Profile-Guided Optimization (PGO) Databases


This article is a continuation for my article on MSDN on Profile-Guided Optimization (PGO) which you can find here.

The benefits of the PGD profile far exceed guiding the optimizations of the compiler. While you can use the pgomgr.exe to merge multiple PGC files, it also serves a different purpose. It offers three switches that let you view the contents of the PGD file to gain full understanding of the behavior of your code with respect to the exercised scenarios. The first switch /summary tells the tool to emit a textual summary of the PGD file contents. The second switch /detail, used in conjunction with the first, tells the tool to emit a detailed textual profile description. The last switch /unique tells the tool to undecorate the function names (particularly useful for C++ codebases).

When you run this tool passing all of these switches and the PGD file name constructed in the previous section, you’ll see what was recorded during the training phase. Note the numbers will probably be slightly different than what is shown here, but not by much. The first interesting line from the profile is:

Module Count: 1 Function Count: 2244 Arc Count: 5193 Value Count: 502

Module count is the number of times the same executable was used to generate constituent PGC files. Function count is the number of instrumented functions (including those that were never executed). Arc Count is the number of count probes and Value Count is the total number of value probes injected in the code. After that, you’ll see a list of undecorated function names and some general statistics:

Static instructions: 46223 Basic blocks: 9897 Average BB size: 4.7 Dynamic instructions: 24192609520

The static instructions metric shows the number of static instructions generated for all instrumented functions. Those instructions are called static because they’re in an internal intermediate language that closely resembles an assembly language. The other type of instruction is dynamic, referring to executed machine instructions. It also shows the total number of basic blocks and their average size in terms of static instructions.

A large table with the following columns will follow those descriptions:

  • The undecorated function name.
  • The entry count or the number of times the function was called.
  • The number of static instructions in the function.
  • The number of dynamic instructions (the number of instructions executed from this function). This number is the sum overall executions of the function.
  • The %total metric is equal to the previous metric divided by the total number of dynamic instructions of the program.
  • Run total is the accumulated version of the previous metric. This should be 100 for the last function listed.

Unfortunately, pgomgr shows only one digit of the fractional portion rounded toward zero. That’s why you might see the value of %total to be zero, even though it’s not actually zero. Under each listed function, there’s detailed table of the basic blocks of the function with their statistics:

CDXUTDialogResourceManager::CreateTexture11
2 42 84 0.0 100.0
Blk 1: 3117- 3122 5 (11.9%)s 10 (11.9%)d
taken ( 5) 0, not-taken 2
Blk 2: 3124- 3124 2 ( 4.8%)s 4 ( 4.8%)d
taken ( 5) 0, not-taken 2
Blk 3: 3124- 3124 2 ( 4.8%)s 4 ( 4.8%)d
taken ( 5) 0, not-taken 2
Blk 4: 3126- 3127 6 (14.3%)s 12 (14.3%)d
taken ( 6) 0, not-taken 2
Blk 5: 3205- 3215 25 (59.5%)s 50 (59.5%)d
Blk 6: 3218- 3218 2 ( 4.8%)s 4 ( 4.8%)d

This function consists of six basic blocks. Next to each basic block is the following information:

  • The line numbers in the source code file for that block.
  • The number of static instructions and the percentage of the total number of the static instructions of the function.
  • The number of dynamic instructions and the percentage of the total number of the dynamic instructions of the function.

If a block ends with a conditional instruction that changes control flow, you’ll see a line showing the number of times the branch has been taken and not taken. The number in braces that might appear next to the taken or not-taken phrases is the block number of the corresponding flow of control.

Following that table is a summary table that includes the following columns:

  • The module name.
  • The undecorated function name. The name is truncated to 25 characters if it was longer.
  • The percentage of executed blocks in the function.
  • The percentage of arcs traversed in the function.
  • The percentage of executed static instructions.

If none of the blocks were executed, the phrase “Never Executed” is displayed. If not all blocks were executed, the table will list the non-executed blocks. At the end of the file, you’ll see some general statistics:

block: 33.9% arc: 23.7% inst: 40.2% functions called: 42.6%

This shows the percentage of blocks and static instructions that were executed, the arcs that were traversed and functions that were called. You can also use this textual representation of the PGD file for other purposes. However, the lack of formal documentation and proper visual display make using this information difficult. If you think this information is useful to you in some way, you should let the Visual C++ team know about it.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s