The Art of Profiling Using Intel VTune Amplifier, Part 8 (Final)


Part 1, Part 2, and Part 3 of this series provided an introduction to profiling and showed how to setup VTune. The first optimization was discussed in Part 4, in which the number of times printf is executed is reduced. The second optimization was discussed in Part 5, in which strlen got replaced with a much cheaper alternative. The third optimization was discussed in Part 6, in which the amount of computation required to report progress is reduced. The third optimization was discussed in Part 7, in which the function do_pswd was inlined into its caller. The following chart shows by how much each optimization improved password cracking throughput.

All of the four optimizations were significant, but the printf one resulted in the greatest enhancement. In general, it’s recommended to either reduce I/O, use asynchronous I/O, or perform I/O operations in dedicated threads. These include graphics drawing. It was also interesting to see that the compiler failed to realize the importance of inlining do_pswd and that we had to do that manually. Perhaps, if we used profile-guided optimization, it would have figure it out by itself.

I refrained from getting into algorithmic, microarchitectural, and parallelization optimizations in this series to keep it short and simple. Maybe I’ll discuss them in future articles. VTune can certainly be used for these purposes too.

Advertisements

One thought on “The Art of Profiling Using Intel VTune Amplifier, Part 8 (Final)

  1. Pingback: The Art of Profiling Using Intel VTune Amplifier, Part 7 | Micromysteries

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s