Profile-guided Optimization (PGO) improves application performance by shrinking code size, reducing branch mispredictions, and reorganizing code layout to reduce instruction-cache problems. PGO provides information to the compiler about areas of an application that are most frequently executed. By knowing these areas, the compiler is able to be more selective and specific in optimizing the application.
1、采样编译: 修改COMMON_FLAGS 加上-fprofile-generate=(dir)选项,其中dir为存放采样文件的目录 例如: COMMON_FLAGS="$COMMON_FLAGS -fprofile-generate=/u01/profile"
2、运行采样 编译完成后按正常测试流程跑完测试用例,所有的采样文件存放在之前指定的目录中,格式为#xx#xx#xx.cc.gcda, 对每一个执行到的源代码文件都会生成一个对应的gcda的文件。
3、采样数据合并 若进行了多次运行采样,比如分别存放在目录profile_data/1, profile_data/2, profile_data/3中,可通过如下命令进行合并 gcov-tool merge profile_data/1 profile_data/2 -o temp1 gcov-tool merge temp1 profile_data/3 -o profile_merged
4、优化编译: 修改COMMON_FLAGS,增加-fprofile-use=(dir) -Wno-missing-profile -fprofile-correction, 其中dir就是gcda文件的目录,若涉及多次采样则为最终merge后的目录 例如: COMMON_FLAGS="$COMMON_FLAGSc -fprofile-use=/u01/profile -Wno-missing-profile -fprofile-correction" 编译后即已经应用了pgo优化
3 PGO 好处
PGO provides the following benefits:
-
Use profile information for register allocation to optimize the location of spill code.
-
Improve branch prediction for indirect function calls by identifying the most likely targets. Some processors have longer pipelines, which improves branch prediction and translates into high performance gains.
-
Detect and do not vectorize loops that execute only a small number of iterations, reducing the runtime overhead that vectorization might otherwise add