blob: a1f3877454748f66dea1b05860e513c0b28adf93 [file] [log] [blame]
George Burgess IVcf477f42018-10-26 20:56:03 +00001=============================================================
2How To Build Clang and LLVM with Profile-Guided Optimizations
3=============================================================
4
5Introduction
6============
7
8PGO (Profile-Guided Optimization) allows your compiler to better optimize code
9for how it actually runs. Users report that applying this to Clang and LLVM can
10decrease overall compile time by 20%.
11
12This guide walks you through how to build Clang with PGO, though it also applies
13to other subprojects, such as LLD.
14
15
16Using the script
17================
18
19We have a script at ``utils/collect_and_build_with_pgo.py``. This script is
20tested on a few Linux flavors, and requires a checkout of LLVM, Clang, and
Hans Wennborge334a3a2020-01-07 16:06:14 +010021compiler-rt. Despite the name, it performs four clean builds of Clang, so it
George Burgess IVcf477f42018-10-26 20:56:03 +000022can take a while to run to completion. Please see the script's ``--help`` for
23more information on how to run it, and the different options available to you.
24If you want to get the most out of PGO for a particular use-case (e.g. compiling
25a specific large piece of software), please do read the section below on
26'benchmark' selection.
27
28Please note that this script is only tested on a few Linux distros. Patches to
29add support for other platforms, as always, are highly appreciated. :)
30
31This script also supports a ``--dry-run`` option, which causes it to print
32important commands instead of running them.
33
34
35Selecting 'benchmarks'
36======================
37
38PGO does best when the profiles gathered represent how the user plans to use the
39compiler. Notably, highly accurate profiles of llc building x86_64 code aren't
40incredibly helpful if you're going to be targeting ARM.
41
42By default, the script above does two things to get solid coverage. It:
43
44- runs all of Clang and LLVM's lit tests, and
45- uses the instrumented Clang to build Clang, LLVM, and all of the other
46 LLVM subprojects available to it.
47
48Together, these should give you:
49
50- solid coverage of building C++,
51- good coverage of building C,
52- great coverage of running optimizations,
53- great coverage of the backend for your host's architecture, and
54- some coverage of other architectures (if other arches are supported backends).
55
56Altogether, this should cover a diverse set of uses for Clang and LLVM. If you
57have very specific needs (e.g. your compiler is meant to compile a large browser
58for four different platforms, or similar), you may want to do something else.
59This is configurable in the script itself.
60
61
62Building Clang with PGO
63=======================
64
65If you prefer to not use the script, this briefly goes over how to build
66Clang/LLVM with PGO.
67
68First, you should have at least LLVM, Clang, and compiler-rt checked out
69locally.
70
71Next, at a high level, you're going to need to do the following:
72
731. Build a standard Release Clang and the relevant libclang_rt.profile library
742. Build Clang using the Clang you built above, but with instrumentation
753. Use the instrumented Clang to generate profiles, which consists of two steps:
76
77 - Running the instrumented Clang/LLVM/lld/etc. on tasks that represent how
78 users will use said tools.
79 - Using a tool to convert the "raw" profiles generated above into a single,
80 final PGO profile.
81
824. Build a final release Clang (along with whatever other binaries you need)
83 using the profile collected from your benchmark
84
85In more detailed steps:
86
871. Configure a Clang build as you normally would. It's highly recommended that
88 you use the Release configuration for this, since it will be used to build
89 another Clang. Because you need Clang and supporting libraries, you'll want
90 to build the ``all`` target (e.g. ``ninja all`` or ``make -j4 all``).
91
922. Configure a Clang build as above, but add the following CMake args:
93
94 - ``-DLLVM_BUILD_INSTRUMENTED=IR`` -- This causes us to build everything
95 with instrumentation.
96 - ``-DLLVM_BUILD_RUNTIME=No`` -- A few projects have bad interactions when
97 built with profiling, and aren't necessary to build. This flag turns them
98 off.
99 - ``-DCMAKE_C_COMPILER=/path/to/stage1/clang`` - Use the Clang we built in
100 step 1.
101 - ``-DCMAKE_CXX_COMPILER=/path/to/stage1/clang++`` - Same as above.
102
103 In this build directory, you simply need to build the ``clang`` target (and
104 whatever supporting tooling your benchmark requires).
105
1063. As mentioned above, this has two steps: gathering profile data, and then
107 massaging it into a useful form:
108
109 a. Build your benchmark using the Clang generated in step 2. The 'standard'
110 benchmark recommended is to run ``check-clang`` and ``check-llvm`` in your
111 instrumented Clang's build directory, and to do a full build of Clang/LLVM
112 using your instrumented Clang. So, create yet another build directory,
113 with the following CMake arguments:
114
115 - ``-DCMAKE_C_COMPILER=/path/to/stage2/clang`` - Use the Clang we built in
116 step 2.
117 - ``-DCMAKE_CXX_COMPILER=/path/to/stage2/clang++`` - Same as above.
118
119 If your users are fans of debug info, you may want to consider using
120 ``-DCMAKE_BUILD_TYPE=RelWithDebInfo`` instead of
121 ``-DCMAKE_BUILD_TYPE=Release``. This will grant better coverage of
122 debug info pieces of clang, but will take longer to complete and will
123 result in a much larger build directory.
124
125 It's recommended to build the ``all`` target with your instrumented Clang,
126 since more coverage is often better.
127
Hans Wennborg45562a32018-12-05 08:35:30 +0000128 b. You should now have a few ``*.profraw`` files in
George Burgess IVcf477f42018-10-26 20:56:03 +0000129 ``path/to/stage2/profiles/``. You need to merge these using
130 ``llvm-profdata`` (even if you only have one! The profile merge transforms
131 profraw into actual profile data, as well). This can be done with
Hans Wennborg45562a32018-12-05 08:35:30 +0000132 ``/path/to/stage1/llvm-profdata merge
133 -output=/path/to/output/profdata.prof path/to/stage2/profiles/*.profraw``.
George Burgess IVcf477f42018-10-26 20:56:03 +0000134
1354. Now, build your final, PGO-optimized Clang. To do this, you'll want to pass
136 the following additional arguments to CMake.
137
138 - ``-DLLVM_PROFDATA_FILE=/path/to/output/profdata.prof`` - Use the PGO
139 profile from the previous step.
140 - ``-DCMAKE_C_COMPILER=/path/to/stage1/clang`` - Use the Clang we built in
141 step 1.
142 - ``-DCMAKE_CXX_COMPILER=/path/to/stage1/clang++`` - Same as above.
143
144 From here, you can build whatever targets you need.
145
146 .. note::
147 You may see warnings about a mismatched profile in the build output. These
148 are generally harmless. To silence them, you can add
149 ``-DCMAKE_C_FLAGS='-Wno-backend-plugin'
150 -DCMAKE_CXX_FLAGS='-Wno-backend-plugin'`` to your CMake invocation.
151
152
153Congrats! You now have a Clang built with profile-guided optimizations, and you
154can delete all but the final build directory if you'd like.
155
156If this worked well for you and you plan on doing it often, there's a slight
157optimization that can be made: LLVM and Clang have a tool called tblgen that's
158built and run during the build process. While it's potentially nice to build
159this for coverage as part of step 3, none of your other builds should benefit
160from building it. You can pass the CMake options
161``-DCLANG_TABLEGEN=/path/to/stage1/bin/clang-tblgen
162-DLLVM_TABLEGEN=/path/to/stage1/bin/llvm-tblgen`` to steps 2 and onward to avoid
163these useless rebuilds.