On Compiling WebKit (now twice as fast!)

Are you tired of waiting for ages to build large C++ projects like WebKit? Slow headers are generally the problem. Your C++ source code file #includes a few headers, all those headers #include more, and those headers #include more, and more, and more, and since it’s C++ a bunch of these headers contain lots of complex templates to slow down things even more. Not fun.

It turns out that much of the time spent building large C++ projects is effectively spent parsing the same headers again and again, over, and over, and over, and over, and over….

There are three possible solutions to this problem:

  • Shred your CPU and buy a new one that’s twice as fast.
  • Use C++ modules: import instead of #include. This will soon become the best solution, but it’s not standardized yet. For WebKit’s purposes, we can’t use it until it works the same in MSVCC, Clang, and three-year-old versions of GCC. So it’ll be quite a while before we’re able to take advantage of modules.
  • Use unified builds (sometimes called unity builds).

WebKit has adopted unified builds. This work was done by Keith Miller, from Apple. Thanks, Keith! (If you’ve built WebKit before, you’ll probably want to say that again: thanks, Keith!)

For a release build of WebKitGTK+, on my desktop, our build times used to look like this:

real 62m49.535s
user 407m56.558s
sys 62m17.166s

That was taken using WebKitGTK+ 2.17.90; build times with any 2.18 release would be similar. Now, with trunk (or WebKitGTK+ 2.20, which will be very similar), our build times look like this:

real 33m36.435s
user 214m9.971s
sys 29m55.811s

Twice as fast.

The approach is pretty simple: instead of telling the compiler to build the original C++ source code files that developers see, we instead tell the compiler to build unified source files that look like this:

// UnifiedSource1.cpp
#include "CSSValueKeywords.cpp"
#include "ColorData.cpp"
#include "HTMLElementFactory.cpp"
#include "HTMLEntityTable.cpp"
#include "JSANGLEInstancedArrays.cpp"
#include "JSAbortController.cpp"
#include "JSAbortSignal.cpp"
#include "JSAbstractWorker.cpp"

Since files are included only once per translation unit, we now have to parse the same headers only once for each unified source file, rather than for each individual original source file, and we get a dramatic build speedup. It’s pretty terrible, yet extremely effective.

Now, how many original C++ files should you #include in each unified source file? To get the fastest clean build time, you would want to #include all of your C++ source files in one, that way the compiler sees each header only once. (Meson can do this for you automatically!) But that causes two problems. First, you have to make sure none of the files throughout your entire codebase use conflicting variable names, since the static keyword and anonymous namespaces no longer work to restrict your definitions to a single file. That’s impractical in a large project like WebKit. Second, because there’s now only one file passed to the compiler, incremental builds now take as long as clean builds, which is not fun if you are a WebKit developer and actually need to make changes to it. Unifying more files together will always make incremental builds slower. After some experimentation, Apple determined that, for WebKit, the optimal number of files to include together is roughly eight. At this point, there’s not yet much negative impact on incremental builds, and past here there are diminishing returns in clean build improvement.

In WebKit’s implementation, the files to bundle together are computed automatically at build time using CMake black magic. Adding a new file to the build can change how the files are bundled together, potentially causing build errors in different files if there are symbol clashes. But this is usually easy to fix, because only files from the same directory are bundled together, so random unrelated files will never be built together. The bundles are always the same for everyone building the same version of WebKit, so you won’t see random build failures; only developers who are adding new files will ever have to deal with name conflicts.

To significantly reduce name conflicts, we now limit the scope of using statements. That is, stuff like this:

using namespace JavaScriptCore;
namespace WebCore {
//...
}

Has been changed to this:

namespace WebCore {
using namespace JavaScriptCore;
// ...
}

Some files need to be excluded due to unsolvable name clashes. For example, files that include X11 headers, which contain lots of unnamespaced symbols that conflict with WebCore symbols, don’t really have any chance. But only a few files should need to be excluded, so this does not have much impact on build time. We’ve also opted to not unify most of the GLib API layer, so that we can continue to use conventional GObject names in our implementation, but again, the impact of not unifying a few files is minimal.

We still have some room for further performance improvement, because some significant parts of the build are still not unified, including most of the WebKit layer on top. But I suspect developers who have to regularly build WebKit will already be quite pleased.

8 Replies to “On Compiling WebKit (now twice as fast!)”

  1. How does this interact with -j and more importantly (top) memory usage during clean build?
    I’m guessing -j is just as usual at make/ninja level, so -j8 will actually build 64 .cpp files in parallel now, due to 8 unified ones?
    Is the top memory usage of clean build still about the same for same amount of compilation units (which are now much longer) or if we are memory starved, should we start looking at possibly having to reduce -j to lower than CPU core count (more than before unified builds)?

    1. I wouldn’t change -j at all. It’s still only 8 .cpp files built in parallel, they’re just 8x larger files. That doesn’t mean you should start building only four files in parallel; that’s not going to help you.

      I recommend passing -GNinja to CMake to get the Ninja generator, though, which has sane defaults and doesn’t require passing -j.

      1. Actually, someone is currently telling me the opposite of what I just said. Unified builds increase RAM requirements, and everything gets way slower if you run out of RAM. If you’re running out of memory, then of course you should definitely reduce -j.

        1. Yeah, I was asking in terms of Gentoo packaging and the user support afterwards (most users compile on their own system afterall) from potential more OOM’s, etc.
          Chromium recently received this for us as well, but it’s an opt-in due to the increased memory requirements.
          If now due to RAM starvation people will need to reduce -j to lower than their CPU count, then that sounds a bit counter-productive to the supposed lower build times and maybe in some cases we could end up with longer build times instead?

          I guess I can’t really muck with automatically lowering -j for them and just continue business as usual, especially if you don’t provide an opt-out for unified build either. But that remains to be found out also while packaging it in the future.

          We already make cmake use ninja in our webkit-gtk packages since webkit-gtk-2.8.3 times (most other cmake using packages default to make).

  2. if somebody is reading this. DO NOT DO THIS.

    Unified builds change the semantic of the language: e.g. the static keyword changes its meaning, anonymous namespace start to confict etc.

    you really should be using precompiled headers instead. They are the real predecessor of modules.

Comments are closed.