On setenv() and Explosions

Skim over this abbreviated and badly-formatted function definition, from glibpacrunner.c:

static void
handle_method_call (GDBusConnection *connection, /* ... */)
{
  /* ... */

  if (/* ... */)
    {
      gchar *libproxy_url = g_strdup_printf ("pac+%s", pac_url);
      g_setenv ("http_proxy", libproxy_url, TRUE);
      g_free (libproxy_url);
    }
  else
    g_setenv ("http_proxy", "wpad://", TRUE);

  g_proxy_resolver_lookup_async (/* ... */);
}

Note:

  1. This is a GDBus method call handler, so you know GDBus is currently running, and you know this is a multithreaded application.
  2. The code calls setenv().

Because you’re a careful programmer, you check the manpage before using a function, and read all the way to the bottom, so you know that setenv() is not threadsafe. Unless your application is trivial and links to no libraries, you can be pretty confident that it is running secondary threads, and that, somewhere, those threads call getenv(), because getenv() is used pervasively in libraries. (A frequent culprit for internationalized applications is gettext(), conventionally defined as the underscore function _().)

So, what does this mean in practice? It means that if any other thread is calling getenv() at the same time that you call setenv(), one or the other call will blow up. This is a timing-dependent race that will only happen occasionally. Thanks to Murphy, you can expect it to only occur on the hardware that your important customer is running, and never for your developers when they attempt to reproduce the issue.

The safest solution is to ensure your application only ever calls setenv() at the very top of main(), before your first secondary thread is initialized. It can be quite hard to know which library calls will initialize secondary threads. (What about if you use GOptionContext? Or GApplication? If you need to check a setting before changing the environment, how many threads does a GSettings object spawn when you create it? Even if it was zero — and it’s not — what if a custom GIO module was in use?) Your best bet is to restrict yourself to using setenv() at the very, very top of main(), where you know it’s safe. This applies to unsetenv(), as well.

This is not some theoretical problem. In practice, we have documented:

  • Unsafe use of setenv() in gnome-session would cause gnome-session to crash, bringing down the entire desktop session and returning the user to the login screen.
  • In a downstream product using a modified version of gnome-initial-setup, unsafe use of setenv() to change LANG on the language selection page would cause the first boot experience to crash, leaving the sufficiently-unlucky user staring at an empty unusable desktop after turning on her new computer.

Both of these issues were rare races that were not possible to reproduce.

In a sense, setenv() is not really special, in that lots of POSIX functions modify global state and are unsafe to use across threads. The problem with setenv() is that, once such an issue is identified, it can be very difficult to fix. Sometimes there is an alternative to setting an environment variable, but often environment variables are the only way to configure important settings. If you have no other choice but to set an environment variable, then there is not really much you can do other than risk randomly crashing. Lesson: never make an environment variable the only way for an application to configure an important setting at runtime. Your inconsiderate mandatory environment variable might not be a problem for your own code, but it will be a problem for unforeseen future code that needs to use your code. Mandatory environment variables have caused many problems in GNOME, some unfixable on platforms without library constructors, and some frankly ludicrous. (libproxy, you are not batting very well right now.) The Linux desktop is already stuck with many mandatory environment variables that we cannot get rid of; we do not need to keep adding more. Please, please, please: learn from our mistakes.

I have no clue how to fix glib-pacrunner. It’s already a separate D-Bus service just to allow it to use unsetenv() (which it does safely) to trick libproxy. I suppose fixing this would require spawning a second helper process in order to run setenv() before the proxy lookup. More likely, we’ll just leave it broken, since to my knowledge, nobody has reported a crash here yet.

Software is wonderful!

On Compiling WebKit (now twice as fast!)

Are you tired of waiting for ages to build large C++ projects like WebKit? Slow headers are generally the problem. Your C++ source code file #includes a few headers, all those headers #include more, and those headers #include more, and more, and more, and since it’s C++ a bunch of these headers contain lots of complex templates to slow down things even more. Not fun.

It turns out that much of the time spent building large C++ projects is effectively spent parsing the same headers again and again, over, and over, and over, and over, and over….

There are three possible solutions to this problem:

  • Shred your CPU and buy a new one that’s twice as fast.
  • Use C++ modules: import instead of #include. This will soon become the best solution, but it’s not standardized yet. For WebKit’s purposes, we can’t use it until it works the same in MSVCC, Clang, and three-year-old versions of GCC. So it’ll be quite a while before we’re able to take advantage of modules.
  • Use unified builds (sometimes called unity builds).

WebKit has adopted unified builds. This work was done by Keith Miller, from Apple. Thanks, Keith! (If you’ve built WebKit before, you’ll probably want to say that again: thanks, Keith!)

For a release build of WebKitGTK+, on my desktop, our build times used to look like this:

real 62m49.535s
user 407m56.558s
sys 62m17.166s

That was taken using WebKitGTK+ 2.17.90; build times with any 2.18 release would be similar. Now, with trunk (or WebKitGTK+ 2.20, which will be very similar), our build times look like this:

real 33m36.435s
user 214m9.971s
sys 29m55.811s

Twice as fast.

The approach is pretty simple: instead of telling the compiler to build the original C++ source code files that developers see, we instead tell the compiler to build unified source files that look like this:

// UnifiedSource1.cpp
#include "CSSValueKeywords.cpp"
#include "ColorData.cpp"
#include "HTMLElementFactory.cpp"
#include "HTMLEntityTable.cpp"
#include "JSANGLEInstancedArrays.cpp"
#include "JSAbortController.cpp"
#include "JSAbortSignal.cpp"
#include "JSAbstractWorker.cpp"

Since files are included only once per translation unit, we now have to parse the same headers only once for each unified source file, rather than for each individual original source file, and we get a dramatic build speedup. It’s pretty terrible, yet extremely effective.

Now, how many original C++ files should you #include in each unified source file? To get the fastest clean build time, you would want to #include all of your C++ source files in one, that way the compiler sees each header only once. (Meson can do this for you automatically!) But that causes two problems. First, you have to make sure none of the files throughout your entire codebase use conflicting variable names, since the static keyword and anonymous namespaces no longer work to restrict your definitions to a single file. That’s impractical in a large project like WebKit. Second, because there’s now only one file passed to the compiler, incremental builds now take as long as clean builds, which is not fun if you are a WebKit developer and actually need to make changes to it. Unifying more files together will always make incremental builds slower. After some experimentation, Apple determined that, for WebKit, the optimal number of files to include together is roughly eight. At this point, there’s not yet much negative impact on incremental builds, and past here there are diminishing returns in clean build improvement.

In WebKit’s implementation, the files to bundle together are computed automatically at build time using CMake black magic. Adding a new file to the build can change how the files are bundled together, potentially causing build errors in different files if there are symbol clashes. But this is usually easy to fix, because only files from the same directory are bundled together, so random unrelated files will never be built together. The bundles are always the same for everyone building the same version of WebKit, so you won’t see random build failures; only developers who are adding new files will ever have to deal with name conflicts.

To significantly reduce name conflicts, we now limit the scope of using statements. That is, stuff like this:

using namespace JavaScriptCore;
namespace WebCore {
//...
}

Has been changed to this:

namespace WebCore {
using namespace JavaScriptCore;
// ...
}

Some files need to be excluded due to unsolvable name clashes. For example, files that include X11 headers, which contain lots of unnamespaced symbols that conflict with WebCore symbols, don’t really have any chance. But only a few files should need to be excluded, so this does not have much impact on build time. We’ve also opted to not unify most of the GLib API layer, so that we can continue to use conventional GObject names in our implementation, but again, the impact of not unifying a few files is minimal.

We still have some room for further performance improvement, because some significant parts of the build are still not unified, including most of the WebKit layer on top. But I suspect developers who have to regularly build WebKit will already be quite pleased.

On Python Shebangs

So, how do you write a shebang for a Python program? Let’s first set aside the python2/python3 issue and focus on whether to use env. Which of the following is correct?

#!/usr/bin/env python
#!/usr/bin/python

The first option seems to work in all environments, but it is banned in popular distros like Fedora (and I believe also Debian, but I can’t find a reference for this). Using env in shebangs is dangerous because it can result in system packages using non-system versions of python. python is used in so many places throughout modern systems, it’s not hard to see how using #!/usr/bin/env in an important package could badly bork users’ operating systems if they install a custom version of python in /usr/local. Don’t do this.

The second option is broken too, because it doesn’t work in BSD environments. E.g. in FreeBSD, python is installed in /usr/local/bin. So FreeBSD contributors have been upstreaming patches to convert #!/usr/bin/python shebangs to #!/usr/bin/env python. Meanwhile, Fedora has begun automatically rewriting #!/usr/bin/env python to #!/usr/bin/python, but with a warning that this is temporary and that use of #!/usr/bin/env python will eventually become a fatal error causing package builds to fail.

So obviously there’s no way to write a shebang that will work for both major Linux distros and major BSDs. #!/usr/bin/env python seems to work today, but it’s subtly very dangerous. Lovely. I don’t even know what to recommend to upstream projects.

Next problem: python2 versus python3. By now, we should all be well-aware of PEP 394. PEP 394 says you should never write a shebang like this:

#!/usr/bin/env python
#!/usr/bin/python

unless your python script is compatible with both python2 and python3, because you don’t know what version you’re getting. Your python script is almost certainly not compatible with both python2 and python3 (and if you think it is, it’s probably somehow broken, because I doubt you regularly test it with both). Instead, you should write the shebang like this:

#!/usr/bin/env python2
#!/usr/bin/python2
#!/usr/bin/env python3
#!/usr/bin/python3

This works as long as you only care about Linux and BSDs. It doesn’t work on macOS, which provides /usr/bin/python and /usr/bin/python2.7, but still no /usr/bin/python2 symlink, even though it’s now been six years since PEP 394. It’s hard to understate how frustrating this is.

So let’s say you are WebKit, and need to write a python script that will be truly cross-platform. How do you do it? WebKit’s scripts are only needed (a) during the build process or (b) by developers, so we get a pass on the first problem: using /usr/bin/env should be OK, because the scripts should never be installed as part of the OS. Using #!/usr/bin/env python — which is actually what we currently do — is unacceptable, because our scripts are python2 and that’s broken on Arch, and some of our developers use that. Using #!/usr/bin/env python2 would be dead on arrival, because that doesn’t work on macOS. Seems like the option that works for everyone is #!/usr/bin/env python2.7. Then we just have to hope that the Python community sticks to its promise to never release a python2.8 (which seems likely).

…yay?