Best Practices for Build Options

Build options are sometimes tricky to get right. Here’s my take on best practices. The golden rule is to set good upstream defaults. Everything else follows from this.

Rule #1: Choose Good Upstream Defaults

Occasionally I see upstream developers complain that a downstream operating system has built their software “incorrectly,” generally because some important dependency or feature has been disabled. Sometimes downstreams really do mess up, but more often poor upstream defaults are to blame. Upstreams must set good defaults because upstream software developers know far more about their projects than downstream packagers do. Upstreams generally have a good idea of how they expect software to be built by downstreams, whereas downstreams generally do not. Accordingly, do the thinking upstream whenever possible. When you set good defaults, it becomes easier for downstreams to build your software the way you expect, because active effort is required for downstreams to mess things up.

For example, say a project has the following two build options:

Option Name Default Value
--enable-thing-you-usually-want-enabled false
--disable-thing-you-rarely-want-disabled true

The thing you usually want enabled is not enabled by default, and the thing you rarely want disabled is disabled by default. Sad. Unfortunately, this pattern used to be extremely common with Autotools build systems, because in the real world, the names of the options are more subtle than this, and also because nobody likes squinting at configure.ac files to audit whether the options make sense. Meson build systems tend to be somewhat better because meson_options.txt is separate from the rest of the build definitions, making it easier to review all your options and check to ensure their defaults are set appropriately. However, there are still a few more subtle ways you can mess up your Meson build system, which I’ll discuss below.

Rule #2: Prefer Upstream Defaults Downstream

Conversely, downstreams should not second-guess upstream defaults unless you have a good reason to do so and really know what you’re doing.

For example, glib-networking’s Meson build system provides you with two different TLS backend options: OpenSSL or GnuTLS. The GnuTLS backend is enabled by default (well, sort of, see the next section on auto dependencies) while the OpenSSL backend is disabled by default. There’s a good reason for this: the OpenSSL backend is half-baked, partially due to bugs in glib-networking, and partially because OpenSSL just cannot do certain things that GnuTLS can. The OpenSSL backend is provided because some environments really do require it for license reasons, but it’s not the right choice for general-purpose operating systems. It may be tempting to think that you can pick whichever library you prefer, but you really should not.

Another example: WebKitGTK’s CMake build system provides a USE_WPE_RENDERER build option, which is enabled by default. This option controls which graphics rendering stack is used: if enabled, rendering uses libwpe and wpebackend-fdo, whereas if disabled, rendering uses a legacy internal Wayland compositor. The option is provided because libwpe and wpebackend-fdo are newer dependencies that are expected to be unavailable on older (pre-2020) operating systems, so older operating systems legitimately need to be able to disable it. But this configuration receives little serious testing and the upstream developers do not notice when it breaks, so you really should not be using it unless you have to. This recently caused rendering bugs that appeared to be distribution-specific, which upstream developers were not willing to investigate because upstream developers could not reproduce the issue.

Sticking with upstream defaults is generally safest. Sometimes you really need to override them. If so, go ahead. Just be careful.

Rule #3: Handle Auto Dependencies and Features with Care

The worst default ever is “build with feature enabled only if dependency xyz is installed; otherwise, disable it.” This is called an auto dependency. If using CMake or Autotools, auto dependencies are almost never permissible, and in this case “handle with care” means repent and fix it. Auto dependencies are acceptable only if you are using the Meson build system.

The theory behind auto dependencies is that it’s convenient for people casually building the software to do so with the fewest number of build errors possible, which is true. Problem is, this screws over serious production builds of your software by requiring your downstreams to possess magical knowledge of what dependencies are required to build your software properly. Users generally expect most features to be enabled at build time, but if upstream uses auto dependencies, the result is a build dependencies lottery: your feature will be enabled or disabled due to luck, based on which downstream build dependencies transitively depend on which other build dependencies. Even if it’s built properly today, that could easily change tomorrow when some other dependency changes in some other package. Just say no. Do not expect downstreams to look at your build system at all, let alone study the possible build options and make accurate judgments about which build dependencies are required to enable them. Avoiding auto dependencies is part of setting good upstream defaults.

Look at this example from WebKit’s OptionsGTK.cmake:

if (ENABLE_SPELLCHECK)
    find_package(Enchant)
    if (NOT PC_ENCHANT_FOUND)
        message(FATAL_ERROR "Enchant is needed for ENABLE_SPELLCHECK")
    endif ()
endif ()

ENABLE_SPELLCHECK is ON by default. If you don’t have enchant installed, the build will fail unless you manually disable it by passing -DENABLE_SPELLCHECK=OFF". This makes it hard to mess up: downstreams have to make an intentional choice to build with spellchecking disabled. It cannot happen by accident.

Many projects would instead write it like this:

if (ENABLE_SPELLCHECK)
    find_package(Enchant)
    if (NOT PC_ENCHANT_FOUND)
        set(ENABLE_SPELLCHECK OFF)
    endif ()
endif ()

But this is an auto dependency, which results in downstream build dependency lottery. If you write your build system like this, you cannot complain when the feature winds up disabled by mistake in downstream builds. Don’t do this.

Exception: if you use Meson, auto dependencies are acceptable if you use the feature option type and set the default to auto. Although auto features are silently enabled or disabled by default depending on whether the required dependency is present, you can easily override this behavior for serious production builds by passing -Dauto_features=enabled, which enables all the auto features and will result in build failures if dependencies are missing. All major Linux operating systems do this when building Meson packages, so Meson’s auto features should not cause problems.

Rule #4: Be Very Careful with Meson’s Build Types

Let’s categorize software builds into production builds or non-production builds. A production build is intended to be either distributed to end users or else run production workloads, whereas a non-production build is intended for testing or development and might have extra debug features enabled, like slow assertions. (These are more commonly called release builds or debug builds, but that terminology would be very confusing in the context of this discussion, as you’re about to see.)

The CMake and Meson build systems give us more than just two build types. Compare CMake build types to the corresponding Meson build types:

CMake Build Type Meson Build Type Meson debug Option Production Build?  (excludes Windows)
Release release false Yes
Debug debug true No
RelWithDebInfo debugoptimized true Yes, be careful!
MinSizeRel minsize true Yes, be careful!
N/A plain false Yes

To simplify, let’s exclude Windows from the discussion for now. (We’ll come back to Windows in a bit.) Now, notice the nomenclature difference between CMake’s RelWithDebInfo (“release with debuginfo”) build type versus Meson’s debugoptimized build type. This build type functions exactly the same for both Meson and CMake, but CMake’s name is better because it clearly indicates that this is a release or production build type, whereas the Meson name seems to indicate it is a debug or non-production build type, and Meson’s debug option is set to true. In fact, it is an optimized production build with debuginfo enabled, the same style of build that almost all Linux operating systems use for their packages (although operating systems use the plain build type instead). The same problem exists for Meson’s minsize build type. This is another production build type where debug is true.

The Meson build type name accurately reflects that the debug option is enabled, but this is very confusing because for most platforms, that option only controls whether debuginfo is generated. Looking at the table above, you can see that you must never use the debug option alone to decide whether you have a production build or a non-production build. As the table indicates, the only non-production build type is the vanilla debug build type, which you can detect by checking the combination of the debug and optimization options. You have a non-production (debug) build if debug is true and if optimization is 0 or g; otherwise, you have a production build.  I wrote this in bold because it is important and not at all obvious. (However, before applying this rule in a cross-platform project, keep reading below to see the huge caveat regarding Windows.)

Here’s an example of what not to do in your meson.build:

# Use debug/optimization flags to determine whether to enable debug or disable
# cast checks
gtk_debug_cflags = []
debug = get_option('debug')
optimization = get_option('optimization')
if debug
  gtk_debug_cflags += '-DG_ENABLE_DEBUG'
  if optimization in ['0', 'g']
    gtk_debug_cflags += '-DG_ENABLE_CONSISTENCY_CHECKS'
  endif
elif optimization in ['2', '3', 's']
  gtk_debug_cflags += ['-DG_DISABLE_CAST_CHECKS', '-DG_DISABLE_ASSERT']
endif

This is from GTK’s meson.build. The code based only on the optimization option is OK, but the code that sets -DG_ENABLE_DEBUG is looking only at the debug option. What the code really wants to do is set G_ENABLE_DEBUG if this is a non-production build, but instead it is tied to debuginfo, which is not the desired result. Downstreams are forced to scratch their heads as to what they should do. Impassioned build engineers have held spirited debates about this particular meson.build snippet. Don’t do this! (I will submit a merge request to improve this.)

Here’s a much better, although still not perfect, example of how to do the same thing, this time from GLib’s meson.build:

# Use debug/optimization flags to determine whether to enable debug or disable
# cast checks
glib_debug_cflags = []
glib_debug = get_option('glib_debug')
if glib_debug.enabled() or (glib_debug.auto() and get_option('debug'))
  glib_debug_cflags += ['-DG_ENABLE_DEBUG']
  message('Enabling various debug infrastructure')
elif get_option('optimization') in ['2', '3', 's']
  glib_debug_cflags += ['-DG_DISABLE_CAST_CHECKS']
  message('Disabling cast checks')
endif

if not get_option('glib_assert')
  glib_debug_cflags += ['-DG_DISABLE_ASSERT']
  message('Disabling GLib asserts')
endif

if not get_option('glib_checks')
  glib_debug_cflags += ['-DG_DISABLE_CHECKS']
  message('Disabling GLib checks')
endif

Notice how GLib provides explicit build options that allow downstreams to decide whether debug should be enabled or not. Using explicit build options here was a good idea! The defaults for glib_assert and glib_checks are intentionally set to true to encourage their use in production builds, while G_DISABLE_CAST_CHECKS is based only on the optimization level. But sadly, if not explicitly configured, GLib sets the value of the glib_debug_cflags option automatically, based on only the value of the debug option. This is actually an OK use of an auto feature, because it is a carefully-considered attempt to provide good default behavior for downstreams, but it fails here because it assumes that debug means “non-production build,” which we have previously established cannot be determined without checking optimization as well. (I will submit a merge request to improve this.)

Here’s another helpful table that shows how the various build types correspond to CFLAGS:

CMake/Meson Build Type CMake CFLAGS Meson CFLAGS
Release/release -O3 -DNDEBUG -O3
Debug/debug -g -O0 -g
RelWithDebInfo/debugoptimized -O2 -g -DNDEBUG -O2 -g
MinSizeRel/minsize -Os -DNDEBUG -Os -g

Notice Meson’s minsize build type includes debuginfo, while CMake’s does not. Since debuginfo requires a huge amount of space, CMake’s behavior seems better here. We’ll discuss NDEBUG momentarily.

OK, so that all makes sense, right? Well I thought so too, until I ran a draft of this blog post past Jussi, who pointed out that the Meson build types function completely differently on Windows than they do on other platforms. Unfortunately, whereas on most platforms the debug option only controls debuginfo generation, on Windows it instead controls whether the C library enables extra runtime debugging checks. So while debugoptimized and minsize are production build types on Linux and have nice corresponding CMake build types, they are non-production build types on Windows. This is a Meson defect. The point to remember is that the debug option is completely different on Windows than it is on other platforms, so my otherwise-nice rule for detecting production builds does not work properly on Windows. Cross-platform projects need to be especially careful with the debug option. There are various ways this could be fixed in Meson in the future: a nice simple proposal would be to add a new debuginfo option separate from debug, then deprecate the debugoptimized build type and replace it with releasewithdebuginfo.

CMake dodges all these problems and avoids any ambiguity because its build types are named differently: “RelWithDebInfo” and “MinSizeRel” leave no doubt that you are dealing with a release (production) build.

Rule #5: Think about NDEBUG

The other behavior difference visible in the table above is that CMake defines NDEBUG for its production build types, whereas Meson has a separate option bn_debug that controls whether to define NDEBUG. NDEBUG controls whether the C and C++ assert() macro is enabled: if this value is defined, asserts are disabled. CMake is the only build system that defines NDEBUG for you automatically. You really need to think about this: if your software is performance-sensitive and contains slow assertions, the consequences of messing this up are severe, e.g. see this historical mesa bug where Fedora’s mesa package suffered a 10x slowdown because mesa upstream accidentally enabled assertions by default. Again, please, do not blame downstreams for bad upstream defaults: downstreams are (usually) not experts on upstream software, and cannot possibly be expected to pick better defaults than upstream’s.

Meson allows developers to explicitly choose whether to enable assertions in production builds. Assertions are enabled in production by default, the opposite of CMake’s behavior. Some developers prefer that all asserts be disabled in production builds to optimize speed as far as possible, but this is usually not the best choice: having assertions enabled in production provides valuable confidence that your code is actually functioning as intended, and often improves security by converting many code execution exploits into denial of service. Most assertions do not have noticeable performance impact, so I prefer to leave most assertions enabled by default in production, and disable only asserts that are slow. Hence, I like Meson’s default behavior. But many engineers disagree with me, and some projects really do need assertions disabled in production; in particular, everyone agrees that performance-sensitive assertions should not be running in production builds. If you’re using Meson and want assertions disabled in production builds, you’re in theory supposed to use b_ndebug=if-release, but it doesn’t actually work because it only disables assertions if your build type is release or plain, while leaving assertions enabled for debugoptimized and minsize builds. We’ve already established that these are both production build types, so sadly that behavior is broken. Instead, it’s better to manually define NDEBUG except in non-production builds. Again, you have a non-production (debug) build when debug is true and if optimization is 0 or g; otherwise, you have a production build (except on Windows).

Rule #6: plain Means “Production Build,” Not “No Flags”

The GNOME release team recently had an exciting debate about the meaning of Meson’s plain build type. It is impressive how build engineers can be so enthusiastic about build options!

I asked Jussi to explain the plain build type. His response was: “Meson does not, by itself, add compiler flags,” emphasis mine. It does not mean your project should not add its own compiler flags, and it certainly does not mean it’s OK to set bad defaults as long as they are vanilla-flavored. It is a production build type, and you should ensure that it receives defaults in line with the other production build types. You’ll be fine if you follow the same rule we already established: you have a non-production (debug) build if debug is true and if optimization is 0 or g; otherwise, you have a production build (except on Windows).

The plain build type exists because it makes it easier for downstreams to implement their own compiler flags. Downstreams have to pass -O2 -g via CFLAGS because CMake and Meson are the only build systems that can do this automatically, and it’s easier to let downstreams disable this functionality than to force downstreams to set different CFLAGS separately for each supported build system.

Rule #7: Don’t Forget Hardening Flags

Sadly, by default all build systems generate insecure, unhardened binaries that should never be used in production. This is true of Autotools, CMake, Meson, and likely also whatever other build system you are thinking of. You must manually add your own hardening flags or your builds will be insecure. Unfortunately this is a little complicated to do. Fedora and RHEL’s recommended compiler flags are documented here. The freedesktop-sdk and GNOME Flatpak runtimes use these recommendations as the basis for their compiler flags, and by default, so do Flatpak applications based on these runtimes. It’s actually not very easy to replicate the same hardening flags since libraries and executables require different flags, so naively setting CFLAGS is not possible. Fedora and RHEL use GCC spec files to achieve this, whereas freedesktop-sdk relies on building GCC itself with a non-default configuration (yes, second-guessing upstream defaults). The good news is that all major downstreams have figured this out one way or another, so you only need to worry about it if you’re doing your own production builds.

Conclusion

That’s all you need to know. Remember, upstreams know their software better than downstreams do, so the hard thinking should happen upstream. We can minimize mistakes and trouble if upstreams carefully set good defaults, and if downstreams deviate from those defaults only when truly necessary. Keep these rules in mind to avoid unnecessary bug reports from dissatisfied users.

History

I updated this blog post on August 3, 2022 to change the primary guidance to “You have a non-production (debug) build if debug is true and if optimization is 0 or g; otherwise, you have a production build.” Originally, I failed to consider g. -Og means “optimize debugging experience” and it is supposedly a better choice than -O0 for debugging according to gcc(1). It’s definitely not actually, but at least that’s the intent.

2 Replies to “Best Practices for Build Options”

  1. An undocumented hack that is sometimes used with CMake to get something similar to Meson’s “plain” build type is -DCMAKE_BUILD_TYPE=”none”. I guess “plain” or anything else that is not a valid build type would have the same effect. Though it is undocumented and can have ugly side effects with some projects that do not expect it.

  2. One thing that helps is if upstreams explain their options: If you have an –enable-vulkan option that is described with “enable Vulkan support”, that’s not very useful, but if it says “enable experimental Vulkan support”, that single adjective already helps a lot. One can even get more informative with things like “enable deprecated Vulkan support (slower)” or “enable alternative Vulkan support (misses some features)” here.
    Of course, that only works if those descriptions are kept up-to-date by the developers. If they aren’t, it’s worse than them not being there in the first place.

Comments are closed.