BuildStream progress and booting images

It’s been a while since my initial post about BuildStream and I’m happy to see that it has generated some discussion.

Since then, here at Codethink we’ve made quite a bit of progress, but we still have some road to travel before we can purport to solve all of the world’s build problems.

So here is a report on the progress we’ve made in various areas.

Infrastructure

Last time I blogged, project infrastructure was still not entirely sorted. Now that this is in place and will remain fixed for the foreseeable future, I’ll provide the more permanent links:

Links to the same in previous post have been updated

A note on GitLab

Gitlab provides us with some irresistible features.

Asides from the Merge Request feature which really does lower the barrier to contributing patches, the pre-merge CI pipelines allow us to ensure the test cases run prior to accepting any patch and are a deciding factor to remain hosted on GitLab for our git repository in lieu of creating a repo on git.gnome.org.

Another feature we get for free with GitLab’s pipeline feature is that we can automatically publish our documentation generated from source whenever a commit lands on the master branch, this was all very easy to setup.

User Experience

A significantly large portion of a software developer’s time is spent building and assembling software. Especially in tight debug and test loops, the seconds that it takes and menial tasks which stand in between an added printf() statement and a running test to reproduce some issue can make the difference between tooling which is actually helpful to the user, or just getting in the way of progress.

As such, we are paying attention to the user experience and have plans in place to ensure the most productive experience is possible.

Here are some of the advancements made since my first post

Presentation

Some of the elements we considered as important when viewing the output of a build include:

  • Separation and easy to find log files. Many build tools which use a serial build model will leave you with one huge log file to parse and figure out what happened, which is rather unwieldy to read. On the other hand, tools which exercise a parallelized build model can leave you searching through log directories for the build log you are looking for.
  • Constant feedback of what is being processed. When your build appears to hang for 30 minutes while all of your cores are being hammered down by a WebKit build, it’s nice to have some indication that a WebKit build is in fact taking place.
  • Consideration of terminal width. It’s desirable however not always possible, to avoid wrapping lines in the output of any command line interface.
  • Colorful and aligned output. When viewing a lot of terminal output, it helps to use some colors to assist the user in identifying some output they may be searching for. Likewise, alignment and formatting of text helps the user to parse more information with less frustration.

Here is a short video showing what the output currently looks like:

I’m particularly happy about how the status bar remains at the bottom of the terminal output while the regular rolling log continues above. While the status bar tells us what is going on right now, the rolling log above provides detail about what tasks are being launched, how long they took to complete and in what log files you can find the detailed build log.

Note that colors and status lines are automatically disabled when BuildStream is not connected to a tty. Interactive mode is also automatically disabled in that case. However using the bst –log-file /path/to/build.log … option will allow you to preserve the master build log of the entire session and also work in interactive mode.

Job Control

Advancements have also been made in the scheduler and how child tasks are managed.

When CNTL-C is pressed in interactive mode, all ongoing tasks are suspended and the user is presented with some choices:

  • continue – Carries on processing and queuing jobs
  • quit – Carries on with ongoing jobs but stops queuing new jobs
  • terminate – Terminates any ongoing jobs and exits immediately

Similarly, if an ongoing build fails in interactive mode, all ongoing tasks will be suspended while the user has the same choices, and an additional choice to debug the failing build in a shell.

Unfortunately continuing with a “repaired” build is not possible at this time in the same way as it is with JHBuild, however one day it should be possible in some developer mode where the user accepts that anything further that is built can only be used locally (any generated artifacts would be tainted as they don’t really correspond to their deterministic cache keys, those artifacts should be rebuilt with a fix to the input bst file before they can be shared with peers).

New Element Plugins

For those who have not been following closely, BuildStream is a system for the modeling and running of build pipelines. While this is fully intended for software building and the decoupling of the build problem and the distribution problem; in a more abstract perspective it can be said that BuildStream provides an environment for the modeling of pipelines, which consist of elements which perform mutations on filesystem data.

The full list of Element and Source plugins currently implemented in BuildStream can be found on the face page of the documentation.

As a part of my efforts to fully reproduce and provide a migration path for Baserock’s declarative definitions, some interesting new plugins were required.

meson

The meson element is a BuildElement for building modules which use meson as their build system.

Thanks goes to Patrick Griffis for filing a patch and adding this to BuildStream.

compose

The compose plugin creates a composition of its own build dependencies. Which is to say that its direct dependencies are not transitive and depending on a compose element can only pull in the output artifact of the compose element itself and none of its dependencies (a brief explanation of build and runtime dependencies can be found here)

Basically this is just a way to collect the output of various dependencies and compress them into a single artifact, that with some additional options.

For the purpose of categorizing the output of a set of dependencies, we have also introduced the split-rules public data which can be read off of the the dependencies of a given element. The default split-rules are defined in BuildStream’s default project configuration, which can be overridden on a per project and also on a per element basis.

The compose element makes use of this public data in order to provide a more versatile composition, which is to say that it’s possible to create an artifact composition of all of the files which are captured by a given domain declared in your split-rules, for instance all of the files related to internationalization, or the debugging symbols.

Example:

kind: compose
description: Initramfs composition
depends:
- filename: gnu-toolchain.bst
  type: build
- filename: initramfs/initramfs-scripts.bst
  type: build

config:
  # Include only the minimum files for the runtime
  include:
  - runtime

The above example takes the gnu-toolchain.bst stack which basically includes a base runtime with busybox, and adds to this some scripts. In this case the initramfs-scripts.bst element just imports an init and shutdown script required for the simplest of initramfs variations. The output is integrated; which is to say that things like ldconfig have run and the output of those has been collected in the output artifact. Further, any documentation, localization, debugging symbols etc, have been excluded from the composition.

script

The script element is a simple but powerful element allowing one to stage more than one set of dependencies into the sandbox in different places.

One set of dependencies is used to stage the base runtime for the sandbox, and the other is used to stage the input which one intends to mutate in some way to produce output, to be collected in the regular /buildstream/install location.

Example:

kind: script
description: The compressed initramfs
depends:
- filename: initramfs/initramfs.bst
  type: build
- filename: foundation.bst
  type: build

config:
  base: foundation.bst
  input: initramfs/initramfs.bst

  commands:
  - mkdir -p %{install-root}/boot
  - (find . -print0 | cpio -0 -H newc -o) |
    gzip -c > %{install-root}/boot/initramfs.gz

This example element will take the foundation.bst stack element (which in this context, is just a base runtime with your regular shell tools available) and stage that at the root of the sandbox, providing the few tools and runtime we want to use. Then, still following the same initramfs example as above, the integrated composition element initramfs/initramfs.bst will be staged as input in the /buildstream/input directory of the build sandbox.

The script commands then simply use the provided base tools to create a gzipped cpio archive inside the /buildstream/install directory, which will be collected as the artifact produced by this script.

A bootable system

Another thing we’ve been doing since last we touched base is providing a migration path for Baserock users to use BuildStream.

This is a particularly interesting case for BuildStream because Baserock systems provide metadata to build a bootable system from the ground up, from a libc and compiler boostrapping phase all the way up to the creation and deployment of a bootable image.

In this way we cover a lot of ground and can now demonstrate that bootstrapping, building and deploying a bootable image as a result is all possible using BuildStream.

The bootstrap

One of the more interesting parts is that the bootstrap remains almost unchanged, except for the key ingredient which is that we never allow any host tools to be present in the build sandbox.

The working theory is that whenever you bootstrap, you bootstrap from some tools. If you were ever able to obtain these tools in binary form installed on your computer, then it should also be possible to obtain them in the form of a chrootable sysroot (or “SDK”).

Anyone who has had a hand in maintaining a tree of build instructions which include a bootstrap phase from host tooling to first get off the ground (like buildroot or yocto) will have lived through the burden of vetting new distros as they roll out and patching builds so as to work “on the latest debian” or whatnot. This whole maintenance aspect is simply dropped from the equation by ensuring that host tools are not a variable in the equation but rather a constant.

Assembling the image

When it comes time to assemble an image to boot with, there are various options and it should not be such a big deal, right ? Well, unfortunately it’s not quite that simple.

It turns out that even in 2017, the options we have for assembling a bootable file system image as a regular unprivileged user are still quite limited.

Short of building qemu and using some virtualization, I’ve found that the only straight forward method of installing a boot loader is with syslinux on a vfat filesystem. While there are some tools around for manipulating ext2 filesystems in user space but these are largely unneeded anyway as static device nodes and assigning file ownership to arbitrary uid/gids is mostly unneeded when using modern init systems. In any case recent versions of e2fsprogs provide an option for populating the filesystem at creation time.

Partitioning an image for your file systems is also possible as a regular user, but populating those partitions is a game of splicing filesystem images into their respective partition locations.

I am hopeful however that with some virtualization performed entirely inside the build sandbox, we can achieve a much better outcome using libguestfs. I’m not altogether clear on how supermin and libguestfs come together but from what I understand, this technology will allow us to mount any linux supported filesystem in userspace, and quite possibly without even having (or using) the supporting filesystem drivers in your host kernel.

That said, for now we settle for the poor mans basic tooling and live with the restriction of having our boot partition be a vfat partition. The image can be created using the script element described above.

Example:

kind: script
description: Create a deployment of the GNOME system
depends:
- filename: gnome/gnome-system.bst
  type: build
- filename: deploy-base.bst
  type: build

variables:
  # Size of the disk to create
  #
  # Should be able to calculate this based on the space
  # used, however it must be a multiple of (63 * 512) bytes
  # as mtools wants a size that is devisable by sectors (512 bytes)
  # per track (63).
  boot-size: 252000K

  rootfs-size: 4G
  swap-size: 1G
  sector-size: 512

config:
  base: deploy-base.bst
  input: gnome/gnome-system.bst

  commands:

  - |
    # Split up the boot directory and the other
    #
    # This should be changed so that the /boot directory
    # is created separately.

    cd /buildstream
    mkdir -p /buildstream/sda1
    mkdir -p /buildstream/sda2

    mv %{build-root}/boot/* /buildstream/sda1
    mv %{build-root}/* /buildstream/sda2

  - |
    # Generate an fstab
    cat > /buildstream/sda2/etc/fstab << EOF
    /dev/sda2 / ext4 defaults,rw,noatime 0 1
    /dev/sda1 /boot vfat defaults 0 2
    /dev/sda3 none swap defaults 0 0
    EOF

  - |
    # Create the syslinux config
    mkdir -p /buildstream/sda1/syslinux
    cat > /buildstream/sda1/syslinux/syslinux.cfg << EOF
    PROMPT 0
    TIMEOUT 5

    ALLOWOPTIONS 1
    SERIAL 0 115200

    DEFAULT boot
    LABEL boot

    KERNEL /vmlinuz
    INITRD /initramfs.gz

    APPEND root=/dev/sda2 rootfstype=ext4 init=/sbin/init
    EOF

  - |
    # Create the vfat image
    truncate -s %{boot-size} /buildstream/sda1.img
    mkdosfs /buildstream/sda1.img

  - |
    # Copy all that stuff into the image
    mcopy -D s -i /buildstream/sda1.img -s /buildstream/sda1/* ::/

  - |
    # Install the bootloader on the image, it will load the
    # config file from inside the vfat boot partition
    syslinux --directory /syslinux/ /buildstream/sda1.img

  - |
    # Now create the root filesys on sda2
    truncate -s %{rootfs-size} /buildstream/sda2.img
    mkfs.ext4 -F -i 8192 /buildstream/sda2.img \
              -L root -d /buildstream/sda2

  - |
    # Create swap
    truncate -s %{swap-size} /buildstream/sda3.img
    mkswap -L swap /buildstream/sda3.img

  - |

    ########################################
    #        Partition the disk            #
    ########################################

    # First get the size in bytes
    sda1size=$(stat --printf="%s" /buildstream/sda1.img)
    sda2size=$(stat --printf="%s" /buildstream/sda2.img)
    sda3size=$(stat --printf="%s" /buildstream/sda3.img)

    # Now convert to sectors
    sda1sec=$(( ${sda1size} / %{sector-size} ))
    sda2sec=$(( ${sda2size} / %{sector-size} ))
    sda3sec=$(( ${sda3size} / %{sector-size} ))

    # Now get the offsets in sectors, first sector reserved
    # for MBR partition table
    sda1offset=1
    sda2offset=$(( ${sda1offset} + ${sda1sec} ))
    sda3offset=$(( ${sda2offset} + ${sda2sec} ))

    # Get total disk size in sectors and bytes
    sdasectors=$(( ${sda3offset} + ${sda3sec} ))
    sdabytes=$(( ${sdasectors} * %{sector-size} ))

    # Create the main disk and do the partitioning
    truncate -s ${sdabytes} /buildstream/sda.img
    parted -s /buildstream/sda.img mklabel msdos
    parted -s /buildstream/sda.img unit s mkpart primary fat32 \
       ${sda1offset} $(( ${sda1offset} + ${sda1sec} - 1 ))
    parted -s /buildstream/sda.img unit s mkpart primary ext2 \
       ${sda2offset} $(( ${sda2offset} + ${sda2sec} - 1 ))
    parted -s /buildstream/sda.img unit s mkpart primary \
       linux-swap \
       ${sda3offset} $(( ${sda3offset} + ${sda3sec} - 1 ))

    # Make partition 1 the boot partition
    parted -s /buildstream/sda.img set 1 boot on

    # Now splice the existing filesystems directly into the image
    dd if=/buildstream/sda1.img of=/buildstream/sda.img \
      ibs=%{sector-size} obs=%{sector-size} conv=notrunc \
      count=${sda1sec} seek=${sda1offset} 

    dd if=/buildstream/sda2.img of=/buildstream/sda.img \
      ibs=%{sector-size} obs=%{sector-size} conv=notrunc \
      count=${sda2sec} seek=${sda2offset} 

    dd if=/buildstream/sda3.img of=/buildstream/sda.img \
      ibs=%{sector-size} obs=%{sector-size} conv=notrunc \
      count=${sda3sec} seek=${sda3offset} 

  - |
    # Move the image where it will be collected
    mv /buildstream/sda.img %{install-root}
    chmod 0644 %{install-root}/sda.img

As you can see the script element is a bit too verbose for this type of task. Following the pattern we have in place for the various build elements, we will soon be creating a reusable element with some more simple parameters (filesystem types, image sizes, swap size, partition table type, etc) for the purpose of whipping together bootable images.

A booting demo

So for those who want to try this at home, we’ve prepared a complete system which can be built in the build-gnome branch of the buildstream-tests repository.

BuildStream now requires python 3.4 instead of 3.5, so this should hopefully be repeatable on most stable distros, e.g. debian jessie ships 3.4 (and also has the required ostree and bubblewrap available in the  jessie-backports repository).

Here are some instructions to get you off the ground:

mkdir work
cd work

# Clone related repositories
git clone git@gitlab.com:BuildStream/buildstream.git
git clone git@gitlab.com:BuildStream/buildstream-tests.git

# Checkout build-gnome branch
cd buildstream-tests
git checkout build-gnome
cd ..

# Make sure you have ostree and bubblewrap provided by your distro
# first, you will also need pygobject for python 3.4

# Install BuildStream as local user, does not require root
# If this fails, it's because you lack some required dependency.
cd buildstream
pip install --user -e .
cd ..

# If you've gotten this far, then the following should also succeed
# after many hours of building.
cd buildstream-tests
bst build gnome/gnome-system-image.bst

# Once the above completes, there is an image which can be
# checked out from the artifact cache.
#
# The following command will create ~/VM/sda.img
#
bst checkout gnome/gnome-system-image.bst ~/VM/

# Now you can use your favorite VM to boot the image, e.g.:
qemu-system-x86_64 -m size=1024 ~/VM/sda.img

# GDM is currently disabled in this build, once the VM boots
# you can login as root (no password) and in that VM you can run:
systemctl start gdm

# And the above will bring up gdm and start the regular
# gnome-initial-setup tool.

With SSD storage and a powerful quad core CPU, this build completes in less than 5 hours (and pretty much makes full usage of your machine’s resources all along the way). All told, the build will take around 40GB of disk space to build and store the result of around 500 modules. I would advise having at least 50GB of free space for this though, especially to account for some headroom in the final step.

Note: This is not an up to date GNOME system based on current modulesets yet, but rather a clone/conversion of the system I tried integrating last year using YBD. I will soon be starting on creating a more modular repository which builds only the components relevant to GNOME and follows the releases, for that I will need to open some dialog and sort out some of the logistics.

Note on modularity

The mentioned buildstream-tests repository is one huge repository with build metadata to build everything from the compiler up to a desktop environment and some applications.

This is not what we ultimately want, because first off, it’s obviously a huge mess to maintain and you dont want your project to be littered with build metadata that you’re not going to use (which is what happens when forking projects like buildroot). Secondly, even when you are concerned with building an entire operating system from scratch, we have found that without modularity, changes introduced in the lower levels of the stack tend to be pushed on the stacks which consume those modules. This introduces much friction in the development and integration process for such projects.

Instead, we will eventually be using recursive pipeline elements to allow modular BuildStream projects to depend on one another in such a way that consuming projects can always decide what version of a project they depend on will be used.

 

Introducing BuildStream

Greetings fellow Gnomies :)

At Codethink over the past few months we’ve been revisiting our approach to assembly of whole-stack systems, particularly for embedded Linux systems and custom GNOME based systems.

We’ve taken inspiration, lessons and use-cases from various projects including OBS, Reproducible Builds, Yocto, Baserock, buildroot, Aboriginal, GNOME Continuous, JHBuild, Flatpak Builder and Android repo.

The result of our latest work is a new project, BuildStream, which aims initially to satisfy clear requirements from GNOME and Baserock, and grow from there. BuildStream uses some key GNOME plumbing (OSTree, bubblewrap) combined with declarative build-recipe description to provide sandboxed, repeatable builds of GNOME projects, while maintaining the flexibility and speed required by GNOME developers.

But before talking about BuildStream, lets go over what this can mean for GNOME in 2017.

Centralization of build metadata

Currently we build GNOME in various ways, including JHBuild XML, Flatpak JSON for the GNOME Runtime and SDK, and GNOME Continuous JSON for CI.

We hope to centralize all of this so that the GNOME release team need only maintain one single set of core module metadata in one repository in the same declarative YAML format.

To this end, we will soon be maintaining a side branch of the GNOME release modulesets so people can try this out early.

GNOME Developer Experience

JHBuild was certainly a huge improvement over the absolutely nothing that we had in place before it, but is generally unreliable due its reliance on host tooling and dependencies.

  • Newcomers can have a hard time getting off the ground and making sure they have satisfied the system dependencies.
  • Builds are not easily repeatable, you cannot easily build GNOME 3.6 today with a modern set of dependencies.
  • Not easy to test core GNOME components like gnome-session or the gnome-initial-setup tool.

BuildStream nips these problems at the bud with an entirely no-host-tooling policy, in fact you can potentially build all of GNOME on your computer without ever installing gcc. Instead, GNOME will be built on top of a deterministic runtime environment which closely resembles the freedesktop-sdk-images Flatpak runtime but will also include the minimal requirements for booting the results in a VM.

Building in the Swarm

BuildStream supports artifact cache sharing so that authenticated users may upload successful build results to share with their peers. I doubt that we’ll want to share all artifacts between random users, but having GNOME Continuous upload to a common artifact cache will alleviate the pain of webkit rebuilds (unless you are hacking on webkit of course).

Flatpak / Flathub support

BuildStream will also be available as an alternative to flatpak-builder.

We will be providing an easy migration path and conversion script for Flatpak JSON which should be good enough for most if not all Flatpak app builds.

As the Flathub project develops, we will also work towards enabling submission of BuildStream metadata as an alternative to the Flatpak Builder JSON.

About BuildStream

Unlike many existing build systems, BuildStream treats the problem of building and distribution as separate problem spaces. Once you have built a stack in BuildStream it should be trivial enough to deploy it as rpms, debian packages, a tarball/ostree SDK sysroot, as a flatpak, or as a bootable filesystem image which can be flashed to hardware or booted in a VM.

Our view is that build instructions as structured metadata used to describe modules and how they integrate together is a valuable body of work on its own. As such we should be able to apply that same body of work reliably to a variety of tasks – the BuildStream approach aims to prove this view while also offering a clean and simple user experience.

BuildStream is written in Python 3, has fairly good test coverage at this stage and is quite well documented.

BuildStream works well right now but still lacks some important features. Expect some churn over the following months before we reach a stable release and become a viable alternative for developing GNOME on your laptop/desktop.

Dependencies

Note that for the time being the OSTree requirement may be too recent for many users running currently stable distros (e.g. debian Jessie). This is because we use the OSTree gobject introspection bindings which require a version from August 2016. Due to this hard requirement it made little sense to include special case support for older Python versions.

However with that said; if this transitional period is too painful, we may decide to lower the Python requirement and just use the OSTree command line interface instead.

Build Pipelines

The BuildStream design in a nutshell is to have one abstract core, which provides the mechanics for sandboxing build environments (currently using bubblewrap as our default sandbox), interpreting the YAML data model and caching/sharing the build results in an artifact cache (implemented with ostree) and an ecosystem of “Element” plugins which process filesystem data as inputs and outputs.

In a very abstract view, one can say that BuildStream is like GStreamer but its extensible set of element plugins operate on filesystem data instead of audio and video buffers.

This should allow for a virtually unlimited variety of pipelines, here are some sketches which attempt to illustrate the kinds of tasks we expect to accomplish using BuildStream.

Import a custom vendor tarball, build an updated graphics stack and BSP on top of that, and use a custom export element to deploy the build results as RPMS:

Import the base runtime ostree repository generated with Yocto, build the modules for the freedesktop-sdk-images repository on top of that runtime, and then deploy both Runtime and SDK from that base, while filtering out the irrelevant SDK specific bits from the Runtime deployment:

Import an arbitrary but deterministic SDK (not your host !) to bootstrap a compiler, C runtime and linux kernel, deploy a bootable filesystem image:

Build pipelines are modular and can be built recursively. So a separate project/pipeline can consume the same base system we just built and extend it with a graphics stack:

A working demo

What follows are some instructions to try out BuildStream in its early stages.

For this demo we chose to build a popular application (gedit) in the flatpak style, however this does not yet include an ostree export or generation of the metadata files which flatpak requires; the built gedit result cannot be run with flatpak without those steps but can be run in a `build-stream shell` environment.

# Installing BuildStream

# Before installing BuildStream you will need to first install
# Python >= 3.5, bubblewrap and OSTree >= v2016.8 as stated above.

# Create some arbitrary directory, dont use ~/buildstream because
# that's currently used by buildstream unless you override the 
# configuration.
mkdir ~/testing
cd testing
git clone https://gitlab.com/BuildStream/buildstream

# There are a handful of ways to install a python setuptools
# package, we recommend for developer checkouts that you first
# install pip, and run the following command.
#
# This should install build-stream and its pythonic dependencies
# into your users local python environment without touching any
# system directories:
cd buildstream
pip install --user -e .

# Clone the demo project repository
cd ..
git clone https://gitlab.com/BuildStream/buildstream-tests
cd buildstream-tests

# Take a peek of the gedit.bst pipeline state (optional)
#
# This will tell us about all the dependencies in the pipeline,
# what their cache keys are and their local state (whether they
# are already cached or need to be built, or are waiting for a
# dependency to be built first).
build-stream show --deps all gedit.bst

# Now build gedit on top of a GNOME Platform & Sdk
build-stream build gedit.bst

#
# This will take some time and quite some disk space, building
# on SSD is highly recommended.
#
# Once the artifact cache sharing features are in place then this
# will take half the disk space it currently takes, in the majority
# of cases where you BuildStream already has an artifact for the
# GNOME Platform and SDK bases.
#

# Ok, the build may have taken some time but I'm pretty sure it
# succeeded.
#
# Now we can launch a sandbox shell in an environment with the
# built gedit:
build-stream shell --scope run gedit.bst

# And launch gedit. Use the --standalone option to be sure we are
# running the gedit we just built, not a new window in the gedit
# installed on your host
gedit --standalone

Getting Involved

As you can see we’re currently hosted from my user account on gitlab, so our next steps is to sort out a proper hosting for the project including mailing list, bug tracking and a place to publish our documentation.

For right now, the best place to reach us and talk about BuildStream is in the #buildstream channel on GNOME IRC.

If you’d like to play around with the source, a quick read into the HACKING file will provide some orientation for getting started, coding conventions, building documentation and running tests.

 

With that, I hope you’ve all enjoyed FOSDEM and the beer that it entails :)

Software Build Topologies

In recent months, I’ve found myself discussing the pros and cons of different approaches used for building complete operating systems (desktops or appliances), or lets say software build topologies. What I’ve found, is that frequently I lack vocabulary to categorize existing build topologies or describe some common characteristics of build systems, the decisions and tradeoffs which various projects have made. This is mostly just a curiosity piece; a writeup of some of my observations on different build topologies.

Self Hosting Build Topologies

Broadly, one could say that the vast majority of build systems use one form or another of self hosting build topology. We use this term to describe tools which build themselves, wikipedia says that self hosting is:

the use of a computer program as part of the toolchain or operating system that produces new versions of that same program

While this term does not accurately describe a category of build topology, I’ve been using this term loosely to describe build systems which use software installed on the host to build the source for that same host, it’s a pretty good fit.

Within this category, there are, I can observe 2 separate topologies used, lets call these the Mirror Build and the Sequential Build for lack of any existing terminology I can find.

The Mirror Build

This topology is one where the system has already been built once, either on your computer or another one. This build process treats the bootstrapping of an operating system as an ugly and painful process for the experts, only to be repeated when porting the operating system to a new architecture.

The basic principle here is that once you have an entire system that is already built, you can use that entire system to build a complete new set of packages for the next version of that system. Thus the next version is a sort of reflection of the previous version.

One of the negative results of this approach is that circular dependencies tend to crop up unnoticed, since you already have a complete set of the last version of everything. For example: it’s easy enough to have perl require autotools to build, even though you needed perl to build autotools in the first place. This doesn’t matter because you already have both installed on the host.

Of course circular dependencies become a problem when you need to bootstrap a system like this for a new architecture, and so you end up with projects like this one, specifically tracking down cropped up circular dependencies to ensure that a build from scratch actually remains possible.

One common characteristic of build systems which are based on the Mirror Build is that they are usually largely non-deterministic. Usually, whatever tools and library versions happen to be lying around on the system can be used to build a new version of a given module, so long as each dependency of that module is satisfied. A dependency here is usually quite loosely specified as a lower minimal bound dependency: the oldest version of foo which can possibly be used to build or link against, will suffice to build bar.

This Mirror Build is historically the most popular, born of the desire to allow the end user to pick up some set of sources and compile the latest version, while encountering the least resistance to do so.

While the famous RPM and Debian build systems have their roots in this build topology, it’s worth noting that surrounding tooling has since evolved to build RPMs or Debian packages under a different topology. For instance, when using OBS to build RPMs or Debian packages: each package is built in sequence, staging only the dependencies which the next package needs from previous builds into a minimal VM. Since we are bootstrapping often and isolating the environment for each build to occur in sequence from a predefined manifest of specifically versioned package, it is much more deterministic and becomes a Sequential Build instead.

The Sequential Build

The Sequential Build, again for the lack of any predefined terminology I can find, is one where the entire OS can be built from scratch. Again and again.

The LFS build, without any backing build system per se, I think is a prime example of this topology.

This build can still be said to be self hosting, indeed; one previously built package is used to build the next package in sequence. Aside from the necessary toolchain bootstrapping: the build host where all the tools are executed is also the target where all software is intended to run. The distinction I make here is that only packages (and those package versions) which are part of the resulting OS are ever used to build that same OS, so a strict order must be enforced, and in some cases the same package needs to be built more than once to achieve the end result, however determinism is favored.

It’s also noteworthy that this common property, where host = target, is what is generally expected by most project build scripts. While cross compiles (more on that below) typically have to struggle and force things to build in some contorted way.

While the Ports, Portage, and Pacman build systems, which encourage the build to occur on your own machine, seem to lend themselves better to the Sequential Build, this only seems to be true at bootstrap time (I would need to look more closely into these systems to say more). Also, these system are not without their own set of problems. With gentoo’s Portage, one can also fall into circular dependency traps where one needs to build a given package twice while tweaking the USE flags along the way. Also with Portage, package dependencies are not strictly defined but again loosely defined as lower minimal bound dependencies.

I would say that a Sequential Self Hosted Build lends itself better to determinism and repeatability, but a build topology which is sequential is not inherently deterministic.

Cross Compiles

The basic concept of Cross Compiling is simple: Use a compiler that runs on host and outputs binary to be later run on target.

But the activity of cross compiling an entire OS is much more complex than just running a compiler on your host and producing binary output for a target.

Direct Cross Build

It is theoretically possible to compile everything for the target using only host tools and a host installed cross compiler, however I have yet to encounter any build system which uses such a topology.

This is probably primarily because it would require that many host installed tools be sysroot aware beyond just the cross compiler. Hence we resort to a Multi Stage Cross Build.

Multi Stage Cross Build

This Multi Stage Cross Build, which can be observed in projects such as Buildroot and Yocto shares some common ground with the Sequential Self Hosted Build topology, except that the build is run in multiple stages.

In the first stage, all the tools which might be invoked during the cross build are built into sysroot prefix for host runnable tooling. This is where you will find your host -> target cross compiler along with autotools, pkg-config, flex, bison, and basically every tool you may need to run on your host during the build. These tools installed in your host tooling sysroot are specially configured so that when they are run they find their comrades in the same sysroot but look for other payload assets (like shared libraries) in the eventual target sysroot.

Only after this stage, which may have involved patching some tooling to make it behave well for the next stage, do we really start cross compiling.

In the second stage we use only tools built into the toolchain’s sysroot to build the target. Starting by cross compiling a C library and a native compiler for your target architecture.

Asides from this defining property, that a cross compile is normally done in separate stages, there is the detail that pretty much everything under the sun besides the toolchain itself (which must always support bootstrapping and cross compiling) needs to be coerced into cooperation with added obscure environment variables, or sometimes beaten into submission with patches.

Virtual Cross Build

While a cross compile will always be required for the base toolchain, I am hopeful that with modern emulation, tools like Scratchbox 2 and approaches such as Aboriginal Linux; we can ultimately abolish the Multi Stage Cross Build topology entirely from existence. The added work involved in maintaining build scripts which are cross build aware and constant friction with downstream communities which insist on cross building upstream software is just not worth the effort when a self hosting build can be run in a virtual environment.

Some experimentation already exists, the Mer Project was successful in running OBS builds inside a Scratchbox 2 environment to cross compile RPMSs without having to deal with the warts of traditional cross compiles. I also did some experimentation this year building the GNU/Linux/GNOME stack with Aboriginal Linux.

This kind of virtual cross compile does not constitute a unique build topology since it in fact uses one of the Self Hosting topologies inside a virtual environment to produce a result for a new architecture.

Finally

In closing, there are certainly a great variety of build systems out there, all of which have made different design choices and share common properties. Not much vocabulary exists to describe these characteristics. This suggests that the area of building software remains somewhat unexplored, and that the tooling we use for such tasks is largely born of necessity, barely holding together with lots of applied duct tape. With interesting new developments for distributing software such as Flatpak, and studies into how to build software reliably and deterministically, such as the reproducible builds project, hopefully we can expect some improvements in this area.

I hope you’ve enjoyed my miscellaneous ramblings of the day.

Flatpak builds available on a variety of architectures

Following the recent work we’ve been doing at Codethink in cooperation with Endless, it’s been a while now that we have the capability of building flatpak SDKs and apps for ARM architectures, and consequently also for 32bit Intel architectures.

Alex has been tying this together and setting up the Intel build machines and as of this week, flatpak builds are available at sdk.gnome.org in a variety of arches and flavors.

Arches

The supported architectures are as follows

  • x86_64, the 64bit Intel architecture which is the only one we’ve been building until now
  • i386, this is the name we are using for 32bit Intel, this is only i386 in name but the builds are in fact tuned for the i586 instruction set
  • aarch64, speaks for itself, this is the 64bit ARM architecture
  • arm, like i386, this is a generic name chosen to indicate 32bit arm, this build is tuned for ARMv7-A processors and will make use of modern features such as vfpv3 and the neon simd. In other words, this will not run on older ARM architectures but should run well on modern ARM processors such as the Cortex-A7 featured in the Raspberry Pi 2.

Build Bots

The build bots are currently driven with this set of build scripts, which should be able to turn an Intel or ARM machine with a vanilla Ubuntu 16.04 or RHEL 7 installation into a flatpak build machine.

ARM and Intel builds run on a few distributed build machines and are then propagated to sdk.gnome.org for distribution.

The build machines also push notifications of build status to IRC, currently we have it setup so that only failed builds are announced in #flatpak on freenode, while the fully verbose build notifications are announced in #flatpak-builds also on freenode (so you are invited to lurk in #flatpak-builds if you would like to monitor how your favorite app or library is faring on various build architectures).

 

Many thanks to all who were involved in making this happen, thanks to Alex for being exceptionally responsive and helpful on IRC, thanks to Endless for sponsoring the development of these build services and ARM support, thanks to Codethink for providing the build machines for the flatpak ARM builds and a special thanks to Dave Page for setting up the ARM build server infrastructure and filling in the IT knowledge gap where I fall short (specifically with things networking related).

Endless and Codethink team up for GNOME on ARM

A couple of months ago Alberto Ruiz issued a Call to Arms here on planet GNOME. This was met with an influx of eager contributions including a wide variety of server grade ARM hardware, rack space and sponsorship to help make GNOME on ARM a reality.

Codethink and Endless are excited to announce their collaboration in this initiative and it’s my pleasure to share the details with you today.

Codethink has donated 8 cartridges dedicated to building GNOME things for ARM architectures in our Moonshot server. These cartridges are AppliedMicro™ X-Gene™ with 8 ARMv8 64-bit cores at 2.4Ghz, 64GB of DDR3 PC3L-12800 (1600 MHz) Memory and 120GB M.2 solid state storage.

Endless has also enlisted our services for the development and deployment of a Flatpak (formerly known as xdg-app) build farm to run on these machines. The goal of this project is to build and distribute both stable and bleeding edge versions of GNOME application bundles and SDKs on a continuous basis.

And we are almost there !

After one spontaneous hackfest and a long list of patches; I am happy to add here that runtimes, sdks and apps are building and running on both AArch64 and 32bit ARMv7-A architectures. As a side effect of this effort, Flatpak sdks and applications can now also be built for 32bit Intel platforms (this may have already been possible, but not from an x86_64 build host).

The builds are already automated at this time and will shortly be finding their way to sdk.gnome.org.

In the interest of keeping everything repeatable, I have been maintaining a set of scripts which setup nightly builds on a build machine, which can be configured to build various stable/unstable branches of the SDK and app repositories. These are capable of building our 4 supported target architectures: x86_64, i386, aarch64 and arm.

Currently they are only well tested with vanilla installations of Ubuntu 16.04 and are also known to work on Debian Stretch, but it should be trivial to support some modern RPM based distros as well.

endless

logo
Stay tuned for further updates on GNOME’s new found build farm, brought to you by Endless and Codethink !

Aboriginal YBD – An exploration in cross building

The last couple of months at Codethink have been an exploration into cross compiling, or rather, cross compiling without the hassle of cross compiling.

In brief, this post is about an experimental technique for cross building operating systems we’ve come up with, in which we use a virtual machine to run the builds, a cross compiler over distccd to do the heavy lifting and a virtfs 9p mount to share the build directory with the guest build slave.

Lets start at the beginning

In a recent post, I showcased a build of GNOME from scratch. This was created using the ybd build tool to build GNOME from Baserock YAML definitions.

Once we had a working system, I was asked if I could repeat that for arm. There was already a build story for building arm with Baserock definitions, but getting off the ground to bootstrap it was difficult, and the whole system needs to be built inside an arm emulator or on arm hardware. We started looking at improving the build story for cross compilation.

We examined a few approaches…

Full traditional cross compile

Some projects, such as yocto or buildroot, provide techniques for cross compiling an entire OS from scratch.

I did a writeup on the complications involved in cross building systems
in this email, but in summary:

  • The build process is complex, packages need to be compiled for both the $host and $target all the way up the stack, since modules tend to provide tooling which needs to run on the build host, usually by other modules which depend on it (i.e. icu-config or pkg-config).
  • Building involves trickery, one needs to setup the build environment very specifically so that host tools are run in the build scripts of a given module, this setup varies from module to module depending on the kind of build scripts they use.
  • The further up the stack you get, the more modules tend to expect a self hosting (or native) build environment. This means there is a lot of friction in maintaining something like buildroot, it can involve in some cases very strange autogen/configure incantations and in worse cases, downstream patches need to be maintained just to get it to work.
  • Sometimes you even encounter projects which compile C programs that are not distributed, but only used to generating header files and the like during the build, and often these programs are not compiled specifically with $HOST_CC but directly with $CC.

In any case, this was obviously not a viable option. If one wants to be able to build the bleeding edge on a regular basis, cross compiling all the way up the stack involves too much friction.

The scratchbox2 project

This was an avenue which shows promise indeed. The scratchbox project allows one to setup a build environment that is completely tweaked for optimal build performance, using qemu user mode emulation, and much, much more.

I took a look at the internals PDF document and, while I remain impressed, I just don’t think this is the right fit.

The opening statement of the referred pdf says:

Documenting a system as complex as Scratchbox 2 is not an easy task.

And this is no understatement by any means. Scratchbox’s internal design is extremely difficult to grasp, there are many moving parts and details to this build environment; all of which, at least at face value, I perceive to be potential points of failure.

Scratchbox 2 inserts itself in between the qemu user mode emulator and the host operating system and makes decisions, based on configuration data the user needs to provide, about what tooling can be used during a build, and what paths are really going to be accessed.

In short, scratchbox 2 will sometimes call host tools and run them directly without emulation, and sometimes it will use target tools in user mode emulation, these are managed by a virtual filesystem “view” and both execution modes will see the underlying filesystem in different ways. This way you basically get the fastest solution possible: you run a host cross compiler to build binaries for the target at build time, you run host built coreutils and shells and perl and such at configure time, and if you are well configured, you presumably only ever run target binaries in user emulation when those are tools which were built in your build and need to run during the build.

Scratchbox is what you get when you try to get the maximum performance out of a virtualized native build environment. And it is a remarkable feat, but I have reservations about depending on something as complex as this:

  • Will be able to easily repeat the same build I did today in 10 years from now and easily obtain the same result ?
  • If something ever goes wrong, will it always be possible to find an engineer who is easily capable of fixing it ?
  • When creating entirely new builds, how much effort is going to go into setting up and configuring the scratchbox environment ?

But we’re getting closer, scratchbox2 provides a virtualized environment so that when compiling, your upstream packages believe that they are in a native environment, removing that friction with upstreams and allowing one to upgrade modules without maintaining obscure build instructions and downstream patches.

The Aboriginal Linux project

This is the approach we took as a starting point, it’s not the fastest as a build environment but has some values which align quite nicely with our goals.

What Aboriginal Linux provides, is mostly just a hand full of shell scripts which allow one to bootstrap for a given architecture quite elegantly.

When running the Aboriginal build itself, you just have to tell it what the host and target architectures are, and after the build completes, you end up with the following ingredients:

A statically linked, relocatable $host -> $target cross compiler

This is interesting, you get a gcc which you can untar on any machine
of the given $host architecture and it will compile for $target

A minimal system image to run on the target

This includes:

    • A minimal kernel configured for that arch
    • Busybox / Toybox for your basic utilities
    • Bash for your basic shell utilities
    • A native compiler for the target arch
    • distcc
    • An init.sh to boot the system

A set of scripts to launch your kernel & rootfs under qemu

These scripts are generated for your specific target arch so they “just work”, and they setup the guest so that distcc is plugged into the cross compiler you just built.

A couple of nice things about Aboriginal

Minimal build requirements

Running the Aboriginal scripts and getting a build requires:

ar, as, nm, ranlib, ld, cc, gcc, g++, objdump, make and sh

The build starts out by setting up an environment which has access to these, and only these binaries.

This highly controlled early stage build environment is attractive to me because I think the chances are very high that in 10 years I can launch the same build script and get a working result, this is to be at the very base of our stack.

Elegant configuration for bootstrapping targets

Supporting a target architecture in Aboriginal Linux is a bit tricky but once it’s done it should remain reliable and repeatable.

Aboriginal keeps some target configuration files which are sourced at various stages of the build in order to determine:

  • Extra compiler flags for building binutils & gcc and libc
  • Extra configuration options for the kernel build
  • Magical obscure qemu incantation for bringing up the OS in qemu

Getting a compiler, kernel and emulator tuple to work is a delicate dance of matching up these configurations exactly. Once that is done however, it should basically keep working forever.

The adventure begins

When I started testing things, I first just wanted a proof of concept, lets see if we can build our stack from within the Aboriginal Linux emulator.

In my first attempts, I built all the dependencies I needed to run python and git, which are basically the base requirements for running the ybd build tool. This was pretty smooth sailing except that I had to relocate everything I was building into a home directory (read-only root). By the time I started to build baserock’s definitions though I hit a wall. I, quite innocently, wanted to just go ahead and build glibc with Aboriginal’s compiler, thinking no big deal right ? Boy was I wrong.

First problem was that glibc, seems to care a great deal about what compiler is used to build it, and the last GPLv2 version of gcc was not going to cut it. Surprisingly, the errors I encountered were not about the compiler not supporting a recent C standard or such, it was explicitly about gcc – glibc has a deep longing desire to be compiled with gcc, and a moderately recent version of it at that.

Aboriginal Linux had frozen at the very latest releases (and even git commits) of packages which were still available under GPLv2. It took some convincing but since that toolchain is getting old, Rob Landley agreed that it would be desirable, in a transitional period until llvm is ready, to have an optional build mode allowing one to build Aboriginal Linux using the newer GPLv3 contaminated toolchain.

So, I set myself to work and, hoping that it would just cost me a weekend (wrong again), cooked up a branch which supports an option to compile Aboriginal with GCC 5.3 and binutils 2.25.1. A report of the changes this branch introduced can be found on the aboriginal mailing list.

In this time I became intimately acquainted with building compilers and cross compilers. As I mentioned, Aboriginal has a very neat build model which bootstraps everything, running build.sh basically runs like:

CROSS_COMPILER_HOST=i686 SYSIMAGE_TYPE=ext2 ./build.sh armv5l

So essentially you choose the host arch and target arch (both of which need to have support, i.e. a description file like this one in the aboriginal sources), and then the build runs in stages, sourcing the description files for the said architecture depending on what it’s building along the way.

Because I spent considerable time on this, and am still sufficiently fascinated, I’m going to give a break down of the process here.

1.) Build host tooling

First we create a host directory where we will install tools we want to use during the build, we intentionally symlink to only a few minimal host tools that we require be on your system, these are your host compilers, linkers, a functional shell and make implementation.

Then, we build toybox, busybox, e2fsprogs and distcc, basically any tools which we actually have a chance of running on your host.

2.) Build a stage 1 cross compiler for ${target}

This is the compiler we’re going to use to build everything that is going to run on your target qemu guest image, including of course the native compiler.

In this step we build gcc, musl libc and then gcc again, we build gcc again in order to complete the runtime and get libstdc++.

Previous versions of Aboriginal did not require this second build of gcc, but since GCC folks decided to start using C++, we need a C++ capable cross compiler to build the native compiler.

3.) Build a stage 1 cross compiler for ${host}

This is the first step towards building your statically linked and relocatable cross compiler, which you’ll want to be plugging into distcc and using on any machine of the given ${host} arch.

This step is run in exactly the same way as the previous step, except that we are building a cross compiler from your real host -> ${host}

4.) Build the full ${host} -> ${TARGET} cross compiler

In this stage we will use the cross compiler we built in the previous step, in order to build a cross compiler which runs on ${host} and produces code for ${target}. Neither of these have to be the host arch you are actually running on, you could be building a cross compiler for arm machines to build x86 code on a mips, if you were that sadistic.

In this second stage compiler the setup is a bit different, we start out by compiling musl libc right away and just build gcc once, since we already have a full compiler and we already have musl libc compiled for the target ${host}.

Note: This is actually called a “Canadian Cross”, and no, I was also surprised to find out that it is not named after a tattoo one gets when joining a fringe religious group in Canada.

5.) Build the native compiler

Now, in exactly the same way we build the Canadian Cross, we’re going to build a native compiler using the stage 1 cross compiler for ${target}.

This compiler is cross compiled to run on target, and configured to produce code for the same target, so it is a cross compiled native compiler.

6.) Build the kernel and system image

Finally, we use our stage 1 cross compiler again to compile the kernel which will run in qemu, and the root filesystem which has a few things in it. The root filesystem includes toybox, busybox, make, bash and distcc.

We wrap this up with a few scripts, an init.sh to run on the resulting guest image and a run-emulator.sh script which is generated to just “know” how to properly bring up this guest.

A word on the ccwrap compiler frontend

Before moving on, I should say a word or two about the compiler frontend ccwrap.c.

I mentioned before that the cross compiler Aboriginal creates is statically linked and relocatable. This is achieved with the said frontend to the compiler tooling, who’s purpose in life is to fight GCC’s desire to hard code itself into the location you’ve compiled it to, tooth and nail.

ccwrap counters gcc’s tactics by sitting in place of gcc, cc, g++, c++ and cpp, and figuring out the real location of standard includes and linking stubs, and then calling into the original gcc binaries using a modified set of command line arguments; adding -nostdinc and -nostdlib where necessary, and providing the include paths and stubs to the command line.

This is a violent process, and gcc puts up a good fight, but the result is that the cross compiler you generate can be untarred anywhere on any host of the correct ${host} architecture, and it will just run and create binaries for ${target}, building and linking against musl libc by default (more on libc further down).

 

To port this all to work with new GCC and binutils versions, I needed to find the right patches for gcc and binutils, these are all mostly upstream already in unreleased versions of gcc and binutils. Then I had to reconstruct the building of the stage 1 compilers so that it builds with C++ support, and finally iron out remaining kinks.

This part was all pretty fun to wrap my head around, hope it was also enjoyable to read about :)

The travel from musl libc to glibc

So after all that, we have an Aboriginal Linux setup which is capable of building glibc, but the ride is not over ! When building a whole operating system, there is a small chance that someone out there used C++, if we’re going to distribute a glibc based system, we’re probably also going to want to have a libstdc++ that is actually linked against that glibc.

Well, that was what I was thinking, in fact; it runs deeper than this, gcc itself provides libgcc.a and it’s start/end stubs which compliment the host libc’s start/end stubs, but also provides a shared library and a libgcc_eh.so which need to be linked against the host libc.

In any case, at this stage I was a bit worried that the musl-linked gcc compiler I had might not be capable of building and linking programs against the new glibc. Of course it should work, this is just a standards compliant compiler on one hand and a standard C library on the other, but seeing that gcc / glibc entanglement runs so deeply, we had to be sure.

After some time building and rebuilding glibc and gcc on a puny armv5l qemu emulator I found the magic concoction which makes the build pass. For glibc the build pretty much runs smoothly, you first have to install the appropriate linux kernel includes and tell glibc that –with-headers=/usr/include, lest it tread off the beaten path, and go searching obscure host-triple prefixed paths all on it’s own.

To build the gcc runtimes (so that you get the desired libstdc++), you actually have to build gcc as if you were building a cross compiler.

In the armv5l transition from musl libc to GNU libc, you would tell it that:

--build=armv5l-thingamajiggie-musleabi
--host=armv5l-thingamajiggie-musleabi
--target=armv5l-thingamajiggie-gnueabi

With this setup, it will build all the host tooling using the existing musl libc which our existing compiler is hardwired to use, but when building the runtimes, it will look into ${prefix} and find the glibc we previously compiled, linking the gcc runtimes against the fresh glibc.

And yeah, it’s actually important to specify ‘-musleabi’ and ‘-gnueabi’ in the host triples specifically, gcc’s build scripts will parse the triples and behave differently depending on what suffix you give it.

In my setup, I did not want to use the new compiler, just the runtimes. So I did a custom install of the gcc runtimes in precisely the way that the aboriginal frontend expects to find them.

At this stage, we can now use environment variables to tell the Aboriginal compiler frontend how to behave, telling it the runtime linker we want to use and where it should look for it’s start stubs and end stubs and such.

Once we have installed glibc and new gcc runtimes into a chroot staging area on the target emulator, we can now set the following env vars:

CCWRAP_DYNAMIC_LINKER=/lib/ld.so
CCWRAP_TOPDIR=/usr

And gcc will look for standard headers and library paths in /usr and use the dynamic linker installed by glibc.

Now we can compile C and C++ programs, against glibc and glibc based libstdc++, using our nifty compiler which was built against, and statically linked, to musl libc.

What we have done with this ?

The next step was integrating all of this into the YBD build tool and use the Aboriginal compilers and emulator image to virtually cross-compile baserock definitions from whatever host you are running.

What we have now is a build model that looks something like this:

I’ll just take a bit more space here to give a run down of what each component is doing.

YBD Builder

The YBD builder tool remains mostly unchanged in my downstream branch.

Mostly it differs inasmuch as it no longer performs the builds in a chroot sandbox, but instead marshals those builds to slaved Aboriginal guests running in qemu emulators (plural of course, because we want to parallelize the builds as much as dependencies and host resources allow).

What YBD does is basically:

  • Clones the sources to be built from git, all sources are normalized into git repositories hosted on the trove.
  • Stages dependencies, i.e. results of previous builds into a sysroot in a temporary directory for a build, this is done in the virtfs staging grounds.
  • Stages the git repository into the build directory
  • Tells a running emulator that it’s time to build
  • Waits for the result
  • If successful, collects the build results and creates an “artifact” (tarball).

Also, of course YBD parses the YAML definitions format and constructs and navigates a dependency graph.

IPC Interpretor / Modified Init.sh

This component currently lives in the aboriginal controller repository, but should eventually be migrated into the YBD build tool itself as it makes little sense to have this many moving parts.

This is essentially some host side shell scripts, and some guest side shell scripts. The guest is launched in a specific way so as to run in the background and listen to commands over the virtio serial port (this IPC needs to be fixed, it’s a shaky thing and should probably be done over the actual network instead of the serial ports).

Build Sandbox

The build sandbox is just your basic chroot calling shell script, except that it is a bit peculiar in the way it does things.

  • It conditionally stages toybox/busybox if and only if tools are not already found in the staging area
  • It stages statically linked binaries only and is perfectly operational in the absence of any libc

Well, not all that peculiar I guess.

Virtfs 9p shared directory

Here is another, really fun part of this experimental build process.

Qemu has support for exporting a shared directory which can be accessed by the guest kernel if it is compiled with:

CONFIG_VIRTIO=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_PCI_LEGACY=y
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y

When a guest mounts -t 9p the exported directory, qemu will basically just perform the reads and writes on the guests behalf.

More interestingly, qemu provides a few security models, the basic being passthrough, which just reads and writes using the qemu launching user’s credentials. In any case, qemu can only access the underlying filesystem using the credentials it has. However qemu does provide a security model called “mapped” (or “mapped-file” which we ended up using).

Firstly, of course the shared directory is practical because it allows the host running YBD tool to stage things in the same directory where they will be built by the emulator, but things become interesting when the emulator is installing files under specific uids/gids, or creating device files which should be shipped in the resulting OS – basically anything that normally requires root.

Using the “mapped-file” security model allows the guest emulator to believe that it, as root, can manipulate the 9p mounted filesystem as root for all intents and purposes. On the actual underlying filesystem that qemu is writing to, everything will be created in mode 0600 and belong to the user running qemu, but extra metadata about the files qemu creates are going to go into corresponding files in a .virtfs_metadata directory.

The solution we came up with (I had much help in this area from Rob Taylor), was to write a small translation layer which allows us to also interact with the virtfs staging directory on the host side. What this translation layer does is basically:

  • Collect build results and create “real” tarballs from these results. The regular user is not allowed to create device files or files which belong to root, but it is at least allowed to own a tarball containing such files
  • The reverse of the previous; stage the content of a real tarball into a virtfs staging ground, so that files are extracted under the users credentials but the correct virtfs metadata is created so that the guest (build slave) will see the right thing
  • Stage files and directories into the virtfs staging grounds. This part is required for extracted git repositories which we intend to build.

This way, the whole operating system image can potentially be built from scratch by a regular user on the host.

Summary

At this, unfinished stage, I have built over 300 of the ~420 components which go into the basic GNOME system using this method of compilation to build for armv5l on my x86_64 laptop. The only build instructions which needed to be changed in order to build these were the base compiler and glibc builds, and a couple of minor changes to get some packages to build on armv5l.

Most of the kinks have been ironed out, I still have to build over 100 high level components and deploy and test the resulting image, but the higher up the stack you get the less problems you tend to encounter, I presume we’re through the worst of it.

Performance wise, this will never be as fast as scratchbox, however it’s possible that we explore qemu’s user mode emulation at some point. The problem with performance is the more you optimize here, the more nasty hacks you introduce (if say, you want to run host perl while building in the emulator), and the less comprehensive build system you end up with. We will try to keep a nice balance here and prioritize on repeatability and the convenience this can offer in terms of bootstrapping an OS with baserock build instructions on new architectures.

I can say however that regarding performance, libtool is probably next on the chopping block. It serves basically no purpose when building on linux, and building libtool objects costs about 8 to 10 times the time as simply compiling a regular object over distcc.

I will have to put this work down for a while as I have other work landing on my plate which requires attention, so I hope there will be an army of developers to continue this work in my absence :)

If you would like to try and repeat this work, a HOWTO can be found at the bottom of this email. Note that in that email, we had not yet tried the virtfs mapped security model which solves the problem of building as a regular user, however the instructions to get a build off the ground are still valid.

For now I see this as an interesting research project, we have tried some pretty new and interesting things, I am curious to see where this will lead us.

And, special thanks are owed to Rob Landley for giving me pointers along the way while navigating the Aboriginal build system, and for being generally entertaining in #toybox in freenode. Also thanks to Rob Taylor for digging into the qemu sources and coming up with the wild idea of man handling the virtfs mapped metadata.

 

DX Hackfest & FOSDEM

This is one of those back to work posts you intend to write and then kick yourself for forgetting… after a few starts this week I finally managed to squeeze in the time to finish this post.

Last week thanks to Codethink, I was able to travel to Brussels and attend the DX Hackfest followed by FOSDEM. What follows is a run down of things we did there.

Day 0

The Hackfest started on the 27th so I had arrived in Brussels on the 26th bright and early, after around 16 hours of travel including the layover. Feeling hungry, I stumbled out of my hotel room which was downtown by Sainte-Catherine square to fetch a kebab sandwhich. I was thoroughly enjoying my messy pita and fries at a small kebab shack beside the church and by coincidence Juan Pablo was moseying by, admiring the view and taking pictures of the church. With a healthy streak of spicy mayonnaise dripping down my face I called out his name so as not to miss him.

Juan and I had a bit of a chance to talk about what things Glade we could accomplish in our short time in Brussels.

Of course, property bindings came up, which is something that we have wanted for a long time, and Denis Washington had attempted before as his gsoc project.

No, we did not implement that, but here are a few reasons why we determined it was a no go for a few days of intense hacking:

Property Sensitivity

Glade has a notion about object properties having a sensitive or insensitive state, this is determined and driven by the widget adaptor of the object type owning a given property. This is typically used in cases where it makes no sense to set a given property, for instance we make the GtkLabel’s wrap mode property insensitive when the label is not set to wrap.

When considering that a property can be set up as a binding target, it stands to reason that the bound property editor should also be insensitive, as it makes no sense to give it a value if it’s value is driven by another property. Further, it may also make no sense to allow binding of a property at all if the given target property is meaningless in the current widget’s configuration. So, for instance when setting a GtkButton to use custom content instead of the icon name & label, we would have to undoably clear the binding state of the icon name property as well as it’s value.

Cut, Copy & Paste

When we cut, copy and paste in Glade we do so with branches of an object hierarchy. Some interesting new cases we would have to handle include:

  • When we paste a hierarchy in which contains a property source/target pair, we should have the new target property re-routed to the copied source object property.
  • When we paste a hierarchy which contains a bound property for which the source object is outside of the pasted hierarchy, we should maintain that binding in the pasted hierarchy so that it continues to reference the same out-of-hierarchy source.
  • When we paste a hierarchy which contains a bound property for which the source object is outside of the pasted hierarchy, but paste it in a separate project / glade file, the originally bound property should be cleared as it refers to a source property that is now in a different project.

So, after having considered some of the complexities of this, we decided not to be over ambitious and set our sights on lower hanging fruit.

Day 1

On day one we met up relatively bright and early at the betacowork space where the hackfest took place. Some of the morning was spent looking at the agenda and seeing if there were specific things people wanted to talk about, however, as Glade has a huge todo list it makes little sense to think too far ahead about bright and shiny desirable features so I did not add anything to the agenda.

Juan and I had decided that we can absolutely support glade files which do not always specify the ID field, which GtkBuilder has not been requiring for some time now. The benefit of adding this seemingly mundane feature to Glade is mostly better support for Glade files in the wild. Since the ID field is not required by GtkBuilder anymore, it turns out that many hand written files in the wild can no longer be loaded in Glade.

We spent around an hour discussing what issues we might face, and decided the path of least resistance would be to always have an ID internally under a special prefix __glade_unnamed_, so we just avoid serialization of the ID of those objects which are unnamed and we invent them as we load files that omit the ID.

Further, we ensure at all times that if an object is referred to as a property of another object, it must always have an explicit name. We achieve the rollover when running the object selection dialog, if any object is selected as a property of another object; the referred object is given a traditional name like label1 undoably while assigning that reference.

By the end of the day this was working pretty well…

Day 2

By now we thought we had pretty much everything covered for the ID’less widgets, and then we encountered the <action-widgets> of GtkDialog and GtkInfoBar.

These have the unfortunate history of being implemented in an odd way, and I’m not sure how far back this dates, but historically you would denote an action widget by giving it a Response ID integer property and placing the widget in the action area. Since some version of GTK+ 3.x (or possibly even 3.0 ?) we need to refer to these action widgets by their ID in the Glade file and serialize an <action-widgets> node containing those references.

This should ideally be changed in Glade so that the dialog & infobar have actual references to the action widgets (consequently forcing them to have an ID), and probably have another object selection dialog allowing one to select widgets inside of the GtkDialog / GtkInfoBar hierarchy as action widgets. If however the <action-widgets> semantic is newer than GTK+ 3.0 then it gets quite tricky to make this switch and still support the older semantics of adding buttons with response IDs into the action area.

In any case, we settled on simply forcing the action widgets to have an ID at save time, without any undo support, for the singular case of GtkDialog/GtkInfoBar action widgets, disturbingly this also includes autosave, and annoyingly modifies the Glade datamodel without any undoable user interaction, but it’s the corner case hack.

After this road block, and ironing out a few other obstacles (like serializing the ID’s even if they dont exist when launching the preview, which requires an ID to preview)… we were able to at least nail this feature by the end of Day 2.

I also closed this bug by ensuring we dont handle scroll events in the already scrolling property editor, something we probably should have done many years ago.

Also, Juan Pablo revived the old school logo (for those who recall the flaming globe logo) in Glade’s workspace so the workspace is a little more fancy. This tribute to the older logo has in fact has been present for years in the loading screen. Unfortunately… there is only a small number of users who work on projects which contain thousands of widgets, so most of you have been missing out on the awesome old logo tribute, which will now appear in it’s full glory in the background of Glade’s workspace.

Day 3

By now we are getting a bit tired, this post hasn’t covered the more gory details but as we were in Brussels, of course we had the responsibility of sampling every kind of beer. By around 4 pm I was falling asleep at my desk, but before that I was able to make a pass through the GTK+ widget catalog and update it with new deprecations and newly added properties and signals, in some cases updating the custom editors to integrate the new properties nicely. For instance GtkLabel now has a “lines” property which is only sensitive and relevant if ellipsis and word wrapping are enabled simultaneously.

We also fixed a few more bugs.

FOSDEM

And then there was FOSDEM, my first time attending this conference, I was planning on sleeping in but managed to arrive around 10am.

I enjoyed hanging around the booths and mingling mostly, which led to a productive conversation with Andre Klapper about various bug tracking and workflow solutions. I attended some talks in the distros dev room; Sam Thursfield gave his talk about the benefits of using declarative and structured data to represent build and integration instructions in build systems. I also enjoyed some libreoffice talks.

By the end of the second day and just in the nick of time, I was informed that if I had not gotten a waffle from a proper waffle van at the venue, then I had not really been to FOSDEM”. I hurried along and was lucky enough to catch one of the last waffles off of a closing van, which was indeed the most delicious waffle I’ve ever tasted.

I guess the conclusion is that waffles are not what FOSDEM is all about, and that’s a good thing – I’d rather be eating a waffle at a conference about free software, than writing free software at a conference about waffles.

 

A build of GNOME from scratch

Hi all, long time no blog !

As is usual when a long time has passed without blogging, we end up with a mish mash of subjects which, ideally should go into separate posts. Sorry about that, I’ve titled this post “A build of GNOME from scratch” because that’s what I’ll be focusing on most here.

First, I have been out of touch for some time with GNOME, mostly because I have been involved with my own Canada based startup company which has been juicing me for every spare hour of work I could lay my hands on. This of course takes a toll on your life in general so the time has come to slow down the pace a bit for the sake of retaining a small measure of sanity.

So this is mostly why I have not been involved in GNOME as much as I would have liked in recent years, but fear not; I am back and hope to be solving problems *cough* causing trouble on a regular basis again :)

New Employment

In late 2015, I have started to play for team Codethink.

I am very happy with the new arrangement for a number of reasons. One of them of course being that I will be able to make some FOSS contributions again in the course of my work. The other reason, is that when it comes to consulting company logos there is no competition, we obviously have the coolest logo:

logo

But seriously, I am both proud and humbled to be working along side such a talented group of individuals.

A build of GNOME from scratch

The majority of the work I’ve been doing so far with Codethink has been to build and integrate a GNOME reference build with the Baserock build system.

I’ll probably return at some point to give a more thorough explanation of exactly what Baserock is and what it is not, but that is not the point of this post, suffice to say that it is a build system and as of the close of 2015 it includes a reference build of GNOME which is quite functional and fairly well integrated.

Some of the things I’m happy about:

Input Methods

When booting the Baserock GNOME image, you can choose your language and input method. This just works, launch the control center and enter the “Region and Languages”, choose your input method, and it just works.

GNOME in Korean, entering text in Nautilus using hangul input method
GNOME in Korean, entering text in Nautilus using hangul input method

I have yet to see a distribution which does this. To get my Korean input method working, I usually have to ask google about it, find out what packages I need to install (sometimes with multiple ways to set it up) and in some cases I’ve needed to inject things into my environment manually to get it working.

Online Desktop

The online desktop experience works throughout the user experience. This means we’ve sorted out PAM hell and have the keyring unlocked with the user login.

It also means that we’ve got gnome-initial-setup working properly again after some bitrot. So when you create the first regular user on the OS with gnome-initial-setup, the online accounts credentials get handed off to the new user seamlessly (the above linked patch still needs to be merged upstream, though).

Of course, if you’ve setup online accounts, you will get all that sugar such as GNOME Shell notifications of events in your online account calendar, and ability to access your online account emails in Evolution, etc.

Location Services

Geoclue also works, so when you boot up your GNOME system for the first time, geoclue guesses your timezone automatically, even if you’ve selected a locale.

Audio / Video

Audio and Video works for most popular formats, pulseaudio is working and all of the important GStreamer bits are there.

Firing up epiphany will allow you to watch (and listen to) videos on the web, except for youtube (but that’s ok, epiphany also does not play youtube videos on my debian system, apparently there are still things in WebKitGtk blocking that).

Most core GNOME apps

Most of the core GNOME apps are integrated with a few exceptions.

gnome-applications

All of the apps you see in the screenshot actually work… Which sounds pathetic, but I assure you it’s more than just building the apps themselves, there are unfortunately many subtle details to get right in a fully integrated desktop experience, and we’re not quite there yet but pretty darn close.

Where this is all going right now is not entirely decided. For starters, the GNOME build will serve as a better real world test case for the Baserock infrastructure.

There is talk about scripting this so as to output nightly images, so that one could potentially try out a bleeding edge nightly GNOME image fresh from git master at any time.

Crashing the party in Brussels

I will be crashing the DX hackfest in Brussels at the end of the month. I’ve inserted myself in the list next to Juan Pablo because I’m so shy that I would prefer to lurk in his shadow :)

I hope to close a lot of Glade bugs at the hackfest and get closer to the goal of properly supporting all of the UI files in the GTK+ tree. I also look forward to hearing more about Builder and seeing what we can do on our side to make that developer experience better.

This coding sprint for Glade is of course brought to you by Codethink, and I would like to personally thank Codethink for the opportunity to attend FOSDEM for the first time.

A few words to end this

This has gone on quite long enough.

Last Friday I wrote a post that was as painful to write for me as it was hurtful to others.

Unfortunately I felt, and still feel that shining some light on our issues was neccessary to protect open discussion and general inclusiveness in our community. I truly hope that stirring the waters here has led us to some long needed introspection about how things are done around here.

I have just now closed comments on the post, a few days of discussion on this is quite enough. You’ll just have to take my word that I have not doctored any of the comments and did not discriminate against any commentors, regardless of whether or not I liked what they had to say.

The reason for this follow up, and the reason for it being a separate post, is that I have to stress how painful it was for me to level accusations against some really nice people, and if my words are in any way harmful to their overall reputation, then this by itself needs to be rectified at least so much as is possible from my side.

Firstly, for anyone who does not know Paolo Borelli, he does not have a hurtful bone in his body, really he is among the nicest people in GNOME I have met. The undertones surrounding this situation are complex, there is a lot of pressure in the community to avoid any conflicts and it’s sad to see people get pulled into this.

Paolo is actually the one who, you could say “mentored” me over ten years ago now, he helped me a lot to understand how things work with IRC and the politics around being a maintainer in GNOME, I hope this serves to clarify how painful it was for me to bring his name into this.

Secondly, I’ve been exchanging emails with Alberto over the weekend, he is also a really nice guy who I would not have expected to take a stance. However something that I failed to recognize in all of this is that Alberto, being the maintainer of Planet GNOME, was under extreme pressure by various people to remove Philip from Planet GNOME at the end of May, of course, he had to take a position in a lose-lose battle and was already caught in the cross fire.

I do not envy Alberto’s position in all of this at all, and while we may disagree on some matters, he does not deserve to be painted in the light that I painted him in.

Paolo, Alberto and Emmanuele, you deserve, and have my deepest apologies for having dragged your names into this.

That said, the fact that there was so much pressure in the community to take a public stance against any and all forms of criticism regarding OPW and the direction of GNOME and our priorities, is a problem and I’m glad we got it out in the open to discuss it.

My blog will not be a venue for further discussion on this matter for the moment, I’ve contributed enough hours to this and we are going into a beta testing phase in one month and really need to focus on the work we are doing.

 

I’m looking at you

Hi,

I write to you all today on a solemn matter, one which I fear will be forgotten and ignored if nobody starts some discussion on this.

Earlier this week, some of you may have noticed that for a very short time there was a rather angry post by Philip Van Hoof, he sounded quite frustrated and disturbed and the title of his post basically said to please remove him from the Planet GNOME feeds.

Unfortunately this blog post was even deleted from his own blog, so there is nothing to refer to here, also it was gone so fast that I have a hunch many Planet GNOME readers did not get a chance to see what was going on.

What I want to highlight in this post is not this frustrated angry post by Philip, but rather the precursor which seems to have led us to this sad turn of events.

Let’s make things better

In late May this summer, Philip submitted the post “Let’s make things better“. This post is also deleted from his own blog, I’m not sure for what reasons, I’m keeping the link alive here incase Philip feels inspired enough to at reactivate that post (it would help for people to see this in perspective, as people who have not read that entry may suspect it contained rudeness or bad language or sweeping accusations or something, which simple was not the case at all).

Yes, a lot of you readers know about that post, many of you would probably prefer I don’t bring it up, but the problem is that many people just don’t know what happened. Also the result of him deleting his post is that people don’t get any chance to verify these false claims of indecency which were aimed towards him for writing a very sensible post.

What I can say, is that the post did not use any distasteful language, he was not rude and did not single anyone out or blame anyone, he just said some really sensible things which happened to annoy a certain few members of our community.

I think the critical part which made people react irrationally to his prose, ran something like:

Maybe if we spent a little less time on outreach, and a little bit more on development…

And went on from there, he was basically arguing that our efforts on sustaining programs such as OPW are not a part of our mission, and that maybe our attention would be better spent writing excellent software (I’ll be happy if the post re-appears so people can read it in it’s integrity, as I don’t have a copy anymore).

I think, given the turn of events, this recent post by Philip requesting to be removed was a final attempt to try to do something good for a community that just keeps telling him that his views are wrong, dirty, and need to be censored, i.e. he got a lot of flak from the community at large for absolutely no good reason at all – if anyone needs to be ashamed, it’s us, as a community, for failing him.

I’m looking at you

It’s generally bad form to name people in public, however the wider GNOME community needs to know what is really going on in this case and they will not have the evidence to judge for themselves without references. That said these are only a couple of excerpts from the circus of public shaming which followed Philip’s perfectly reasonable blog post.

 

Paolo Borelli makes a response to someone who quoted Philip’s blog in a positive light on a public mailing list, and he goes out of his way to mention his public opposition to Philip speaking his mind:

However you also started off by citing Philip’s blog post and honestly I found that post wrong and disturbing

Taken in context of the mail thread, it looks as though the original poster is to be considered lucky to be taken seriously in any measure, just for referring to the said blog post which puts a little scrutany on our GNOME identity as an outreach foundation.

Paolo, really ? I would never have expected this behavior, do you really feel it’s necessary to call Philip’s call to reevaluate our position on these matter as “wrong and disturbing” ?

We have a long history you and I, I thought I knew you better than that.

 

Alberto Ruiz takes it a step further, again taking a public stance against Philip:

“I’ve been asked to remove your blog by several people and I’ve reached the conclusion that it would be a really bad idea because
it would set the wrong precedence and it would shift the discussion to the wrong topic (censorship yadda yadda). Questioning OPW should be allowed.

The problem with your post is that if not questioned by other people (as many have done already) it would send the wrong message to the public and prospect GSoC, OPW and general contributors. Your blog was the wrong place to question and your wording makes it clear that you have misunderstandings about how the community works.”

Alberto, I’m disappointed in you. There is no censorship on Planet GNOME, you know that, I know that, and asides from one silly “upskirt” incident in the history of Planet GNOME, this has never caused any issues.

Moreover, it is simply not your call, or anyone’s call to make, to decide that a long time member of our community’s politely and consicely formed opinion be censored from Planet GNOME just because it disagrees with what some of the other members think.

It is not your call to say that people should not be questioning things on Planet GNOME, especially since that is EXACTLY where it will be heard. Have you considered that he takes this issue very seriously and has decided, as is his right, to raise the matter for open public discussion ? Public discussion on the direction of GNOME is what we do in GNOME, we are the foundation and contributors and public discussion needs to happen about critical matters in order for us, the public, to make good decisions about the future of GNOME.

 

Finally, Emmanuele Bassi, I know his recent post was pretty “out there”, anyone would expect him to be frustrated after the treatment this community has given him, the public shaming and insolence this community has shown him by taking such an opposed stance against his expressing himself would be enough to drive anyone nuts.

Don’t you think, though, that his post was a last-effort attempt to be heard and be a positive influence for change in GNOME ?

Do you really think this immediate response to a frustrated blog post was the correct way to diffuse the situation ?

Really, we should do better to protect our own, Philip obviously had a rough time in the last couple months, his blog post was not an excuse to quickly sweep him under the rug, but a challenge to call people to action and actually openly discuss change.

If we don’t have people like Philip who are at least willing to fight for our ability to openly discuss things, then I fear the worst for this community in the long run.

Moral of the story guys… Please get a grip, I’m really not impressed with how people have responded to Philip this summer, it could have equally been any of you, and if you had something important to share, I would be equally disappointed if the community had so aggressively shouted you down.

And no, I was never a proponent of the CoC effort, but please guys at least try to remember the first rule: Assume that others mean well.

All the best.

 

Amendment

Today someone pointed out that since the original post at the end of may is missing, noone can form an opinion of their own. I did not have access to it at the time but another commentor was kind enough to paste a copy:

Matthew gets that developers need good equipment.
 
Glade, Scaffolding (DevStudio), Scintilla & GtkSourceView, Devhelp, gnome-build and Anjuta also got it earlier.
 
I think with GNOME’s focus on this and a bit less on woman outreach programs; this year we could make a difference.
 
Luckily our code is that good that it can be reused for what is relevant today.
 
It’s all about what we focus on.
 
Can we please now go back at making software?
 
ps. I’ve been diving in Croatia. Trogir. It was fantastic. I have some new reserves in my mental system.
 
ps. Although we’re very different I have a lot of respect for your point of view, Matthew.