BuildStream at ApacheCon 2022 New Orleans

About ApacheCon

This was my first real conference since worldwide panic spread a few years ago, and it is hard to overstate how exciting it was to be in New Orleans and meet real people again, especially since this was an opportunity to meet the Apache contributor base for the first time.

The conference took place at the Sheraton hotel on Canal street, where we had access to a discounted rate for rooms, a large room where everyone could attend the keynotes with a coffee/snack area outside where booths were also located, and on the 8th floor we had 6 small conference rooms for the various tracks.

The talks I attended were refreshingly outside of my comfort zone, such as big data workflow scheduling with DolphinScheduler, a talk about Apache Toree which is a Jupyter Kernel for Scala / Apache Spark (a framework for data analysis and visualization) and my personal favorite was a talk about the SDAP (Science Data Analytics Platform) which is a platform built to support Earth Science use cases, this talk explored some of the implementation details and use cases of an engine which can be used to search efficiently through Earth related data (collected from satellites and various sensors) which can be freely obtained from NASA. In case you’re wondering, unfortunately actually leveraging this data requires that you download and house the (immense) data which you intend to analyze, either in the elastic cloud or on-premise clusters.

In the evenings, the foundation prepared a speakers reception on Tuesday evening and then a general attendee reception on Wednesday evening, and the Gradle folks also organized a bonus event, this provided a nice balance for socializing with new people with the opportunity to explore the music scene in New Orleans.

The Spotted Cat jazz club on Frenchmen street (highly recommended)

 

 

 

 

 

 

Overall this felt very much like a GUADEC inasmuch as it is a small(ish) tightly knit community, leaving us with much quality time for networking and bonding with new fellow hackers.

BuildStream talk

On Tuesday I gave my BuildStream presentation, which was an introduction to BuildStream, along with a problem statement about working with build systems like Maven which love to download dependencies from the internet at build time. This bore some fruit as I was able to discuss the problem at length with people more familiar with java/mvn later on over dinner.

But… all of that is not the point of this blog post.

The point is that directly after my talk, Sander (a long time contributor and stakeholder) and I scurried off into a corner and finally pounded out a piece of code we had been envisioning for years.

BuildStream / Bazel vision

Good integration with Bazel has always been an important goal for us especially since we got involved with the remote execution space and as such are using the same remote execution APIs (or REAPI).

While BuildStream generally excels in build correctness and robustness, Bazel is the state of the art in build optimization. Bazel achieves this by maximizing parallelism of discrete compilation units, coupled with caching the results of these fine grained compilation units. Of course this requires that a project adopt/support Bazel as a build system, similar to how a project may support being built with cmake or autotools.

Our perspective has been that, and I hope I’m being fair:

  • Bazel requires significant configuration and setup, from the perspective of one who just wants to download a project and build it for the first time.
  • Bazel will build against your system toolchain by default, but provides some features for executing build actions inside containers, which is slow due to spawning separate containers for each action, and is still considered an experimental feature.
  • BuildStream should be able to provide a safe harness for running Bazel without too much worries about configuration of build environments as well as provide strict control of what dependencies (toolchain) Bazel can see.
  • A marriage of both tools in this manner should provide a powerful and reliable setup for organizations which are concerned with:
    • Building large mono repos with Bazel
    • Building the base toolchain/runtime and operating system on which their large Bazel built code bases are intended to be built with and run

While running Bazel inside BuildStream has always been possible and does ensure determinism of all build inputs (including Bazel itself), this has never been attractive since Bazel would not have access to remote execution features and more importantly it would not have access to the CAS where all of the discrete compilation units are cached.

And this is the crux of the issue; in order to have a performant solution for running Bazel in BuildStream, we need to have the option for tools in the sandbox to have access to some services, such as the CAS service and remote execution services.

And so this leads me to our proof of concept.

BuildStream / recc proof of concept

At the conference we initially tried this with Bazel, but I was unable to successfully configure Bazel to enable remote caching, so we went for an equivalent test using the Remote Execution Caching Compiler (recc). Recc is essentially like ccache, but uses the CAS service for caching compilation results.

After some initial brainstorming, some hacking, and some advice from Jürg, we were able to come up with this one line patch all in a matter of hours.

As visualized in the following approximate diagram, the patch for now simply unconditionally exposes a unix domain socket at a well known location in the build environment, allowing tooling in the sandbox to have access to the CAS service.

 

 

 

 

 

 

 

 

 

Asides from the simple patch, we generated the buildstream-recc-demo repository to showcase recc in action. The configuration for this in the sandbox is dead simple, as evidenced by the example.bst, building the example simply results in running the following command twice:

recc g++ -I. -c -o hello-time.o hello-time.cc

One can observe evidence that this is working by observing the buildstream logs of the element, and indeed the second compile (or subsequent compiles) results in the expected cache hit.

By itself, using recc in this context to build large source bases like WebKit should provide significant performance benefits, especially if the cache results can be pushed to a central / shared CAS and shared with peers.

Next steps

In order to bring this proof of concept to fruition, we’ll have to consider topics such as:

  • How to achieve the same thing in a remote execution context
    • Possibly this needs to be a feature for a remote worker to implement, and as such at first, BuildStream would probably only support the BuildGrid remote execution service implementation for this feature
  • Consider the logistics of sharing cached data with peers using remote CAS services
    • Currently this demo gives access to the local CAS service
  • Consider possible side effects when multiple different projects are using the same CAS, and whether we need to delegate any trust to the tools we run in our sandboxes:
    • Do tools like recc or bazel consider the entire build environment digest when composing the actions used to index build results / cache hits ?
    • If not, could these tools be extended to be informed of a digest which uniquely identifies the build environment ?
    • Could BuildStream or BuildBox act as a middle man in this transaction, such that we could automatically augment the actions used to index cached data to include a unique digest of the build environment ?

The overarching story here is of course more work than our one line patch, but put in perspective it is not an immense amount of work. We’re hoping that our proof of concept is enough to rekindle the interest required to actually push us over the line here.

In closing

To summarize, I think it was a great to meet with a completely different community at ApacheCon and I would recommend it to anyone involved in FLOSS to diversify the projects and conferences they are involved in, as a learning experience, and also because in a sense, we are all working towards very similar common goods.

Thanks to Codethink for sponsoring the BuildStream project’s presence at the conference, with the 2.0 release very close on the horizon it should be an exciting time for the project.

 

BuildStream 2 news

After a long time passed without any BuildStream updates, I’m proud to finally announce that unstable BuildStream 2 development phase is coming to a close.

As of the 1.95.0 beta release, we have now made the promise to remain stable and refrain from introducing any more API breaking changes.

At this time, we are encouraging people to try migrating to the BuildStream 2 API and to inform us of any issues you have via our issue tracker.

Installing BuildStream 2

At this time we recommend installing BuildStream 2 into a python venv, this allows parallel installability with BuildStream 1 in the case that you may need both installed on the same system.

First you need to install BuildBox using these instructions, and then you can install BuildStream 2 from source following these commands:

# Install BuildStream 2 in a virtual environment
cd buildstream
git checkout 1.95.0
python3 -m venv ~/buildstream2
~/buildstream2/bin/pip install .

# Activate the virtual environment to use `bst`
. ~/buildstream2/bin/activate

Porting projects to BuildStream 2

In order to assist in the effort of porting your projects, we have compiled a guide which should be helpful in updating your BuildStream YAML files.

The guide does not cover the python API. In case you have custom plugins which need to be ported to the new API, you can observe the API reference here and you are encouraged to reach out to us on our project mailing list where we will be eager to answer any questions and help out in the porting effort.

What’s new in BuildStream 2 ?

This would be a very exhaustive list, so I’ll try to summarize the main changes as succinctly as possible.

Remote Execution

BuildStream now supports building remotely using the Remote Execution API (REAPI), which is a standard protocol used by various applications and services such as Bazel, recc, BuildGrid, BuildBarn and Buildfarm.

As the specification allows for variations in setup and implementation of remote execution clusters, there are some limitations on what can be used by BuildStream documented here.

Load time performance

In order to support large projects (in the 100,000 elements range), the loading codepaths have undergone a massive overhaul. Various recursive algorithms have been eliminated in favor of iterative ones, and cython is used for the hot code paths.

This makes large projects manageable, and noticeably improves performance and responsiveness on the command line.

Artifact and Source caching

BuildStream now uses implementations of the Content Addressable Storage (CAS) and Remote Asset services which are a part of the Remote Execution API exclusively to store artifacts and sources in remote caches. As such, we no longer ship our own custom implementation of an artifact server, and currently recommend using BuildBarn.

Supported implementations of cache servers are documented here.

In addition to caching and sharing built artifacts on artifact servers, BuildStream now also caches source code in CAS servers. This will imply an initial performance degradation on the initial build while BuildStream populates the caches but should improve performance under regular operation, assuming that you have persistent local caches, or have decent bandwidth between your build machines and CAS storage services.

BuildBox Sandboxing

Instead of using bubblewrap directly, BuildStream now uses the BuildBox abstraction for its sandboxing implementation, which will use bubblewrap on Linux platforms.

On Linux build hosts, BuildBox will use a fuse filesystem to expose files in CAS directly to the containerized build environment. This results in some optimization of the build environment and also avoids the hard limit on hardlinks which we sometimes encounter with BuildStream 1.

BuildBox is also used for the worker implementation when building remotely using the BuildGrid remote execution cluster implementation.

Cache Key Stability

BuildStream 2 comes with a new promise to never inadvertently change artifact cache keys in future releases.

This is helpful to reduce unnecessary rebuilds and potential resulting validation of artifacts when upgrading BuildStream to new versions.

In the future, it is always possible that a new feature might require a change to the artifact content and format. Should such a scenario arise, our policy is to support the old format and make sure that new features which require artifact changes be opt-in.

Caching failed builds

With BuildStream 2, it is now possible to preserve and share the state of a failed build artifact.

This should be useful for local debugging of builds which have failed in CI or on another user’s machine.

YAML Format Enhancements

A variety of enhancements have been made to the YAML format:

  • Variable expansion is now performed unconditionally on all YAML. This means it is now possible to use variable substitution when declaring sources as well as elements.
  • The toplevel-root and project-root automatic variables have been added, allowing some additional flexibility for specifying sources which must be obtained in project relative locations on the build host.
  • Element names are more clearly specified, and it is now possible to refer to elements across multiple junction/project boundaries, both on the command line and also as dependencies.
  • It is now possible to override an element in a subproject, including the ability to override junction definitions in subprojects. This can be useful to resolve conflicts where multiple projects depend on the same junction (diamond shaped project dependency graphs). This can also be useful to override how a single element is built in a subproject, or cause subproject elements to depend on local dependencies.
  • The core link element has been introduced, which simply acts as a symbolic link to another element in the same project or in a subproject.
  • New errors are produced when loading multiple junctions to the same project, these errors can be explicitly avoided using the duplicates or internal keywords in your project.conf.
  • ScriptElement and BuildElement implementations now support extended configuration in dependencies, allowing one to stage dependencies in alternative locations at build time.
  • Loading pip plugins now supports versioning constraints, this offers a more reliable method for specifying what plugins your project depends on when loading plugins from pip packages installed in the host python environment.
  • Plugins can now be loaded across project boundaries using the new junction plugin origin. This is now the recommended way to load plugins, and plugin packages such as buildstream-plugins are accessible using this method. An example of creating a junction to a plugin package and using plugins from that package is included in the porting guide.

New Plugin Locations

Plugins which used to be a part of the BuildStream core, along with some additional plugins, have been migrated to the buildstream-plugins repository. A list of migrated core plugins and their new homes can be found in the porting guide.

Porting Progress

Abderrahim has been maintaining the BuildStream 2 ports of freedesktop-sdk and gnome-build-meta, and it is looking close to complete, as can be observed in the epic issue.

Special Thanks

We’ve come a long way in the last years, and I’d like to thank:

  • All of our contributors individually, whom are listed in our beta release announcement.
  • Our downstream projects and users who have continued to have faith in the project.
  • Our stakeholder Bloomberg, who funded a good portion of the 2.0 development, took us into the Remote Execution space and also organized the first Build Meetup events.
  • Our stakeholder Codethink, who continues to support and fund the project since founding the BuildStream project in 2016, organized several hackfests over the years, organized the 2021 chapter of the Build Meetup during lockdowns, and has granted us the opportunity to tie up the loose ends and bring BuildStream 2.0 to fruition.

BuildStream news and 2.0 planning

Hi all,

It has been a very long time since my last BuildStream related post, and there has been a huge amount of movement since then, including the initial inception of our website, the 1.2 release, the beginnings of BuildGrid, and several hackfests including one in manchester hosted by Codethink, and another one in London followed by the Build Meetup which were hosted by Bloomberg at their London office.

BuildStream Gathering (BeaverCon) January 2019 group photo

As a newyears resolution, I should make a commitment to blogging about upcoming hackfests as soon as we decide on dates, sorry I have been failing at this !

Build Meetup

While the other BuildStream related hackfests were very eventful (each of them with at least 20 attendees and jam packed with interesting technical and planning sessions), the Build Meetup in London is worth a special mention as it marks the beginnings of a collaborative community of projects in the build space.

Developers of Bazel, BuildStream & BuildGrid and Buildbarn (and more!) united in London to discuss our wide variety of requirements for the common remote execution API specification, had various discussions on how we can improve performance and interoperability of our tooling in distributed build scenarios and most importantly; had an opportunity to meet in person and drink beer together !

BuildStream 2.0 planning

As some may have noticed, we did not release any BuildStream 1.4 in the expected timeframes, and the big news is that we won’t be.

After much deliberation and hair pulling, we have decided to focus on a new BuildStream 2.0 API.

Why ?

The reasons for this are multifold, but mainly:

  • The project has received much more interest and funding than anticipated, as such we currently have a staggering amount of contributors working full time on BuildStream & BuildGrid combined. Considering the magnitude of features and changes we want to make, committing to a 6 month stable release cycle for this development is causing too much friction.
  • A majority of contributors would prefer to refine and perfect the command line interface than to commit to the interface we initially had, and indeed the result after adding new functionality will be more comprehensive and intuitive as a result. The result of these changes will not however be backwards compatible.
  • With a long term plan to make it easier for external plugins to be developed, along with a plan to split out plugins into more sensible separate repositories, we also anticipate some API breakages to the format related to plugin loading which cannot be avoided.

To continue with the 1.x API while making the changes we want to make would be dishonest and just introduce unexpected breakages.

What are we planning ?

Some highlights of what we are working on for future BuildStream 2.x include:

Remote execution support

A huge amount of the ongoing work is towards supporting distributed build scenarios and interoperability with other tooling on server clusters which implement the remote execution API.

This means:

  • Element builds which occur on distributed worker machines
  • Ability to implement a Bazel element which operates on the same build servers and share the same caching mechanics
  • Ability to use optimizations such as RECC which interact with the same servers and caching mechanics

Optimizating for large scale projects

While performance of BuildStream 1.x is fairly acceptable when working with smaller projects with 500-1000 elements, it falls flat on its face when processing larger projects with ~50K elements or more.

New commands for interacting with artifacts

For practical purposes, we have been getting along well enough without the ability to list the artifacts in your cache or examine details about specific artifacts, but the overall experience will be greatly enriched with the ability to specify and manipulate artifacts (rather than elements) directly on the command line.

Stability of artifact cache keys

This is a part of BuildStream which was never stable (yet), meaning that any minor point upgrade (e.g. from 1.0 -> 1.2) would result in a necessary rebuild of all of a given project’s artifacts.

While this is not exactly on the roadmap yet, there is a growing interest in all contributing parties to make these keys stable, so I have an expectation that this will stabilize before any 2.0 release.

Compatibility concerns

BuildStream 2.x will be incompatible with 1.x, as such we will need to guarantee safety to users of either 1.x or 2.x, probably in the form of ensuring parallel installability of BuildStream itself and ensuring that the correct plugins are always loaded for the correct version, and that 1.x and 2.x clients do not cross contaminate eachother in any way.

While there are no clear commitments about just how much the 2.x API will resemble the 1.x API, we are committed to providing a clear migration guide for projects which need to migrate, and we have a good idea what is going to change.

  • The command line interface will change a lot. We consider this to be the least painful for users to adapt to, and as there are a lot of things which we want to enhance, we expect to take a lot of liberties in changing the command line interface.
  • The Plugin facing Python API will change, we cannot be sure how much. One important change is that plugins will not interact directly with the filesystem but instead must use an abstract API. There are not that many plugins out in the wild so we expect that this migration will not be very painful to end users.
  • The YAML format will change the least. We recognize that this is the most painful API surface to change and as such we will try our best to not change it too much. Not only will we ensure that a migration process is well documented, but we will make efforts to ensure that such migrations require minimal effort.

We have also made some commitment to testing BuildStream 2.x on a continuous basis and ensuring that we can build the freedesktop-sdk project with the new BuildStream at all times.

Schedule

We have not committed to any date for the 2.0 release as of yet, and we explicitly do not have any intention to commit to a date until such a time that we are comfortable with the new API and that our distributed build story has sufficiently stabilized. I can say with confidence that this will certainly not be in 2019.

We will be releasing development snapshots regularly while working towards 2.0, and these will be released in the 1.90.x range.

In the mean time, as far as I know there are no features which are desperately needed by users at this time, and as long as bugs are identified and fixed in the 1.2.x line, we will continue to issue bugfix releases.

We expect that this will provide the safety required for users while also granting us the freedom we need to develop towards the new 2.0.

FOSSASIA 2019

Often, a conference is an opportunity to visit exotic and distant locations like Europe, and maybe we forget to check out the exciting conferences closer to home.

In this spirit, I will be giving another BuildStream talk at FOSSASIA this year. This conference has been running strong since 2009, takes place in Singapore and lasts four days including talks, workshops, an exhibition and a hackathon.

I’m pretty excited to finally see this for myself and would like to encourage people to join me there !

BuildStream Hackfest and FOSDEM

Hi all !

In case you happen to be here at FOSDEM, we have a BuildStream talk in the distributions track tomorrow at noon.

BuildStream Hackfest

I also wanted to sum up a last minute BuildStream hackfest which occurred in Manchester just a week ago. Bloomberg sent some of their Developer Experience engineering team members over to the Codethink office in Manchester where the whole BuildStream team was present, and we split up into groups to plan upcoming coding sprints, land some outstanding work and fix some bugs.

Here are some of the things we’ve been talking about…

Caching the build tree

Currently we only cache the output in a given artifact, but caching the build tree itself, including the sources and intermediate object files can bring some advantages to the table.

Some things which come to mind:

  • When opening a workspace, if the element has already been built; it becomes interesting to checkout the whole built tree instead of only the sources. This can turn your first workspace build into an incremental build.
  • Reconstruct build environment to test locally. In case you encounter failures in remote CI runners, it’s interesting to be able to reproduce the shell environment locally in order to debug the build.

Benchmarking initiative

There is an initiative started by Angelos Evripiotis to get a benchmarking process in place to benchmark BuildStream’s performance in various aspects such as load times for complex projects, staging time for preparing a build sandbox etc, so we can keep an eye on performance improvements and regressions over time.

We talked a bit about the implementation details of this, I especially like that we can use BuildStream’s logging output to drive benchmarks as this eliminates the observer side effects.

Build Farm

We talked about farming out tasks and discussed the possibility of leveraging the remote execution framework which the Bazel folks are working on. We noticed that if we could leverage such a framework to farm out the execution of a command against a specific filesystem tree, we could essentially use the same protocol to farm out a command to a worker with a specified CPU architecture, solving the cross build problem at the same time.

Incidentally, we also spent some time with some of the Bazel team here at FOSDEM, and seeing as there are some interesting opportunities for these projects to compliment eachother, we will try to see about organizing a hackfest for these two projects.

Incremental Builds

Chandon Singh finally landed his patches which mount the workspaces into the build environment at build time instead of copying them in, this is the first big step to giving us incremental builds for workspaced elements.

Besides this, we need to handle the case where configure commands should not be run after they have already been run once, with an option to reset or force them to be run again (i.e. avoid recreating config.h and rebuilding everything anyway), also there is another subtle bug to investigate here.

But more than this, we also got to talking about a possible workflow where one runs CI on the latest of everything in a given mainline, and runs the builds incrementally, without workspaces. This would require some client state which would recall what was used in the last build in order to feed in the build tree for the next build and then apply/patch the differences to that tree, causing an incremental build. This can be very interesting when you have highly active projects which want to run CI on every commit (or branch merge), and then scale that to thousands of modules.

Reproducibility Testing

BuildStream strives for bit-for-bit reproducibility in builds, but should have some minimal support for allowing one to easily check just how reproducible their builds are.

Jim MacArthur has been following up on this.

 

In Summary

The hackfest was a great success but was on a little short notice, we will try to organize another hackfest well in advance and give more people a chance to participate next time around.

 

And… here is a group photo !

BuildStream Hackfest Group Photo

 

If you’re at FOSDEM… then see you in a bit at the GNOME beer event !

GUADEC & BuildStream

After a much needed 2 week vacation following GUADEC, finally I’m getting around to writing up a GUADEC post.

GUADEC

Great venue conveniently located close to downtown. I feel like it was a smaller event this year compared to other years that I have attended (which are not many), still; I enjoyed most, if not all of the talks I went to.

Some highlights for me I guess were attending Juan’s Glade talk, he always does something fun with the presentation and creates the presentation using glade which is pretty cool; would always be nice to get some funding for that project as it’s always had potential…

Both Christian and Michael’s talks on Builder and LibreOffice respectively were entertaining and easy to digest.

Probably the most interesting and important talk I attended was Richard Brown’s talk which was about software distribution models and related accountability (broadly speaking). It’s important that we hear the concerns of those who have been maintaining traditional distros, so that we are prepared for the challenges and responsibilities that come with distributing binary stacks (Flatpak runtimes and apps).

My deepest regret is that I missed out on all of Carlos Garnocho’s talks. Here we are all concerned with the one talk we have to give; and meanwhile Carlos is so generous that he is giving three talks, all without breaking a sweat (a round of applause please !).

First and foremost to me, GUADEC is an opportunity to connect in person with the human backend component of the irc personas we interact with on a day to day basis and this is always a pleasure 🙂

So thanks to the local organization team for organizing a great event, and for manning the booths and doing all the stuff that makes this possible, and thanks to everyone who sponsored the event !

BuildStream

It was a huge year so far for BuildStream and I’m very happy with the result so far. Last year we started out in November from scratch and now while our initial road map is not exactly complete, it’s in great shape. This is thanks in no small part to the efforts and insights of my team members who have been dealing with build system related problems for years before me (Sam Thursfield, Jürg Billeter, Jonathan Maw and more).

I threw together a last minute demo for my talk at GUADEC, showing that we can viably build Flatpak runtimes using BuildStream and that we can do it using the same BuildStream project (automatically generated from the current jhbuild modulesets) used to build the rest of GNOME. I messed up the demo a bit because it required a small feature that I wrote up the day before and that had a bug; but I was able to pull it together just in time and show gedit running in an SDK that causes GtkLabels to be upside down (this allowed me to also demo the workspaces and non strict build modes features which I’ve been raving about).

The unconference workshop did not go quite as well as I had hoped; I think I just lacked the talent to coordinate around 15 people and get everyone to follow a step by step hands on experience, lesson learned I guess. Still, people did try building things and there was a lot of productive conversation which went on throughout the workshop.

GNOME & BuildStream

During the unconference days I had some discussions with some of release team members regarding moving forward building GNOME with BuildStream.

The consensus at this time is that we are going to look into a migration after releasing GNOME 3.26 (which is soon !), which means I will be working full time on this for… a while.

Some things we’re looking into (but not all at once):

  • Migrate to maintaining a BuildStream project instead of JHBuild modulesets
  • Ensure the GNOME BuildStream project can be used to build Flatpak runtimes
  • Migrate the Flatpak sdkbuilder machines to use BuildStream to build the GNOME SDK
  • Also build the freedesktop SDK using BuildStream as we can tie things in better this way
  • Finally replace the Yocto based freedesktop-sdk-base project with a runtime bootstrap built with BuildStream
  • Adapt GNOME Continuous to consume the same GNOME BuildStream project, and also have GNOME Continuous contribute to a common artifact share (so developers dont have to build things which GNOME Continuous already built).

These are all things which were discussed but I’m only focusing on landing the first few of the above points before moving on with other enhancements.

Also, I have accepted an invitation to join the GNOME release team !

This probably wont change my life dramatically as I expect I’ll be working on building and releasing GNOME anyway, just in a more “official” capacity.

Summary

This year we did a lot of great work, and GUADEC was a climax in our little BuildStream journey. All in all I enjoyed the event and am happy about where things are going with the project, it’s always exciting to show off a new toy and rewarding when people are interested.

Thanks to everyone who helped make both GUADEC and BuildStream happen !

Continuous BuildStream conversions of GNOME Modulesets

This is another followup post in the BuildStream series.

As I’ve learned from my recent writings on d-d-l, it’s hard to digest information when too much of it is presented at once, so today I’m going to try to talk about one thing only: A continuous migration path for moving away from the JHBuild modulesets to the YAML files which BuildStream understands.

Why continuous migrations ?

Good question !

It would be possible to just provide a conversion tool (which is indeed a part of the mechanics we’ve put in place), but that would leave GNOME in a situation where there would have to be a decided “flag day” on which the JHBuild based modulesets would be deprecated and everyone would have to move at once.

This continuous migration allows us to improve the user and maintainer story in BuildStream, and allows people to try building GNOME with BuildStream along the way until such a point that the GNOME release team and community is ready to take a plunge.

How does it work ?

The migrations are running on a dedicated server (which just happens to be one of the servers that Codethink contributed to build GNOME things on arm architectures last year) and there are a few moving parts, which should be quite transparent to the end user.

Here is a little diagram I threw together to illustrate the big picture:

About debootstrap-ostree

One of the beautiful things about BuildStream is that we do not ever allow host tools or dependencies to infiltrate the build sandbox. This means that in order to obtain the system dependencies required to build GNOME, we need to provide them from somewhere. Further, we need that source to be something revisioned so that builds can be reproducible.

The debootstrap-ostree script does this by running debian’s multistrap program to cross bootstrap the GNOME system dependencies (currently using debian testing) for the four architectures we are currently interested in: i386, x86_64, arm and aarch64.

This process simply commits the results of the multistrap into an ostree repository that is hosted on the same machine, and is running once daily.

In case you are wondering “why use a debian base” instead of “my favorite distro foo” the answer is just that time is precious and this was very easy to setup. With that said, longer term I think it will be interesting to setup a service for running builds against a multitude of baselines (as I tentatively outlined in this message).

  • Code Repository: https://gitlab.com/BuildStream/debootstrap-ostree
  • Hosted outputs: https://gnome7.codethink.co.uk/repo

About jhbuild2bst

The jhbuild2bst conversion program is a fork of upstream JHBuild itself, this was just the easiest way to do it as we leverage all of the internal XML parsing logic.

Again because of the “no host dependency” nature of BuildStream, there is some extra static data which goes into the conversion. This data defines the base system that everything we convert depends on to build. Naturally this static data defines a BuildStream element which imports the debootstrap-ostree output, you can take a look at that here.

The automated conversion works by first checking out the latest gnome-modulesets-base repository and then running jhbuild2bst in that repository to get a conversion of the latest upstream jhbuild modulesets and the latest static data together.

This conversion runs every 10 minutes on gnome7.codethink.co.uk and result in a noop in the case that the static data and upstream modulesets have not changed. The output is then committed to the master branch of this git repository.

  • jhbuild2bst fork: https://gitlab.com/BuildStream/jhbuild2bst
  • static data: https://gitlab.com/BuildStream/gnome-modulesets-base
  • converted: https://gnome7.codethink.co.uk/gnome-modulesets.git

The “tracked” branch

In addition to the automated conversions taking place above which produce output in the master branch of the automated git repository, there is another task which runs bst track on the same output, this is a heavier workload and runs less frequently than the conversions themselves.

So what is “bst track” exactly ?

If you look at the GNOME modulesets, you will find that (except for in previous stable releases which refer to precise tarball versions), the currently developed version of GNOME specifies only branches for every module.

The raw conversions (available in the master branch of the converted repo) will only show the tracking branches taken from the jhbuild modulesets, but running “bst track” will modify the BuildStream project inline with the latest source references for each respective tracking branch.

As BuildStream favors determinism, it will refuse to build anything without explicit references to an exact set of inputs, however BuildStream will automate this part for the user with “bst track” so you don’t have to care. It is however important to us that the activity of changing your exact source references be an explicit one. For instance, BuildStream also now supports “bst build –track” so that you can track the newest source references on each branch and then build them, all in one session.

While it is certainly possible to just grab the latest conversion from the master branch and build it with “bst build –track”, it is also interesting that we have a repository where it has been done automatically. This allows me to say that I have build exactly this <commit sha> from the tracking branch, and that if you build the same commit on your host, the results you get will be exactly the same.

To get the latest tracked sources:

git clone https://gnome7.codethink.co.uk/gnome-modulesets.git
cd gnome-modulesets
git checkout tracked

How can I try it ?

So we’ve finally reached the fun part !

First, install buildstream by following the “Installing BuildStream” instructions in the documentation.

Now just run these commands:

git clone https://gnome7.codethink.co.uk/gnome-modulesets.git
cd gnome-modulesets
bst build --track meta-gnome-apps-tested.bst

So the above will build or fail depending on the time of day, while building a tracked branch will build or fail entirely predictably. In the meantime I will be babysitting the converted builds and trying to get as much as I can building reliably.

Now lets say you want to run something that you’ve built:

bst shell --scope run gnome-sudoku.bst

[ ... BuildStream now prepares a sandbox and launches a shell ... ]

# gnome-sudoku

Warning:  This will take around around 15GB of disk space (better have 20) and a lot of processing (on my very powerful laptop, just over 2 hours to build around 130 modules). The artifact sharing feature which is almost complete at this time will help to at least reduce the initial amount of disk space needed to get off the ground. Eventually (and after we get dual cache key modes implemented) if there are enough builders contributing to the artifact cache, you will be able to get away with only building the modules you care about working on.

What Next ?

Flying pigs, naturally 🙂

Jokes aside, the next steps for this will be to leverage the variants feature in BuildStream to allow building multiple variants of GNOME pipelines. Most particularly, we will want to build a subset only of the GNOME modulesets against the freedesktop-sdk-images runtime as a base, and output a GNOME Flatpak runtime. This will be particularly challenging as we will have to integrate this into the continuous conversion, so it will most probably have to just take a list of which modules in the modulesets we want to include in the output GNOME Flatpak runtime.

Also, for the full GNOME builds, we will be looking into creating a single target which encompasses everything we want to roll into a desktop system and generate bootable images directly from the continuously converted GNOME builds.

 

BuildStream progress and booting images

It’s been a while since my initial post about BuildStream and I’m happy to see that it has generated some discussion.

Since then, here at Codethink we’ve made quite a bit of progress, but we still have some road to travel before we can purport to solve all of the world’s build problems.

So here is a report on the progress we’ve made in various areas.

Infrastructure

Last time I blogged, project infrastructure was still not entirely sorted. Now that this is in place and will remain fixed for the foreseeable future, I’ll provide the more permanent links:

Links to the same in previous post have been updated

A note on GitLab

Gitlab provides us with some irresistible features.

Asides from the Merge Request feature which really does lower the barrier to contributing patches, the pre-merge CI pipelines allow us to ensure the test cases run prior to accepting any patch and are a deciding factor to remain hosted on GitLab for our git repository in lieu of creating a repo on git.gnome.org.

Another feature we get for free with GitLab’s pipeline feature is that we can automatically publish our documentation generated from source whenever a commit lands on the master branch, this was all very easy to setup.

User Experience

A significantly large portion of a software developer’s time is spent building and assembling software. Especially in tight debug and test loops, the seconds that it takes and menial tasks which stand in between an added printf() statement and a running test to reproduce some issue can make the difference between tooling which is actually helpful to the user, or just getting in the way of progress.

As such, we are paying attention to the user experience and have plans in place to ensure the most productive experience is possible.

Here are some of the advancements made since my first post

Presentation

Some of the elements we considered as important when viewing the output of a build include:

  • Separation and easy to find log files. Many build tools which use a serial build model will leave you with one huge log file to parse and figure out what happened, which is rather unwieldy to read. On the other hand, tools which exercise a parallelized build model can leave you searching through log directories for the build log you are looking for.
  • Constant feedback of what is being processed. When your build appears to hang for 30 minutes while all of your cores are being hammered down by a WebKit build, it’s nice to have some indication that a WebKit build is in fact taking place.
  • Consideration of terminal width. It’s desirable however not always possible, to avoid wrapping lines in the output of any command line interface.
  • Colorful and aligned output. When viewing a lot of terminal output, it helps to use some colors to assist the user in identifying some output they may be searching for. Likewise, alignment and formatting of text helps the user to parse more information with less frustration.

Here is a short video showing what the output currently looks like:

I’m particularly happy about how the status bar remains at the bottom of the terminal output while the regular rolling log continues above. While the status bar tells us what is going on right now, the rolling log above provides detail about what tasks are being launched, how long they took to complete and in what log files you can find the detailed build log.

Note that colors and status lines are automatically disabled when BuildStream is not connected to a tty. Interactive mode is also automatically disabled in that case. However using the bst –log-file /path/to/build.log … option will allow you to preserve the master build log of the entire session and also work in interactive mode.

Job Control

Advancements have also been made in the scheduler and how child tasks are managed.

When CNTL-C is pressed in interactive mode, all ongoing tasks are suspended and the user is presented with some choices:

  • continue – Carries on processing and queuing jobs
  • quit – Carries on with ongoing jobs but stops queuing new jobs
  • terminate – Terminates any ongoing jobs and exits immediately

Similarly, if an ongoing build fails in interactive mode, all ongoing tasks will be suspended while the user has the same choices, and an additional choice to debug the failing build in a shell.

Unfortunately continuing with a “repaired” build is not possible at this time in the same way as it is with JHBuild, however one day it should be possible in some developer mode where the user accepts that anything further that is built can only be used locally (any generated artifacts would be tainted as they don’t really correspond to their deterministic cache keys, those artifacts should be rebuilt with a fix to the input bst file before they can be shared with peers).

New Element Plugins

For those who have not been following closely, BuildStream is a system for the modeling and running of build pipelines. While this is fully intended for software building and the decoupling of the build problem and the distribution problem; in a more abstract perspective it can be said that BuildStream provides an environment for the modeling of pipelines, which consist of elements which perform mutations on filesystem data.

The full list of Element and Source plugins currently implemented in BuildStream can be found on the face page of the documentation.

As a part of my efforts to fully reproduce and provide a migration path for Baserock’s declarative definitions, some interesting new plugins were required.

meson

The meson element is a BuildElement for building modules which use meson as their build system.

Thanks goes to Patrick Griffis for filing a patch and adding this to BuildStream.

compose

The compose plugin creates a composition of its own build dependencies. Which is to say that its direct dependencies are not transitive and depending on a compose element can only pull in the output artifact of the compose element itself and none of its dependencies (a brief explanation of build and runtime dependencies can be found here)

Basically this is just a way to collect the output of various dependencies and compress them into a single artifact, that with some additional options.

For the purpose of categorizing the output of a set of dependencies, we have also introduced the split-rules public data which can be read off of the the dependencies of a given element. The default split-rules are defined in BuildStream’s default project configuration, which can be overridden on a per project and also on a per element basis.

The compose element makes use of this public data in order to provide a more versatile composition, which is to say that it’s possible to create an artifact composition of all of the files which are captured by a given domain declared in your split-rules, for instance all of the files related to internationalization, or the debugging symbols.

Example:

kind: compose
description: Initramfs composition
depends:
- filename: gnu-toolchain.bst
  type: build
- filename: initramfs/initramfs-scripts.bst
  type: build

config:
  # Include only the minimum files for the runtime
  include:
  - runtime

The above example takes the gnu-toolchain.bst stack which basically includes a base runtime with busybox, and adds to this some scripts. In this case the initramfs-scripts.bst element just imports an init and shutdown script required for the simplest of initramfs variations. The output is integrated; which is to say that things like ldconfig have run and the output of those has been collected in the output artifact. Further, any documentation, localization, debugging symbols etc, have been excluded from the composition.

script

The script element is a simple but powerful element allowing one to stage more than one set of dependencies into the sandbox in different places.

One set of dependencies is used to stage the base runtime for the sandbox, and the other is used to stage the input which one intends to mutate in some way to produce output, to be collected in the regular /buildstream/install location.

Example:

kind: script
description: The compressed initramfs
depends:
- filename: initramfs/initramfs.bst
  type: build
- filename: foundation.bst
  type: build

config:
  base: foundation.bst
  input: initramfs/initramfs.bst

  commands:
  - mkdir -p %{install-root}/boot
  - (find . -print0 | cpio -0 -H newc -o) |
    gzip -c > %{install-root}/boot/initramfs.gz

This example element will take the foundation.bst stack element (which in this context, is just a base runtime with your regular shell tools available) and stage that at the root of the sandbox, providing the few tools and runtime we want to use. Then, still following the same initramfs example as above, the integrated composition element initramfs/initramfs.bst will be staged as input in the /buildstream/input directory of the build sandbox.

The script commands then simply use the provided base tools to create a gzipped cpio archive inside the /buildstream/install directory, which will be collected as the artifact produced by this script.

A bootable system

Another thing we’ve been doing since last we touched base is providing a migration path for Baserock users to use BuildStream.

This is a particularly interesting case for BuildStream because Baserock systems provide metadata to build a bootable system from the ground up, from a libc and compiler boostrapping phase all the way up to the creation and deployment of a bootable image.

In this way we cover a lot of ground and can now demonstrate that bootstrapping, building and deploying a bootable image as a result is all possible using BuildStream.

The bootstrap

One of the more interesting parts is that the bootstrap remains almost unchanged, except for the key ingredient which is that we never allow any host tools to be present in the build sandbox.

The working theory is that whenever you bootstrap, you bootstrap from some tools. If you were ever able to obtain these tools in binary form installed on your computer, then it should also be possible to obtain them in the form of a chrootable sysroot (or “SDK”).

Anyone who has had a hand in maintaining a tree of build instructions which include a bootstrap phase from host tooling to first get off the ground (like buildroot or yocto) will have lived through the burden of vetting new distros as they roll out and patching builds so as to work “on the latest debian” or whatnot. This whole maintenance aspect is simply dropped from the equation by ensuring that host tools are not a variable in the equation but rather a constant.

Assembling the image

When it comes time to assemble an image to boot with, there are various options and it should not be such a big deal, right ? Well, unfortunately it’s not quite that simple.

It turns out that even in 2017, the options we have for assembling a bootable file system image as a regular unprivileged user are still quite limited.

Short of building qemu and using some virtualization, I’ve found that the only straight forward method of installing a boot loader is with syslinux on a vfat filesystem. While there are some tools around for manipulating ext2 filesystems in user space but these are largely unneeded anyway as static device nodes and assigning file ownership to arbitrary uid/gids is mostly unneeded when using modern init systems. In any case recent versions of e2fsprogs provide an option for populating the filesystem at creation time.

Partitioning an image for your file systems is also possible as a regular user, but populating those partitions is a game of splicing filesystem images into their respective partition locations.

I am hopeful however that with some virtualization performed entirely inside the build sandbox, we can achieve a much better outcome using libguestfs. I’m not altogether clear on how supermin and libguestfs come together but from what I understand, this technology will allow us to mount any linux supported filesystem in userspace, and quite possibly without even having (or using) the supporting filesystem drivers in your host kernel.

That said, for now we settle for the poor mans basic tooling and live with the restriction of having our boot partition be a vfat partition. The image can be created using the script element described above.

Example:

kind: script
description: Create a deployment of the GNOME system
depends:
- filename: gnome/gnome-system.bst
  type: build
- filename: deploy-base.bst
  type: build

variables:
  # Size of the disk to create
  #
  # Should be able to calculate this based on the space
  # used, however it must be a multiple of (63 * 512) bytes
  # as mtools wants a size that is devisable by sectors (512 bytes)
  # per track (63).
  boot-size: 252000K

  rootfs-size: 4G
  swap-size: 1G
  sector-size: 512

config:
  base: deploy-base.bst
  input: gnome/gnome-system.bst

  commands:

  - |
    # Split up the boot directory and the other
    #
    # This should be changed so that the /boot directory
    # is created separately.

    cd /buildstream
    mkdir -p /buildstream/sda1
    mkdir -p /buildstream/sda2

    mv %{build-root}/boot/* /buildstream/sda1
    mv %{build-root}/* /buildstream/sda2

  - |
    # Generate an fstab
    cat > /buildstream/sda2/etc/fstab << EOF
    /dev/sda2 / ext4 defaults,rw,noatime 0 1
    /dev/sda1 /boot vfat defaults 0 2
    /dev/sda3 none swap defaults 0 0
    EOF

  - |
    # Create the syslinux config
    mkdir -p /buildstream/sda1/syslinux
    cat > /buildstream/sda1/syslinux/syslinux.cfg << EOF
    PROMPT 0
    TIMEOUT 5

    ALLOWOPTIONS 1
    SERIAL 0 115200

    DEFAULT boot
    LABEL boot

    KERNEL /vmlinuz
    INITRD /initramfs.gz

    APPEND root=/dev/sda2 rootfstype=ext4 init=/sbin/init
    EOF

  - |
    # Create the vfat image
    truncate -s %{boot-size} /buildstream/sda1.img
    mkdosfs /buildstream/sda1.img

  - |
    # Copy all that stuff into the image
    mcopy -D s -i /buildstream/sda1.img -s /buildstream/sda1/* ::/

  - |
    # Install the bootloader on the image, it will load the
    # config file from inside the vfat boot partition
    syslinux --directory /syslinux/ /buildstream/sda1.img

  - |
    # Now create the root filesys on sda2
    truncate -s %{rootfs-size} /buildstream/sda2.img
    mkfs.ext4 -F -i 8192 /buildstream/sda2.img \
              -L root -d /buildstream/sda2

  - |
    # Create swap
    truncate -s %{swap-size} /buildstream/sda3.img
    mkswap -L swap /buildstream/sda3.img

  - |

    ########################################
    #        Partition the disk            #
    ########################################

    # First get the size in bytes
    sda1size=$(stat --printf="%s" /buildstream/sda1.img)
    sda2size=$(stat --printf="%s" /buildstream/sda2.img)
    sda3size=$(stat --printf="%s" /buildstream/sda3.img)

    # Now convert to sectors
    sda1sec=$(( ${sda1size} / %{sector-size} ))
    sda2sec=$(( ${sda2size} / %{sector-size} ))
    sda3sec=$(( ${sda3size} / %{sector-size} ))

    # Now get the offsets in sectors, first sector reserved
    # for MBR partition table
    sda1offset=1
    sda2offset=$(( ${sda1offset} + ${sda1sec} ))
    sda3offset=$(( ${sda2offset} + ${sda2sec} ))

    # Get total disk size in sectors and bytes
    sdasectors=$(( ${sda3offset} + ${sda3sec} ))
    sdabytes=$(( ${sdasectors} * %{sector-size} ))

    # Create the main disk and do the partitioning
    truncate -s ${sdabytes} /buildstream/sda.img
    parted -s /buildstream/sda.img mklabel msdos
    parted -s /buildstream/sda.img unit s mkpart primary fat32 \
       ${sda1offset} $(( ${sda1offset} + ${sda1sec} - 1 ))
    parted -s /buildstream/sda.img unit s mkpart primary ext2 \
       ${sda2offset} $(( ${sda2offset} + ${sda2sec} - 1 ))
    parted -s /buildstream/sda.img unit s mkpart primary \
       linux-swap \
       ${sda3offset} $(( ${sda3offset} + ${sda3sec} - 1 ))

    # Make partition 1 the boot partition
    parted -s /buildstream/sda.img set 1 boot on

    # Now splice the existing filesystems directly into the image
    dd if=/buildstream/sda1.img of=/buildstream/sda.img \
      ibs=%{sector-size} obs=%{sector-size} conv=notrunc \
      count=${sda1sec} seek=${sda1offset} 

    dd if=/buildstream/sda2.img of=/buildstream/sda.img \
      ibs=%{sector-size} obs=%{sector-size} conv=notrunc \
      count=${sda2sec} seek=${sda2offset} 

    dd if=/buildstream/sda3.img of=/buildstream/sda.img \
      ibs=%{sector-size} obs=%{sector-size} conv=notrunc \
      count=${sda3sec} seek=${sda3offset} 

  - |
    # Move the image where it will be collected
    mv /buildstream/sda.img %{install-root}
    chmod 0644 %{install-root}/sda.img

As you can see the script element is a bit too verbose for this type of task. Following the pattern we have in place for the various build elements, we will soon be creating a reusable element with some more simple parameters (filesystem types, image sizes, swap size, partition table type, etc) for the purpose of whipping together bootable images.

A booting demo

So for those who want to try this at home, we’ve prepared a complete system which can be built in the build-gnome branch of the buildstream-tests repository.

BuildStream now requires python 3.4 instead of 3.5, so this should hopefully be repeatable on most stable distros, e.g. debian jessie ships 3.4 (and also has the required ostree and bubblewrap available in the  jessie-backports repository).

Here are some instructions to get you off the ground:

mkdir work
cd work

# Clone related repositories
git clone git@gitlab.com:BuildStream/buildstream.git
git clone git@gitlab.com:BuildStream/buildstream-tests.git

# Checkout build-gnome branch
cd buildstream-tests
git checkout build-gnome
cd ..

# Make sure you have ostree and bubblewrap provided by your distro
# first, you will also need pygobject for python 3.4

# Install BuildStream as local user, does not require root
# If this fails, it's because you lack some required dependency.
cd buildstream
pip install --user -e .
cd ..

# If you've gotten this far, then the following should also succeed
# after many hours of building.
cd buildstream-tests
bst build gnome/gnome-system-image.bst

# Once the above completes, there is an image which can be
# checked out from the artifact cache.
#
# The following command will create ~/VM/sda.img
#
bst checkout gnome/gnome-system-image.bst ~/VM/

# Now you can use your favorite VM to boot the image, e.g.:
qemu-system-x86_64 -m size=1024 ~/VM/sda.img

# GDM is currently disabled in this build, once the VM boots
# you can login as root (no password) and in that VM you can run:
systemctl start gdm

# And the above will bring up gdm and start the regular
# gnome-initial-setup tool.

With SSD storage and a powerful quad core CPU, this build completes in less than 5 hours (and pretty much makes full usage of your machine’s resources all along the way). All told, the build will take around 40GB of disk space to build and store the result of around 500 modules. I would advise having at least 50GB of free space for this though, especially to account for some headroom in the final step.

Note: This is not an up to date GNOME system based on current modulesets yet, but rather a clone/conversion of the system I tried integrating last year using YBD. I will soon be starting on creating a more modular repository which builds only the components relevant to GNOME and follows the releases, for that I will need to open some dialog and sort out some of the logistics.

Note on modularity

The mentioned buildstream-tests repository is one huge repository with build metadata to build everything from the compiler up to a desktop environment and some applications.

This is not what we ultimately want, because first off, it’s obviously a huge mess to maintain and you dont want your project to be littered with build metadata that you’re not going to use (which is what happens when forking projects like buildroot). Secondly, even when you are concerned with building an entire operating system from scratch, we have found that without modularity, changes introduced in the lower levels of the stack tend to be pushed on the stacks which consume those modules. This introduces much friction in the development and integration process for such projects.

Instead, we will eventually be using recursive pipeline elements to allow modular BuildStream projects to depend on one another in such a way that consuming projects can always decide what version of a project they depend on will be used.

 

Introducing BuildStream

Greetings fellow Gnomies 🙂

At Codethink over the past few months we’ve been revisiting our approach to assembly of whole-stack systems, particularly for embedded Linux systems and custom GNOME based systems.

We’ve taken inspiration, lessons and use-cases from various projects including OBS, Reproducible Builds, Yocto, Baserock, buildroot, Aboriginal, GNOME Continuous, JHBuild, Flatpak Builder and Android repo.

The result of our latest work is a new project, BuildStream, which aims initially to satisfy clear requirements from GNOME and Baserock, and grow from there. BuildStream uses some key GNOME plumbing (OSTree, bubblewrap) combined with declarative build-recipe description to provide sandboxed, repeatable builds of GNOME projects, while maintaining the flexibility and speed required by GNOME developers.

But before talking about BuildStream, lets go over what this can mean for GNOME in 2017.

Centralization of build metadata

Currently we build GNOME in various ways, including JHBuild XML, Flatpak JSON for the GNOME Runtime and SDK, and GNOME Continuous JSON for CI.

We hope to centralize all of this so that the GNOME release team need only maintain one single set of core module metadata in one repository in the same declarative YAML format.

To this end, we will soon be maintaining a side branch of the GNOME release modulesets so people can try this out early.

GNOME Developer Experience

JHBuild was certainly a huge improvement over the absolutely nothing that we had in place before it, but is generally unreliable due its reliance on host tooling and dependencies.

  • Newcomers can have a hard time getting off the ground and making sure they have satisfied the system dependencies.
  • Builds are not easily repeatable, you cannot easily build GNOME 3.6 today with a modern set of dependencies.
  • Not easy to test core GNOME components like gnome-session or the gnome-initial-setup tool.

BuildStream nips these problems at the bud with an entirely no-host-tooling policy, in fact you can potentially build all of GNOME on your computer without ever installing gcc. Instead, GNOME will be built on top of a deterministic runtime environment which closely resembles the freedesktop-sdk-images Flatpak runtime but will also include the minimal requirements for booting the results in a VM.

Building in the Swarm

BuildStream supports artifact cache sharing so that authenticated users may upload successful build results to share with their peers. I doubt that we’ll want to share all artifacts between random users, but having GNOME Continuous upload to a common artifact cache will alleviate the pain of webkit rebuilds (unless you are hacking on webkit of course).

Flatpak / Flathub support

BuildStream will also be available as an alternative to flatpak-builder.

We will be providing an easy migration path and conversion script for Flatpak JSON which should be good enough for most if not all Flatpak app builds.

As the Flathub project develops, we will also work towards enabling submission of BuildStream metadata as an alternative to the Flatpak Builder JSON.

About BuildStream

Unlike many existing build systems, BuildStream treats the problem of building and distribution as separate problem spaces. Once you have built a stack in BuildStream it should be trivial enough to deploy it as rpms, debian packages, a tarball/ostree SDK sysroot, as a flatpak, or as a bootable filesystem image which can be flashed to hardware or booted in a VM.

Our view is that build instructions as structured metadata used to describe modules and how they integrate together is a valuable body of work on its own. As such we should be able to apply that same body of work reliably to a variety of tasks – the BuildStream approach aims to prove this view while also offering a clean and simple user experience.

BuildStream is written in Python 3, has fairly good test coverage at this stage and is quite well documented.

BuildStream works well right now but still lacks some important features. Expect some churn over the following months before we reach a stable release and become a viable alternative for developing GNOME on your laptop/desktop.

Dependencies

Note that for the time being the OSTree requirement may be too recent for many users running currently stable distros (e.g. debian Jessie). This is because we use the OSTree gobject introspection bindings which require a version from August 2016. Due to this hard requirement it made little sense to include special case support for older Python versions.

However with that said; if this transitional period is too painful, we may decide to lower the Python requirement and just use the OSTree command line interface instead.

Build Pipelines

The BuildStream design in a nutshell is to have one abstract core, which provides the mechanics for sandboxing build environments (currently using bubblewrap as our default sandbox), interpreting the YAML data model and caching/sharing the build results in an artifact cache (implemented with ostree) and an ecosystem of “Element” plugins which process filesystem data as inputs and outputs.

In a very abstract view, one can say that BuildStream is like GStreamer but its extensible set of element plugins operate on filesystem data instead of audio and video buffers.

This should allow for a virtually unlimited variety of pipelines, here are some sketches which attempt to illustrate the kinds of tasks we expect to accomplish using BuildStream.

Import a custom vendor tarball, build an updated graphics stack and BSP on top of that, and use a custom export element to deploy the build results as RPMS:

Import the base runtime ostree repository generated with Yocto, build the modules for the freedesktop-sdk-images repository on top of that runtime, and then deploy both Runtime and SDK from that base, while filtering out the irrelevant SDK specific bits from the Runtime deployment:

Import an arbitrary but deterministic SDK (not your host !) to bootstrap a compiler, C runtime and linux kernel, deploy a bootable filesystem image:

Build pipelines are modular and can be built recursively. So a separate project/pipeline can consume the same base system we just built and extend it with a graphics stack:

A working demo

What follows are some instructions to try out BuildStream in its early stages.

For this demo we chose to build a popular application (gedit) in the flatpak style, however this does not yet include an ostree export or generation of the metadata files which flatpak requires; the built gedit result cannot be run with flatpak without those steps but can be run in a `build-stream shell` environment.

# Installing BuildStream

# Before installing BuildStream you will need to first install
# Python >= 3.5, bubblewrap and OSTree >= v2016.8 as stated above.

# Create some arbitrary directory, dont use ~/buildstream because
# that's currently used by buildstream unless you override the 
# configuration.
mkdir ~/testing
cd testing
git clone https://gitlab.com/BuildStream/buildstream

# There are a handful of ways to install a python setuptools
# package, we recommend for developer checkouts that you first
# install pip, and run the following command.
#
# This should install build-stream and its pythonic dependencies
# into your users local python environment without touching any
# system directories:
cd buildstream
pip install --user -e .

# Clone the demo project repository
cd ..
git clone https://gitlab.com/BuildStream/buildstream-tests
cd buildstream-tests

# Take a peek of the gedit.bst pipeline state (optional)
#
# This will tell us about all the dependencies in the pipeline,
# what their cache keys are and their local state (whether they
# are already cached or need to be built, or are waiting for a
# dependency to be built first).
build-stream show --deps all gedit.bst

# Now build gedit on top of a GNOME Platform & Sdk
build-stream build gedit.bst

#
# This will take some time and quite some disk space, building
# on SSD is highly recommended.
#
# Once the artifact cache sharing features are in place then this
# will take half the disk space it currently takes, in the majority
# of cases where you BuildStream already has an artifact for the
# GNOME Platform and SDK bases.
#

# Ok, the build may have taken some time but I'm pretty sure it
# succeeded.
#
# Now we can launch a sandbox shell in an environment with the
# built gedit:
build-stream shell --scope run gedit.bst

# And launch gedit. Use the --standalone option to be sure we are
# running the gedit we just built, not a new window in the gedit
# installed on your host
gedit --standalone

Getting Involved

As you can see we’re currently hosted from my user account on gitlab, so our next steps is to sort out a proper hosting for the project including mailing list, bug tracking and a place to publish our documentation.

For right now, the best place to reach us and talk about BuildStream is in the #buildstream channel on GNOME IRC.

If you’d like to play around with the source, a quick read into the HACKING file will provide some orientation for getting started, coding conventions, building documentation and running tests.

 

With that, I hope you’ve all enjoyed FOSDEM and the beer that it entails 🙂

Flatpak builds available on a variety of architectures

Following the recent work we’ve been doing at Codethink in cooperation with Endless, it’s been a while now that we have the capability of building flatpak SDKs and apps for ARM architectures, and consequently also for 32bit Intel architectures.

Alex has been tying this together and setting up the Intel build machines and as of this week, flatpak builds are available at sdk.gnome.org in a variety of arches and flavors.

Arches

The supported architectures are as follows

  • x86_64, the 64bit Intel architecture which is the only one we’ve been building until now
  • i386, this is the name we are using for 32bit Intel, this is only i386 in name but the builds are in fact tuned for the i586 instruction set
  • aarch64, speaks for itself, this is the 64bit ARM architecture
  • arm, like i386, this is a generic name chosen to indicate 32bit arm, this build is tuned for ARMv7-A processors and will make use of modern features such as vfpv3 and the neon simd. In other words, this will not run on older ARM architectures but should run well on modern ARM processors such as the Cortex-A7 featured in the Raspberry Pi 2.

Build Bots

The build bots are currently driven with this set of build scripts, which should be able to turn an Intel or ARM machine with a vanilla Ubuntu 16.04 or RHEL 7 installation into a flatpak build machine.

ARM and Intel builds run on a few distributed build machines and are then propagated to sdk.gnome.org for distribution.

The build machines also push notifications of build status to IRC, currently we have it setup so that only failed builds are announced in #flatpak on freenode, while the fully verbose build notifications are announced in #flatpak-builds also on freenode (so you are invited to lurk in #flatpak-builds if you would like to monitor how your favorite app or library is faring on various build architectures).

 

Many thanks to all who were involved in making this happen, thanks to Alex for being exceptionally responsive and helpful on IRC, thanks to Endless for sponsoring the development of these build services and ARM support, thanks to Codethink for providing the build machines for the flatpak ARM builds and a special thanks to Dave Page for setting up the ARM build server infrastructure and filling in the IT knowledge gap where I fall short (specifically with things networking related).

Endless and Codethink team up for GNOME on ARM

A couple of months ago Alberto Ruiz issued a Call to Arms here on planet GNOME. This was met with an influx of eager contributions including a wide variety of server grade ARM hardware, rack space and sponsorship to help make GNOME on ARM a reality.

Codethink and Endless are excited to announce their collaboration in this initiative and it’s my pleasure to share the details with you today.

Codethink has donated 8 cartridges dedicated to building GNOME things for ARM architectures in our Moonshot server. These cartridges are AppliedMicro™ X-Gene™ with 8 ARMv8 64-bit cores at 2.4Ghz, 64GB of DDR3 PC3L-12800 (1600 MHz) Memory and 120GB M.2 solid state storage.

Endless has also enlisted our services for the development and deployment of a Flatpak (formerly known as xdg-app) build farm to run on these machines. The goal of this project is to build and distribute both stable and bleeding edge versions of GNOME application bundles and SDKs on a continuous basis.

And we are almost there !

After one spontaneous hackfest and a long list of patches; I am happy to add here that runtimes, sdks and apps are building and running on both AArch64 and 32bit ARMv7-A architectures. As a side effect of this effort, Flatpak sdks and applications can now also be built for 32bit Intel platforms (this may have already been possible, but not from an x86_64 build host).

The builds are already automated at this time and will shortly be finding their way to sdk.gnome.org.

In the interest of keeping everything repeatable, I have been maintaining a set of scripts which setup nightly builds on a build machine, which can be configured to build various stable/unstable branches of the SDK and app repositories. These are capable of building our 4 supported target architectures: x86_64, i386, aarch64 and arm.

Currently they are only well tested with vanilla installations of Ubuntu 16.04 and are also known to work on Debian Stretch, but it should be trivial to support some modern RPM based distros as well.

endless

logo
Stay tuned for further updates on GNOME’s new found build farm, brought to you by Endless and Codethink !