Rethinking the Linux distibution

Recently I’ve been thinking about how Linux desktop distributions work, and how applications are deployed. I have some ideas for how this could work in a completely different way.

I want to start with a small screencast showing how bundles work for an end user before getting into the technical details:

[youtube]http://www.youtube.com/watch?v=qpRjSAD_3wU[/youtube]

Note how easy it is to download and install apps? Thats just one of the benefits of bundles. But before we start with bundles I want to take a step back and look at what the problem is with the current Linux distribution models.

Desktop distributions like Fedora or Ubuntu work remarkably well, and have a lot of applications packaged. However, they are not as reliable as you would like. Most Linux users have experienced some package update that broke their system, or made their app stop working. Typically this happens at the worst times. Linux users quickly learn to disable upgrades before leaving for some important presentation or meeting.

Its easy to blame this on lack of testing and too many updates, but I think there are some deeper issues here that affect testability in general:

  • Every package installs into a single large “system” where everything interacts in unpredictable ways. For example, upgrading a library to fix one app might affect other applications.
  • Everyone is running a different set of bits:
    • The package set for each user is different, and per the above all packages interact which can cause problems
    • Package installation modify the system at runtime, including running scripts on the users machine. This can give different results due to different package set, install order, hardware, etc.

Also, while it is very easy to install the latest packaged version of an application, other things are not so easy:

  • Installing applications not packaged for your distribution
  • Installing a newer version of an application that requires newer dependencies than what is in your current repositories
  • Keeping multiple versions of the same app installed
  • Keeping older versions of applications running as you update your overall system

So, how can we make this better? First we make everyone run the same bits. (Note: From here we start to get pretty technical)

I imagine a system where the OS is a well defined set of non-optional core libraries, services and apps. The OS is shipped as a read-only image that gets loopback mounted at / during early boot. So, not only does everyone have the same files, they are using (and testing) *exactly* the same bits. We can do semi-regular updates by replacing the image (keeping the old one for easy rollback), and we can do security hot-fixes by bind-mounting over individual files.

The core OS is separated into two distinct parts. Lets call it the platform and the desktop. The platform is a small set of highly ABI stable and reliable core packages. It would have things like libc, coreutils, libz, libX11, libGL, dbus, libpng, Gtk+, Qt, and bash. Enough unix to run typical scripts and some core libraries that are supportable and that lots of apps need.

The desktop part is a runtime that lets you work with the computer. It has the services needed to be able to start and log into a desktop UI, including things like login manager, window manager, desktop shell, and the core desktop utilities. By necessity there will some libraries needed in the desktop that are not in the platform, these are considered internal details and we don’t ship with header files for them or support third party binaries using them.

Secondly, we untangle the application interactions.

All applications are shipped as bundles, single files that contain everything (libraries, files, tools, etc) the application depends on. Except they can (optionally) depend on things from the OS platform. Bundles are self-contained, so they don’t interact with other bundles that are installed. This means that if a bundle works once it will always keep working, as long as the platform is ABI stable as guaranteed. Running new apps is as easy as downloading and clicking a file. Installing them is as easy as dropping them in a known directory.

I’ve started writing a new bundle system, called Glick 2, replacing an old system I did called Glick. Here is how the core works:

When a bundle is started, it creates a new mount namespace, a kernel feature that lets different processes see different sets of mounts. Then the bundle file itself is mounted as a fuse filesystem in a well known prefix, say /opt/bundle. This mount is only visible to the bundle process and its children. Then an executable from the bundle is started, which is compiled to read all its data and libraries from /opt/bundle. Another kernel feature called shared subtrees is used to make the new mount namespace share all non-bundle mounts in the system, so that if a USB stick is inserted after the bundle is started it will still be visible in the bundle.

There are some problematic aspects of bundles:

  • Its a lot of work to create a bundle, as you have to build all the dependencies of your app yourself
  • Shared libraries used by several apps are not shared, leading to higher memory use and more disk i/o
  • Its hard for bundles to interact with the system, for instance to expose icons and desktop files to the desktop, or add a new mimetype

In Glick 2, all bundles are composed of a set of slices. When the bundle is mounted we see the union of all the slices as the file tree, but in the file itself they are distinct bits of data. When creating a bundle you build just your application, and then pick existing library bundles for the dependencies and combine them into an final application  bundle that the user sees.

With this approach one can easily imagine a whole echo-system of library bundles for free software, maintained similarly to distro repositories (ideally maintained by upstream). This way it becomes pretty easy to package applications in bundles.

Additionally, with a set of shared slices like this used by applications it becomes increasingly likely that an up-to-date set of apps will be using the same build of some of its dependencies. Glick 2 takes advantage of this by using a checksum of each slice, and keeping track of all the slices in use globally on the desktop. If any two bundles use the same slice, only one copy of the slice on disk will be used, and the files in the two bundle mount mounts will use the same inode. This means we read the data from disk only once, and that we share the memory for the library in the page cache. In other words, they work like traditional shared libraries.

Interaction with the system is handled by allowing bundle installation. This really just means dropping the bundle file in a known directory, like ~/Apps or some system directory. The session then tracks files being added to this directory, and whenever a bundle is added we look at it for slices marked as exported. All the exported slices of all the installed bundles are then made visible in a desktop-wide instance of /opt/bundle (and to process-private instances).

This means that bundles can mark things like desktop files, icons, dbus service files, mimetypes, etc as exported and have them globally visible (so that other apps and the desktop can see them). Additionally we expose symlinks to the intstalled bundles themselves in a well known location like /opt/bundle/.bundles/<bundle-id> so that e.g. the desktop file can reference the application binary in an absolute fashion.

There is nothing that prohibits bundles from running in regular distributions too, as long as the base set of platform dependencies are installed, via for instance distro metapackages. So, bundles can also be used as a way to create binaries for cross-distro deployment.

The current codebase is of prototype quality. It works, but requires some handholding, and lacks some features I want. I hope to clean it up and publish it in the near future.