Compatibility in a sandboxed world

Compatibility has always been a complex problems in the Linux world. With the advent of containers/sandboxing it has become even more complicated. Containers help solve compatibility problems, but there are still remaining issues. Especially on the Linux desktop where things are highly interconnected. In fact, containers even create some problems that we didn’t use to have.

Today I’ll take a look at the issues in more details and give some ideas on how to best think of compatibility in this post-container world, focusing on desktop use with technologies like flatpak and snap.

Forward and backwards compatibility

First, lets take a look at what we mean by compatibility. Generally we’re considering changing some part of the system that interacts with other parts (for example command-line arguments, a library, a service or even a file format). We say the change is compatible if the complete system still works as intended after the change.

Most of the time when compatibility is mentioned it refers to backwards compatibility. This means that a change is done such that all old code keeps working. However, after the update other things may start to rely on the new version and there is no guarantee that these new versions will work without the change. A simple example of a backwards compatible change is a file format change, where the new app can read the old files, but the old app may not necessarily read the new files.

However, there is also a another concept called forward compatibility. This is a property of the design rather than any particular change. If something is designed to be forward compatible that means any later change to it will not cause earlier versions to stop working. For example, a file format is designed to be forward compatible if it guarantees a file produced by a new app is still readable by an older app (possibly somewhat degraded due to the lack of new features).

The two concepts are complementary in the sense that if something is forward compatible you can “upgrade” to older versions and that change will be backwards compatible.

API compatibility

API stands for Application Programming Interface and it defines how a program interfaces with some other code. When we talk about API compatibility we mean at the programming level. In other words, a change is API compatible if the some other source code can be recompiled against your changed code and it still builds and works.

Since the source code is very abstract and flexible this means quite a lot of changes are API compatible. For example, the memory layout of a structure is not defined at the source code level, so that can change and still be API compatible.

API compatibility is mostly interesting to programmers, or people building programs from source. What affects regular users instead is ABI compatibility.

ABI compatibility

ABI means Application Binary Interface, and it describes how binaries compiled from source are compatible. Once the source code is compiled a lot more details are made explicit, such as the layout of memory. This means that a lot of changes that are API compatible are not ABI compatible.

For example, changing the layout of a (public) structure is API compatible, but not ABI compatible as the old layout is encoded in the compiled code. However, with sufficient care by the programmer there are still a lot of changes that can be made that are ABI backward compatible. The most common example is adding new functions.

Symbol versioning

One thing that often gets mentioned when talking about ABI compatibility is symbol versioning. This is a feature of the ELF executable format that allows the creation of multiple versions of a function in the binary with the same name. Code built against older versions of the library calls the old function, and code built against the new version will call the new function. This is a way to extend what is possible to change and still be backwards ABI compatible.

For example, using this it may be possible to change the layout of a structure, but keep a copy of the previous structure too. Then you have two functions that each work on its own particular layout, meaning the change is still ABI compatible.

Symbol versioning is powerful, but it is not a solution for all problems. It is mostly useful for small changes. For example, the above change is workable if only one function uses the structure. However, if the modified structure is passed to many functions then all those functions need to be duplicated, and that quickly becomes unmanageable.

Additionally, symbol versioning silently introduces problems with forward compatibility. Even if an application doesn’t rely on the feature that was introduced in the new version of the library a simple rebuild will pick up a dependency of the new version, making it unnecessarily incompatible with older versions. This is why Linux applications must be built against the oldest version of glibc they want to support running against rather than against the latest.

ABI domains

When discussing ABI compatibility there is normally an implicit context that is assumed to be fixed. For example, we naturally assume both builds are for the same CPU architecture.

For any particular system there are a lot of details that affect
this context, such as:

  • Supported CPU features
  • Kernel system call ABI
  • Function calling conventions
  • Compiler/Linker version
  • System compiler flags
  • ABI of all dependent modules (such as e.g. glibc or libjpeg versions)

I call any fixed combination of these an ABI domain.

Historically the ABI domain has not been very important in the context of a particular installation, as it is the responsibility of the distribution to ensure that all these properties stay compatible over time. However, the fact that ABI domains differ is one of the primary causes of incompatibility between different distributions (or between different versions of the same distribution).

Protocol compatibility

Another type of compatibility is that of communication protocols. Two programs that talk to each other using a networking API (which could be on two different machines, or locally on the same machine) need to use a protocol to understand each other. Changes to this protocol need to be carefully considered to ensure they are compatible.

In the remote case this is pretty obvious, as it is very hard to control what software two different machines use. However, even for local communication between processes care has to be taken. For example, a local service could be using a protocol that has several implementations and they all need to stay compatible.

Sometimes local services are split into a service and a library and the compatibility guarantees are defined by the library rather than the service. Then we can achieve some level of compatibility by ensuring the library and the service are updated in lock-step. For example a distribution could ship them in the same package.

Enter containers

Containers is (among other things) a way to allow having multiple ABI domains on the same machine. This allows running a single build of an application on any distribution. This solves a lot of ABI compatibility issues in one fell swoop.

And there was much rejoicing!

There are still some remnants of ABI compatibility issues left, for example CPU features and kernel system calls still need to be compatible. Additionally it is important to keep the individual ABI domains internally compatible over time. For flatpak this ends up being the responsibility of the maintainers of the runtime wheras for docker it is up to each container image. But, by and large, we can consider ABI compatibility “solved”.

However, the fact that we now run multiple ABI domains on one machine brings up some new issues that we really didn’t need to care much about before, namely protocol compatibility and forward compatibility.

Protocol compatibility in a container world

Server containers are very isolated from each other, relying mainly on the stability of things like DNS/HTTP/SQL. But the desktop world is a lot more interconnected. For example, it relies on X11/Wayland/OpenGL for graphics, PulseAudio for audio, and Cups for printing, etc. Lots of desktop services use DBus to expose commonly used desktop functionality (including portals), and there is a plethora of file formats shared between apps (such as icons, mime-types, themes, etc).

If there is only one ABI domain on the machine then all these local protocols and formats need not consider protocol compatibility at all, because services and clients can be upgraded in lock-step. For example, the evolution-data-server service has an internal versioned DBus API, but client apps just use the library. If the service protocol changes we just update the library and apps continue to work.

However, with multiple ABI domains, we may have different versions of the library, and if the library of one domain doesn’t match the version of the running service then it will break.

So, in a containerized world, any local service running on the desktop needs to consider protocol stability and extensibility much more careful than what we used to do. Fortunately protocols are generally much more limited and flexible than library ABIs, so it’s much easier to make compatible changes, and failures are generally error messages rather than crashes.

Forward compatibility in a container world

Historically forward compatibility has mainly needed to be considered for file formats. For instance, you might share a home directory between different machines and then use and different versions of an app to open a file on the two machines, and you don’t want the newer version to write files the older can’t read.

However, with multiple ABI domains it is going to be much more common that some of the software on the machine update versions faster than others. For example, you might be running a very up-to-date app on an older distribution. In this case it is important that any host services were designed to be forward compatible so that they doesn’t break when talking to the newer clients.

Summary

So, in this new world we need to have new rules. And the rule is this:

Any interface that spans ABI domains (for example between system and app, or between two different apps) should be backwards compatible and try to be forward compatible as much as possible.

Any consumer of such an interface should be aware of the risks involved with version differences and degrade gracefully in case of a mismatch.

Putting container updates on a diet

For infrastructure reasons the Fedora flatpaks are delivered using the container registry, rather than OSTree (which is normally used for flatpaks). Container registries are quite inefficient for updates, so we have been getting complaints about large downloads.

I’ve been working on ways to make this more efficient. That would help not only the desktop use-case, as smaller downloads are also important in things like IoT devices. Also, less bandwidth used could lead to significant cost savings for large scale registries.

Containers already have some features designed to save download size. Lets take a look at them in more detail to see why that often doesn’t work.

Consider a very simple Dockerfile:

FROM fedora:32
RUN dnf -y install httpd
COPY index.html /var/www/html/index.html
ENTRYPOINT /usr/sbin/httpd

This will produce an container image that looks like this:

The nice thing about this setup is that if you change the html layer and re-deploy you only need to download the last layer, because the other layers are unchanged.

However, as soon as one of the other layer changes you need to download the changed layer and all layers below it from scratch. For example, if there is a security issue and you need to update the base image all layers will change.

In practice, such updates actually change very little in the image. Most files are the same as the previous version, and the few that change are still similar to the previous version. If a client is doing an update from the previous version the old files are available, and if they could be reused that would save a lot of work.

One complexity of this is that we need to reconstruct the exact tar file that we would otherwise download, rather than the extracted files. This is because we need to checksum it and verify the checksum to protect against accidental or malicious modifications. For containers,  the checksum that clients use is the checksum of the uncompressed tarball. Being uncompressed is fortunate for us, because reproducing identical compression is very painful.

To handle such reconstruction, I wrote a tool called tar-diff which does exactly what is needed here:

$ tar-diff old.tar.gz new.tar.gz delta.tardiff
$ tar xf old.tar.gz -C extracted/
$ tar-patch delta.tardiff extracted/ reconstructed.tar
$ gunzip new.tar.gz
$ cmp new.tar reconstructed.tar

I.e. it can use the extracted data from an old version, together with a small diff file to reconstruct the uncompressed tar file.

tar-diff uses knowledge of the tar format, as well as the bsdiff binary diff algorithm and zstd compression to create very small files for typical updates.

Here are some size comparisons for a few representative images. This shows the relative size of the deltas compared to the size of the changed layers:

Red Hat Universal Base Image 8.0 and 8.1
fluent/fluentd, a Ruby application on top of a small base image
OpenShift Enterprise prometheus releases
Fedora 30 flatpak runtime updates

These are some pretty impressive figures. Its clear from this that some updates are really very small, yet we are all downloading massive files anyway. Some updates are larger, but even for those the deltas are in the realm of 10-15% of the original size. So, even in the worst case deltas are giving around 10x improvement.

For this to work we need to store the deltas on a container registry and have a way to find the deltas when pulling an image. Fortunately it turns out that the OCI specification is quite flexible, and there is a new project called OCI artifacts specifying how to store other types of binary data in a container.

So, I was able to add support for this in skopeo and podman, allowing it both to generate deltas and use them to speed up downloads. Here is a short screen-cast of using this to generate and use deltas between two images stored on the docker hub:

All this is work in progress and the exact details of how to store deltas on the repository is still being discussed. However, I wanted to give a heads up about this because I think it is some really powerful technology that a lot of people might be interested in.

Introducing GVariant schemas

GLib supports a binary data format called GVariant, which is commonly used to store various forms of application data. For example, it is used to store the dconf database and as the on-disk data in OSTree repositories.

The GVariant serialization format is very interesting. It has a recursive type-system (based on the DBus types) and is very compact. At the same time it includes padding to correctly align types for direct CPU reads and has constant time element lookup for arrays and tuples. This make GVariant a very good format for efficient in-memory read-only access.

Unfortunately the APIs that GLib has for accessing variants are not always great. They are based on using type strings and accessing children via integer indexes. While this is very dynamic and flexible (especially when creating variants) it isn’t a great fit for the case where you have serialized data in a format that is known ahead of time.

Some negative aspects are:

  • Each g_variant_get_child() call allocates a new object.
  • There is a lot of unavoidable (atomic) refcounting.
  • It always uses generic codepaths even if the format is known.

If you look at some other binary formats, like Google protobuf, or Cap’n Proto they work by describing the types your program use in a schema, which is compiled into code that you use to work with the data.

For many use-cases this kind of setup makes a lot of sense, so why not do the same with the GVariant format?

With the new GVariant Schema Compiler you can!

It uses a interface definition language where you define the types, including extra information like field names and other attributes, from which it generates C code.

For example, given the following schema:

type Gadget {
  name: string;
  size: {
    width: int32;
    height: int32;
  };
  array: []int32;
  dict: [string]int32;
};

It generates (among other things) these accessors:

const char *    gadget_ref_get_name   (GadgetRef v);
GadgetSizeRef   gadget_ref_get_size   (GadgetRef v);
Arrayofint32Ref gadget_ref_get_array  (GadgetRef v);
const gint32 *  gadget_ref_peek_array (GadgetRef v,
                                       gsize    *len);
GadgetDictRef   gadget_ref_get_dict   (GadgetRef v);

gint32 gadget_size_ref_get_width  (GadgetSizeRef v);
gint32 gadget_size_ref_get_height (GadgetSizeRef v);

gsize  arrayofint32_ref_get_length (Arrayofint32Ref v);
gint32 arrayofint32_ref_get_at     (Arrayofint32Ref v,
                                    gsize           index);

gboolean gadget_dict_ref_lookup (GadgetDictRef v,
                                 const char   *key,
                                 gint32       *out);

Not only are these accessors easier to use and understand due to using C types and field names instead of type strings and integer indexes, they are also a lot faster.

I wrote a simple performance test that just decodes a structure over an over. Its clearly a very artificial test, but the generated code is over 600 times faster than the code using g_variant_get(), which I think still says something.

Additionally, the compiler has a lot of other useful features:

  • You can add a custom prefix to all generated symbols.
  • All fixed size types generate C struct types that match the binary format, which can be used directly instead of the accessor functions.
  • Dictionary keys can be declared sorted: [sorted string] { ... } which causes the generated lookup function to use binary search.
  • Fields can declare endianness: foo: bigendian int32 which will be automatically decoded when using the generated getters.
  • Typenames can be declared ahead of time and used like foo: []Foo, or declared inline: foo: [] 'Foo { ... }. If you don’t name the type it will be named based on the fieldname.
  • All types get generated format functions that are (mostly) compatible with g_variant_print().

Gthree – ready to play

Today I made a new release of Gthree, version 0.2.0.

Newly added in this release is support for Raycaster, which is important if you’re making interactive 3D applications. For example, it’s used if you want clicks on the window to pick a 3D object from the scene. See the interactive demo for an example of this.

Also new is support for shadow maps. This allows objects between a light source and a target to cast shadows on the target. Here is an example from the demos:

I’ve been looking over the list of feature that we support, and in this release I think all the major things you might want to do in a 3D app is supported to at least a basic level.

So, if you ever wanted to play around with 3D graphics, now would be a great time to do so. Maybe just build the code and study/tweak the code in the examples subdirectory. That will give you a decent introduction to what is possible.

If you just want to play I added a couple of new features to gnome-hexgl based on the new release. Check out how the tracks casts shadows on the buildings!

Gaming with GThree

The last couple of week I’ve been on holiday and I spent some of that hacking on gthree. Gthree is a port of three.js, and a good way to get some testing of it is to port a three.js app. Benjamin pointed out HexGL, a WebGL racing game similar to F-Zero.

This game uses a bunch of cool features like shaders, effects, sprites, particles, etc, so it was a good target. I had to add a bunch of features to gthree and fix some bugs, but its now at a state where it looks pretty cool as a demo. However it needs more work to be playable as a game.

Check out this screenshot:

Or this (lower resolution) video:

If you’re interested in playing with it, the code is on github. It needs latest git versions of graphene and gthree to build.

I hope to have a playable version of this for GUADEC. See you there!

Gthree update, It moves!

Recently I have been backporting some missing three.js features and fixing some bugs. In particular, gthree supports:

  • An animation system based on keyframes and interpolation.
  • Skinning, where a model can have a skeleton and modifying a bone affects the whole model.
  • Support in the glTF loader for the above.

This is pretty cool as it enables us to easily load and animate character models. Check out this video:

Gthree is alive!

A long time ago in a galaxy far far away I spent some time converting three.js into a Gtk+ library, called gthree.

It was never really used for anything and wasn’t really product quality. However, I really like the idea of having an easy way to add some 3D effects to Gnome apps.

Also, time has moved on and three.js got cool new features like a PBR material and a GLTF support.

So, last week I spent some time refreshing gthree to the latest shaders from three.js, adding a GLTF loader and a few new features, ported to meson and did some general cleanup/polish. I still want to add some more features in like skinning, morphing and animations, but just the new rendering is showing off just how cool this stuff is:

Here is a screencast showing off the model viewer:

Cool, eh!

I have to do some other work too, but I hope to get some more time in the future to work on gthree and to use it for something interesting.

Broadway adventures in Gtk4

One of my long running side projects is a Gtk backend called “Broadway”. Instead of rendering to the screen this backend creates a HTTP server that you can connect to, and then exposes the UI remotely in the browser.

The original version of broadway was essentially streaming image frames, although there were various ways to optimize what got sent. This matches pretty well with how Gtk 3 rendering works, particularly on Wayland. Every frame it calls out to all widgets, letting them draw on top of a buffer and then sends the final frame to the compositor. Broadway just inserts some image delta computation and JavaScript magic in the middle of this.

Enter Gtk 4, breaking everything!

However, time moves on, and the current development branch of Gtk (which will be Gtk 4) has completely changed how rendering works, with the goal of doing efficient rendering on modern GPUs.

In the new model widgets don’t directly render to a buffer. Instead they build up a model of how the final result should look in terms of something called render nodes. These describe rendering as a tree of highlevel operations. The backend (we have software, OpenGL and Vulkan backends) then knows how to take this description and submit it to the GPU in an efficient way. This is somewhat similar to the firefox WebRender project.

Its would be possible to implement the broadway backend by hooking up the software renderer, letting it generate a buffer and then send that to the browser.  However, that is pretty lame!

CSS comes to the rescue!

Instead I’ve been looking at making the browser actually draw the render nodes. Gtk defines a lot of its UI in terms of CSS these days, and that means that the render nodes actually are very close to the CSS rendering model. For example, the basic drawing operation are things like rounded boxes with borders, shadows, etc.

So, I was thinking, could we not take these render node and turn them into actual DOM nodes with CSS styles and send them to the browser. Then every frame we can just diff the DOM trees, sending the minimal changes necessary.

Sounds crazy right? But, it turns out to work pretty well.

Check out this example page which I created with the magic of “save as”. In particular, try zooming into that page in the browser, and play with the developer tools inspector to see the nodes. Here is a part of it zoomed in:

The icons and the text are not CSS, so they don’t scale, but look at those gorgeous borders, shadows and gradients!

Entering the 3rd dimension!

Particularly interesting is the support in Gtk for general 3D transforms. This maps well to the CSS transform on the browser style.

Check out this example of a spinning-cube transition. If you open up the browser inspector you can see that each individual element in the cube is still a regular CSS box.

Some technical notes

If you look at the examples above they all use data: uris for images. This is a custom mode that lets “screenshots” like the above work. Normally broadway uses blobs for the images.

Also, looking at the examples they seem very heavy in terms of images, as all the text are images. However, in a typical frame most of the render tree is identical to the previous frame, meaning any label that was used in the last frame need not be sent again. In fact, even if it changes position in the tree due to a parent node changing (scrolling, cube-switching, etc) it can still be reused as-is.

However, text is clearly the weak point in here. Unfortunately HTML/CSS has no low-level text rendering APIs we could use. I’m considering generating a texture atlas with pre-rendered glyphs that can be reused (like CSS sprites) when rendering text, that would mean we will have to download less data at least. If anyone has other ideas I would love to hear about it.

Introducing flat-manager

A long time ago I wrote a blog post about how to maintain a Flatpak repository.

It is still a nice, mostly up to date, description of how Flatpak repositories work. However, it doesn’t really have a great answer to the issue called syncing updates in the post. In other words, it really is more about how to maintain a repository on one machine.

In practice, at least on a larger scale (like e.g. Flathub) you don’t want to do all the work on a single machine like this. Instead you have an entire build-system where the repository is the last piece.

Enter flat-manager

To support this I’ve been working on a side project called flat-manager. It is a service written in rust that manages Flatpak repositories. Recently we migrated Flathub to use it, and its seems to work quite well.

At its core, flat-manager serves and maintains a set of repos, and has an API that lets you push updates to it from your build-system. However, the way it is set up is a bit more complex, which allows some interesting features.

Core concept: a build

When updating an app, the first thing you do is create a new build, which just allocates an id that you use in later operations. Then you can upload one or more builds to this id.

This separation of the build creation and the upload is very powerful, because it allows you to upload the app in multiple operations, potentially from multiple sources. For example, in the Flathub build-system each architecture is built on a separate machine. Before flat-manager we had to collect all the separate builds on one machine before uploading to the repo. In the new system each build machine uploads directly to the repo with no middle-man.

Committing or purging

An important idea here is that the new build is not finished until it has been committed. The central build-system waits until all the builders report success before committing the build. If any of the builds fail, we purge the build instead, making it as if the build never happened. This means we never expose partially successful builds to users.

Once a build is committed, flat-manager creates a separate repository containing only the new build. This allows you to use Flatpak to test the build before making it available to users.

This makes builds useful even for builds that never was supposed to be generally available. Flathub uses this for test builds, where if you make a pull request against an app it will automatically build it and add a comment in the pull request with the build results and a link to the repo where you can test it.

Publishing

Once you are satisfied with the new build you can trigger a publish operation, which will import the build into the main repository and do all the required operations, like:

  • Sign builds with GPG
  • Generate static deltas for efficient updates
  • Update the appstream data and screenshots for the repo
  • Generate flatpakref files for easy installation of apps
  • Update the summary file
  • Call out out scripts that let you do local customization

The publish operation is actually split into two steps, first it imports the build result in the repo, and then it queues a separate job to do all the updates needed for the repo. This way if multiple builds are published at the same time the update can be shared. This saves time on the server, but it also means less updates to the metadata which means less churn for users.

You can use whatever policy you want for how and when to publish builds. Flathub lets individual maintainers chose, but by default successful builds are published after 3 hours.

Delta generation

The traditional way to generate static deltas is to run flatpak build-update-repo --generate-static-deltas. However, this is a very computationally expensive operation that you might not want to do on your main repository server. Its also not very flexible in which deltas it generates.

To minimize the server load flat-manager allows external workers that generate the deltas on different machines. You can run as many of these as you want and the deltas will be automatically distributed to them. This is optional, and if no workers connect the deltas will be generated locally.

flat-manager also has configuration options for which deltas should be generated. This allows you to avoid generating unnecessary deltas and to add extra levels of deltas where needed. For example, Flathub no longer generates deltas for sources and debug refs, but we have instead added multiple levels of deltas for runtimes, allowing you to go efficiently to the current version from either one or two versions ago.

Subsetting tokens

flat-manager uses JSON Web Tokens to authenticate API clients. This means you can assign different permission to different clients. Flathub uses this to give minimal permissions to the build machines. The tokens they get only allow uploads to the specific build they are currently handling.

This also allows you to hand out access to parts of the repository namespace. For instance, the Gnome project has a custom token that allows them to upload anything in the org.gnome.Platform namespace in Flathub. This way Gnome can control the build of their runtime and upload a new version whenever they want, but they can’t (accidentally or deliberately) modify any other apps.

Rust

I need to mention Rust here too. This is my first real experience with using Rust, and I’m very impressed by it. In particular, the sense of trust I have in the code when I got it past the compiler. The compiler caught a lot of issues, and once things built I saw very few bugs at runtime.

It can sometimes be a lot of work to express the code in a way that Rust accepts, which makes it not an ideal language for sketching out ideas. But for production code it really excels, and I can heartily recommend it!

Future work

Most of the initial list of features for flat-manager are now there, so I don’t expect it to see a lot of work in the near future.

However, there is one more feature that I want to see; the ability to (automatically) create subset versions of the repository. In particular, we want to produce a version of Flathub containing only free software.

I have the initial plans for how this will work, but it is currently blocking on some work inside OSTree itself. I hope this will happen soon though.

Nvidia drivers in Fedora Silverblue

UPDATE:

The updated drivers packages are now in the repos, so you don’t need the specially built rpm. Using rpm-ostree install kmod-nvidia xorg-x11-drv-nvidia is enough. If you installed the custom build you need to uninstall it as it can cause upgrade issues.

I really like how Fedora Silverblue combines the best of atomic, image-based updates and local tweaking with its package layering idea.

However, one major issue many people has had with it is support for the NVIDIA drivers. Given they ares not free software they can’t be shipped with the image, so one imagines using package layering to would be a good way to install it. In theory this works, but unfortunately it often runs into issues, because frequent kernel updates cause there to be no pre-built nvidia module for your particular kernel/driver version.

In a normal Fedora installation this is handled by something called akmods. This is a system where the kernel modules ship as sources which get automatically rebuilt on the target system itself when a new kernel is installed.

Unfortunately this doesn’t quite work on Silverblue, because the system image is immutable. So, I’ve been working recently on making akmods work in silverblue. The approach I’ve taken is having the modules being built during the rpm-ostree update command (in the %post script) and the output of that being integrated into the newly constructed image.

Last week the final work landed in the akmods and kmodtools packages (currently available in updates-testing), which means that anyone can easily experiment with akmods, including the nvidia drivers.

Preparing the system

First we need the latest of everything:

$ sudo rpm-ostree update

The required akmods packages are in updates-testing at the moment, so we’ll enable that for now:

$ sudo vi /etc/yum.repos.d/fedora-updates-testing.repo
... Change enabled to 1 ..

Then we add the rpmfusion repository:

$ sudo rpm-ostree install https://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-29.noarch.rpm https://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-29.noarch.rpm

At this point you need to reboot into the new ostree image to enable installation from the new repositories.

$ systemctl reboot

Installing the driver

The akmod-nvidia package in the current rpm-fusion is not built against the new kmodtools, so until it is rebuilt it will not work. This is a temporary issue, but  I built a new version we can use until it is fixed.

To install it, and the driver itself we do:

$ sudo rpm-ostree install http://people.redhat.com/alexl/akmod-nvidia-418.43-1.1rebuild.fc29.x86_64.rpm xorg-x11-drv-nvidia

Once the driver in rpm-fusion is rebuilt the custom rpm should not be necessary.

We also need to blacklist the built-in nouveau driver so to avoid driver conflicts:

$ sudo rpm-ostree kargs --append=rd.driver.blacklist=nouveau --append=modprobe.blacklist=nouveau --append=nvidia-drm.modeset=1

Now you’re ready to boot into your fancy new silverblue nvidia experience:

$ systemctl reboot

What about Fedora 30/Rawhide?

All the changes necessary for this to work have landed, but there is no Fedora 30 Silverblue image yet (only a rawhide one), and the rawhide kernel is built with mutex debugging which is not compatible with the nvidia driver.

However, the second we have a Fedora 30 Silverblue image with a non-debug kernel the above should work there too.