Will Thompson

Age rating data for Flathub apps

OARS (Open Age Ratings Service) defines a scheme to include content rating information in apps’ AppData/AppStream file. GNOME Software and similar tools use this metadata to show age ratings for applications. In Endless OS, we also support restricting which applications a given user can install based on this data – see this page, and the reports it links to, for a bit more information about this feature and its future.

Screenshot: “Age Rating: 7. The application was rated this way because it features: Characters in aggressive conflict easily distinguishable from reality”

Every new app on Flathub should include OARS metadata, but there are many existing apps which don’t have this data, ~~so it’s not (yet) enforced at build time~~. Edit: Bartłomiej tells me that it has been enforced at build time for a little over a month. (See App Requirements and AppData Guidelines on the Flathub wiki for some more information on what’s required and recommended; this branch of appstream-glib is modified to enforce Flathub’s policies.) My colleague Andre Magalhaes crunched the data and opened a tracker task for Flathub apps without OARS metadata. ((We have a similar list for our in-house apps.)) This information is much more useful if it can be relied upon to be present.

If you’re familiar with an app on this list, generating the OARS data is a simple process: open this generator in your browser, answer some questions about the app, and receive some XML. The next step is to put that OARS data into the AppData. ((If you’re the upstream maintainer for the app, you probably already know how to do all this, and can stop reading here!)) Take a look at the app’s Flathub repo and check whether it has an .appdata.xml file.

If it doesn’t, then the app’s AppData must be maintained upstream. Great! Find the upstream repository for the project, and send a merge request there. (Here’s one I sent earlier, for D-Feet.) You can either add the same patch to the Flathub packaging (as I did for D-Feet) or wait for a new upstream release and then update the Flathub packaging to that version (as I also did for D-Feet, a couple of days later).

If the appdata is maintained in the Flathub repo, make the relevant changes there directly. (Here’s a PR I opened for Tux, of Math Command while writing this post.) Ideally, the appdata would make its way upstream, but there are a fair few apps on Flathub which do not have active upstreams.

You might well find that the appdata requirements have become more strict about things other than OARS since the app was last updated, and these will have to be fixed too. In both the example cases above, I had to add release information, which has become mandatory for Flathub apps since these were last updated.

Rebasing downstream translations

At Endless, we maintain downstream translations for an number of GNOME projects, such as gnome-software, gnome-control-center and gnome-initial-setup. These are projects where our (large) downstream modifications introduce new user-facing strings. Sometimes, our translation for a string differs from the upstream translation for the same string. This may be due to:

a deliberate downstream style choice – such as tú vs. usted in Spanish
our fork of the project changing the UI so that the upstream translation does not fit in the space available in our UI – “Suspend” was previously translated into German as „In Bereitschaft versetzen“, which we changed to „Bereitschaft“ for this reason
the upstream translation being incorrect
the whim of a translator

Whenever we update to a new version of GNOME, we have to reconcile our downstream translations with the changes from upstream. We want to preserve our intentional downstream changes, and keep our translations for strings that don’t exist upstream; but we also want to pull in translations for new upstream strings, as well as improved translations for existing strings. Earlier this year, the translation-rebase baton was passed to me. My predecessor would manually reapply our downstream changes for a set of officially-supported languages, but unlike him, I can pretty much only speak English, so I needed something a bit more mechanical.

I spoke to various people from other distros about this problem. ((I’d love to credit the individuals I spoke to but my memory is awful. Please let me know if you remember being on the other side of these conversations and I’ll update this post!)) A common piece of advice was to not maintain downstream translation changes: appealing, but not really an option at the moment. I also heard that Ubuntu follows a straightforward rule: once the translation for a string has been changed downstream, all future upstream changes to the translation for that string are ignored. The assumption is that all downstream changes to a translation must have been made for a reason, and should be preserved. This is essentially a superset of what we’ve done manually in the past.

I wrote a little tool to implement this logic, ~~pomerge~~ translate-o-tron 3000 (or “t3k” for short). ((Thanks to Alexandre Franke for pointing out the existence of at least one existing tool called “pomerge”. In my defence, I originally wrote this script on a Eurostar with no internet connection, so couldn’t search for conflicts at the time.)) Its “rebase” mode takes the last common upstream ancestor, the last downstream commit, and a working copy with the newest downstream code. For each locale, for each string in the translation in the working copy, it compares the old upstream and downstream translations – if they differ, it merges the latter into the working copy. For example, Endless OS 3.5.x was based on GNOME 3.26; Endless OS 3.6.x is based on GNOME 3.32. I might rebase the translations for a module with:

$ cd src/endlessm/gnome-control-center

# The eos3.6 branch is based on the upstream 3.32.1 tag
$ git checkout eos3.6

# Update the .pot file
$ ninja -C build meson-gnome-control-center-2.0-pot

# Update source strings in .po files
$ ninja -C build meson-gnome-control-center-2.0-update-po

# The eos3.6 branch is based on the upstream 3.26.1 tag;
# merge downstream changes between those two into the working copy
$ t3k rebase `pwd` 3.26.1 eos3.5

# Optional: Python's polib formats .po files slightly differently to gettext;
# reformat them back. This has no semantic effect.
$ ninja -C build meson-gnome-control-center-2.0-update-po

$ git commit -am 'Rebase downstream translations'

It also has a simpler “copy” mode which copies translations from one .po file to another, either when the string is untranslated in the target (the default) or for all strings. In some cases, we’ve purchased translations for strings which have not been translated upstream; I’ve used this to submit some of those upstream, such as Arabic translations of the myriad OARS categories, and hope to do more of that in the future now I can make a computer do the hard work.

$ t3k copy \
    ~/src/endlessm/gnome-software/po/ar.po \
    ~/src/gnome/gnome-software/po/ar.po

Yes, it was something to do with controlling terminals

Back in June, I wrote about the difficulty of killing the dbus-monitor subprocess when Bustle is recording the system bus:

But sending SIGKILL from the unprivileged Bustle process to the privileged dbus-monitor process has no effect. Weirdly, if I run pkexec dbus-monitor --system in a terminal, I can hit Ctrl+C and kill it just fine. (Is this something to do with controlling terminals? Please tell me if you know how I could make this work.)

Greetings, past me! I’m just like you, only 6 months older, and I’m here to tell you how to make this work.

When you run a process in a terminal (emulator), its controlling TTY is a pseudoterminal (hereafter PTY) slave ((not my choice of terminology)) device. When you press Ctrl+C, the terminal doesn’t send SIGINT to the running process; it writes the ^C character '\x03' into the corresponding PTY master. This is intercepted by the kernel’s TTY layer, which eats this character and sends SIGINT to the (foreground) process (group) attached to the PTY slave. It doesn’t matter if that process (group) is owned by another user: being able to write to its controlling terminal is the capability you need.

Armed with this knowledge, and some further reading on how to set the controlling TTY, it was reasonably easy to wire this up in Bustle:

Before forking the dbus-monitor subprocess, open a PTY master/slave pair and tell GSubprocess to use the slave FD as the subprocess’s stdin
After forking, but before execing pkexec dbus-monitor, call setsid () to move the process to a new session, then ioctl (STDIN_FILENO, TIOCSCTTY, 0); to set its controlling TTY
When it comes time to stop the monitor, write '\x03' into the PTY master FD

You may wonder how the second step works in the Flatpak case, where the pkexec dbus-monitor process is not spawned directly by Bustle, but by the Flatpak session helper daemon. Well, we don’t get a chance to run our own code between fork() and exec(), but that’s okay because the session helper does exactly the same thing for us. (I think this is why Ctrl+C works in a host shell running within Flatpak:d GNOME Builder.)

The devil makes work for idle processes

TLDR: in Endless OS, we switched the IO scheduler from CFQ to BFQ, and set the IO priority of the threads doing Flatpak downloads, installs and upgrades to “idle”; this makes the interactive performance of the system while doing Flatpak operations indistinguishable from when the system is idle.

At Endless, we’ve been vaguely aware for a while that trying to use your computer while installing or updating apps is a bit painful, particularly on spinning-disk systems, because of the sheer volume of IO performed by the installation/update process. This was never particularly high priority, since app installations are user-initiated, and until recently, so were app updates.

But, we found that users often never updated their installed apps, so earlier this year, in Endless OS 3.3.10, we introduced automatic background app updates to help users take advantage of “new features and bug fixes” (in the generic sense you so often see in iOS/Android app release notes). This fixed the problem of users getting “stuck” on old app versions, but made the previous problem worse: now, your computer becomes essentially unusable at arbitrary times when app updates happen. It was particularly bad when users unboxed a system with an older version of Endless OS (and hence a hundred or so older apps) pre-installed, received an automatic OS update, then rebooted into a system that’s unusable until all those apps have been updated.

At first, I looked for logic errors in (our versions of) GNOME Software and Flatpak that might cause unneccessary IO during app updates, without success. We concluded that heavy IO load when updating a large app or runtime is largely unavoidable, ((modulo Umang’s work mentioned in the coda)) so I switched to looking at whether we could mitigate this by tweaking the IO scheduler.

The BFQ IO scheduler is supposed to automatically prioritize interactive workloads over bulk workload, which is pretty much exactly what we’re trying to do. The specific example its developers give is watching a video, without hiccups, while copying a huge file in the background. I spent some time with the BFQ developers’ own suite of benchmarks on two test systems: a Lenovo Yoga 900 (with an Intel i5-6200U @ 2.30GHz and a consumer-grade M.2 SSD) and an Endless Mission One (an older system with a Celeron CPU and a laptop-class spinning disk). Neither JP nor I were able to reproduce any interesting results for the dropped-frames benchmark: with either BFQ or CFQ (the previous default IO scheduler), the Yoga essentially never dropped frames, whereas the IO workloads immediately rendered the Mission totally unusable. I had rather more success with a benchmark which measures the time to launch LibreOffice:

On the Yoga, when the system was idle, the mean launch time went from 2.838s under CFQ to 2.98s under BFQ (a slight regression), but with heavy background IO, the mean launch time went from 16s with CFQ (standard deviation 0.11) to 3s with BFQ (standard deviation 0.51).
On the Mission, with modest background IO, the mean launch time was 108 seconds under BFQ, which sounds awful; but under CFQ, I gave up waiting for LibreOffice to start after 8 minutes!

Emboldened by these results, I went on to look at how the same “time to launch LibreOffice” benchmark fared when the background IO load is “installing and uninstalling a Lollipop Flatpak bundle in a loop”. I also looked at using ionice -c3 to set the IO priority of the install/uninstall loop to idle, which does what its name suggests: BFQ essentially will never serve IO at the idle priority if there is IO pending at any higher priority. You can see some raw data or look at some extended discussion copied from our internal issue tracker to a Flatpak pull request, but I suggest just looking at this chart:

What does it all mean?

The coloured bars represent median launch time in seconds for LibreOffice, across 15/30 trials for Yoga/Mission respectively.
The black whiskers show the minimum and maximum launch times observed. I know this should have been a box-and-whiskers or violin plot, but I realised too late that multitime does not give enough information to draw those.
“unloaded” refers to the performance when the system is otherwise idle.
“shell-loop” refers to running while true; do flatpak install -y /home/wjt/Downloads/org.gnome.Lollypop.flatpak; flatpak uninstall -y org.gnome.Lollypop/x86_64/stable; done; “long-lived” refers to performing the same operations with the Flatpak API in a long-lived process. I tried this because I understood that BFQ gives new processes a slight performance boost, but on a real system the GNOME Software and Flatpak system helper processes are long-lived. As you can see, the behaviour under BFQ is actually the other way around in the worst case, and identical for CFQ and in the median case.
The “ionice-” prefix means the Flatpak operation was run under ionice -c3.
Switching from CFQ to BFQ makes the worst case a little worse at the default IO priority, but the median case much better.
Setting the IO priority of the Flatpak process(es) to idle erases that worst-case regression under BFQ, and dramatically improves the median case under CFQ.
In combination, the time to launch LibreOffice while performing Flatpak operations in the background on the Mission went from 24 seconds to 12 seconds by switching to BFQ & setting the IO priority to idle.

So, by switching to BFQ and setting IO priorities appropriately, the system’s interactive performance while performing background updates is now essentially indistinguishable from when the system is idle. To implement this in practice, Rob McQueen wrote some patches to set the IO priority of the Flatpak system helper and GNOME Software’s worker threads to idle (both changes are upstream) and changed Endless OS’s default IO scheduler to BFQ where available. As Matthias put it on #flatpak when shown this chart and that first link: “not bad for a 1-line change”.

Of course, this means apps take a bit longer to install, even on a mostly-idle system. No, I don’t have numbers on how big the impact is: this work happened months ago and it’s taken me this long to write it up because I couldn’t find the time to collect more data. But my colleague Umang is working on eliminating up to half of the disk IO performed during Flatpak installations so that should more than make up for it!

Wandering in the symlink forest forever

Last week, Philip Withnall told me that Meson has built-in support for generating code coverage reports: just configure with -Db_coverage=true, run your tests with ninja test, then run ninja coverage-{text,html,xml} to generate the report in the format of your choice. The XML format is compatible with Cobertura’s output, which is convenient since Endless’s Jenkins is already configure to consume Cobertura XML generated by Autotools projects using our EOS_COVERAGE_REPORT macro. So it was a simple matter of adding gcovr to the build enviroment, running ninja coverage-xml after the tests, and moving the report to the right place for Jenkins to find it. It worked well on the projects I tested, so I decided to enable it for all Meson projects built in our CI. Sure, I thought, it’s not so useful for our forks of GNOME and third-party projects, but it’s harmless and saves adding per-project config, right?

Fast-forward to yesterday, when someone noticed that a systemd build had been stuck on the ninja coverage-xml step for 16 hours. Uh oh.

It turns out that gcovr follows symlinks when scanning for coverage files, but didn’t check for cycles. systemd’s test suite generates a fake sysfs tree, with many circular references via symlinks. For example, there are 64 self-referential ttyX trees:

$ ls -l build/test/sys/devices/virtual/tty/tty1
total 12
-rw-r--r-- 1 wjt wjt    4 Oct  9 12:16 dev
drwxr-xr-x 2 wjt wjt 4096 Oct  9 12:16 power
lrwxrwxrwx 1 wjt wjt   21 Oct  9 12:16 subsystem -> ../../../../class/tty
-rw-r--r-- 1 wjt wjt   16 Oct  9 12:16 uevent
$ ls -l build/test/sys/devices/virtual/tty/tty1/subsystem/tty1
lrwxrwxrwx 1 wjt wjt 30 Oct  9 12:16 build/test/sys/devices/virtual/tty/tty1/subsystem/tty1 -> ../../devices/virtual/tty/tty1
$ readlink -f build/test/sys/devices/virtual/tty/tty1/subsystem/tty1
/home/wjt/src/endlessm/systemd/build/test/sys/devices/virtual/tty/tty1

And, worse, all other ttyY trees are accessible via the symlinks from each ttyX tree. The kernel caps the number of symlinks per path to 40 before lookups fail with ELOOP, but that’s still 64⁴⁰ paths to resolve, just for the fake ttys. Quite a big number!

The fix is straightforward: maintain a set of visited (st_dev, st_ino) pairs while walking the tree, and prune subtrees we’ve already visited. I tried adding a similar highly self-referential symlink graph to the gcovr test suite, so that it would run in reasonable time if the fix works and essentially never terminate if it does not. Unfortunately, pytest has exactly the same bug: while searching for tests to run, it gets lost wandering in the symlink forest forever.

This bug is a good metaphor for my habit of starting supposedly-quick side-projects.

They should have called it Mirrorball

TL;DR: there’s now an rsync server at rsync://images-dl.endlessm.com/public from which mirror operators can pull Endless OS images, along with an instance of Mirrorbits to redirect downloaders to their nearest—and hopefully fastest!—mirror. Our installer for Windows and the eos-download-image tool baked into Endless OS both now fetch images via this redirector, and from the next release of Endless OS our mirrors will be used as BitTorrent web seeds too. This should improve the download experience for users who are near our mirrors.

If you’re interested in mirroring Endless OS, check out these instructions and get in touch. We’re particularly interested in mirrors in Southeast Asia, Latin America and Africa, since our mission is to improve access to technology for people in these areas.

Big thanks to Niklas Edmundsson, who administers the mirror at Academic Computer Club, Umeå University, who recommended Mirrorbits and provided the nudge needed to get this work going, and to dotsrc.org and Mythic Beasts who are also mirroring Endless OS already.

Read on if you are interested in the gory details of setting this up.

We’ve received a fair few offers of mirroring over the years, but without us providing an rsync server, mirror operators would have had to fetch over HTTPS using our custom JSON manifest listing the available images: extra setup and ongoing admin for organisations who are already generously providing storage and bandwidth. So, let’s set up an rsync server! One problem: our images are not stored on a traditional filesystem, but in Amazon S3. So we need some way to provide an rsync server which is backed by S3.

I decided to use an S3-backed FUSE filesystem to mount the bucket holding our images. It needs to provide a 1:1 mapping from paths in S3 to paths on the mounted filesystem (preserving the directory hierarchy), perform reasonably well, and ideally offer local caching of file contents. I looked at two implementations (out of the many that are out there) which have these features:

s3fs-fuse, which is packaged for Debian as s3fs. Debian is the predominant OS in our server infrastructure, as well as the base for Endless OS itself, so it’s convenient to have a package. ((Do not confuse s3fs-fuse with fuse-s3fs, a totally separate project packaged in Fedora. It uses its own flattened layout for the S3 bucket rather than mapping S3 paths to filesystem paths, so is not suitable for what we’re doing here.))
goofys, which claims to offer substantially better performance for file metadata than s3fs.

I went with s3fs first, but it is a bit rough around the edges:

Our S3 bucket name contains dots, which is not uncommon. By default, if you try to use one of these with s3fs, you’ll get TLS certificate errors. This turns out to be because s3fs accesses S3 buckets as $NAME.s3.amazonaws.com, and the certificate is for *.s3.amazonaws.com, which does not match foo.bar.s3.amazonaws.com. s3fs has a -o use_path_request_style flag which avoids this problem by putting the bucket name into the request path rather than the request domain, but this use of that parameter is only documented in a GitHub Issues comment.
If your bucket is in a non-default region, AWS serves up a redirect, but s3fs doesn’t follow it. Once again, there’s an option you can use to force it to use a different domain, which once again is documented in a comment on an issue.
Files created with s3fs have their permissions stored in an x-amz-meta-mode header. Files created by other tools (which is to say, all our files) do not have this header, so by default get mode 0000 (ie unreadable by anybody), and so the mounted filesystem is completely unusable (even by root, with FUSE’s default settings).

There are two ways to fix this last problem, short of adding this custom header to all existing and future files:

The -o complement_stat option forces files without the magic header to have mode 0400 (user-readable) and directories 0500 (user-readable and -searchable).
The -o umask=0222 option (from FUSE) makes the files and directories world-readable (an improvement on complement_stat in my opinion) at the cost of marking all files executable (which they are not)

I think these are all things that s3fs could do by itself, by default, rather than requiring users to rummage through documentation and bug reports to work out what combination of flags to add. None of these were showstoppers; in the end it was a catastrophic memory leak (since fixed in a new release) that made me give up and switch to goofys.

Due to its relaxed attitude towards POSIX filesystem semantics where performance would otherwise suffer, goofys’ author refers to it as a “Filey System”. ((This inspired a terrible “joke”.)) In my testing, throughput is similar to s3fs, but walking the directory hierarchy is orders of magnitude faster. This is due to goofys making more assumptions about the bucket layout, not needing to HEAD each file to get its permissions (that x-amz-meta-mode thing is not free), and having heuristics to detect traversals of the directory tree and optimize for that case. ((I’ve implemented a similar optimization elsewhere in our codebase: since we have many "directories", it takes many less requests to ask S3 for the full contents of the bucket and transform that list into a tree locally than it does to list each directory individually.))

For on-disk caching of file contents, goofys relies on catfs, a separate FUSE filesystem by the same author. It’s an interesting design: catfs just provides a write-through cache atop any other filesystem. The author has data showing that this arrangement performs pretty well. But catfs is very clearly labelled as alpha software (“Don’t use this if you value your data.”) and, as described in this bug report with a rather intemperate title, it was not hard to find cases where it DOSes itself or (worse) returns incorrect data. So we’re running without local caching of file data for now. This is not so bad, since this server only uses the file data for periodic synchronisation with mirrors: in day-to-day operation serving redirects, only the metadata is used.

With this set up, it’s plain sailing: a little rsync configuration generator that uses the filter directive to only publish the last two releases (rather than forcing terabytes of archived images onto our mirrors) and setting up Mirrorbits. Our Mirrorbits instance is configured with some extra “mirrors” for our CloudFront distribution so that users who are closer to a CloudFront-served region than any real mirrors are directed there; it could also have been configured to make the European mirrors (which they all are, as of today) only serve European users, and rely on its fallback mechanism to send the rest of the world to CloudFront. It’s a pretty nice piece of software.

If you’ve made it this far, and you operate a mirror in Southeast Asia, South America or Africa, we’d love to hear from you: our mission is to improve access to technology for people in these areas.

Using Vundle from the Vim Flatpak

I (mostly) use Vim, and it’s available on Flathub. However: my plugins are managed with Vundle, which shells out to git to download and update them. But git is not in the org.freedesktop.Platform runtime that the Vim Flatpak uses, so that’s not going to work!

If you’ve read any of my recent posts about Flatpak, you’ll know my favourite hammer. I allowed Vim to use flatpak-spawn by launching it as:

flatpak run --talk-name=org.freedesktop.Flatpak org.vim.Vim

I saved the following file to /tmp/git:

#!/bin/sh
exec flatpak-spawn --host git "$@"

then ran the following Vim commands to make it executable, add it to the path, then fire up Vundle:

:r !chmod +x /tmp/git
:let $PATH = '/tmp:/app/bin:/usr/bin'
:VundleInstall

This tricks Vundle into running git outside the sandbox. It worked!

I’m posting this partly as a note to self for next time I want to do this, and partly to say “can we do better?”. In this specific case, the Vim Flatpak could use org.freedesktop.Sdk as its runtime, like many other editors do. But this only solves the problem for tools like git which are included in the relevant SDK. What if I’m writing Python and want to use pyflakes, which is not in the SDK?

Everybody’s Gone To The GUADEC

It’s been ten days since I came back from GUADEC 2018, and I’ve finally caught up enough to find the time to write about it. As ever, it was a pleasure to see familiar faces from around the community, put some new faces to familiar names, and learn some entirely new names and faces! Some talk highlights:

In “Patterns of refactoring C to Rust”, Federico Mena Quintero pulled off the difficult trick of giving a very source code-centric talk without losing the audience. (He said afterwards that the style he used is borrowed from a series of talks he referenced in his slides, but the excellent delivery was certainly a large part of why it worked.)
Christian Hergert and Corentin Noël’s talk on “What’s happening in Builder?” left me feeling good about the future of cross-architecture and cross-device GNOME app development. Developing OS and platform components in a desktop-containerised world is not a fully-solved problem; between upcoming plans for Builder and Philip Chimento’s Flapjack, I think we’re getting there.
I’m well-versed in Flatpak but know very little about Snap, so Robert Ancell’s talk on “Snap Package support in GNOME” was enlightening. It’s heartening that much of the user-facing infrastructure to solve problems common to Snap and Flatpak (such as GNOME Software and portals) is shared, and it was interesting to learn about some of the unique featues of Snap which make it attractive to ISVs.

I couldn’t get to Almería until the Friday evening; I’m looking forward to checking out video recordings of some of the talks I missed. (Shout-out to the volunteers editing these videos! Update: the videos are now mostly published; I’ve added links to the three talks above.)

One of the best bits of any conference is the hallway track, and GUADEC did not disappoint. Fellow Endlesser Carlo Caione and I caught up with Javier Martinez Canillas from Red Hat to discuss some of the boot-loader questions shared between Endless OS and Silverblue, like the downstream Boot Loader Specification module for GRUB, and how to upgrade GRUB itself—which lives outside the atomic world of OSTree—in as robust and crash-proof a manner as is feasible.

On the bus to the campus on Sunday, I had an interesting discussion with Robert Ancell about acquiring domain expertise too late in a project to fix the design decisions made earlier on (which has happened to me a fair few times). While working on LightDM, he avoided this trap by building a thorough integration test suite early on; this allowed him to refactor with confidence as he discovered murky corners of display management. As I understand it (sorry if I’ve misremembered from the noisy bus ride!), he wrote a library which essentially shims every syscall. This made it easier to mock and exercise all the complicated interactions the display manager has with many different parts of the OS via many IPC mechanisms. I always regret it when I procrastinate on test coverage; I’ll keep this discussion in mind as extra ammunition to do the right thing.

My travel to and from Almería was kindly sponsored by the GNOME Foundation. Thank you!

Bustle 0.7.1: jumping the ticket barrier

Bustle 0.7.1 is out now and supports monitoring the system bus, without requiring any prior system configuration. It also lets you monitor any other bus by providing its address, which I’ve already used to spy on ibus traffic.

Screenshot: Bustle monitoring IBus messages.

Bustle used to try to intercept all messages by adding one match rule per message type, with the eavesdrop=true flag set. This is how dbus-monitor also worked when Bustle was created. This works okay on the session bus, but for obvious reasons, a regular user being able to snoop on all messages on the system bus would be undesirable; even if you were root, you had to edit the policy to allow you to do this. So, the Bustle UI didn’t have any facility for monitoring the system bus; instead, the web page had some poorly-specified instructions suggesting you remove all restrictive policies from the system bus, reboot, do your monitoring, then reimpose the security policies. Not very stetic!

D-Bus 1.5.6 ((or version 0.18 of the spec if you’re into that)) added a BecomeMonitor method explicitly designed for debugging tools: if the bus considers you privileged enough, it will send you a copy of every future message that passes through the bus until you disconnect. Privileged enough is technically implementation defined; the reference implementation is that if you are root, or have the same UID as the bus itself, you can become a monitor.

Three panels: woman smiling while looking at computer screen; mouse pointer pointing at 'Become a fan' button from an old version of Facebook; woman has been replaced by a desk fan.

Bustle, which runs as a regular user, needs to outsource the task of monitoring the system bus to some other privileged process. The normal way to do this kind of thing – as used by tools like sysprof – is to have a system service which runs as root and uses polkit to determine whether to allow the request. The D-Bus daemon itself is not about to talk to polkit (itself a D-Bus service), though, and when distributed with Flatpak it’s not possible for Bustle to install its own system service.

I decided to cheat and assume that pkexec – the polkit equivalent of sudo – and dbus-monitor are both installed on the host system. A few years ago, dbus-monitor grew the ability to dump out raw D-Bus messages in pcap format, which by pure coincidence is the on-disk format Bustle uses. So Bustle builds a command line as follows:

If running inside a Flatpak, append flatpak-spawn --host
If trying to monitor the system bus, pkexec
dbus-monitor --pcap [--session | --system | --address ADDRESS]

It launches this subprocess, then reads the stream of messages from its stdout. In the system bus case, polkit will prompt the user to authenticate and approve the action:

Assuming you authenticate successfully, the messages will flow:

There are a few more fiddly details:

If the dbus-monitor subprocess quits unexpectedly, we want to show an error; but if the user hits cancel on the polkit authentication dialog, we don’t. pkexec signals this with exit code 126. Now you know why I care about flatpak-spawn propagating exit codes correctly: without that, Bustle can’t readily distinguish these two cases.
Bustle’s old internal monitor code did an initial state dump of names and their owners when it connected to the bus. dbus-monitor doesn’t do this yet. For now, Bustle waits until it knows the monitor is running, then makes its own connection to the same bus and performs the same state dump, which will be captured by the monitor and end up in the same stream. This means that Bustle in Flatpak still needs access to the session and system buses.
Once started, dbus-monitor only stops when the bus exits, it tries and fails to write to its stdout pipe, or it is killed by a signal. But sending SIGKILL from the unprivileged Bustle process to the privileged dbus-monitor process has no effect. Weirdly, if I run pkexec dbus-monitor --system in a terminal, I can hit Ctrl+C and kill it just fine. (Is this something to do with controlling terminals? Please tell me if you know how I could make this work.) So instead we just close the pipe that dbus-monitor is writing to, and assume that at some point in the future it will try and fail to log a new message to it. ((I’ve realised while writing this post that I can bring “some point in the future” forward by just connecting to the bus again.))
Why bother with the pcap framing when Bustle immediately takes it back apart? It means messages are timestamped in the dbus-monitor process, so they’re marginally more likely to be accurate than they would be if the Bustle UI process (which might be doing some rendering) was adding the timestamps.

On the one hand, I’m pleased with this technique, because it allows me to implement a feature I’ve wanted Bustle to have since I started it in 2008(!). On the other hand, this isn’t going to cut it in the general case of a Flatpaked development tool like sysprof, which needs a system helper but can’t reasonably assume that it’s already installed on the host system. Of course there’s nothing to stop you doing something like flatpak-spawn --host pkexec flatpak run --command=my-system-helper com.example.MyApp ... I suppose…

When is an exit code not an exit code?

TL;DR: I found an interesting bug in flatpak-spawn which taught me that there is a difference between the exit code you pass to exit(), the exit status reported by waitpid(), and the shell variable $?.

One of the goals of Flatpak is to isolate applications from the host system; they can normally only directly run external programs supplied by the Flatpak platform they are built against, rather than whatever executables happen to be installed on the host. But some developer tools do need to be able to run commands on the host system. One example is GNOME Builder, which allows you to compile software on the host; another is flatpak-builder which uses this to build flatpak:s from within a flatpak. (For my part, I’m occasionally working on making Bustle run pkexec dbus-monitor --system on the host, to allow reading all messages on the system bus (a privileged operation) from an unprivileged, sandboxed application. More on this in a future blog post.)

Flatpak’s session helper provides a D-Bus API to do this: a HostCommand method that launches a given command outside the sandbox and returns its process ID; and a HostCommandExited signal which is emitted when the process exists, with its exit status as a uint32. Apps can use this D-Bus API directly, but recent versions of the common runtimes include a wrapper command which is much easier to adapt existing code to use: just replace cat /etc/passwd with flatpak-spawn --host cat /etc/passwd.

In theory, flatpak-spawn --host propagates the exit status from the command it runs, but I found that in practice, it did not. For example, false is a program which does nothing, unsuccessfully:

$ false; echo exit status: $?
1

But when run via flatpak-spawn --host, its exit status is 0:

$ flatpak run --env='PS1=sandbox$ ' \
> --talk-name=org.freedesktop.Flatpak \
> --command=bash org.freedesktop.Sdk//1.6
sandbox$ flatpak-spawn --host false; echo exit status: $?
0

If you care whether the command you launched succeeded, this is problematic! The first clue to what’s going on is in the output of flatpak-spawn --verbose:

sandbox$ flatpak-spawn --verbose --host false; echo exit status: $?
F: child_pid: 18066
F: child exited 18066: 256
exit status: 0

Here’s the code, from the HostCommandExited signal handler:

g_variant_get (parameters, "(uu)", &client_pid, &exit_status);
g_debug ("child exited %d: %d", client_pid, exit_status);

if (child_pid == client_pid)
  exit (exit_status);

So exit_status is 256, even though false actually returns 1. If you read man 3 exit, you will learn:

void exit(int status);

The exit() function causes normal process termination and the value of status & 0377 is returned to the parent (see wait(2)).

256 == 0x0100 and 0377 == 0x00ff; so exit_status & 0377 == 0. Now we know why flatpak-spawn returns 0, but why is exit_status equal to 256 rather than 1 in the first place?

It comes from a g_child_watch_add_full() callback. The g_child_watch_add_full() docs tell us:

In many programs, you will want to call g_spawn_check_exit_status() in the callback to determine whether or not the child exited successfully.

Following the link, we learn:

On Unix, [the exit status] is guaranteed to be in the same format waitpid() returns.

And reading the waitpid() documentation, we finally learn that the exit status is an opaque integer which must be inspected with a set of macros. On Linux, the layout is, roughly:

When a process calls exit(x), the exit status is ((x & 0xff) << 8); the low byte is 0. This explains why the exit_status for false is 256.
When a process is killed by signal y, the exit status is stored in the low byte, with its high bit (0x80) set if the process dumped core. So a process which segfaults and dumps core will have exit status 11 | 0x80 == 11 + 128 == 139

What’s funny about this is that, if the subprocess segfaults and dumps core, when testing from the shell flatpak-spawn --host appears to work.

host$ /home/wjt/segfault; echo exit status: $?
Segmentation fault (core dumped)
exit status: 139
sandbox$ flatpak-spawn --verbose --host /home/wjt/segfault; echo exit status: $?
F: child_pid: 20256
F: child exited 20256: 139
exit status: 139

But there’s a difference between this and a process which actually exits 139:

sandbox$ flatpak-spawn --verbose --host /bin/sh -c 'exit 139'; echo exit status: $?
F: child_pid: 20481
F: child exited 20481: 35584
exit status: 0

I always thought these two were the same. Actually, mapping the signal that killed a process to $? = 128 + signum is just shell convention.

To fix flatpak-spawn, we need to inspect the exit status and recover the exit code or signal. For normal termination, we can pass the exit code to exit(). For signals, the options are:

Reset all signal() handlers to SIG_DFL, then send the signal to ourselves and hope we die
Follow the shell convention and exit(128 + signal number)

I think the former sounds scary and unreliable, so I implemented the latter. Imperfect, but it’ll do.