TL;DR: I found an interesting bug in
flatpak-spawn which taught me that there is a difference between the exit code you pass to
exit(), the exit status reported by
waitpid(), and the shell variable
One of the goals of Flatpak is to isolate applications from the host system; they can normally only directly run external programs supplied by the Flatpak platform they are built against, rather than whatever executables happen to be installed on the host. But some developer tools do need to be able to run commands on the host system. One example is GNOME Builder, which allows you to compile software on the host; another is
flatpak-builder which uses this to build flatpak:s from within a flatpak. (For my part, I’m occasionally working on making Bustle run
pkexec dbus-monitor --system on the host, to allow reading all messages on the system bus (a privileged operation) from an unprivileged, sandboxed application. More on this in a future blog post.)
Flatpak’s session helper provides a D-Bus API to do this: a HostCommand method that launches a given command outside the sandbox and returns its process ID; and a HostCommandExited signal which is emitted when the process exists, with its exit status as a uint32. Apps can use this D-Bus API directly, but recent versions of the common runtimes include a wrapper command which is much easier to adapt existing code to use: just replace
cat /etc/passwd with
flatpak-spawn --host cat /etc/passwd.
flatpak-spawn --host propagates the exit status from the command it runs, but I found that in practice, it did not. For example,
false is a program which does nothing, unsuccessfully:
$ false; echo exit status: $?
But when run via
flatpak-spawn --host, its exit status is 0:
$ flatpak run --env='PS1=sandbox$ ' \
> --talk-name=org.freedesktop.Flatpak \
> --command=bash org.freedesktop.Sdk//1.6
sandbox$ flatpak-spawn --host false; echo exit status: $?
If you care whether the command you launched succeeded, this is problematic! The first clue to what’s going on is in the output of
sandbox$ flatpak-spawn --verbose --host false; echo exit status: $?
F: child_pid: 18066
F: child exited 18066: 256
exit status: 0
Here’s the code, from the
HostCommandExited signal handler:
g_variant_get (parameters, "(uu)", &client_pid, &exit_status);
g_debug ("child exited %d: %d", client_pid, exit_status);
if (child_pid == client_pid)
256, even though
false actually returns
1. If you read
man 3 exit, you will learn:
void exit(int status);
exit() function causes normal process termination and the value of
status & 0377 is returned to the parent (see
256 == 0x0100 and
0377 == 0x00ff; so
exit_status & 0377 == 0. Now we know why
flatpak-spawn returns 0, but why is
exit_status equal to 256 rather than 1 in the first place?
It comes from a
g_child_watch_add_full() callback. The g_child_watch_add_full() docs tell us:
In many programs, you will want to call
g_spawn_check_exit_status() in the callback to determine whether or not the child exited successfully.
Following the link, we learn:
On Unix, [the exit status] is guaranteed to be in the same format waitpid() returns.
And reading the
waitpid() documentation, we finally learn that the exit status is an opaque integer which must be inspected with a set of macros. On Linux, the layout is, roughly:
- When a process calls
exit(x), the exit status is ((x & 0xff) << 8); the low byte is 0. This explains why the
- When a process is killed by signal
y, the exit status is stored in the low byte, with its high bit (0x80) set if the process dumped core. So a process which segfaults and dumps core will have exit status
11 | 0x80 == 11 + 128 == 139
What’s funny about this is that, if the subprocess segfaults and dumps core, when testing from the shell
flatpak-spawn --host appears to work.
host$ /home/wjt/segfault; echo exit status: $?
Segmentation fault (core dumped)
exit status: 139
sandbox$ flatpak-spawn --verbose --host /home/wjt/segfault; echo exit status: $?
F: child_pid: 20256
F: child exited 20256: 139
exit status: 139
But there’s a difference between this and a process which actually exits 139:
sandbox$ flatpak-spawn --verbose --host /bin/sh -c 'exit 139'; echo exit status: $?
F: child_pid: 20481
F: child exited 20481: 35584
exit status: 0
I always thought these two were the same. Actually, mapping the signal that killed a process to
$? = 128 + signum is just shell convention.
flatpak-spawn, we need to inspect the exit status and recover the exit code or signal. For normal termination, we can pass the exit code to
exit(). For signals, the options are:
- Reset all signal() handlers to SIG_DFL, then send the signal to ourselves and hope we die
- Follow the shell convention and exit(128 + signal number)
I think the former sounds scary and unreliable, so I implemented the latter. Imperfect, but it’ll do.