Using Vundle from the Vim Flatpak

I (mostly) use Vim, and it’s available on Flathub. However: my plugins are managed with Vundle, which shells out to git to download and update them. But git is not in the org.freedesktop.Platform runtime that the Vim Flatpak uses, so that’s not going to work!

If you’ve read any of my recent posts about Flatpak, you’ll know my favourite hammer. I allowed Vim to use flatpak-spawn by launching it as:

flatpak run --talk-name=org.freedesktop.Flatpak org.vim.Vim

I saved the following file to /tmp/git:

#!/bin/sh
exec flatpak-spawn --host git "$@"

then ran the following Vim commands to make it executable, add it to the path, then fire up Vundle:

:r !chmod +x /tmp/git
:let $PATH = '/tmp:/app/bin:/usr/bin'
:VundleInstall

This tricks Vundle into running git outside the sandbox. It worked!

I’m posting this partly as a note to self for next time I want to do this, and partly to say “can we do better?”. In this specific case, the Vim Flatpak could use org.freedesktop.Sdk as its runtime, like many other editors do. But this only solves the problem for tools like git which are included in the relevant SDK. What if I’m writing Python and want to use pyflakes, which is not in the SDK?

Everybody’s Gone To The GUADEC

It’s been ten days since I came back from GUADEC 2018, and I’ve finally caught up enough to find the time to write about it. As ever, it was a pleasure to see familiar faces from around the community, put some new faces to familiar names, and learn some entirely new names and faces! Some talk highlights:

  • In “Patterns of refactoring C to Rust”, Federico Mena Quintero pulled off the difficult trick of giving a very source code-centric talk without losing the audience. (He said afterwards that the style he used is borrowed from a series of talks he referenced in his slides, but the excellent delivery was certainly a large part of why it worked.)
  • Christian Hergert and Corentin Noël’s talk on “What’s happening in Builder?” left me feeling good about the future of cross-architecture and cross-device GNOME app development. Developing OS and platform components in a desktop-containerised world is not a fully-solved problem; between upcoming plans for Builder and Philip Chimento’s Flapjack, I think we’re getting there.
  • I’m well-versed in Flatpak but know very little about Snap, so Robert Ancell’s talk on “Snap Package support in GNOME” was enlightening. It’s heartening that much of the user-facing infrastructure to solve problems common to Snap and Flatpak (such as GNOME Software and portals) is shared, and it was interesting to learn about some of the unique featues of Snap which make it attractive to ISVs.

I couldn’t get to Almería until the Friday evening; I’m looking forward to checking out video recordings of some of the talks I missed. (Shout-out to the volunteers editing these videos! Update: the videos are now mostly published; I’ve added links to the three talks above.)

One of the best bits of any conference is the hallway track, and GUADEC did not disappoint. Fellow Endlesser Carlo Caione and I caught up with Javier Martinez Canillas from Red Hat to discuss some of the boot-loader questions shared between Endless OS and Silverblue, like the downstream Boot Loader Specification module for GRUB, and how to upgrade GRUB itself—which lives outside the atomic world of OSTree—in as robust and crash-proof a manner as is feasible.

On the bus to the campus on Sunday, I had an interesting discussion with Robert Ancell about acquiring domain expertise too late in a project to fix the design decisions made earlier on (which has happened to me a fair few times). While working on LightDM, he avoided this trap by building a thorough integration test suite early on; this allowed him to refactor with confidence as he discovered murky corners of display management. As I understand it (sorry if I’ve misremembered from the noisy bus ride!), he wrote a library which essentially shims every syscall. This made it easier to mock and exercise all the complicated interactions the display manager has with many different parts of the OS via many IPC mechanisms. I always regret it when I procrastinate on test coverage; I’ll keep this discussion in mind as extra ammunition to do the right thing.

My travel to and from Almería was kindly sponsored by the GNOME Foundation. Thank you!

Sponsored by GNOME Foundation

Bustle 0.7.1: jumping the ticket barrier

Bustle 0.7.1 is out now and supports monitoring the system bus, without requiring any prior system configuration. It also lets you monitor any other bus by providing its address, which I’ve already used to spy on ibus traffic.

Screenshot: Bustle monitoring IBus messages.

Bustle used to try to intercept all messages by adding one match rule per message type, with the eavesdrop=true flag set. This is how dbus-monitor also worked when Bustle was created. This works okay on the session bus, but for obvious reasons, a regular user being able to snoop on all messages on the system bus would be undesirable; even if you were root, you had to edit the policy to allow you to do this. So, the Bustle UI didn’t have any facility for monitoring the system bus; instead, the web page had some poorly-specified instructions suggesting you remove all restrictive policies from the system bus, reboot, do your monitoring, then reimpose the security policies. Not very stetic!

D-Bus 1.5.61 added a BecomeMonitor method explicitly designed for debugging tools: if the bus considers you privileged enough, it will send you a copy of every future message that passes through the bus until you disconnect. Privileged enough is technically implementation defined; the reference implementation is that if you are root, or have the same UID as the bus itself, you can become a monitor.

Three panels: woman smiling while looking at computer screen; mouse pointer pointing at 'Become a fan' button from an old version of Facebook; woman has been replaced by a desk fan.

Bustle, which runs as a regular user, needs to outsource the task of monitoring the system bus to some other privileged process. The normal way to do this kind of thing – as used by tools like sysprof – is to have a system service which runs as root and uses polkit to determine whether to allow the request. The D-Bus daemon itself is not about to talk to polkit (itself a D-Bus service), though, and when distributed with Flatpak it’s not possible for Bustle to install its own system service.

I decided to cheat and assume that pkexec – the polkit equivalent of sudo – and dbus-monitor are both installed on the host system. A few years ago, dbus-monitor grew the ability to dump out raw D-Bus messages in pcap format, which by pure coincidence is the on-disk format Bustle uses. So Bustle builds a command line as follows:

  • If running inside a Flatpak, append flatpak-spawn --host
  • If trying to monitor the system bus, pkexec
  • dbus-monitor --pcap [--session | --system | --address ADDRESS]

It launches this subprocess, then reads the stream of messages from its stdout. In the system bus case, polkit will prompt the user to authenticate and approve the action:

Screenshot: Authentication is needed to run /usr/bin/dbus-monitor as the super user.

Assuming you authenticate successfully, the messages will flow:

Screenshot: Bustle recording messages on the system bus.

There are a few more fiddly details:

  • If the dbus-monitor subprocess quits unexpectedly, we want to show an error; but if the user hits cancel on the polkit authentication dialog, we don’t. pkexec signals this with exit code 126. Now you know why I care about flatpak-spawn propagating exit codes correctly: without that, Bustle can’t readily distinguish these two cases.
  • Bustle’s old internal monitor code did an initial state dump of names and their owners when it connected to the bus. dbus-monitor doesn’t do this yet. For now, Bustle waits until it knows the monitor is running, then makes its own connection to the same bus and performs the same state dump, which will be captured by the monitor and end up in the same stream. This means that Bustle in Flatpak still needs access to the session and system buses.
  • Once started, dbus-monitor only stops when the bus exits, it tries and fails to write to its stdout pipe, or it is killed by a signal. But sending SIGKILL from the unprivileged Bustle process to the privileged dbus-monitor process has no effect. Weirdly, if I run pkexec dbus-monitor --system in a terminal, I can hit Ctrl+C and kill it just fine. (Is this something to do with controlling terminals? Please tell me if you know how I could make this work.) So instead we just close the pipe that dbus-monitor is writing to, and assume that at some point in the future it will try and fail to log a new message to it.2
  • Why bother with the pcap framing when Bustle immediately takes it back apart? It means messages are timestamped in the dbus-monitor process, so they’re marginally more likely to be accurate than they would be if the Bustle UI process (which might be doing some rendering) was adding the timestamps.

On the one hand, I’m pleased with this technique, because it allows me to implement a feature I’ve wanted Bustle to have since I started it in 2008(!). On the other hand, this isn’t going to cut it in the general case of a Flatpaked development tool like sysprof, which needs a system helper but can’t reasonably assume that it’s already installed on the host system. Of course there’s nothing to stop you doing something like flatpak-spawn --host pkexec flatpak run --command=my-system-helper com.example.MyApp ... I suppose…

  1. or version 0.18 of the spec if you’re into that []
  2. I’ve realised while writing this post that I can bring “some point in the future” forward by just connecting to the bus again. []

When is an exit code not an exit code?

TL;DR: I found an interesting bug in flatpak-spawn which taught me that there is a difference between the exit code you pass to exit(), the exit status reported by waitpid(), and the shell variable $?.

One of the goals of Flatpak is to isolate applications from the host system; they can normally only directly run external programs supplied by the Flatpak platform they are built against, rather than whatever executables happen to be installed on the host. But some developer tools do need to be able to run commands on the host system. One example is GNOME Builder, which allows you to compile software on the host; another is flatpak-builder which uses this to build flatpak:s from within a flatpak. (For my part, I’m occasionally working on making Bustle run pkexec dbus-monitor --system on the host, to allow reading all messages on the system bus (a privileged operation) from an unprivileged, sandboxed application. More on this in a future blog post.)

Flatpak’s session helper provides a D-Bus API to do this: a HostCommand method that launches a given command outside the sandbox and returns its process ID; and a HostCommandExited signal which is emitted when the process exists, with its exit status as a uint32. Apps can use this D-Bus API directly, but recent versions of the common runtimes include a wrapper command which is much easier to adapt existing code to use: just replace cat /etc/passwd with flatpak-spawn --host cat /etc/passwd.

In theory, flatpak-spawn --host propagates the exit status from the command it runs, but I found that in practice, it did not. For example, false is a program which does nothing, unsuccessfully:

$ false; echo exit status: $?
1

But when run via flatpak-spawn --host, its exit status is 0:

$ flatpak run --env='PS1=sandbox$ ' \
> --talk-name=org.freedesktop.Flatpak \
> --command=bash org.freedesktop.Sdk//1.6
sandbox$ flatpak-spawn --host false; echo exit status: $?
0

If you care whether the command you launched succeeded, this is problematic! The first clue to what’s going on is in the output of flatpak-spawn --verbose:

sandbox$ flatpak-spawn --verbose --host false; echo exit status: $?
F: child_pid: 18066
F: child exited 18066: 256
exit status: 0

Here’s the code, from the HostCommandExited signal handler:

g_variant_get (parameters, "(uu)", &client_pid, &exit_status);
g_debug ("child exited %d: %d", client_pid, exit_status);

if (child_pid == client_pid)
  exit (exit_status);

So exit_status is 256, even though false actually returns 1. If you read man 3 exit, you will learn:

void exit(int status);

The exit() function causes normal process termination and the value of status & 0377 is returned to the parent (see wait(2)).

256 == 0x0100 and 0377 == 0x00ff; so exit_status & 0377 == 0. Now we know why flatpak-spawn returns 0, but why is exit_status equal to 256 rather than 1 in the first place?

It comes from a g_child_watch_add_full() callback. The g_child_watch_add_full() docs tell us:

In many programs, you will want to call g_spawn_check_exit_status() in the callback to determine whether or not the child exited successfully.

Following the link, we learn:

On Unix, [the exit status] is guaranteed to be in the same format waitpid() returns.

And reading the waitpid() documentation, we finally learn that the exit status is an opaque integer which must be inspected with a set of macros. On Linux, the layout is, roughly:

  • When a process calls exit(x), the exit status is ((x & 0xff) << 8); the low byte is 0. This explains why the exit_status for false is 256.
  • When a process is killed by signal y, the exit status is stored in the low byte, with its high bit (0x80) set if the process dumped core. So a process which segfaults and dumps core will have exit status 11 | 0x80 == 11 + 128 == 139

What’s funny about this is that, if the subprocess segfaults and dumps core, when testing from the shell flatpak-spawn --host appears to work.

host$ /home/wjt/segfault; echo exit status: $?
Segmentation fault (core dumped)
exit status: 139
sandbox$ flatpak-spawn --verbose --host /home/wjt/segfault; echo exit status: $?
F: child_pid: 20256
F: child exited 20256: 139
exit status: 139

But there’s a difference between this and a process which actually exits 139:

sandbox$ flatpak-spawn --verbose --host /bin/sh -c 'exit 139'; echo exit status: $?
F: child_pid: 20481
F: child exited 20481: 35584
exit status: 0

I always thought these two were the same. Actually, mapping the signal that killed a process to $? = 128 + signum is just shell convention.

To fix flatpak-spawn, we need to inspect the exit status and recover the exit code or signal. For normal termination, we can pass the exit code to exit(). For signals, the options are:

  • Reset all signal() handlers to SIG_DFL, then send the signal to ourselves and hope we die
  • Follow the shell convention and exit(128 + signal number)

I think the former sounds scary and unreliable, so I implemented the latter. Imperfect, but it’ll do.

Everything In Its Right Place

Back in July, I wrote about trying to get Endless OS working on DVDs. To recap: we have published live ISO images of Endless OS for a while, but until recently if you burned one to a DVD and tried to boot it, you’d get the Endless boot-splash, a lot of noise from the DVD drive, and not much else. Definitely no functioning desktop or installer!

I’m happy to say that Endless OS 3.3 boots from a DVD. The problems basically boiled down to long seek times, which are made worse by data not being arranged in any particular order on the disk. Fixing this had the somewhat unexpected benefit of improving boot performance on fixed disks, too. For the gory details, read on!

The initial problem that caused the boot process to hang was that the D-Bus system bus took over a minute to start. Most D-Bus clients assume that any method call will get a reply within 25 seconds, and fail particularly badly if method calls to the bus itself time out. In particular, systemd calls a number of methods on the system bus right after it launches it; if these calls fail, D-Bus service activation will not work. iotop and systemd-analyze plot strongly suggested that dbus-daemon was competing for IO with systemd-udevd, modprobe incantations, etc. Booting other distros’ ISOs, I noticed local-fs.target had a (transitive) dependency on systemd-udev-settle.service, which as the name suggests waits for udev to settle down.1 This gets most hardware discovery out of the way before D-Bus and friends get started; doing the same in our ISOs means D-Bus starts relatively quickly and the boot process can continue.

Even with this change, and many smaller changes to remove obviously-unnecessary work from the boot sequence, DVDs took unacceptably long to reach the first-boot experience. This is essentially due to reading lots of small files which are scattered all over the disk: the laser has to be physically repositioned whenever you need to seek to a different part of the DVD, which is extremely slow. For example, initialising IBus involves running ibus-engine-m17n --xml which reads hundreds of tiny files. They’re all in the same directory, but are not necessarily physically close to one another on the disk. On an otherwise idle system with caches flushed, running this command from an loopback-mounted ISO file on an SSD took 0.82 seconds, which we can assume is basically all squashfs decompression overhead. From a DVD, this command took 40 seconds!

What to do? Our systemd is patched to resurrect systemd-readahead (which was removed upstream some time ago) because many of our target systems have spinning disks, and readahead improves boot performance substantially on those systems. It records which files are accessed during the boot sequence to a pack file; early in the next boot, the pack file is replayed using posix_fadvise(..., POSIX_FADV_WILLNEED); to instruct the kernel that these files will be accessed soon, allowing them to be fetched eagerly, in an order matching the on-disk layout. We include a pack file collected from a representative system in our OS images to have something to work from during the first boot.

This means we already have a list of all2 files which are accessed during the boot process, so we can arrange them contiguously on the disk. The main stumbling block is that our ISOs (like most distros’) contain an ext4 filesystem image, inside a GPT disk image, inside a squashfs filesystem image, and ext4 does not (to my knowledge!) provide a straightforward way to move certain files to a particular region of the disk. To work around this, we adapt a trick from Fedora’s livecd-tools, and create the ext4 image in two passes. First, we calculate the size of the files listed in the readahead pack file (it’s about 200MB), add a bit for filesystem overhead, create an ext4 image which is only just large enough to hold these files, and copy them in. Then we grow the filesystem image to its final size (around 10GB, uncompressed, for a DVD-sized image) and copy the rest of the filesystem contents. This ensures that the files used during boot are mostly contiguous, near the start of the disk.3

Does this help? Running ibus-engine-m17n --xml on a DVD prepared this way takes 5.6 seconds, an order of magnitude better than the 40 seconds observed on an unordered DVD, and booting the DVD is more than a minute faster than before this change. Hooray!

Due to the way our image build and install process works, the GPT disk image inside the ISO is the same one that gets written to disk when you install Endless OS. So: how will this trick affect the installed system? One potential problem is that mke2fs uses the filesystem size to determine various attributes, like block and inode sizes, and 200MB is small enough to trigger the small profile. So we pass -T default to explicitly select more appropriate parameters for the final filesystem size.4 As far as I can tell, the only impact on installed systems is positive: spinning disks also have high seek latency, and this change cuts 15% off the boot time on a Mission One. Of course, this will gradually decay when the OS is updated, since new files used at boot will not be contiguous, but it’s still nice to have. (In the back of my mind, I’ve always wondered why boot times always get worse across the lifetime of a device; this is the first time I’ve deliberately caused this to be the case.)

The upshot: from Endless OS 3.3 onwards, ISOs boot when written to DVD. However, almost all of our ISOs are larger than 4.7 GB! You can grab the Basic version, which does fit, from the Linux/Mac tab on our website and give it a try. I hope we’ll make more DVD-sized ISOs available in a future release. New installations of Endless OS 3.3 or newer should boot a bit more quickly on rotating hard disks, too. (Running the dual-boot installer for Windows from a DVD doesn’t work yet; for a workaround, copy all the files off the DVD and run them from the hard disk.)

Oh, and the latency simulation trick I described? Since it delays reads, not seeks, it is actually not a good enough simulation when the difference between the two matters, so I did end up burning dozens of DVD+Rs. Accurate simulation of optical drive performance would be a nice option in virtualisation software, if any Boxes or VirtualBox developers are reading!

  1. Fedora’s is via dmraid-activation.service, which may or may not be deliberate; anecdotally, SUSE DVDs deliberately add a dependency for this reason. []
  2. or at least the majority of []
  3. When I described this technique internally at Endless, Juan Pablo pointed out that DVDs can actually read data faster from the end (outside) of the disk. The outside of the disk has more data per rotation than the centre, and the disk spins at a constant rotation speed. A quick test with dd shows that my drive is twice as fast reading data from the end of the disk compared to the start. It’s harder to put the files at the end of the ext4 image, but we might be able to artificially fragment the squashfs image to put the first few hundred MBs of its contents at the end. []
  4. After Endless OS is installed, the filesystem is resized again to fill the free space on disk. []

Endless Reddit AMA

Along with many colleagues from all across the company (and globe), I’m taking part in Endless’s first Reddit Ask Me Anything today. From my perspective in London it starts at 5pm on Wednesday 11th; check our website for a countdown or this helpful table of time conversions.

Have you been wanting to ask about our work and products in a public forum, but never found a socially-acceptable opportunity? Now’s your chance!

Simulating read latency with device-mapper

Like most distros, Endless OS is available as a hybrid ISO 9660 image. The main uses (in my experience) of these images are to attach to a virtual machine’s emulated optical drive, or to write them to a USB flash drive. In both cases, disk access is relatively fast.

A few people found that our ISOs don’t always boot properly when written to a DVD. It seems to be machine-dependent and non-deterministic, and the journal from failed boots shows lots of things timing out, which suggests that it’s something to do with slower reads – and higher seek times – on optical media. I dug out my eight-year-old USB DVD-R drive, but didn’t have any blank discs and really didn’t want to have to keep burning DVDs on a hot summer day. It turned out to be pretty easy to reproduce using qemu-kvm plus device-mapper’s delay target.

According to AnandTech, DVD seek times are somewhere in the region of 90-135ms. It’s not a perfect simulation but we can create a loopback device backed by the ISO image (which lives on a fast SSD), then create a device-mapper device backed by the loopback device that delays all reads by 125 ms (for the sake of argument), and boot it:

$ sudo losetup --find --show \
  eos-eos3.1-amd64-amd64.170520-055517.base.iso
/dev/loop0
$ echo "0 $(sudo blockdev --getsize /dev/loop0)" \
  "delay /dev/loop0 0 125" \
  | sudo dmsetup create delayed-loop0
$ qemu-kvm -cdrom /dev/mapper/delayed-loop0 -m 1GB

Sure enough, this fails with exactly the same symptoms we see booting a real DVD. (It really helps to remember the -m 1GB because modern desktop Linux distros do not boot very well if you only allow them QEMU’s default 128MB of RAM.)

Computer discoveries from February 2016

I found a text file named TIL.md lying around on my computer, with one section dated 17th February 2016. Apparently I’d planned to keep a log of the weird or interesting computer things I learned each day, but forgot after a day. I’d also forgotten all the facts in the file and was surprised afresh. Maybe you’ll be surprised too:

  • Windows’ shell and user interface do not support filenames with trailing spaces, so if you have a directory called worstever.christmas˽ (where ˽ represents a space) on your Unix fileserver, and serve it to Windows over SMB, you’ll see a filename like CQHNYI~0. I think this is the DOS-style 8.3 compatibility filename but I’m not sure where it gets generated in this case – Samba?
  • TIFF files can contain multiple images.
  • If you have a multi-subfile TIFF, multi.tiff, and run convert multi.tiff multi.jpeg, you will not get back a file called multi.jpeg; convert will silently assume you meant convert multi.tiff multi-%d.jpeg and give you back multi-0.jpeg, multi-1.jpeg, etc.

For some context: at the time, I was trying to work out why a script that imported a few tens of thousands of photographs into pan.do/ra – which doesn’t like TIFFs – had skipped some photographs, and imported others as a blank white rectangle; and why a Windows application pointed at the same fileserver showed a different number of photographs again. This was also the first time I encountered an inadvertent homoglyph attack: x.jpg and х.jpg are indistinguishable in most fonts.

My next EP will be released as a corrupted GPT image

Since July last year I’ve been working at Endless Computers on the downloadable edition of Endless OS.1 A big part of my work has been the Endless Installer for Windows: a Wubi-esque tool that “installs” Endless OS as a gigantic image file in your Windows partition2, sparing you the need to install via a USB stick and make destructive changes like repartitioning your drive. It’s derived from Rufus, the Reliable USB Formatting Utility, and our friends at Movial did a lot of the heavy lifting of turning it to our installer.

Endless OS is distributed as a compressed disk image, so you just write it to disk to install it. On first boot, it resizes itself to fill the whole disk. So, to “install” it to a file we decompress the image file, then extend it to the desired length. When booting, in principle we want to loopback-mount the image file and treat that as the root device. But there’s a problem: NTFS-3G, the most mature NTFS implementation for Linux, runs in userspace using FUSE. There are some practical problems arranging for the userspace processes to survive the transition out of the initramfs, but the bigger problem is that accessing a loopback-mounted image on an NTFS partition is slow, presumably because every disk access has an extra round-trip to userspace and back. Is there some way we can avoid this performance penalty?

Robert McQueen and Daniel Drake came up with a neat solution: map the file’s contents directly, using device mapper. Daniel wrote a little tool, ntfsextents, which uses the ntfs-3g library to find the position and size (in bytes) within the partition of each chunk of the Endless OS image file.3 We feed these to dm-setup to create a block device corresponding to the Endless OS image, and then boot from that – bypassing NTFS entirely! There’s no more overhead than an LVM root filesystem.

This is safe provided that you disallow concurrent modification of the image file via NTFS (which we do), and provided that you get the mapping right. If you’ve ensured that the image file is not encrypted, compressed, or sparse, and if ntfsextents is bug-free, then what could go wrong?

Unfortunately, we saw some weird problems as people started to use this installation method. At first, everything would work fine, but after a few days the OS image would suddenly stop booting. For some reason, this always seemed to happen in the second half of the week. We inspected some affected image files and found that, rather than ending in the secondary GPT header as you’d expect, they ended in zeros. Huh?

We were calling SetEndOfFile to extend the image file. It’s documented to “[set] the physical file size for the specified file”, and “if the file is extended, the contents of the file between the old end of the file and the new end of the file are not defined”. For our purposes this seems totally fine: the extended portion will be used as extra free space by Endless OS, so its contents don’t matter, but we need it to be fully physically allocated so we can use the extra space. But we missed an important detail! NTFS maintains two lengths for each file: the allocation size (“the size of the space that is allocated for a file on a disk”), and the valid data length (“the length of the data in a file that is actually written”).4 SetEndOfFile only updates the former, not the latter. When using an NTFS driver, reads past the valid data length return zero, rather than leaking whatever happens to be on the disk. When you write past the valid data length, the NTFS driver initializes the intervening bytes to zero as needed. We’re not using an NTFS driver, so were happily writing into this twilight zone of allocated-but-uninitialized bytes without updating the valid data length; but when the file is defragmented, the physical contents past the valid data length are not copied to their new home on the disk (what would be the point? it’s just uninitialized data, right?). So defragmenting the file would corrupt the Endless OS image.

One could fix this in our installer in two ways: write a byte at the end of the file (forcing the NTFS driver to write tens of gigabytes of zeros to initialize the file), or use SetFileValidData to mark the unused space as valid without actually initializing it. We chose the latter: installing a new OS is already a privileged operation, and the permissions on the Endless OS image file are set to deny read access to mere mortals, so it’s safe to avoid the cost of writing ten billion zeros.5

We weren’t quite home and dry yet, though: some users were still seeing their Endless OS image file corrupting itself after a few days. Having been burned once, we guessed this might be the defragmenter at work again. It turned out to be a quirk of how chunks of a file which happen to be adjacent can be represented, which we were not handling correctly in ntfsextents, leading us to map parts of the file more than once, like a glitchy tape loop. (We got lucky here: at least all the bytes we mapped really were part of the image file. Imagine if we’d mapped some arbitrary other part of the Windows system drive and happily scribbled over it…)

(Oh, why did these problems surface in the second half of any given week? By default, Windows defragments the system drive at 1am every Wednesday, or as soon as possible after that.)

  1. If you’re not familiar with Endless OS, it’s a GNOME- and Debian-derived desktop distribution, focused on reliable, easy-to-use computing for everyone. There was lots of nice coverage from CES last week. People seem particularly taken by the forthcoming “flip the window to edit the app” feature. []
  2. and configures a bootloader – more on this in a future post… []
  3. See debian/patches/endless*.patch in our ntfs-3g source package. []
  4. I gather many other filesystems do the same. []
  5. A note on the plural of “zero”: I conducted a poll on Twitter but chose to disregard the result when it was pointed out that MATLAB and NumPy both spell it without an “e”. See? No need to blindly implement the result of a non-binding referendum! []

Machine-specific Git config changes

Update (2018-03-28): if you have work and personal projects on the same machine, a better way to do this is to put all your work projects in one directory and use conditional configuration includes, introduced in Git 2.13.

I store my .gitconfig in Git, naturally. It contains this block:

[user]
        email = will@willthompson.co.uk
        name = Will Thompson

which is fine until I want to use a different email address for all commits on my work machine, without needing git config user.email in every working copy. In the past I’ve just made a local branch of the config, merging and cherry-picking as needed to keep in sync with the master version, but I noticed that Git reads four different config files, in this order, with later entries overriding earlier entries:

  1. /etc/gitconfig – system-wide stuff, doesn’t help on multi-user machines
  2. $XDG_CONFIG_HOME/git/config (aka ~/.config/git/config) – news to me!
  3. ~/.gitconfig
  4. $GIT_DIR/config – per-repo, irrelevant here

So here’s the trick: put the standard config file at ~/.config/git/config, and then override the email address in ~/.gitconfig:

[user]
        email = wjt@endlessm.com

Ta-dah! Machine-specific Git config overrides. The spanner in the works is that git config --global always updates ~/.gitconfig if it exists, but it’s a start.