Simulating read latency with device-mapper

Like most distros, Endless OS is available as a hybrid ISO 9660 image. The main uses (in my experience) of these images are to attach to a virtual machine’s emulated optical drive, or to write them to a USB flash drive. In both cases, disk access is relatively fast.

A few people found that our ISOs don’t always boot properly when written to a DVD. It seems to be machine-dependent and non-deterministic, and the journal from failed boots shows lots of things timing out, which suggests that it’s something to do with slower reads – and higher seek times – on optical media. I dug out my eight-year-old USB DVD-R drive, but didn’t have any blank discs and really didn’t want to have to keep burning DVDs on a hot summer day. It turned out to be pretty easy to reproduce using qemu-kvm plus device-mapper’s delay target.

According to AnandTech, DVD seek times are somewhere in the region of 90-135ms. It’s not a perfect simulation but we can create a loopback device backed by the ISO image (which lives on a fast SSD), then create a device-mapper device backed by the loopback device that delays all reads by 125 ms (for the sake of argument), and boot it:

$ sudo losetup --find --show \
  eos-eos3.1-amd64-amd64.170520-055517.base.iso
/dev/loop0
$ echo "0 $(sudo blockdev --getsize /dev/loop0)" \
  "delay /dev/loop0 0 125" \
  | sudo dmsetup create delayed-loop0
$ qemu-kvm -cdrom /dev/mapper/delayed-loop0 -m 1GB

Sure enough, this fails with exactly the same symptoms we see booting a real DVD. (It really helps to remember the -m 1GB because modern desktop Linux distros do not boot very well if you only allow them QEMU’s default 128MB of RAM.)

Computer discoveries from February 2016

I found a text file named TIL.md lying around on my computer, with one section dated 17th February 2016. Apparently I’d planned to keep a log of the weird or interesting computer things I learned each day, but forgot after a day. I’d also forgotten all the facts in the file and was surprised afresh. Maybe you’ll be surprised too:

  • Windows’ shell and user interface do not support filenames with trailing spaces, so if you have a directory called worstever.christmas˽ (where ˽ represents a space) on your Unix fileserver, and serve it to Windows over SMB, you’ll see a filename like CQHNYI~0. I think this is the DOS-style 8.3 compatibility filename but I’m not sure where it gets generated in this case – Samba?
  • TIFF files can contain multiple images.
  • If you have a multi-subfile TIFF, multi.tiff, and run convert multi.tiff multi.jpeg, you will not get back a file called multi.jpeg; convert will silently assume you meant convert multi.tiff multi-%d.jpeg and give you back multi-0.jpeg, multi-1.jpeg, etc.

For some context: at the time, I was trying to work out why a script that imported a few tens of thousands of photographs into pan.do/ra – which doesn’t like TIFFs – had skipped some photographs, and imported others as a blank white rectangle; and why a Windows application pointed at the same fileserver showed a different number of photographs again. This was also the first time I encountered an inadvertent homoglyph attack: x.jpg and х.jpg are indistinguishable in most fonts.

My next EP will be released as a corrupted GPT image

Since July last year I’ve been working at Endless Computers on the downloadable edition of Endless OS.1 A big part of my work has been the Endless Installer for Windows: a Wubi-esque tool that “installs” Endless OS as a gigantic image file in your Windows partition2, sparing you the need to install via a USB stick and make destructive changes like repartitioning your drive. It’s derived from Rufus, the Reliable USB Formatting Utility, and our friends at Movial did a lot of the heavy lifting of turning it to our installer.

Endless OS is distributed as a compressed disk image, so you just write it to disk to install it. On first boot, it resizes itself to fill the whole disk. So, to “install” it to a file we decompress the image file, then extend it to the desired length. When booting, in principle we want to loopback-mount the image file and treat that as the root device. But there’s a problem: NTFS-3G, the most mature NTFS implementation for Linux, runs in userspace using FUSE. There are some practical problems arranging for the userspace processes to survive the transition out of the initramfs, but the bigger problem is that accessing a loopback-mounted image on an NTFS partition is slow, presumably because every disk access has an extra round-trip to userspace and back. Is there some way we can avoid this performance penalty?

Robert McQueen and Daniel Drake came up with a neat solution: map the file’s contents directly, using device mapper. Daniel wrote a little tool, ntfsextents, which uses the ntfs-3g library to find the position and size (in bytes) within the partition of each chunk of the Endless OS image file.3 We feed these to dm-setup to create a block device corresponding to the Endless OS image, and then boot from that – bypassing NTFS entirely! There’s no more overhead than an LVM root filesystem.

This is safe provided that you disallow concurrent modification of the image file via NTFS (which we do), and provided that you get the mapping right. If you’ve ensured that the image file is not encrypted, compressed, or sparse, and if ntfsextents is bug-free, then what could go wrong?

Unfortunately, we saw some weird problems as people started to use this installation method. At first, everything would work fine, but after a few days the OS image would suddenly stop booting. For some reason, this always seemed to happen in the second half of the week. We inspected some affected image files and found that, rather than ending in the secondary GPT header as you’d expect, they ended in zeros. Huh?

We were calling SetEndOfFile to extend the image file. It’s documented to “[set] the physical file size for the specified file”, and “if the file is extended, the contents of the file between the old end of the file and the new end of the file are not defined”. For our purposes this seems totally fine: the extended portion will be used as extra free space by Endless OS, so its contents don’t matter, but we need it to be fully physically allocated so we can use the extra space. But we missed an important detail! NTFS maintains two lengths for each file: the allocation size (“the size of the space that is allocated for a file on a disk”), and the valid data length (“the length of the data in a file that is actually written”).4 SetEndOfFile only updates the former, not the latter. When using an NTFS driver, reads past the valid data length return zero, rather than leaking whatever happens to be on the disk. When you write past the valid data length, the NTFS driver initializes the intervening bytes to zero as needed. We’re not using an NTFS driver, so were happily writing into this twilight zone of allocated-but-uninitialized bytes without updating the valid data length; but when the file is defragmented, the physical contents past the valid data length are not copied to their new home on the disk (what would be the point? it’s just uninitialized data, right?). So defragmenting the file would corrupt the Endless OS image.

One could fix this in our installer in two ways: write a byte at the end of the file (forcing the NTFS driver to write tens of gigabytes of zeros to initialize the file), or use SetFileValidData to mark the unused space as valid without actually initializing it. We chose the latter: installing a new OS is already a privileged operation, and the permissions on the Endless OS image file are set to deny read access to mere mortals, so it’s safe to avoid the cost of writing ten billion zeros.5

We weren’t quite home and dry yet, though: some users were still seeing their Endless OS image file corrupting itself after a few days. Having been burned once, we guessed this might be the defragmenter at work again. It turned out to be a quirk of how chunks of a file which happen to be adjacent can be represented, which we were not handling correctly in ntfsextents, leading us to map parts of the file more than once, like a glitchy tape loop. (We got lucky here: at least all the bytes we mapped really were part of the image file. Imagine if we’d mapped some arbitrary other part of the Windows system drive and happily scribbled over it…)

(Oh, why did these problems surface in the second half of any given week? By default, Windows defragments the system drive at 1am every Wednesday, or as soon as possible after that.)

  1. If you’re not familiar with Endless OS, it’s a GNOME- and Debian-derived desktop distribution, focused on reliable, easy-to-use computing for everyone. There was lots of nice coverage from CES last week. People seem particularly taken by the forthcoming “flip the window to edit the app” feature. []
  2. and configures a bootloader – more on this in a future post… []
  3. See debian/patches/endless*.patch in our ntfs-3g source package. []
  4. I gather many other filesystems do the same. []
  5. A note on the plural of “zero”: I conducted a poll on Twitter but chose to disregard the result when it was pointed out that MATLAB and NumPy both spell it without an “e”. See? No need to blindly implement the result of a non-binding referendum! []

Machine-specific Git config changes

I store my .gitconfig in Git, naturally. It contains this block:

[user]
        email = will@willthompson.co.uk
        name = Will Thompson

which is fine until I want to use a different email address for all commits on my work machine, without needing git config user.email in every working copy. In the past I’ve just made a local branch of the config, merging and cherry-picking as needed to keep in sync with the master version, but I noticed that Git reads four different config files, in this order, with later entries overriding earlier entries:

  1. /etc/gitconfig – system-wide stuff, doesn’t help on multi-user machines
  2. $XDG_CONFIG_HOME/git/config (aka ~/.config/git/config) – news to me!
  3. ~/.gitconfig
  4. $GIT_DIR/config – per-repo, irrelevant here

So here’s the trick: put the standard config file at ~/.config/git/config, and then override the email address in ~/.gitconfig:

[user]
        email = wjt@endlessm.com

Ta-dah! Machine-specific Git config overrides. The spanner in the works is that git config --global always updates ~/.gitconfig if it exists, but it’s a start.

Bustle 0.5: Gtk+-3-ier, hidpi-friendlier

I finally replaced my vintage 2008 ThinkPad X200s, after months of agonising over which of keyboard, form factor, hidpi display, and software freedom to compromise on. Just in the nick of time, Dell released a developer edition of their widely-lauded XPS 13, which is spot on: same comfortable form factor (with 2015-era thin-ness); huge display with minimal bezel; conservative, usable keyboard layout; supplied and supported with free software; and the Sputnik team are very amiable on Twitter. I’m very happy with it.

I was less happy with how ludicrous Bustle looked on it, with its total ignorance of hidpi scaling and retro Gtk+ 2 stylings. I finally found the spare evening I mentioned 18 months ago and freshened it up to not let the amazing screen down.

Bustle 0.5.0

Source tarball and x86-64 binary available from the usual place. The freedesktop.org git repository is temporarily out-of-date due to some poorly-synchronised GPG and SSH key migrations, so for now it’s at GitHub. Update: the freedesktop.org git repository is current again!

(Hey, actual Bustle users! How do you feel about it as a piece of software? I’d be interested to hear.)

Bustle 0.4.3: I think you mean ‘fewer crashy’.

It’s Bustle release season! Here is version 0.4.3’s flagship new feature:

Bustle not crashing when it can't connect to the bus.

(Previously, it would crash.) I fixed a couple of other crashes, too, spurred by Sujith Sudhi reporting a i486-only crash. I feel compelled to point out that all of these crashes occurred in C code or at the inter-language boundary.

No Gtk+ 3 yet, I’m afraid, though experimental support in the Haskell binding was released a couple of days ago so it’s just a matter of a spare evening…

Meanwhile, why not follow my latest Twitter bot, @fewerror? It will provide you with 100% accurate corrections on a tricky point of English grammar.

Moving on

Yesterday was my last day at Collabora. It’s been a fun five years of working with smart and friendly people (the best kind) on interesting problems. I’ve learnt a lot, created many things I’m proud to have been a part of, and made a lot of friends all over the globe; and now I’ve decided to take a break, then try my hand at something different.

I think it’s notable that quite a few of those smart and friendly people I’m thinking of were neither colleagues nor clients. It’s been a privilege to work predominantly in the open, alongside others with the shared goal of advancing the causes of free software, open platforms and open communication systems. I’m not planning to disappear from the GNOME community any time soon, so I’m looking forward to running into a lot of familiar names, faces and IRC nicks in the future. :)

Thanks to Rob, Philippe and everyone I’ve worked with at Collabora over the last half-decade. It’s been great! (Oh, hey, also, Collabora is hiring. I’d recommend working there. Maybe they’ll get an application from Guybrush soon…)

I know that it’s not a party if it happens every night

I think I’ve just about caught up on sleep, four days after getting back from A Coruña. This year’s GUADEC was pretty great. One highlight was the bumper crop of interns’ lightning talks. In general, I’m a huge fan of the lightning talk format, because good talks are just as good when they’re three minutes long, and bad talks are only three minutes long. In this session, I didn’t have to invoke that second clause: the quality was really consistently high, the speakers had prepared well, and the talks kept me interested for the duration. Change-overs were smooth, and a few truncated-slide hiccups didn’t trip anyone up. It’s great to see so many people excited about contributing in all manner of ways. Congratulations all round.

🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙

Michael Meeks informed and amused1 as ever. Discussion about Telepathy’s historically patchy support for IRC during the Empathy BOF pushed me into a drive-by release of the IRC backend. Adam Dingle and Jim Nelson’s keynote also stood out—free software business models are a tricky matter, and it was interesting to hear their thoughts on sustaining the dream. I learnt a lot from Owen’s talk on smooth animations, and particularly enjoyed the un-dramatic reveal in Neil and Robert’s talk on Wayland-ifying the Shell, where they switched Pinpoint out of fullscreen to reveal their demo: an apparently-unremarkable Gnome Shell running both X and Wayland applications, including the presentation itself.

Outside the conference itself, my poor scheduling meant I missed the GNOME OS BOF, to my chagrin, in favour of spending a beautiful day exploring A Coruña. I fell into my usual trap of trying to visit museums on Monday (when they are generally closed), but the Torre de Hércules happened to be both open and free2. Well worth a visit, if you’re ever there.

For me, chatting to old and new friends about GNOME, music, and everything in between are the best part of GUADEC, and this year was no exception.3 Of course, over the week I also saw a lot of Pulpo a la Gallega. I felt a bit like this cat in the third panel.

  1. slide 35 is a highlight []
  2. how appropriate []
  3. We didn’t have an official party this time around, but the nightly Collabora beach party welcomed many wonderful people, including tens of colleagues I rarely get the opportunity to see in person. []

Maps and clocks and contact locations

Once upon a time, three intrepid individuals made Empathy publish your location to your contacts, and show your contacts’ locations on a map. Today, I noticed that the Location tab is missing from Preferences—I guess Debian’s Empathy is built without GeoClue support for some reason—and as a result the map looks rather forlorn, what with none of my contacts publishing their location:

Empathy's empty Contact Map View window

A map is an obvious demo to build, but I don’t think it’s that useful (even when it had more than zero contacts on it, I never looked at it).1 So what would be more useful? For starters, here’s some “relevant art” from Skype, showing a contact’s local time in their tooltip:

Raúl's Skype tooltip shows it's 6am where he is.

Adding that to Empathy might be a useful first step. But unlike Skype, it’s possible to use this information outside the IM app. So, if I spend a lot of time chatting to friends in Melbourne and New York, why not automatically add those timezones to GNOME Clocks? (The last two mock-ups in that section look particularly bare—perhaps the names of some contacts could show up in the space where “local time” does for Boston.)

For this to be useful, of course, someone would have to fix the publishing of location information in the first place. But if fixing it produced a more compelling feature than a map, it would not be such a thankless task.

  1. Top designers agree! To quote Allan Day, “I could live without contacts on a map ;)”. []

A brief list of observed meanings of the word “port”

port (v):

  1. Reindent and reformat.

    empathy-time: port to TP coding style

  2. Update to compile against a backwards-incompatible version of an API.

    <ocrete> twi: I’m porting farstream to [GStreamer] 0.11 this week

  3. Rewrite to use a different widget set and network library.

    You should port Sojourner to Qt4!

  4. Reimplement in an entirely different programming language.

    Zeitgeist has been ported from Python to Vala.

  5. Translate into a different data format.

    Using Semantics3’s web crawlers, we were able to get hold of the data within a half hour, after which we spent a further half hour cleaning up the data and porting it to SQL.