Fitting Endless OS images on small disks

Last week I read Jorge Castro’s article On “Wasting disk space” with interest, and not only because it cites one of my own articles 😉. Jorge encouraged me to write up the conversation we had off the back of it, so here we go!

People like to fixate on the disk space used by installing a calculator app as a Flatpak when you don’t have any other Flatpak apps installed. For example, on my system GNOME Calculator takes up 9.3 MB for itself, plus 803.1 MB for the GNOME 42 runtime it depends on. Regular readers will not be surprised when I say that that 803.1 MB figure looks rather different when you realise that Calculator is just one of 70 apps on my system that use that runtime; 11.5 MB of runtime per app feels a lot more reasonable.

But I do have one app installed which depends on the GNOME 3.34 runtime, which has been unsupported since August 2020, and the GNOME 3.34 runtime only shares 102 MB of its files with the GNOME 42 runtime, leaving 769 MB installed solely for this one 11 MB app. Not such a big deal on my laptop with a half-terabyte drive, but this gets to a point Jorge makes near the end of his article:

Yes, they take up more room, but it’s not by much, and unless you’re on an extremely size-restrained system (like say a 64GB Chromebook you are repurposing) then for most systems it’s a wash.

Part of the insight behind Endless OS is that it can be more practical and cost-effective to fill a large hard disk with apps and educational resources than to get access to high-bandwidth connectivity in remote or disadvantaged communities, and the devices we and our deployment partners use generally have plentiful storage. But inexpensive, lower-end devices with 64 GB storage are still quite common, and having runtimes installed for one or two apps consumes valuable space that could be used for more offline content. So this can be a real problem for us, if a small handful of apps are bloating an image to the point where it doesn’t fit, or leaves little space for documents & updates.

So, our OS image builder has a mode to tabulate the apps that will be preinstalled in a given image configuration, grouped by the runtime they use, with their approximate sizes. I added this mode when I was trying to make an Endless OS ISO fit on a 4.7 GB DVD a few years back, packing in as much content as possible. It’s a bit rough and ready, and tends to overestimate the disk space needed (because it does not take into account deduplication identical files between different apps and runtimes, or installing only a subset of translations for an app or runtime) but it is still useful to get a sense of where the space is going.

I ran this for the downloadable English configuration of Endless OS earlier today. I’ll spare you the full output here but if you are curious it’s in this gist. It shows that, like on my system, TurtleBlocks is the only reason the GNOME 3.34 runtime is preinstalled, and so removing that app or updating it to a more modern runtime would probably save somewhere between 1 GB and 2 GB in the image! Normally after seeing this I would go and see if I can take a few minutes to send an update to the app, but in this case someone has beaten me to it. (If you are a Python expert interested in block-based programming tools for kids, why not lend a hand getting this over the line?) It also shows that a couple of our unmaintained and sadly closed-source first-party apps, Resumé and My Budget, are stuck on an even more prehistoric runtime.

Running it on a deployment partner’s custom image that they reported is too big for the 64 GB target hardware showed that the Othman Quran browser is also using an older runtime; once more, someone else has already noticed in the last couple of days, and if you are an expert on GTK’s font selection your input would be welcome on that pull request.

This tool is how it came to pass that I updated the runtimes of gbrainy and Genius, two apps I have never used, last month, and Klavaro back in May, among others.

There tends to be a flurry of community activity every 6-12 months to update all the apps that depend on newly end-of-lifed runtimes, and many of these turn out not to be rocket science, though it is not (yet!) something that flatpak-external-data-checker can do automatically. Then we are left with a long tail of more lightly-maintained apps where the update is more cumbersome to do, as in the case of gbrainy, where it took a bit of staring at the unfamiliar error messages produced by the Mono C# compiler to figure out where the problem might lie. If you are good at debugging build failures, this kind of thing is a great way to contribute to Flathub; and if someone can remind me of the URL of the live-updating TODO list of apps on obsolete runtimes I once saw, I’ll add the link here.

While I’ve focused on one of the problems that apps depending on obsolete runtimes can cause, it’s not all bad news. If you really love, or need, an app that is abandoned or has not been updated in a while, the Flatpak model you can still install & use that app and its old runtime without your distribution having to keep around some obsolete version of the libraries, or you having to stay on an old version of the distro. As and when that app does get an update, the unused and end-of-lifed runtime on your system will be uninstalled automatically by modern versions of Flatpak.

Everything In Its Right Place

Back in July, I wrote about trying to get Endless OS working on DVDs. To recap: we have published live ISO images of Endless OS for a while, but until recently if you burned one to a DVD and tried to boot it, you’d get the Endless boot-splash, a lot of noise from the DVD drive, and not much else. Definitely no functioning desktop or installer!

I’m happy to say that Endless OS 3.3 boots from a DVD. The problems basically boiled down to long seek times, which are made worse by data not being arranged in any particular order on the disk. Fixing this had the somewhat unexpected benefit of improving boot performance on fixed disks, too. For the gory details, read on!

The initial problem that caused the boot process to hang was that the D-Bus system bus took over a minute to start. Most D-Bus clients assume that any method call will get a reply within 25 seconds, and fail particularly badly if method calls to the bus itself time out. In particular, systemd calls a number of methods on the system bus right after it launches it; if these calls fail, D-Bus service activation will not work. iotop and systemd-analyze plot strongly suggested that dbus-daemon was competing for IO with systemd-udevd, modprobe incantations, etc. Booting other distros’ ISOs, I noticed local-fs.target had a (transitive) dependency on systemd-udev-settle.service, which as the name suggests waits for udev to settle down. ((Fedora’s is via dmraid-activation.service, which may or may not be deliberate; anecdotally, SUSE DVDs deliberately add a dependency for this reason.)) This gets most hardware discovery out of the way before D-Bus and friends get started; doing the same in our ISOs means D-Bus starts relatively quickly and the boot process can continue.

Even with this change, and many smaller changes to remove obviously-unnecessary work from the boot sequence, DVDs took unacceptably long to reach the first-boot experience. This is essentially due to reading lots of small files which are scattered all over the disk: the laser has to be physically repositioned whenever you need to seek to a different part of the DVD, which is extremely slow. For example, initialising IBus involves running ibus-engine-m17n --xml which reads hundreds of tiny files. They’re all in the same directory, but are not necessarily physically close to one another on the disk. On an otherwise idle system with caches flushed, running this command from an loopback-mounted ISO file on an SSD took 0.82 seconds, which we can assume is basically all squashfs decompression overhead. From a DVD, this command took 40 seconds!

What to do? Our systemd is patched to resurrect systemd-readahead (which was removed upstream some time ago) because many of our target systems have spinning disks, and readahead improves boot performance substantially on those systems. It records which files are accessed during the boot sequence to a pack file; early in the next boot, the pack file is replayed using posix_fadvise(..., POSIX_FADV_WILLNEED); to instruct the kernel that these files will be accessed soon, allowing them to be fetched eagerly, in an order matching the on-disk layout. We include a pack file collected from a representative system in our OS images to have something to work from during the first boot.

This means we already have a list of all ((or at least the majority of)) files which are accessed during the boot process, so we can arrange them contiguously on the disk. The main stumbling block is that our ISOs (like most distros’) contain an ext4 filesystem image, inside a GPT disk image, inside a squashfs filesystem image, and ext4 does not (to my knowledge!) provide a straightforward way to move certain files to a particular region of the disk. To work around this, we adapt a trick from Fedora’s livecd-tools, and create the ext4 image in two passes. First, we calculate the size of the files listed in the readahead pack file (it’s about 200MB), add a bit for filesystem overhead, create an ext4 image which is only just large enough to hold these files, and copy them in. Then we grow the filesystem image to its final size (around 10GB, uncompressed, for a DVD-sized image) and copy the rest of the filesystem contents. This ensures that the files used during boot are mostly contiguous, near the start of the disk. ((When I described this technique internally at Endless, Juan Pablo pointed out that DVDs can actually read data faster from the end (outside) of the disk. The outside of the disk has more data per rotation than the centre, and the disk spins at a constant rotation speed. A quick test with dd shows that my drive is twice as fast reading data from the end of the disk compared to the start. It’s harder to put the files at the end of the ext4 image, but we might be able to artificially fragment the squashfs image to put the first few hundred MBs of its contents at the end.))

Does this help? Running ibus-engine-m17n --xml on a DVD prepared this way takes 5.6 seconds, an order of magnitude better than the 40 seconds observed on an unordered DVD, and booting the DVD is more than a minute faster than before this change. Hooray!

Due to the way our image build and install process works, the GPT disk image inside the ISO is the same one that gets written to disk when you install Endless OS. So: how will this trick affect the installed system? One potential problem is that mke2fs uses the filesystem size to determine various attributes, like block and inode sizes, and 200MB is small enough to trigger the small profile. So we pass -T default to explicitly select more appropriate parameters for the final filesystem size. ((After Endless OS is installed, the filesystem is resized again to fill the free space on disk.)) As far as I can tell, the only impact on installed systems is positive: spinning disks also have high seek latency, and this change cuts 15% off the boot time on a Mission One. Of course, this will gradually decay when the OS is updated, since new files used at boot will not be contiguous, but it’s still nice to have. (In the back of my mind, I’ve always wondered why boot times always get worse across the lifetime of a device; this is the first time I’ve deliberately caused this to be the case.)

The upshot: from Endless OS 3.3 onwards, ISOs boot when written to DVD. However, almost all of our ISOs are larger than 4.7 GB! You can grab the Basic version, which does fit, from the Linux/Mac tab on our website and give it a try. I hope we’ll make more DVD-sized ISOs available in a future release. New installations of Endless OS 3.3 or newer should boot a bit more quickly on rotating hard disks, too. (Running the dual-boot installer for Windows from a DVD doesn’t work yet; for a workaround, copy all the files off the DVD and run them from the hard disk.)

Oh, and the latency simulation trick I described? Since it delays reads, not seeks, it is actually not a good enough simulation when the difference between the two matters, so I did end up burning dozens of DVD+Rs. Accurate simulation of optical drive performance would be a nice option in virtualisation software, if any Boxes or VirtualBox developers are reading!

Simulating read latency with device-mapper

Like most distros, Endless OS is available as a hybrid ISO 9660 image. The main uses (in my experience) of these images are to attach to a virtual machine’s emulated optical drive, or to write them to a USB flash drive. In both cases, disk access is relatively fast.

A few people found that our ISOs don’t always boot properly when written to a DVD. It seems to be machine-dependent and non-deterministic, and the journal from failed boots shows lots of things timing out, which suggests that it’s something to do with slower reads – and higher seek times – on optical media. I dug out my eight-year-old USB DVD-R drive, but didn’t have any blank discs and really didn’t want to have to keep burning DVDs on a hot summer day. It turned out to be pretty easy to reproduce using qemu-kvm plus device-mapper’s delay target.

According to AnandTech, DVD seek times are somewhere in the region of 90-135ms. It’s not a perfect simulation but we can create a loopback device backed by the ISO image (which lives on a fast SSD), then create a device-mapper device backed by the loopback device that delays all reads by 125 ms (for the sake of argument), and boot it:

$ sudo losetup --find --show \
  eos-eos3.1-amd64-amd64.170520-055517.base.iso
/dev/loop0
$ echo "0 $(sudo blockdev --getsize /dev/loop0)" \
  "delay /dev/loop0 0 125" \
  | sudo dmsetup create delayed-loop0
$ qemu-kvm -cdrom /dev/mapper/delayed-loop0 -m 1GB

Sure enough, this fails with exactly the same symptoms we see booting a real DVD. (It really helps to remember the -m 1GB because modern desktop Linux distros do not boot very well if you only allow them QEMU’s default 128MB of RAM.)