Creating Windows installation media on Linux

Every so often I need to install Windows, most recently for my GNOME on WSL experiments, and to do this I need to write the Windows installer ISO to a USB stick. Unlike most Linux distro ISOs, these are true, pure ISO 9660 images—not hybrid images that can also be treated as a DOS/MBR disk image—so they can’t just be written directly to the disk. Microsoft’s own tool is only available for Windows, of course.

I’m sure there are other ways but this is what I do. Edit: check the comments for an approach which involves 2 partitions and a little more careful copying, but no special tools. I’m writing it down so I can easily find the instructions next time!

The basic process is quite simple:

  • Download an ISO 9660 disk image from Microsoft
  • Partition the USB drive with a single basic data partition, formatted as FAT32
  • Mount the ISO image – on GNOME, you should just be able to double-click it to mount it with Disk Image Mounter
  • Copy all the files from the mounted ISO image to the USB drive

But there is a big catch with that last step: at least one of the .wim files in the ISO is too large for a FAT32 partition.

The trick is to first copy all the files to a writeable directory on internal storage, then use a tool called wimlib-imagex split from wimlib to split the large .wim file into a number of smaller .swm files before copying them to the FAT32 partition. I think I compiled it from source, in a toolbox container, but you could also use this OCI container image whose README helpfully provides these instructions:

find . -size +4294967000c -iname '*.wim' -print | while read -r wimpath; do
  wimbase="$(basename "$wimpath" '.wim')"
  wimdir="$(dirname "$wimpath")"
  echo "splitting ${wimpath}"
  docker run \
    --rm \
    --interactive \
    --tty \
    --volume "$(pwd):/work" \
    "backplane/wimlib-imagex" \
      split "$wimpath" "${wimdir}/${wimbase}.swm" 4000
done

Now you can copy all those files, minus the too-large .wim, onto the FAT32 drive, and then boot from it.

This all assumes that you only care about a modern system with EFI firmware. I have no idea about creating a BIOS-bootable Windows installer on Linux, and fortunately I have never needed to do this: to test stuff on a BIOS Windows installation, I have used the time-limited virtual machines that Microsoft publishes for testing stuff in old versions of Internet Explorer.

I was inspired to resurrect this old draft post by a tweet by Ross Burton.

Release (semi-)automation

The time I have available to maintain GNOME Initial Setup is very limited, as anyone who has looked at the commit history will have noticed. I’d love more eyes & hands on this important but easy-to-overlook component, particularly to guide it kindly but firmly into the modern age of GTK 4 and the refreshed HIG.

I found that making a batch of 1–3 releases across different GNOME branches every few months was surprisingly time-consuming and error-prone, even with the pretty comprehensive release process checklist on the GNOME Wiki, so I’ve been periodically trying to automate bits of it away.

Philip Withnall’s gitlab-changelog script makes writing the NEWS file a lot quicker. I taught it to output the human-readable names of each updated translation (a nice additional contribution would be to also include the name of the human who updated the translation) and made it a little smarter about guessing the Git commit range to scan.

Beyond that, I added a Meson run target, maintainer-upload-release pointing at a script which performs some rudimentary coherence checks on the version number, tags the release (using git-evtag if available), atomically pushes the branch and that tag to GNOME GitLab, then copies the source tarball to master.gnome.org. (Apparently it has been almost 12 years since I did something similar in telepathy-gabble, building on the make maintainer-upload-release target that Simon McVittie added in 2008, which is where I borrowed the name.) Maybe other module maintainers may find this script useful too – it’s quite generic.

Putting these together, the release flow looks like this:

git switch gnome-42
git pull
../pwithnall/gitlab-changelog/gitlab-changelog.py GNOME/gnome-initial-setup
# Manually edit NEWS to incorporate the changelog, adjusted as needed
# Manually check the version in meson.build
git commit -am 'NEWS for 42.Y'
ninja -C _build dist maintainer-upload-release

Another release-related quality-of-life improvement is to make GitLab CI not only build and test the project (in the vain hope that there might actually be tests!) but also check that the install and gnome-initial-setup-pot targets both work. (At one point or another both have failed at or around release time; now they never will again, famous last words.)

I know none of this is rocket science, but I find it all makes the process quicker and less cumbersome, and it’s stopped me from repeating errors like uploading the wrong version on a few tired evenings. Obviously this could all be taken further: perhaps a manually-invoked CI pipeline that does all this stuff, more checks, etc. But while I’m on this train of thought:

Why do we release GNOME modules one-by-one at all?

The workflow we use to release Endless OS is a bit different to GNOME. Once we merge a change to some module’s Git repository, such as eos-updater or our shrinking branch of GNOME Software, that change embarks on a scenic automated journey that takes it to the next nightly build of the entire OS, both as an OSTree update and as fresh installation media. I use these nightly builds for my daily work, safe in the knowledge that I can roll back to the previous build if necessary.

We don’t make releases of individual modules: instead, when it comes time to release the OS, we trigger a pipeline that (among many other things) pushes the already-built OS update to the production repo, and creates Release_x.y.z tags on each Git repo.

This was quite an adjustment for me at first, compared to lovingly hand-crafting NEWS files and coming up with funny/esoteric release names, but now that I’m used to it it’s hard to go back. Why can’t GNOME do the same?

At this point in the post, we are straying into territory that I have limited first-hand knowledge of. Caveat lector! But here goes:

Thanks to GNOME OS, GNOME already has nightly builds of the entire desktop and apps: so rather than having to build everything yourself, or wait for a development release of GNOME, you can just update & reboot your GNOME OS VM and test the change right there. gnome-build-meta knows how to build every GNOME module; and if you can build the code, it seems a conceptually small step to run ninja dist and the stuff above to publish tags and tarballs for each module.

So you could well imagine on 43.beta release day, someone in the release team could boot the latest GNOME OS nightly, declare it to be Good, and push a button that tags every relevant GNOME module & builds and uploads all the tarballs, and then go back to their day, rather than having to chase down module owners who haven’t quite got around to making the release, fix random build breakages, and so on.

To make this work reliably, I think you’d need every module’s CI to be run through gnome-build-meta, building that MR against the rest of the project, so that g-b-m build failures would be caught before (not after) the offending change lands in the module in question. Seems doable – in Endless we have the equivalent thing managed by a jenkins-job-builder template, the GitHub Pull Request Builder plugin, and a gnarly script.

Continuous integration and deployment are becoming the norm throughout the software industry, for good reasons laid out quite well in articles like Shipping Fast Changes Your Life: the smaller the gap between making a change and it reaching a user, the faster the feedback, and the less costly it is to fix a bug or change course.

The free software movement has historically been ahead of the curve on this, with the “release early, release often” philosophy. And GNOME in particular has used a time-based release process for two decades, allowing major distros to align their schedules to GNOME and get updates into the hands of users quickly, which went some way towards overcoming the fact that GNOME does not own the full pipeline from source code to end users.

Havoc Pennington’s June 2002 email proposing this model has aged rather well, in my opinion, and places a heavy emphasis on the development branch being usable:

The unstable branch must always be dogfood-quality. If testers can’t test it by using it daily, they can’t make the jump. If the unstable branch becomes too unstable, we can’t release it on a reliable schedule, so we have to start breaking the stable branch as a stopgap.

Interestingly the time-based release schedule wiki page states that the schedule should contain:

Regular test release dates, approximately every 2 weeks.

These days, GNOME releases are closer to monthly. In the context of the broader industry where updates reach users multiple times a day, this is starting to look a little less forward-thinking! Of course, continuously deploying an entire OS to production is rather harder than continuously deploying web apps or apps in app stores, if only because the stakes are higher: you need a really robust automatic rollback mechanism to save your users’ plant-based bacon substitute if a new OS build fails to boot, or worse, contains an updater bug that prevents future updates being applied! Still, I believe that a bit of automation would go a long way in allowing module maintainers and the release team alike to spend their scarce mental energy on other things, and allow the project to increase the frequency of releases. What am I missing?

Small steps towards a GTK 4-based Initial Setup

Over the Christmas holidays, I was mostly occupied with the literal care and feeding of small humans, but I found a bit of time for the metaphorical care and feeding of Initial Setup for GNOME 42 as well. Besides a bit of review and build and CI housekeeping, I wrote some patches to update it for API changes in libgnome-desktop (merged) and libgweather (pending). The net result is an app which looks and works exactly the same, complete with a copy of the widget formerly known as GWeatherLocationEntry (RIP) with its serial numbers filed off.

Of course, my ultimate goal was to port Initial Setup to GTK 4. I made some other tiny steps in that direction, such as removing a redundant use of GtkFrame that becomes actively harmful with the removal of the shadow-type property in GTK 4, and now have a proof-of-concept port of just the final page which both compiles and runs!

Screenshots of "All done!" page of Initial Setup

But, I will not have time to complete this port in time for the GNOME 42 UI freeze on 12th February. If you are reading this and feel inspired to pick this up, even just a page or two, more hands would be much appreciated.

γυαδεκ? χκπτγεδ?

GUADEC in Thessaloniki was a great experience, as ever. Thank you once again to the GNOME Foundation for sponsoring my attendence!

Sponsored by GNOME Foundation

Some personal highlights, in no particular order:

  • A lot of useful and informative discussion at the GNOME Advisory Board meeting on Thursday – we ran out of time, which seems like a good sign.
  • After Benjamin Berg and Iain Lane’s great talk on Managing GNOME Sessions with Systemd, Benjamin and I discussed the special-case they had to make to run GNOME Initial Setup’s “copy worker” early in the user session, and whether we might be able to improve this and various other aspects by launching Initial Setup in a different way.
  • Via Matthias’ talk on Portals, I got thinking about the occasional requests for an “is this app installed?” portal, and I realised that you can actually fake it with existing machinery in some cases. If you care about a specific app, you probably want to be able to talk to it, so you specify --talk-name=org.example.Foo; at which point you can call org.freedesktop.DBus.ListActivatableNames() and check whether org.example.Foo is in the returned list.
  • The Intern Lightning Talks were inspiring: it’s great to see what has caught the interest of new contributors. This year, I was inspired by Srestha Srivastava’s work on Boxes to send a merge request to osinfo-db to generate the necessary XML for Endless OS. This in turn led to a great discussion with Fabiano and Felipe, and to some more issues and merge requests.
  • Alex Larsson was a tough act to follow at the lightning talks, but based on hallway discussion, my bit on Flatpak External Data Checker was of interest. (I taught it how to update appdata on the flight home. The person sitting next to me told me that writing code on flights is a young-person thing, which I took as a compliment.)
  • Not one, but two talks on user testing! One thing I took away is that while it’s possible to conduct remote usability testing, you’ll miss out on body language cues from the test subjects, and in the specific case of GNOME you’ll either bias the sample towards people who already use GNOME, or you’ll introduce the additional variable of whatever remote access tool the user uses. Not ideal!

On the Endless front, the launch of the Coding Education Challenge, and the various talks from my esteemed colleagues about varied activities, were all great to see.

There were lots of clashes for me, so I’m grateful to the AV team for their great work on recording all the talks. (Unfortunately, one of the talks I couldn’t make it to, on GDPR, was not recorded, to avoid distributing what could be construed as legal advice. Alas!) Many thanks to the local team and the GNOME Foundation staff and volunteers who made the event run so smoothly.

Using Vundle from the Vim Flatpak

I (mostly) use Vim, and it’s available on Flathub. However: my plugins are managed with Vundle, which shells out to git to download and update them. But git is not in the org.freedesktop.Platform runtime that the Vim Flatpak uses, so that’s not going to work!

If you’ve read any of my recent posts about Flatpak, you’ll know my favourite hammer. I allowed Vim to use flatpak-spawn by launching it as:

flatpak run --talk-name=org.freedesktop.Flatpak org.vim.Vim

I saved the following file to /tmp/git:

#!/bin/sh
exec flatpak-spawn --host git "$@"

then ran the following Vim commands to make it executable, add it to the path, then fire up Vundle:

:r !chmod +x /tmp/git
:let $PATH = '/tmp:/app/bin:/usr/bin'
:VundleInstall

This tricks Vundle into running git outside the sandbox. It worked!

I’m posting this partly as a note to self for next time I want to do this, and partly to say “can we do better?”. In this specific case, the Vim Flatpak could use org.freedesktop.Sdk as its runtime, like many other editors do. But this only solves the problem for tools like git which are included in the relevant SDK. What if I’m writing Python and want to use pyflakes, which is not in the SDK?

Everybody’s Gone To The GUADEC

It’s been ten days since I came back from GUADEC 2018, and I’ve finally caught up enough to find the time to write about it. As ever, it was a pleasure to see familiar faces from around the community, put some new faces to familiar names, and learn some entirely new names and faces! Some talk highlights:

  • In “Patterns of refactoring C to Rust”, Federico Mena Quintero pulled off the difficult trick of giving a very source code-centric talk without losing the audience. (He said afterwards that the style he used is borrowed from a series of talks he referenced in his slides, but the excellent delivery was certainly a large part of why it worked.)
  • Christian Hergert and Corentin Noël’s talk on “What’s happening in Builder?” left me feeling good about the future of cross-architecture and cross-device GNOME app development. Developing OS and platform components in a desktop-containerised world is not a fully-solved problem; between upcoming plans for Builder and Philip Chimento’s Flapjack, I think we’re getting there.
  • I’m well-versed in Flatpak but know very little about Snap, so Robert Ancell’s talk on “Snap Package support in GNOME” was enlightening. It’s heartening that much of the user-facing infrastructure to solve problems common to Snap and Flatpak (such as GNOME Software and portals) is shared, and it was interesting to learn about some of the unique featues of Snap which make it attractive to ISVs.

I couldn’t get to Almería until the Friday evening; I’m looking forward to checking out video recordings of some of the talks I missed. (Shout-out to the volunteers editing these videos! Update: the videos are now mostly published; I’ve added links to the three talks above.)

One of the best bits of any conference is the hallway track, and GUADEC did not disappoint. Fellow Endlesser Carlo Caione and I caught up with Javier Martinez Canillas from Red Hat to discuss some of the boot-loader questions shared between Endless OS and Silverblue, like the downstream Boot Loader Specification module for GRUB, and how to upgrade GRUB itself—which lives outside the atomic world of OSTree—in as robust and crash-proof a manner as is feasible.

On the bus to the campus on Sunday, I had an interesting discussion with Robert Ancell about acquiring domain expertise too late in a project to fix the design decisions made earlier on (which has happened to me a fair few times). While working on LightDM, he avoided this trap by building a thorough integration test suite early on; this allowed him to refactor with confidence as he discovered murky corners of display management. As I understand it (sorry if I’ve misremembered from the noisy bus ride!), he wrote a library which essentially shims every syscall. This made it easier to mock and exercise all the complicated interactions the display manager has with many different parts of the OS via many IPC mechanisms. I always regret it when I procrastinate on test coverage; I’ll keep this discussion in mind as extra ammunition to do the right thing.

My travel to and from Almería was kindly sponsored by the GNOME Foundation. Thank you!

Sponsored by GNOME Foundation

When is an exit code not an exit code?

TL;DR: I found an interesting bug in flatpak-spawn which taught me that there is a difference between the exit code you pass to exit(), the exit status reported by waitpid(), and the shell variable $?.

One of the goals of Flatpak is to isolate applications from the host system; they can normally only directly run external programs supplied by the Flatpak platform they are built against, rather than whatever executables happen to be installed on the host. But some developer tools do need to be able to run commands on the host system. One example is GNOME Builder, which allows you to compile software on the host; another is flatpak-builder which uses this to build flatpak:s from within a flatpak. (For my part, I’m occasionally working on making Bustle run pkexec dbus-monitor --system on the host, to allow reading all messages on the system bus (a privileged operation) from an unprivileged, sandboxed application. More on this in a future blog post.)

Flatpak’s session helper provides a D-Bus API to do this: a HostCommand method that launches a given command outside the sandbox and returns its process ID; and a HostCommandExited signal which is emitted when the process exists, with its exit status as a uint32. Apps can use this D-Bus API directly, but recent versions of the common runtimes include a wrapper command which is much easier to adapt existing code to use: just replace cat /etc/passwd with flatpak-spawn --host cat /etc/passwd.

In theory, flatpak-spawn --host propagates the exit status from the command it runs, but I found that in practice, it did not. For example, false is a program which does nothing, unsuccessfully:

$ false; echo exit status: $?
1

But when run via flatpak-spawn --host, its exit status is 0:

$ flatpak run --env='PS1=sandbox$ ' \
> --talk-name=org.freedesktop.Flatpak \
> --command=bash org.freedesktop.Sdk//1.6
sandbox$ flatpak-spawn --host false; echo exit status: $?
0

If you care whether the command you launched succeeded, this is problematic! The first clue to what’s going on is in the output of flatpak-spawn --verbose:

sandbox$ flatpak-spawn --verbose --host false; echo exit status: $?
F: child_pid: 18066
F: child exited 18066: 256
exit status: 0

Here’s the code, from the HostCommandExited signal handler:

g_variant_get (parameters, "(uu)", &client_pid, &exit_status);
g_debug ("child exited %d: %d", client_pid, exit_status);

if (child_pid == client_pid)
  exit (exit_status);

So exit_status is 256, even though false actually returns 1. If you read man 3 exit, you will learn:

void exit(int status);

The exit() function causes normal process termination and the value of status & 0377 is returned to the parent (see wait(2)).

256 == 0x0100 and 0377 == 0x00ff; so exit_status & 0377 == 0. Now we know why flatpak-spawn returns 0, but why is exit_status equal to 256 rather than 1 in the first place?

It comes from a g_child_watch_add_full() callback. The g_child_watch_add_full() docs tell us:

In many programs, you will want to call g_spawn_check_exit_status() in the callback to determine whether or not the child exited successfully.

Following the link, we learn:

On Unix, [the exit status] is guaranteed to be in the same format waitpid() returns.

And reading the waitpid() documentation, we finally learn that the exit status is an opaque integer which must be inspected with a set of macros. On Linux, the layout is, roughly:

  • When a process calls exit(x), the exit status is ((x & 0xff) << 8); the low byte is 0. This explains why the exit_status for false is 256.
  • When a process is killed by signal y, the exit status is stored in the low byte, with its high bit (0x80) set if the process dumped core. So a process which segfaults and dumps core will have exit status 11 | 0x80 == 11 + 128 == 139

What’s funny about this is that, if the subprocess segfaults and dumps core, when testing from the shell flatpak-spawn --host appears to work.

host$ /home/wjt/segfault; echo exit status: $?
Segmentation fault (core dumped)
exit status: 139
sandbox$ flatpak-spawn --verbose --host /home/wjt/segfault; echo exit status: $?
F: child_pid: 20256
F: child exited 20256: 139
exit status: 139

But there’s a difference between this and a process which actually exits 139:

sandbox$ flatpak-spawn --verbose --host /bin/sh -c 'exit 139'; echo exit status: $?
F: child_pid: 20481
F: child exited 20481: 35584
exit status: 0

I always thought these two were the same. Actually, mapping the signal that killed a process to $? = 128 + signum is just shell convention.

To fix flatpak-spawn, we need to inspect the exit status and recover the exit code or signal. For normal termination, we can pass the exit code to exit(). For signals, the options are:

  • Reset all signal() handlers to SIG_DFL, then send the signal to ourselves and hope we die
  • Follow the shell convention and exit(128 + signal number)

I think the former sounds scary and unreliable, so I implemented the latter. Imperfect, but it’ll do.

Computer discoveries from February 2016

I found a text file named TIL.md lying around on my computer, with one section dated 17th February 2016. Apparently I’d planned to keep a log of the weird or interesting computer things I learned each day, but forgot after a day. I’d also forgotten all the facts in the file and was surprised afresh. Maybe you’ll be surprised too:

  • Windows’ shell and user interface do not support filenames with trailing spaces, so if you have a directory called worstever.christmas˽ (where ˽ represents a space) on your Unix fileserver, and serve it to Windows over SMB, you’ll see a filename like CQHNYI~0. I think this is the DOS-style 8.3 compatibility filename but I’m not sure where it gets generated in this case – Samba?
  • TIFF files can contain multiple images.
  • If you have a multi-subfile TIFF, multi.tiff, and run convert multi.tiff multi.jpeg, you will not get back a file called multi.jpeg; convert will silently assume you meant convert multi.tiff multi-%d.jpeg and give you back multi-0.jpeg, multi-1.jpeg, etc.

For some context: at the time, I was trying to work out why a script that imported a few tens of thousands of photographs into pan.do/ra – which doesn’t like TIFFs – had skipped some photographs, and imported others as a blank white rectangle; and why a Windows application pointed at the same fileserver showed a different number of photographs again. This was also the first time I encountered an inadvertent homoglyph attack: x.jpg and х.jpg are indistinguishable in most fonts.

Moving on

Yesterday was my last day at Collabora. It’s been a fun five years of working with smart and friendly people (the best kind) on interesting problems. I’ve learnt a lot, created many things I’m proud to have been a part of, and made a lot of friends all over the globe; and now I’ve decided to take a break, then try my hand at something different.

I think it’s notable that quite a few of those smart and friendly people I’m thinking of were neither colleagues nor clients. It’s been a privilege to work predominantly in the open, alongside others with the shared goal of advancing the causes of free software, open platforms and open communication systems. I’m not planning to disappear from the GNOME community any time soon, so I’m looking forward to running into a lot of familiar names, faces and IRC nicks in the future. 🙂

Thanks to Rob, Philippe and everyone I’ve worked with at Collabora over the last half-decade. It’s been great! (Oh, hey, also, Collabora is hiring. I’d recommend working there. Maybe they’ll get an application from Guybrush soon…)

I know that it’s not a party if it happens every night

I think I’ve just about caught up on sleep, four days after getting back from A Coruña. This year’s GUADEC was pretty great. One highlight was the bumper crop of interns’ lightning talks. In general, I’m a huge fan of the lightning talk format, because good talks are just as good when they’re three minutes long, and bad talks are only three minutes long. In this session, I didn’t have to invoke that second clause: the quality was really consistently high, the speakers had prepared well, and the talks kept me interested for the duration. Change-overs were smooth, and a few truncated-slide hiccups didn’t trip anyone up. It’s great to see so many people excited about contributing in all manner of ways. Congratulations all round.

🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙 🐙

Michael Meeks informed and amused ((slide 35 is a highlight)) as ever. Discussion about Telepathy’s historically patchy support for IRC during the Empathy BOF pushed me into a drive-by release of the IRC backend. Adam Dingle and Jim Nelson’s keynote also stood out—free software business models are a tricky matter, and it was interesting to hear their thoughts on sustaining the dream. I learnt a lot from Owen’s talk on smooth animations, and particularly enjoyed the un-dramatic reveal in Neil and Robert’s talk on Wayland-ifying the Shell, where they switched Pinpoint out of fullscreen to reveal their demo: an apparently-unremarkable Gnome Shell running both X and Wayland applications, including the presentation itself.

Outside the conference itself, my poor scheduling meant I missed the GNOME OS BOF, to my chagrin, in favour of spending a beautiful day exploring A Coruña. I fell into my usual trap of trying to visit museums on Monday (when they are generally closed), but the Torre de Hércules happened to be both open and free ((how appropriate)). Well worth a visit, if you’re ever there.

For me, chatting to old and new friends about GNOME, music, and everything in between are the best part of GUADEC, and this year was no exception. ((We didn’t have an official party this time around, but the nightly Collabora beach party welcomed many wonderful people, including tens of colleagues I rarely get the opportunity to see in person.)) Of course, over the week I also saw a lot of Pulpo a la Gallega. I felt a bit like this cat in the third panel.