I’ve had a little adventure with my Fedora Atomic Workstation this morning and almost missed a meeting because I couldn’t get to a desktop session.
I’ve been using the rawhide branch of Fedora Atomic Workstation to keep up to speed with the latest developments in Fedora. As is expected of rawhide, recently, it would not get me to a login screen (much less a working desktop session). I’ve just booted back into my working image and ignored this for a few days.
The Adventure begins
But since it didn’t go away by itself, yesterday, I decided to see if I can debug it a bit. Looking at the journal for the last unsuccessful boot gave some hints:
gnome-shell: Failed to create backend: Failed to initialize renderer: Missing extension for GBM renderer: EGL_KHR_platform_gbm gnome-session-binary: WARNING: App 'org.gnome.Shell.desktop' exited with code 1 gnome-session-binary: Unrecoverable failure in required component org.gnome.Shell.desktop
Poking the nearest graphics team member about this, I was asked to provide the output of eglinfo in this situation. Since I had an hour to spare before the meeting, I booted back into the broken image in runlevel 3, logged in on a vt, … and found that eglinfo is not in the OS image.
Well, thats easy enough to fix on an Atomic system, using package layering:
rpm-ostree install egl-utils
After that, I proceeded to reboot to get to the OS image with the newly added layer, and when I got to the boot prompt, I realized my mistake: rpm-ostree never replaces the booted image, since it (reasonably) assumes that the booted image is ‘working’. But it only keeps two images around, so it had to replace the other one – which was the image which successfully boots to my desktop.
Now, at the boot prompt, I was faced with the choice between
- the broken image
- the broken image + egl-utils
Ugh. Not what I had hoped for. And my meeting starts in 50 minutes. Admittedly, this was entirely my fault. rpm-ostree behaved as it should and as documented. Since it is a snow day, I need to do the meeting from home and need a web browser for that.
So, what can be done? I remembered that ostree is ‘like git for binaries’, so there should be history, right? After some fiddling with the ostree commandline, I found the log command that shows me the history of my local repository. But sadly, the output was disappointing:
$ ostree log fedora/rawhide/x86_64/workstation commit fa09fd6d2551a501bcd3670c84123a22e4c704ac30d9cb421fa76821716d8c20 ContentChecksum: 74ff34ccf6cc4b7554d6a8bb09591a42f489388ba986102f6726f9e662b06fcb Date: 2018-03-20 10:27:42 +0000 Version: Rawhide.20180320.n.0 (no subject) << History beyond this commit not fetched >>
rpm-ostree defaults to only keeping the latest commit in the local repository, a bit like a shallow git clone. Thankfully, just like git, ostree is versatile, and bit more searching brought me to the pull command, and its –depth option:
# ostree pull --depth=5 onerepo fedora/rawhide/x86_64/workstation Receiving metadata objects: 698/(estimating) 2.2 MB/s 23.7 MB
This command writes to the local repo in /sysroot/ostree/repo and thus needs to be run as root.
Now ostree log showed a few older commits. I had to bump the depth a few times to find the last working commit. Then, I made that commit available for booting into again, using the depoy command:
# ostree admin deploy 76723f34b8591434fd9ec0
where that hex string is a prefix of the commit ID of the last working commit. This command also needs to be run as root.
Now a quick reboot, and… the boot loader menu had an entry for the working image again. I made it back to my desktop with 5 minutes to spare before the meeting. Phew!Update: Since you might be wondering, the output of eglinfo was:
eglinfo: eglInitialize failed