Installing a “full” disk encrypted Ubuntu 16.04 Hetzner server

I needed to install a server multiple times recently. Fully remotely, via the network. In this case, the machines stood at Hetzner, a relatively large German hoster with competitive prices. Once you buy a machine, they boot it into a rescue image that they deliver via PXE. You can log in and start an installer or do whatever you want with the machine.

The installation itself can be as easy as clicking a button in their Web interface. The manual install with their installer is almost as easily performed. You will get a minimal (Ubuntu) installation with the SSH keys or password of your choice deployed.

While having such an easy way to (re-)install the system is great, I’d rather want to have as much of my data encrypted as possible. I came up with a series of commands to execute in order to have an encrypted system at the end. I have put the “full” in the title in quotes, because I dislike the term “full disk encryption”. Mainly because it makes you believe that the disk will be fully encrypted, but it is not. Currently, you have to leave parts unencrypted in order to decrypt the rest. We probably don’t care so much about the confidentiality there, but we would like the contents of our boot partition to be somewhat integrity protected. Anyway, the following shows how I managed to install an Ubuntu with the root partition on LUKS and RAID.

Note: This procedure will disable your machine from booting on its own, because someone will need to unlock the root partition.

shred --size=1M  /dev/sda* /dev/sdb*
installimage -n bitbox -r yes  -l 1 -p swap:swap:48G,/boot:ext3:1G,/mnt/disk:btrfs:128G,/:btrfs:all  -K /root/.ssh/robot_user_keys   -i /root/.oldroot/nfs/install/../images/Ubuntu-1604-xenial-64-minimal.tar.gz


## For some weird reason, Hetzner puts swap space in the RAID.
#mdadm --stop /dev/md0
#mdadm --remove /dev/md0
#mkswap /dev/sda1
#mkswap /dev/sdb1

mount /dev/md3 /mnt
btrfs subvolume snapshot -r /mnt/ /mnt/@root-initial-snapshot-ro

mkdir /tmp/disk
mount /dev/md2 /tmp/disk
btrfs send /mnt/@root-initial-snapshot-ro | btrfs receive -v /tmp/disk/ 
umount /mnt/

luksformat -t btrfs  /dev/md3 
cryptsetup luksOpen /dev/md3 cryptedmd3

mount /dev/mapper/cryptedmd3  /mnt/

btrfs send /tmp/disk/@root-initial-snapshot-ro | btrfs receive -v /mnt/
btrfs subvolume snapshot /mnt/@root-initial-snapshot-ro /mnt/@

btrfs subvolume create /mnt/@home
btrfs subvolume create /mnt/@var
btrfs subvolume create /mnt/@images


blkid -o export /dev/mapper/cryptedmd3  | grep UUID=
sed -i  's,.*md/3.*,,'   /mnt/@/etc/fstab
echo  /dev/mapper/cryptedmd3   /     btrfs defaults,subvol=@,noatime,compress=lzo  0  0  | tee -a /mnt/@/etc/fstab
echo  /dev/mapper/cryptedmd3   /home btrfs defaults,subvol=@home,compress=lzo,relatime,nodiratime  0  0  | tee -a /mnt/@/etc/fstab

umount /mnt/
mount /dev/mapper/cryptedmd3 -osubvol=@ /mnt/

mount /dev/md1 /mnt/boot

mv /mnt//run/lock /tmp/
chroot-prepare /mnt/; chroot /mnt


passwd

echo  "termcapinfo xterm* ti@:te@" | tee -a /etc/screenrc
sed "s/UMASK[[:space:]]\+022/UMASK   027/" -i /etc/login.defs  
#echo   install floppy /bin/false  | tee -a    /etc/modprobe.d/blacklist
#echo "blacklist floppy" | tee /etc/modprobe.d/blacklist-floppy.conf

# Hrm. for some reason, even with crypttab present, update-initramfs does not include cryptsetup in the initrd except when we force it:
# https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1256730
# echo "export CRYPTSETUP=y" | tee /usr/share/initramfs-tools/conf-hooks.d/forcecryptsetup



echo   cryptedmd3 $(blkid -o export /dev/md3  | grep UUID=) none luks     | tee  -a  /etc/crypttab
# echo   swap   /dev/md0   /dev/urandom   swap,cipher=aes-cbc-essiv:sha256  | tee  -a  /etc/crypttab


apt-get update
apt-get install -y cryptsetup
apt-get install -y busybox dropbear


mkdir -p /etc/initramfs-tools/root/.ssh/
chmod ug=rwX,o=   /etc/initramfs-tools/root/.ssh/
dropbearkey -t rsa -f /etc/initramfs-tools/root/.ssh/id_rsa.dropbear

/usr/lib/dropbear/dropbearconvert dropbear openssh \
        /etc/initramfs-tools/root/.ssh/id_rsa.dropbear \
        /etc/initramfs-tools/root/.ssh/id_rsa

dropbearkey -y -f /etc/initramfs-tools/root/.ssh/id_rsa.dropbear | \
        grep "^ssh-rsa " > /etc/initramfs-tools/root/.ssh/id_rsa.pub



cat /etc/initramfs-tools/root/.ssh/id_rsa.pub >> /etc/initramfs-tools/root/.ssh/authorized_keys

cat /etc/initramfs-tools/root/.ssh/id_rsa

 
update-initramfs -u -k all
update-grub2

exit

umount -l /mnt
mount /dev/mapper/cryptedmd3 /mnt/
btrfs subvolume snapshot -r /mnt/@ /mnt/@root-after-install

umount -l /mnt

Let’s walk through it.


shred --size=1M /dev/sda* /dev/sdb*

I was under the impression that results are a bit more deterministic if I blow away the partition table before starting. This is probably optional.


installimage -n somehostname -r yes -l 1 -p swap:swap:48G,/boot:ext3:1G,/mnt/disk:btrfs:128G,/:btrfs:all -K /root/.ssh/robot_user_keys -i /root/.oldroot/nfs/install/../images/Ubuntu-1604-xenial-64-minimal.tar.gz

This is Hetzner’s install script. You can look at the script here. It’s setting up some hostname, a level 1 RAID, some partitions (btrfs), and SSH keys. Note that my intention is to use dm-raid here and not btrfs raid, mainly because I trust the former more. Also, last time I checked, btrfs’ raid would not perform well, because it used the PID to determine which disk to hit.



mdadm --stop /dev/md0
mdadm --remove /dev/md0
mkswap /dev/sda1
mkswap /dev/sdb1

If you don’t want your swap space to be in the RAID, remove the array and reformat the partitions. I was told that there are instances in which it makes sense to have a raided swap. I guess it depends on what you want to achieve…



mount /dev/md3 /mnt
btrfs subvolume snapshot -r /mnt/ /mnt/@root-initial-snapshot-ro

mkdir /tmp/disk
mount /dev/md2 /tmp/disk
btrfs send /mnt/@root-initial-snapshot-ro | btrfs receive -v /tmp/disk/
umount /mnt/

Now we first snapshot the freshly installed image not only in case anything breaks and we need to restore, but also we need to copy our data off, set LUKS up, and copy the data back. We could also try some in-place trickery, but it would require more scripts and magic dust.


luksformat -t btrfs /dev/md3
cryptsetup luksOpen /dev/md3 cryptedmd3
mount /dev/mapper/cryptedmd3 /mnt/

Here we set the newly encrypted drive up. Remember your passphrase. You will need it as often as you want to reboot the machine. You could think of using pwgen (or similar) to produce a very very long password and save it encryptedly on a machine that you will use when babysitting the boot of the server.


btrfs send /tmp/disk/@root-initial-snapshot-ro | btrfs receive -v /mnt/
btrfs subvolume snapshot /mnt/@root-initial-snapshot-ro /mnt/@

Do not, I repeat, do NOT use btrfs add because the btrfs driver had a bug. The rescue image may use a fixed driver now, but be warned. Unfortunately, I forgot the details, but it involved the superblock being confused about the number of devices used for the array. I needed to re-set the number of devices before systemd would be happy about booting the machine.


btrfs subvolume create /mnt/@home
btrfs subvolume create /mnt/@var
btrfs subvolume create /mnt/@images

We create some volumes at our discretion. It’s up to you how you want to partition your device. My intention is to be able to backup the home directories without also backing up the system files. The images subvolume might become a non-COW storage for virtual machine images.


blkid -o export /dev/mapper/cryptedmd3 | grep UUID=
sed -i 's,.*md/3.*,,' /mnt/@/etc/fstab
echo /dev/mapper/cryptedmd3 / btrfs defaults,subvol=@,noatime,compress=lzo 0 0 | tee -a /mnt/@/etc/fstab
echo /dev/mapper/cryptedmd3 /home btrfs defaults,subvol=@home,compress=lzo,relatime,nodiratime 0 0 | tee -a /mnt/@/etc/fstab

We need to tell the system where to find our root partition. You should probably use the UUID= notation for identifying the device, but I used the device path here, because I wanted to eliminate a certain class of errors when trying to make it work. Because of the btrfs bug mentioned above I had to find out why systemd wouldn’t mount the root partition. It was a painful process with very little help from debugging or logging output. Anyway, I wanted to make sure that systemd attempts to take exactly that device and not something that might have changed.

Let me state the problem again: The initrd successfully mounted the root partition and handed control over to systemd. Systemd then wanted to ensure that the root partition is mounted. Due to the bug mentioned above it thought the root partition was not ready so it was stuck on mounting the root partition. Despite systemd itself being loaded from that very partition. Very confusing. And I found it surprising to be unable to tell systemd to start openssh as early as possible. There are a few discussions on the Internet but I couldn’t find any satisfying solution. Is it that uncommon to want the emergency mode to spawn an SSHd in order to be able to investigate the situation?


umount /mnt/
mount /dev/mapper/cryptedmd3 -osubvol=@ /mnt/

mount /dev/md1 /mnt/boot

mv /mnt//run/lock /tmp/
chroot-prepare /mnt/; chroot /mnt

Now we mount the actual root partition of our new system and enter its environment. We need to move the /run/lock directory out of the way to make chroot-prepare happy.


passwd

We start by creating a password for the root user, just in case.


echo "termcapinfo xterm* ti@:te@" | tee -a /etc/screenrc
sed "s/UMASK[[:space:]]\+022/UMASK 027/" -i /etc/login.defs
#echo install floppy /bin/false | tee -a /etc/modprobe.d/blacklist
#echo "blacklist floppy" | tee /etc/modprobe.d/blacklist-floppy.conf

Adjust some of the configuration to your liking. I want to be able to scroll in my screen sessions and I think having a more restrictive umask by default is good.


echo "export CRYPTSETUP=y" | tee /usr/share/initramfs-tools/conf-hooks.d/forcecryptsetup

Unless bug 1256730 is resolved, you might want to make sure that mkinitramfs includes everything that is needed in order to decrypt your partition. Please scroll down a little bit to check how to find out whether cryptsetup is indeed in your initramdisk.


echo cryptedmd3 $(blkid -o export /dev/md3 | grep UUID=) none luks | tee -a /etc/crypttab
# echo swap /dev/md0 /dev/urandom swap,cipher=aes-cbc-essiv:sha256 | tee -a /etc/crypttab

In order for the initramdisk to know where to find which devices, we populate /etc/crypttab with the name of our desired mapping, its source, and some options.


apt-get update
apt-get install -y cryptsetup
apt-get install -y busybox dropbear

Now, in order for the boot process to be able to decrypt our encrypted disk, we need to have the cryptsetup package installed. We also install busybox and dropbear to be able to log into the boot process via SSH. The installation should print you some warnings or hints as to how to further proceed in order to be able to decrypt your disk during the boot process. You will probably find some more information in /usr/share/doc/cryptsetup/README.remote.gz.


mkdir -p /etc/initramfs-tools/root/.ssh/
chmod ug=rwX,o= /etc/initramfs-tools/root/.ssh/
dropbearkey -t rsa -f /etc/initramfs-tools/root/.ssh/id_rsa.dropbear

/usr/lib/dropbear/dropbearconvert dropbear openssh \
/etc/initramfs-tools/root/.ssh/id_rsa.dropbear \
/etc/initramfs-tools/root/.ssh/id_rsa

dropbearkey -y -f /etc/initramfs-tools/root/.ssh/id_rsa.dropbear | \
grep "^ssh-rsa " > /etc/initramfs-tools/root/.ssh/id_rsa.pub

cat /etc/initramfs-tools/root/.ssh/id_rsa.pub >> /etc/initramfs-tools/root/.ssh/authorized_keys

cat /etc/initramfs-tools/root/.ssh/id_rsa

Essentially, we generate a SSH keypair, convert it for use with openssh, leave the public portion in the initramdisk so that we can authenticate, and print out the private part which you better save on the machine that you want to use to unlock the server.


update-initramfs -u -k all
update-grub2

Now we need to regenerate the initramdisk so that it includes all the tools and scripts to be able decrypt the device. We also need to update the boot loader so that includes the necessary Linux parameters for finding the root partition.


exit

umount -l /mnt
mount /dev/mapper/cryptedmd3 /mnt/
btrfs subvolume snapshot -r /mnt/@ /mnt/@root-after-install

umount -l /mnt

we leave the chroot and take a snapshot of the modified system just in case… You might now think about whether you want your boot and swap parition to be in a RAID and act accordingly. Then you can try to reboot your system. You should be able to SSH into the machine with the private key you hopefully saved. Maybe you use a small script like this:


cat ~/.server_boot_secret | ssh -o UserKnownHostsFile=~/.ssh/server-boot.known -i ~/.ssh/id_server_boot root@server "cat - >/lib/cryptsetup/passfifo"

If that does not work so well, double check whether the initramdisk contains everything necessary to decrypt the device. I used something like


zcat /boot/initrd.img-4.4.0-47-generic > /tmp/inird.cpio
mkdir /tmp/initrd
cd /tmp/initrd
cpio -idmv < ../inird.cpio
find . -name '*crypt*'

If there is no cryptsetup binary, something went wrong. Double check the paragraph above about forcing mkinitramfs to include cryptsetup.

With these instructions, I was able to install a new machine with an encrypted root partition within a few minutes. I hope you will be able to as well. Let me know if you think anything needs to be adapted to make it work with more modern version of either Ubuntu or the Hetzner install script.

(Re)mastering a custom Ubuntu auto-install ISO

Recently, I had to install GNU/Linux on a dozen or so machines. I didn’t want to install manually, mainly because I was too lazy, but also because the AC in the data centre is quite strong and I didn’t want to catch a cold… So I looked for some lightweight way of automatically installing an Ubuntu or so. Fortunately, I don’t seem to be the first person to be looking for a solution, although, retrospectively, I think the tooling is still poor.

I would describe my requirements as being relatively simple. I want to turn one of the to be provisioned machines on, wait, and then be able to log in via SSH. Ideally, most of the software that I want to run would already be installed. I’m fine with software the distribution ships. The installation must not require the Internet and should just work™, i.e. it should wipe the disk and not require anything special from the network which I have only little control over.

I looked at tools like Foreman, Cobbler, and Ubuntu’s MAAS. But I decided against them because it doesn’t necessarily feel lightweight. Actually, Cobbler doesn’t seem to work well when run on Ubuntu. It also fails (at least for me) when being behind an evil corporate proxy. Same for MAAS. Foreman seems to be more of a machine management framework rather than a hit and run style of tool.

So I went for an automated install using the official CD-ROMs. This is sub-optimal as I need to be physically present at the machines and I would have preferred a non-touch solution. Fortunately, the method can be upgrade to delivering the installation medium via TFTP/PXE. But most of the documents describing the process insist on Bind which I dislike. Also, producing an ISO is less error-prone so making that work first should be easier; so I thought.

Building an ISO

The first step is to mount to ISO and copy everything into a working directory. You could probably use something like isomaster, too.


mkdir iso.vanilla
sudo mount -oloop ubuntu.iso ./iso.vanilla
mkdir iso.new
sudo cp -ar ./iso.vanilla/* ./iso.vanilla/.* iso.new/

After you have made changes to your image, you probably want to generate a new ISO image that you can burn to CD later.


sudo mkisofs -J -l -b isolinux/isolinux.bin -no-emul-boot -boot-load-size 4 -boot-info-table -z -iso-level 4 -c isolinux/isolinux.cat -o /tmp/ubuntu-16.04-myowninstall-amd64.iso -joliet-long iso.new

You’d expect that image to work If you now dd it onto a pendrive, but of course it does not… At least it didn’t for me. After trying many USB creators, I eventually found that you need to call isohybrid.


sudo isohybrid /tmp/ubuntu-16.04-myowninstall-amd64.iso

Now you can test whether it boots with qemu:


qemu-img create -f qcow2 /tmp/ubuntu.qcow2 10G
qemu-system-x86_64 -m 1G -cdrom ubuntu-16.04-server-amd64.iso -hda /tmp/ubuntu-nonet.qcow2

If you want to test whether a USB image would boot, try with -usb -usbdevice disk:/tmp/ubuntu-16.04-myowninstall-amd64.iso. If it doesn’t, then you might want to check whether you have assigned enough memory to the virtual machine. I needed to give -m 1G, because the default didn’t work with the following mysterious error.

Error when running with too little memory

It should also be possible to create a pendrive with FAT32 and to boot it on EFI machines. But my success was limited…

Making Changes

Now what changes do you want to make to the image to get an automated installation?
First of all you want to get rid of the language selection. Rumor has it that


echo en | tee isolinux/lang

is sufficient, but that did not work for me. Replacing timeout values in files in the isolinux to something strictly positive worked much better for me. So edit isolinux/isolinux.cfg.

If the image boots now, you don’t want the installer to ask you questions. Unfortunately, there doesn’t seem to be “fire and forget” mode which tries to install as aggressively as possible. But there are at least two mechanisms: kickstart and preseed. Ubuntu comes with a kickstart compatibility layer (kickseed).

Because I didn’t know whether I’ll stick with Ubuntu, I opted for kickstart which would, at least theoretically, allow me for using Fedora later. I installed system-config-kickstart which provides a GUI for creating a kickstart file. You can then place the file in, e.g. /preseed/ks-custom.cfg next to the other preseed files. To make the installer load that file, reference it in the kernel command line in isolinux/txt.cfg, e.g.


default install
label install
menu label ^Install Custom Ubuntu Server
kernel /install/vmlinuz
append file=/cdrom/preseed/ubuntu-server.seed vga=788 initrd=/install/initrd.gz ks=cdrom:/preseed/ks-custom.cfg DEBCONF_DEBUG=5 cdrom-detect/try-usb=false usb_storage.blacklist=yes --

Ignore the last three options for now and remember them later when we talk about issues installing from a pen drive.

When you boot now, you’d expect it to “just work”. But if you are me then you’ll run into the installer asking you questions. Let’s discuss these.

Multiple Network Interfaces

When you have multiple NICs, the installer apparently asks you for which interface to use. That is, of course, not desirable when wanting to install without interruption. The documentation suggest to use


d-i netcfg/choose_interface select auto

That, however, seemed to crash the installer when I configured QEMU to use four NICs… I guess it’s this bug which, at least on my end, had been cause by my accidentally putting “eth0” instead of “auto”. It’s weird, because it worked fine with the single NIC setup. The problem, it seems, is that eth0 does not exist! It’s 2016 and we have “predictable device names” now. Except that we still have /dev/sda for the first harddisk. I wonder whether there is a name for the first NIC. Anyway, if you do want to have the eth0 scheme back, it seems to be possible by setting biosdevname=0 as kernel parameter when booting.

2016-multiple-networks

You can test with multiple NICs and QEMU like this:


sudo qemu-system-x86_64 -m 1G -boot menu=on -hda /tmp/ubuntu-nonet.qcow2 -runas $USER -usb -usbdevice disk:/tmp/ubuntu-16.04-myowninstall-amd64.iso -netdev user,id=network0 -device e1000,netdev=network0 -netdev user,id=network1 -device e1000,netdev=network1 -netdev user,id=network2 -device e1000,netdev=network2 -netdev user,id=network3 -device e1000,netdev=network3 -cdrom /tmp/ubuntu-16.04-myowninstall-amd64.iso

No Internet Access

When testing this with the real servers, I realised that my qemu testbed was still too ideal. The real machines can resolve names, but cannot connect to the Internet. I couldn’t build that scenario with qemu, but the following gets close:


sudo qemu-system-x86_64 -m 1G -boot menu=on -hda /tmp/ubuntu-nonet.qcow2 -runas $USER -usb -usbdevice disk:/tmp/ubuntu-16.04-myowninstall-amd64.iso -netdev user,id=network0,restrict=y -device e1000,netdev=network0 -netdev user,id=network1,restrict=y -device e1000,netdev=network1 -netdev user,id=network2,restrict=y -device e1000,netdev=network2 -netdev user,id=network3,restrict=y -device e1000,netdev=network3 -cdrom /tmp/ubuntu-16.04-myowninstall-amd64.iso

That, however, fails:

2016-default-route

The qemu options seem to make the built-in DHCP server to not hand out a default gateway via DHCP. The installer seems to expect that, though, and thus stalls and waits for user input. According to the documentation a netcfg/get_gateway value of "none" could be used to make it proceed. It’s not clear to me whether it’s a special none type, the string literal “none”, or the empty string. Another uncertainty is how to actually make it work from within the kickstart file, because using this debconf syntax is for preseeding, not kickstarting. I tried several things,


preseed netcfg/get_gateway none
preseed netcfg/get_gateway string
preseed netcfg/get_gateway string 1.2.3.4
preseed netcfg/get_gateway string none
preseed netcfg/no_default_route boolean true

The latter two seemed to worked better. You may wonder how I found that magic configuration variable. I searched for the string being displayed when it stalled and found an anonymous pastebin which carries all the configurable items.

After getting over the gateway, it complained about missing nameservers. By putting


preseed netcfg/get_nameservers string 8.8.8.8

I could make it proceed automatically.

2016-nameservers

Overwriting existing partitions

When playing around you eventually get to the point where you need to retry, because something just doesn’t work. Then you change your kickseed file and try again. On the same machine you’ve just left half-installed with existing partitions and all. For a weird reason the installer mounts the partition(s), but cannot unmount them

2016-mounted

The documentation suggest that a line like


preseed partman/unmount_active boolean true

would be sufficient, but not so for me. And it seems to be an issue since 2014 at least. The workarounds in the bug do not work. Other sources suggested to use partman/early_command string umount -l /media || true, partman/filter_mounted boolean false, or partman/unmount_active seen true. Because it’s not entirely clear to me, who the “owner” , in terms of preseed, is. I’ve also experimented with setting, e.g. preseed --owner partman-base partman/unmount_active boolean true. It started to work when I set preseed partman/unmount_active DISKS /dev/sda and preseed --owner partman-base partman/unmount_active DISKS /dev/sda. I didn’t really believe my success and reordered the statements a bit to better understand what I was doing. I then removed the newly added statements and expected it to not work. However, it did. So I was confused. But I didn’t have the time nor the energy to follow what really was going on. I think part of the problem is also that it sometimes tries to mount the pendrive itself! Sometimes I’ve noticed how it actually installed the system onto the pendrive *sigh*. So I tried hard to make it not mount USB drives. The statements that seem to work for me are the above mentioned boot parameters (i.e. cdrom-detect/try-usb=false usb_storage.blacklist=yes) in combination with:


preseed partman/unmount_active boolean true
preseed --owner partman-base partman/unmount_active boolean true
preseed partman/unmount_active seen true
preseed --owner partman-base partman/unmount_active seen true

#preseed partman/unmount_active DISKS /dev/sda
#preseed --owner partman-base partman/unmount_active DISKS /dev/sda

preseed partman/early_command string "umount -l /media || true"
preseed --owner partman-base partman/early_command string "umount -l /media ||$

How I found that, you may ask? Enter the joy of debugging.

Debugging debconf

When booting with DEBCONF_DEBUG=5, you can see a lot of information in /var/log/syslog. You can see what items are queried and what it thinks the answer is. It looks somewhat like this:

2016-debug

You can query yourself with the debconf-get tool, e.g.


# debconf-get partman/unmount_active
true

The file /var/lib/cdebconf/questions.dat seems to hold all the possible items. In the templates.dat you can see the types and the defaults. That, however, did not really enlighten me, but only wasted my time. Without knowing much about debconf, I’ve noticed that you seem to be able to not only store true and false, but also flags like “seen”. By looking at the screenshot above I’ve noticed that it forcefully sets partman/unmount_active seen false. According to the documentation mentioned above, some code really wants this flag to be reset. So that way was not going to be successful. I noticed that the installer somehow sets the DISKS attribute to the partman/unmount_active, so I tried to put the disk in question (/dev/sda) and it seemed to work.

Shipping More Software

I eventually wanted to install some packages along with the system, but not through the Internet. I thought that putting some more .debs in the ISO would be as easy as copying the file into a directory. But it’s not just that easy. You also need to create the index structure Debian requires. The following worked well enough for me:


cd iso.new
cd pool/extras
apt-get download squid-deb-proxy-client
cd ../..
sudo apt-ftparchive packages ./pool/extras/ | sudo tee dists/stable/extras/binary-i386/Packages

I was surprised by the i386 suffix. Although I can get over the additional apt-ftparchive, I wish it wouldn’t be necessary. Another source of annoyance is the dependencies. I couldn’t find a way to conveniently download all the dependencies of a given package.

These packages can then be installed with the %packages directive:


%packages
@ ubuntu-server
ubuntu-minimal
openssh-server
curl
wget
squid-deb-proxy-client
avahi-daemon
avahi-autoipd
telnet
nano
#build-essential
#htop

Or via a post-install script:


%post

apt-get install -y squid-deb-proxy-client
apt-get update
apt-get install -y htop
apt-get install -y glusterfs-client glusterfs-server
apt-get install -y screen
apt-get install -y qemu-kvm libvirt-bin

Unfortunately, I can’t run squid-deb-proxy-client in the installer itself. Not only because I don’t know how to properly install the udeb, but also because it requires the dbus daemon to be run inside the to-be-installed system which proves to be difficult. I tried the following without success:


preseed anna/choose_modules string squid-deb-proxy-client-udeb

preseed preseed/early_command string apt-install /cdrom/pool/extras/squid-deb-proxy-client_0.8.14_all.deb

%pre
anna-install /cdrom/pool/extras/squid-deb-proxy-client-udeb_0.8.14_all.udeb

If you happen to know how to make it work, I’d be glad to know about it.

Final Thoughts

Having my machines installed automatically cost me much more time than installing them manually. I expected to have tangible results much quicker than I actually did. However, now I can re-install any machine within a few minutes which may eventually amortise the investment.

I’m still surprised by the fact that there is no “install it, dammit!” option for people who don’t really care about the details and just want to get something up and running.

Unfortunately, it seems to be non-trivial to just save the diff of the vanilla and the new ISO :-( The next Ubuntu release will then require me to redo the modifications. Next time, however, I will probably not use the kickseed compatibility layer and stick to the pure method.

A journey to an updated Linux 18

Oh what joy this whole GNU/Linux thing brings. I took a few days off to upgrade my machines. I had the pleasure to update one laptop twice, i.e. from the Ubuntu 12.04 LTS to the current 13.04 and a desktop from Fedora 17 to Fedora 18.

The Laptop was almost easy. It took long time for the system to install packages. And there are stupid dialogues to confirm which block the whole process. Not very nice. I let it run for a couple of hours, everything went more less fine until I couldn’t log in anymore. LightDM saved my GNOME preference but there was no gnome-session left. So I went to the console and got myself ubuntu-gnome-desktop (arr. stupid wordpress doesn’t render apt:// links).

The second update from 12.10 to 13.04 took as long as the first, with nothing noteworthy happening. Interestingly though, it didn’t want to install the 13.04 unless being told to install a “development release”. Bollocks.

Anyway, Ubuntu’s GNOME runs almost nicely on my tiny laptop. GNOME-Shell is very slow when it comes to alt-tab. It takes three or four seconds to switch a window. Distraction free computing at its best.

The Fedora desktop is full bucket of joy. The FedUp utility keeps what it promises. It’s surprisingly refreshing. This time, the whole upgrade procedure worked flawlessly. No really! In 2013! I’m amazed. It only took a while for it to fetch everything but then a reboot straight into the upgrade system made the magic happen. Very cool.

Not so cool was the surprise of the machine not booting. Of course. Systemd hung somewhere in NFS related daemons and bailed out because they failed. The old GRUB menu entry booted a little further, just until sendmail, and enabled me to investigate.

Sendmail could not be brought up, because “-bd is not supported by sSMTP”. Right. I have sSMTP installed. And to make a long story short, something did place an init script in /etc/rc.d/init.d/. And that script failed now. NOW. After a couple of years. It was probably never used but got activated with the migration to systemd. Anyway, you might want to delete your stray init scripts and eventually get rid of the packages altogether.

Then GDM wouldn’t come up. Only flicker. It took me a while to find the relevant log files (thinking that everything was in the Journal by now…) but grepping for the usual “EE” and “WW” didn’t reveal much.

# grep -r -e EE -e WW /var/log/gdm/
/var/log/gdm/:5.log.1: (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
/var/log/gdm/:5.log.1:Initializing built-in extension MIT-SCREEN-SAVER
/var/log/gdm/:5.log.1:(WW) Falling back to old probe method for vesa
/var/log/gdm/:5.log.1:(WW) Falling back to old probe method for modesetting
/var/log/gdm/:5.log.1:(WW) Falling back to old probe method for fbdev
/var/log/gdm/:5.log: (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
/var/log/gdm/:5.log:Initializing built-in extension MIT-SCREEN-SAVER
/var/log/gdm/:5.log:(WW) Falling back to old probe method for vesa
/var/log/gdm/:5.log:(WW) Falling back to old probe method for modesetting
/var/log/gdm/:5.log:(WW) Falling back to old probe method for fbdev
/var/log/gdm/:1.log.2: (WW) warning, (EE) error, (NI) not implemented, (??) unknown.

But. There were also the logs for the “slaves”. They contained:

gdm-simple-slave[1030]: WARNING: Failed to give slave programs access to the display. Trying to proceed.
gdm-launch-environment][1046]: pam_unix(gdm-launch-environment:session): session opened for user gdm by (uid=0)
gdm-launch-environment][1046]: pam_unix(gdm-launch-environment:session): session closed for user gdm
gdm-simple-slave[1030]: GLib-GObject-CRITICAL: g_object_ref: assertion `object->ref_count > 0′ failed
gdm-simple-slave[1030]: GLib-GObject-CRITICAL: g_object_unref: assertion `object->ref_count > 0′ failed

And there was a hint given by systemd:

# systemctl status gdm --full
gdm.service - GNOME Display Manager
Loaded: loaded (/usr/lib/systemd/system/gdm.service; enabled)
Active: active (running) since Fr 2013-05-03 12:22:04 CEST; 9s ago
Main PID: 1843 (gdm-binary)
CGroup: name=systemd:/system/gdm.service
└─1843 /usr/sbin/gdm-binary

Mai 03 12:22:07 bigbox gdm[1843]: gdm-binary[1843]: WARNING: GdmDisplay: display lasted 0.510350 seconds
Mai 03 12:22:07 bigbox gdm-binary[1843]: WARNING: GdmDisplay: display lasted 0.510350 seconds
Mai 03 12:22:07 bigbox gdm-simple-slave[1997]: WARNING: Failed to give slave programs access to the display. Trying to proceed.
Mai 03 12:22:08 bigbox gdm-simple-slave[1997]: GLib-GObject-CRITICAL: g_object_ref: assertion `object->ref_count > 0' failed
Mai 03 12:22:08 bigbox gdm[1843]: gdm-binary[1843]: WARNING: GdmDisplay: display lasted 0.507905 seconds
Mai 03 12:22:08 bigbox gdm-binary[1843]: WARNING: GdmDisplay: display lasted 0.507905 seconds
Mai 03 12:22:08 bigbox gdm-binary[1843]: WARNING: GdmLocalDisplayFactory: maximum number of X display failures reached: check X server log for errors
Mai 03 12:22:08 bigbox gdm-binary[1843]: WARNING: GdmDisplay: display lasted 0.509609 seconds
Mai 03 12:22:08 bigbox gdm[1843]: gdm-binary[1843]: WARNING: GdmLocalDisplayFactory: maximum number of X display failures reached: check X server log for errors
Mai 03 12:22:08 bigbox gdm[1843]: gdm-binary[1843]: WARNING: GdmDisplay: display lasted 0.509609 seconds

Aha! There is the problem! But.. what is it? No indication whatsoever. Not even a tiny hint as to where to look next.

I decided to make baby steps and tried to bring up X on my own. My computer liked “X”. But it didn’t “startx”. That in turn revealed a missing library. libicule.so.48. But the current version is .49. Why on earth would something try to link against an old version? “yum distro-sync” proves me right that my packages are up to date. I thus set out to find the weird library causing me trouble. But there were many!


# ldd /lib64/libgailutil-3.so | grep not
libicule.so.48 => not found
libicuuc.so.48 => not found
libicudata.so.48 => not found

I thought I got rid of them by doing

for f in /lib64/*.so; do ldd $f | (grep -q “not found” && echo $f); done | xargs yum remove -y

but that didn’t help. The ldd resolves symbols recursively but I really want to know the symbols needed by the library itself, not its dependencies. Readelf comes to mind. And after chasing a few libraries manually, I was tired so I came up with

for lib in $(cat /tmp/libs); do echo $lib; for l in $(readelf -d /lib64/$lib | grep NEEDED | cut -d[ -f2 | cut -d] -f1); do echo $lib: $l; done; done | less

which showed nicely which library the culprit was.

It was /lib64/libharfbuzz.so.0 from harfbuzz-0.9.13-1.fc20.x86_64. Where does this package come from, you may ask. So did I. I didn’t know how to make yum tell me, but I found out that it belonged to the F17 texlive repository.

Interestingly enough, yum check told me that there was a problem but couldn’t handle it. The solution, very similar to the command above, but with an important difference:

yum --disablerepo texlive distro-sync

Hope this will be useful to someone in the future. Chances are quite good.