On Boot Times

January 31, 2016

Why does it take as long to boot Fedora 23 in 2016 as it did to boot Windows 95 in 1995?

I knew we were slow, but I did not realize how slow:

$ systemd-analyze Startup finished in 9.002s (firmware) + 5.586s (loader) + 781ms (kernel) + 24.845s (initrd) + 1min 16.803s (userspace) = 1min 57.019s

Two minutes. (Edit: The 25 seconds in initrd is mostly time spent waiting for me to enter my LUKS password. Still, 1.5 minutes.)

$ systemd-analyze blame 32.247s plymouth-quit-wait.service 22.837s systemd-cryptsetup@luks\x2df1993bc3\x2da397\x2d4b38\x2d9bef\x2d 18.058s systemd-journald.service 16.804s firewalld.service 9.314s systemd-udev-settle.service 8.905s libvirtd.service 7.890s dev-mapper-fedora_victory\x2d\x2droad\x2droot.device 5.712s abrtd.service 5.381s accounts-daemon.service 2.982s packagekit.service 2.871s lvm2-monitor.service 2.646s systemd-tmpfiles-setup-dev.service 2.589s systemd-journal-flush.service 2.370s dmraid-activation.service 2.230s proc-fs-nfsd.mount 2.024s systemd-udevd.service 2.000s lm_sensors.service 1.932s polkit.service 1.931s systemd-fsck@dev-disk-by\x2duuid-30901da9\x2dab7e\x2d41fc\x2d9b 1.852s systemd-fsck@dev-mapper-fedora_victory\x2d\x2droad\x2dhome.serv 1.795s iio-sensor-proxy.service 1.786s gssproxy.service 1.759s gdm.service

(Truncated.)

This review of Fedora 23 shows how severely our boot speed has regressed (spoiler: 56.5% slower than Fedora 21, 49% slower than Ubuntu 15.10). The review also shows that Fedora 23 takes twice as long to power off as Fedora 22.

I think we can do better.

Comments

24 responses to “On Boot Times”

February 1, 2016

Gary van der Merwe

Out of interest, please will you post the output for `systemd-analyze critical-chain`. This provides more useful information. `systemd-analyze blame` shows `plymouth-quit-wait.service` as the service with the longest startup time, however, due to it’s nature, it contributes very little to slowing the total boot time. `systemd-analyze critical-chain` will better show which services are increasing the boot time.
1. February 1, 2016
  
  Michael Catanzaro
  
  That’s a useful one:
  
  $ systemd-analyze critical-chain
  The time after the unit is active or started is printed after the “@” character.
  The time the unit takes to start is printed after the “+” character.
  
  graphical.target @1min 19.933s
  └─multi-user.target @1min 19.933s
  └─libvirtd.service @44.197s +7.404s
  └─remote-fs.target @44.158s
  └─remote-fs-pre.target @44.158s
  └─iscsi-shutdown.service @44.157s
  └─network.target @44.145s
  └─wpa_supplicant.service @1min 15.035s +971ms
  └─dbus.service @22.995s
  └─basic.target @22.907s
  └─sockets.target @22.903s
  └─cups.socket @22.884s
  └─sysinit.target @22.781s
  └─systemd-update-utmp.service @22.594s +186ms
  └─auditd.service @22.401s +191ms
  └─systemd-tmpfiles-setup.service @21.454s +945ms
  └─systemd-journal-flush.service @3.451s +17.991s
  └─systemd-remount-fs.service @2.711s +691ms
  └─systemd-fsck-root.service @584542y 2w 2d 20h 1min 48.194s +99ms
  └─systemd-journald.socket
  └─-.slice
February 1, 2016

Sandro Mathys

Is this running on hardware from 1995, too? Just kidding, but Fedora 23 only just takes 20s to boot on my 3 year old laptop, and 1/3 of that is the time I needed to type in my disk encryption passphrase. So in reality, it takes 13-14s to boot my F23 and that’s with some optional services enabled.

If you need to speed it up, disable abrtd and libvirtd, and replace firewalld with iptables. And train to type your passphrase faster, since that time is included in your 2 minute boot time, too. Finally, you could build an initrd with only just the modules you really need, which should gain you some time, too.
February 1, 2016

Adam Williamson

I don’t know what’s going on there, but it’s nothing like that for me. Recent Fedoras boot in approx. 10-12 seconds, and 3 of those on my desktop are the bridged network connection doing a two second MTP timeout.

What hardware was this on?
1. February 1, 2016
  
  Michael Catanzaro
  
  The hard drive is: SAMSUNG HD321KJ (CP100-12)
  
  It’s probably about 10 years old. The Internet says it’s 7200 rpm, which is the same speed that’s available today….
  
  The processor is: Intel® Core™ i7-4770 CPU @ 3.40GHz × 8
  
  But I think this is not happening just to me; the reviewer I linked to measured a dramatic regression since F21 (although his absolute times were much faster than mine).
  1. February 1, 2016
    
    Adam Williamson
    
    Just based on the tone and content of that review I’m rather suspicious of it, but when I’m home I’ll check and see if I can reproduce his results on my test box (which uses HDDs). I don’t recall F22 or F23 seeming significantly slower to boot on that test box than F21 was, but of course there’s six months between releases and my memory isn’t perfect.
    
    I’m fairly sure he’s wrong about why readahead was dropped from Fedora; AFAICS it got sort of naturally orphaned by the systemd transition and it doesn’t look like anyone cared to pick it up. Ubuntu has something called ‘ureadahead’ that appears to have a systemd unit, but I’ve no idea what that is, how actively it’s maintained, or how efficient it is.
    1. February 1, 2016
      
      Adam Williamson
      
      Ah, I see the reference for the ‘devs use SSDs’ comments – it’s talking about why systemd-readahead was dropped upstream:
      
      https://www.mail-archive.com/systemd-devel@lists.freedesktop.org/msg21693.html
      
      systemd-readahead was indeed dropped from Fedora between 21 and 22, so it’s plausible that had an impact on boot times. 50% seems pretty high, though. As I said, I’ll take a look and see what the numbers look like for me.
      1. February 2, 2016
        
        Anon
        
        Here’s where it says it was dropped in the systemd NEWS file: https://github.com/systemd/systemd/blob/fd04bba0e8d51f95aae9661ced5d4959c6c82110/NEWS#L1638 .
  2. February 1, 2016
    
    Adam Williamson
    
    BTW, have you run a SMART test on the drive and checked if there are any kernel errors going on? 10 years is *very* old for a spinning rust drive, it could well be dying. Their usual life span is five years if you’re lucky…
    1. February 1, 2016
      
      Michael Catanzaro
      
      I don’t think I have any kernel errors.
      
      I ran a SMART test just now with GNOME Disks. I’m not sure what to make of most of the data, but it says “disk is OK” and it has an OK in every row.
      
      It also has either “pre-fail” or “old age” in every row. I am just going to assume that’s what it’s testing for, since it claims to be “OK.”
      1. February 1, 2016
        
        Adam Williamson
        
        Yeah, that’s right, it’s the type of test. smartctl gets you more detail and I don’t know whether GNOME disks runs the full exhaustive test, but it sounds like it’s not obviously dying at least.
      2. February 1, 2016
        
        Michael Catanzaro
        
        This seems like a good time to pretend that I’m not the maintainer of GNOME Disks.
        
        If anyone knowledgeable in the area of disks is interested in picking this up….
      3. February 1, 2016
        
        cmurf
        
        # dmesg | egrep ‘reset|UNC’
        If there are bad sectors causing long read attempts, they’d only show up in kernel messages if they take longer than the kernel’s default SCSI command timer of 30 seconds. If that were the case, I’d expect even worse boot times. If they’re bad enough to cause a read error sooner then 30 seconds, I’d expect startup to faceplant. So either way it seems unlikely but the above will find both kinds of errors.
        
        Unfortunately smart fails to give a heads up for impending drive failure in something like 60% of failure cases. There are two kinds of attributes “pre-fail” and “old-age”, as long as their assessments are OK, then that attribute is not a factor in the opinion of the manufacturer. Mainly you want to see “value” and “worst” to be the same, if they aren’t then that attribute is a factor for pre-fail or old-age even if it hasn’t reached “thresh” in order to cause the state to go to FAIL. So the drive can actually have a pretty nasty cold, and still report its health as OK.
February 1, 2016

Smith

Startup finished in 3.001s (firmware) + 5.326s (loader) + 4.387s (kernel) + 3.111s (userspace) = 15.827s

Is my archlinux timing. And I didn’t do much to reach it either. You can do it :)
February 1, 2016

Anon

Curious. Without LVM, software RAID, an encrypted disk and lmsensors (is it usual to install that on a desktop) you might go a little bit faster. What sort of disk are you using – HDD?
February 1, 2016

Dylan Smith

I think everyone else here uses SSD’s because I have never seen a boot time of 12 or 15 seconds with Fedora 22 or 23 in my 2 PCs. Even with fresh installations with new users.

Mine takes 1 minute, it’s no as bad as yours but still…

http://i.imgur.com/Z9orMp7.png
1. February 2, 2016
  
  Adam Williamson
  
  I wouldn’t expect <15 secs on an HDD, indeed, but I'm surprised it takes as long for Michael as it does. I'm definitely gonna run some tests on a test system with an HDD when I get home from the trip I'm on and see where we're at with clean installs for the last few releases.
February 1, 2016

cmurf

>>systemd-fsck-root.service @584542y 2w 2d 20h 1min 48.194s +99ms

Really? That’s not suspicious at all. That’s hilarious. But no, really, seeing as this is impossible, what’s going on there?

>>9.002s (firmware) + 5.586s (loader)

Firmware delays you can’t do anything about except on newer UEFI systems if you disable USB on boot, and then it’ll be about 1.5s. On a 10 year old system with BIOS, you’re stuck at 10s if you’re lucky, some servers won’t complete POST for a minute plus.

Bootloader is slow because /etc/default/grub contains GRUB_TIMEOUT=5 by default.
1. February 1, 2016
  
  Michael Catanzaro
  
  I dunno what’s up with systemd-fsck-root.service. Pretty sure it did not start two years into the boot process. :)
  
  The computer is actually a newer UEFI system (1.5 years old). Only the hard disk is salvaged from an older computer. It’s some dodgy BioStar motherboard that has tons of boot options that I don’t understand, so I’ve mostly left them alone….
2. February 8, 2016
  
  Cornel Panceac
  
  I honestly doubt that systemd counts grub timeout. Unless of course grub itslef reports it somewhere.
February 2, 2016

Dan Nicholson

Hey, Michael. I looked at this stuff a bunch at Endless. One thing I would suggest is looking at the analyze plot. It’s a lot easier to spot the interdependencies of jobs. Another suggestion is to boot with systemd.log_level=debug as that will print lots of info on when and how the jobs are being triggered.

One thing I suspect is causing a slowdown is that systemd in fedora likely has the readahead implementation turned off as it apparently causes issues in some cases. However, for our hardware, I found it made a really big difference for booting to have readahead on and working correctly. Ping me if you want to see some more details on that.
1. February 2, 2016
  
  Michael Catanzaro
  
  Indeed, readahead was removed from systemd, since it was slowing down boots on SSDs. Adam found the reference for this up above: https://www.mail-archive.com/systemd-devel@lists.freedesktop.org/msg21693.html
  1. February 2, 2016
    
    Dan Nicholson
    
    Maybe it’s because we’re on an older systemd, but even in our version it was disabled by default. We enabled it and it makes a huge difference on HDDs. I actually wouldn’t be surprised if it’s in the 50% range. Every job I saw took about twice as long to run without readahead. But, again, without the details it’s hard to say.
February 2, 2016

Sven

sudo dnf remove firewalld

This caused a tremendous speedup for me.