Offline Updates in Fedora 20

In GNOME 3.10 we’re encouraging more people to use the offline-update functionality which we’ve been using in Fedora for a little while now. A couple of people have told me it’s really slow, but I hadn’t seen an offline update take more than a minute or so as I test updates all the time. To reproduce this, I spun up a seldom-used Fedora 20 alpha image and let GNOME download and prepare all the updates in the background. I then added some profiling code to the pk-offline-update binary, and rebooted. The offline update took almost 17 minutes to run.

So, what was it doing all that time, considering that we’ve already downloaded the packages and depsolved the transaction:

Transaction Phase Time (s)
Start up PackageKit 0.3
Starting up yum 3
Depsolving 10
Signature Check 8
Test Commit 5
Install new packages 704
Remove old packages 168
Run post-install scripts 90

This is about an order of magnitude slower than what I expected. Some of my observations:

  • 10 seconds to depsolve an already depsolved transaction
  • 8 seconds to check a few hundred signatures
  • 168 seconds just to delete a few thousand files
  • over 10 minutes to install a few hundred RPMs seems crazy
  • 90 seconds to rebuild a few indexes seems like a huge amount of time

Some notable offenders:

Package Time to install (s)
selinux-policy-targeted 122
kernel-devel 25
libreoffice-core 21
selinux-policy 17
hugin 12

 

Package Time to cleanup (s)
gramps 11
wireshark-gnome 8
hugin 7
meld 6
control-center 5

Hopefully Fedora 21 will move to the hawkey backend, and we can get closer to raw librpm speed (which seems to be quite a speed boost) but even that is too slow. I’ll be looking into the individual packages this week, and trying to find what makes them so slow, and what we can do about them to speed things up.

12 responses to “Offline Updates in Fedora 20”

  1. Tom Hughes

    Well selinux and the kernel are always far and away the slowest packages to update. In the case of selinux because of the time taken to build and load the new policy into the kernel and in the case of the kernel because of the time taken to rebuild the initrd.

    Whilst I’d welcome speeding those updates up (though I suspect it will be hard) so that my yum updates go through quickly, it won’t really help the people that fall for your offline updates silliness, because the main slowdown for them won’t be the time to run the update, it will be the half hour spent recreating their desktop setup after each reboot.

  2. Emmanuele

    @Tom Hughes: your offline updates silliness — nice way to insult people and their work. good job.

    1. Tom Hughes

      I wasn’t intending to insult anybody I just happen to think that offline updates is a massive step backwards – we’ve spent years laughing at Windows for making people reboot all the time and now we’re going to tell Linux users they need to do the same?

      The simple fact is, that for all the talk of it not being possible to update a running system and of all the things that can go wrong, real life experience says that it simply doesn’t happen very often. I run “yum update” every day and only very rarely will an application start behaving oddly as a result and then I just restart that application and the problem is solved.

      1. Emmanuele

        we’ve spent years laughing at Windows for making people reboot all the time and now we’re going to tell Linux users they need to do the same?

        dunno who you’re talking about: I never did laugh at that. people usually point that updates requiring 10+, not one, reboots are ridiculous on a newly installed system. also, you’d be hard-pressed to “laugh” if you spent a small amount of time researching the problem space — like Richard has done.

        if you notice, everyone (desktop OSes, mobile OSes) does offline updates for the system. it’s the only way to ensure a proper update. on Linux we’re just reaping what we have sown in the past, with our apps based on plugins and our libraries dispersed over the entire file system and loaded by a bajillion applications. in short: distributions made a mess of this bed, and now we get to live in it. offline updates are the least problematic solution for this, to avoid inconsistent states.

        again, reading up the problem space would help understanding the issue.

      2. bochecha

        > we’ve spent years laughing at Windows for making people reboot all the time

        Maybe you shouldn’t have done that, then. ;)

        Really, “we’ve always said A” is a poor excuse for keeping the status quo. People can change their opinion, and realize that something they were proud of might have been a bad idea all along.

        > I run “yum update” every day and only very rarely will an application start behaving oddly as a result

        Me too. And it’s perfectly fine for you and me.

        But when it happens to her, my mum instantly fires a screenshot and sends that to me in an email.

        It doesn’t matter how often I tell her “oh, it’s ok after an update, just restart the application”, she just doesn’t get it. And she’s right not to get it. It doesn’t make any sense that an application starts working weirdly after an update (except for regressions of course, but these are different), unless you know how things work.

        And if the app crashes after the update, you might lose some work.

        > and then I just restart that application and the problem is solved.

        It’s just a terrible user experience. It’s part of why Linux and Free Software are perceived as being “just for geeky hobbyists”. And it leads people to delay their updates because “my application might start behaving weirdly or crash if I update now”.

  3. kparal

    One of the major problems I notice with RPM is that it is only single-threaded. In the cleanup phase, the only task is to remove some folders and files, I believe. Still, it’s very slow (much slower compared to rm) and I see 100% CPU utilization during the whole period. I have a SSD disk, so being CPU-bound is clearly visible in my case. Still, I believe that the CPU can be a bottleneck even on HDD-based systems.

    Btw, my system is CPU-bound even for the install phase, even though not that much as in cleanup. Giving RPM multi-threading would yield a large speed boost.

    Having said that, I’m not sold on the offline-updates idea. I know that it’s supposed to be safer (for common users only; experienced users will be safer when they see an error and can fix it in realtime). But this is overkill, many of the updates can be performed online and are perfectly safe. The best approach here could be to redesign our packaging system and start separating safe and unsafe updates (or even, more granularly, tasks inside a single update) and perform only the unsafe ones offline. Unfortunately, that would be quite a change and I don’t expect it to happen.

  4. Adam Williamson

    What would probably be easier to implement and would help a lot in the short term is simply better progress indication for the offline update – you kinda get a bit if you hit Esc, but if you don’t, all you get is a normal ‘Fedora booting’ screen with no information that an update is even happening.

  5. Alexander Larsson

    I agree with Adam, the main problem is that it just says “Fedora booting” during the full update cycle. As soon as the actual update takes a non-neglible time that is just plain wrong, and should at least change to “applying updates” or something, otherwise its very unclear that its doing something and not just hanging during boot.

  6. Alexander E. Patrakov

    I have already tried to suggest this idea, and will do this once more.

    Use snapshots (no matter LVM or btrfs) to avoid modifying the live system, and do only the required stuff on reboot. Of course, this would only work if /usr is on its own LVM or btrfs volume. This is to avoid the situation when admin’s changes to configuration files are lost – the idea is that /usr has no configuration files.

    So here we go.

    1. Detect that updates are available. We already do that before the reboot.
    2. Solve dependencies. We do that already in order to know what to download.
    3. Download packages. We already do that before the reboot.
    4. Check signatures.
    5. Make a new snapshot of /usr – should be fast
    6. Install new packages into that snapshot, while the main system is up and running. Store postinst scripts somewhere in /usr instead of running them. This should hide 704 seconds.
    7. Remove old unwanted packages from the snapshot, while the main system is up and running. This should hide 168 seconds.
    8. Reboot into the new snapshot at the user’s pleasure.
    9. Run previously stored post-install scripts – 90 seconds.
    10. Erase them – should be fast and can be done in the background.

    With the above scheme, the downtime would be only 90 seconds (strictly required for post-install scripts) instead of 988 – i.e. a possibility of 10x improvement even without switching to librpm.

    Of course, the above scheme would require moving the package database from /var into /usr, because it has to be consistent with the contents of /usr even in the case if the update is rolled back.

  7. Sparks

    It would be nice to be able to turn off this functionality easily. Just the other day I was on a very-low bandwidth connection trying to transmit an email. The connection to my SMTP server was timing out and I was having problems getting the message sent. I pulled up Wireshark and noticed that most of my bandwidth was being hogged by a PackageKit instance that was pulling down updates even though I hadn’t asked it to. Now normally this wouldn’t have been a problem as I’m usually on a much higher bandwidth connection but this functionality was causing me big troubles.

    I killed the offending process only to have it regenerate. I tried to find some information on this “feature” but, again, I didn’t have the bandwidth. I finally found a config file that I was able to change which seemed to fix the problem but it took a while.

    I don’t think the feature is bad but it needs to be easily toggled on and off from the GUI. Personally I’ll most likely leave it off on both of my laptops because it’s just not worth it to me to hog my bandwidth when I need it for other things.

    [WORDPRESS HASHCASH] The poster sent us ‘0 which is not a hashcash value.

  8. Dave Airlie

    update-mime-cache seems to be eating a lot of time here on f20 updates, it seems to run multiple times then once more at the end.

    Also to the comment about the kernel afaik we don’t do the initrd update now until after all the packages are installed so it won’t show up in the timing for the kernel package.

Bad Behavior has blocked 2769 access attempts in the last 7 days.