Testing the hawkey backend in Fedora 20

The grand plan is that Fedora is replacing yum with dnf in Fedora 21/22. For a few technical reasons PackageKit isn’t going to be using the python DNF layer, but instead using the main two libraries that DNF is build upon directly, namely hawkey (which in turn uses libsolv) and librepo.

I’ve been working with the hawkey and librepo developers on-and-off for a few months now, and we’ve now got a “hawkey” backend in PackageKit which I’ve been stress-testing every day for the last week or so. Today I released PackageKit 0.8.13 with all the fixes in the hawkey backend that make it, well, actually work correctly.

If you’d like to test out the backend, the procedure is pretty simple. Either wait for PackageKit-0.8.13-1.fc20 to hit updates-testing or manually download all the packages. Make sure you’ve updated to 0.8.13-1, and then install the PackageKit-hawkey subpackage and then remove the PackageKit-yum subpackage. If you don’t know how to do this you probably should stick to the tried and tested yum backend for now :)

Reboot, and then pkcon backend-details should tell you that you’re indeed running with the hawkey backend. The first transaction will take a little time as all the metadata will be downloaded and built into a .solv file, but after that it should be fine. From there, test offline updates, gnome-software and all the new stuff and file bugs with a way to reproduce and a backtrace if anything fails (and grab me on IRC if you can). Known issues is that installing and removing groups is not implemented, but that should only affect the old gpk-application application.

And the most important question… Is hawkey faster than yum? I’ll have to let the early adopters be the judge of that. :)

Offline Updates Performance Notes

So, after my epic 20+ minute offline update of 245 packages, last night I decided to look at some profiling numbers. All my testing was done using git master PackageKit (for the new strace support) on an otherwise unmodified Fedora 20 of a snapshot from last week. For the strace I chose to update two packages, otherwise the strace -tt output went maaaasive. Some salient points:

  • yum opens and closes the rpmdb 6692 times (that’s about 6690 more than it needs to) –we’re investigating why
  • fdatasync and fsync are killing us:

 

duration(ms) system call
805.749 fdatasync(17)
752.828 fsync(27)
658.659 fdatasync(9)
614.367 fdatasync(15)
598.182 fdatasync(33)
535.642 wait4(903, [{WIFEXITED(s) && WEXITSTATUS(s) =
423.247 wait4(911, [{WIFEXITED(s) && WEXITSTATUS(s) =
368.85 fsync(22)
309.556 stat(“/var/lib/yum/yumdb/g/gvfs-fuse-1.18.3-1.fc20-x86_64/checksum_type”
217.877 fdatasync(18)
179.002 close(23)

The full strace log is here (warning, huge) if you’re interested. I’ve got some other work to be doing today, but I’ll continue to work on this at the weekend.

Offline Updates in Fedora 20

In GNOME 3.10 we’re encouraging more people to use the offline-update functionality which we’ve been using in Fedora for a little while now. A couple of people have told me it’s really slow, but I hadn’t seen an offline update take more than a minute or so as I test updates all the time. To reproduce this, I spun up a seldom-used Fedora 20 alpha image and let GNOME download and prepare all the updates in the background. I then added some profiling code to the pk-offline-update binary, and rebooted. The offline update took almost 17 minutes to run.

So, what was it doing all that time, considering that we’ve already downloaded the packages and depsolved the transaction:

Transaction Phase Time (s)
Start up PackageKit 0.3
Starting up yum 3
Depsolving 10
Signature Check 8
Test Commit 5
Install new packages 704
Remove old packages 168
Run post-install scripts 90

This is about an order of magnitude slower than what I expected. Some of my observations:

  • 10 seconds to depsolve an already depsolved transaction
  • 8 seconds to check a few hundred signatures
  • 168 seconds just to delete a few thousand files
  • over 10 minutes to install a few hundred RPMs seems crazy
  • 90 seconds to rebuild a few indexes seems like a huge amount of time

Some notable offenders:

Package Time to install (s)
selinux-policy-targeted 122
kernel-devel 25
libreoffice-core 21
selinux-policy 17
hugin 12

 

Package Time to cleanup (s)
gramps 11
wireshark-gnome 8
hugin 7
meld 6
control-center 5

Hopefully Fedora 21 will move to the hawkey backend, and we can get closer to raw librpm speed (which seems to be quite a speed boost) but even that is too slow. I’ll be looking into the individual packages this week, and trying to find what makes them so slow, and what we can do about them to speed things up.

Upstream adoption of AppData so far

By popular request, some update on the upstream adoption of AppData so far:

Applications in Fedora with long descriptions: 168 (9%)
Applications in Fedora with screenshots: 140 (7%)
Applications in GNOME with AppData: 60 (50%)
Applications in KDE with AppData: 1 (1%)
Applications in XFCE with AppData: 0 (0%)

You can look at a few ways:

  • We’ve made significant progress in the last year-or-so and many popular applications are already shipping the extra data.
  • There are a lot of situations where the upstream authors do not know what an AppData file is, don’t have time to add one, or simply do not care.
  • GNOME is clearly ahead of KDE and XFCE, probably because of the existing GNOME Goal and my nag emails to the desktop-devel mailing list. A little thing to bear in mind is that Apper (the KDE application installer) can also make use of the AppStream data, so this is a little disappointing for KDE users who probably don’t see any difference at the moment.

So where do we go from here? Clearly KDE and XFCE have some catching up to do, and I need someone familiar with those communities to lead this effort. There is also a huge number of upstreams that need a little push in the right direction, and I’ve been trying to do that for the last couple of months. Without help, this would be a never-ending battle for me. A little reminder: In GNOME 3.12 we are penalising applications that don’t ship AppData by including them lower in the search results, and in GNOME 3.14 we’re not going to be showing them at all.

If you’re interested to see all the applications shown by default in Fedora 20, I’ve put together this page showing a quick overview. If you see anything there that shouldn’t be an application and needs blacklisting, just let me know. If you see an application you care about without a long description or screenshots, then please file a bug upstream pointing them at the AppData specification page. Thanks.