Make yum faster

Pleeeaaaaase… It’s not acceptable, really…

Note: This is a rant…

This entry was posted in Uncategorized. Bookmark the permalink.

51 Responses to Make yum faster

  1. nacho says:

    +1 from me

    • Jesse van den Kieboom says:

      That seems interesting, maybe I will try it. I never use the graphical tool for install packages though…

  2. Evan says:

    I agree, yum could be faster. I just switched to Fedora after many years of Debian, and it feels like rpm is faster than dpkg, but aptitude is faster than yum.

    The fastestmirror plugin helps.

  3. Jef Spaleta says:

    I’ve been doing a lot of private benchmarking with yum and zif and I don’t understand this comment at all I’ve done repeated tests for depsolving and listing and searching and removals. Once you factor out network activity which is outside the control of yum, everything that is happening on disk with regard to transaction setup and depsolving or repodata querying seems reasonably fast to me.

    The _only_ situation I’ve seen where yum comes up slow is if you have been manipulating the rpmdb outside of yum prior to a yum transaction. Yum notices this and this condition changes how yum operates, basically interacting with the rpmdb more. I don’t see how this can be avoided and still ensure correct transaction operation.

    If you have repeatable tests that you want me to try to reproduce, I’m more than happy to do so on my systems. But rants like this don’t help identify problems. I’m telling you man, I’ve looked really hard at this in the last few weeks and I’m not seeing anything egregious.

    -jef

    • Jesse van den Kieboom says:

      @Jef: The problem is as Bjoern states. yum notoriously updates it’s cache db (even with yum -C somehow it decides to retrieve updates). It also doesn’t seem to download these updates very efficiently. I am absolutely positive that this process can be much faster, e.g. look at apt). This is a rant in the sense that I know if you want to see something fixed, the best way is to try and fix it, but I’m not in the mood to try and ‘fix’ yum :) However, the issue is still there, and I’m really not the only one having issues with it.

  4. NickG says:

    Rumour has it that Fedora 17 will have Zif by default for GUI stuff but YUM will be provided for the CLI.

    If there’s an effort to replace YUM I will support it happily with both time and cash.

  5. Bjoern says:

    Zif sounds interesting but I don’t think it targets the right problem regarding the performance of yum. Of course more layer decreases the performance. But in my experience the real problem which makes yum so much slower than apt is the fact that every yum command updates the complete package list while apt keeps the package list in a database and only updates it if the user ask for it (apt-get update).

  6. Milan Bouchet-Valat says:

    Actually, the advantage of apt-get/apt-cache is that it doesn’t download data at all on start. You need to call apt-get update if you want to get fresh information from the net. I think it makes sense, since when you run ‘yum info’ or ‘apt-cache show’, you certainly don’t want to wait one minute downloading information; usually, you don’t need it to be absolutely up-to-date.

  7. Jef Spaleta says:

    Jesse,

    I question your notorious comment. If you want to change how yum caches so it doesn’t have to go back out and pull additional repo data for some transactions. you can do that. By default yum is configured to just pull the primary repodata file. This is not enough for all possible depsolving transactions. If you want yum to cache the full repodata payload from a repository configure yum to do that.

    mdpolicy option in newer version of yum will let you define what your local policy is in regard to metadata pulling. Instead of group:primary use group:all and you’ll cache all the metadata for each repository.

    -jef

  8. Jef Spaleta says:

    Milan,
    metadata_expire option in yum.conf is your friend.

    set it to never

    yum makecache

    enjoy

    -jef

  9. Jeff Diddley says:

    I only use yum from the command line, but I’ve discovered something really cool that makes it seem to go a lot faster. It turns out, you don’t have to just sit and stare at the terminal for yum to work. You can switch to another screen / window and do work there until yum completes the requested transaction.

    This trick has literally saved me many multiples of minutes over the past six month.s

  10. Bjoern says:

    @Jef: That’s really great! Thanks a lot! I never heard about this option but I just tried it and it makes yum so much faster.

  11. Alexandre Rosenfeld says:

    For some reason disabling the yum expire option wasn’t as good as apt is with apt-get update. Zif looks interesting though, since one of it’s goals is “working with offline copies of metadata whenever possible”, seems to me what apt does.

  12. jeff says:

    From what I understand of Jef Spaleta’s suggestions combined with the comments in /etc/yum.conf about the metadata_expire property, it seems that the way to make yum/zif much faster would simply be to run gpk-prefs and set the “Check for updates: [ Hourly ]” option…

    Can someone confirm this?

    Also, why doesn’t yum use binary diffs for repo metadata just like it does (through presto) for the actual package updates?

  13. ac says:

    I’m wondering if they do profiling on yum each time they changed yum codes.

    Profile and optimize please.

    Yum is slow no matter how your apologize for it. And yum is one of the primary tools used by developers and beta/alpha testers.

    So by keeping yum slow, you are lossing developers and testers on that distribution.

  14. Milan Bouchet-Valat says:

    Jef: Why isn’t it the default? You can’t expect every Fedora user to find out about this… and they’ll rant about yum being slow! ;-)

    AFAICT, an outdated cache is really a problem only when installing packages, so yum could automatically update the cache when installing, but keep it outdated when calling e.g. show.

  15. Stijn says:

    I never understood why people complain about a *package manager* being slow. What are you doing, installing and deinstalling packages all day?

    Other than that, as Jef says, simply make yum behave like apt does and the “problem” goes away.

    Personally, I like yum’s default behaviour so much more than having to remember to do a manual ‘update’ of something. I prefer spending my typing time on intelligent stuff, not out-smarting the computer on when to cache something.

  16. Bjoern says:

    @MIlan I think even for installing new packages you don’t need to update the cache every time. In a first step yum could try to install a package based on the information provided by the cache. If this fails, e.g. because in the meanwhile package foo3.1 was updated to foo3.2 yum could update the cache and try again. This would be even better than apt which just stops with an error message in such a case.

  17. Benjamin Otte says:

    Jef, the problem is not about how yum does the things it does, but that it does the things it does. If I tell yum I want some information and yum decides it’s first going to ignore me while it downloads packages, then it has picked the absolute worst possible moment to do that. It can download new packages in a cron job, after it has answered me, or whenever, but not when I actually need it.

    It’s exactly as annoying as Windows Update that decides to install updates exactly when I want to shut down my computer. Or alternatively, when I just turn it on and want to do stuff with it.

    Or imagine locate updating its database everytime when you invoke it so it returns the most accurate results. Worst. Possible. Moment.

  18. Jef Spaleta says:

    @Milan
    Because stale mirrors SUCK. They SUCK a lot. Fedora repositories see much more churn than debian, by the nature of how the development model being used in the release. Fedora updates repository sees continual package additions, sometimes daily.
    Is default metadata cache settings optimal? Probably not. Is metadata expiration set to never optimal? Absolutely NOT…not for Fedora. If you want to have a data base discussion about what a more optimal default is, I’m willing to be a part of it. If you show up making recommendation that we flip it to never expire by default. I’m going to march in and stomp all over you. Fair warning.

    Benjamin Otte,

    Ignore you? I don’t understand what you are talking about.

    Everyone,
    I’ll point out that everyone here who has a complaint as so far FAILED to provide me with a test case that I can run for myself and to first verify I’m seeing similar behavior. I’m going to repeat what I said. I’ve spent a lot of time benchmarking yum and zif in the last few weeks (at the request of others). If there is a usage scenario that is problematic that I’m not aware of I want to know about it in enough detail to put it into my benchmarking scripts.

    That being said.. I think a lot of the griping simply comes down to “I’m use to apt and yum by default is different than my expectations” Now I’m more than happy to help you configure yum to work inline with your expectations, but you MUST do a better job of explaining them to me. I am not a day-to-day apt user so I don’t have your pre-existing baggage when it comes to operation. Tell me what you want yum to be doing in clear english, or give me an example situation (complete with your entire set of repository configurations and yum.conf) that I can repeat for myself and I can probably point you to a set of settings that work adequately for you.

    -jef

  19. Milan Bouchet-Valat says:

    Jef: No need for benchmarking, yum is quite fast (I’d say faster than apt-get) when the cache has been downloaded.

    Here are two clear use cases that could be improved
    1) I want to get information about an installed package (say the version, to report a bug somewhere). I type ‘yum info mypackage’, and wait for two minutes for the cache to get updated to get information about something local.
    2) I want to know what package provides a given file. Again, I don’t care about updated packages, since this doesn’t change everyday. The cache is enough.

    That’s why I suggested only yum install directly checks for updates. Other operations should ideally return local results, and only if needed update the cache. If you cannot know whether it’s needed or not, update the cache after showing cached results.

    I think that’s not only a technical issue linked to the package turnover in the repos, but also a matter of habits: in Debian, you know apt-cache and apt-get don’t use up-to-date information if you didn’t run apt-get update. yum could work the same, provided it’s smart enough to think: oh, this package no longer exists in the repo, I really need to update the cache now. But you need to accept the idea that the cache is not guaranteed to be up-to-date all the time.

  20. afanen01 says:

    As a reasonably sophisticated user, and a recent convert from Ubuntu, I have to chime in with the rest that Yum is annoyingly slow.

    I Agree with Otte that it picks the worst possible moments to do updates, and especially when I don’ t need it to.

    Yes, I install packages almost everyday, because I basically get a fresh install every 6 months, and I don’t re-install every single program I use back on the first day, I install them as the need for them arises.

    So, yea, as a typical user, Yum is slow, and given my track record of bugzilla reports, I stand a better chance of finding another distro (planning on testing out OpenSuse) than getting yum fixed.

    The experience is tolerable if you happen to be on a superfast link with latency under 100ms, which is the case for most of you guys in the West, but it renders Fedora a non-starter almost everywhere else.

    I am fortunate enough to be where I have a fast network connection right now, and for the next few months, but once I go back home, I’m probably going to switch to debian for the simple fact that doing a “yum search” doesn’t need to download megabytes off a crappy 3G connection. Ubuntu was OK in this sense because it works quite decently offline, and seems to do a reasonably decent job on crappy connections.

  21. afanen01 says:

    I should mention that I applaud the RPMDelta stuff for upgrades. It ranks up there with sliced bread and toast.

    Also, I discovered zif by reading the comments on this blog post, I’ve installed and am using it, but it seems to be less smart at guessing the correct package to install from incomplete package names (something I quite admire in yum). Also, it floods the terminal with messages… changing this may be a task accessible to my feeble skills even :p

    You should really consider giving yum more balanced (subject to definition of course) defaults. The fact that its behaviour is configurable doesn’t make its defaults acceptable.

  22. Jef Spaleta says:

    Milan,
    First I’m going to humbly suggest that neither of these two situations are in fact common day to day operations for _users_. For testers and packagers (like myself and other contributors these are quite common day-to-day tasks.) The default yum.conf is meant to prevent everyday users from getting into trouble with stale repository information which would lead them into situations where they are not getting the correct information about security and bugfix updates.

    Now that being said. Both of these situations can be handled.

    situation 1:
    first ensure you have the repomd policy configure to pull _all_ repo data on makecache.

    yum makecache to pull _all_ repodata to local cache.
    sometime later…..
    yum -C info installed
    full listing of everything that is installed from cache even if cache is expired. If you want to make sure that cache

    yum -C info installed packagename
    will only give you information for the _installed_ version of packagename

    yum -C info available packagename
    will only give you information for the _available_ version of packagename

    Situation two:
    first ensure you have the repomd policy configure to pull _all_ repo data on makecache. The default policy does not pull the provides repodata information.
    sometime later…
    yum -C provides whatever
    So if you haven’t updated your yum.conf to tell it to pull all repodata, then yes.. yum -C provides whatever… will pull pull some repodata because some repodata is _missing_ in the local cache. Commonly the the filelist repodata. Yum provides matches both the filelist provides and the virtual provides ( rpm -q -l and rpm -q –provides)… because items in filelist are valid rpm dependencies even if they are not listed in rpm -q –provides. I believe this is a key different between dpkg and rpm (I could be wrong about that).

    And for both situations if you don’t care about the repository metadata… just use rpm. rpm -qi packagename and rpm -q –provides packagename. rpm works strictly on the _installed_ rpmdb..no repodata…no cache.

    -jef

  23. Jef Spaleta says:

    afanen01,
    I cannot recommend zif to anyone at this time based on my benchmarking.

    I’m still waiting for a data centric view of what a more optimal default configuration would be that does not leave normal users with stale cache on a day to day basis given the rate of Fedora update repository churn.

    Everyone coming to Fedora from one of the Debian distros needs to take a step back and understand one significant point. Fedora as a distribution produces an order of magnitude or great amount of repository updates for end-users in the updates and updeates-testing repository which suppliments each Fedora release. This is significantly different than Ubuntu or Debian which both have extremely convervative update policies in comparison. The yum configs in Fedora are decided largely in part to provide a compromise that minimizes the amount of repository metadata churn for normal “updating” This means that only a subset of the repodata is pulled by default and the metadata is expired on a timescale which tries to anticipate the rate of repository churn in the Fedora updates. if we are pushing updates on a daily basis…then yes..we want users to pull metadata data often to avoid having stale metadata. The yum default policy for metadata expiration is inline with the distribution level update policy as it exists.

    You can not take the operational expectations from a distro with a conservative update policy over to a distro with a liberal update policy. The distro level policy directly impacts what the optimal solution is going to look like. A never expiring metadata cache will not work for Fedora and will leave end-users entirely high and dry when it comes to important bug and security fixes.

    -jef

  24. afanen01 says:

    Jef,

    I think the never expiring cache is a good idea and its implications can be managed. When a user attempts to install a package that no longer exists because of churn, then yum can suggest or perform an update. This is what happens on Debian/Ubuntu, and I’ve experienced it a number of times. Apt is not smart enough to know why the install failed, but I have experienced it enough times to know that an apt-get update fixes it, if it isn’t a genuine network access issue.

    It may also be possible to ship with some cronjob perhaps that updates the cache silently, dependent on network state. This way you can give the user both worlds. A cache that stays reasonably up to date, and a yum that behaves acceptably, especially in offline conditions.

  25. Jef Spaleta says:

    afanen01,
    There is an _optional_ cronjob for yum its packaged in the yum-cron package. Feel free to install it and see if its default behavior is reasonable to you. if it is not, its operation can be reconfigured via its configuration file in /etc/sysconfig/

    -jef

  26. Milan Bouchet-Valat says:

    I’m not sure the argument that Fedora packages change fast really holds. For Debian stable and Ubuntu releases, OK. But many Debian users are constantly on testing, which changes very fast, and Ubuntu has development versions too…

    Anyway, you made clear plenty of options are available to avoid updating the cache, but I think the request is that the *defaults* handle better that problem, because that’s what most people use[1]. Maybe you’re right that yum cannot handle correctly the case where an operation would fail because of an outdated cache, but I still think this makes a very poor user experience because of that, even on fast network connections.

    Note I’m not saying this to criticize yum, as I find it potentially better than apt-get if only this problem was mitigated. But I know I’m not going to convince you know… ;-)

    1: These “people” are of course developers, but also casual users who use yum via the PackageKit GUI, and suffer from the cache update when they just want to find an application, which is instant e.g. in Ubuntu Software center.

  27. afanen01 says:

    Jef,

    Thanks for the suggestions. I’ve installed and configured yum-cron (I assume I don’t need to manually create the cronjob) to “check and download only”. I’ll see how it works over the next few days and maybe write a blog post to encourage people to consider this route if they don’t like the default behaviour of yum.

  28. Jef Spaleta says:

    Milan,

    There is no reason why PK couldnt issue a makecache at login and update the cache resetting the expire clock and hiding the network cost of pulling metadata. But that’s a PK policy decisions not a yum policy decision.

    The default yum policy expire clock is a _fallback_. The default expiration clock setting was originally designed with the idea of yum-updatesd being running on a period basis in the background (which was in fact exactly how things use to be setup prior to PK)
    No idea what I’m talking about? read the yum.conf manpage and yum info yum-updatesd

    Does the default expiration clock need to be adjusted now that yum-updatesd is not expected to be run in a default configuration on Fedora? Maybe. Is the more optimal clock value never? Abso-freaking-lutely not. I’ll reaffirm that there is room to have a discussion about what a more optimal default config is, but “never” is a non-starter.

    -jef

  29. Milan Bouchet-Valat says:

    What would make sense is to run yum-updatesd *by default*, adjusting the period so that it corresponds to the reasonable delay PackageKit will ask users to install updates. Then, the cache expiration should be set to match this period. Or have the PackageKit daemon update the cache itself, but with the same period. Jef, what do you think of this?

  30. Jef Spaleta says:

    I think updatesd is a historic non-starter. I don’t think its going to be part of a default configuration ever again.

    -jef

  31. Benjamin Otte says:

    Seriously, I have to edit configuration files or pass obscure command-line arguments to make yum not suck?

    Maybe it’s the fact that I’m using GNOME but this is a non-starter for me.

    In the default configuration yum is an application that takes 20+ seconds to inform me about things that apt-cache doesn’t take a second to. That means yum is 2000+% slower. And such a factor means that it must suck a lot. I don’t care why. It’s not my job to figure out why yum developers can’t manage to make stuff fast.

  32. Jef Spaleta says:

    Benjamin,

    I’m trying to keep this discussion civil. But if you insist on using emotive language, I can oblige, but I’m pretty sure you won’t like that as I probably have a much wider vocabulary of pejoratives than you do. You should look into expanding yours, especially if your a big “words with friends” player. Anyways….

    If you want to say something sucks please make sure you cast the blame at the appropriate place. The problem you are having with yum and metadata caching/expriation is a configuration issue are entirely inside the control of the specific distribution which is making use of yum as a tool…not yum itself. Yum works, it works great, it works fast, even for people expecting apt like metadata pulling ONCE IT IS CONFIGURED TO MEET THERE EXPECTATIONS.

    So if anything sucks its Fedora’s default yum policy..not yum itself. If you can’t understand that then I don’t really know what else to tell you.
    I’ve tried very hard in this comment thread to show you that yum can be configured to work in a way that caches metadata in a way more likely to do what you expect. I’ve so far also gone out of my way to put a value judgement on your expectations. I’ve also tried very hard to explain why “never” expiration is a non-starter for Fedora. If want to suggest something in between the current default and “never” then I’m willing to facilitate that discussion for a more optimal default inside the Fedora distribution release management process with people who have the authority to make the changes. But I’m not going to do that if the level of discussion continues to be stuck at the “this sucks” level. Good day.

    -jef

  33. Alexandre Rosenfeld says:

    I understand the argument that we are biased towards the apt behaviour and so our expectations are not met by yum. So I’m going to talk about use cases, which might be biased towards my expectation as a power user, but it’s the best I can do.
    When I use a Mac and go to the Software Update, it takes a long time to open the list of updates, which as a user, looks a lot like what PackageKit does when I try to update my Fedora system. And it’s ok, I’ve seen users complain about this behaviour, but it’s not something we do often. If you tell me that the update list could show incorrect data if it wasn’t downloading the list of updates, I can agree that I prefer a delay then having the wrong information.
    But then I can go to the Mac App Store and while it’s slow sometimes, I can usually query the store fast enough and I can install new software without waiting it to load anything. But if I open PackageKit and I try to query the list of software available I have to wait for it doing some kind of loading, which technically is it updating the information of available software. The case of installling software, however, is not so rare and can be quite annoying to wait for this update. And as a user, I don’t care if the software I want to install has a minor update which is not showing for me, I just want to see if it’s what I want and install it.
    So, for a regular user, PackageKit, yum, or whatever is responsible for installing software, it sucks. And the regular user doesn’t care why.
    As a power user, I have a bit more of expectations of how I want things to work and I agree that expecting yum to work as apt is not right. But most of these frustations of the regular user bites us when we try to use yum and we too are users, so we too get annoyed. Telling us that it’s just our configuration that is wrong is just as frustating.
    As a personal case, Fedora was impossible to use for me when I had a slow connection that was dropping sometimes because of yum always taking so much time to do simple things (when it takes half an hour to update the yum cache and it does so every day, I was constantly lost of what I wanted to do in the first place). But Ubuntu was fine because of the “apt-get update” behaviour. And no matter how much I tried to change the yum configuration, it was simply too frustating.

  34. afanen01 says:

    Jef,

    I think that in making the choices for the default configuration, you ought to give more weight to the user experience.

    The reasons why yum needs to have its cache current constantly are understandable, but in its present state, yum seems to want to do as it pleases without regard for the user.

    Package updates and security updates are good and welcome, but the user is perfectly capable of running a yum update, yum upgrade when they are ready to update packages, or just yum update when a stale cache causes problems.

    That is a much more agreeable compromise than being expected to wait through often unneeded and quite unpredictable network activity.

    I don’t understand why you say that a never expiring cache is a non-starter. As far as I can see, it doesn’t affect the fedora project in any way, and the OS certainly doesn’t need to go out of its way to force package updates on the user, for whatever reason (that is what it feels like going by your reasons for not wanting an out of date cache).

    The only time when a “yum update” is required is when a package install fails because a particular package referenced by the cache no longer exists, or when the user explicitly requests an update. In the mean time, stale package caches provide other useful functions as have already been mentioned in previous comments.

  35. Jef Spaleta says:

    Alexandre Rosenfeld:

    “Telling us that it’s just our configuration that is wrong is just as frustating.”

    So your frustrated… noted. Doesn’t change the fact that this _is_ a configuration issue. I’m pointing out that whatever advocacy for change that anyone is going to do needs to happen is at the distribution level to get the defaults changed.

    As for your usecase. People are aware of the problem with trying to support an App Store. However stale metadata is not the answer. New metadata is. Stale metadata is a quick fix that happens to work okay for conservative distributions. Stale metadata does not work in Fedora. I know its difficult to understand coming from distributions with very low update churn in their stable releases, but Fedora repositories see a lot of churn, even in the inter-package dependencies. A stale fedora updates repository metadata can cause you lots of problems when trying to update for security or what not.

    But back to your “I want to browse PackageKit like its an App Store” useage case. You should really look at how the PackageKit upstream developer wants to make use of a dedicated app_data.xml file (as an optional repomd payload meant specifically to work with the browsing usecase) which better serves the needs of application browsing usecase you are talking about.

    repomd is designed to be flexible to allow for new optional repository metadata files as new use cases other than depsolving come up.
    The primay.xml that yum pulls is designed for depsolving. Not browsing…depsolving.. Which makes it much bigger and more critical to keep updated as things inside the repository change with updates. Once app_data.xml is available and PackageKit knows how to make use of it, it should pull it selectively as needed as part of a browsing operation to support an App store concept without having to pull any of the _much_ larger depsolving related metadata. Until you actually you know want to install something and you need correct depsolving.

    -jef

  36. Jef Spaleta says:

    afanen01,

    You clearly don’t understand how much update churn Fedora update repositories can see. You can absolutely get into situations where your stale cached metadata is referring to package versions which no longer exist on any mirror because there have been multiple updates of that package. Fedora mirrors do not have to keep every single update that was issued. Just the most recent.

    So there you are with a week old cached metadata and it depsolves a for a set of packages that no longer exist. Versioned dependancies are fun!. You reach out to a mirror and boom failure because inside of that last week packages were introduced which changed the dependency graph and the mirrors aren’t holding the old updates any longer. Now add to that that PackageKit is configured to silently skip broken dependency issues when doing updates and you end up with a situation where end-users never actually get the security updates they expected to get because they are sitting on stale metadata.

    Default never expire metadata will not work for Fedora without leaving casual end-users high and dry.

    -jef

  37. Benjamin Otte says:

    I’m using emotive language to express how I feel every time I have to use the software. And I think “sucks” describes it quite well. Heck, you get blog posts like this one from people who are frustrated.

    And I don’t care where to cast the blame. I don’t care what the issue is. I see a software that looks incredibly slow (yum on Fedora) and a software that does the same thing lightning-fast (apt on Ubuntu). And I see a bunch of people defending the slow software by claiming it is not slow/slow on purpose or blaming me for failing to configure it right. Seriously?

    PS: Easy fix: print (“Index looks outdated, continuing anyway. Use –some-switch to cause an update.”)
    PPS: Harder fix: Make index updating not take so long. Even if you have to write it in C. You have 100ms. My network does 1MB/s. Make it only update the package I asked about if you have to. “yum show inkscape” does not need to download new data for libnih.

  38. Jef Spaleta says:

    Benjamin,

    I care. And I’m willing to talk civilly with people who care and can control their emotions well enough to have a rational discussion and talk through the issues.. Here’s some free advice, never try to talk through a disagreement when angry. Not only does that help with contentious software development issue, but it will save marriages too.

    But on to your quick fixes.
    yum -C provides what you you want. I don’t understand why yum -C does not adequately do the job you want.

    If yum has the necessary metadata cache on disk yum -C will _not_ go out and grab new metadata. Even if the metadata is expired, yum -C will use it. If yum only has a partial set of repository metadata and the command needs an additional metadata file (for example filelist or group information) then yum -C will pull new metadata to fill the need.

    -jef

  39. Milan Bouchet-Valat says:

    > yum -C provides what you you want. I don’t understand why yum -C does not adequately do the job you want.
    It seems to do, but it’s not the default. What would people say if GNOME Shell took 20s to open the overview every new day unless you go to the control center and untick a “Don’t check for updates” check box?

    > If yum has the necessary metadata cache on disk yum -C will _not_ go out and grab new metadata. Even if the metadata
    > is expired, yum -C will use it. If yum only has a partial set of repository metadata and the command needs an additional
    > metadata file (for example filelist or group information) then yum -C will pull new metadata to fill the need.
    Sounds like a very reasonable behavior. Make it the default. Since you say depsolving can fail with an outdated cache, you can still force updating when you resolve dependencies. But *only in that case*.

    Thanks for your attention, BTW – and sorry if you’re the guy who has to support our complaints (I’m not personally frustrated about that behavior, but I find it sad that yum appears to suck just because of something trivial that can be fixed)!

  40. Jef Spaleta says:

    Shrug, don’t be sorry. I’m not a dev I’ve got no dog in this fight.

    But if people are going to complain, I’d like it to be accurate and focused enough to be actionable. But that requires some effort to understand why the current default Fedora policy is different than your pre-existing expectations.

    So on to your suggestion. You are basically asking for yum to behaving inconsistently to avoid using a single commandline argument. That is a non-starter.

    If -C is needed for some operations to force a cache-centric operation and to override metadata expiration, but is not needed for others..that is inconsistently inside the scope of day-to-day yum operation and very very confusing. Not cool.

    I do, and can use yum -C update to do offline package installs on some systems. I do this by batching yum -y –downloadonly update operations for multiple days and cache the packages that fill the transaction (without disrupting my system during the work week). Then say on the weekend (when I’m at home with poor bandwidth) I run yum -C update using the self-consistent metadata that corresponds to my local cache of packages. Completely valid use case for using the -C flag in a situation where depsolving from the cache is needed.

    Your suggestions would require me to use different flags for different operations to get inverted default behavior depending on the operation. That’s an exceedingly bad idea. It’s _much_ easier to remember that -C will override metadata expiration clocks for all operations.

    -jef

  41. Elad Alfassa says:

    @Benjamin:
    I asked the yum developers to switch to xz compression for the metadata. for now, they won’t (unless yum is ported to python3, which won’t happen soon). This is one of the most major thing that makes updating the cache so slow.
    So at least one part of the “harder fix” is declared as “won’t fix in the foreseeable future”

    @All:
    I think zif is the answer here. But I also agree zif is not perfect. zif needs a lot more testing, and a lot more work, to become 100% usable for the end user. We need your help to make it better.

    Also, “Rumour has it that Fedora 17 will have Zif by default for GUI stuff but YUM will be provided for the CLI.”
    is incorrect, FESCO veto’d this feature request.

  42. Torben says:

    I hate that there are still packages to dowload after a local rebuild. Why can’t rebuilding some packages and finishing downloads of others happen in parallel?

  43. Rudd-O says:

    ITs not just the network. It is also the INCREDIBLE AMOUNTS of disk IO and memory that yum uses. Each time I run yum, yum costs me half a gigabyte of disk IO at the very least, and a shitload of ram.

    yum is slow because it is horribly inefficient. that is the cold hard truth.

  44. Rudd-O says:

    Look, no network IO, still fucking slow:

    ———————————

    ~@karen.dragonfear α:
    dropcaches
    /usr/local/bin/dropcaches: line 4: /proc/sys/vm/drop_caches: Permission denied

    ~@karen.dragonfear α:
    sudo dropcaches

    ~@karen.dragonfear α:
    time yum search syslog
    Loaded plugins: auto-update-debuginfo, downloadonly, fastestmirror, fs-snapshot, presto, priorities, refresh-packagekit, remove-with-leaves, rpm-warm-cache, show-leaves, upgrade-helper,
    : verify, versionlock
    Loading mirror speeds from cached hostfile
    * fedora: mirror.lib.ucdavis.edu
    * fedora-debuginfo: mirror.lib.ucdavis.edu
    * kde: apt.kde-redhat.org
    * kde-testing: apt.kde-redhat.org
    * rpmfusion-free: mirrors.tummy.com
    * rpmfusion-free-debuginfo: mirrors.tummy.com
    * rpmfusion-free-updates: mirrors.tummy.com
    * rpmfusion-free-updates-debuginfo: mirrors.tummy.com
    * rpmfusion-nonfree: mirrors.tummy.com
    * rpmfusion-nonfree-debuginfo: mirrors.tummy.com
    * rpmfusion-nonfree-updates: mirrors.tummy.com
    * rpmfusion-nonfree-updates-debuginfo: mirrors.tummy.com
    * updates: mirror.lib.ucdavis.edu
    * updates-debuginfo: mirror.lib.ucdavis.edu
    ===================================================================================== N/S Matched: syslog =====================================================================================
    erlang-erlsyslog.x86_64 : Syslog facility for Erlang
    perl-Unix-Syslog.x86_64 : Perl interface to the UNIX syslog(3) calls
    perl-Unix-Syslog-debuginfo.x86_64 : Debug information for package perl-Unix-Syslog
    rsyslog-debuginfo.x86_64 : Debug information for package rsyslog
    rsyslog-gnutls.x86_64 : TLS protocol support for rsyslog
    rsyslog-gssapi.x86_64 : GSSAPI authentication and encryption support for rsyslog
    rsyslog-libdbi.x86_64 : libdbi database support for rsyslog
    rsyslog-mysql.x86_64 : MySQL support for rsyslog
    rsyslog-pgsql.x86_64 : PostgresSQL support for rsyslog
    rsyslog-relp.x86_64 : RELP protocol support for rsyslog
    rsyslog-snmp.x86_64 : SNMP protocol support for rsyslog
    rsyslog-sysvinit.x86_64 : SysV init script for rsyslog
    sblim-cmpi-syslog.i686 : SBLIM syslog instrumentation
    sblim-cmpi-syslog.x86_64 : SBLIM syslog instrumentation
    sblim-cmpi-syslog-debuginfo.x86_64 : Debug information for package sblim-cmpi-syslog
    sblim-cmpi-syslog-test.x86_64 : SBLIM Syslog Instrumentation Testcases
    syslog-ng.i686 : Next-generation syslog server
    syslog-ng.x86_64 : Next-generation syslog server
    syslog-ng-debuginfo.x86_64 : Debug information for package syslog-ng
    syslog-ng-devel.i686 : Development files for syslog-ng
    syslog-ng-devel.x86_64 : Development files for syslog-ng
    syslog-ng-libdbi.x86_64 : libdbi support for syslog-ng
    eventlog.i686 : Syslog-ng v2 support library
    eventlog.x86_64 : Syslog-ng v2 support library
    eventlog-devel.i686 : Syslog-ng v2 support library development files
    eventlog-devel.x86_64 : Syslog-ng v2 support library development files
    eventlog-static.x86_64 : Syslog-ng v2 support static library files
    petit.noarch : Log analysis tool for syslog, Apache and raw log files
    phplogcon.noarch : A syslog data viewer for the web
    rsyslog.x86_64 : Enhanced system logging and kernel message trapping daemon
    rsyslog-udpspoof.x86_64 : Provides the omudpspoof module
    snoopy.x86_64 : A preload library to send shell commands to syslog

    Name and summary matches only, use “search all” for everything.

    real 0m16.112s
    user 0m2.002s
    sys 0m3.635s

  45. Rudd-O says:

    And if I showed you the strace log, the amount of reads,seeks,and files it opens… you would be APPALLED.

  46. Craig says:

    The best thing about using zif as the default backend for PackageKit is that it then means yum is purely a developer tool and no longer needs to babysit end users for the sake of cache correctness and low maintenance use. Please, yum developers, for heavens sake, use this fact to your advantage.

    Also, despite the unhelpful default behaviours, the number of cycles that yum is clearly wasting is insanity. I only need 3 letters to to make this point clear. X M L. Yes yum uses XML to store metadata. Using a tree structure to serialise a list of files is the antithesis of Unix philosophy.

    I really don’t need to dig into the yum codebase or do benchmarks in order to have a reasoned opinion. Let’s not do too much technical masturbation here. I have used enough package managers to know that yum is at least an order of magnitude slower than almost everything else out there.

  47. jon says:

    Jef Spaleta’s attitude is exactly why after all these years yum is still slow. Try to argue with end users over where the problem is (“it’s the default config, not the app”). This disconnect from the users is amazing. Trying to rationalize away a problem doesn’t solve it.

    It’s really simple:
    time yum install
    time apt-get install

    Notice how apt-get takes about 2 seconds and yum takes about 30? How do you not understand the problem? How can you honestly sit there and try and argue that there’s absolutely nothing wrong with yum (the application, it’s config, the defaults, WHATEVER)?

Leave a Reply

Your email address will not be published. Required fields are marked *