Xz compression

Yesterday I’ve changed ftpadmin to generate .tar.xz and .tar.bz2 tarballs. Shortly after that, tracker 0.10.15 was released.

Tracker 0.10.14:

[   ] tracker-0.10.14.tar.bz2     19-May-2011 17:58  7.5M  
[   ] tracker-0.10.14.tar.gz      19-May-2011 17:58  8.9M  

Tracker 0.10.15:

[   ] tracker-0.10.15.tar.bz2     26-May-2011 17:37  7.5M  
[   ] tracker-0.10.15.tar.xz      26-May-2011 17:37  5.3M  

The smallest version you could download for 0.10.14 is 7.5MB (bz2). For 0.10.15 you’re able to get the 5.3MB xz version. A 29% saving to get the same bits.

Note: The plan is to only create xz tarballs once the last 3.2 GNOME stable version has been released (November).

18 Replies to “Xz compression”

  1. Why is it the plan to drop the others?

    In other words, it the cumulative bandwidth savings for dropping bz2
    worth the cumulative hassle for people whose tar and scripts won’t
    deal with xz?

    1. There won’t be much hassle as I’ve announced and discussed it many times and got no negative feedback. Furthermore, there is a 6 month transition period where things can still be adjusted if the need arises.

  2. Nice. Still, bz2 is the obsolete format while gz is not. Gzip is still a simple and fast algorithm while bz2, chasing better compression using more CPU, is made obsolete by xz.

    [WORDPRESS HASHCASH] The poster sent us ‘0 which is not a hashcash value.

  3. I should add that I did go over the email archives and read the relevant
    threads. The thing that stands out spectacularly: there are no numbers,
    not even the basics. No tar.gz downloaded per day. No .tar.bz2 per day.
    No count of new tarballs per day. It’s all gut feelings and hand waving.

    That’s not a way to run this operation.

    There are, on the other hand, silly assertions like “if we store it, we have
    to back it up” (no, you don’t) and worrying that mirroring will take longer
    (if it even enters the radar — no data again — have the mirrors regenerate
    bz2 locally and verify sha1).

    1. Fedora is very happy with it, as is Debian as they do need to back it up.

      Anyways, I don’t get the purpose of your complaint other than stop energy. E.g. you ignore the bits what distributions said, and call a few things ‘silly’. The rest of the things I’m planning or said you seem to have ignored.

      If you’d read the threads, you’ll notice that I’ve already acknowledged that the benefit is not significant.

  4. Just curious…. Why keep bz2 rather then gz as the fallback alternative? Even if bzip2 has been around for a long time the gzip format is still much more universally supported…. gzip “works everywhere” (and it’s larger, but bzip2 has the same drawback when both are compared to xz) and would thus IMHO be a better fallback alternative….

    1. Because a lot of the spec files use/reference bz2 currently and they’d all break. Further, you have Solaris users who download files directly from http://ftp.gnome.org and 99% of the spec files they use reference the bz2 file.

      So in short: gz isn’t used much.

          1. wow, that’s quite a difference. although i rather thought about download numbers.

            btw, never heard about this xz thing before your post. also often preferred gzip over bzip2 for its so much better speed. how well is xz supported outside of linux world?

          2. Xz support will be in the next Solaris, and will be supported within a few month max in the current one (discussed it with Oracle). Various programs support xz in Windows, for instance 7zip does. Same for Mac. BSDs do not have an issue.

            Bzip2 decompression is pretty slow. For me xz is faster.

            Download numbers are affected by e.g. jhbuild. There is also influence by bots (they love gzip compressed stuff, dislike bzip2). Hit wise: 983 gzip, 4936 bzip2. At least 200+ gzip hits are from bots.

            Xz is just LMZA in a better file format.

      1. > So in short: gz isn’t used much.

        You’ve obviously only cared to look at spec files and nothing else… You just broke the entire watch/update system of debian for example.

        So in short: you’re misinformed.

        1. In short: you do not know what you’re talking about.

          I’ve announced this many times, also to the various distributions. People from Debian responded and did not raise this.

          I still expected breakage and people not knowing. And making such a big fuss about a small watch system is a bit weird.

          You might want to inform yourself a bit more before stating things. Most of the discussion has been archived on public mailing lists. I think you’re a bit silly 🙂

          You expect a switch to happen ‘by magic’?

  5. This is great ovitters,

    Actually XZ uses LZMA2, which is an improved version of LZMA.

    I already use it for backup mails and mysql databases.

    1. Compression should be the same (both use LZMA/LZMA2), just the file format differs. 7z can store various files (like a tarball); .xz only stores 1 file (more unixy). Xz is supported by GNU tar, used for various GNU releases, available in various ‘enterprise editions’, etc.

Comments are closed.