fwupd and xz metadata

A few people (and multi-billion dollar companies!) have asked for my response to the xz backdoor. The fwupd metadata that millions of people download every day is a 9.5MB XML file — which thankfully is very compressible. This used to be compressed as gzip by the LVFS, making it a 1.6MB download for end-users, but in 2021 we switched to xz compression instead.

What actually happens behind the scenes is that the libxmlb library loads the optionally compressed metadata into a mmap-able binary blob, and then it gets used by fwupd to look for new updates for specific hardware. In libxmlb 0.3.3 we added support for xz as a compression format. Then fwupd 1.8.7 was released with xz support, preferring the xz format to the “legacy” gz format — as the metadata became a 1.1MB download, saving significant amounts of data from the CDN.

Then this week we learned that xz wasn’t the kind of thing we want to depend on. Out of an abundance of caution (and to be clear — my understanding is there is no fwupd or LVFS security problem of any kind) I’ve switched the LVFS to also generate zstd metadata, make libxmlb no longer hard depend on lzma and switched fwupd to prefer the zstd metadata over the xz metadata if the installed version of libjcat supports it. The zstd metadata is also ~3% smaller than xz (and faster to decompress), but the real benefit is that I now trust it a lot more than xz.

I’ll be doing new libxmlb and fwupd releases with the needed changes next week.

Published by

hughsie

Richard has over 10 years of experience developing open source software. He is the maintainer of GNOME Software, PackageKit, GNOME Packagekit, GNOME Power Manager, GNOME Color Manager, colord, and UPower and also contributes to many other projects and opensource standards. Richard has three main areas of interest on the free desktop, color management, package management, and power management. Richard graduated a few years ago from the University of Surrey with a Masters in Electronics Engineering. He now works for Red Hat in the desktop group, and also manages a company selling open source calibration equipment. Richard's outside interests include taking photos and eating good food.

11 thoughts on “fwupd and xz metadata”

  1. xz always had a smell to it. It depended on a single person who appears to be an amateur in the field.

    https://www.nongnu.org/lzip/xz_inadequate.html

    This article has been called FUD on many mailing lists, but I never saw a convincing rebuttal of the arguments.

    How is that related to supply-chain attacks? Everyone who publishes their program as free software is a hero. But the fact that xz invented their own file format and called it algorithm gives the impression that this project is created by amateurs who do not even know what an algorithm is. Maybe this also explains why the project did not have any expert contributors. Low activity for such a central project increases the risk of supply-chain attacks. I do not think that our whole free software ecosystem should rely on an amateur project in this way.

    1. The article is technically correct for most parts, but criticisms therein are too marginal to be relevant. That’s why people considered the article as a trolling to advertise the author’s own format. If you want to find criticisms, there are plenty in tech forums like Hacker News or Lobsters. But I’m also sure that not everyone wouldn’t look for them, so here is my rebuttal for the 4th (or 5th?) time.

      While a bit misguided, the xz format is clearly designed as a much simpler version of the 7zip format and it was a good thing to have at that time. As a result most claims made by the article also apply to 7z—have you ever read the 7z format specification?—, but no one would attack 7z for such minor reasons today. The article’s lengthy devotion to the error detection and recovery is also quite telling; it is good to have for compression formats, but no, it is never mandatory and you should use a dedicated parity archive or similar if that is necessary. The article devotes too much length to analyze the error detection probability for short fields, which are indeed the part where xz was misguided, but such design issue is so minor to be detailed in that way. It suffices to say that xz should have designed for agility, not the extensibility.

      Lzip, by the way, is no better than xz. In fact it took out so much features that I had to question the author’s sound claims as well. For example, lzip cannot be verified for the checksum until it can be fully decoded, and there is no way to determine which part of the compressed stream was corrupted. Like, what? It is rather easy to split the input into multiple chunks and put a checksum for each chunk, so that each chunk can be individually recovered in spite of other corrupted blocks. It is possible for lzip to produce multiple streams for the similar effect (because they should be concatenable), but it doesn’t do to my knowledge: the closest alternative is to produce multiple volumes and concatenate them manually. Sure, lzip the format can probably support the use case without any change, but if the implementation doesn’t handle it, it is effectively unsupported.

      1. Thank you very much. This helped a lot.

        Let’s hope that zstd will end the xz vs. lzip rivalry for good.

      2. > Sure, lzip the format can probably support the use case without any change, but if the implementation doesn’t handle it, it is effectively unsupported.

        In fact it is supported. Try the option –member-size of lzip.
        Perhaps this option is not clearly explained in the manual. I’ll notify it to the maintainer so that he can improve the description.
        Thanks for pointing this out.

    1. The problem wasn’t that liblzma was linked. The problem was that it was a understaffed project which made the subversion of the git (pun intended) possible.

  2. “A few people (and multi-billion dollar companies!)…” This is a discouraging start to a blog post about free software.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.