Adding an optional install duration to LVFS firmware

We’ve just added an optional feature to fwupd and the LVFS that some people might find useful: The firmware update process can now tell the user how long in seconds the update is going to take.

This means that users can know that a dock update might take 5 minutes, and so they start the update process before they go to lunch. A UEFI update will require multiple reboots and will take 45 minutes to apply, and so the user will only apply the update at the end of the day rather than losing access to the their computer for nearly an hour.

If you want to use this feature there are currently three ways to assign the duration to the update:

  • Changing the value on the LVFS admin console — the component update panel now has an extra input field to enter the
    duration in
    seconds
  • Adding a new attribute to the element, for instance:
    <release version="3.0.2" date="2018-11-09" install_duration="120">
    
  • Adding a ‘quirk’ to fwupd, for instance:
    [DeviceInstanceId=USB\VID_1234&PID_5678]
    InstallDuration = 40
    
  • For updates requiring a reboot the install duration should include the time to POST the system both before and after the update has run, but it can be approximate. Only users running very new versions of fwupd and gnome-software will be shown the install duration, and older versions will be unchanged as the new property will just be ignored. It’s therefore safe to include in all versions of firmware without adding a the dependency on a specific fwupd version.

More fun with libxmlb

A few days ago I cut the 0.1.4 release of libxmlb, which is significant because it includes the last three features I needed in gnome-software to achieve the same search results as appstream-glib.

The first is something most users of database libraries will be familiar with: Bound variables. The idea is you prepare a query which is parsed into opcodes, and then at a later time you assign one of the ? opcode values to an actual integer or string. This is much faster as you do not have to re-parse the predicate, and also means you avoid failing in incomprehensible ways if the user searches for nonsense like ]@attr. Borrowing from SQL, the syntax should be familiar:

g_autoptr(XbQuery) query = xb_query_new (silo, "components/component/id[text()=?]/..", &error);
xb_query_bind_str (query, 0, "gimp.desktop", &error);

The second feature makes the caller jump through some hoops, but hoops that make things faster: Indexed queries. As it might be apparent to some, libxmlb stores all the text in a big deduplicated string table after the tree structure is defined. That means if you do <component component="component">component</component> then we only store just one string! When we actually set up an object to check a specific node for a predicate (for instance, text()='fubar' we actually do strcmp("fubar", "component") internally, which in most cases is very fast…

Unless you do it 10 million times…

Using indexed strings tells the XbMachine processing the predicate to first check if fubar exists in the string table, and if it doesn’t, the predicate can’t possibly match and is skipped. If it does exist, we know the integer position in the string table, and so when we compare the strings we can just check two uint32_t’s which is quite a lot faster, especially on ARM for some reason. In the case of fwupd, it is searching for a specific GUID when returning hardware results. Using an indexed query takes the per-device query time from 3.17ms to about 0.33ms – which if you have a large number of connected updatable devices makes a big difference to the user experience. As using the indexed queries can have a negative impact and requires extra code it is probably only useful in a handful of cases. In case you do need this feature, this is the code you would use:

xb_silo_query_build_index (silo, "component/id", NULL, &error); // the cdata
xb_silo_query_build_index (silo, "component", "type", &error); // the @type attr
g_autoptr(XbNode) n = xb_silo_query_first (silo, "component/id[text()=$'test.firmware']", &error);

The indexing being denoted by $'' rather than the normal pair of single quotes. If there is something more standard to denote this kind of thing, please let me know and I’ll switch to that instead.

The third feature is: Stemming; which means you can search for “gaming mouse” and still get results that mention games, game and Gaming. This is also how you can search for words like Kongreßstraße which matches kongressstrasse. In an ideal world stemming would be computationally free, but if we are comparing millions of records each call to libstemmer sure adds up. Adding the stem() XPath operator took a few minutes, but making it usable took up a whole weekend.

The query we wanted to run would be of the form id[text()~=stem('?') but the stem() would be called millions of times on the very same string for each comparison. To fix this, and to make other XPath operators faster I implemented an opcode rewriting optimisation pass to the XbMachine parser. This means if you call lower-case(text())==lower-case('GIMP.DESKTOP') we only call the UTF-8 strlower function N+1 times, rather than 2N times. For lower-case() the performance increase is slight, but for stem it actually makes the feature usable in gnome-software. The opcode rewriting optimisation pass is kinda dumb in how it works (“lets try all combinations!”), but works with all of the registered methods, and makes all existing queries faster for almost free.

One common question I’ve had is if libxmlb is supposed to obsolete appstream-glib, and the answer is “it depends”. If you’re creating or building AppStream metadata, or performing any AppStream-specific validation then stick to the appstream-glib or appstream-builder libraries. If you just want to read AppStream metadata you can use either, but if you can stomach a binary blob of rewritten metadata stored somewhere, libxmlb is going to be a couple of orders of magnitude faster and use a ton less memory.

If you’re thinking of using libxmlb in your project send me an email and I’m happy to add more documentation where required. At the moment libxmlb does everything I need for fwupd and gnome-software and so apart from bugfixes I think it’s basically “done”, which should make my manager somewhat happier. Comments welcome.

Using the LVFS to influence procurement decisions

The National Cyber Security Centre (part of GCHQ, the UK version of the NSA) wrote a nice article on using the LVFS to influence procurement decisions. It’s probably also worth noting that the two biggest OEMs making consumer hardware also require all their ODMs to also support firmware updates on the LVFS. More and more mega-corporations also have “supports the LVFS” as a requirement for procurement.

The LVFS is slowly and carefully moving to the Linux Foundation, so expect more outreach and announcements soon.

libxmlb now a dependency of fwupd and gnome-software

I’ve just released libxmlb 0.1.3, and merged the branches for fwupd and gnome-software so that it becomes a hard dependency on both projects. A few people have reviewed the libxmlb code, and Mario, Kalev and Robert reviewed the fwupd and gnome-software changes so I’m pretty confident I’ve not broken anything too important — but more testing very welcome. GNOME Software RSS usage is about 50% of what is shipped in 3.30.x and fwupd is down by 65%! If you want to ship the upcoming fwupd 1.2.0 or gnome-software 3.31.2 in your distro you’ll need to have libxmlb packaged, or be happy using a meson subpackage to download libxmlb during the build of each dependent project.

Announcing the first release of libxmlb

Today I did the first 0.1.0 preview release of libxmlb. We’re at the “probably API stable, but no promises” stage. This is the library I introduced a couple of weeks ago, and since then I’ve been porting both fwupd and gnome-software to use it. The former is almost complete, and nearly ready to merge, but the latter is still work in progress with a fair bit of code to write. I did manage to launch gnome-software with libxmlb yesterday, and modulo a bit of brokenness it’s both faster to start (over 800ms faster from cold boot!) and uses an amazing 90Mb less RSS at runtime. I’m planning to merge the libxmlb branch into the unstable branch of fwupd in the next few weeks, so I need volunteers to package up the new hard dep for Debian, Ubuntu and Arch.

The tarball is in the usual place – it’s a simple Meson-built library that doesn’t do anything crazy. I’ve imported and built it already for Fedora, much thanks to Kalev for the super speedy package review.

I guess I should explain how applications are expected to use this library. At its core, there are just five different kinds of objects you need to care about:

  • XbSilo – a deduplicated string pool and a read only node tree. This is typically kept alive for the application lifetime.
  • XbNode – a “Gobject wrapped” immutable node available as a query result from XbSilo.
  • XbBuilder – a “compiler” to build the XbSilo from XbBuilderNode’s or XbBuilderSource’s. This is typically created and destroyed at startup or when the blob needs regenerating.
  • XbBuilderNode – a mutable node that can have a parent, children, attributes and a value
  • XbBuilderSource – a source of data for XbBuilder, e.g. a .xml.gz file or just a raw XML string

The way most applications will use libxmlb is to create a local XbBuilder instance, add some XbBuilderSource’s and then “ensure” a local cache file. The “ensure” process either mmap loads the binary blob if all the file mtimes are the same, or compiles a blob saving it to a new file. You can also tell the XbSilo to watch all the sources that it was built with, so that if any files change at runtime the valid property gets set to FALSE and the application can xb_builder_ensure() at a convenient time.

Once the XbBuilder has been compiled, a XbSilo pops out. With the XbSilo you can query using most common XPath statements – I actually ended up implementing a FORTH-style stack interpreter so we can now do queries like /components/component/id[contains(upper-case(text()),'GIMP')] – I’ll say a bit more on that in a minute. Queries can limit the number of results (for speed), and are deduplicated in a sane way so it’s really quite a simple process to achieve something that would be a lot of C code. It’s possible to directly query an attribute or text value from a node, so the silo doesn’t have to be passed around either.

In the process of porting gnome-software, I had to make libxmlb thread-safe – which required some internal organisation. We now have an non-exported XbMachine stack interpreter, and then the XbSilo actually registers the XML-specific methods (like contains()) and functions (like ~=). These get passed some per-method user data, and also some per-query private data that is shared with the node tree – allowing things like [last()] and position()=3 to work. The function callbacks just get passed an query-specific stack, which means you can allow things like comparing “1” to 1.00f This makes it easy to support more of XPath in the future, or to support something completely application specific like gnome-software-search() without editing the library.

If anyone wants to do any API or code review I’d be super happy to answer any questions. Coverity and valgrind seem happy enough with all the self tests, but that’s no replacement for a human asking tricky questions. Thanks!

Speeding up AppStream: mmap’ing XML using libxmlb

AppStream and the related AppData are XML formats that have been adopted by thousands of upstream projects and are being used in about a dozen different client programs. The AppStream metadata shipped in Fedora is currently a huge 13Mb XML file, which with gzip compresses down to a more reasonable 3.6Mb. AppStream is awesome; it provides translations of lots of useful data into basically all languages and includes screenshots for almost everything. GNOME Software is built around AppStream, and we even use a slightly extended version of the same XML format to ship firmware update metadata from the LVFS to fwupd.

XML does have two giant weaknesses. The first is that you have to decompress and then parse the files – which might include all the ~300 tiny AppData files as well as the distro-provided AppStream files, if you want to list installed applications not provided by the distro. Seeking lots of small files isn’t so slow on a SSD, and loading+decompressing a small file is actually quicker than loading an uncompressed larger file. Parsing an XML file typically means you set up some callbacks, which then get called for every start tag, text section, then end tag – so for a 13Mb XML document that’s nested very deeply you have to do a lot of callbacks. This means you have to process the description of GIMP in every language before you can even see if Shotwell exists at all.

The typical way parsing XML involves creating a “node tree” when parsing the XML. This allows you treat the XML document as a Document Object Model (DOM) which allows you to navigate the tree and parse the contents in an object oriented way. This means you typically allocate on the heap the nodes themselves, plus copies of all the string data. AsNode in libappstream-glib has a few tricks to reduce RSS usage after parsing, which includes:

  • Interning common element names like description, p, ul, li
  • Freeing all the nodes, but retaining all the node data
  • Ignoring node data for languages you don’t understand
  • Reference counting the strings from the nodes into the various appstream-glib GObjects

This still has a both drawbacks; we need to store in hot memory all the screenshot URLs of all the apps you’re never going to search for, and we also need to parse all these long translated descriptions data just to find out if gimp.desktop is actually installable. Deduplicating strings at runtime takes nontrivial amounts of CPU and means we build a huge hash table that uses nearly as much RSS as we save by deduplicating.

On a modern system, parsing ~300 files takes less than a second, and the total RSS is only a few tens of Mb – which is fine, right? Except on resource constrained machines it takes 20+ seconds to start, and 40Mb is nearly 10% of the total memory available on the system. We have exactly the same problem with fwupd, where we get one giant file from the LVFS, all of which gets stored in RSS even though you never have the hardware that it matches against. Slow starting of fwupd and gnome-software is one of the reasons they stay resident, and don’t shutdown on idle and restart when required.

We can do better.

We do need to keep the source format, but that doesn’t mean we can’t create a managed cache to do some clever things. Traditionally I’ve been quite vocal against squashing structured XML data into databases like sqlite and Xapian as it’s like pushing a square peg into a round hole, and forces you to think like a database doing 10 level nested joins to query some simple thing. What we want to use is something like XPath, where you can query data using the XML structure itself.

We also want to be able to navigate the XML document as if it was a DOM, i.e. be able to jump from one node to it’s sibling without parsing all the great, great, great, grandchild nodes to get there. This means storing the offset to the sibling in a binary file.

If we’re creating a cache, we might as well do the string deduplication at creation time once, rather than every time we load the data. This has the added benefit in that we’re converting the string data from variable length strings that you compare using strcmp() to quarks that you can compare just by checking two integers. This is much faster, as any SAT solver will tell you. If we’re storing a string table, we can also store the NUL byte. This seems wasteful at first, but has one huge advantage – you can mmap() the string table. In fact, you can mmap the entire cache. If you order the string table in a sensible way then you store all the related data in one block (e.g. the <id> values) so that you don’t jump all over the cache invalidating almost everything just for a common query. mmap’ing the strings means you can avoid strdup()ing every string just in case; in the case of memory pressure the kernel automatically reclaims the memory, and the next time automatically loads it from disk as required. It’s almost magic.

I’ve spent the last few days prototyping a library, which is called libxmlb until someone comes up with a better name. I’ve got a test branch of fwupd that I’ve ported from libappstream-glib and I’m happy to say that RSS has reduced from 3Mb (peak 3.61Mb) to 1Mb (peak 1.07Mb) and the startup time has gone from 280ms to 250ms. Unless I’ve missed something drastic I’m going to port gnome-software too, and will expect even bigger savings as the amount of XML is two orders of magnitude larger.

So, how do I use this thing. First, lets create a baseline doing things the old way:

$ time appstream-util search gimp.desktop
real	0m0.645s
user	0m0.800s
sys	0m0.184s

To create a binary cache:

$ time xb-tool compile appstream.xmlb /usr/share/app-info/xmls/* /usr/share/appdata/* /usr/share/metainfo/*
real	0m0.497s
user	0m0.453s
sys	0m0.028s

$ time xb-tool compile appstream.xmlb /usr/share/app-info/xmls/* /usr/share/appdata/* /usr/share/metainfo/*
real	0m0.016s
user	0m0.004s
sys	0m0.006s

Notice the second time it compiled nearly instantly, as none of the filename or modification timestamps of the sources changed. This is exactly what programs would do every time they are launched.

$ df -h appstream.xmlb
4.2M	appstream.xmlb

$ time xb-tool query appstream.xmlb "components/component[@type='desktop']/id[text()='firefox.desktop']"
RESULT: <id>firefox.desktop</id>
RESULT: <id>firefox.desktop</id>
RESULT: <id>firefox.desktop</id>
real	0m0.008s
user	0m0.007s
sys	0m0.002s

8ms includes the time to load the file, search for all the components that match the query and the time to export the XML. You get three results as there’s one AppData file, one entry in the distro AppStream, and an extra one shipped by Fedora to make Firefox featured in gnome-software. You can see the whole XML component of each result by appending /.. to the query. Unlike appstream-glib, libxmlb doesn’t try to merge components – which makes it much less magic, and a whole lot simpler.

Some questions answered:

  • Why not just use a GVariant blob?: I did initially, and the cache was huge. The deeply nested structure was packed inefficiently as you have to assume everything is a hash table of a{sv}. It was also slow to load; not much faster than just parsing the XML. It also wasn’t possible to implement the zero-copy XPath queries this way.
  • Is this API and ABI stable?: Not yet, as soon as gnome-software is ported.
  • You implemented XPath in c‽: No, only a tiny subset. See the README.md

Comments welcome.

3 Million Firmware Files and Counting…

In the last two years the LVFS has supplied over 3 million firmware files to end users. We now have about a two dozen companies uploading firmware, of which 9 are multi-billion dollar companies.

Every month about 200,000 more devices get upgraded and from the reports so far the number of failed updates is less than 0.01% averaged over all firmware types. The number of downloads is going up month-on-month, although we’re no longer growing exponentially, thank goodness. The server load average is 0.18, and we’ve made two changes recently to scale even more for less money: signing files in a 30 minute cron job rather than immediately, and switching from Amazon to BunnyCDN.

The LVFS is mainly run by just one person (me!) and my time is sponsored by the ever-awesome Red Hat. The hardware costs, which recently included random development tools for testing the dfu and nvme plugins, and the server and bandwidth costs are being paid from charitable donations from the community. We’re even cost positive now, so I’m building up a little pot for the next server or CDN upgrade. By pretty much any metric, the LVFS is a huge success, and I’m super grateful to all the people that helped the project grow.

The LVFS does have one weakness, that it has a bus factor of one. In other words, if I got splattered by a bus, the LVFS would probably cease to exist in the current form. To further grow the project, and to reduce the dependence on me, we’re going to be moving various parts of the LVFS to the Linux Foundation. This means that they’ll be sysadmins who don’t have to google basic server things, a proper community charter, and access to an actual legal team. From a OEM point of view, nothing much should change, including the most important thing that it’ll continue to be free to use for everyone. The existing server and all the content will be migrated to the LVFS infrastructure. From a users point of view, new metadata and firmware will be signed by the Linux Foundation key, rather than my key, although we haven’t decided on a date for the switch-over yet. The LF key has been trusted by fwupd for firmware since 1.0.8 and it’s trivial to backport to older branches if required.

Before anyone gets too excited and starts pointing me at all my existing bugs for my other stuff: I’ll probably still be the person “onboarding” vendors onto the LVFS, and I’m fully expecting to remain the maintainer and core contributor to the lvfs-website code itself — but I certainly should have a bit more time for GNOME Software and color stuff.

In related news, even more vendors are jumping on the LVFS. No more public announcements yet, but hopefully soon. For a lot of hardware companies they’ll be comfortable “going public” when the new hardware currently in development is on shelves in stores. So, please raise a glass to the next 3 million downloads!

Realtek on the LVFS!

For the last week I’ve been working with Realtek engineers adding USB3 hub firmware support to fwupd. We’re still fleshing out all the details, as we also want to also update any devices attached to the hub using i2c – which is more important than it first seems. A lot of “multifunction” dongles or USB3 hubs are actually USB3 hubs with other hardware connected internally. We’re going to be working on updating the HDMI converter firmware next, probably just by dropping a quirk file and adding some standard keys to fwupd. This will let us use the same plugin for any hardware that uses the rts54xx chipset as the base.

Realtek have been really helpful and open about the hardware, which is a refreshing difference to a lot of other hardware companies. I’m hopeful we can get the new plugin in fwupd 1.1.2 although supported hardware won’t be available for a few months yet, which also means there’s no panic getting public firmware on the LVFS. It will mean we get a “works out of the box” experience when the new OEM branded dock/dongle hardware starts showing up.

Fun with SuperIO

While I’m waiting back for NVMe vendors (already one tentatively onboard!) I’ve started looking at “embedded controller” devices. The EC on your laptop historically used to just control the PS/2 keyboard and mouse, but now does fan control, power management, UARTs, GPIOs, LEDs, SMBUS, and various tasks the main CPU is too important to care about. Vendors issue firmware updates for this kind of device, but normally wrap up the EC update as part of the “BIOS” update as the system firmware and EC work together using various ACPI methods. Some vendors do the EC update out-of-band and so we need to teach fwupd about how to query the EC to get the model and version on that specific hardware. The Linux laptop vendor Tuxedo wants to update the EC and system firmware separately using the LVFS, and helpfully loaned me an InfinityBook Pro 13 that was immediately disassembled and connected to all kinds of exotic external programmers. On first impressions the N131WU seems quick, stable and really well designed internally — I’m sure would get a 10/10 for repairability.

At the moment I’m just concentrating on SuperIO devices from ITE. If you’re interested what SuperIO chip(s) you have on your machine you can either use superiotool from coreboot-utils or sensors-detect from lm_sensors. If you’ve got a SuperIO device from ITE please post what signature, vendor and model machine you have in the comments and I’ll ask if I need any more information from you. I’m especially interested in vendors that use devices with the signature 0x8587, which seems to be a favourite with the Clevo reference board. Thanks!

Please welcome AKiTiO to the LVFS

Another week, another vendor. This time the vendor is called AKiTiO, a vendor that make a large number of very nice ThunderBolt peripherals.

Over the last few weeks AKiTiO added support for the Node and Node Lite devices, and I’m sure they’ll be more in the future. It’s been a pleasure working with the engineers and getting them up to speed with uploading to the LVFS.

In other news, Lenovo also added support for the ThinkPad T460 on the LVFS, so get any updates while they’re hot. If you want to try this you’ll have to enable the lvfs-testing remote either using fwupdmgr enable-remote lvfs-testing or using the sources dialog in recent versions of GNOME Software. More Lenovo updates coming soon, and hopefully even more vendor announcements too.