Adventures with NVMe, part 2

A few days ago I asked people to upload their NVMe “cns” data to the LVFS. So far, 908 people did that, and I appreciate each and every submission. I promised I’d share my results, and this is what I’ve found:

Number of vendors implementing slot 1 read only “s1ro” factory fallback: 10 – this was way less than I hoped. Not all is lost: the number of slots in a device “nfws”indicate how many different versions of firmware the drive can hold, just like some wireless broadband cards. The idea is that a bad firmware flash means you can “fall back” to an old version that actually works. It was surprising how many drives didn’t have this feature because they only had one slot in total:

I also wanted to know how many firmware versions there were for a specific model (deduping by removing the capacity string in the model); the idea being that if drives with the same model string all had the same version firmware then the vendor wasn’t supplying firmware updates at all, and might be a lost cause, or have perfect firmware. Vendors don’t usually change shipped firmware on NMVe drives for no reason, and so a vendor having multiple versions of firmware for a given model could indicate a problem or enhancement important enough to re-run all the QA checks:

So, not all bad, but we can’t just assume that trying to flash a firmware is a safe thing to do for all drives. The next, much bigger problem was trying to identify which drives should be flashed with a specific firmware. You’d think this would be a simple problem, where the existing firmware version would be stored in the “fr” firmware revision string and the model name would be stored in the “mn” string. Alas, only Lenovo and Apple store a sane semver like 1.2.3, other vendors seem to encode the firmware revision using as-yet-unknown methods. Helpfully, the model name alone isn’t all we need to identify the firmware to flash as different drives can have different firmware for the laptop OEM without changing the mn or fr. For this I think we need to look into the elusive “vs” vendor-defined block which was the reason I was asking for the binary dump of the CNS rather than the nvme -H or nvme -o json output. The vendor block isn’t formally defined as part of the NVMe specification and the ODM (and maybe the OEM?) can use this however they want.

Only 137 out of the supplied ~650 NVMe CNS blobs contained vendor data. SK hynix drives contain an interesting-looking string of something like KX0WMJ6KS0760T6G01H0, but I have no idea how to parse that. Seagate has simply 2002. Liteon has a string like TW01345GLOH006BN05SXA04. Some Samsung drives have things like KR0N5WKK0184166K007HB0 and CN08Y4V9SSX0087702TSA0 – the same format as Toshiba CN08D5HTTBE006BEC2K1A0 but it’s weird that the blob is all ASCII – I was somewhat hoping for a packed GUID in the sea of NULs. They do have some common sub-sections, so if you know what these are please let me know!

I’ve built a fwupd plugin that should be able to update firmware on NVMe drives, but it’s 100% untested. I’m going to use the leftover donation money for the LVFS to buy various types of NVMe hardware that I can flash with different firmware images and not cry if all the data gets wiped or the device get bricked. I’ve already emailed my contact at Samsung and fingers crossed something nice happens. I’ll do the same with Toshiba and Lenovo next week. I’ll also update this blog post next week with the latest numbers, so if you upload your data now it’s still useful.

NVMe Firmware: I Need Your Data

In a recent Google Plus post I asked what kind of hardware was most interesting to be focusing on next. UEFI updating is now working well with a large number of vendors, and the LVFS “onboarding” process is well established now. On that topic we’ll hopefully have some more announcements soon. Anyway, back to the topic in hand: The overwhelming result from the poll was that people wanted NVMe hardware supported, so that you can trivially update the firmware of your SSD. Firmware updates for SSDs are important, as most either address data consistency issues or provide nice performance fixes.

Unfortunately there needs to be some plumbing put in place first, so don’t expect anything awesome very fast. The NVMe ecosystem is pretty new, and things like “what version number firmware am I running now” and “is this firmware OEM firmware or retail firmware” are still queried using vendor-specific extensions. I only have two devices to test with (Lenovo P50 and Dell XPS 13) and so I’m asking for some help with data collection. Primarily I’m trying to find out what NMVe hardware people are actually using, so I can approach the most popular vendors first (via the existing OEMs). I’m also going to be looking at the firmware revision string that each vendor sets to find quirks we need — for instance, Toshiba encodes MODEL VENDOR, and everyone else specifies VENDOR MODEL. Some drives contain the vendor data with a GUID, some don’t, I have no idea of the relative number or how many different formats there are. I’d also like to know how many firmware slots the average SSD has, and the percentage of drives that have a protected slot 1 firmware. This all lets us work out how safe it would be to attempt a new firmware update on specific hardware — the very last thing we want to do is brick an expensive new NMVe SSD with all your data on.

So, what do I would like you to do. You don’t need to reboot, unmount any filesystems or anything like that. Just:

  1. Install nvme (e.g. dnf install nvme-cli or build it from source
  2. Run the following command:
    sudo nvme id-ctrl --raw-binary /dev/nvme0 > /tmp/id-ctrl
    
  3. If that worked, run the following command:
    curl -F type=nvme \
        -F "machine_id="`cat /etc/machine-id` \
        -F file=@/tmp/id-ctrl \
        https://staging.fwupd.org/lvfs/upload_hwinfo

If you’re not sure if you have a NVMe drive you can check with the nvme command above. The command isn’t doing anything with the firmware; it’s just asking the NVMe drive to report what it knows about itself. It should be 100% safe, the kernel already did the same request at system startup.

We are sending your random machine ID to ensure we don’t record duplicate submissions — if that makes you unhappy for some reason just choose some other 32 byte hex string. In the binary file created by nvme there is the encoded model number and serial number of your drive; if this makes you uneasy please don’t send the file.

Many thanks, and needless to say I’ll be posting some stats here when I’ve got enough submissions to be statistically valid.

GNOME Software and automatic updates

For GNOME 3.30 we’ve enabled something that people have been asking for since at least the birth of the gnome-software project: automatically installing updates.

This of course comes with some caveats. Since it’s still not safe to auto-update packages (trust me, I triaged the hundreds of bugs) we will restrict automatic updates to Flatpaks. Although we do automatically download things like firmware updates, ostree content, and package updates by default they’re deployed manually like before. I guess it’s important to say that the auto-update of Flatpaks is optional and can easily be turned off in the GUI, and that you’ll be notified when applications have been auto-updated and need restarting.

Another common complaint with gnome-software was that it didn’t show the same list of updates as command line tools like dnf. The internal refactoring required for auto-deploying updates also allows us to show updates that are available, but not yet downloaded. We’ll still try and auto-download them ahead of time if possible, but won’t hide them until that. This does mean that “new” updates could take some time to download in the updates panel before either the firmware update is performed or the offline update is scheduled.

This also means we can add some additional UI controlling whether updates should be downloaded and deployed automatically. This doesn’t override the existing logic regarding metered connections or available battery power, but does give the user some more control without resorting to using gsettings invocation on the command line.

Also notable for GNOME 3.30 is that we’ve switched to the new libflatpak transaction API, which both simplifies the flatpak plugin considerably, and it means we install the same runtimes and extensions as the flatpak CLI. This was another common source of frustration as anyone trying to install from a flatpakref with RuntimeRepo set will testify.

With these changes we’ve also bumped the plugin interface version, so if you have out-of-tree plugins they’ll need recompiling before they work again. After a little more polish, the new GNOME Software 2.29.90 will soon be available in Fedora Rawhide, and will thus be available in Fedora 29. If 3.30 is as popular as I think it might be, we might even backport gnome-software 3.30.1 into Fedora 28 like we did for 3.28 and Fedora 27 all those moons ago.

Comments welcome.

Please welcome Lenovo to the LVFS

I’d like to formally welcome Lenovo to the LVFS. For the last few months myself and Peter Jones have been working with partners of Lenovo and the ThinkPad, ThinkStation and ThinkCenter groups inside Lenovo to get automatic firmware updates working across a huge number of different models of hardware.

Obviously, this is a big deal. Tens of thousands of people are likely to be offered a firmware update in the next few weeks, and hundreds of thousands over the next few months. Understandably we’re not just flipping a switch and opening the floodgates, so if you’ve not seen anything appear in fwupdmgr update or in GNOME Software don’t panic. Over the next couple of weeks we’ll be moving a lot of different models from the various testing and embargoed remotes to the stable remote, and so the list of supported hardware will grow. That said, we’ll only be supporting UEFI hardware produced fairly recently, so there’s no point looking for updates on your beloved T61. I also can’t comment on what other Lenovo branded hardware is going to be supported in the future as I myself don’t know.

Bringing Lenovo to the LVFS has been a lot of work. It needed changes to the low level fwupdate library, fwupd, and even the LVFS admin portal itself for various vendor-defined reasons. We’ve been working in semi-secret for a long time, and I’m sure it’s been frustrating to all involved not being able to speak openly about the grand plan. I do think Lenovo should be applauded for the work done so far due to the enormity of the task, rather than chastised about coming to the party a little late. If anyone from HP is reading this, you’re now officially late.

We’re still debugging a few remaining issues, and also working on making the update metadata better quality, so please don’t judge Lenovo (or me!) too harshly if there are initial niggles with the update process. Updating the firmware is slightly odd in that it sometimes needs to reboot a few times with some scary-sounding beeps, and on some hardware the first UEFI update you do might look less than beautiful. If you want to do the firmware update on Lenovo hardware, you’ll have a lot more success with newer versions of fwupd and fwupdate, although we should do a fairly good job of not offering the update if it’s not going to work. All our testing has been done with a fully updated Fedora 28 workstation. It of course works with SecureBoot turned on, but if you’ve enabled the BootOrder lock manually you’ll need to turn that off first.

I’d like to personally thank all the Lenovo engineers and managers I’ve worked with over the last few months. All my time has been sponsored by Red Hat, and they rightfully deserve love too.

Affiliated Vendors on the LVFS

We’ve just about to deploy another feature to the LVFS that might be interesting to some of you. First, some nomenclature:

OEM: Original Equipment Manufacturer, the user-known company name on the outside of the device, e.g. Sony, Panasonic, etc
ODM: Original Device Manufacturer, typically making parts for one or more OEMs, e.g. Foxconn, Compal

There are some OEMs where the ODM is the entity responsible for uploading the firmware to the LVFS. The per-device QA is typically done by the OEM, rather than the ODM, although it can be both. Before today we didn’t have a good story about how to handle this other than having a “fake” oem_odm@oem.com useraccounts that were shared by all users at the ODM. The fake account isn’t of good design from a security or privacy point of view and so we needed something better.

The LVFS administrator can now mark other vendors as “affiliates” of other vendors. This gives the ODM permission to upload firmware that is “owned” by the OEM on the LVFS, and that appears in the OEM embargo metadata. The OEM QA team is also able to edit the update description, move the firmware to testing and stable (or delete it entirely) as required. The ODM vendor account also doesn’t have to appear in the search results or the vendor table, making it hidden to all users except OEMs.

This also means if an ODM like Foxconn builds firmware for two different OEMs, they also have to specify which vendor should “own” the firmware at upload time. This is achieved with a simple selection widget on the upload page, but will only be shown if affiliations have been set up. The ODM is able to manage their user accounts directly, either using local accounts with passwords, or ODM-specific OAuth which is the preferred choice as it means there is only one place to manage credentials.

If anyone needs more information, please just email me or leave a comment below. Thanks!

fwupdate is {nearly} dead; long live fwupd

If the title confuses you, you’re not the only one that’s been confused with the fwupdate and fwupd project names. The latter used the shared library of the former to schedule UEFI updates, with the former also providing the fwup.efi secure-boot signed binary that actually runs the capsule update for the latter.

In Fedora the only user of libfwupdate was fwupd and the fwupdate command line tool itself. It makes complete sense to absorb the redundant libfwupdate library interface into the uefi plugin in fwupd. Benefits I can see include:

  • fwupd and fwupdate are very similar names; a lot of ODMs and OEMs have been confused, especially the ones not so Linux savy.
  • fwupd already depends on efivar for other things, and so there are no additional deps in fwudp.
  • Removal of an artificial library interface, with all the soname and package-induced pain. No matter how small, maintaining any project is a significant use of resources.
  • The CI and translation hooks are already in place for fwupd, and we can use the merging of projects as a chance to write lots of low-level tests for all the various hooks into the system.
  • We don’t need to check for features or versions in fwupd, we can just develop the feature (e.g. the BGRT localised background image) all in one branch without #ifdefs everwhere.
  • We can do cleverer things whilst running as a daemon, for instance uploading the fwup.efi to the ESP as required rather than installing it as part of the distro package.
    • The last point is important; several distros don’t allow packages to install files on the ESP and this was blocking fwupdate being used by them. Also, 95% of the failures reported to the LVFS are from Arch Linux users who didn’t set up the ESP correctly as the wiki says. With this new code we can likely reduce the reported error rate by several orders of magnitude.

      Note, fwupd doesn’t actually obsolete fwupdate, as the latter might still be useful if you’re testing capsule updates on something super-embedded that doesn’t ship Glib or D-Bus. We do ship a D-Bus-less fwupdate-compatible command line in /usr/libexec/fwupd/fwupdate if you’re using the old CLI from a shell script. We’re all planning to work on the new integrated fwupd version, but I’m sure they’ll be some sharing of fixes between the projects as libfwupdate is shipped in a lot of LTS releases like RHEL 7.

      All of this new goodness is available in fwupd git master, which will be the new 1.1.0 release probably available next week. The 1_0_X branch (which depends on libfwupdate) will be maintained for a long time, and is probably the better choice to ship in LTS releases at the moment. Any distros that ship the new 1.1.x fwupd versions will need to ensure that the fwup.efi files are signed properly if they want SecureBoot to work; in most cases just copying over the commands from the fwupdate package is all that is required. I’ll be updating Fedora Rawhide with the new package as soon as it’s released.

      Comments welcome.

Updating Wacom Firmware In Linux

I’ve been working with Wacom engineers for a few months now, adding support for the custom update protocol used in various tablet devices they build. The new wacomhid plugin will be included in the soon-to-be released fwupd 1.0.8 and will allow you safely update the bluetooth, touch and main firmware of devices that support the HID protocol. Wacom is planning to release a new device that will be released with LVFS support out-of-the-box.

My retail device now has a 0.05″ SWI debugging header installed…

In other news, we now build both flatpak and snap versions of the standalone fwupdtool tool that can be used to update all kinds of hardware on distributions that won’t (or can’t) easily update the system version of fwupd. This lets you easily, for example, install the Logitech unifying security updates when running older versions of RHEL using flatpak and update the Dell Thunderbolt controller on Ubuntu 16.04 using snapd. Neither bundle installs the daemon or fwupdmgr by design, and both require running as root (and outside the sandbox) for obvious reasons. I’ll upload the flatpak to flathub when fwupd and all the deps have had stable releases. Until then, my flatpak bundle is available here.

Working with the Wacom engineers has been a pleasure, and the hardware is designed really well. The next graphics tablet you buy can now be 100% supported in Linux. More announcements soon.

System76 and the LVFS

Update 2021-11-12: System76 is now using the LVFS to distribute firmware. We’re working together to put much more firmware on the LVFS in the future.

System76 is a hardware vendor that builds laptops with the Pop_OS! Linux distribution pre-loaded. System76 machines do get firmware updates, but do not use the fwupd and LVFS shared infrastructure. I’m writing this blog post so I can point people at some static text rather than writing out long replies to each person that emails me wanting to know why they don’t just use the LVFS.

In April of last year, System76 contacted me, wanting to work out how to get on the LVFS. We wrote 30+ cordial emails back and forth with technical details. Discussions got stuck when we found out they currently use a nonfree firmware flash tool called afuefi rather than use the UEFI specification called UpdateCapsule. All vendors have support for capsule updates as a requirement for the Windows 10 compliance sticker, so it should be pretty easy to use this instead. Every major vendor of consumer laptops is already using capsules, e.g. Dell, HP, Lenovo and many others.

There was some resistance to not using the proprietary AUEFI executable to do the flashing. We certainly can’t include the non-free and non-redistributable afuefi as a binary in the .cab file uploaded to the LVFS, as even if System76 does have special permission to distribute it, as the LVFS would be a 3rd party and is mirrored to various places. IANAL and all that.

An employee of System76 wrote a userspace tool in rust to flash the embedded controller (EC) using a reverse engineered protocol (fwupd is written in C) and the intention was add a plugin to fwupd to do this. Peter Jones suggested that most vendors just include the EC update as part of the capsule as the EC and system firmware typically form a tightly-coupled pair. Peter also thought that afuefi is really just a wrapper for UpdateCapsule, and S76 was going to find out how to make the AMI BIOS just accept a capsule. Apparently they even built a capsule that works using UpdateCapsule.

I was really confused when things went so off-course with a surprise announcement in July that System76 had decided that they would not use the LVFS and fwupd afterall even after all the discussion and how it all looked like it was moving forwards. Looking at the code it seems the firmware update notifier and update process is now completely custom to System76 machines.

Apparently System76 decided that having their own client tools and firmware repository was a better fit for them. At this point the founder of System76 got cc’d and told me this wasn’t about politics, and it wasn’t competition. I then got told that I’d made the LVFS and fwupd more complicated than it needed to be, and that I should have adopted the infrastructure that System76 had built instead.

The way forward from my point of view would be for System76 to spend a few hours making UpdateCapsule work correctly, another few days to build an EFI binary with the EC update, and a few more hours to write the metadata for the LVFS. I don’t require an apology, and would happily create them a OEM account on the LVFS. I guess it might make sense for them to require Pop_OS! on their hardware but it’s not going to help when people buy System76 hardware and want to run Red Hat Enterprise Linux in a business. It also means System76 also gets to maintain all this security-sensitive server and client code themselves for eternity.

It was a hugely disappointing end to the discussion as I had high hopes System76 would do the right thing and work with other vendors on shared infrastructure. I don’t actually mind if System76 doesn’t use fwupd and the LVFS, I just don’t want people to buy new hardware and be disappointed. I’ve heard nothing more from System76 about uploading firmware to the LVFS or using fwupd since about November, and I’d given multiple people many chances to clarify the way forward.

The LVFS CDN will change soon

tl;dr: If you have https://s3.amazonaws.com/lvfsbucket/downloads/firmware.xml.gz in /etc/fwupd/remotes.d/lvfs.conf then you need to nag your distribution to update the fwupd package to 1.0.6.

Slightly longer version:

The current CDN (~$100/month) is kindly sponsored by Amazon, but that won’t last forever and the donations I get for the LVFS service don’t cover the cost of using S3. Long term we are switching to a ‘dumb’ provider (currently BunnyCDN) which works out 1/10th of the cost — which is covered by the existing kind donations. We’ve switched to use a new CNAME to make switching CDN providers easy in the future, which means this should be the only time this changes client side.

If you want to patch older versions of fwupd, you can apply this commit.

LVFS Mailing List

I have created a new low-volume lvfs-announce mailing list for the Linux Vendor Firmware Service, which will only be used to make announcements about new features and planned downtime. If you are interested in what’s happening with the LVFS you can subscribe here. If you need to contact me about anything LVFS-related, please continue to email me (not the mailing list) as normal.