3000 Reviews on the ODRS

The Open Desktop Ratings service is a simple Flask web service that various software centers use to retrieve and submit application reviews. Today it processed the 3000th review, and I thought I should mark this occasion here. I wanted to give a huge thanks to all the people who have submitted reviews; you have made life easier for people unfamiliar with installing software. There are reviews in over a hundred different languages and over 600 different applications have been reviewed.

Over 4000 people have clicked the “was this useful to you” buttons in the reviews, which affect how the reviews are ordered for a particular system. Without people clicking those buttons we don’t really know how useful a review is. Since we started this project, 37 reviews have been reported for abuse, of which 15 have been deleted for things like swearing and racism.

Here are some interesting graphs, first, showing the number of requests we’re handling per month. As you can see we’re handling nearly a million requests a month.

The second show the number of people contributing reviews. At about 350 per month this is a tiny fraction compared to the people requesting reviews, but this is to be expected.

The third shows where reviews come from; the notable absence is Ubuntu, but they use their own review system rather than the ODRS. Recently Debian has been increasing the fastest, I assume because at last they ship a gnome-software package new enough to support user reviews, but the reviews are still coming in fastest from Fedora users. Maybe Fedora users are the kindest in the open source community? Maybe we just shipped the gnome-software package first? :)

One notable thing missing from the ODRS is a community of people moderating reviews; at the moment it’s just me deciding which reviews are indeed abuse, and also fixing up common spelling errors in the submitted text. If this is something you would like to help with, please let me know and I can spend a bit of time adding a user type somewhere in-between benevolent dictator (me) and unauthenticated. Ideas welcome.

Reverse engineering ComputerHardwareIds.exe with winedbg

In an ideal world vendors could use the same GUID value for hardware matching in Windows and Linux firmware. When installing firmware and drivers in Windows vendors can always use some generated HardwareID GUIDs that match useful things like the BIOS vendor and the product SKU. It would make sense to use the same scheme as Microsoft. There are a few issues in an otherwise simple plan.

The first, solved with a simple kernel patch I wrote, exposes a few more SMBIOS fields into /sys/class/dmi/id that are required for the GUID calculation.

The second problem is a little more tricky. We don’t actually know how Microsoft joins the strings, what encoding is used, or more importantly the secret namespace UUID used to seed the GUID. The only thing we have got is the closed source ComputerHardwareIds.exe program in the Windows DDK. This, luckily, runs in Wine although Wine isn’t able to get the system firmware data itself. This can be worked around, and actually makes testing easier.

So, some research. All we know from the MSDN page is that Each hardware ID string is converted into a GUID by using the SHA-1 hashing algorithm which actually tells us quite a bit. Generating a GUID from a SHA-1 hash means this has to be a type 5 UUID.

The reference code for a type-5 UUID is helpfully available in the IETF RFC document so it’s quite quick to get started with research. From a few minutes of searching online, the most likely symbols the program will be using are the BCrypt* set of functions. From the RFC code, we call the checksum generation update function with first the encoded namespace (aha!) and then the encoded joined string (ahaha!). For Win32 programs, BCryptHashData is the function we want to trace.

So, to check:

wine /home/hughsie/ComputerHardwareIds.exe /mfg "To be filled by O.E.M."

…matches the reference HardwareID-14 output from Microsoft. So onto debugging, using +relay shows all the calling values and return values from each Win32 exported symbol:

WINEDEBUG=+relay winedbg --gdb ~/ComputerHardwareIds.exe
Wine-gdb> b BCryptHashData
Wine-gdb> r ~/ComputerHardwareIds.exe /mfg "To be filled by O.E.M." /family "To be filled by O.E.M."
005b:Call bcrypt.BCryptHashData(0011bab8,0033fcf4,00000010,00000000) ret=0100699d
Breakpoint 1, 0x7ffd85f8 in BCryptHashData () from /lib/wine/bcrypt.dll.so
Wine-gdb> 

Great, so this is the secret namespace. The first parameter is the context, the second is the data address, the third is the length (0x10 as a length is indeed SHA-1) and the forth is the flags — so lets print out the data so we can see what it is:

Wine-gdb> x/16xb 0x0033fcf4
0x33fcf4:	0x70	0xff	0xd8	0x12	0x4c	0x7f	0x4c	0x7d
0x33fcfc:	0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00

Using either the uuid in python, or uuid_unparse in libuuid, we can format the namespace to 70ffd812-4c7f-4c7d-0000-000000000000 — now this doesn’t look like a randomly generated UUID to me! Onto the next thing, the encoding and joining policy:

Wine-gdb> c
005f:Call bcrypt.BCryptHashData(0011bb90,00341458,0000005a,00000000) ret=010069b3
Breakpoint 1, 0x7ffd85f8 in BCryptHashData () from /lib/wine/bcrypt.dll.so
Wine-gdb> x/90xb 0x00341458
0x341458:	0x54	0x00	0x6f	0x00	0x20	0x00	0x62	0x00
0x341460:	0x65	0x00	0x20	0x00	0x66	0x00	0x69	0x00
0x341468:	0x6c	0x00	0x6c	0x00	0x65	0x00	0x64	0x00
0x341470:	0x20	0x00	0x62	0x00	0x79	0x00	0x20	0x00
0x341478:	0x4f	0x00	0x2e	0x00	0x45	0x00	0x2e	0x00
0x341480:	0x4d	0x00	0x2e	0x00	0x26	0x00	0x54	0x00
0x341488:	0x6f	0x00	0x20	0x00	0x62	0x00	0x65	0x00
0x341490:	0x20	0x00	0x66	0x00	0x69	0x00	0x6c	0x00
0x341498:	0x6c	0x00	0x65	0x00	0x64	0x00	0x20	0x00
0x3414a0:	0x62	0x00	0x79	0x00	0x20	0x00	0x4f	0x00
0x3414a8:	0x2e	0x00	0x45	0x00	0x2e	0x00	0x4d	0x00
0x3414b0:	0x2e	0x00
Wine-gdb> q

So there we go. The encoding looks like UTF-16 (as expected, much of the Windows API is this way) and the joining character seems to be &. An extra nugget of information found through bitter experience is that the string has to be stripped of leading and trailing spaces.

I’ve written some code in fwupd so that this happens:

$ fwupdmgr hwids # use /usr/libexec/fwupd/fwupdtool hwids for fwupd > 1.1.1
Computer Information
--------------------
BiosVendor: LENOVO
BiosVersion: GJET75WW (2.25 )
Manufacturer: LENOVO
Family: ThinkPad T440s
ProductName: 20ARS19C0C
ProductSku: LENOVO_MT_20AR_BU_Think_FM_ThinkPad T440s
EnclosureKind: 10
BaseboardManufacturer: LENOVO
BaseboardProduct: 20ARS19C0C

Hardware IDs
------------
{c4159f74-3d2c-526f-b6d1-fe24a2fbc881}   <- Manufacturer + Family + ProductName + ProductSku + BiosVendor + BiosVersion + BiosMajorRelease + BiosMinorRelease
{ff66cb74-5f5d-5669-875a-8a8f97be22c1}   <- Manufacturer + Family + ProductName + BiosVendor + BiosVersion + BiosMajorRelease + BiosMinorRelease
{2e4dad4e-27a0-5de0-8e92-f395fc3fa5ba}   <- Manufacturer + ProductName + BiosVendor + BiosVersion + BiosMajorRelease + BiosMinorRelease
{3faec92a-3ae3-5744-be88-495e90a7d541}   <- Manufacturer + Family + ProductName + ProductSku + BaseboardManufacturer + BaseboardProduct
{660ccba8-1b78-5a33-80e6-9fb8354ee873}   <- Manufacturer + Family + ProductName + ProductSku
{8dc9b7c5-f5d5-5850-9ab3-bd6f0549d814}   <- Manufacturer + Family + ProductName
{178cd22d-ad9f-562d-ae0a-34009822cdbe}   <- Manufacturer + ProductSku + BaseboardManufacturer + BaseboardProduct
{da1da9b6-62f5-5f22-8aaa-14db7eeda2a4}   <- Manufacturer + ProductSku
{059eb22d-6dc7-59af-abd3-94bbe017f67c}   <- Manufacturer + ProductName + BaseboardManufacturer + BaseboardProduct
{0cf8618d-9eff-537c-9f35-46861406eb9c}   <- Manufacturer + ProductName
{f4275c1f-6130-5191-845c-3426247eb6a1}   <- Manufacturer + Family + BaseboardManufacturer + BaseboardProduct
{db73af4c-4612-50f7-b8a7-787cf4871847}   <- Manufacturer + Family
{5e820764-888e-529d-a6f9-dfd12bacb160}   <- Manufacturer + EnclosureKind
{f8e1de5f-b68c-5f52-9d1a-f1ba52f1f773}   <- Manufacturer + BaseboardManufacturer + BaseboardProduct
{6de5d951-d755-576b-bd09-c5cf66b27234}   <- Manufacturer

Which basically matches the output of ComputerHardwareIds.exe on the same hardware. I’ve merged the fwupd branch to master and vendors are already using the Microsoft HardwareID GUID values.

New fwupd release, and why you should buy a Dell

This morning I released the first new release of fwupd on the 0.8.x branch. This has a number of interesting fixes, but more importantly adds the following new features:

  • Adds support for Intel Thunderbolt devices
  • Adds support for some Logitech Unifying devices
  • Adds support for Synaptics MST cascaded hubs
  • Adds support for the Altus-Metrum ChaosKey device
  • Adds Dell-specific functionality to allow other plugins turn on TBT/GPIO

Mario Limonciello from Dell has worked really hard on this release, and I can say with conviction: If you want to support a hardware company that cares about Linux — buy a Dell. They seem to be driving the importance of Linux support into their partners and suppliers. I wish other vendors would do the same.

Open Desktop Review System : One Year Review

This weekend we had the 2,000th review submitted to the ODRS review system. Every month we’re getting an additional ~300 reviews and about 500,000 requests for reviews from the system. The reviews that have been contributed are in 94 languages, and from 1387 different users.

Most reviews have come from Fedora (which installs GNOME Software as part of the default workstation) but other distros like Debian and Arch are catching up all the time. I’d still welcome KDE software center clients like Discover and Apper using the ODRS although we do have quite a lot of KDE software reviews submitted using GNOME Software.

Out of ~2000 reviews just 23 have been marked as inappropriate, of which I agreed with 7 (inappropriate is supposed to be swearing or abuse, not just being unhelpful) and those 7 were deleted. The mean time between a review being posted that is actually abuse and it being marked as such (or me noticing it in the admin panel) is just over 8 hours, which is certainly good enough. In the last few months 5523 people have clicked the “upvote” button on a review, and 1474 people clicked the “downvote” button on a review. Although that’s less voting that I hoped for, that’s certainly enough to give good quality sorting of reviews to end users in most locales. If you have a couple of hours on your hands, gnome-software --mode=moderate is a great way to upvote/downvote a lot of reviews in your locale.

So, onward to 3,000 reviews. Many thanks to those who submitted reviews already — you’re helping new users who don’t know what software they should install.

Last chance for ColorHug(1) users to get upgraded

For the early adopters of the original ColorHug I’ve been offering a service where I send all the newer parts out to people so they can retrofit their device to the latest design. This included an updated LiveCD, the large velcro elasticated strap and the custom cut foam pad that replaced the old foam feet. In the last two years I’ve sent out over 300 free upgrades, but this has reduced to a dribble recently as later ColorHug1’s and all ColorHug2 had all the improvements and extra bits included by default. I’m going to stop this offer soon as I need to make things simpler so I can introduce a new thing (+? :) next year. If you do need a HugStrap and gasket still, please fill in the form before the 4th January. Thanks, and Merry Christmas to all.

Logitech Unifying Hardware Required

Does anyone have a spare Logitech Unifying dongle I can borrow? I specifically need the newer Texas Instruments version, rather than the older Nordic version.

You can tell if it’s the version I need by looking at the etching on the metal USB plug, if it says U0008 above the CE marking then it’s the one I’m looking for. I’m based in London, UK if that matters. Thanks!

Linux communities, we need your help!

There are a lot of Linux communities all over the globe filled with really nice people who just want to help others. Typically these people either can’t (or don’t feel comfortable) coding, and I’d love to harness some of that potential by adding a huge number of new application reviews to the ODRS. At the moment we have about 1100 reviews, mostly covering the more popular applications, and also mostly written in English.

What I would love is for a few groups of people to come together for their next LUG/outreach/InstallFest and sit down together somewhere cozy and write a few reviews. Bonus points if you use a less-well-known application, and even more points if you can write in a language other than English. Submitting a review is easy; just open up GNOME Software, find the application, and click ‘Write a Review‘ at the bottom of the page.

Application reviews help new users what to install, and the star ratings you give means we can return useful search results full of great applications. Please write an email, ask about helping the ODRS, and perhaps you can help a lot of new users next time you meet with your Linuxy friends.

Thanks!

Last batch of ColorHugALS

I’ve got 9 more ColorHugALS devices in stock and then when they are sold they will be no more for sale. With all the supplier costs going recently up my “sell at cost price” has turned into “make a small loss on each one” which isn’t sustainable. It’s all OpenHardware, both hardware design and the firmware itself so if someone wanted to start building them for sale they would be doing it with my blessing. Of course, I’m happy to continue supporting the existing sold devices into the distant future.

colorhug-als1-large

In part the original goal is fixed, the kernel and userspace support for the new SensorHID protocol works great and ambient light functionality works out of the box for more people on more hardware. I’m slightly disappointed more people didn’t get involved in making the ambient lighting algorithms more smart, but I guess it’s quite a niche area of development.

Plus, in the Apple product development sense, killing off one device lets me start selling something else OpenHardware in the future. :)

Searching in GNOME Software

I’ve spent a few days profiling GNOME Software on ARM, mostly for curiosity but also to help our friends at Endless. I’ve merged a few patches that make the existing --profile code more useful to profile start up speed. Already there have been some big gains, over 200ms of startup time and 12Mb of RSS, but there’s plenty more that we want to fix to make GNOME Software run really nicely on resource constrained devices.

One of the biggest delays is constructing the search token cache at startup. This is where we look at all the fields of the .desktop files, the AppData files and the AppStream files and split them in a UTF8-sane way into search tokens, adding them into a big hash table after stemming them. We do it with 4 threads by default as it’s trivially parallelizable. With the search cache, when we search we just ask all the applications in the store “do you have this search term” and if so it gets added to the search results and ordered according to how good the match is. This takes 225ms on my super-fast Intel laptop (and much longer on ARM), and this happens automatically the very first time you search for anything in GNOME Software.

At the moment we add (for each locale, including fallbacks) the package name, the app ID, the app name, app single line description, the app keywords and the application long description. The latter is the multi-paragraph long description that’s typically prose. We use 90% of the time spent loading the token cache just splitting and adding the words in the description. As the description is prose, we have to ignore quite a few words e.g. “and”, “the”, “is” and “can” are some of the most frequent, useless words. Just the nature of the text itself (long non-technical prose) it doesn’t actually add many useful keywords to the search cache, and the ones that is does add are treated with such low priority other more important matches are ordered before them.

My proposal: continue to consume everything else for the search cache, and drop using the description. This means we start way quicker, use less memory, but it does require upstream actually adds some [localized] Keywords=foo;bar;baz in either the desktop file or <keywords> in the AppData file. At the moment most do, especially after I sent ~160 emails to the maintainers that didn’t have any defined keywords in the Fedora 25 Alpha, so I think it’s fairly safe at this point. Comments?