Searching in GNOME Software

I’ve spent a few days profiling GNOME Software on ARM, mostly for curiosity but also to help our friends at Endless. I’ve merged a few patches that make the existing --profile code more useful to profile start up speed. Already there have been some big gains, over 200ms of startup time and 12Mb of RSS, but there’s plenty more that we want to fix to make GNOME Software run really nicely on resource constrained devices.

One of the biggest delays is constructing the search token cache at startup. This is where we look at all the fields of the .desktop files, the AppData files and the AppStream files and split them in a UTF8-sane way into search tokens, adding them into a big hash table after stemming them. We do it with 4 threads by default as it’s trivially parallelizable. With the search cache, when we search we just ask all the applications in the store “do you have this search term” and if so it gets added to the search results and ordered according to how good the match is. This takes 225ms on my super-fast Intel laptop (and much longer on ARM), and this happens automatically the very first time you search for anything in GNOME Software.

At the moment we add (for each locale, including fallbacks) the package name, the app ID, the app name, app single line description, the app keywords and the application long description. The latter is the multi-paragraph long description that’s typically prose. We use 90% of the time spent loading the token cache just splitting and adding the words in the description. As the description is prose, we have to ignore quite a few words e.g. “and”, “the”, “is” and “can” are some of the most frequent, useless words. Just the nature of the text itself (long non-technical prose) it doesn’t actually add many useful keywords to the search cache, and the ones that is does add are treated with such low priority other more important matches are ordered before them.

My proposal: continue to consume everything else for the search cache, and drop using the description. This means we start way quicker, use less memory, but it does require upstream actually adds some [localized] Keywords=foo;bar;baz in either the desktop file or <keywords> in the AppData file. At the moment most do, especially after I sent ~160 emails to the maintainers that didn’t have any defined keywords in the Fedora 25 Alpha, so I think it’s fairly safe at this point. Comments?

Comments about OARS and CSM age ratings

I’ve had quite a few comments from people stating that using age rating classification values based on American culture is wrong. So far I’ve been using the Common Sense Media research (and various other psychology textbooks) to essentially clean-room implement a content-rating to appropriate age algorithm.

Whilst I do agree that other cultures have different sensitivities (e.g. Smoking in Uganda, references to Nazis in Germany) there doesn’t appear to be much research on the suggested age ratings for different categories for those specific countries. Lots of things are outright banned for sale for various reasons (which the populous may completely ignore), but there doesn’t seem to be many statistics that back up the various anecdotal statements. For instance, are there any US-specific guidelines that say that the age rating for playing a game that involves taking illegal drugs should be 18, rather than the 14 which is inferred from CSM? Or the age rating should be 25+ for any game that features drinking alcohol in Saudi Arabia?

Suggestions (especially references) welcome. Thanks!

GNOME Software and Age Ratings

After all the tarballs for GNOME 3.22 the master branch of gnome-software is now open to new features. Along with the usual cleanups and speedups one new feature I’ve been working on is finally merging the age ratings work.

screenshot-from-2016-09-21-10-22-36

The age ratings are provided by the upstream-supplied OARS metadata in the AppData file (which can be generated easily online) and then an age classification is generated automatically using the advice from the appropriately-named Common Sense Media group. At the moment I’m not doing any country-specific mapping, although something like this will be required to show appropriate ratings when handling topics like alcohol and drugs.

At the moment the only applications with ratings in Fedora 26 will be Steam games, but I’ve also emailed any maintainer that includes an <update_contact> email address in the appdata file that also identifies as a game in the desktop categories. If you ship an application with an AppData and you think you should have an age rating please use the generator and add the extra few lines to your AppData file. At the moment there’s no requirement for the extra data, although that might be something we introduce just for games in the future.

I don’t think many other applications will need the extra application metadata, but if you know of any adult only applications (e.g. in Fedora there’s an application for the sole purpose of downloading p0rn) please let me know and I’ll contact the maintainer and ask what they think about the idea. Comments, as always, welcome. Thanks!

LVFS and ODRS are down

The LVFS firmware server and ODRS reviews server are down because my credit card registered with OpenShift expired. I’ve updated my credit card details, paid the pending invoice and still can’t start any server. I rang customer service who asked me to send an email and have heard nothing back.

screenshot-from-2016-09-14-17-34-59

I have backups a few days old, but this whole situation is terrible on so many levels.

EDIT: cdaley has got everything back working again, it appears I found a corner case in the code that deals with payments.

Fedora 25 and Additional Software Sources

I was asked to produce a checklist for applications that we want to show up in GNOME Software in Fedora 25. In this post I’ll refer to applications as graphical programs, rather than other system add-on components like drivers and codecs (which the next post will talk about). There is a big checklist, which really is the bare minimum that the distributor has to provide so that the application is listed correctly. If any of these points is causing problems or is confusing, please let me know and I’ll do my best to help.

So, these things really have to be done:

  • Verify that you ship a .desktop file for each built application, and that these keys exist: Name, Comment, Icon, Categories, Keywords and Exec and that desktop-file-validate correctly validates the file.
  • Verify that there is a PNG (with transparent background) or SVG icon is installed in /usr/share/icons, /usr/share/icons/hicolor/*/apps/*, or /usr/share/${app_name}/icons/* and is at least 64×64 in size.
  • At least one valid AppData file with the suffix .appdata.xml file must be installed into /usr/share/appdata with an <id> that matches the name of the .desktop file, e.g. gimp.appdata.xml. Ideally the name of both the desktop file and appdata should be reverse DNS, e.g. com.hughski.ColorHug.desktop rather than colorhug-client.desktop although this isn’t critically important.
  • Include several 16:9 aspect screenshots in the AppData file along with a compelling translated description made up of multiple paragraphs. Make sure you follow the style guide, which can be tested using appstream-util validate foo.appdata.xml
  • Make sure that there are not two applications installed with one package; in this case split up the package so that there are multiple subpackages or mark one of the .desktop files as NoDisplay=true. Make sure the application-subpackages depend on any -common subpackage and deal with upgrades (perhaps using a metapackage) if you’ve shipped the application before.
  • Make sure your application is visible in the example.xml.gz file when running appstream-builder on the binary rpm(s).
  • Make sure the AppStream metadata is regenerated when the application is updated in the repo, for more details see an entire blog post on this
  • Ensure that enabled_metadata=1 is set in the .repo file. This means that PackageKit will automatically download just the application metadata even when the repository is disabled.

Getting S3 Statistics using S3stat

I’ve been using Amazon S3 as a CDN for the LVFS metadata for a few weeks now. It’s been working really well and we’ve shifted a huge number of files in that time already. One thing that made me very anxious was the bill that I was going to get sent by Amazon, as it’s kinda hard to work out the total when you’re serving cough millions of small files rather than a few large files to a few people. I also needed to keep track of which files were being downloaded for various reasons and the Amazon tools make this needlessly tricky.

I signed up for the free trial of S3stat and so far I’ve been pleasantly surprised. It seems to do a really good job of graphing the spend per day and also allowing me to drill down into any areas that need attention, e.g. looking at the list of 404 codes various people are causing. It was fairly easy to set up, although did take a couple of days to start processing logs (which is all explained in the set up). Amazon really should be providing something similar.

Screenshot from 2016-08-24 11-29-51

For people providing less than 200,000 hits per day it’s only $10, which seems pretty reasonable. For my use case (bazillions of small files) it rises to a little-harder-to-justify $50/month.

I can’t justify the $50/month for the LVFS, but luckily for me they have a Cheap Bastard Plan (their words, not mine!) which swaps a bit of advertising for a free unlimited license. Sounds like a fair swap, and means it’s available for a lot of projects where $600/yr is better spent elsewhere.

Devo Firmware Updating

Does anybody have a Devo RC transmitter I can borrow for a few weeks? I need model 6, 6S, 7E, 8, 8S, 10, 12, 12S, F7 or F12E — it doesn’t actually have to work, I just need the firmware upload feature for testing various things. Please reshare/repost if you’re in any UK RC groups that could help. Thanks!

Updating Firmware on 8Bitdo Game Controllers

I’ve spent a few days adding support for upgrading the firmware of the various wireless 8Bitdo controllers into fwupd. In my opinion, the 8Bitdo hardware is very well made and reasonably priced, and also really good retro fun.

Although they use a custom file format for firmware, and also use a custom flashing protocol (seriously hardware people, just use DFU!) it was quite straightforward to integrate into fwupd. I’ve created a few things to make this all work:

  • a small libebitdo library in fwupd
  • a small ebitdo-tool binary that talks to the device and can flash a vendor supplied .dat file
  • a ebitdo fwupd provider that uses libebitdo to flash the device
  • a firmware repo that contains all the extra metadata for the LVFS

I guess I need to thank the guys at 8Bitdo; after asking a huge number of questions they open sourced their OS-X and Windows flashing tools, and also allowed me to distribute the firmware binary on the LVFS. Doing both of those things made it easy to support the hardware.

Screenshot from 2016-08-18 10-36-56

The result of all this is that you can now do fwupd update when the game-pad is plugged in using the USB cable (not just connected via bluetooth) and the firmware will be updated to the latest version. Updates will show in GNOME Software, and the world is one step being closer to being awesome.

LVFS has a new CDN

Now that we’re hitting cough Cough COUGH1 million users a month the LVFS is getting slower and slower. It’s really just a flask app that’s handling the admin panel and then apache is serving a set of small files to a lot of people. As switching to a HA server is taking longer than I hoped2, I’m in the process of switching to using S3 as a CDN to take the load off. I’ve pushed a commit that changes the default in the fwupd.conf file. If you want to help test this, you can do a substitution of secure-lvfs.rhcloud.com to s3.amazonaws.com/lvfsbucket in /etc/fwupd.conf although the old CDN will be running for a long time indeed for compatibility.

  1. Various vendors have sworn me to secrecy
  2. I can’t believe GPGME and python-gpg is the best we have…

Adding suggestions to AppData files

An oft-requested feature is to show suggestions for other apps to install. This is useful if the apps are part of a larger suite of application, or if the apps are in some way complimentary to each other. A good example might be that we want to recommend libreoffice-writer when the user is looking at the details (or perhaps had just installed) libreoffice-calc.

At the moment we’ve got got any UI using this kind of data, as simply put, there isn’t much data to use. Using the ODRS I can kinda correlate things that the same people look at (i.e. user A got review for B and C, so B+C are possibly related) but it’s not as good as actual upstream information.

Those familiar with my history will be unsuprised: AppData to the rescue! By adding lines like this in the foo.appdata.xml file you can provide some information to the software center:

<suggests>
<id>libreoffice-draw.desktop</id>
<id>libreoffice-calc.desktop</id>
</suggests>

You don’t have to specify the parent app (e.g. libreoffice-writer.desktop in this case) and is the only tag that’s accepted. If isn’t found in the AppStream metadata then it’s just ignored, so it’s quite safe to add things that might not be in stable distros.

If enough upstreams do this then we can look at what UI makes sense. If you make use of this feature, please let me know and I can make sure we discuss the use-case in the design discussions.