Application installing

In the Linux desktop we have a very big problem: We focus very much on packages. Packages are interesting to programmers, but users care about applications.

I’ll explain the difference:

  • Packages can contain none, one or multiple applications. Packages are called “openoffice-common”
  • Applications only belong to one package. Applications are called “OpenOffice.org Writer”

Now, when a user wants to install an application, they have to research on Google what the package name is called (which is different on each distro) or hope that the application name is mentioned in the description of the package in the distro metadata. Not ideal at all.

Now, I said as a desktop we have one big problem, well, we actually have two: We don’t all speak English well. Some of us speak no English at all.

We want to be able to display package descriptions to the user in all languages, but we don’t want to download 80x the metadata to do so. Most packaging systems only understand en_US anyway, and there certainly isn’t the resources to translate every spec file or emerge instruction for each distro. To add to the problems, each package needs an icon, in various sizes so we can show the application icon rather than a generic box.

So we’re sunk, right?

No. In each package, there are desktop files that contain all the applications, with nice translations gently massaged and QA’d by upstream. It would be nice if we could search on that data. At the moment, this is impossible, unless we want to download every package in the archive, and extract the data from it. This is sort of how Ubuntu does gnome-app-install, and it seems to work fairly well. It is Ubuntu specific, but maybe we could work on that.

So, we cache all the desktop files, and push this out to the repo metadata?

No. If you did that you make a lot of people very unhappy, as even compressed the metadata and icons make up over 80Mb.

So we’re sunk, right?

No. What we can do is create sub-packages for each repo (e.g. rpmfusion-appdata) which ships a tarball of icons and a few hundred Kb of SQL. Every time the repo maintainer can be bothered (once a month?) the new data is generated, and a new package pushed out to the mirrors. If the repo maintainer can’t be bothered to do that, then none of the new packages will show up in the application browser. It’s optional.

This data clogs up my system right?

Well, it’s only a few tens of Mb if you want all the icons in most of the sizes, if you only choose the 48×48 option then it’s much less. When you install the new $repo-appdata subpackage, it removes all the stale applications and re-adds the latest data. This happens in the vendor spec files as postinst scripts.

How do I query the data?

It’s a simple sqlite database in /var/lib/app-install/desktop.db — the icons are located in /usr/share/app-install/icons/$size/*.png — there’s no GUI installer yet using this, but expect a few before two long.

Great! Another Ubuntu v.s. Red Hat standards war!

No. Roderick Greening, Sebastian Heinlein and myself together drafted the specification together, and made it generic enough for all the distros to use. It’s totally expected that each distro will code a different tool to extract the metadata, but that’s because they are different in some important ways.

So the maintainers have to install everything just to get the desktop files?!

No. You can download and install a package to a prefix without the deps — we don’t need the binary to run, we just need the data. in this way we don’t need to download -data subpackages, only the one with the desktop file in.

Can I add some more features to the spec?

Yes, in a little while. We want to get version 1 of the spec finished, with it being used in a few distros. When we’re comfortable this works correctly, we’ll start working on version 2, and add stuff like popularity metrics and metadata about suggesting gnome-power-manager rather than kpowersave if you’re running GNOME. There are lots of things we need to add for this to work really well.

So, Comments?

Published by

hughsie

Richard has over 10 years of experience developing open source software. He is the maintainer of GNOME Software, PackageKit, GNOME Packagekit, GNOME Power Manager, GNOME Color Manager, colord, and UPower and also contributes to many other projects and opensource standards. Richard has three main areas of interest on the free desktop, color management, package management, and power management. Richard graduated a few years ago from the University of Surrey with a Masters in Electronics Engineering. He now works for Red Hat in the desktop group, and also manages a company selling open source calibration equipment. Richard's outside interests include taking photos and eating good food.

35 thoughts on “Application installing”

  1. “Now, when a user wants to install an application, they have to research on Google what the package name is called” & “We don’t all speak English well”

    i’m french and i have the same problem for bug reporting.
    If i want to report a bug againt an application, i go to help menu>About and look at the name of the application, for instance Moniteur Système.

    but if i want to fill a bug report, i have to select a package. No way to knaow that Moniteur Système stands for gnome-system-monitor.

    Then i run Alacarte, click on the Moniteur Système entry, select propertie and look at the command to get the name of gnome-system-monitor

    Maybe there would be a lot more bugs reports if the package name could figure in the About box, close to the translated application name ?

  2. Three quick comments:
    – Users don’t care about applications. Users have something to accomplish and are looking for an application that does it for them.
    – Debian already has translated package descriptions (I’m surprised about your claim that translations for descriptions are not available).
    – Enrico Zini’s debtags work probably is a far more interesting set of metadata for improving package selection than desktop files.
    Your efforts are to help users find the packages for apps cool, but your post sounds like its inventing something fairly primitive when more advanced stuff already exists. Maybe it’d be a good idea get (even) more people involved before starting from scratch. The aptitude developer and debian-devel might be a good address to try to benefit from some of the thoughts Debian people had on this.

  3. @Thomas:

    I know Debian can do debtags and other clever things, but unless Debian starts working with other distros building a common solution, we’ll continue to work on this spec. It’s also ignoring the package vs. application argument which I think is a important point to make.

  4. @jonathan:

    From a user view, almost exactly the same. The difference is how the data is stored, merged and removed in a cross distro way.

  5. Hello.

    I’m the author of command-not-found (for reference). I think I have some interesting points to add:

    First and most importantly:

    Creating an offline system is a _bad_idea_ for installing applications that ship .desktop icons. People will _download_ them from the internet anyway so it’s right to assume they are online. At the very least the (sensible IMHO) default should be search online, cache icons and some option to have offline data installed as an option. This is important for getting suggestions done right (you can see search history and build from there, you can share this data with other distros). This limits the amount of insane data people need to update every time a new app pops up. This also allows you to get high-resolution icons for the applications that ship them (limiting to 48×48 is silly IMHO). This way you can also get rating, comments etc in a good way. If you’d ask me look at apple app store, linspire/lindows marketplace and similar things for getting inspiration on how to do this.

    Second most important thing:

    USERS CARE FOR APPLICATIONS NOT TASKS
    Users understand branding. Really, that’s how current world works. I know it’s nice to say that if you want to edit a text file we can offer the following 100s of programs but that’s not what users want. When they hunt for applications they know the NAME of what they are looking for. Doing ‘task’ hunting right can be accomplished by different means (again look at app store, categories, etc)

    Other technical stuff:
    * maintaining the data is a major pain in the ass; any good solution makes it possible for people to contribute in an easy way and _extend_ this data with their own repositories
    * data must be open so that a cross-distro solution can be built eventually
    * knowing which package to offer is difficult (packages that contain binary files/desktop files are not ideal, often you need to offer meta-package).
    * localization is really difficult to get right unless you are online

  6. There is also relevancy here to the problem of codenames VS application names. Most people don’t associate the purpose of a program from its name, except in very individual cases where:

    – The program is very popular. Example: Firefox.
    – The program name is very clear. Example: OpenOffice. It’s a name which describes the purpose of the program.

    However, let’s talk about the general case, not the exceptions. Who in the world will now the purpose of “alacarte”, or “baobab”? Maybe we should use generic names for these programs in the desktop, until they reach some level of popularity (or better program name) by consensus.

    This is a so big problem that there even exist distros out there with the sole reason of renaming programs (desktop files, you name it…).

    I wonder if you have plans also to address this in your spec.

    Regards,

    Andrés

  7. @hughsie: with all due respect, the Debian people have solved problems they were facing, because that is what motivated them. If your motivation is to develop a cross-distro solution, you shouldn’t discount prior art from any particular camp just because it wasn’t their interests at the time to solve this globally. Interesting you take this attitude with debtags and package description translations but not with Ubuntu’s gnome-app-install work.

    I speak in complete generalities here because I can’t honestly see the applicability of debtags nor package translations to the problem you are facing (but I may not be looking at it right)

  8. @Zygmunt Krynicki: I’m sorry but you are wrong in your assumption that people will soley install packages whilst online. I’m sure the majority do but by no means everyone, especially when we are talking globally rather than specific to a given distro.

  9. So which Icon theme will you use (oh and please default on 96×96 or 128×128)?

    If I find the icon in the installer program, install the program and then… I can’t find it… it doesn’t look like it did before, because I’m running a different theme?

  10. Why not choose scalable icons? They compress well and look better. At some point we have to move away from having many pixel variants for applications, can we start now?

    Are these appdata bundles likely to be an issue on slower storage systems like early netbook SSDs? If the appdata database could throw away any languages it doesn’t need after downloading, it would certainly help.

  11. If we still expect users not to alias ‘sudo apt-get install’ and ‘sudo apt-get remove’, we may have bigger problems than packages vs apps: often times they are the same thing.

    $ install empathy
    $ uninstall pidgin
    $ install epiphany
    $ uninstall firefox
    $ give root ~/.bashrc
    $ unmount thumb
    $ restart

  12. On the topic of the “offline systems are a bad idea” reply, do keep in mind that some people would prefer to install from offline sources (ex., DVDs) or local repositories if possible. Installations are not all online.

  13. The problem with downloading these icons “on the fly” is that downloading 80mb of data to show the user a list of applications available is not currently feasible. Caching is only going to help the second time you do this.

    So there has to be an offline solution somehow. My favoured one, would be to pre-cache the icons with the distro, and update/download new ones on the fly. This would probably work, especially since Linux distributions tend not to expand their repository much during the life cycle of a given version. New applications are generally added only when a new version comes out (I don’t like this, but that is a different question).

  14. How big is just the textual metadata? I’m guessing most of the 80MB is icons. If not, you could slice the metadata by language and have one package per language :-)

  15. Hang on a minute. You’re still thinking of how to distribute the needed data to every machine. But if I want to install an application, I need a network connection, right? What’s the point of talking about an offline solution? (Alternatively I might have all the packages on a DVD, and I can search the metadata there, but then I don’t have to worry about minimizing what is shipped).

    Why not just have a service, located at the distro (since packaging choices are distro-specific), where users can search . Everybody doesn’t have to download metadata for every single application, if it exists on the distro’s site. The query volume would be small compared to the cost of keeping mirrors updated.

  16. antistress: the “right” way to solve the bug reporting problem is probably to put a “Report a problem” menu item into the application (it already exists for some apps in Ubuntu, not sure about other distros).

    Andrés G. Aragoneses: many desktop files already contain “descriptive” names for applications, which are used by the Gnome Application menu already (like “Analyze disk usage” instead of “baobab”).

    Regarding the online/offline part, I think that it would be nice if users could at least _see_ which apps they could install if they were online, even if they are offline at the moment. OTOH, my netbook only has 4GB of SSD disk space, so installing several MBs of data which I seldom use is probably wasteful :-)

  17. “In each package, there are desktop files that contain all the applications, with nice translations gently massaged and QA’d by upstream.”

    About the nice translations part…

    I added the possibility to translate the .desktop file info in gnome-app-install[1], and the template contains about 4000 strings. After extracting existing translations from the files, I see the resulting po files contain about 500-1000 translated strings only – because of the “not sure if anybody uses it, but we have a package” projects of the Universe repo, aka “long tail of OSS”[2].

    So we probably want to setup a translation domain for these files and submit po files to some non-distribution-specific translation team.

    [1]: https://translations.launchpad.net/ubuntu/jaunty/+source/app-install-data-ubuntu/+pots/app-install-data
    [2]: http://linuxhaters.blogspot.com/2008/06/good-software-isnt-really-free.html

  18. Re the spec”The interchange format is native SQL, rather than a compressed database.”

    Unfortunately, the extent to which SQL is actually standardized here isn’t as great as it might be, and I think you’ll have to be more careful about specifying which version of SQL, which dialect, etc. I feel coming up with a simple XML DTD would be a far better idea; solves this, plus the nasty issue of text encoding.

  19. @Jon

    I didn’t say offline system should not be developed. I understand the values of having an offline system. But if you think about it: when you make an online system you can make it much more compeling for users. Its’ about giving the 80% of people _exactly_ what they want. The remaining 20% is either so special that creating a good solution for them is beyond cross-distro interest (too much effort for too little gain) or they are fine with yum/apt/whatever anyway. From my experience the most strange and ‘important’ use cases are made by vocal minority with technical skills. I think it’s more important to design for the average user.

  20. I don’t think this design is very good. Moving 80 MiB of regularly updated data into a package won’t make mirrors any happier. On the contrary you’ve just created a major mirror choke-point: every month, when your metadata package is updated, and all the client PK instances decide they need to update to show pretty application lists, your mirrors will be hammered to death.

    What you need is to design a kind of metadata that caches (http) well and can be downloaded with a fine granularity (as in, check the user is actually interested in an app, has selected it in the UI and the associated data is older than X before trying to update the info about this app only, use partial cheap descriptions before going gfx, etc)

    And then you can take care about the initial population with a full-repo monolithic seed. But this seed does not need to be in a package, and it certainly should not used after the initial seeding.

    It does not matter if the install is offline or online. Offline installs will need a full metadata copy anyway, regardless of its content.

  21. Joe Buck:

    A single service per distro is not going to cut it. There are and will continue to be software that 3rd party repos provide that a central “distro” service won’t be able to aggregate and list to users for the same reasons the distro doesn’t include that software in the first place..whether those reasons are political or legal it doesn’t really matter. 3rd party repositories will exist and a single distro service won’t be able to cover their software in fair an equitable manner side-by-side with the “distro” collection. A central per-distro service isn’t going to cut it.

    That being said,I’m not sure this optional information as a client side installable package makes the most sense. I guess I’d have to understand how it was expected to be used in terms of user interface.

    An optional per repository service that PK could interact with might make sense for the usage case of network enabled clients. In the same way that yum repositories can provide an optional MirrorManager service now, The MM service hands over a mirrorlist that the client caches for some period of time before asking for a new mirrorlist. An Appdata service could do the same sort of thing, except we are talking about a lot more data to pass over the wire for the application metadata like icons.

    What you sort of want is a way to pre-seed the client information when a new repository is added (even if that repository is a media based one), and then ask the repository service to pass only new information incrementally to clients as new applications become available. That would be closer to ideal i think. Instead of passing 10s of megabytes of mostly redundant information every time there is a group of applications added..even if its just once a month. Delta rpms would help here, but I’m not sure that is as distro-agnostic as a per-repository service.

    -jef

  22. I think the whole idea of tying it to repositories is busted. Most users communicate applications to each other with links. Most users search with GOOGLE, not with some random installer application buried away in the menues.

    If a user wants Open Office (and he can’t find it on his desktop), he opens his web browser and types in Open Office. He then gets the Open Office website, not some online package repository interface (that is, he isn’t going to get Click ‘N’ Run or anything like that).

    Your entire approach needs to at the very least _include_ the Web, if not build primarily around it.

    What would be best is if projects can include a simple “INSTALL FOO” icon/link right on their website. This would ideally include some kind of unique ID that a browser plugin (or just a filetype handler) can open up and use to find the proper package to install in the distro (or even opt to install a new add-on repository hosted by the project if no package can be found).

    So far as offline users, they are pretty much stuck installing packages from installation media or from local package repositories. Local repositories can just start running a simple search daemon (if the user decides he even wants it) and the package searching utility can be smart enough to ask for the install media if a net connection is not available.

    When approaching the problem, though, please find out how real users look for software. Don’t guess. Don’t ask your mega-geek friends how they do it or how they guess people do it. Don’t just ask your non-geek friends. Do what Ximian used to do: run real random polls (pull people off the street), use paper mockups to test ideas, etc.

    Software installation is THE biggest issue with the Linux desktop from my experience. When I used to have too much spare time and I installed Linux desktops for people, 99% of the support requests people will call me with were things like “how do I install .”

    I know we’re all trying to push Free Software and that conventional wisdom is that distro repositories are The Way, but I really would give at least a _little_ thought to software that will not (or even cannot!) be in any repository. e.g., games. Even if we’re talking about a theoretical high-quality modern Free Software game, such a game could easily have 4GB of data to install! People want to install off a DVD, not download it off the Net. (And they definitely don’t want to have to redownload all 4GB for every little patch… or every little packaging tweak!)

    People get software by searching Google and clicking Download. When they aren’t doing that, they’re installing it off a disc. Only the geeks like us who’ve been trained to always go look inside the “proprietary” (to each distro) repository are going to actually look there first (or second, or third).

  23. @Nicolas:

    About the mirror choke point. I’m not expecting to update this more frequently than every 6 months after a stable distro release like F10. For distros like rawhide it’ll be more often, but that’s not unusual.

    Also, the 80Mb was calculated having 4 installed, as the actual data in each repo is only about 20Mb in size.

  24. @Dave Malcolm:

    I wanted to avoid wrapping this in XML, this has to be quick to insert into a database. You are correct I’ll have to define the encoding and stuff, but that should be pretty simple.

  25. Jef Spaleta:
    Yes. Fedora seems to be a bit of an exception, but I still only see a 20% increase in the number of packages added over 1.5 years for Fedora 9. Devel is obviously a different beast, but this is not an end user version.

    I see hughsie updated his numbers to 20Mb for the icons. Downloading this on the fly will take approximately 5 minutes on a fairly common 0.5 Mbit connection or easily an hour on dial-up. Not really feasible imo, even if you do something clever about the order in which you download these icons. This needs to appear instant to users.

    If the initial 20Mb of icons are pre-cached in a package (possible to remove if you don’t want them), then updating them every time there is a new package online is not a problem. Even in the most extreme case that you install Fedora 9 now and have 1.5 years of package additions to update for it will still only be about 4Mb. This can download in the background while you start the service, filling in the blanks as you go along. It should only take 1 minute on a 0.5 Mbit connection, or not much more than 10 minutes on dialup.

  26. Looks less than useful from an always-in-development distro point of view. There’s just no way we’ll find man power to install thousands of packages every month and we certainly don’t want tp force people to re-download packages with size close to openoffice.

    As others mentiones, create an online service with a well-defined API to both download and upload (merge) distro-specific data. As for the offline clients (we won’t really care about these) make it possible for devs to download an SQLite 3 snapshot with packages just for that single distro. Icons are not important – if there’s no network link you have bigger concerns than missing OpenOffice icons in the installer.

  27. @Zygmunt Krynicki: OK so I argue that not all users will be online, you argue that the majority will be… should such arguments perhaps be deferred until some statistics are compiled? We’re both making assumptions and could be completely wrong.

Comments are closed.