How many Flathub apps reuse other package formats?

Today I read Comparison of Fedora Flatpaks and Flathub remotes by Hari Rana, who is an active and valued member of the Flatpak community. The article is a well-researched and well-written overview of how these two Flatpak ecosystems differ, and contains the following remark about one major difference (emphasis mine):

Flathub is open with what source a Flatpak application (re)uses, whereas Fedora Flatpaks strictly reuses the RPM format.

As such, Flathub has tons of applications that reuse other package formats.

When this article was discussed in the Flatpak Matrix channel, several people wondered whether “tons” is a fair assessment. Let’s find out!

The specific examples given in the article are of apps which reuse a .deb (to which I will add .rpm), AppImage, Snap package, or binary .tar.gz archive. It’s not so easy to distinguish a binary tarball from a source tarball, so as a substitute I will look for apps which use the extra-data to download external sources at install time rather than at build time.

I have cloned every repo from the Flathub GitHub organisation with this script I had lying around. There are 2,220 such repositories. This is a bigger number than the 1,518 apps cited in the blog post, because it includes many thing which are not apps, such as 258 GTK themes and 60 digital audio workstation plugins. I also believe that the 1,518 number does not include end-of-lifed apps, whereas my methodology does. This post will also ignore the existence of OBS Studio and Firefox, where those projects build the Flatpak from source on their own infrastructure and push the result into Flathub.

Now I’m just going to grep them all for the offending strings:

$ (for i in */
do
    if git -C $i grep --quiet -E '(\.(deb|rpm|AppImage|snap)\>)|(extra-data)'
    then
        echo $i
    fi
done) | wc -l
237

(Splitting apart the search terms, we have 141 repos matching .deb, 10 for .rpm, 23 for .AppImage, 6 for .snap, and 110 for extra-data. These numbers don’t sum to 237 because the same repo can use multiple formats, and these binary files are often used by extra-data apps.)

So by my back-of-an-envelope calculation, 237 out of 2220 repos on Flathub repackage other binary formats. This is a little under 11%. Of those 237, 51 are GTK themes, specifically variations of the Mint, Pop and Yaru themes. If we assume that all the other 186 are apps, and that none of them are EOLed, then 186 divided by 1,518 gives us a little more than 12% of apps on Flathub that are repackaged from other binary formats. (I believe this is a slight overestimate but I have run out of time this morning.)

Is that a big number? It’s roughly what I expected. Is it “ton[ne]s”? Compared to Fedora’s Flatpak repo, where everything is built from source, it certainly is: indeed, it’s more than the total number of apps in the Fedora Flatpak repo!

If it is valuable for Flathub to provide proprietary apps like Slack whose publishers do not currently wish to support Flatpak (which I believe it is) then it’s unavoidable that some apps repackage other binary formats. OK, time for one last bit of data: what if we exclude extra-data apps?

$ (for i in */
do
    if ! git -C $i grep --quiet extra-data && \
       git -C $i grep --quiet -E '\.(deb|rpm|AppImage|snap)\>'
    then
        echo $i
    fi
done )| wc -l
127

So (ignoring non-extra-data apps which use binary tarballs, if any such apps exist) that’s something like 76 apps and 51 GTK themes which probably could be built from source by Flathub, but aren’t. It may be hard to build some of these apps from source (perhaps the upstream build system requires network access) but the rewards would include support for aarch64 and any other architectures Flathub may add, and arguably greater transparency in how the app is built.

If you want to do your own research in this vein, you may be interested in gasinvein‘s Flatpak remote metadata fetcher, which would let you generate and analyse a 200 MiB JSON file rather than by cloning and grep-ing 4.6 GiB of Git repositories. His analysis using this data yields 174 apps, quite close to my 186 estimate above.

./flatpak-remote-metadata.py -u https://dl.flathub.org/repo flathub | \
    jq -r '.[] | select(
        .manifest | objects | .modules[] | recurse(.modules | arrays | .[]) |
        .sources | arrays | .[] | .url | strings | test(".*.(deb|rpm|snap|AppImage)$")
    ) | .metadata.Application.name // .metadata.Runtime.name' | \
    sort -u | wc -l