On Flatpak disk usage and deduplication

There is a blog post doing the rounds asserting that Flatpak Is Not The Future. The post is really long, and it seems unlikely that I and the author will ever agree on this topic, so I’m only going to talk about a couple of paragraphs about disk usage and sharing of runtimes between apps which caught my eye. This is highly relevant to my day job because all apps on Endless OS are Flatpaks—for example, the English downloadable version has 58 Flatpak apps pre-installed, and 13 runtimes—and I’ve had and answered some of the same questions discussed in the post.

They claim that they deduplicate runtimes. I question how much can really be shared between different branches when everything is recompiled.

This question is really easy to answer using du, which does not double-count files which are hardlinked together. Let’s compare the 20.08 and 21.08 versions of the freedesktop runtime:

wjt@camille:~$ du -sh /var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/20.08
674M	/var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/20.08
wjt@camille:~$ du -sh /var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/21.08
498M	/var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/21.08
wjt@camille:~$ du -sh /var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/20.08 /var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/21.08
674M	/var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/20.08
385M	/var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/21.08
wjt@camille:~$ echo $(( 498 - 385 ))                                                                                                                                                
113

113 MB (out of 498 MB for the smaller, more up-to-date 21.08 runtime) is shared between these two runtimes.

How about the GNOME 41 runtime, which is derived from the 21.08 freedesktop runtime?

wjt@camille:~$ du -sh /var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/21.08                                                                                                
498M	/var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/21.08
wjt@camille:~$ du -sh /var/lib/flatpak/runtime/org.gnome.Platform/x86_64/41                                                                                                
715M	/var/lib/flatpak/runtime/org.gnome.Platform/x86_64/41
wjt@camille:~$ du -sh /var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/21.08 /var/lib/flatpak/runtime/org.gnome.Platform/x86_64/41
498M	/var/lib/flatpak/runtime/org.freedesktop.Platform/x86_64/21.08
327M	/var/lib/flatpak/runtime/org.gnome.Platform/x86_64/41
wjt@camille:~$ echo $(( 715 - 327 ))                                                                                                                                                
388

388 MB (out of 715 MB) of the GNOME 41 runtime is shared with the 21.08 runtime.

I can’t imagine what system updates will be like in the future when you have a few dozen apps storing tens of gigabytes of runtimes that all want to be kept up to date.

There is no need to imagine! I have 163 Flatpak apps on my Endless OS system. Let’s see how many runtimes I have, how big they are, and how many apps use each one:

wjt@camille:~$ flatpak list --app --columns=runtime | sort | uniq -c | wc -l
18
wjt@camille:~$ flatpak list --app --columns=runtime | sort | uniq -c | sort -n
      1 com.endlessm.apps.Platform/x86_64/6
      1 org.freedesktop.Platform/x86_64/18.08
      1 org.gnome.Platform/x86_64/3.34
      1 org.gnome.Sdk/x86_64/41
      1 org.kde.Platform/x86_64/5.14
      2 com.endlessm.Platform/x86_64/eos3.2
      2 org.gnome.Platform/x86_64/3.28
      3 org.freedesktop.Platform/x86_64/19.08
      3 org.freedesktop.Sdk/x86_64/21.08
      3 org.gnome.Platform/x86_64/3.38
      3 org.kde.Platform/x86_64/5.15-21.08
      5 org.gnome.Platform/x86_64/3.36
      9 org.kde.Platform/x86_64/5.15
     10 org.freedesktop.Platform/x86_64/20.08
     24 com.endlessm.apps.Platform/x86_64/5
     28 org.freedesktop.Platform/x86_64/21.08
     30 org.gnome.Platform/x86_64/40
     36 org.gnome.Platform/x86_64/41
wjt@camille:~$ cd /var/lib/flatpak/runtime; flatpak list --app --columns=runtime | sort | uniq | xargs du -sh --total
918M	com.endlessm.apps.Platform/x86_64/5
907M	com.endlessm.apps.Platform/x86_64/6
2.0G	com.endlessm.Platform/x86_64/eos3.2
632M	org.freedesktop.Platform/x86_64/18.08
211M	org.freedesktop.Platform/x86_64/19.08
569M	org.freedesktop.Platform/x86_64/20.08
385M	org.freedesktop.Platform/x86_64/21.08
782M	org.freedesktop.Sdk/x86_64/21.08
26M	org.gnome.Platform/x86_64/3.28
329M	org.gnome.Platform/x86_64/3.34
264M	org.gnome.Platform/x86_64/3.36
263M	org.gnome.Platform/x86_64/3.38
198M	org.gnome.Platform/x86_64/40
277M	org.gnome.Platform/x86_64/41
264M	org.gnome.Sdk/x86_64/41
436M	org.kde.Platform/x86_64/5.14
231M	org.kde.Platform/x86_64/5.15
215M	org.kde.Platform/x86_64/5.15-21.08
8.7G	total

I have 18 runtimes, totalling 8.7 GB of storage (deduplicated), not “tens of gigabytes”. The top 5 most-used runtimes on my system cover 128 of the 163 apps. (I am ignoring the .Locale extensions of each runtime: the English and French translations of the 21.08 runtime total 17 MB, compared to 498 MB for the runtime itself, so I think this is a reasonable simplification for rough numbers.) As for updates? GNOME Software applies them automatically and silently. I don’t think about them at all.

People may disagree about whether the numbers above are large or small, compared to the upsides that Flatpak does or does not bring. But the numbers themselves are readily accessible, as is much of the past and ongoing work that has gone into making them as small/large as they are.

Personally, I think the trade-off is absolutely worth it for me and for Endless OS users, particularly since going all-in on Flatpak means that the base, immutable Endless OS install is just 4.2 GB. Of course there is room for improvement, and years ago I wrote a quick hack to help study exactly which files would ideally be shared between two runtimes but are not. At the time, the primary cause was non-reproducible builds. Since then, the Flatpak ecosystem has moved over to Buildstream which should help a lot, though I haven’t rerun the experiments except for what you see above. Automated statistics about apps using obsolete runtimes might be useful for the Flathub community, as might automated runtime updates, and some further work on understanding if and how the derived runtimes (GNOME & KDE) could share more with the freedesktop runtime version they are based on. And, would widespread use of filesystems that support block-level deduplication (like btrfs) help?

4 thoughts on “On Flatpak disk usage and deduplication”

  1. I’m not an expert but I have an idea that, why not just put version number, and then the libraries can be stored with different versions. Every app can then choose to use whatever library it wishes. And additionally there can be default libraries for each library which system uses. Isn’t that sort of thing possible.

  2. I guess you are referring to a tiny app depending on a huge runtime. If Kcalc is the only Flatpak you have installed, sure, not great, but that’s an artificial degenerate case. If you have most of your apps installed with Flatpak, the numbers are very different.

  3. Libraries themselves have versioned dependencies on other libraries. A Flatpak runtime is just a bundle of libraries at particular versions, and apps declare a dependency on a particular bundle of libraries.

    As for sharing libraries between apps and the OS: we used to do this in Endless OS by building Flatpak runtimes from distro packages, then making Flatpak and the system OSTree share a repository. It worked but the savings are not as great as you might hope, in part due to version drift and since only the Endless apps used the Endless runtime. Sharing the repo introduced robustness and performance complications, too. Fedora now builds runtimes out of distro packages so they could in theory pull the same repo-sharing trick with Silverblue, but TTBOMK hasn’t.

Comments are closed.