Under public domain
There’s been lately lots of fuzz around Tracker as a security risk, as the de-facto maintainer of Tracker I feel obliged to comment. I’ll comment purely on Tracker bits, I will not comment on other topics that OTOH were not as debated but are similarly affected, like thumbnailing, previewing, autodownloading, or the state of maintenance of gstreamer-plugins-bad.
First of all, I’m glad to tell that Tracker now sandboxes its extractors, so its only point of exposure to exploits is now much more constrained, leaving very little room for malicious code to do anything harmful. This fix has been backported to 1.10 and 1.8, and new tarballs rolled, everyone rejoice.
Now, the original post raising the dust storm certainly achieved its dramatic effect, despite Tracker not doing anything insecure besides calling a closed well known set of 3rd party libraries (which after all are most often installed from the same trusted sources that Tracker comes from), it’s been on the “security” spotlight across several bugs/MLs/sites with different levels of accuracy, I’ll publicly comment on some of these assertions I’ve seen in the last days.
This is a design flaw in Tracker!
Tracker has always performed metadata extraction in a separate process for stability reasons, which means we already count on this process possibly crashing and burning away.
Tracker was indeed optimistic at the possible reasons why that might happen, but precisely thanks to Tracker design it’s been a breeze to isolate the involved parts. A ~200 lines change hardly counts as a redesign.
All of tracker daemons are inherently insecure!, or its funnier cousin Tracker leaks all collected information to the outside world!
This security concern has only raised because of using 3rd party parsers (well, in the case of the GStreamer vulnerability in question, decoders, why a parsing facility like GstDiscoverer triggers decoding is another question worth asking), and this parsing of content happens in exactly one place in your common setup: tracker-extract.
Let’s dissect a bit Tracker daemons’ functionality:
- tracker-store: It is the manager of your user Tracker database, it connects to the session bus and gets readwrite access to a database in ~/.cache. Also does notification of changes in the database through the user bus.
- tracker-miner-fs: It’s the process watching for changes in filesystem, and filling in the basic information that can be extracted from shared-mime-info sniffing (which usually involves matching some bytes inside the file, little conditionals involved), struct dirent and struct stat.
- tracker-extract: Guilty as charged! It receives the notification of changes, and is basically a loop that picks the next unprocessed file, runs it through 3rd party parsers, sends a series of insert clauses over dbus, and picks the next file. Wash, rinse, repeat.
- tracker-miner-applications: A very simplified version of tracker-miner-fs that just parses the keyfiles in various .desktop file locations.
- tracker-miner-rss: This might be another potential candidate, as it parses “arbitrary” content through libgrss. However, it must be configured by the user, it otherwise has no RSS feeds to read from. I’ll take the possibility of hijacking famous blogs and news sites to hack through tracker-miner-rss as remote enough to fix it after a breathe.
So, taking aside per-parser specifics, tracker consists of one database stored under 0600 permissions, information being added to it through requests in the dbus session, and being read by apps from a readonly handle created by libtracker-sparql, the read and write channels can be independently isolated.
If you are really terrified by your user information being stored inside your homedir, or can’t sleep thinking of your session bus as a dark alley, you certainly want to run all your applications in a sandbox, they won’t be able to poke on org.freedesktop.Tracker1.Store or sniff on ~/.cache that way.
But again, there is nothing that makes Tracker as a whole inherently insecure, at least not more than the average session bus service, or the average application storing data in your homedir. Everything that could be distrusted is down to specific parsers, and that is anything but inherent in Tracker.
Tracker-extract runs plugins and is thus unsafe!
No, tracker-extract has a modular design, but is not extensible itself. It reads a closed set of modules implemented by Tracker from a folder that should be in /usr/lib/tracker-1.0 if your setup is right. The API of these modules is private and subject to change. If anything manages to add or modify modules there, you’ve got way worse concerns.
Now, one of these extractor modules uses GStreamer, which to my knowledge is still the go-to library if you want anything multimedia on linux, and it happens to open an arbitrary list of plugins itself, that is beyond Tracker control or extent.
It should be written in rust!
What do we gain from that? As said, tracker-extract is in essence a very simple loop, all the scary stuff is handled by external libraries that will still be implemented in “unsafe languages”, rust is just as useful as gift paper to wrap this.
Extraction should be further isolated into another process!
There are good reasons not to do that. Having two separate processes running completely interlocked tasks (one process can’t do anything until the other is finished) is pretty much a worst case for scheduling, context switching, performance and battery life at once.
Furthermore, such tertiary service would need exactly the same whitelisted syscalls and exactly the same number of ways out of the process. So I think I won’t attract the “Tracker is heavy/slow” zealots for this time… There is a throwaway process, and it is tracker-extract.
The silver linings
Tracker is already more secure, now lets silence the remaining noise. Quite certainly one area of improvement is Flatpak integration, so sandboxed applications can launch isolated Tracker instances that run under the same sandboxed environment, and extracted data is only visible within the sandbox.
This is achievable with current Tracker design, however the “Tracker as a service” approach sounds excessive with this status quo, tracker needs to adapt to being usable as a local store, and it needs to go through being more of a generic SPARQL endpoint before.
But this is just adapting to the new times, Flatpak is relatively young and Tracker is slow moving, so they haven’t met yet. But there is a pretty clear roadmap, and we’ll get there.