In free software some fashions never change, and some are particularly hard to overcome. Today I’ll talk about the “Tracker makes $ANYTHING slow” adage, lately gnome-music being on the spotlight here. I’m glad that I could personally clear this up to some individuals on the hackfests/conferences I’ve been around lately.
But convincing is a never ending labor, there’s still confused people around the internets, and disdainful looks don’t work as well over there. The next best thing I could do is fixing things myself to make Tracker look less like the bad guy. So, from the “can’t someone else do it” department, here’s some commits to improve the situation. The astute reader might notice that there is nothing about tracker in these changes.
There’s of course more to it, AFAICT other minor performance hits are caused by:
- grilo emitting one signal per media item found, which is somewhat bad on huge lists
- icon view performance generally sucking, which makes scrolling not as smooth in the Albums view while covers are loading
- After all that, well sure, Tracker queries can be marginally optimized.
This will eventually hit master and packages, until then, do me a favor an point to this post anyone still saying how Tracker made gnome-music slow.
Developer experience hackfest
Kind of on topic with this, I attended a few weeks ago to the Developer experience hackfest. Besides trying to peg round pieces into square holes, after some talking with how much of a steep barrier was Sparql as a prerequisite for accessing Tracker data, I started there on a simpler query API that abstracted all of these gritty details. Code is just shaping up there, but I expect it to cover the most common usecases. I must thank Red Hat and Collabora for enabling me to go there, all the people there, and particularly Philip for being such a great host.
Oh, and also attended Fosdem and Devconf, even talked on the last one about the input plans going on in GNOME, busy days!
I also think iconview is slow because grilo works with signals and this make impossible to have a threaded loading view as all the code run in the mainloop… I may be wrong here 🙂
BTW, l don’t blame tracker, it’s really fast! But, GRILO is doing too many requests to get all informations needed by gnome-music… I think it’s the main problem… On my computer (XPS 13), loading my collection (main icon view) take more than 30 minutes with my computer going crazy…
I still can see a lot of disk trashing on Tracker’s behalf (80% from tracker-store, but also tracker-miner-fs), on a BTRFS filesystem. I tried marking the ~/.cache/tracker folder with lsattr +C to disable COW, but still it is unacceptably slow due to tons of I/O. Especially when having some terabytes of music and videos around.
While I agree that most of the performance issues in GNOME Music (and let’s not forget GNOME Photos) are on their client side, I can also see a general slowdown at each login, due to blocking I/O. Sometimes the PC freezes for seconds. Is it maybe sqlite3 + btrfs to blame? Either way, I have a last-generation PC, and it’s slowed down to a crawl. Given that btrfs is slowly gaining traction, maybe Tracker should take it in account? Is it possible to set up a postgresql backend for tracker, just to check if the problem lies on the sqlite3 side of things? Is there a preferred way to profile tracker?
I actually tried several times to get hold of someone hacking on Tracker at the GNOME booth at FOSDEM, but evidently I was out of luck :-(.
@Cédric Bellegarde: making that fully async might help indeed, it is mainly model insertions/modifications what’s somewhat hindering here. GtkTreeView used to fix this for large models with fixed-height rows and whatnot, whereas we do know the item width/height in iconview, it could certainly be smarter here.
@Matteo: It is true that probably nobody payed attention to Tracker on btrfs, and behavior there should be checked. Initial indexing of data is going to take cpu and I/O nonetheless.
After initial indexing has finished, Tracker still has the task to check for
changes on session restarts, and to reinstaurate file monitors so changes can be followed. This indeed means that 1) mtimes are compared with the db’s all across and 2) inotify monitors are added on each directory Tracker steps in.
This would explain for the I/O you experience on these 2 processes on startup. TBH, for your volume of data I can only suggest to check Tracker configuration options, you can eg. make crawling scheduled to happen every X days, and/or avoid monitoring directories. This would make Tracker no longer “real-time” for you, but you may prefer this to facing slowness on startup.
Ideally, to avoid recursively descending on startup, there should be some kind of logged file notification mechanism, so that Tracker could just know what changed last on startup, and keep getting notifications for the whole tree from there. There’s nothing similar yet on Linux.
So, until something like this can be done above the kernel, tracker won’t be “one size fits all” (specially as for TB of data), and you’ll be recommended about tweaking config options, sorry…
@carlosg: would something like “btrfs subvolume find-new ” give the information needed?
Initial indexing is really not an issue. I am more concerned about the little freezes upon log-in. Maybe there is a way to make tracker a bit less aggressive when it comes to I/O?
I also wonder whether tracker hits the maximum FD limit for inotify watches, if it watches everything recursively. Even checking out a git copy of the linux kernel in a watched directory (such as “Documents”) might alone add hundreds of watches. Since I have several big checkouts… can that be an issue?
@Matteo: oh nice, it certainly looks like something that could help on btrfs /home partitions,
However, in order to avoid descending recursively through the tree on startup, we also should be able to monitor for changes without recursive watches. inotify is the best we got nowadays, but certainly wasn’t thought out for this, as you can also see from the performance hit.
So if you want to avoid recursive checks at this time, I can just recommend you to check the /org/freedesktop/tracker/miner/files/crawling-interval and enable-monitors settings.
As for exhausting inotify watches limit, Tracker makes sure to leave some 500 for other apps, and fail gracefully when it runs out of watches (the not-really-watched directories would just be checked again on next startup).
But as you say, checkouts of large repos on directories that Tracker inspects recursively is a sure way to stress it, could also be prevented with the /org/freedesktop/tracker/miner/files/ignored-directories-with-content setting if you prefer to avoid git repos altogether (I should just add “.git” by default there).
As for me personally, all my code stuff is in ~/Source and ~/Build, these wouldn’t be inspected by Tracker by default. But then there’s people explicitly wanting Tracker as a sort of git grep for their source trees. “one size fits all” is indeed hard on Tracker as you see 🙂