This post started as a reply to Mikkel‘s post about startup times and how people were jumping to conclusions before actually testing anything. But because I’m way too smart to do pointless arguments in blog posts (*cough*), I actually measured things.
Here’s the first benchmark (disclaimer: It’s run on my development system, so a crazy mix of Fedora 15 and hand-compiled stuff including debugging symbols): I launched “time $APPLICATION” from a terminal and immediately held down Ctrl+W or Ctrl+Q to make it quit again. I did a few runs and averaged the numbers in my head. That gave me a pretty good idea of how long these applications roughly take to do a warm start. Here’s the totally unscientific numbers:
application | real | user | sys |
---|---|---|---|
Epiphany | 1.0 | 0.8 | 0.05 |
Evince | 0.8 | 0.4 | 0.03 |
GEdit | 0.8 | 0.6 | 0.05 |
kwrite | 0.7 | 0.3 | 0.05 |
Totem | 1.0 | 0.7 | 0.05 |
gnome-terminal | 0.3 | 0.2 | 0.03 |
konsole | 0.3 | 0.1 | 0.04 |
Gimp | 3.4 | 2.7 | 0.20 |
Evolution | 3.3 | 2.5 | 0.30 |
Gnumeric | 0.7 | 0.4 | 0.02 |
oocalc | 1.0 | 0.6 | 0.15 |
Firefox | 1.8 | 1.4 | 0.20 |
Now I also wanted to do a cold start, so I ran all of the applications again, but ran “sudo sysctl vm.drop_caches=3” before every run, so I was sure the start really was cold. For KDE applications, I did 2 runs, one with kdelauncher already running, one without. And I only did it once, because it takes so long. Here’s the numbers again:
application | real | user | sys |
---|---|---|---|
Epiphany | 9.4 | 0.8 | 0.15 |
Evince | 6.4 | 0.4 | 0.10 |
GEdit | 8.3 | 0.7 | 0.10 |
kwrite | 13.1 | 0.3 | 0.20 |
9.4 | 0.3 | 0.15 | |
Totem | 7.6 | 0.5 | 0.10 |
gnome-terminal | 6.5 | 0.2 | 0.10 |
konsole | 12.3 | 0.2 | 0.15 |
8.5 | 0.2 | 0.15 | |
Gimp | 13.0 | 2.8 | 0.35 |
Evolution | 19.6 | 2.6 | 0.45 |
Gnumeric | 7.1 | 0.4 | 0.20 |
oocalc | 15.2 | 0.7 | 0.30 |
Firefox | 13.0 | 1.5 | 0.25 |
Yuck.
While that looks really bad, it probably is not as bad as it looks in the real world, because a bunch of applications are already running and a bunch of stuff is already being cached. So here’s another benchmark. This time I ran “sudo sysctl vm.drop_caches=3 && gnome-control-center” before every run, so that the GNOME libs were already loaded, and only application-specific things had to be loaded. More numbers:
application | real | user | sys |
---|---|---|---|
Epiphany | 4.6 | 0.8 | 0.10 |
Evince | 1.8 | 0.4 | 0.05 |
GEdit | 4.4 | 0.6 | 0.10 |
kwrite | 6.3 | 0.3 | 0.10 |
Totem | 2.7 | 0.5 | 0.05 |
gnome-terminal | 1.3 | 0.2 | 0.05 |
konsole | 5.1 | 0.1 | 0.10 |
Gimp | 9.2 | 2.8 | 0.30 |
Evolution | 13.4 | 2.6 | 0.40 |
Gnumeric | 3.5 | 0.4 | 0.05 |
oocalc | 11.6 | 0.6 | 0.25 |
Firefox | 9.1 | 1.4 | 0.25 |
Hrm. So what now? It seems there’s a few things one can learn from this.
- It looks like every application is able to start in 1 second or less, unless it does a lot of things that it probably shouldn’t do. Or said differently: If your application takes more than 1s to start up, it is slower than Libreoffice.
- Evolution and Gimp are really slow. They spend almost 3s calculating stuff. I suppose somebody should take a good look at what they do during startup.
- GNOME applications are really fast at starting up in a GNOME environment. (Epiphany seems to spend too much time initializing the disk cache, not sure what GEdit is doing, but probably loading plugins (and Python?))
- GNOME running gives a huge boost to startup times for all applications (seems to be pretty consistent at 4s-5s).
- Cold start of C++ applications seems to be really slow.
- It is never the kernel spending too much CPU.
- I cannot guess which of Ctrl+W and Ctrl+Q actually closes an app.
- gnome-control-center is the only application I found that responds to neither Ctrl+W nor Ctrl+Q (which is why I used it for repopulating the cache and didn’t benchmark it).
Now how can we make startup even faster? I know it has to involve I/O, but I don’t know what I/O is really hurting here. It could be anything from loading lots of libraries to loading icons to dconf to extensive D-BUS activations that is to blame here. I suppose that requires smarter benchmarking processes. But it’s certainly not stabbing in the dark. Or blaming building the widget tree, parsing files, complex data models or in fact any code that is run at startup. And I’m pretty sure the solution doesn’t involve copying ugly solutions from handhelds, because we’re really not that far away from a fast startup.
35 comments ↓
For GIMP, the answer is pretty easy: loading brushes, patterns, gradients and other resources is most likely the answer. Many small files that all need to be loaded.
Yeah, but none of those are necessary at application startup, they only need to be available once you start selecting a brush, pattern, gradient or plugin. In fact, when I open it, edit a picture and save it, I often don’t need any of these features.
I guess gimp loads them, like krita, in order to show the resources in the various resource selectors. Krita loads the resources in threads, though, so the applications appears to be starting faster.
@Benjamin Agreed … to paste a picture from a screenshot or other copy/paste activity does not need none of those.
I think you swapped the columns ‘user’ and ‘real’.
Yes, i always thought the Gimp should do some sort of background loading of the brushes and everything, maybe the thread would become blocking when it really needs to show the stuff.
Note that for the Gimp it hurts soooo bad on Windows and on MacOSX…
@ChornHulio: Thanks, fixed.
> gnome-control-center is the only application I found that
> responds to neither Ctrl+W nor Ctrl+Q
That’s because you’re using an old version, that’s been fixed some time ago (by me).
On my system it can take > 60 seconds to start Inkscape; I talked to the developers and it’s because of the numbers of fonts I have installed. I don’t know if number of fonts affects other apps in quite the same way but it might be worth looking at how many fonts are installed on systems the benchmarks are being taken on.
Also, I’d be interested to see your numbers for GNote. I have noticed more recent versions of GNote are very slow to create new notes and then again to allow you to type inside them. I thought it might be the too-many-fonts issue Inkscape was showing, but i removed my fonts and still got the slow times on GNote.
This doesn’t have any scientifical basis which is sad. “I launched “time $APPLICATION” from a terminal and immediately held down Ctrl+W or Ctrl+Q to make it quit again.” this is terrible and you don’t get any accurate numbers. But still we can get an idea of whats wrong.
I look at strace output regularly for Gnumeric.
Currently it looks like gtk’s icon stuff is the cause
of most io:
Count Filename
——————————————————————————————
73 /usr/share/icons/crystalsvg
72 /usr/share/icons/crystalsvg/icon-theme.cache
66 /opt/kde3/share/icons/crystalsvg
65 /opt/kde3/share/icons/crystalsvg/icon-theme.cache
29 /opt/kde3/share/icons/hicolor
28 /opt/kde3/share/icons/hicolor/icon-theme.cache
16 /home/welinder/.icons/default/index.theme
16 /usr/share/icons/default/index.theme
16 /usr/share/pixmaps/default/index.theme
Taras Glek just gave a talk at the plumbers conference about Firefox startup, and he’s written quite a few blog posts on it. A lot of it seems to be that library IO is quite random during startup, which hurts with disk seeks. He has some systemtap scripts for measurement that I’m sure you could adapt, and probably suggestions on other tools too.
For the C++ case, I suggest you read Ulrich Drepper’s article on ‘How to write shared libraries’, which contains some useful information on the drawbacks of the common C++ ABI that could account for the slow load times when C++ libraries are involved. Specifically, the number of string comparisons of mangled symbol names in the case of OpenOffice.org 1.0 is mentioned (on page 8):
http://www.akkadia.org/drepper/dsohowto.pdf
The article also contains useful tips on improving performance with C libraries (this is the main theme) and other accumulated wisdom.
Have none of you heard of profilers?
@Luis Medinas: Why is it terrible? I mean sure, you don’t want to measure CPU clock ticks for that kind of benchmark, but it’s certainly a good way to measure startup performance. Or are there any better ways?
@David King: The thing is: oocalc loads in 1.0s, konsole loads in 0.3s. That is as fast as C code. They only fall flat on cold startup. So I’m pretty sure it’s not strcmp() that is to blame. In particular, because the “user” time does not grow when going from warm to cold start, only the total time grows significantly.
C-W should be close window and C-Q should be close application.
I think you should stress more the CPU time bits. Suppose we wanted all of our applications to start in less than 250 ms. Even discarding IO, only gnome-terminal and konsole are able to pass the test. Bummer!
Or, using LibreOffice again as a threshold: if your application uses more than 0.6s of CPU time, you’re slower than LibreOffice.
Alt-F4 consistently works.
I believe several folks have worked on this problem:
http://asofyet.org/muppet/software/gtk2-perl/startup-time-poll/
http://people.gnome.org/~michael/blog/ooo-commit-stats-2008.html
http://blog.mozilla.com/tglek/2010/04/07/icegrind-valgrind-plugin-for-optimizing-cold-startup/
M Welinder:
Not saying that icon i/o is not bad, but repeated i/o on the same files is likely to be cached and thus not have the type of problems that cold boot has. So, i’m not sure counting like that matters. Its more about the total set of disk i/o and how the i/o patterns look.
GNOME startup as a whole was analyzed in
http://people.gnome.org/~lcolitti/gnome-startup/analysis/
Dave Jones mentions some tips in
http://www.linuxsymposium.org/archives/OLS/Reprints-2006/jones-reprint.pdf
Evolution won’t quit until it’s refreshed all folders, I think, and that’s network-dependent for an IMAP account. If you measured ‘time to interaction’ it’d probably look a lot better.
$ time emacs -q –eval ‘(save-buffers-kill-terminal)’
real 0m0.510s
user 0m0.236s
sys 0m0.016s
Hah!
Well, actually, the access to the icon theme directory is a stat that is likely cached, on the other hand, the icon theme cache file is mmaped and paged in on demand, so that may very well generate poor i/o patterns.
Time to resurrect the performance list:
http://mail.gnome.org/archives/performance-list/
More info on GNOME performance (tools, results):
http://live.gnome.org/GnomePerformance
Awesome to put some hard numbers on this Benjamin :-) I have some myself coming up from a foray into “Gwibber in 500ms” that I’ll try to put in a blog post.
I want to note, however, that I don’t think the point is to “have keyboard shortcuts ready asap”, but rather focus on having the main app window instantly mapped, and give a user impression that it’s instant response (and any dirty tricks are allowed as long as the user do not notice).
I really don’t know if there’s a shared metric we can use across all apps for this – maybe there is, but I have a hunch that it’s also gonna require some per-app evaluation.
I developed a habit (in the last months had to to a lot of photo editing): in the morning open an empty instance of GIMP and leave it open all day long. In any case, DO NOT open it for the first time with photo copied from my camera (at 18 MP), since it takes aeons to load, while the coputer in unresponsive.
As for Open Office/Libre Office… there is no need for words about them, it was a pain to start since back in the days it was called Star Office and it was a proprietary app (and all the time you see in release notes about it becoming faster, with no reflection in reality).
Firefox could be also a bit faster to start… so it is again another of the apps you have to keep open all day long.
“I cannot guess which of Ctrl+W and Ctrl+Q actually closes an app.”
There must be someone on the Gnome project that really loves to be able to claim that the UI is incredibly consistent because: “Ctrl+W always closes your document and Ctrl+Q always quits your application if it’s not showing a single document!”
Nevermind that I constantly find myself pressing Ctrl+Q trying to get Evince to quit (because damnit, Q is for Quit, right?) and it doesn’t work.
I could unleash a torrent of very rude words in the direction of the person who made the stupid design decision to distinguish between closing a document and quitting an application when it only has a single document open.
“I suppose somebody should take a good look at what they do during startup.”
For those: VTune is available free-of-charge under a non-commercial license, see
http://milianw.de/blog/vtune-and-kde
;)
“If your application takes more than 1s to start up, it is slower than Libreoffice”
This quote is awesome!
[…] thought Benjamin’s idea of launching gnome-control-center afterwards (to simulate real-world cold startup times) was […]
Boudewijn Rempt said “For GIMP, the answer is pretty easy: loading brushes, patterns, gradients and other resources is most likely the answer. Many small files that all need to be loaded.”
What about doing something like the new extension packaging model in Firefox 4 that aims at reducing the amount of I/O ? https://developer.mozilla.org/en/Extensions/Updating_extensions_for_Firefox_4#XPI_unpacking
Another tool that can be used:
http://maemo.org/development/tools/doc/diablo/sp-startup-time/ sources are here http://repository.maemo.org/pool/maemo5.0/free/s/sp-startup-time/
@antistress, the reason Firefox can access extensions in XPIs is that it can read resources directly from ZIP files (including .jar and .xpi files) without unpacking them into a file system.
Internally, Firefox 4 bundles hundreds of resources (CSS, .js, icons, properties) into a single zip file named omni.jar, and then Mozilla optimized the layout and construction of this mega-zip file. Code (or users) can use special URLs to access the contents of this, e.g. load chrome://global/locale/appstrings.properties in Firefox and it’ll access the appropriate strings file for your locale.
http://blog.mozilla.com/tglek/2010/09/14/firefox-4-jar-jar-jar/
http://blog.mozilla.com/mwu/2010/08/13/omnijar-how-does-it-work/
Good stuff but I don’t know how applicable it is to other programs. As @Josh Stone mentions, Taras Glek and others have done other fine work optimizing Firefox startup that is relevant to other C code, e.g. segment layout in executables and libraries to minimize disk I/O.