Performance Measurement (I)

Update: Ok, as usual a paper brown bug in the first release. gtk2-10cairopatch was actually using the pangoxft GTK+ instead of stock. Now I’m using the stock version, and sadly the double to fixed patch looks slightly less impressive… anyway, we still need more like that! :)

As part of the ongoing work at Nokia to make GTK+ perform decently on ARM I’ve done a series of performance tests using several different versions of GTK+.

I’ve used the excelent GTK+ Theme Torturer and a home grown PyCairo plotter (which is more or less my second ever python program: comments, fixes and rewrites warmly welcome) based on the ones made by Federico.

First, the graphics:

The other widgets are GtkCheckButton, GtkEntry, GtkFrame, GtkHScale, GtkNotebook,GtkProgressBar,

Now the boring details:

  • gtk2-6 is the heavily patched GTK+ 2.6 currently used in the Maemo platform. It’s here.
  • gtk2-10xft is stock GTK+ 2.10.6 with this patch applied. It basically replaces pangocairo with pangoxft for text rendering. Everywhere else, it still uses Cairo (1.2.4).
  • gtk2-10cairo is GTK+ 2.10.6 stock using Cairo 1.2.4.
  • gtk2-10cairopatch is the same as gtk2-10cairo but cairo has this patch applied. It applies heavy wizardry (and I mean heavy) to optimize the hell out the double to fixed conversion.

All the tests were made on ARM, using the Raleigh theme.

The big empty space is the Expose-Resize test from gtk-theme-torturer. I’ve omitted it because we were getting random (as in, each time you run the tests you get completely different numbers while all the other tests appear exactly the same as before). We are trying to track this down but if anyone has any wild guess about it comments are welcome. Of course, this could also mean all my numbers are pure crap, so thake this graphs with a grain of salt.

As you can see cairo is still some way away from the 2.6 performance levels (especially on some widgets, *ahem*GtkRadioButton*ahem*), but with patches like this one it shouldn’t be too difficult to reach reasonable performance numbers. Keep up the good work!

P.S: I’ll try to make periodical updates to track the ongoing performance effort in Cairo/GTK+.

P.P.S: Thanks to fer for the top-notch hosting! ;)

This entry was posted in General. Bookmark the permalink.

10 Responses to Performance Measurement (I)

  1. Maybe you already know, but the current cairo tessellator is reconverting the doubles to floats, so it may be a bottleneck. The new tessellator cworth is working on should avoid the problem using only fixed point math and be faster.

  2. Sean Kelley says:

    Yes, I have heard that. But the reality is that embedded development needs to use those patches sooner rather than later. So it would be good to hear Carl comment on them.

  3. Ross says:

    W.T.F. happened in GTK+ 2.10 to GtkRadioButton. Is this hitting a codepath in the gtk210-xft path that causes it to hit Cairo?

  4. Carl posted some pretty details stuff about it to the cairo list here so maybe you could try a fifth test with this branch of Cairo to see what if any difference it makes to the tests.

  5. Can you please email jdub and get yourself on Planet gnome? (

    Your work is amazing and I would love to see it in my normal p.g.o perusal. Thanks for what you are doing!

  6. Benjamin Otte says:

    Good job guys, if you continue your hard work, you might get the same conclusions at Christmas we got in August. (see for the boring details) To name a few:
    - the theme torturer has some problems, especially the resize test (since that’s not one test run multiple times, but multiple tests run one time) and not discarding wildly off measurements
    - it’s not of much use to test whole widgets, since they are composed of lots of drawing operations and only one is slow. You should better test the drawing ops directly (and with representative sizes)
    - Theres a big bug in cairo when stroking that causes it to hit the slow path every time, which makes stroking a rectangle 5-10x slower on a desktop machine.

    But your graphing looks a lot nicer than mine. :)

    who’s a bit surprised to not have gotten much feedback for his benchmarks only to see someone repeating his work 3 months later – worse.

  7. Since you invited a rewrite for the Python script, I started doing some little touch-ups, and got somewhat carried away:

    It is also a Bzr ( branch, if you want the full history in tiny incremental commits. Highlights:

    – conforms to the Python style guide (PEP-8)
    – informative docstrings (try pydoc plot-torturer)
    – comprehensible math for scale marks
    – columns are sorted alphabetically, not randomly
    – there are named constants instead of hardcoded numbers for almost all plot attributes (margins, spacings, etc.)

    What’s missing:

    – copyright notice
    – unit tests

  8. Xan López says:

    Wow, thanks Marius! I wasn’ really expecting someone to step up and fix my crappy code, so I spent most of yesterday adding lots of features and basically rewriting a huge chunk of the script. Now it can: plot all the graphs using the same scale (“normalize”), group the timings by widget (as it was in the first version) or by “property” (expose, map, etc. Really useful IMHO), filter arbitrary widgets or properties from the graphs, so you can do stuff like: make a graph with boot::expose for GtkLabel, GtkButton and GtkRadioButton.
    I’m going to do the following: I’ll open a project in, merge my new version with yours and put a license on the damn thing. When it’s done I’ll make another post about it so you can go and fix all the broken stuff again if you feel like it.

    Benjamin: it’s a pity that I didn’t find your code earlier, but anyway it’s not like I wasted my life doing a ~300 lines python script. The SQL thing is a very good idea though. And it’s good to know that the resize thing is broken.

  9. Xan: I somehow missed the “plot-torturer on garage” post. Is your blog syndicated on Planet GNOME?

    Anyway, I took a peek, and it looks like you merged almost all of my changes. I have only a few nitpicky stylistic suggestions:

    – Mixing tabs and spaces in the indentation often causes trouble, or so I’ve heard. Some people have their editors assume a tab size of 4 spaces (heretics! burn!).

    – “merge_dict” is a function that has a side effect (it modifieds its first argument and returns it). The side-effect-y nature of it would be highlighted by making it not return anything. That would also simplify your foo_merge(): master[k] = merge_dict(master[k], dict[k]) would become just merge_dict(master[k], dict[k])

    – “foo_merge” is not a very clear function name. Hm… “merge_datasets”?

    – “if condition: continue” on one line is a bit hard to read when the condition is long.

    – Inconsistent function call spacing (i.e. mixing both “fn(args)” and “fn (args)” in the same source file) is somewhat distracting. PEP-8 recommends “fn(args)” with no space.

    – I would find

    parser = getattr(self, “parse_%s” %
    result = [parser(file, options) for file in file_list]

    easier to read than

    method_name = “parse_%s” %
    result = [getattr(self, method_name)(file, options) for file in file_list]


  10. Xan says:

    Man, patches welcome! :)
    The most alarming thing is the tabs vs. spaces issue, has emacs betrayed me beyond all repair?
    All of them are sensible comments anyway (you don’t like foo_merge!?), will fix them.

Comments are closed.