Investigating Bugzilla performance with Cairo

Within GNOME we use Bugzilla a bit differently than upstream. GNOME has a lot more products and components than e.g. the Bugzilla used by Mozilla (which is also used to tracks the bugs of Bugzilla itself). More than 350 products, a lot of components, versions, etc. This especially can cause performance issues that aren’t seen when you only have a few products. One of these issues is bug 372795. This is a performance bug about Bugzilla doing one query per component/version/milestones on query.cgi?format=advanced. That bug has a patch where all components/versions/milestones will use just one query per item. However, you might not always want that information and the patch could trigger instances where the information is loaded while it isn’t being used.

Performance testing Bugzilla is always annoying. You have to do something like:

DBI_PROFILE='2/DBI::ProfileDumper' time ./query.cgi 'format=advanced'

This generates a dbi.prof file which can be analyzed with the dbiprof command. Running command like shown above becomes annoying when you want to test a lot of these scripts. Plus sometimes you want the uncached timings (/proc/sys/vm/drop_caches + MySQL stuff). It is all just too slow. I wrote a script that handles this stuff and does all the work of comparing a standard Bugzilla installation to when that installation is patched. The script parses the ‘time’ and ‘dbiprof’ output and generates a few .csv files. However, having a chart would be much nicer.

It took a while to find a nice (working) Python Cairo chart module — I ended up with a patched pycha. I had to patch it as it didn’t allow the font size to be changed and it also generated area charts instead of a real line chart. The end result is picture which is far easier to understand than a .csv with loads of figures:


I still want to make that chart clearer. However, it is a big improvement compared to the .csv files. What you see are various tests ran against 4 ‘Bugzilla installations’: -anon means that the test was ran while logged out, ‘auth’ means while logged in, ‘patched’ is when the patch was applied. To check for performance regressions you just have to compare the 1st ‘tick’ to the 2nd and the 3rd to the 4th. Which in a chart is easy, just look where the lines go up instead of down. In above chart, query.cgi is much faster. However, editproduct, ‘select_classification’ and the ‘My Bugs’ buglist are slower. The timings are in seconds btw and are for the cached cache (each test ran 5 times, the timings are for the 5th run).
Making a script to run such tests is really great (hate doing things manually). Even better is having Cairo to really do something with these results — the chart is so much clearer. I’d still like a better Cairo charting option from within Python though — perhaps some nice goffice Python bindings?

Release notes on GNOME Library

In a combined effort Frederic Peters and me moved the GNOME 2.12+ release notes to the GNOME Library. There have been various changes over the years in how these release notes were generated (only discussing GNOME 2.x). In the beginning, the release notes were stored in the releng module. Someone had to fetch the docbook sources, generate HTML from these files and commit the HTML to the www.gnome.org module (gnomeweb-wml). Starting with 2.10 the release notes were translated into various languages. Someone still had to commit the HTML files to gnomeweb-wml. That changed as of 2.12. The release-notes were put into gnomeweb-wml and its build was changed to automatically generate the HTML (in various languages). The release-notes build process within gnomeweb-wml wasn’t smart. It would generate HTML files no matter what. Even if you just changed a file in www.gnome.org/projects/ it would still generate the 2.12 (and newer) release notes in various languages. This severely slowed down the update of www.gnome.org. From 2.18 onwards the build was changed to not recreate release notes for older GNOME versions (as no change would be expected); although ideally it should just rebuild when needed.

As of now, you can see the release notes on the GNOME Library (I’ll wait with the URL until 2.22). This has the following benefits:

  • General GNOME Library goodness. Meaning: users should automatically get the release notes in their language (when available)
  • No duplication of docbook intelligence
  • Less maintenance
    Initially I changed the gnomeweb-wml release notes build system to change the layout to match
  • Only builds when needed

However, it took some effort to get these release notes on the GNOME Library. Most importantly, the GNOME Library builds from tarballs, not SVN. This meant somehow giving a new tarball to the GNOME Library after a commit. Secondly, GNOME Library assumes tarballs do not change their contents (e.g. you cannot just give it an updated release-notes-2.22.tar.gz ). And the last important issue was that the old release notes had some invalid docbook.

Creating a new tarball involved the moving of the 2.12+ release-notes from gnomeweb-wml into a new repository. This to make the tarball generating scripts easier, make damned-lies integration easier and it is a lot cleaner as well. Moving these files with history meant a painful combination of svndumpfilter, vim (to change the locations), and trial and error until the repository import worked again (due to usage of svn cp between release note versions). Basically the stuff changed from /trunk/www.gnome.org/start/$VERSION(/notes)?/ to /branches/gnome-$VERSION/help. The ‘help’ directory is there to ease damned-lies (has some defaults); same for the ‘branches’.
After the new repository was created, a tarball had to be made. Although actually a tarball per branch. This was a problem as although there was enough infrastructure to build something after a commit (the ‘gnomeweb’ scripts), these had no notion of branches. Everything works using trunk and the post-commit script assumes the commit was done in trunk. First task was making a post-commit script that would figure out which branch(es) a commit was made to. For this I started with a copy of the damned-lies post-commit script, changed it enough to introduce a horrible bug and spent loads of time figuring out why it wasn’t working in some cases. Oh well.
Next up was adding branch intelligence to the ‘gnomeweb’ scripts. There are basically two scripts. One notices that a commit was made. That script is started after receiving a certain email. The other script starts via cron and is smart enough to avoid updating e.g. GNOME Library if there is another update running atm. However, it will notice if a new commit was made while a build was running. In that case, it’ll rebuild again (and again). This is important as the GNOME Library doesn’t support 2 updated running at the same time (although it usually works fine). After the branch support was added, it was time to make ‘hook’ script (the ‘gnomeweb’ script optionally runs a different hook script depending on the svn repos name).

The hook script itself is pretty simple (not perfect, but works and it is pretty clear what is going on):

#!/bin/sh

DIR="$(basename "$(pwd)")"
TGZ="/usr/local/www/gnomeweb/libgo-extra-tarballs/$DIR.tar.gz"

cd help
for LANG in $(find -maxdepth 1 -type d)
do
    PO_FILE="$LANG/$LANG.po"
    if [ -e "$PO_FILE" ]
    then
        for XML_FILE in C/*.xml
        do
            DEST="$LANG/$(basename $XML_FILE)"
            BUILD=0
            test -e "$DEST" || BUILD=1
            test "$PO_FILE" -nt "$DEST" && BUILD=1
            test "$XML_FILE" -nt "$DEST" && BUILD=1
            if [ "$BUILD" = "1" ]
            then
                xml2po -e -p "$PO_FILE" "$XML_FILE" > "$DEST"
            fi
        done
    fi
done
cd ..

cd ..
rm -f "$TGZ"
tar czf "$TGZ" --exclude=".svn" "$DIR"

# Trigger update of GNOME Library
echo "$DIR forced" | /home/admin/bin/gnomeweb/set_buildflag.py -m library-web

Comments on above hook script: The gnomeweb script automatically runs this hook script from the checkout directory. Further, I added some assumptions to that script on how to name the checkout directory for a branch. Basically: $REPOSNAME-$BRANCHNAME, but if the branchname looks like: gnome-2-20, name the checkout directory like: $REPOSNAME-2.20. This allows the hook script to easily tar that directory. The directory name is important as (AFAIK) GNOME Library wants a ‘release-notes-2.20.tar.gz’ tarball to have a ‘release-notes-2.20’ directory.

Above hook script triggers a rebuild of the GNOME Library. I’ve added some (customizable) ordering to the gnomeweb scripts to ensure ‘library-web’ is checked after ‘release-notes’.

Frederic worked on making GNOME Library (AKA libgo / library-web) show these release-notes somewhere. I’m not fully aware of the various things he had to do, but hopefully he’ll make a post if something is missing (or I’ll update this one). Initially he had added to libgo to always rebuild certain modules (meaning: release-notes). Later he changed that to look at the timestamp of the tarball. Of course he had to add workarounds due to the various ways in which the release-notes XML was wrong. Stuff like a no/wrong ‘id’ being set on the article element. Another issue was with getting a title/abstract from the release-notes. This is done automatically and works perfectly for every document aside from the release-notes. However, for the release-notes it only gets some random text. Forcing Frederic to add an option not to fetch a title/abstract for some modules (release-notes) and using some standard text instead.
‘Layout’-wise Frederic made various important changes. First of all the release notes appeared under ‘users’, while the target is everyone (users, developers, administrators, …). Frederic made a new category for the release-notes called ‘misc’. Another change was to render the release notes ‘flat’. Meaning: everything in one big page. This as we learned in older release notes that people 1) didn’t bother to click the link to the release notes 2) often stopped at the first page. Further, he added a list of languages to the side of the release notes. Before we had a link to these languages within the release notes themselves. Meaning: if there we’re 45 languages, the translators would translate those links again. Now GNOME Library handles that. Also avoids changes to the release notes docbook just because it was translated.
He also made some changes that aren’t really release-notes specific. First of all a translated document will have links to the specific translated image. Meaning it links to image.png.es instead of image.png. Before it had a link to image.png. This relied on the user having the correct browser setting being correct. That broke down whenever you e.g. we’re intentially looking at a language other than the one your browser was set to. Second change was how links within the same document we’re made. Before it always had links like: $FILENAME#someid. Now it only does that if it links to another file. Otherwise if you’re looking at release-notes/2.20/, your browser would load index.html.en#someid — but that is the same document!

Currently a commit to release-notes will start the following process:
commit on socket -> post-hook command -> nasty gawk script -> email to menubar -> fwd’ed to window -> procmail as gnomeweb user -> setting of buildflag -> cron -> update_sites.py -> svn checkout/update -> release-notes hook script -> tarball -> forced libgo update -> library-web svn update -> library-web hook script -> lots of magic (I’m not Frederic) -> updated document. This whole process takes 5-7 minutes (due to cron — that is on purpose so it sometimes combines multiple commits).