Speeding up NewsBruiser – Initial results

Switched to the hotshot profiler instead of profile. The hotshot profile is better than profile. Made two changes, one to delay reading a notebooks configuration until something actually needs something out of it. Second change is storing the order of the blogs in a seperate file. This avoids NewsBruiser reading the order from each blog.

The hotshot profiler produces different results. A profile before above changes:

         77867 function calls (75825 primitive calls) in 0.905 CPU seconds

   Ordered by: internal time, call count
   List reduced from 856 to 100 due to restriction 

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      139    0.179    0.001    0.347    0.002 Notebook.py:626(readConfiguration)
        1    0.062    0.062    0.165    0.165 core.py:15(?)
        3    0.062    0.021    0.063    0.021 __init__.py:9(?)
       86    0.058    0.001    0.506    0.006 NBConfig.py:156(__registerPluginDir)
    15078    0.057    0.000    0.057    0.000 string.py:351(find)
   402/71    0.039    0.000    0.095    0.001 sre_parse.py:374(_parse)
       26    0.039    0.001    0.481    0.019 __init__.py:1(?)
     6459    0.038    0.000    0.049    0.000 util.py:59(replaceBaseURLs)
     6526    0.031    0.000    0.048    0.000 IWantOptions.py:154(getOption)

And after:

         28853 function calls (26806 primitive calls) in 0.467 CPU seconds

   Ordered by: internal time, call count
   List reduced from 856 to 100 due to restriction 

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.060    0.060    0.077    0.077 drv_libxml2.py:35(?)
   402/71    0.040    0.000    0.106    0.001 sre_parse.py:374(_parse)
        1    0.037    0.037    0.114    0.114 saxexts.py:41(_create_parser)
       26    0.020    0.001    0.697    0.027 __init__.py:1(?)
        1    0.019    0.019    0.312    0.312 feedparser.py:12(?)
   747/67    0.018    0.000    0.038    0.001 sre_compile.py:27(_compile)
     3541    0.017    0.000    0.025    0.000 sre_parse.py:201(get)
        3    0.016    0.005    0.065    0.022 __init__.py:3(?)
        1    0.015    0.015    0.051    0.051 cookielib.py:26(?)

Have to do a little work before I can commit this. I need to ensure the cache file will be saved whenever the order changes (or a new blog is added). Also need to rest it on Python 2.2 (version on the server). Still pretty good for a 120 line patch (large part consists of indenting changes and some debugging code).

Update: Patch has been committed.

More on ResourceAbuser

Yesterday I did an initial investigation to find out why NewsBruiser (software behind blogs.gnome.org) is so slow. Put a copy of blogs.gnome.org on my machine so I can hack it without breaking stuff. Did a profile of NewsBruiser as it served an image. Result:

         133023 function calls (130602 primitive calls) in 2.150 CPU seconds

   Ordered by: internal time
   List reduced from 935 to 100 due to restriction 

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      139    0.230    0.002    0.860    0.006 Notebook.py:643(readConfiguration)
     6526    0.120    0.000    0.200    0.000 IWantOptions.py:154(getOption)
   402/71    0.110    0.000    0.380    0.005 sre_parse.py:374(_parse)
     9198    0.110    0.000    0.110    0.000 :0(append)
    15077    0.110    0.000    0.110    0.000 string.py:351(find)
   747/67    0.100    0.000    0.250    0.004 sre_compile.py:27(_compile)
     6630    0.090    0.000    0.090    0.000 :0(replace)
     6526    0.070    0.000    0.110    0.000 Options.py:42(isThemed)
 1062/384    0.070    0.000    0.080    0.000 sre_parse.py:140(getwidth)
     6459    0.060    0.000    0.170    0.000 util.py:59(replaceBaseURLs)
     4511    0.050    0.000    0.070    0.000 sre_parse.py:182(__next)
     6543    0.050    0.000    0.050    0.000 :0(getattr)

There is a lot of stuff in there. Decided that starting with at the top was the best (readConfiguration). Initially I only looked at cumtime. Should have looked at ncalls and percall as that would have saved me some time. readConfiguration uses code to parse a handmade configuration file. Would be better if it just used some config file supported by Python (better chance that it is implemented in C or Python and optimized for speed). Thought about using a pickle file as a cache.

At this point I decided to learn abit more about NewsBruiser and the interaction between all the classes/files. Looked into all the stuff NewsBruiser does before it actually transmits the image. NewsBruiser actually reads (using not-speedy Python code) the configuration files of every blog within blogs.gnome.org (which I could have known earlier by looking at ncalls). Meaning, if more blogs are added to blogs.gnome.org, it slow down because of that. Grr. Added a quick hack to delay loading the config file until something wanted to access the config. Didn’t work. Seems the ordering of a blog is stored as a number in the config file and NewsBruiser really wants that. Not good. Looked at the ncalls vs the number of blogs. We do not have that many blogs. Seems that ResourceAbuser reads those configuration files twice.

If I avoid 138 readConfiguration calls, the cumtime would drop to 0.006, saving 0.854 secs out of the 2.150 CPU seconds. Pretty good for an initial investigation. Saw a tip in the NewsBruiser documentation for increasing the performance. There goes my my assumption that the developer just did not care about performance issues.

Although I’m investigating how I can optimize the NewsBruiser image serving this is not my goal. Images should be served by Apache (statically). Using a Python script to do that is stupid. The reason why I’m still investigating how NewsBruiser serves images because I want to understand why it isn’t faster. It should be like: 1) locate file, 2) read file 3) push content to stdout. Reading the configuration files of other blogs twice is not one of the things that should be part of this.

Slow bugzilla.gnome.org

Getting annoyed with bugzilla.gnome.org being slow. The server hosting Bugzilla also hosts almost all of the *.gnome.org websites and anoncvs. One known problem is blogs.gnome.org (aka NewsBruiser). That software is very resource intensive. The Nautilus Search blog post by alexl caused a load of >500. Because of the high load the sysadmins couldn’t log in to kill the processes (Elijah and me where logged in but couldn’t do anything).. end result was a reboot.

Today the load hit 127. Again due to blogs.gnome.org. There are some protections set in the apache config (renice, max cpu time, max # of processes, caching), but it is not enough. Fortunately got sudo access to the blogs.gnome.org just a few hours before and was able to kill all the evil processes.

Some 15 minutes later and the load again rose to at 15, except there where no blogs.gnome.org hits. Apparently 6 anoncvs sessions are also enough.

Really need a few servers just for Bugzilla In the meantime I’ll have to fix the biggest problems with blogs.gnome.org. Rather hack on b.g.o.