2003 – James Henstridge

17 December 2003

Post author:James Henstridge
Post published:17 December, 2003
Post category:Uncategorised

Callum: the slowness of modular DocBook XSLT stylesheets is in the chunking code, as I found out a while ago. You will find that if you turn off chunking (ie. produce one huge output file rather than many smaller files), the processing time will be cut in half. Interestingly, the older DSSSL stylesheets showed the opposite behaviour. One thing that might be interesting would be to try porting gtk-doc over to using Shaun McCance's new XSLT stylesheets (there are more details on his website). If these are suitable, they could give a significant boost to building API and user docs.

5 November 2003

Post author:James Henstridge
Post published:5 November, 2003
Post category:Uncategorized

Mark: the support for building the freedesktop.org X server hasn't been there for a while. It was just added yesterday by Johan Dahlin. If anyone else is interested in building some of the stuff in freedesktop.org CVS using jhbuild, I wrote some instructions and put them in the wiki.

Atom

Post author:James Henstridge
Post published:3 November, 2003
Post category:Uncategorized

Have been playing round with Atom, which looks like a nicer form of RSS. Assuming your content is already in XHTML, it looks a lot easier to generate an Atom file compared to an RSS file, because the content can be embedded directly, rather than needing to be escaped as character data. Similarly, an Atom file is easier to process using standard XML tools compared to RSS because the document only needs to be parsed once to get at the content (which is probably what you were after anyway). I decided to take a look at what would be necessary to get Advogato to produce nice Atom feeds. One of the difficulties is that all the content is stored in plain non XML compatible HTML. After a little bit of head scratching I realised that libxml can already do this kind of normalisation without much trouble as it already has an HTML parser that produces a DOM tree compatible with its XML parser/dumper APIs. I did some simple test programs in Python and C. I wonder whether code like this could be used directly in the diary posting code? With some small extensions, it would be pretty easy to implement tag/attribute sanitisation, and double new line to new paragraph conversion (the current implementation of this is quite annoying -- it still adds extra <p> tags for new lines that are clearly outside of a paragraph).

22 October 2003

Post author:James Henstridge
Post published:22 October, 2003
Post category:Uncategorised

Laptop I started running out of space on my laptop, so decided it would be easier to buy a new hard disk rather than clean things up (after all, I could get a 40GB drive for about AU$200, which would give me more than 3 times as much storage, and had almost identical power requirements). If only things were that easy ... After backing everything up, the first problem was taking the old hard disk out of the machine. The m300 is quite a nice machine, as you only need to undo one screw to remove the hard drive mounting. Getting the hard drive out of the mounting was a bit more of a problem as there were two torx screws holding the drive in. Moreover, I didn't have access to a small enough torx driver :(. Luckily the screw heads were raised enough that it was possible to undo them using some pliers without damaging anything. After getting the new drive into the mounting frame and into the machine, I needed to get Windows 98 onto the drive. This was required to get the hibernation working under Linux (the BIOS saves the contents of memory to a special file on the Windows partition). It turned out that the CD that came with the laptop was a quick restore disk, and wanted to create a full 40GB partition, rather than use the use the smaller partition I had already created. It them proceeded to screw up the restore, leaving me with a system that (a) wouldn't boot fully, and (b) was convinced that there were errors on the hard disk, but just couldn't find them. I guess that the restore CD managed to mis-format the drive somehow. In the end, I had to borrow a 98 CD and do a clean install, which worked perfectly (and let me install to a smaller partition). I can see how a quick restore CD could be useful in many common cases, but this one was nowhere near as robust as I would have liked. Compared to this, getting Linux up and running was trivial. After completing the restore, I did a few tests with hdparm -Tt which showed that the new disk had a read performance of 25MB/s (in comparison, the old disk did 13MB/s), which has resulted in noticably shorter compile times on the laptop. It is also a lot quieter when busy. This should put off the need to get a new laptop for quite a while. Gnome 2.5 Updated my system to CVS head, and things are looking good. The new Nautilus feels even faster (especially in spatial mode). Apparently metadata plugins are planned for 2.6, which should be interesting. It should allow people to implement things like TortoiseCVS, augmenting the existing views rather than creating a completely new view like Apotheke does.

Python

Post author:James Henstridge
Post published:20 October, 2003
Post category:Uncategorised

Been reading over Ulrich Drepper's paper on how to write shared libraries, and it struck me that use of the PyArg_ParseTupleAndKeywords() function will result in a lot of relocations that can't be avoided. I did a few tests using some dummy extension modules that contained a number of functions. I tried varying the number of functions, number of arguments for each function, and whether keyword arguments were supported. I found that in the PyArg_ParseTuple() case, the number of relocations was proportional to the number of functions (as expected -- a few relocations for each entry in the PyMethodDefarray. For the PyArg_ParseTupleAndKeywords() case, there was also one relocation for each argument listed in the keyword list array, which dominated as the number of arguments went up. I haven't checked how much influence this has on the startup speed, but it would make a difference to the amount of code shareable between processes for larger modules.