bzr-dbus hacking

When working on my bzr-avahi plugin, Robert asked me about how it should fit in with his bzr-dbus plugin. The two plugins offer complementary features, and could share a fair bit of infrastructure code. Furthermore, by not cooperating, there is a risk that the two plugins could break when both installed together.

Given the dependencies of the two packages, it made more sense to put common infrastructure in bzr-dbus and have bzr-avahi depend on it. That said, bzr-dbus is a bit more difficult to install than bzr-avahi, since it requires installation of a D-Bus service activation file. After looking at the code, it seemed that there was room to simplify how bzr-dbus worked and improve its reliability at the same time.

The primary purpose of bzr-dbus is to send signals over the session bus whenever the head revision of a branch changes. This was implemented using a daemon that is started using D-Bus activation, and sends out the signals in response to method calls made by short lived bzr processes.

While this seems to be the design the dbus-python tutorial guides you to use, I don’t think it is the best fit for bzr-dbus. The approach I took was to do away with the daemon altogether: the D-Bus session bus does a pretty good job of broadcasting the signals on its own.

The code that previously asked the broadcast daemon to send the revision signal was changed to simply send the signal. The following helper made this pretty easy to do without having to write any extra classes to emit the signals:

def send_signal(bus, dbus_interface, signal_name, signature, *args):
    """Send a signal on the bus."""
    message = dbus.lowlevel.SignalMessage('/', dbus_interface, signal_name)
    message.append(signature=signature, *args)
    bus.send_message(message)

With these changes, the commit hook now only needs to connect to the session bus and fire off the signal and return. Previously it was connecting to the bus, getting an the broadcast service (which might involve activating it), sending a method call message and waiting for a method return message. The new code is faster and if no one is listening for the signals, it only wakes the bus.

For code that was consuming the signals, they had to switch to the bus.add_signal_receiver() method to register the callbacks, which allows you to subscribe to a signal irrespective of its origin.

The only missing feature with these changes was annotating the signals with additional URLs when the branch was being shared over the network. As these additional URLs are only really interesting when accessing the branch remotely, I moved the functionality to the “bzr lan-notify” command so that it annotates the revision announcements just before broadcasting them to the local network.

With all the changes applied, the D-Bus API consists entirely of signal emissions, which gives a looser coupling between the various components: each component will happily function in the absence of the others, which is great for reliability.

Once the patches are merged, I’ll have to look at porting bzr-avahi to this infrastructure. Together, these two plugins offer compelling features for local network collaboration.

Running Valgrind on Python Extensions

As most developers know, Valgrind is an invaluable tool for finding memory leaks. However, when debugging Python programs the pymalloc allocator gets in the way.

There is a Valgrind suppression file distributed with Python that gets rid of most of the false positives, but does not give particularly good diagnostics for memory allocated through pymalloc. To properly analyse leaks, you often need to recompile Python with pymalloc.

As I don’t like having to recompile Python I took a look at Valgrind’s client API, which provides a way for a program to detect whether it is running under Valgrind. Using the client API I was able to put together a patch that automatically disables pymalloc when appropriate. It can be found attached to bug 2422 in the Python bug tracker.

The patch still needs a bit of work before it will be mergeable with Python 2.6/3.0 (mainly autoconf foo).  I also need to do a bit more benchmarking on the patch.  If the overhead of turning on this patch is negligible, then it’d be pretty cool to have it enabled by default when Valgrind is available.

Honey Bock

Yesterday I bottled the honey bock that has been brewing over the last week. This one was made with the following ingredients:

  1. A Black Rock Bock beer kit.
  2. 1kg of honey
  3. 500g of Dextrose
  4. Caster sugar for carbonation

The only difference from the standard procedure was replacing part of the brewing sugar with honey. Before being added, the honey needs to be pasteurised, which involves heating it up to 80°C and keeping it at that temperature for half an hour or so. This kills off any any wild yeasts or other undesirables that might spoil the brew.

I’ve used honey in a few other brews over the years but had not tried it with a dark beer, so it will be interesting to see how it turns out. The previous beers had a stronger honey flavour than commercial beers like Beez Neez, which is probably a good thing for a dark beer.  I guess I’ll find out after it matures for about a month.

Two‐Phase Commit in Python’s DB‐API

Marc uploaded a new revision of the Python DB-API 2.0 Specification yesterday that documents the new two phase commit extension that I helped develop on the db-sig mailing list.

My interest in this started from the desire to support two phase commit in Storm – without that feature there are far fewer occasions where its ability to talk to multiple databases can be put to use. As I was doing some work on psycopg2 for Launchpad, I initially put together a PostgreSQL specific patch, which was (rightly) rejected by Federico.

He suggested that it would be better to try and standardise on an API on the db-sig list, so that’s what I did. I looked over the API exposed by other database adapters that supported 2PC, and the 2PC APIs of the major free databases that did not have support in their Python adapters (MySQL and PostgreSQL). The resulting API is a bit more complicated than my original PostgreSQL-only but has the advantage of being implementable on other databases such as MySQL.

Below is a simple example of using the API directly (missing some of the error handling):

# begin transactions for each database connection
conn1.tpc_begin(conn1.xid(42, 'transaction ID', 'connection 1'))
conn2.tpc_begin(conn2.xid(42, 'transaction ID', 'connection 2'))
# Do stuff with both connections
...
try:
    conn1.tpc_prepare()
    conn2.tpc_prepare()
except DatabaseError:
    conn1.tpc_rollback()
    conn2.tpc_rollback()
else:
    conn1.tpc_commit()
    conn2.tpc_commit()

Or alternatively, if you’ve got one connection supporting 2PC and the other only supporting one-phase commit, it could be structured as follows:

# begin transactions for each database connection
conn1.tpc_begin(conn1.xid(42, 'transaction ID', 'connection 1'))
# Do stuff with both connections
...
try:
    conn1.tpc_prepare()
    conn2.commit()
except DatabaseError:
    conn1.tpc_rollback()
    conn2.rollback()
else:
    conn1.tpc_commit()

While it is possible to use the 2PC API directly, it is expected that most applications will rely on a transaction manager to coordinate global transactions, such as Zope’s transaction module.

The hope is that by offering a consistent API, Python application frameworks will be more likely to bother supporting this feature of databases. Hopefully you’ll be able to use the API with PostgreSQL and Storm soon.