Transaction Management in Django

In my previous post about Django, I mentioned that I found the transaction handling strategy in Django to be a bit surprising.

Like most object relational mappers, it caches information retrieved from the database, since you don’t want to be constantly issuing SELECT queries for every attribute access. However, it defaults to commiting after saving changes to each object. So a single web request might end up issuing many transactions:

Change object 1 Transaction 1
Change object 2 Transaction 2
Change object 3 Transaction 3
Change object 4 Transaction 4
Change object 5 Transaction 5

Unless no one else is accessing the database, there is a chance that other users could modify objects that the ORM has cached over the transaction boundaries. This also makes it difficult to test your application in any meaningful way, since it is hard to predict what changes will occur at those points. Django does provide a few ways to provide better transactional behaviour.

The @commit_on_success Decorator

The first is a decorator that turns on manual transaction management for the duration of the function and does a commit or rollback when it completes depending on whether an exception was raised. In the above example, if the middle three operations were made inside a @commit_on_success function, it would look something like this:

Change object 1 Transaction 1
Change object 2 Transaction 2
Change object 3
Change object 4
Change object 5 Transaction 3

Note that the decorator is usually used on view functions, so it will usually cover most of the request. That said, there are a number of cases where extra work might be done outside of the function. Some examples include work done in middleware classes and views that call other view functions.

The TransactionMiddleware class

Another alternative is to install the TransactionMiddleware middleware class for the site. This turns on transaction management for the duration of each request, similar to what you’d see with other frameworks giving results something like this:

Change object 1 Transaction 1
Change object 2
Change object 3
Change object 4
Change object 5

Combining @commit_on_success and TransactionMiddleware

At first, it would appear that these two approaches cover pretty much everything you’d want. But there are problems when you combine the two. If we use the @commit_on_success decorator as before and TransactionMiddleware, we get the following set of transactions:

Change object 1 Transaction 1
Change object 2
Change object 3
Change object 4
Change object 5 Transaction 2

The transaction for the @commit_on_success function has extended to cover the operations made before hand. This also means that operations #1 and #5 are now in separate transactions despite the use of TransactionMiddleware. The problem also occurs with nested use of @commit_on_success, as reported in Django bug 2227.

A better behaviour for nested transaction management would be something like this:

  1. On success, do nothing. The changes will be committed by the outside caller.
  2. On failure, do not abort the transaction, but instead mark it as uncommittable. This would have similar semantics to the Zope transaction.doom() function.

It is important that the nested call does not abort the transaction because that would cause a new transaction to be started by subsequent code: that should be left to the code that began the transaction.

The @autocommit decorator

While the above interaction looks like a simple bug, the @autocommit decorator is another matter. It turns autocommit on for the duration of a function call, no matter what the transaction mode for the caller was. If we took the original example and wrapped the middle three operations with @autocommit and used TransactionMiddleware, we’d get 4 transactions: one for the first two operations, then one for each of the remaining operations.

I can’t think of a situation where it would make sense to use, and wonder if it was just added for completeness.


While the nesting bugs remain, my recommendation would be to go for the TransactionMiddleware and avoid use of the decorators (both in your own code and third party components). If you are writing reusable code that requires transactions, it is probably better to assert that django.db.transaction.is_managed() is true so that you get a failure for improperly configured systems while not introducing unwanted transaction boundaries.

For the Storm integration work I’m doing, I’ve set it to use managed transaction mode to avoid most of the unwanted commits, but it still falls prey to the extra commits when using the decorators. So I guess inspecting the code is still necessary. If anyone has other tips, I’d be glad to hear them.

Storm 0.13

Yesterday, Thomas rolled the 0.13 release of Storm, which can be downloaded from Launchpad.  Storm is the object relational mapper for Python used by Launchpad and Landscape, so it is capable of supporting quite large scale applications.  It is seven months since the last release, so there is a lot of improvements.  Here are a few simple statistics:

0.12 0.13 Change
Tarball size (KB) 117 155 38
Mainline revisions 213 262 49
Revisions in ancestry 552 875 323

So it is a fairly significant update by any of these metrics.  Among the new features are:

  • Infrastructure for tracing the SQL statements issued by Storm.  Sample tracer implementations are provided to implement bounded statement run times and for logging statements (both features used for QA of Launchpad).
  • A validation framework.  The property constructors take a validator keyword argument, which should be a function taking arguments (object, attr_name, value) and return the value to set.  If the function raises an exception, it can prevent a value from being set.  By returning something different to its third argument it can transform values.
  • The find() and ResultSet API has been extended to make it possible to generate queries that use GROUP BY and HAVING.  The primary use case for result sets that contain an object plus some aggregates associated with that object.
  • Some core parts of Storm have been accelerated through a C extension.  This code is turned off by default, but can be enabled by defining the STORM_CEXTENSIONS environment variable to 1.  While it is disabled by default, it is pretty stable.  Barring any serious problems reported over the next release cycle, I’d expect it to be enabled by default for the next release.
  • The minimum dependencies of the storm.zope.zstorm module have been reduced to just the zope.interface and transaction modules.  This makes it easier to use the per-thread store management code and global transaction management outside of Zope apps (e.g. for integrating with Django).

It doesn’t include my Django integration code though, since that isn’t fully baked.  I’ll post some more about that later.

Using Storm with Django

I’ve been playing around with Django a bit for work recently, which has been interesting to see what choices they’ve made differently to Zope 3.  There were a few things that surprised me:

  • The ORM and database layer defaults to autocommit mode rather than using transactions.  This seems like an odd choice given that all the major free databases support transactions these days.  While autocommit might work fine when a web application is under light use, it is a recipe for problems at higher loads.  By using transactions that last for the duration of the request, the testing you do is more likely to help with the high load situations.
  • While there is a middleware class to enable request-duration transactions, it only covers the database connection.  There is no global transaction manager to coordinate multiple DB connections or other resources.
  • The ORM appears to only support a single connection for a request.  While this is the most common case and should be easy to code with, allowing an application to expand past this limit seems prudent.
  • The tutorial promotes schema generation from Python models, which I feel is the wrong choice for any application that is likely to evolve over time (i.e. pretty much every application).  I’ve written about this previously and believe that migration based schema management is a more workable solution.
  • It poorly reinvents thread local storage in a few places.  This isn’t too surprising for things that existed prior to Python 2.4, and probably isn’t a problem for its default mode of operation.

Other than these things I’ve noticed so far, it looks like a nice framework.

Integrating Storm

I’ve been doing a bit of work to make it easy to use Storm with Django.  I posted some initial details on the mailing list.  The initial code has been published on Launchpad but is not yet ready to merge. Some of the main details include:

  • A middleware class that integrates the Zope global transaction manager (which requires just the zope.interface and transaction packages).  There doesn’t appear to be any equivalent functionality in Django, and this made it possible to reuse the existing integration code (an approach that has been taken to use Storm with Pylons).  It will also make it easier to take advantage of other future improvements (e.g. only committing stores that are used in a transaction, two phase commit).
  • Stores can be configured through the application’s Django settings file, and are managed as long lived per-thread connections.
  • A simple get_store(name) function is provided for accessing per-thread stores within view code.

What this doesn’t do yet is provide much integration with existing Django functionality (e.g. django.contrib.admin).  I plan to try and get some of these bits working in the near future.

How not to do thread local storage with Python

The Python standard library contains a function called thread.get_ident().  It will return an integer that uniquely identifies the current thread at that point in time.  On most UNIX systems, this will be the pthread_t value returned by pthread_self(). At first look, this might seem like a good value to key a thread local storage dictionary with.  Please don’t do that.

The value uniquely identifies the thread only as long as it is running.  The value can be reused after the thread exits.  On my system, this happens quite reliably with the following sample program printing the same ID ten times:

import thread, threading

def foo():
    print 'Thread ID:', thread.get_ident()

for i in range(10):
    t = threading.Thread(target=foo)

If the return value of thread.get_ident() was used to key thread local storage, all ten threads would share the same storage. This is not generally considered to be desirable behaviour.

Assuming that you can depend on Python 2.4 (released 3.5 years ago), then just use a threading.local object. It will result in simpler code, correctly handle serially created threads, and you won’t hold onto TLS data past the exit of a thread.

You will save yourself (or another developer) a lot of time at some point in the future. Debugging these problems is not fun when you combine code doing proper TLS with other code doing broken TLS.

Psycopg migrated to Bazaar

Last week we moved psycopg from Subversion to Bazaar.  I did the migration using Gustavo Niemeyer‘s svn2bzr tool with a few tweaks to map the old Subversion committer IDs to the email address form conventionally used by Bazaar.

The tool does a good job of following tree copies and create related Bazaar branches.  It doesn’t have any special handling for stuff in the tags/ directory (it produces new branches, as it does for other tree copies).  To get real Bazaar tags, I wrote a simple post-processing script to calculate the heads of all the branches in a tags/ directory and set them as tags in another branch (provided those revisions occur in its ancestry).  This worked pretty well except for a few revisions synthesised by a previous cvs2svn migration.  As these tags were from pretty old psycopg 1 releases I don’t know how much it matters.

As there is no code browsing set up on yet, I set up mirrors of the 2.0.x and 1.x branches on Launchpad to do this:

It is pretty cool having access to the entire revision history locally, and should make it easier to maintain full credit for contributions from non-core developers.

Psycopg2 2.0.7 Released

Yesterday Federico released version 2.0.7 of psycopg2 (a Python database adapter for PostgreSQL).  I made a fair number of the changes in this release to make it more usable for some of Canonical‘s applications.  The new release should work with the development version of Storm, and shouldn’t be too difficult to get everything working with other frameworks.

Some of the improvements include:

  • Better selection of exceptions based on the SQLSTATE result field.  This causes a number of errors that were reported as ProgrammingError to use a more appropriate exception (e.g. DataError, OperationalError, InternalError).  This was the change that broke Storm’s test suite as it was checking for ProgrammingError on some queries that were clearly not programming errors.
  • Proper error reporting for commit() and rollback(). These methods now use the same error reporting code paths as execute(), so an integrity error on commit() will now raise IntegrityError rather than OperationalError.
  • The compile-time switch that controls whether the display_size member of Cursor.description is calculated is now turned off by default.  The code was quite expensive and the field is of limited use (and not provided by a number of other database adapters).
  • New QueryCanceledError and TransactionRollbackError exceptions.  The first is useful for handling queries that are canceled by statement_timeout.  The second provides a convenient way to catch serialisation failures and deadlocks: errors that indicate the transaction should be retried.
  • Fixes for a few memory leaks and GIL misuses. One of the leaks was in the notice processing code that could be particularly problematic for long-running daemon processes.
  • Better test coverage and a driver script to run the entire test suite in one go.  The tests should all pass too, provided your database cluster uses unicode (there was a report just before the release of one test failing for a LATIN1 cluster).

If you’re using previous versions of psycopg2, I’d highly recommend upgrading to this release.

Future work will probably involve support for the DB-API two phase commit extension, but I don’t know when I’ll have time to get to that.

Running Valgrind on Python Extensions

As most developers know, Valgrind is an invaluable tool for finding memory leaks. However, when debugging Python programs the pymalloc allocator gets in the way.

There is a Valgrind suppression file distributed with Python that gets rid of most of the false positives, but does not give particularly good diagnostics for memory allocated through pymalloc. To properly analyse leaks, you often need to recompile Python with pymalloc.

As I don’t like having to recompile Python I took a look at Valgrind’s client API, which provides a way for a program to detect whether it is running under Valgrind. Using the client API I was able to put together a patch that automatically disables pymalloc when appropriate. It can be found attached to bug 2422 in the Python bug tracker.

The patch still needs a bit of work before it will be mergeable with Python 2.6/3.0 (mainly autoconf foo).  I also need to do a bit more benchmarking on the patch.  If the overhead of turning on this patch is negligible, then it’d be pretty cool to have it enabled by default when Valgrind is available.

Two‐Phase Commit in Python’s DB‐API

Marc uploaded a new revision of the Python DB-API 2.0 Specification yesterday that documents the new two phase commit extension that I helped develop on the db-sig mailing list.

My interest in this started from the desire to support two phase commit in Storm – without that feature there are far fewer occasions where its ability to talk to multiple databases can be put to use. As I was doing some work on psycopg2 for Launchpad, I initially put together a PostgreSQL specific patch, which was (rightly) rejected by Federico.

He suggested that it would be better to try and standardise on an API on the db-sig list, so that’s what I did. I looked over the API exposed by other database adapters that supported 2PC, and the 2PC APIs of the major free databases that did not have support in their Python adapters (MySQL and PostgreSQL). The resulting API is a bit more complicated than my original PostgreSQL-only but has the advantage of being implementable on other databases such as MySQL.

Below is a simple example of using the API directly (missing some of the error handling):

# begin transactions for each database connection
conn1.tpc_begin(conn1.xid(42, 'transaction ID', 'connection 1'))
conn2.tpc_begin(conn2.xid(42, 'transaction ID', 'connection 2'))
# Do stuff with both connections
except DatabaseError:

Or alternatively, if you’ve got one connection supporting 2PC and the other only supporting one-phase commit, it could be structured as follows:

# begin transactions for each database connection
conn1.tpc_begin(conn1.xid(42, 'transaction ID', 'connection 1'))
# Do stuff with both connections
except DatabaseError:

While it is possible to use the 2PC API directly, it is expected that most applications will rely on a transaction manager to coordinate global transactions, such as Zope’s transaction module.

The hope is that by offering a consistent API, Python application frameworks will be more likely to bother supporting this feature of databases. Hopefully you’ll be able to use the API with PostgreSQL and Storm soon.

Zeroconf Branch Sharing with Bazaar

Bazaar logoAt Canonical, one of the approaches taken to accelerate development is to hold coding sprints (otherwise known as hackathons, hackfests or similar). Certain things get done a lot quicker face to face compared to mailing lists, IRC or VoIP.

When collaborating with someone at one of these sprints the usual way to let others look at my work would be to commit the changes so that they could be pulled or merged by others. With legacy version control systems like CVS or Subversion, this would generally result in me uploading all my changes to a server in another country only for them to be downloaded back to the sprint location by others.

In contrast, with a modern VCS like Bazaar we should be able to avoid this since the full history of the branch is available locally – enough information to let others pull or merge the changes. That said, we’ve often ended up using a server on the internet to exchange changes despite this. This is the same work flow we use when working from home, so I guess the pain of switching to a new work flow outweighs the potential productivity gains.

The Solution

Bazaar makes it easy to run a read only server locally:

bzr serve [--directory=DIR]

However, there is still the issue of others finding the branch. They’d need to know the IP address assigned to my computer at the sprint, and the path to the branch on the server. Ideally they’d just need to know the name of the my branch. As it happens, we’ve got the technology to fix this.

Avahi logoAvahi makes it trivial to advertise and browse for services on the local network without having to worry about what IP addresses have been assigned or what people name their computer. So the solution is to hook Avahi and Bazaar together. This was fairly easy due to Avahi’s DBus interface and the dbus-python bindings.

The result is my bzr-avahi plugin. You can either download tarballs or install the latest version directly with from Bazaar:

bzr branch lp:bzr-avahi ~/.bazaar/plugins/avahi

To use the plugin, you must have at least version 1.1 of Bazaar, the Python bindings for DBus and Avahi, and a working Avahi setup. Once the plugin is installed, it hooks into the standard “bzr serve” command to do the following:

  • scan the directory being served for branches that the user has asked to advertise.
  • ask Avahi to advertise said branches

You can ask to advertise a branch using the new “bzr advertise” command:

bzr advertise [BRANCH-NAME]

If no name is specified, the branch’s nickname is used. The advertise command sends a signal over the session bus to tell any running servers about the change, so there is no need to restart “bzr serve” to see the change.

At this point, the advertised branches should be visible with a service browser like avahi-discover, so that’s half the problem solved. From the client side two things are provided: a special redirecting transport and a command to list all advertised branches on the local network.

The transport allows you to access the branch by its advertised name with most Bazaar commands. For example, merging a branch is as simple as:

$ bzr merge local:BRANCH-NAME
local:BRANCH-NAME is redirected to bzr://hostname.local:4155/path/to/branch
All changes applied successfully.

If you want to get a list of all advertised branches on the network, the “bzr browse” command will print out a list of branch names and the URLs they translate to.

I believe using these tools together should offer a low enough overhead for direct sharing of branches at sprints that people would actually bother using it. It should be quite useful at the next sprint I go to.

Re: Python factory-like type instances

Nicolas: Your metaclass example is a good example of when not to use metaclasses. I wouldn’t be surprised if it is executed slightly different to how you expect. Let’s look at how Foo is evaluated, starting with what’s written:

class Foo:
    __metaclass__ = FooMeta

This is equivalent to the following assignment:

Foo = FooMeta('Foo', (), {...})

As FooMeta has an __new__() method, the attempt to instantiate FooMeta will result in it being called. As the return value of __new__() is not a FooMeta instance, there is no attempt to call FooMeta.__init__(). So we could further simplify the code to:

Foo = {
    'linux2': LinuxFoo,
    'win32': WindowsFoo,
}.get(PLATFORM, None)
if not Foo:
    # XXX: this should _really_ raise something other than Exception
    raise Exception, 'Platform not supported'

So the factory function is gone completely here, and it is clear that the decision about which class to use is being made at module import time rather than class instantiation time.

Now this isn’t to say that metaclasses are useless. In both implementations, the code responsible for selecting the class has knowledge of all implementations. To add a new implementation (e.g. for Solaris or MacOS X), the factory function needs to be updated. A better solution would be to provide a way for new implementations to register themselves with the factory. A metaclass could be used to make the registration automatic:

class FooMeta(type):
    def __init__(self, name, bases, attrs):
        cls = super(FooMeta, self).__init__(name, bases, attrs)
        if cls.platform is not None:
            register_foo_implementation(klass.platform, cls)
        return cls

class Foo:
    __metaclass__ = FooMeta
    platform = None

class LinuxFoo(Foo):
    platform = 'linux2'

Now the simple act of defining a SolarisFoo class would be enough to have it registered and ready to use.