Python class advisors

Anyone who has played with Zope 3 has probably seen the syntax used to declare what interfaces a particular class implements. It looks something like this:

class Foo:
    implements(IFoo, IBar)
    ...

This leads to the following question: how can a function call inside a class definition’s scope affect the resulting class? To understand how this works, a little knowledge of Python metaclasses is needed.

Metaclasses

In Python, classes are instances of metaclasses. For new-style classes, the default metaclass is type (which happens to be its own metaclass). When you create a new class or subclass, you are creating a new instance of the metaclass. The constructor for a metaclass takes three arguments: the class’s name, a tuple of the base classes and a dictionary attributes and methods. So the following two definitions of the class C are equivalent:

class C(object):
    a = 42

C = type('C', (object,), {'a': 42})

The metaclass for a particular class can be picked in a number of ways:

  • A __metaclass__ variable at module or class scope.
  • Use the same metaclass as the base class.

If no metaclass is specified through either of these means, an “old style” class is created. I won’t cover old style classes here.

Now in Python calling a function and creating a new instance look pretty similar. In fact the metaclass machinary doesn’t really care. The following two class definitions are also equivalent:

class C:
    __metaclass__ = type

def not_the_metaclass(name, bases, attrs):
    return type(name, bases, attrs)

class C:
    __metaclass__ = not_the_metaclass

So using a function or other callable object as the metaclass allows you to hook into the class creation without affecting the type of the resulting class.

Class Advisors

The tricks performed by the Zope implements() function are wrapped up in the zope.interface.advice module. It does so by making use of the fact that Python programs can inspect their execution stack at runtime.

  1. Walk up the stack to where the scope of the class being defined.
  2. Check to see if a “__metaclass__” variable has been set, which would indicate the that a metaclass has been specified for this particular class already.
  3. Check the module scope for a “__metaclass__” variable.
  4. Define a function advise(name, bases, cdict) that does the following:
    • Deduce the metaclass (either what __metaclass__ was set to in the class scope, the module scope, or check base classes).
    • Call the metaclass to create the new class.
    • Do something to the new class (in the case of Zope, it sets what interfaces the class implements).
  5. Set the “__metaclass__” variable in the class scope to this function.

The actual implementation is a little more complicated to handle the case of registering multiple class advisors for a single class. The actual interface provided is quite simple though:

from zope.interface.advice import addClassAdvisor

def setA():
    def advisor(cls):
        cls.a = 42
        return cls
    addClassAdvisor(advisor)

class C:
    setA()

This simply sets the attribute ‘a’ on the class after it has been created. Also, since method decorators are implemented as a single function call, they can add a class advisor as a way to perform some extra work on the class or method after the class has been constructed.

Version control discussion on the Python list

The Python developers have been discussing a migration off CVS on the python-dev mailing list. During the discussion, Bazaar-NG was mentioned. A few posts of note:

I’m going to have to play around with bzr a bit more, but it looks very nice (and should require less typing than baz …)

Overriding Class Methods in Python

One of the features added back in Python 2.2 was class methods. These differ from traditional methods in the following ways:

  1. They can be called on both the class itself and instances of the class.
  2. Rather than binding to an instance, they bind to the class. This means that the first argument passed to the method is a class object rather than an instance.

For most intents and purposes, class methods are written the same way as normal instance methods. One place that things differ is overriding a class method in a subclass. The following simple example demonstrates the problem:

class SubClass(ParentClass):
    @classmethod
    def create(cls, arg):
        ret = ParentClass.create(cls, arg)
        ret.dosomethingelse()
        return ret

This code is broken because the ParentClass.create() call is calling the version of create() method in the context of ParentClass, rather than calling an unbound method like it would with a normal instance method. The most likely outcome will be a TypeError due to the method receiving too many arguments.

So how do you chain up to the parent class implementation? You use the super() object, which was also added in Python 2.2 as an alternative way to chain to the parent implementation of a method. The above code rewritten as follows:

class SubClass(ParentClass):
    @classmethod
    def create(cls, arg):
        ret = super(SubClass, cls).create(arg)
        ret.dosomethingelse()
        return ret

If you haven’t ever used the super() object, this is what it is doing in the above example:

  1. SubClass is looked up in the list cls.__mro__ (a linearised list of ancestor classes in the order used for method resolution).
  2. The class dict for each ancestor class coming after SubClass in cls.__mro__ is checked to see if it contains “create“.
  3. The super() object returns a version of “create” in the context of cls using the __get__(cls) “descriptor get” method.
  4. When this bound method gets called, cls will be passed in instead of the parent class.

Previously I’d ignored super() for the most part, since I could use the old chaining syntax. This shows a place where the old-style syntax can’t be applied.

Python Challenge

Found out about The Python Challenge. While you don’t need to use Python to solve most of the problems, a knowledge of the language certainly helps. While the initial problems are fairly easy, some of the later ones are quite difficult, and cover many topics.

If you decide to have a go, here are a few hints that might help:

  • Keep a log of what you do. Solutions to may provide insight into subsequent problems.
  • Look at ALL the information provided to you. If the solution isn’t apparent, look for patterns in the information and extrapolate.
  • If you are using brute force to solve a problem, there is probably a quicker and simpler method to get the answer.
  • If you get stuck, check the forum for hints.

There is also a solutions wiki, however, you need to have solved the corresponding problem before it will give you access.

8 April 2005

Tracing Python Programs

I was asked recently whether there was an equivalent of sh -x for Python (ie. print out each statement before it is run), to help with debugging a script. It turns out that there is a module in the Python standard library to do so, but it isn’t listed in the standard library reference for some reason.

To use it, simply run the program like this:

/usr/lib/python2.4/trace.py -t program.py

This’ll print out the filename, line number and contents of that line before executing the code. If you want to skip the output for the standard library (ie. only show statements from your own code), simply pass --ignore-dir=/usr/lib/python2.4 (or similar) as an option.

BitKeeper

So the free (no-cost) version of BitKeeper has been discontinued, leaving just the commercial version and the limited open source version (which is essentially limited to checking out the head revision of a particular tree).

It seems a bit weird that one of the stated reasons for discontinuing the free version is a dispute with OSDL, where some employees were using BitKeeper (eg. Linus), while another unrelated employee was reverse engineering it as a personal project. This is a bit surprising, since it seems that a scenario almost the same as this was brought up last year and Larry said his concern was a licensed BitKeeper user helping someone else reverse engineer the code. Of course, there are probably other issues involved here.

This does bring up an interesting issue of what users of the free version are going to do with their repositories. While they can use the open source editing to easily check out the head revision and continue development, it isn’t clear that it can be used to extract all the information stored in a repository. And since BitMover has refused to sell the commercial version to some people, it is conceivable that some projects could find themselves unable to access their revision history with BitKeeper.

I doubt this situation is acceptable to many users (they are using a version control system, so probably want to keep their revision history), so there will probably be some programs written to extract all the information from a BitKeeper repository. Ironically, this could add some value to BitKeeper for BitMover’s commercial customers — insurance for their data in case BitMover disappears or something else makes BitKeeper unusable to them.

Airports

If you are coming to Australia for first time, make sure you pack your camel suit and other valuable in your cabin luggage, rather than the checked luggage. It will save you trouble in the long run.

Python Unicode Weirdness

While discussing unicode on IRC with owen, we ran into a peculiarity in Python’s unicode handling. It can be tested with the following code:

>>> s = u'\U00010001\U00010002'
>>> len(s)
>>> s[0]

Python can be compiled to use either 16-bit or 32-bit widths for characters in its unicode strings (16-bit being the default). When compiled in 32-bit mode, the results of the last two statements are 2 and u'\U00010001' respectively. When compiled in 16-bit mode, the results are 4 and u'\ud800'.

So rather than just being an implementation detail, the unicode string width chosen at compile time can alter the result of Python programs that manipulate characters outside of the basic multilingual plane. It would be nice if Python programs didn’t have to care about this sort of detail …