Version control discussion on the Python list

The Python developers have been discussing a migration off CVS on the python-dev mailing list. During the discussion, Bazaar-NG was mentioned. A few posts of note:

I’m going to have to play around with bzr a bit more, but it looks very nice (and should require less typing than baz …)

Overriding Class Methods in Python

  • Post author:
  • Post category:Uncategorized

One of the features added back in Python 2.2 was class methods. These differ from traditional methods in the following ways:

  1. They can be called on both the class itself and instances of the class.
  2. Rather than binding to an instance, they bind to the class. This means that the first argument passed to the method is a class object rather than an instance.

For most intents and purposes, class methods are written the same way as normal instance methods. One place that things differ is overriding a class method in a subclass. The following simple example demonstrates the problem:

class SubClass(ParentClass):
    def create(cls, arg):
        ret = ParentClass.create(cls, arg)
        return ret

This code is broken because the ParentClass.create() call is calling the version of create() method in the context of ParentClass, rather than calling an unbound method like it would with a normal instance method. The most likely outcome will be a TypeError due to the method receiving too many arguments.

So how do you chain up to the parent class implementation? You use the super() object, which was also added in Python 2.2 as an alternative way to chain to the parent implementation of a method. The above code rewritten as follows:

class SubClass(ParentClass):
    def create(cls, arg):
        ret = super(SubClass, cls).create(arg)
        return ret

If you haven’t ever used the super() object, this is what it is doing in the above example:

  1. SubClass is looked up in the list cls.__mro__ (a linearised list of ancestor classes in the order used for method resolution).
  2. The class dict for each ancestor class coming after SubClass in cls.__mro__ is checked to see if it contains “create“.
  3. The super() object returns a version of “create” in the context of cls using the __get__(cls) “descriptor get” method.
  4. When this bound method gets called, cls will be passed in instead of the parent class.

Previously I’d ignored super() for the most part, since I could use the old chaining syntax. This shows a place where the old-style syntax can’t be applied.

Python Challenge

  • Post author:
  • Post category:Uncategorized

Found out about The Python Challenge. While you don’t need to use Python to solve most of the problems, a knowledge of the language certainly helps. While the initial problems are fairly easy, some of the later ones are quite difficult, and cover many topics.

If you decide to have a go, here are a few hints that might help:

  • Keep a log of what you do. Solutions to may provide insight into subsequent problems.
  • Look at ALL the information provided to you. If the solution isn’t apparent, look for patterns in the information and extrapolate.
  • If you are using brute force to solve a problem, there is probably a quicker and simpler method to get the answer.
  • If you get stuck, check the forum for hints.

There is also a solutions wiki, however, you need to have solved the corresponding problem before it will give you access.

8 April 2005

  • Post author:
  • Post category:Uncategorized

Tracing Python Programs

I was asked recently whether there was an equivalent of sh -x for Python (ie. print out each statement before it is run), to help with debugging a script. It turns out that there is a module in the Python standard library to do so, but it isn’t listed in the standard library reference for some reason.

To use it, simply run the program like this:

/usr/lib/python2.4/ -t

This’ll print out the filename, line number and contents of that line before executing the code. If you want to skip the output for the standard library (ie. only show statements from your own code), simply pass --ignore-dir=/usr/lib/python2.4 (or similar) as an option.


So the free (no-cost) version of BitKeeper has been discontinued, leaving just the commercial version and the limited open source version (which is essentially limited to checking out the head revision of a particular tree).

It seems a bit weird that one of the stated reasons for discontinuing the free version is a dispute with OSDL, where some employees were using BitKeeper (eg. Linus), while another unrelated employee was reverse engineering it as a personal project. This is a bit surprising, since it seems that a scenario almost the same as this was brought up last year and Larry said his concern was a licensed BitKeeper user helping someone else reverse engineer the code. Of course, there are probably other issues involved here.

This does bring up an interesting issue of what users of the free version are going to do with their repositories. While they can use the open source editing to easily check out the head revision and continue development, it isn’t clear that it can be used to extract all the information stored in a repository. And since BitMover has refused to sell the commercial version to some people, it is conceivable that some projects could find themselves unable to access their revision history with BitKeeper.

I doubt this situation is acceptable to many users (they are using a version control system, so probably want to keep their revision history), so there will probably be some programs written to extract all the information from a BitKeeper repository. Ironically, this could add some value to BitKeeper for BitMover’s commercial customers — insurance for their data in case BitMover disappears or something else makes BitKeeper unusable to them.


If you are coming to Australia for first time, make sure you pack your camel suit and other valuable in your cabin luggage, rather than the checked luggage. It will save you trouble in the long run.

Python Unicode Weirdness

  • Post author:
  • Post category:Uncategorized

While discussing unicode on IRC with owen, we ran into a peculiarity in Python’s unicode handling. It can be tested with the following code:

>>> s = u'\U00010001\U00010002'
>>> len(s)
>>> s[0]

Python can be compiled to use either 16-bit or 32-bit widths for characters in its unicode strings (16-bit being the default). When compiled in 32-bit mode, the results of the last two statements are 2 and u'\U00010001' respectively. When compiled in 16-bit mode, the results are 4 and u'\ud800'.

So rather than just being an implementation detail, the unicode string width chosen at compile time can alter the result of Python programs that manipulate characters outside of the basic multilingual plane. It would be nice if Python programs didn’t have to care about this sort of detail …