MySQL Announces Move to Bazaar

Bazaar logoIt has been a while coming, but MySQL has announced their move to Bazaar for version control.  This has been a long time coming, and it is great to finally see it announced publicly.

The published Bazaar branches include 8 years of history going back to MySQL 3.23.22, imported from the BitKeeper repositories.  So you can see a lot more than just the history since the switch: you can use all the normal Bazaar tools to see where the code came from and how it evolved.  Giuseppe Maxia has posted some instructions on how to check out the code for those who are interested.

I haven’t checked extensively, but I wouldn’t be surprised if this is the largest public code base managed with Bazaar.  I’ve known from personal experience working on Launchpad that it is capable of handling large trees, but it is good to have a high profile project to point at as an example now.

How not to do thread local storage with Python

The Python standard library contains a function called thread.get_ident().  It will return an integer that uniquely identifies the current thread at that point in time.  On most UNIX systems, this will be the pthread_t value returned by pthread_self(). At first look, this might seem like a good value to key a thread local storage dictionary with.  Please don’t do that.

The value uniquely identifies the thread only as long as it is running.  The value can be reused after the thread exits.  On my system, this happens quite reliably with the following sample program printing the same ID ten times:

import thread, threading

def foo():
    print 'Thread ID:', thread.get_ident()

for i in range(10):
    t = threading.Thread(target=foo)
    t.start()
    t.join()

If the return value of thread.get_ident() was used to key thread local storage, all ten threads would share the same storage. This is not generally considered to be desirable behaviour.

Assuming that you can depend on Python 2.4 (released 3.5 years ago), then just use a threading.local object. It will result in simpler code, correctly handle serially created threads, and you won’t hold onto TLS data past the exit of a thread.

You will save yourself (or another developer) a lot of time at some point in the future. Debugging these problems is not fun when you combine code doing proper TLS with other code doing broken TLS.