–create-prefix not needed with bazaar.launchpad.net

When outlining the use of team branches on Launchpad previously, I used the --create-prefix option when pushing the branch to sftp://bazaar.launchpad.net. This was to make sure the initial push would succeed, even if the /~username/product directory the branch would be created in didn’t exist.

To simplify things for users, we made a change to the SFTP server in the latest release, so that --create-prefix is no longer necessary. This does not affect the allowed branch directories though: the structure is used to associate the branches with products, and decide who can write to the branches.

Another change included in the rollout is the ability to rename branches and reassign them to different owners through the web interface. So for instance, you can give ownership of a personal branch to a team your project grows to multiple developers. This should be used sparingly, since it will change the published branch URLs which can confuse people using your branch.

Gnome-gpg 0.5.0 Released

Over the weekend, I released gnome-gpg 0.5.0. The main features in this release is support for running without gnome-keyring-daemon (of course, you can’t save the passphrase in this mode), and to use the same keyring item name for the passphrase as Seahorse. The release can be downloaded here:


I also switched over from Arch to Bazaar. The conversion was fairly painless using bzr baz-import-branch, and means that I have both my revisions and Colins revisions in a single tree. The branch can be pulled from:

bzr branch http://www.gnome.org/~jamesh/bzr/gnome-gpg/devel gnome-gpg

All of the converted revisions authored by me have been signed with my PGP key. As signatures can’t get moved over in the conversion process, none of Colin’s revisions are signed. Note that the signatures in Bazaar are for particular tree states rather than changes between two tree states, so it doesn’t affect the trust of the current revisions.

While I was at it, I also converted the other branches I had in my www.gnome.org Arch archive over to bzr. The only other branch that people might find useful is the http-resource code, which I’ve updated to compile with the latest libsoup.

Shared Branches using Bazaar and Launchpad

Earlier, David Allouche
described how to
host Bazaar branches on Launchpad
. At the end, he alluded to the
ability to create branches that can be committed to by anyone on a
team. I’ll describe how this works here.

Launchpad Teams

Launchpad allows people to organise themseleves into teams. Most
of the things people can do in Launchpad can also be done by teams,
including owning branches.

You can create a new team at the following page:


There are three different membership policies you can choose

  • Open: anyone can join. Choosing this sort of team
    effectively gives everyone write access to branches owned by the
  • Moderated: new memberships must be approved by one of the
    administrators (this is the default policy). This makes it easy for
    people to request commit access to the branch while still requiring
    approval from a team administrator..
  • Restricted: new members can only be added by the team
    administrators. This is appropriate if new members shouldn’t be able
    to propose themselves normally.

Once the team has been created, members of the team can create the

Uploading a Team Owned Branch

Now that you are a member of a team, you can upload branches to
that team’s directory on bazaar.launchpad.net. This is done
in the same way as uploading personal branches described in David’s

cd branchdir
bzr push --create-prefix sftp://bazaar.launchpad.net/~teamname/product/branchname

When the command completes, the team owned branch will have been
created. Now you can treat this branch like a personal branch, but
once someone else pushes a commit to the branch, “bzr push
will tell you that the branch has diverged, and not let you push your
changes until you merge them to your branch.

An alternative model is to use checkouts, which provide a workflow
closer to CVS and Subversion without losing Bazaar’s ability to work
while disconnected.

Bazaar Checkouts

A Bazaar checkout is a local working copy bound to a remote branch
such that changes are committed to the remote location. The remote
branch data is also cached locally to speed up local operations and
allow you to work while disconnected from the network. A checkout of
the previously created team branch can be created with the following

bzr checkout sftp://bazaar.launchpad.net/~teamname/product/branchname team-branch
cd team-branch

Alternatively if you still have the local branch used to create
the team branch, it can be converted to a checkout with the “bzr
” command:

cd branchdir
bzr bind sftp://bazaar.launchpad.net/~teamname/product/branchname

You can then make commits to the checkout as you would with any
other branch, provided the checkout is up to date with the remote
branch. If another team member has committed to the branch in the
mean time though, you will be prompted to update your checkout to the
head of the latest version of the remote branch.

If this happens, the checkout can be updated by issuing the
bzr update” command. You can then retry the commit, after
fixing any conflicts that are reported.

Disconnected Operation with Checkouts

If you are disconnected from the network, it will be impossible to
publish your changes to the remote branch so running the “bzr
” command on the checkout will fail.

To handle this situation, Bazaar lets you make local commits in
your checkout. This is performed with the “bzr commit
” command. You can treat these commits just like regular
commits and get diffs between them, etc.

When you are connected to the network again, run “bzr
“. This will pull in any changes made to the remote branch
and turn your local commits into a pending merge. After fixing any
conflicts (if there are any), running “bzr commit” will
publish the changes to the remote branch for the world to see.

Feature Branches

If you are developing a feature that is not yet appropriate to
check into the mainline team branch, the checkout workflow may not be
convenient. In this case, it may make sense to create a personal
branch to do the work and then merge the changes later on.

You can create a new branch using the “bzr branch
command. Since the checkout made previously contains full history
data we can branch from it directly, which saves saves downloading the
branch again:

bzr branch checkoutdir mybranch

If you want to make this branch available to others, it can be
published to bazaar.launchpad.net as described in David’s
original article.

Merging your branch into the checkout is the same as merging into
any other Bazaar branch:

bzr update
bzr merge mybranch
# resolve any conflicts that may be reported
bzr commit

Once the commit completes, the changes will be available on the
team branch.


Without much trouble, you can create a shared mainline branch with Bazaar and Launchpad and use it in a way familiar to Subversion users. With one extra command you can extend the familiar model to allow commits while disconnected, providing the power of distributed revision control when you need it.

JHBuild Improvements

I’ve been doing most JHBuild development in my bzr branch recently. If you have bzr 0.8rc1 installed, you can grab it here:

bzr branch http://www.gnome.org/~jamesh/bzr/jhbuild/jhbuild.dev

I’ve been keeping a regular CVS import going at http://www.gnome.org/~jamesh/bzr/jhbuild/jhbuild.cvs using Tailor, so changes people make to module sets in CVS make there way into the bzr branch. I’ve used a small hack so that merges back into CVS get recorded correctly in the jhbuild.cvs branch:

  1. Apply the diff between jhbuild.cvs and jhbuild.dev to my CVS checkout and commit.
  2. Modify tailor to pause before committing the to jhbuild.cvs.
  3. While tailor is paused, run bzr revert followed by a merge of the same changes from jhbuild.dev.
  4. Let tailor complete the commit.

It’s a bit of a hack, but it allows me to do repeated merges from the CVS import to my development branch (and back again). It also means that any file moves I do in my bzr branch are reflected in the CVS import when merged.

So now when filing bug reports on jhbuild, you can submit fixes in the form of bzr branches as well as patches.

So, on to the improvements:

Generic Version Control Interface

Previously, to add support for a new version control system the following additions were needed:

  • Some code to invoke the version control utility to make checkouts and update working trees.
  • Code to implement the build state machine for modules using the version control system (these classes would generally derive from AutogenModule which implemented most of the build logic).
  • Code to create instances of the above module type when parsing .modules files.

This was quite a bit of work, and in the end would only help if the code in question was set up to build the same way as most Gnome modules (i.e. with a autogen.sh script and autotools). If you wanted to build a module using Python distutils out of Subversion, a new module type would be needed. If you wanted to build a distutils module from a tarball, that would be another module type again.

With the new system, the different version control support modules provide a common interface. This means that a single module type is capable of implementing the build state machine for any version control system. Similarly, it should now be possible to implement distutils module support such that it will work with any supported version control system.

This work is not yet finished though. A bit more work needs to be done to parse version control system agnostic module definitions from .modules files. When this is done, a fair bit of the current syntax can be deprecated and eventually removed. When this is done, adding support for a new version control system shouldn’t take more than 100-200 lines.

Module Type Simplifications

As well as reducing the number of module types that need to be maintained in JHBuild, I’ve been working on simplifying the code in these module types. Previously, each stage of a module build was represented by a method call on the module type. The return value of the method was used to say (a) whether the stage succeeded or not, (b) what the next state would be and (c) if an error occurred some alternative next states to go to (e.g. offer to rerun autogen.sh).

With the new system, the next state and error states are declared as attributes on the method object. The method can indicate a failure by raising a particular exception. This greatly simplifies the cases where a build stage involves a number of separate actions that could each fail individually, since the exception cuts processing short without the error checks getting in the way of the code.

There are still a few module build stages not converted to the new system since their next state depends on various config settings (e.g. if running “make check” has been enabled or not). Since these generally involve skipping a stage based on some criteria, the plan is to move the logic to the stage being skipped, which should simplify things further.

New Default Branch Format in Bzr

One of the new features in the soon to be released bzr 0.8 is the new “knit” storage format.

When comparing the size of the repository data for jhbuild with “knit” and “metadir” formats (metadir is just the old storage format with repository, branch and checkout bookkeeping separated), I see the following:

metadir knit
Size 9.9MB 5.5MB
Number of files 1267 307

The reason for the smaller number of files is that information about all revisions in the repository is now stored together rather than in separate files. So the file count comes out at a constant plus 2 times the number of tracked files (a knit index file plus the knit data file). For comparison, the CVS repository I imported this from was 4.4MB, and comprised 143 files.

As well as reducing storage requirements, the new knit repository format is designed to reduce network traffic. With the current weave repository format, the weave file for each file touched by a commit gets rewritten to include the contents of the new revision. In contrast to this, the information about the new revision can simply be appended to the knit data file and the knit index file updated to match. This means publishing a branch to a server via sftp mainly involves append operations, resulting in a nice speed up.

Similarly when pulling new changes from a published branch, bzr only needs to download a knit index to find out which sections of the knit data are missing locally. It can then ask for just the changed sections (by an HTTP range request or a partial read with sftp), rather than downloading the entire contents of the changed weaves.

Overall, this should make bzr 0.8 a lot more usable than 0.7 for various network operations.

Repositories in Bzr

One of the new features comming up in the next release of bzr is support for shared repositories. This provides a way to reduce disk space needed to store multiple related branches. To understand how repositories work, it helps to know a bit about how branches are stored by bzr.

[bzr repository diagram]

There are three concepts that make up a bzr branch:

  1. A checkout or working tree. This is the source files you are working with. It represents the state of the source code at some recorded revision plus any local changes you’ve made. In the diagram on the right, it is represented as the red node.
  2. The branch, consisting of a linear sequence of revisions. This is represented by the blue nodes in the diagram. Note that there may be multiple paths from the first revision to the current revision due to branching and merging. The branch revision history indicates the path that was taken by this particular branch.
  3. The repository, being a store of the text of all the revisions in the ancestry of the branch, plus metadata about those revisions. This essentially stores information about every node and edge in the diagram.

In previous versions of bzr, this information was not clearly separated. However with the new default branch format in bzr 0.8 they are separated, and a particular directory need not contain all three parts, which is what makes the space savings and performance improvements possible.

One of the biggest space savings is achieved from sharing the repository data between branches. If a particular branch does not contain any repository information, bzr will recursively check the parent directory til it finds a repository. If a collection branches share some of their history, then the single shared repository will be significantly smaller than the space used if each branch had its own repository data.

Another way to reduce disk usage is to create branches without checkouts. This is useful when publishing a branch, since people pulling or merging from that branch don’t use the checkout files.

Finally, it is possible to create a checkout which does not contain branch or repository data, instead containing a pointer to where that data is located. This is quite useful when combined with a central shared repository.

So how big is this space saving? When I converted JHBuild to bzr, the repository data totals to 10MB, the branch data totals 100KB and a checkout is 1.4MB.

So to publish a second branch without the use of shared repositories means another 10MB of storage (a bit more if I include a checkout at the published location). If I use shared repositories, the cost of the second branch is 100KB plus an amount proportional to the size of the changes I make on that branch. So for many projects, the cost of publishing another branch is lost in the noise.

Using Tailor to Convert a Gnome CVS Module

In my previous post, I mentioned using Tailor to import jhbuild into a Bazaar-NG branch. In case anyone else is interested in doing the same, here are the steps I used:

1. Install the tools

First create a working directory to perform the import, and set up tailor. I currently use the nightly snapshots of bzr, which did not work with Tailor, so I also grabbed bzr-0.7:

$ wget http://darcs.arstecnica.it/tailor-0.9.20.tar.gz
$ wget http://www.bazaar-ng.org/pkg/bzr-0.7.tar.gz
$ tar xzf tailor-0.9.20.tar.gz
$ tar xzf bzr-0.7.tar.gz
$ ln -s ../bzr-0.7/bzrlib tailor-0.9.20/bzrlib

2. Prepare a local CVS Repository to import from

The import will run a lot faster with a local CVS repository. If you have a shell account on window.gnome.org, this is trivial to set up:

$ mkdir cvsroot
$ cvs -d `pwd`/cvsroot init
$ rsync -azP window.gnome.org:/cvs/gnome/jhbuild/ cvsroot/jhbuild/

3. Check for history inconsistency

As I discovered, Tailor will bomb if time goes backwards at some point in your CVS history, and will probably bomb out part way through. The quick fix for this is to directly edit the RCS ,v files to correct the dates. Since you are working with a copy of the repository, there isn’t any danger of screwing things up.

I wrote a small program to check an RCS file for such discontinuities:


When editing the dates in the RCS files, make sure that you change the dates in the different files in a consistent way. You want to make sure that revisions in different files that are part of the same changeset still have the same date after the edits.

4. Create a Tailor config file

Here is the Tailor config file I used to import jhbuild:

verbose = True
projects = jhbuild
encoding = utf-8

target = bzr:target
start-revision = INITIAL
root-directory = basedir/jhbuild.cvs
state-file = tailor.state
source = cvs:source
subdir = .
before-commit = remap_author
patch-name-format =

encoding = utf-8

module = jhbuild
repository = basedir/cvsroot
encoding = utf-8

def remap_author(context, changeset):
    if '@' not in changeset.author:
        changeset.author = '%s <%s@cvs.gnome.org>' % (changeset.author,
    return True

The remap_author function at the bottom maps the CVS user names to something closer to what bzr normally uses.

5. Perform the conversion

Now it is possible to run the conversion:

$ python tailor-0.9.20/tailor -vv --configfile jhbuild.tailor

When the conversion is complete, you should be left with a bzr branch containing the history of the HEAD branch from CVS. Now is a good time to check that the converted bzr looks sane.

6. Use the new branch

Rather than using the converted branch directly, it is a good idea to branch off it and do the development there:

$ bzr branch jhbuild.cvs jhbuild.dev

The advantage of doing this is that you have the option of rsyncing in new changes to the CVS repository and running tailor again to incrementally import them. You can then merge those changes to your development branch.

Revision Control Migration and History Corruption

As most people probably know, the Gnome project is planning a migration to Subversion. In contrast, I’ve decided to move development of jhbuild over to bzr. This decision is a bit easier for me than for other Gnome modules because:

  • No need to coordinate with GDP or GTP, since I maintain the docs and there is no translations.
  • Outside of the moduleset definitions, the large majority of development and commits are done by me.
  • There aren’t really any interesting branches other than the mainline.

I plan to leave the Gnome module set definitions in CVS/Subversion though, since many people help in keeping them up to date, so leaving them there has some value.

I performed a test conversion using Tailor 0.9.20. My first attempt at performing the conversion failed part way through. Looking at what had been imported, it was apparent that the first few changesets created weren’t the first changesets I’d created in CVS. What was weirder still was the dates on those changesets: they were dated 1997, while I hadn’t started jhbuild til 2001.

It turns out that it was caused by clock skew on the CVS server back in September 2003, so the revision dates for a few files are not monotonic. I did the quick fix of directly editing the RCS files (I was working off a local copy of the repo), which allowed the conversion to run through to completion. The problem has been reported as bug #37 in Tailor’s bug tracker.

This made me a bit worried about whether the CVS to Subversion conversion script being used for the rest of the Gnome modules was also vulnerable to this sort of clock skew problem. Sure enough it was, and the first real changeset of jhbuild had been imported as revision 323.

I did a bit more checking of the CVS repository, and found that there were 98 other modules exhibiting clock skew in their revision history, spread over 1245 files (some files with multiple points of skew). I’ve only checked the SVN test conversions of some of these modules, but all the ones I checked exhibited the same type of corruption.

It is going to be a fair bit of work cleaning it all up before the final conversion.

OpenSSH support in bzr

I updated my bzr openssh plugin to be a proper patch against bzr.dev, and got it merged. So if you have bzr-openssh-sftp.py in your ~/.bazaar/plugins directory, you should remove it when upgrading.

Unfortunately there was a small problem resolving a conflict when merging it, which causes the path to get mangled a little inside _sftp_connect(). Once this is resolved, the mainline bzr should fully follow settings in ~/.ssh/config, because it will be running the same ssh binary as you normally use.

One thing I learnt when adding the support code was a quirk in the SFTP URI spec‘s interpretation of paths, which differs to gnome-vfs’s interpretation. The uri sftp://remotehost/directory is interpreted as /directory on remotehost by gnome-vfs, while the spec says that it should be interpreted as ~/directory.

To refer to /directory on remotehost, the spec says you should use sftp://remotehost/%2Fdirectory. I filed this as bug 322394.