Using OpenSSH with bzr

One of the transports available in bzr is sftp. This is implemented using the Paramiko SSH and SFTP library. Unfortunately there are a few issues I experienced with the code: Since it is an independent implementation of SSH, none of my OpenSSH settings in ~/.ssh/config were recognised. The particular options I rely on include: User: when the remote username doesn't match my local one. One less thing to remember when connecting to a remote machine. IdentityFile: use different keys to access different machines. ProxyCommand: access work machines that are behind the firewall. Paramiko does not currently support SSH compression. This is a real pain for larger trees. The easiest way to fix all these problems would be to use OpenSSH directly, so wrote a small plugin to do so. I decided to follow the model used to do this in gnome-vfs and Bazaar 1.x: communicate with an ssh subprocess via pipes and implement the SFTP protocol internally. Since SFTP is layered fairly cleanly on top of SSH, and the paramiko code was also quite modular, it was possible to use the paramiko SFTP implementation with openssh. The result is a small plugin that monkey-patches the existing SFTP transport: http://people.ubuntu.com/~jamesh/bzr-openssh-plugin/ Just copy openssh-sftp.py into the ~/.bazaar/plugins directory, and use bzr as normal. The compression seems to make a noticable difference to performance, but it should be possible to improve things further with a pipelined SFTP client implementation. Of course, the biggest performance optimisation will probably come from the smart server, when that is implemented.

Comparison of Configs/Aliases in Bazaar, CVS and Subversion

When a project grows to a certain size, it will probably need a way to share code between multiple software packages they release. In the context of Gnome, one example is the sharing of the libbackground code between Nautilus and gnome-control-center. The simplest way to do this is to just copy over the files in question and manually synchronise them. This is a pain to do, and can lead to problems if changes are made to both copies, so you'd want to avoid it if possible. So most version control systems provide some way to share code in this way. As with the previous articles, I'll focus on Bazaar, CVS and Subversion Unlike the common operations each system implements this feature in a different way, so I'll go over each one in turn and then compare them. CVS When you run the "cvs checkout module" command, CVS will look in the CVSROOT/modules file for the repository. For example, the file might contain the following: module foobar This would tell CVS to check out the foobar directory from the repository into a directory named module when the user asks for module. If no entry is found for a particular name, the directory by that name is checked out from the repository. To compose multiple modules into a single working copy, the ampersand syntax can be used: module foo &bar &baz bar othermodule/bar With this modules file, "cvs checkout module" would give the following working copy: Working Copy Repository module foo module/bar othermodule/bar module/baz baz Operations like tag, commit, update, etc will descend into included modules, so for the most part a user can treat the resulting working copy as a single tree. If a particular branch tag exists on all the included modules, you can even check out a branch of the combined working copy. There are some problems with the support though: While "cvs update" will update the working copy, it won't take into account any changes in CVSROOT/modules. If you've only got write access to part of the repository, and can't write to CVSROOT/modules, then you can't change configurations. While CVS lets you check out old versions of code, you still use the latest version of CVSROOT/modules. This can make it difficult to check out historical versions of the tree. Since "cvs tag" descends into included modules, you can end up with many branch tags on some modules. For instance, the gnome-common/macros directory in Gnome CVS has 282 branch tags, which makes it almost impossible to feed fixes to all those branches. Subversion Rather than a single repository-wide file describing the module configuration for checkouts, Subversion makes use of the svn:externals property on directories. Any directory can have such a property attached. Each line in the property is of the form: subdir [-rrevnum] absolute-uri-of-tree-to-include This will check out each the given tree at the given sub dir when ever "svn checkout" or "svn update" are used. However unlike CVS, "svn commit" will not descend into the included modules. Some…

Version control discussion on the Python list

The Python developers have been discussing a migration off CVS on the python-dev mailing list. During the discussion, Bazaar-NG was mentioned. A few posts of note: Mark Shuttleworth provides some information on the Bazaar roadmap. Importantly, Bazaar-NG will become Bazaar 2.0. Steve Alexander describes how we use Bazaar to develop Launchpad. This includes a description of the branch review process we use to integrate changes into the mainline. I'm going to have to play around with bzr a bit more, but it looks very nice (and should require less typing than baz ...)

Version Control Workflow

Havoc: we are looking at ways to better integrate version control in Launchpad. There are many areas that could benefit from better use of version control, but I'll focus on bug tracking since you mentioned it. Take the attachment handling in Bugzilla, for instance. In non-ancient versions, you can attach statuses to attachments such as "obsolete" (which has some special handling in the UI — striking out obsolete attachments and making it easy to mark attachments as obsolete when uploading a new attachment). This makes it easy to track and manage a sequence of patches as a fix for a bug is developed (bug 118372 is a metacity bug with such a chain of patches). If you look at this from a version control perspective, this sequence of patches forms a branch off the mainline of the software, where each newly attached patch is a new revision. The main differences being: No explicit indication of what the patch was made against (code base or revision), or what options were used to create the patch. No linkage between successive patches (can be a bit confusing if multiple patch series are attached to the same bug report). So why not just use real version control to manage patches in the bug tracker? The big reason for projects using CVS or Subversion is that only authenticated users can create branches in the repository, and you don't want to require contributors to ask permission before submitting fixes. So this is an area where a distributed version control system can help: anyone can make a branch, so potential contributors don't need permission to begin working on a bug. This also has the benefit that the contributors get access to the same tools as the developers (which is also helpful if they ever become a regular developer). Now if you combine this with history sensitive merging and tell the bug tracker what the mainline branches of the products are, you can do some useful things: Try and merge the changes from the bug fix branch onto the mainline, and see if it merges cleanly. This can tell a developer at a glance whether the patch has bitrotted. This could also be used to produce an up to date diff to the mainline, which can aid review of the changes. Check if the bug fix branch has been merged into the mainline. No need for developers to manually flag the attachment as such. We discussed some of these features in the context of Launchpad at the recent Brazil meeting.

Bryan’s Bazaar Tutorial

Bryan: there are a number of steps you can skip in your little tutorial: You don't need to set my-default-archive. If you often work with multiple archives, you can treat working copies for all archives pretty much the same. If you are currently inside a working copy, any branch names you use will be relative to your current one, so you can still use short branch names in almost all cases (this is similar to the reason I don't set $CVSROOT when working with CVS). If you have a directory which contains only the files you want to import into your Bazaar archive, the following command will add them all, and convert the directory into a Bazaar working copy: cd background-channels baz import -a bclark@redhat.com--gnomearchive/background-channels--dev--0.1 No need for init-tree, add or commit. Running archive-mirror in your working copy will mirror that archive, so doesn't need my-default-archive set. Other people probably don't want to set your archive as their default. Also, they can ommit the register-archive call entirely: baz get http://gnome.org/~clarkbw/arch/background-channels--dev--0.1 This checks out the branch, and registers the archive as a side effect. If you want to find out what is inside an archive, the following command is quite convenient: baz abrowse http://gnome.org/~clarkbw/arch Some things you might want to do: If you have a PGP key, create a signed archive. This will cryptographically sign all revisions. When people checkout your branches, the signatures get checked automatically (this is useful if the server hosting your mirror gets broken into and you need to verify that nothing has been tampered with). If you have already created the archive, you can turn on signing with baz change-archive (remember to update the mirror archive too). If you turn on signing, consider using a PGP agent like gnome-gpg. You can configure it in ~/.arch-params/archives/defaults. It is customary to name the archive directory the same as the archive name. This has the benefit that the branch name matches the last portion of the URL. If you haven't set up a revision library, you should do so: mkdir ~/.arch-revlib baz my-revision-library ~/.arch-revlib baz library-config --greedy --sparse ~/.arch-revlib