Merging In Bazaar

This posting follows on from my previous postings about Bazaar, but is a bit more advanced. In most cases you don't need to worry about this, since the tools should just work. However if problems occur (or if you're just curious about how things work), it can be useful to know a bit about what's going on inside. Changesets vs. Tree Snapshots A lot of the tutorials for Arch list "changeset orientation" as one of its benefits over other systems such as Subversion, which were said to be based on "tree snapshots". At first this puzzled me, since from my mathematical background the relationship between these two concepts seemed the same as the relationship between integrals and derivatives: A changeset is just the difference between two tree snapshots. The state of a tree at a particular point in just the result of taking the initial tree state (which might be an empty tree), and applying all changesets on the line of development made before that point. The distinction isn't clear cut in the existing tools either -- Subversion uses changesets to store the data in the repository while providing a "tree snapshot" style view, and Bazaar generates tree snapshots in its revision library to increase performance of some operations. So the distinction people talk about isn't a simple matter of the repository storage format. Instead the difference is in the metadata stored along with the changes that describes the ancestry of the code. Changesets and Branching In the simple case of a single branch, you end up with a simple series of changesets. The tree for each revision is constructed by taking the last revision's tree and applying the relevant changeset. Alternatively, you can say that the tree for patch-3 contains the changesets base-0, patch-1, patch-2 and patch-3. Branching fits into this model pretty well. As with other systems, a particular revision can have multiple children. In the diagram below, the trees for both patch-2 from the original branch and patch-1 from the new branch "contain" base-0 and patch-1 from the original branch. Any apparent asymmetry is just in the naming and storage locations -- both revisions are branches are just patches against the same parent revision. So far, there's no rocket science. Nothing that Subversion doesn't represent. Pretty much every version control system under the sun tracks this kind of linear revision ancestry (as can be seen using svn log or similar). The differences really only become apparent when merges are taken into consideration. Merges Just as a particular revision can have multiple child revisions (a.k.a. branching), a tree can have multiple parent revisions when merges occur. When you merge two revisions, the result should contain all the changes that exist in the parent revisions. In the above diagram, we want to merge the changes made on the second branch back into the original one. The usual way to merge changes goes something like this: Identify the most recent common ancestor of the two revisions. Take the difference between…

Bazaar (continued)

I got a few responses to the comparison between CVS, Subversion and Bazaar command line interfaces I posted earlier from Elijah, Mikael and David. As I stated in that post, I was looking at areas where the three systems could be compared. Of course, most people would choose Arch because of the things it can do with it that Subversion and CVS can't. Below I'll discuss two of those things: disconnected development and distributed development. I'll follow on from the examples in the previous post. Disconnected Development Disconnected development allows you to continue working on some code while not having access to the main repository. I hinted at how to do this in the previous post, but left out most of the details. The basic steps are: Create an archive on your machine Branch the module you want to work on into your local archive. Perform your development as normal When you connect again, switch back to the mainline, merge your local branch and commit the changes. To create the local archive, you follow the same procedure as for creating the original archive. Something like this: mkdir ~/archives baz make-archive --signed joe@example.com ~/archives/joe@example.com This creates an archive named joe@example.com (archive names are required to be an email address, optionally followed by some extra info) stored in the user's home directory. Now we can create a branch in the local archive. From a working copy of the mainline branch, run the following command: baz branch joe@example.com/modulename--devel--0 It was necessary to specify an archive name in this call to baz branch, because the branch was being created in a different archive to the one the working copy was pointing at. This leaves the working copy pointing at the new branch, so you can start working on it immediately. You can commit as many revisions as you want, and compare to other revisions on the branch. When you have access to the main repository again, it is trivial to merge your changes back into the mainline: baz switch arch@example.org/modulename--devel--0 baz merge joe@example.com/modulename--devel--0 fix conflicts, if any exist, and mark them resolved baz commit -s 'merge changes from joe@example.com/modulename--devel--0' You can then ignore the branch in the joe@example.com archive, or continue to use it. If you want to continue working on the branch in that module, it is a simple matter to merge from the arch@example.org archive first to pick up the changes made while you were disconnected. Distributed Development In a distributed development environment, there is no main branch. Instead, each developer maintains their own branch, and pulls changes from other developers' archives. A few things fall out from this model: To start working on a distributed project, you need to branch off from another developer's archive. This can be achieved using the same instructions as found in the "disconnected development" section above. In order for other developers to pull changes from your archive, they will need to be able to access it. This isn't possible if it only exists in your home…

SCM Command Line Interface Comparison

With the current discussion on gnome-hackers about whether to switch Gnome over to Subversion, it has been brought up a number of times that people can switch from CVS to Subversion without thinking about it (the implication being that this is not true for Arch). Given the improvements in Bazaar, it isn't clear that Subversion is the only system that can claim this benefit. For the sake of comparison, I'm considering the case of a shared repository accessed by multiple developers over SSH. While this doesn't exploit all the benefits of Arch, it gives a better comparison of the usability of the different tools. Setup Before using any of CVS, Subversion or Arch, you'll need a repository. This can be done with the following commands (run on the repository server): cvs init /cvsroot svnadmin create --fs-type=fsfs /svnroot baz make-archive --signed arch@example.org /archives/arch@example.org (the --signed flag can be omitted if you don't want to cryptographically sign change sets) Once the archive is created, you'd need to make sure that everyone has write access to the files, and new files will be created with the appropriate group ownership. This procedure is the same for each system. Now before users of the arch archive can start using the archive, they will need to tell baz what user ID to associate. Each user only needs to do this once. The email address used should match that on your PGP key, if you're using a signed archive. baz my-id "Joe User <joe@example.com>" Next you'll want to import some code into the repository. This will be done from one of the client machines, from the source directory: cvs -d :ext:user@hostname:/cvsroot import modulename vendor-tag release-tag svn import . svn+ssh://user@hostname/svnroot/modulename/trunk baz import -a sftp://user@hostname/archives/arch@example.org/modulename--devel--0 In the subversion case, we're using the standard convention of putting the main branch in a trunk/ subdirectory. In the arch case, you need a three-level module name, so I picked a fairly generic one. Working with the repository The first thing a user will want to do is to create a working copy of the module: cvs -d :ext:user@hostname:/cvsroot get modulename svn checkout svn+ssh://user@hostname/svnroot/modulename/trunk modulename baz get sftp://user@hostname/archives/arch@example.org/modulename--devel--0 modulename The user can then make changes to the working copy, adding new files with the add sub-command, and removing files with rm sub-command. For Subversion there are also mv and cp sub-commands. For Arch, the mv sub-command is supported. To bring the working copy up to date with the repository, all three systems use the update sub-command. The main difference is that CVS and Subversion will only update the current directory and below, while Arch will update the entire working copy. If there are any conflicts during the update, you'll get standard three-way merge conflict markers in all three systems. Unlike CVS, both Subversion and Arch require you to mark each conflict resolved using the resolved sub-command. To see what changes you have in your working copy, all three systems support a diff command. Again, this works on the full tree in Arch, while…

6 January 2005

Travels I've put some of the photos from my trip to Mataró, and the short stop over in Japan on the way back. The Mataró set includes a fair number taken around La Sagrida Familia, and the Japan set is mostly of things around the Naritasan temple (I didn't have enough time to get into Tokyo). Multi-head A few months back, I got a second monitor for my computer and configured it in a Xinerama-style setup (I'm actually using the MergedFB feature of the radeon driver, but it looks like Xinerama to X clients). Overall it has been pretty nice, but there are a few things that Gnome could do a bit nicer in the setup: Backgrounds get stretched over both screens. The Ubuntu backgrounds already looked a bit weird at a 5:4 aspect ratio. They look even worse at a 5:2 ratio :-). Ideally the background image would be repeated on each monitor of the virtual screen. Some details are available as bug 147808, but it looks like the fix would be in EelBackground code. Most parts of the desktop treat the monitors as independent (which is good, since most people pick Xinerama over classic X multi-screen so that dragging windows between monitors works, rather than to build video walls), but there is a few bits that don't. One of the more obvious ones is in Metacity: the alt+tab dialog pops up centred on the monitor where mouse currently resides, but it cycles through all the windows visible on the virtual screen. This is a bit confusing, since it looks like it will be a monitor-local operation based on the position of the dialog (however, if it was monitor-local I'm not sure how you'd switch focus to a window on the other monitor with only the keyboard ...). Bazaar The new merge command in baz is quite nice. This provides support for merging in ways that tla can't. One of the limitations of star-merge is that it can get confused if you don't strictly follow the star topology when merging. That is, you should only merge to/from the person you branched from, and people who branched from you. If siblings merge for instance, it can cause problems with subsequent merges. The new merge command doesn't suffer from that problem, and allows you to merge from anyone. Of course, if you break the star topology, people wanting to merge from you will either need to be using Bazaar, or ask for you to merge from them first (so that the star-merge algorithm merges the right changes).

15 December 2004

Mataró The conference has been great so far. The PyGTK BoF on the weekend was very productive, and I got to meet Anthony Baxter (who as well as being the Python release manager, wrote a cool VoiP application called Shtoom). There was an announcement of some of the other things Canonical have been working on, which has been reported on in LWN (currently subscriber only) among other places. Over the weekend, I had a little time to do some tourist-type things in Barcelona. I went to La Sagrada Família. It was a great place to visit, and there was an amazing level of detail in the architecture. You can walk almost to the very top of the cathedral, and see out over the Barcelona skyline (and see various bits of the cathedral not visible from the ground). I'll have to put my photos up online. Bazaar I've been using using Bazaar a bit more at work, and it is becoming quite usable, compared to tla. It is a little interesting using daily builds of baz from the 1.1 development branch, where some features appear, get renamed or removed as they get developped, but it has a few more useful features not found in the 1.0 release. From a user point of view, it feels like the command line interface for baz is being designed to be easy to use, while tla's feels like they made choices based on what was easy to implement. I built some Fedora Core 2 i386 builds of the 1.0.1 release, and some 1.1 snapshots that are now up on the Bazaar website in case anyone wants to try them. When I get back home and install FC3 onto my AMD64 box (it only has Ubuntu on at the moment), I'll do some FC3 x86-64 and i386 builds too.