24 May, 2005 – James Henstridge

Merging In Bazaar

Post author:James Henstridge
Post published:24 May, 2005
Post category:Uncategorised

This posting follows on from my previous postings about Bazaar, but is a bit more advanced. In most cases you don't need to worry about this, since the tools should just work. However if problems occur (or if you're just curious about how things work), it can be useful to know a bit about what's going on inside. Changesets vs. Tree Snapshots A lot of the tutorials for Arch list "changeset orientation" as one of its benefits over other systems such as Subversion, which were said to be based on "tree snapshots". At first this puzzled me, since from my mathematical background the relationship between these two concepts seemed the same as the relationship between integrals and derivatives: A changeset is just the difference between two tree snapshots. The state of a tree at a particular point in just the result of taking the initial tree state (which might be an empty tree), and applying all changesets on the line of development made before that point. The distinction isn't clear cut in the existing tools either -- Subversion uses changesets to store the data in the repository while providing a "tree snapshot" style view, and Bazaar generates tree snapshots in its revision library to increase performance of some operations. So the distinction people talk about isn't a simple matter of the repository storage format. Instead the difference is in the metadata stored along with the changes that describes the ancestry of the code. Changesets and Branching In the simple case of a single branch, you end up with a simple series of changesets. The tree for each revision is constructed by taking the last revision's tree and applying the relevant changeset. Alternatively, you can say that the tree for patch-3 contains the changesets base-0, patch-1, patch-2 and patch-3. Branching fits into this model pretty well. As with other systems, a particular revision can have multiple children. In the diagram below, the trees for both patch-2 from the original branch and patch-1 from the new branch "contain" base-0 and patch-1 from the original branch. Any apparent asymmetry is just in the naming and storage locations -- both revisions are branches are just patches against the same parent revision. So far, there's no rocket science. Nothing that Subversion doesn't represent. Pretty much every version control system under the sun tracks this kind of linear revision ancestry (as can be seen using svn log or similar). The differences really only become apparent when merges are taken into consideration. Merges Just as a particular revision can have multiple child revisions (a.k.a. branching), a tree can have multiple parent revisions when merges occur. When you merge two revisions, the result should contain all the changes that exist in the parent revisions. In the above diagram, we want to merge the changes made on the second branch back into the original one. The usual way to merge changes goes something like this: Identify the most recent common ancestor of the two revisions. Take the difference between…