Metrics for success of a DVCS

One thing that has been mentioned in the GNOME DVCS debate was that it is as easy to do “git diff” as it is to do “svn diff” so the learning curve issue is moot.  I’d have to disagree here.

Traditional Centralised Version Control

With traditional version control systems  (e.g. CVS and Subversion) as used by Free Software projects like GNOME, there are effectively two classes of users that I will refer to as “committers” and “patch contributors”:

Centralised VCS Users

Patch contributors are limited to read only access to the version control system.  They can check out a working copy to make changes, and then produce a patch with the “diff” command to submit to a bug tracker or send to a mailing list.  This is where new contributors start, so it is important that it be easy to get started in this mode.

Once a contributor is trusted enough, they may be given write access to the repository moving them to the committers group. They now have access to more functionality from the VCS, including the ability to checkpoint changes into focused commits, possibly on branches.  The contributor may still be required to go through patch review before committing, or may be given free reign to commit changes as they see fit.

Some problems with this arrangement include:

  • New developers are given a very limited set of tools to do their work.
  • If a developer goes to the trouble of learning the advanced features of the version control system, they are still limited to the read only subset if they decide to start contributing to another project.

Distributed Workflow

A DVCS allows anyone to commit to their own branches and provides the full feature set to all users.  This splits the “committers” class into two classes:

Distributed VCS Users

The social aspect of the “committers” group now becomes the group of people who can commit to the main line of the project – the core developers. Outside this group, we have people who make use of the same features of the VCS as the core developers but do not have write access to the main line: their changes must be reviewed and merged by a core developer.

I’ve left the “patch contributor” class in the above diagram because not all contributors will bother learning the details of the VCS.  For projects I’ve worked on that used a DVCS, I’ve still seen people send simple patches (either from the “xxx diff” command, or as diffs against a tarball release) and I don’t think that is likely to change.

Measuring Success

Making the lives of core developers better is often brought up as a reason to switch to a DVCS (e.g. through features like offline commits, local cache of history, etc).  I’d argue that making life easier for non core contributors is at least as important.  One way we can measure this is by looking at whether such contributors are actually using VCS features beyond what they could with a traditional centralised setup.

By looking at the relative numbers of contributors who submit regular patches and those that either publish branches or submit changesets we can get an idea of how much of the VCS they have used.

It’d be interesting to see the results of a study based on contributions to various projects that have already adopted DVCS.  Although I don’t have any reliable numbers, I can guess at two things that might affect the results:

  1. Familiarity for existing developers.  There is a lot of cross pollination in Free Software, so it isn’t uncommon for a new contributor to have worked on another project before hand.  Using a VCS with a familiar command set can help here (or using the same VCS).
  2. A gradual learning curve.  New contributors should be able to get going with a small command set, and easily learn more features as they need them.

I am sure that there are other things that would affect the results, but these are the ones that I think would have the most noticeable effects.

DVCS talks at GUADEC

Yesterday, a BoF was scheduled for discussion of distributed version control systems with GNOME.  The BoF session did not end up really discussing the issues of what GNOME needs out of a revision control system, and some of the examples Federico used were a bit snarky.

We had a more productive meeting in the session afterwards where we went over some of the concrete goals for the system.  The list from the blackboard was:

  • Contributor collaboration (i.e. let anyone use the tool rather than just core developers).
  • Distro ⇔ distro and distro ⇔ upstream collaboration.
  • Host GNOME source code repositories
  • Code review
  • Server side hooks
  • Translators: what to do?
  • Enforced checks
  • Offline operations
  • Documentation authors?
  • Support Win32/Mac (important for GTK)

The sys admin tasks were broken down to:

  • MAINTAINERS file syntax checking
  • PO file syntax checking
  • CIA integration.
  • Commits mailing list
  • Check that commit messages are not empty
  • Trigger updates from commits (e.g. the web site module).
  • Release notes tarballs
  • Damned Lies support

It was clear from the discussion that neither Git or Bazaar satisfied all of the criteria.

The Playground

John Carr did a great job setting up Bazaar mirrors of all the GNOME modules.  This provided an easy way for people to see play around with Bazaar.  However, it only gave you half the experience since it didn’t provide a way to publish code and collaborate.

To aid in this, we have set up the bzr-playground.gnome.org machine, which any GNOME developer should be able to use to publish branches based on John’s imports.  Instructions on getting set up can be found on the wiki.  I hope that we will get a lot of people trying out this infrastructure.

We gave a presentation today on some of the things Bazaar provides that could be useful when hacking on GNOME.  Demoing bzr-playground was a bit problematic due to the internet connection problems at the venue, but I think we still showed some useful tools for local collaboration, searching and code review.

Meanwhile, Robert Collins has been working on some of the GNOME sysadmin features that Bazaar was lacking.  Among other things, he got Damned Lies working with both Subversion and Bazaar, with a test installation on the playground machine.