Metrics for success of a DVCS

One thing that has been mentioned in the GNOME DVCS debate was that it is as easy to do “git diff” as it is to do “svn diff” so the learning curve issue is moot.  I’d have to disagree here.

Traditional Centralised Version Control

With traditional version control systems  (e.g. CVS and Subversion) as used by Free Software projects like GNOME, there are effectively two classes of users that I will refer to as “committers” and “patch contributors”:

Centralised VCS Users

Patch contributors are limited to read only access to the version control system.  They can check out a working copy to make changes, and then produce a patch with the “diff” command to submit to a bug tracker or send to a mailing list.  This is where new contributors start, so it is important that it be easy to get started in this mode.

Once a contributor is trusted enough, they may be given write access to the repository moving them to the committers group. They now have access to more functionality from the VCS, including the ability to checkpoint changes into focused commits, possibly on branches.  The contributor may still be required to go through patch review before committing, or may be given free reign to commit changes as they see fit.

Some problems with this arrangement include:

  • New developers are given a very limited set of tools to do their work.
  • If a developer goes to the trouble of learning the advanced features of the version control system, they are still limited to the read only subset if they decide to start contributing to another project.

Distributed Workflow

A DVCS allows anyone to commit to their own branches and provides the full feature set to all users.  This splits the “committers” class into two classes:

Distributed VCS Users

The social aspect of the “committers” group now becomes the group of people who can commit to the main line of the project – the core developers. Outside this group, we have people who make use of the same features of the VCS as the core developers but do not have write access to the main line: their changes must be reviewed and merged by a core developer.

I’ve left the “patch contributor” class in the above diagram because not all contributors will bother learning the details of the VCS.  For projects I’ve worked on that used a DVCS, I’ve still seen people send simple patches (either from the “xxx diff” command, or as diffs against a tarball release) and I don’t think that is likely to change.

Measuring Success

Making the lives of core developers better is often brought up as a reason to switch to a DVCS (e.g. through features like offline commits, local cache of history, etc).  I’d argue that making life easier for non core contributors is at least as important.  One way we can measure this is by looking at whether such contributors are actually using VCS features beyond what they could with a traditional centralised setup.

By looking at the relative numbers of contributors who submit regular patches and those that either publish branches or submit changesets we can get an idea of how much of the VCS they have used.

It’d be interesting to see the results of a study based on contributions to various projects that have already adopted DVCS.  Although I don’t have any reliable numbers, I can guess at two things that might affect the results:

  1. Familiarity for existing developers.  There is a lot of cross pollination in Free Software, so it isn’t uncommon for a new contributor to have worked on another project before hand.  Using a VCS with a familiar command set can help here (or using the same VCS).
  2. A gradual learning curve.  New contributors should be able to get going with a small command set, and easily learn more features as they need them.

I am sure that there are other things that would affect the results, but these are the ones that I think would have the most noticeable effects.