Archive for January, 2009

GNOME DVCS Survey results

Saturday, January 3rd, 2009

The GNOME DVCS (Distributed Version Control System) Survey completed
about a week and a half ago, with responses from 579 different people with
svn accounts. (There are 1083 people with commit access to
GNOME SVN, so this is about a 53% response rate.) The survey was
intended to collect data related to a possible move for the GNOME
project from SVN to a distributed version control system in 2009, thus
questions about svn were included despite the fact that it is not
distributed. The results of the survey are shown below.  (I got the data from Behdad; the scripts I used to generate the plots can be found here.)

Bias

The plots of the data I present simply cover all the questions —
twice. Once to show the percentages of respondents with each answer
for the specific question, then again to contrast how those who
answered a given question differently had differing rankings for the
various VCSes. So the plots are as neutral as I think is possible.

I also add some commentary of my own, analyzing the data and noting
items that surprised me (I had several predictions about how the
survey would turn out; many of my predictions were right but there
were a number of surprises for me too). I don’t think it’s possible
to make such commentary unbiased. In fact, since I noticed a clear
front-runner in looking at the results, I thought it most useful to
look at that particular system, so the majority of my comments focus
on it. If you do not want my bias, ignore my comments and draw your
own conclusions from the data.

Survey Questions

First, let’s remind everyone what the survey questions were:

  • Your GNOME SVN user id
  • Do you currently maintain any GNOME modules in SVN?
    • Yes, I maintain multiple modules
    • Yes, I maintain a single module
    • No, I am not a maintainer
  • Do you currently develop any GNOME modules in SVN?
    • Yes, I develop multiple modules
    • Yes, I develop a single module
    • No, I do not develop any modules
  • Do you commit to GNOME SVN?
    • Yes, I regularly commit to GNOME SVN
    • Yes, I sometimes commit to GNOME SVN
    • No, I do not commit to GNOME SVN myself
  • How do you best characterize your current GNOME SVN contributions?
    • I develop code
    • I write documentation
    • I test
    • I translate
    • Other

    (Edit: I wish the question, “In which ways do you characterize
    your current GNOME SVN contributions?” had also been asked.
    It would be really interesting to see the results of such a
    select-all-that-apply question.)

  • Which of the following distributed version control systems are you familiar with? (select all that apply)
    • bzr
    • git
    • hg
  • How do you best summarize which DVCS systems you use *regularly*? (select all that apply)
    • bzr
    • git
    • hg
  • How do you feel about GNOME changing version control system to one of bzr, git, or hg in 2009?
    • Not again! We just switched systems, like, yesterday (no)
    • No strong feeling, I’d use whatever is provided
    • What’s wrong with SVN? (why?)
    • I do not care
    • Please do! Anything is better than svn (except for cvs of course)
    • Other
  • Which one do you prefer? Please rank the following:
    • anything other than svn (no preference)
    • bzr
    • git
    • hg
    • svn (no change)

Basic stats

Contribution statistics

Why do we attract so few people that self-identify as primarily being
documenters? Is it because people who get involved in documentation
then also get heavily involved in other areas and thus put themselves
in the “Other” category (most of the documenters I can think of
probably did this)? Are distros more likely to attract this kind of
volunteer? Do we just have a fundamental shortcoming somewhere?

DVCS familiarity statistics, and should we switch

Wow…we have an awful lot of people already familiar with other
VCSes. Over 60% familiar with git, and nearly half the people already
use it regularly? I knew there were a lot of people out there, but I
didn’t know it was that many. bzr and hg also have fairly strong
representation among the community (there’s even 31 people who are
familiar with all three systems, and one person who regularly uses all
three — no I’m not that person). The number of people who regularly
use git still leads the other two systems by quite a bit; I thought
they (or at least bzr) would have caught up more by now but I guess
not.

The lion’s share of the votes for whether we should switch were either
for those that wanted to switch or those that didn’t have a strong
feeling. Although only a small percentage (less than 3%) voted “no”,
that may have been due to the wording; for purposes of counting, the
“why?” column should be lumped with the “no”s. It’s a lighter no, but
still a no. The “other” column is a bit of a wildcard and represents
a somewhat significant cross-section of the community. As can be seen
in the next section, among this group who chose “other” in answer to
the question of whether we should switch, there was a preference for
git over the other systems.

VCS rankings

Note that I’ve created an extra plot derived from the other five, ‘Average rank’, which shows the average rank of each VCS (the number in parenthesis for this extra plot is the number of people whose rankings were averaged). If the community were evenly divided, or if no one cared which system we used, then every VCS would have a rank of 3. So the relevant question in the average rank plot is how far from rank 3 each system is.

Note that the different graphs have different y-axis ranges, as was true with previous plots too. Sorry.

This set of plots really surprised me. I have often thought of git as
polarizing and expected it to have the most first place votes and the
most last place votes. It definitely got the most first place votes,
was close on second place votes, and significantly lagged all other
systems in second-to-last and last place votes. I was floored by
this.

Average rankings for different demographics

One question I was really interested in was which version control
system various demographics preferred. For example, there were a
significant number of people who selected “other” for whether we
should switch to another system. What’s their preference? Do
translators or testers have a different favorite system than coders?
Do maintainers of multiple modules have a different outlook than
non-maintainers? So, in this section I try to look into this
question.  Note that in each plot, the number in parentheses are the number of people across whom the average was taken.

Average VCS ranking by maintainence/development load

It looks like VCS preference doesn’t change much relative to
maintainence and development load. However, I found it interesting
that bzr had its highest support among maintainers/developers of a
single module and that git had its highest support among
maintainers/developers of multiple modules. (Mercurial had more
support among non-maintainers and non-developers, though that may just
be a reflection of the latter demographic having less strongly held
opinions.) That matched my intuition about design choices of bzr and
git, what they were optimized for, and how it has reflected in their
usage. However, although I was correct about the trend, the size of
the trend turned out to be nearly negligible.

Average VCS ranking by commit frequency

Not much variance here either. As expected, it looks like regular committers have stronger opinions (average rankings further from 3) than occasional or non-committers.

Average VCS ranking by contribution type

I was surprised by these plots. I expected support for git
to be found almost exclusively among coders, but apparently that is
not the case at all. git is ranked highest by all groups other than
documenters. Documenters, though, do rank git dead last.

Some might suggest we discard the last plot given the tiny sample size
(only 4 people self-identify as being ‘primarily’ documenters!).
While there’s some merit to that claim, I find it to be the most
interesting plot (as a bit of a VCS junkie) since it is the only
non-VCS related demographic for which git does not come in first
place.

I also find the translator plot interesting (as a VCS junkie), as it’s
the only other such plot for which git does not have a commanding
preference lead over all other VCSes. Honestly, though, I was quite
surprised that git was even close to svn for translators, let alone
that it had a small lead.

Average VCS ranking by DVCS usage/familiarity

No real surprise here as far as the favorite goes — users who are familiar with or regularly use a certain system tend to prefer that system. However, git enjoys positive support in all cases and at least comes in second? I found that somewhat surprising. I thought it would get a average ranking lower than 3 by those familiar with or using bzr/hg — much as bzr, svn, and hg did among those familiar with or regularly using git.

Average VCS ranking by propensity to switch systems

Those who think we should switch want to go to git. Those who have no
strong preference or selected other, also had a preference for git.
Those who don’t care whether we switch, wonder what’s wrong with
subversion, or think we just shouldn’t switch, all prefer subversion.
Even among the latter group, git came in a positive second for the
“why?” and “I don’t care” groups.

Final thoughts

It looks like there’s a strong preference in the community toward
switching, and that git has a strong lead in preference among the
community, followed by svn, then bzr, then mercurial.

Among the non-VCS-related demographics, there was only two in which
git did not have a commanding lead: testers and documenters. Among
testers, git was still the preferred system, but it only marginally
lead svn (and these two strongly lead bzr and hg). Among documenters,
git came dead last by a large margin (while bzr came in a commanding
first). It would be interesting to find out why; perhaps we should
poll the 4 relevant people.

Among the VCS-related demographics, people familiar with or regularly
using a certain system tended to prefer that system. git always came
in a positive second, though. Also, those not wanting to switch
systems or not caring *at all* whether we switched strongly supported
subversion, while everyone else (including those with no strong
feeling about the switch) strongly preferred git. Even among the “why
switch” and “I don’t care” groups that preferred subversion, git came
in a positive second. Among the tiniest switch preference group,
those that don’t want to change systems at all, bzr was second
followed fairly closely by git.

I spent a lot more time discussing git than bzr or hg in my comments
here, but that was mostly a reflection of where it appeared in the
stats. As shown in the survey results, the other systems don’t appear
to be nearly as preferred in the community, so I simply didn’t discuss
them as much. I apologize if that makes my analysis looks biased; as
I said at the beginning, feel free to ignore my analysis and draw your
own conclusions from the stats.