The GNOME DVCS (Distributed Version Control System) Survey completed
about a week and a half ago, with responses from 579 different people with
svn accounts. (There are 1083 people with commit access to
GNOME SVN, so this is about a 53% response rate.) The survey was
intended to collect data related to a possible move for the GNOME
project from SVN to a distributed version control system in 2009, thus
questions about svn were included despite the fact that it is not
distributed. The results of the survey are shown below.ย (I got the data from Behdad; the scripts I used to generate the plots can be found here.)
Bias
The plots of the data I present simply cover all the questions —
twice. Once to show the percentages of respondents with each answer
for the specific question, then again to contrast how those who
answered a given question differently had differing rankings for the
various VCSes. So the plots are as neutral as I think is possible.
I also add some commentary of my own, analyzing the data and noting
items that surprised me (I had several predictions about how the
survey would turn out; many of my predictions were right but there
were a number of surprises for me too). I don’t think it’s possible
to make such commentary unbiased. In fact, since I noticed a clear
front-runner in looking at the results, I thought it most useful to
look at that particular system, so the majority of my comments focus
on it. If you do not want my bias, ignore my comments and draw your
own conclusions from the data.
Survey Questions
First, let’s remind everyone what the survey questions were:
- Your GNOME SVN user id
- Do you currently maintain any GNOME modules in SVN?
- Yes, I maintain multiple modules
- Yes, I maintain a single module
- No, I am not a maintainer
- Do you currently develop any GNOME modules in SVN?
- Yes, I develop multiple modules
- Yes, I develop a single module
- No, I do not develop any modules
- Do you commit to GNOME SVN?
- Yes, I regularly commit to GNOME SVN
- Yes, I sometimes commit to GNOME SVN
- No, I do not commit to GNOME SVN myself
- How do you best characterize your current GNOME SVN contributions?
- I develop code
- I write documentation
- I test
- I translate
- Other
(Edit: I wish the question, “In which ways do you characterize
your current GNOME SVN contributions?” had also been asked.
It would be really interesting to see the results of such a
select-all-that-apply question.) - Which of the following distributed version control systems are you familiar with? (select all that apply)
- bzr
- git
- hg
- How do you best summarize which DVCS systems you use *regularly*? (select all that apply)
- bzr
- git
- hg
- How do you feel about GNOME changing version control system to one of bzr, git, or hg in 2009?
- Not again! We just switched systems, like, yesterday (no)
- No strong feeling, I’d use whatever is provided
- What’s wrong with SVN? (why?)
- I do not care
- Please do! Anything is better than svn (except for cvs of course)
- Other
- Which one do you prefer? Please rank the following:
- anything other than svn (no preference)
- bzr
- git
- hg
- svn (no change)
Basic stats
Contribution statistics
Why do we attract so few people that self-identify as primarily being
documenters? Is it because people who get involved in documentation
then also get heavily involved in other areas and thus put themselves
in the “Other” category (most of the documenters I can think of
probably did this)? Are distros more likely to attract this kind of
volunteer? Do we just have a fundamental shortcoming somewhere?
DVCS familiarity statistics, and should we switch
Wow…we have an awful lot of people already familiar with other
VCSes. Over 60% familiar with git, and nearly half the people already
use it regularly? I knew there were a lot of people out there, but I
didn’t know it was that many. bzr and hg also have fairly strong
representation among the community (there’s even 31 people who are
familiar with all three systems, and one person who regularly uses all
three — no I’m not that person). The number of people who regularly
use git still leads the other two systems by quite a bit; I thought
they (or at least bzr) would have caught up more by now but I guess
not.
The lion’s share of the votes for whether we should switch were either
for those that wanted to switch or those that didn’t have a strong
feeling. Although only a small percentage (less than 3%) voted “no”,
that may have been due to the wording; for purposes of counting, the
“why?” column should be lumped with the “no”s. It’s a lighter no, but
still a no. The “other” column is a bit of a wildcard and represents
a somewhat significant cross-section of the community. As can be seen
in the next section, among this group who chose “other” in answer to
the question of whether we should switch, there was a preference for
git over the other systems.
VCS rankings
Note that I’ve created an extra plot derived from the other five, ‘Average rank’, which shows the average rank of each VCS (the number in parenthesis for this extra plot is the number of people whose rankings were averaged). If the community were evenly divided, or if no one cared which system we used, then every VCS would have a rank of 3. So the relevant question in the average rank plot is how far from rank 3 each system is.
Note that the different graphs have different y-axis ranges, as was true with previous plots too. Sorry.
This set of plots really surprised me. I have often thought of git as
polarizing and expected it to have the most first place votes and the
most last place votes. It definitely got the most first place votes,
was close on second place votes, and significantly lagged all other
systems in second-to-last and last place votes. I was floored by
this.
Average rankings for different demographics
One question I was really interested in was which version control
system various demographics preferred. For example, there were a
significant number of people who selected “other” for whether we
should switch to another system. What’s their preference? Do
translators or testers have a different favorite system than coders?
Do maintainers of multiple modules have a different outlook than
non-maintainers? So, in this section I try to look into this
question.ย Note that in each plot, the number in parentheses are the number of people across whom the average was taken.
Average VCS ranking by maintainence/development load
It looks like VCS preference doesn’t change much relative to
maintainence and development load. However, I found it interesting
that bzr had its highest support among maintainers/developers of a
single module and that git had its highest support among
maintainers/developers of multiple modules. (Mercurial had more
support among non-maintainers and non-developers, though that may just
be a reflection of the latter demographic having less strongly held
opinions.) That matched my intuition about design choices of bzr and
git, what they were optimized for, and how it has reflected in their
usage. However, although I was correct about the trend, the size of
the trend turned out to be nearly negligible.
Average VCS ranking by commit frequency
Not much variance here either. As expected, it looks like regular committers have stronger opinions (average rankings further from 3) than occasional or non-committers.
Average VCS ranking by contribution type
I was surprised by these plots. I expected support for git
to be found almost exclusively among coders, but apparently that is
not the case at all. git is ranked highest by all groups other than
documenters. Documenters, though, do rank git dead last.
Some might suggest we discard the last plot given the tiny sample size
(only 4 people self-identify as being ‘primarily’ documenters!).
While there’s some merit to that claim, I find it to be the most
interesting plot (as a bit of a VCS junkie) since it is the only
non-VCS related demographic for which git does not come in first
place.
I also find the translator plot interesting (as a VCS junkie), as it’s
the only other such plot for which git does not have a commanding
preference lead over all other VCSes. Honestly, though, I was quite
surprised that git was even close to svn for translators, let alone
that it had a small lead.
Average VCS ranking by DVCS usage/familiarity
No real surprise here as far as the favorite goes — users who are familiar with or regularly use a certain system tend to prefer that system. However, git enjoys positive support in all cases and at least comes in second? I found that somewhat surprising. I thought it would get a average ranking lower than 3 by those familiar with or using bzr/hg — much as bzr, svn, and hg did among those familiar with or regularly using git.
Average VCS ranking by propensity to switch systems
Those who think we should switch want to go to git. Those who have no
strong preference or selected other, also had a preference for git.
Those who don’t care whether we switch, wonder what’s wrong with
subversion, or think we just shouldn’t switch, all prefer subversion.
Even among the latter group, git came in a positive second for the
“why?” and “I don’t care” groups.
Final thoughts
It looks like there’s a strong preference in the community toward
switching, and that git has a strong lead in preference among the
community, followed by svn, then bzr, then mercurial.
Among the non-VCS-related demographics, there was only two in which
git did not have a commanding lead: testers and documenters. Among
testers, git was still the preferred system, but it only marginally
lead svn (and these two strongly lead bzr and hg). Among documenters,
git came dead last by a large margin (while bzr came in a commanding
first). It would be interesting to find out why; perhaps we should
poll the 4 relevant people.
Among the VCS-related demographics, people familiar with or regularly
using a certain system tended to prefer that system. git always came
in a positive second, though. Also, those not wanting to switch
systems or not caring *at all* whether we switched strongly supported
subversion, while everyone else (including those with no strong
feeling about the switch) strongly preferred git. Even among the “why
switch” and “I don’t care” groups that preferred subversion, git came
in a positive second. Among the tiniest switch preference group,
those that don’t want to change systems at all, bzr was second
followed fairly closely by git.
I spent a lot more time discussing git than bzr or hg in my comments
here, but that was mostly a reflection of where it appeared in the
stats. As shown in the survey results, the other systems don’t appear
to be nearly as preferred in the community, so I simply didn’t discuss
them as much. I apologize if that makes my analysis looks biased; as
I said at the beginning, feel free to ignore my analysis and draw your
own conclusions from the stats.
Elijah, thanks very much for doing this analysis! This has been such a hot-button topic for so long, it’s great to see some actual numbers now. And in particular it’s good to see that the numbers are pretty decisive. I was worried that we’d get like a 30-30-30-10 vote (with svn getting 10%).. 90% people agreeing we need DVCS but evenly split between bzr/git/hg would have been the worst possible outcome. ๐
thanks Elijah — the analysis is definitely great work. kudos!
so, now that we have the numbers: when do we switch? ๐
To be honest – DVCS is important for those who do not own a SVN account. I tend to contribute from time to time using git and gnome git mirror as it is theo only way of tracking more complikated things then single commit or multiple patches.
Interesting analysis, nicely done.
Just a thought reading your analysis. Git is already famous and, if I had one new VCS to learn, I would choose Git because it’s a knowledge that would be useful for a lot of projects. So, without knowing any of those VCS, I would have choosen Git, of course. Then, if I know, for example, bzr, I would have choosen Bzr but put git in the second place because it’s famous.
So, the whole study can be summarized as : “People tend to prefer what they know. A lot of project already use Git.” -> “A lot of people had to learn git to contribute to one project.” -> “a lot of people know git.” -> “Git will be the winner.”. Also add : “A lot of people are still only using SVN” -> “A lot of people will vote to keep svn”.
I find a lot of graphs mostly irrelevant as a simple count of VCS used by popular projects would give exactly the same result.
What I found more interesting is the division between coder/translator/documenter. Git is known as a very powerful but hard to learn system, with a high learning step. (disclaimer : I do not use Git myself). So it raises those questions :
– Isn’t Git exclusively for technical geeks ? It should be fine for 90% of contributors but it could be a great barrier to new contributor, documenter, translator and other with no technical knowledge. (this is a question). How do the other VCS are regarding this question ?
– Isn’t Git so hard that, once people know it, they want to use it everywhere to justify the time it took to learn it. (Also know as the “I don’t want an automatic transmission for my car because I learned how to switch gear manually” syndrom).
Another interesting point is the one/many modules. You say that it highlights some design choice for git and bzr (and I don’t understand which one, not being a VCS specialist) but it seems an interesting point to dig.
So you basically have no maintainers or testers yet you’re busy conducting huge surveys about which VCS to use?
Excelent analysis! Thanks a lot!
This was something very interesting to read on a Sunday afternoon. Thanks for the great analysis!
I agree with Ploum that git is getting even more popular because its the most popular. That’s called the network effect and its a good thing. I work on a project that had components in 5 different VCSes (git, bzr, svn, cvs, darcs). This was a terrible pain. It means that a new contributor has to learn all of them.
When I started doing open source, there was only CVS, you learned it, and you were set to work on almost any open source project. My great hope is that we will come back to that kind of equilibrium. And git is the only one thats far ahead enough to do that.
As for git being harder to learn, its less and less true with each new release. It’s now mostly fear-mongering coming from proponents of other DVCSes.
For documenters and translators and other non-coders, I think we should maybe think of write special “porcelain” tools to streamline their work. Although I’m not in a good position to comment, since I have no idea what their needs are.
I think there is a flaw in this analysis. Certainly people who use git would pick git. I think the only results you can get from this survey is popularity of the different VCS (or DVCS) across the different “demographics” you had there. I certainly do not think i is possible to do a ranking of which VCS is better. People have been using Windows & IE for years, and it is certainly most popular in terms of numbers. However, I wouldn’t think gnome would start thinking of making an IE port would it.
To really rank these systems, you need to consider users that are very familiar with all these system. My personal opinion is I like mercurial, mainly because I use it. I have tried using git, and even after being made easier as Linus mentions, it is still quite complicated and cumbersome to use.
@ Ploum
Certainly, for translators, an automated tool could be developed (some already exist actually) so that those people who don’t want/can’t use git could still contribute efficiently.
Ploum: “Git is already famous and, if I had one new VCS to learn, I would choose Git because itโs a knowledge that would be useful for a lot of projects.” Sure, isn’t that a great reason to choose git? It’s useful knowledge to have, lots of people working on other projects will already have it, and if we choose it we’re providing contributors the chance to learn something useful. What’s wrong with that? Besides, see shaunm’s graphs (which I hadn’t seen until after doing my analysis) showing those that know both git and bzr, or both git and hg, prefer git: http://www.gnome.org/~shaunm/survey/first-picks-permutations.png
Also, I used to also think that git was primarily or even exclusively for technical geeks (I ripped its documentation in particular — http://blogs.gnome.org/newren/2008/03/15/how-not-to-write-newbie-friendly-documentation/ — and even wrote a simplified wrapper script, called EasyGit — http://www.gnome.org/~newren/eg/). Git has improved a lot since then, but its improvements aren’t what convinved me I was wrong — this survey did. If git was exclusively for technical geeks, I’m sure it would have ranked low for translators and testers — rather than first. Also, I’m convinced it would have received a lot of second to last and last place rankings in general, but it received the fewest such votes *by far*.
jegHegy: Your comment doesn’t make sense to me. We have 216 maintainers (93 are maintainers of multiple modules, the rest of just one). We do only have 19 people who self-identify as *primarily* being testers who also have commit access to svn and responded to the survey…but that’s heavily undercounting the testers. And finally, the survey took about 2 minutes for people to complete. So I really don’t understand your comment.
@Hatem: The survey wasn’t designed to determine which VCS was best. It was designed to “help us better understand familiarity and preferences of our active contributor base regarding the future version control system for GNOME.” However, if you want head to head comparisons of those that are familiar with more than one system, see http://www.gnome.org/~shaunm/survey/first-picks-permutations.png.
@Rui: Note that in the survey, translators ranked git the *highest* of all systems. To me, that shoots down the idea that git isn’t usable for translators. And, yes, I agree that there are simple wrappers out there too, if needed. ๐
I suspect the documenters might be the Ubuntu documentation team, who’d obviously support Bzr.
While I applaud the effort and the results are indeed telling, this essay is worth reading: http://keithp.com/blogs/Tyrannical_SCM_selection/;
“when selecting and SCM, I decided (already the tyrant) that it really couldnโt involve even a substantial minority of the project developers. Learning enough about the available SCMs takes a lot of time; I spent about a year looking at options and trying things out. During that time, I downloaded SCM source code, built repositories, converted bits of the X.org tree and looked at the results.”
Gnome’s repository isn’t some simple thing that you can just crank through the existing tools and expect to get a useful result at the end; it will be monolithic and not very usable without implementing a lot of features in git nobody got around to yet (in particular, sparse checkouts). Also when branches are moved between projects if you want to preserve that history you need to use submodules which the current SVN importer does not know how to do.
Anyway, I’d suggest that volunteers and experts who are eager to convert to any or all of the above systems should be supported in their efforts, and perhaps one of them will succeed. There’s no point in making a selection, if you still don’t actually have usable options on the table.
Bruce:
I’m not on the Ubuntu Documentation team, but yeah. You’ve probably hit the nail on the head. I’m actually surprised bzr didn’t do better in general thanks to Ubuntu users that like having access to bazaar.launchpad.net for storing their branches. As far as I know, casual GNOME contributors have nowhere to put a branch, so bazaar.lp is nice to have.
I’ve used all of the VCSs named here, but bzr was the easiest to learn. I still have a lot more to learn, because I only use the most basic things, but at least with bzr I feel like I can get things done without the manpage or asking for help. I still need someone sitting next to me telling me which of the myriad git commands does the thing I need…so I don’t use it unless necessary.
I’m really happy about the GNOME Bzr Playground.
@Sam: Thanks for taking the time to read and respond. Your work on importing the Perl archives into git was _very_ impressive.
I’ve read keithp’s blog posting on vcses, but I’m not so sure it’s applicable here. The desire in the GNOME project to switch to a DVCS has been there for some time, but we’ve gotten hung up on (sometimes unsubtantiated) arguments between systems, and we have no tyrant or benevolent dictator that can choose and just make it happen. Actually, I guess we have the foundation board of directors, but they like doing things by consensus more than fiat…and, in fact, I think that might be the reason for the survey. (I didn’t initiate the survey or run it, just ran some analysis on it since I was interested.)
Also, I think you’re assuming we have a single monolithic subversion repository? That’s not the case…we have one subversion repository per gnome module; I’m not aware of any module in gnome that makes use of sparse checkouts. (I’ve heard translators make use of the ability to check out a single file, so maybe that’s not entirely true, but for the most part the lack of sparse checkouts shouldn’t be a hangup for GNOME.)
@Sam Vilain: Um.
The X.Org repository dates back a hell of a long way, and was a single huge monolithic repository (http://webcvs.freedesktop.org/xorg/xc IIRC), before being exploded into multiple separate repositories via cp. These were then reassembled with parsecvs. The repo had, of course, been quite brutally hacked in a number of cruel and unusual ways with both CVS and RCS directly to simulate decent merges or to just get the thing to work. It was pretty epic.
@Elijah: X.Org lacks a tyrant or a dictator as well; it’s just that no-one bothered to step in and reverse it (being that making a decision is usually better than not).
@Hatem Nassrat:
The point of this exercise was not to evaluate the VCS:s but to evaluate the preferences of the people who work on the code base (directly, it doesn’t count those who just create patches and submit them to bugzilla of course).
“Why do we attract so few people that self-identify as primarily being
documenters?”
Kalle hit the nail on the head. This is just selection bias in your survey. Those who write the documentation (and people who write small patches) are not necessarily those who commit the work.
Some translators may have an interest in version control systems but it is developers who should be interested in keeping it simple so they can get on with development and have to spend less time managing the work of others that needs to be committed. (The natural choke point serves as a good time to review the work but I’m sure documentation writers and translators do manage their own commits when it isn’t too much hassle.)
@Alan:
The burden of committing for others is lessened with *any* DVCS because anyone can clone the destination repo. and become their own committer. It’s *so* much easier to merge a series of small changes versus merging one large change.
“Those who write the documentation (and people who write small patches) are not necessarily those who commit the work. ”
Personally I think that from DVCS will benefit the most people who write big patches without commit rights. I can imagine working on small patch and send a svn diff. But I cannot imagine working with >200 line patch without using git morror (I have such in progress).
I’d second what Bruce Cowan said. The preference among documentors for bzr probably came from documentation contributors from Ubuntu, who currently use bzr. I know that we sometimes get requested to contribute to upstream docs (e.g., Banshee docs).
Hah at “switch = why?” prefering SVN. Time to learn git’s (or more specifically, DVCS’s) benefits then.
D-Bus was in the same boat as X.Org on a much smaller scale. When we decided to switch we moved to git for 2 reasons. First the network effect of most of freedesktop.org going to git meant I had a wealth of resources to help me with the transition. Second git made it fairly easy to split up the bindings and core while keeping history intact.
Some people seem to think popularity is a bad reason to make a choice. What you are seeing is not popularity but momentum. The more people using it means that there are more resources, both in development and in just being able to ask people how to do things. This builds momentum which means your odds of running into a project using git is much higher than other developing systems while older systems which were on top are seeing decelerated usage and will eventually slip below the radar. By hitching to the technology which has the most momentum (what some cynically deride as winning a popularity contest) one increases their odds of being familiar with the dvcs in use when jumping from project to project thereby increasing their own usefulness within the ecosystem. It is simple economic/survival theory. The more people use it, the more useful it becomes, the more people thrive using it, continue the cycle until something else which is a game changer appears and gains more momentum.
What’s the result for those who are familiar with all 3 VCS systems?
i am neither committers nor documenters…..but i guess maybe those guys try to write one doc for all distro…..and i just know that ubuntu large supports bzr by launchpad.net
i GUESS maybe this is a little point……so they can write once and use everywhere easily
at least no repos sync/transform/migrate
Why wasn’t Svk offered in the survey? That way you get DVCS without having to do yet another repository conversion?
It sounds like many folks think the “take away” from the data is to move from Subversion to Git. However, I don’t think you can really read that from the data – what would those that supported Hg or Bzr support if they knew that Hg and Bzr were out of the running? To assume it is Git would be a fallacy.
Thanks for the analysis, I was too lazy to analyze raw data myself, though was very interested in Gnome devs’ public opinion.
Hope KDE will some day think about this too.
@Arne: See http://www.gnome.org/~shaunm/survey/first-picks-permutations.png (based on early results, so not complete but close) or http://wingolog.org/archives/2009/01/06/git-and-bzr.
@David: See http://blogs.gnome.org/newren/2007/11/17/adoption-of-various-vcses/ for at least one reason to not use svk.
@David: Did you even read the survey results? Sure, if we only asked for first choice from everyone and there wasn’t a clear leader, then we couldn’t realistically claim we knew what the “preferred” system of those polled were. However, (1) we asked for first, second, third, fourth, and fifth place rankings so we could look at other rankings and figure out what people would want, and (2) even if all people who chose bzr (78 people) or hg (38 people) or svn (151 people) for their first choice system had all voted for the same system they still wouldn’t have as many votes as those who picked git as their first choice (286 people).
Some interesting statistics, which I think are in some ways more interesting than the averages used in the analysis above:
Second choices are always interesting, especially where there are two or more options which would naturally seem to go together. I’d expect those who ranked git, hg or bzr first to rank the other two before SVN, but it seems that this isn’t the case.
38% of people who ranked Git first preferred svn before bzr or hg, while 31% of bzr users felt the same way about svn over hg or git (although 45% chose git as a second choice, it is hard to explain why there is more than a few die hards who would prefer svn to the other distributed alternatives, given that their first choice isn’t really all that different.
Of those who regularly used git, 83% had it listed as their first choice, quickly tailing off to 0% only 1% ranking it 4th or 5th. For bzr, 71% of users ranked it as their first choice, with the dropoff following the same (scaled) pattern from there.
Of the 20 people who used both regularly, 50% preferred git vs 30% preferring bzr. Interestingly, 2nd choices for svn here were a slightly more realistic 20%.
I started this hoping it would put bzr in a more favorable light, as being someone who only very occasionally (never? can’t remember) contributes something to anything in the ecosystem, I personally prefer ease of use above all else. I think however that all my analysis can show is that people prefer what they already know above unknowns, even if they appear similar on the surface. It would be interesting to find out why the people who put svn in second place did so, and what could be done to improve the chosen vcs to change their minds.
@Elijah
Thanks for that stats link, and all the other stats under the same directory (http://www.gnome.org/~shaunm/survey/). They were quite helpful.
I have been contemplating Git over and over for the past few weeks. I did try it out for a brief session. Although they do say it came a long way, I found it lacking from a usability respective and shunned it immediately. However, as a mercurial user I did have to deal with a few things that Git solves, such as the staging area for example. I am thinking I am sold with Git, I will have to work with aliases for a while, maybe submit a patch to at least get mind reading abbreviations going like Hg has. I will use Git for one or two projects and see how it feels.
Having found the complexity of git somewhat painful myself, I am glad to learn (from one of the comments) about Easy Git. I’m giving it a try. Maybe it will make me more willing to recommend git, particularly in environments where not everyone is a developer. And maybe wider use of it would further enhance git’s popularity?
I’m not a Gnome contributor, just a passing statistician really.
I just wanted to point out explicitly, as several have hinted, that this survey isn’t a technical comparison of DSVNs amongst unbiased responders, but rather a highly biased referendum on the perceived merits of switching.
But then, that’s exactly the point. So, well done! Good survey!
I think the graph people should be taking most note of is the “Regular use of DVCSs”, since it’s a history-based question rather than opinion-based. 50% of your ‘user base’ regularly uses GIT. Choosing a system that half of your base already knows is always going to be an easier transition than educating the larger majority in an unfamiliar system.
Asking what people actually do tends to give more useful answers than asking what they think.