We’ve been running into some interesting issues with the GNOME census, which are causing us to twist our tiny brains to get useful results. I thought it might be interesting to share some of them.

  1. – a large number of people commit with their address (or, but have also committed with a different address in the past. So many of you received our survey request twice or more (oops). addresses pose another problem too – when attempting to identify a developer’s employer, gitdm uses domain name matching, and some many gainfully employed GNOME hackers use their addresses to commit, that doesn’t work very well (oops). Finally, we have observed so far that the response rate among unpaid GNOME developers is much higher than the response rate among professional GNOME developers, which has made identifying employers for specific addresses even more difficult.
  2. – Some Canonical developers commit with their address, some with their address, and others apparently use their address. Some unpaid Ubuntu hackers & packagers also commit with email addresses. So identifying the exact number of Canonical developers & Canonical upstream commits has proven very difficult.
  3. Time – many GNOME developers have changed employers at some point, or gone from being unpaid GNOME developers to paid GNOME developers, or changed companies through acquisition or merger. Old email addresses bounce. And yet, it’s the same person. Dealing with time has been one of our toughest challenges, and one where we still don’t have satisfactory answers.
  4. Self-identity – One of the issues we’ve had running the survey is that simple domain name pattern matching doesn’t tell the whole story. Does someone who works for Red Hat on packaging and then spends his evenings hacking his pet project count as a volunteer or a professional? We have noticed a significant number of people who commit with their professional email addresses and consider themselves volunteers on the GNOME project. For a problem which is already complicated enough, this adds further nuance to any quantitative statistics which result.

Thank you all to everyone who has taken the time to answer the 3 to 7 questions (depending on how you self-identify) in the survey – the data has been interesting, and has led us to question some of our preconceptions. To those of you who have not answered yet, let me assure you that the email we sent was not a spam, and Vanessa is doing a great job collating your answers and (in some cases) preparing follow-up questions.

Any insights which people might be able to give in elucidating the issues we’ve noticed are welcome! Please do leave comments.

9 Responses

  3. Owen Taylor Says:

    A problem with the census is that (as far as I noticed) when you get multiple copies, there’s no way to say “this is the same person as this other address that I already answered the census for”. So, it’s actually unclear what you are supposed to do when you end up with a small pile of emails.

    In terms of the time question – people changing roles and employers – I think that’s largely alleviated if you try to assemble results only for the last year, at least as a first step. (Which should also help with the – many commits before the GIT era will be, because we decided explicitly not to resolve old SVN accounts to current email addresses. If there was a clear ChangeLog formatted commit message we used that for Author, otherwise we used

  4. Andrés G. Aragoneses Says:

    Maybe you could include in this Gnome Census project a new requirement which is very interesting wrt the “welcomeness” of contributions (because ISVs want to know if their efforts are going to be taken in account or not): what’s the rate of patches being reviewed per month / number of patches posted in the bug tracker. This, if done, would also help other efforts in downstream to help other higher-level metrics to be synced (outside Gnome) such as the idea behind

    Another metric would be: number of bugs fixed per month / number of bugs filed, to know the level of “support” from the community for a project.

  5. Sandy Says:

    Yeah, this actually caused me to not complete the census yet (I had ignored the fact that I was sent two emails).

    If Novell lets me use a bit of work time to do Tomboy releases, but it’s not really my job, how do I respond?

    If I’m on a team that makes decisions about at-spi2, but I’m not the one committing the changes, how do I respond?

    I realized it would require a bit of thought, and the survey got relegated to a tab that I soon forgot about.

  6. Dave Neary Says:

    @Sandy: In your case, it seems like “Both” is the obvious answer?

    @Owen: Agreed, a major oversight on our part. It would have been very useful information to eliminate duplicates and get closer to the true number of committers.

    @Andrès: I agree this would be useful information, but it’s considerable outside the scope of what we’re doing for this report. A dashboard for the GNOME project might be a logical next step, but it’s not something I’ve currently planned to invest in.

  7. Hub Says:

    Non withstanding that you require to answer questions that are not relevant.

    If you answer that you “used to” in the first question, the second question become irrelevant, and yet is required. So I just skipped it.

    And I’ll pass on the fact that you sent an email in HTML…. That’s Gmail-style bad taste.

  8. Sandy Says:

    @Dave: I must have missed that, thanks. Survey completed (only once).

