Estimating merge costs

community, freesoftware, General, maemo 2 Comments

After commenting on Mal Minhas’s “cost of non-participation” paper (PDF), I’ve been thinking about the cost of performing a merge back to a baseline, and I think I have something to work with.

First, this might be obvious, but worth stating: Merging a branch which has changed and a branch which has not changed is trivial, and has zero cost.

So merging only has a cost if we have a situation where the two trees concerned with the merge have changed.

We can also make another observation: If we are only adding new function points to a branch, and the mainline branch does not change the API, there is a very small cost to merging (almost zero). There may be some cost if functions with similar names, performing similar functions, have been added to the mainline branch, but we can trivially merge even a large diff if we are not touching any of the baseline code, and only adding new files, objects, or functions.

With that said, let’s get to the nuts & bolts of the analysis:

Let’s say that a code tree has n function points. A vendor takes a branch and makes a series of modifications which affects x function points in the program. The community develops the mainline, and changes y function points in the original program. Both vendor and community add new function points to extend functionality, but we’re assuming that merging these is an almost zero cost.

The probability of conflicts is obviously greater the bigger x and y are. This probability increases very fast the bigger the numbers. Let’s assume that every time that a given function point has been modified by both the vendor and the community that there is a conflict which must be manually resolved  (1).  If we assume that changes are independently distributed across the codebase (2), we can work out that the probability of at least one conflict is 1 – (n-x)!(n-y)!/n!(n-x-y)! if I haven’t messed up my maths (thanks to derf on for the help!).

So if we have 20 functions, and one function gets modified on the mainline and another on the vendor branch, we have a 5% chance of a conflict, but if we modify 5 each, the probability goes up to over 80%. This is the same phenomenon which lets you show that if you have 23 people in a room, chances are that at least two of them will share a birthday.

We can also calculate the expected number of conflicts, and thus the expected cost of the merge, if we assume the cost of each of these conflicts is a constant cost C (3). However, the maths to do that is outside the scope of my skillz right now 🙁 Anyone else care to give it a go & put it in the comments?

We have a bunch of data we can analyse to calculate the cost of merges in quantitative terms (for example, Nokia’s merge of Hildon work from GTK+ 2.6 to 2.10), to estimate C, and of course we can quite easily measure n and y over time from the database of source code we have available to us, so it should be possible to give a very basic estimate metric for cost of merge with the public data.

Footnotes:

(1) It’s entirely possible to have automatic merges happen within a single function, and the longer the function, the more likely this is to happen if the patches are short.

(2) A poor assumption, since changes tend to be disproportionately concentrated in a  few key functions.

(3) I would guess that the cost is usually proportional to the number of lines in the function, perhaps by the square of the number of lines – resolving a conflict in a 40 line function os probably more than twice as easy as resolving a conflict in an 80 line function. This is slightly at odds with footnote (1), so overall the assumption of constant cost seems reasonable to me.

The value of engagement

community, freesoftware, General, gimp, gnome, maemo, work 5 Comments

(Reposted from Neary Consulting)

Mal Minhas of the LiMo Foundation announced and presented a white paper at OSiM World called “Mobile Open Source Economic Analysis” (PDF link). Mal argues that by forking off a version of a free software component to adjust it to your needs, run intensive QA, and ship it in a device (a process which can take up to 2 years), you are leaving money on the table, by way of what he calls “unleveraged potential” – you don’t benefit from all of the features and bug fixes which have gone into the software since you forked off it.

While this is true, it is also not the whole story. Trying to build a rock-solid software platform on shifting sands is not easy. Many projects do not commit to regular stable releases of their software. In the not too distant past, the FFMpeg project, universally shipped in Linux distributions, had never had a stable or unstable release. The GIMP went from version 1.2.0 in December 1999 to 2.0.0 in March 2004 in unstable mode, with only bug-fix releases on the 1.2 series.

In these circumstances, getting both the stability your customers need, and the latest & greatest features, is not easy. Time-based releases, pioneered by the GNOME project in 2001, and now almost universally followed by major free software projects, mitigate this. They give you periodic sync points where you can get software which meets a certain standard of feature stability and robustness. But no software release is bug-free, and this is true for both free and proprietary software. In the Mythical Man-Month, Fred Brooks described the difficulties of system integration, and estimated that 25% of the time in a project would be spent integrating and testing relationships between components which had already been planned, written and debugged. Building a system or a Linux distribution, then, takes a lot longer than just throwing the latest stable version of every project together and hoping it all works.

By participating actively in the QA process of the project leading up to the release, and by maintaining automated test suites and continuous integration, you can mitigate the effects of both the shifting sands of unstable development versions and reduce the integration overhead once you have a stable release. At some stage, you must draw a line in the sand, and start preparing for a release. In the GNOME project, we have a progressive freezing of modules, progressively freezing the API & ABI of the platform, the features to be included in existing modules, new module proposals, strings and user interface changes, before finally we have a complete code freeze pre-release. Similarly, distributors decide early what versions of components they will include on their platforms, and while occasional slippages may be tolerated, moving to a new major version of a major component of the platform would cause integration testing to return more or less to zero – the overhead is enormous.

The difficulty, then, is what to do once this line is drawn. Serious bugs will be fixed in the stable branch, and they can be merged into your platform easily. But what about features you develop to solve problems specific to your device? Typically, free software projects expect new features to be built and tested on the unstable branch, but you are building your platform on the stable version. You have three choices at this point, none pleasant – never merge, merge later, or merge now:

  • Develop the feature you want on your copy of the stable branch, resulting in a delta which will be unique to your code-base, which you will have to maintain separately forever. In addition, if you want to benefit from the features and bug fixes added to later versions of the component, you will incur the cost of merging your changes into the latest version, a non-negigible amount of time.
  • Once you have released your product and your team has more time, propose the features you have worked on piecemeal to the upstream project, for inclusion in the next stable version. This solution has many issues:
    • If the period is long enough, your feature additions will be long removed from the codebase as it has evolved, and merging your changes into the latest unstable tree will be a major task
    • You may be redundantly solving problems that the community has already addressed, in a different or incompatible way.
    • Feature requests may need substantial re-writing to meet community standards. This problem is doubly so if you have not consulted the community before developing the feature, to see how it might best be integrated.
    • In the worst case, you may have built a lot of software on an API which is only present in your copy of the component’s source tree, and if your features are rejected, you are stuck maintaining the component, or re-writing substantial amounts of code to work with upstream.
  • Develop your feature on the unstable branch of the project, submit it for inclusion (with the overhead that implies), and back-port the feature to your stable branch once included. This guarantees a smaller delta from the next stable version to your branch, and ensures you work gets upstream as soon as possible, but adds a time & labour overhead to the creation of your software platform

In all of these situations there is a cost. The time & effort of developing software within the community and back-porting, the maintenance cost (and related unleveraged potential) to maintaining your own branch of a major component, and the huge cost of integrating a large delta back to the community-maintained version many months after the code has been written.

Intuitively, it feels like the long-term cheapest solution is to develop, where possible, features in the community-maintained unstable branch, and back-port them to your stable tree when you are finished. While this might be nice in an ideal world, feature proposals have taken literally years to get to the point where they have been accepted into the Linux kernel, and you have a product to ship – sometimes the only choice you have is to maintain the feature yourself out-of-tree, as Robert Love did for over a year with inotify.

While addressing the raw value of the code produced by the community in the interim, Mal does not quantify the costs associated with these options. Indeed, it is difficult to do so. In some cases, there is not only a cost in terms of time & effort, but also in terms of goodwill and standing of your engineers within the community – this is the type of cost which it is very hard to put a dollar value on. I would like to see a way to do so, though, and I think that it would be possible to quantify, for example, the community overhead (as a mean) by looking at the average time for patch acceptance and/or number of lines modified from intial proposal to final mainline merge.

Anyone have any other thoughts on ways you could measure the cost of maintaining a big diff, or the cost of merging a lot of code?

Ton Roosendaal to keynote the Summit community days

community, freesoftware, maemo 1 Comment
Ton Roosendaal - the one on the left

Ton Roosendaal (on the left)

I’m very pleased to share that the opening keynote for the community days during the Maemo Summit will be Ton Roosendaal of the Blender Foundation. It’s been my pleasure to know Ton, mostly from afar, for the past few years, and he is one of the most amazing people I know in the free software world.

Ton is one of those people who has a sense of doing things big, and doing them right. Over the past few years, Ton has raised money to hire artists and developers to work on commercial quality films and games, resulting in Project Orange, Project Apricot, Project Peach and now Project Durian is in pre-production (you can buy your copy now and get your name on the credits!), with the goal of showing off what Blender can do and making the program better by working closely with artists to see what needs work. The results are truly impressive, and the amount of foresight and hard work which went into getting each of these projects off the ground and completed is amazing.

The Blender community is also amazing. Ton has continually given passionate users a reason to stay around, and ways to help the project, and that has been rewarded by a diverse community of artists and developers working together. The BlenderNation fan/news site is a testament to the creativity and passion of the community.

I’m really looking forward to hearing Ton speak.

GCDS round-up 6: The tail-off

community, freesoftware, gnome, guadec No Comments

The last in the series!

After Mobile Day on Wednesday, I chilled out on Thursday morning, and attended the GNOME Foundation AGM where I gave a quick report on GNOME Mobile, before heading off to play in the fourth annual FreeFA world cup, with the mad dogs and Englishmen who went out in the midday Gran Canaria sun to play football for 2 hours.

As is usual, the team I played on won and the team that Bastien played on lost 🙂 After the match, though, I let Bastien in on my secret: always play with the team with the most local guys. Why? Because the people from the local team who take time off to go play in the patch usually play regularly. The rule has never let me down 🙂

Another highlight of the match was Diego, ending the match without breaking a sweat, finally broke a sweat just as we were taking the group photo at the end.

Thursday evening, met up with Federico Mena, and Jonathan and Rosanna Blandford for a very interesting hippie BOF, with conversation varying across a bunch of subjects, including compost heaps, growing trees and herbs, architecture pattern languages, cultural variations in building design, and more. On to dinner, and home to the hotel early, ready for the cycling trip on Friday.

Up early on Friday, down to breakfast, no sign of Aaron or J5 yet, so I start eating without them, and go get started with the bike. Turns out we were sitting in different parts of the lobby in the Fataga.

Armed with bikes, we set off around 9:30 to get to Arucas, on our way to Teror. We made pretty good going of it along the waterfront, and after taking it fairly easy on the edge of the motorway (surprising that we could cycle there actually, but apparently that’s the only way to get where we were going) along the coastloine, we finally came to the intersection for Arucas. John was starting to find the going a little tough already, but nothing had prepared us for what was next.

The nice straight GC-20, from the coast to Arucas, was steep, much steeper than I had expected (if I had to guess, I’d say an average of around 8% with some bits around 10%). It was a struggle, but we got to the top, before a nice long downhill stretch to come into Arucas, after which we all needed a water stop. We agreed that the goal of Teror (600m higher and 15km further on) was probably not realistic, so we decided to cut across on the GC 300 to Tamaraceite, grabbing lunch in the first village we came to on the road.

And so we set out after a nice long break & a walk around the “cathedral” in Arucas, the town gardens and the main shopping district (with a detour by a farm supplies store) for what we thought would be a nice light 20 minute cycle to the next town over. No such luck.

After climbing a nice hill straight out of Arucas, we had our reward – a really nice winding fast descent towards Tenoya. But when we got up to the village of Tenoya, we couldn’t find a restaurant anywhere. Eventually a nice old man pointed us towards “la cantina”, which turned out to be a bar with some very nice young men standing outside calling us crazy. So we decided to go to the next village over.

Through a road tunnel – watch out for the oncoming cars! Traffic lights don’t know about bikes going through and we didn’t have lights. then we got to a sort of service station with a promising sign: “Supermarket in Las Mesas: 500m”. If you ever come across that sign, don’t believe it. Between us and lunch was a killer hill and 2km of dusty road.

We settled in to la Cantina to weather the hottest part of the day, had some nice lunch (food always tastes better after physical effort) and set off again to get back to Gran Canaria. Getting around Tamaraceite was a bit tricky, we took one or two wrong turns before finding the nice small back road to get us on the right track. Then one last killer hill, up the Cuesta Blanca to the major shopping district, and then downhill all the way through the roundabouts, right down to the hospital near the golf club, and home.

It was a great ride, lots of fun, and I’m happy I made time for it. Aaron & John were great partners.

After that, packed my bags, out for dinner with Lefty, home to bed quite early, and up with the birds for an early flight, when I got to run into lots of GCDS attendees in the airport – a nice breakfast with Guy Lunardi, Jonathan & Zana and Owen Taylor (IIRC), and I was off on my plane once more. Homeward bound, for a few days, before heading off to OSCON and the Community Leadership Summit.

Barriers to community growth

community, freesoftware 13 Comments
Barriers to entry

Barriers to entry

I often talk to vendors who are interested in growing their developer communities around their free software projects. When I do, my advice centers around two things, one of which I think I can help with.

The first is your project vision – why would someone look at your stuff instead of anyone else’s? You are competing for the attention of the pool of free software developers out there, as well as trying to grow that pool, and what will draw people to your project is your vision.

The second is the Hippocratic principle of community building: Primum non nocere, first do no harm.

Most communities fail to reach critical mass because someone becomes interested in your project, and just bounces off it, because of some difficulties they meet when engaging you. To build a successful community, it is usually sufficient to build a compelling vision, and remove all non-essential barriers to participation in your project that exist.

I have compiled a check-list of various barriers to entry which are found in vendor-led projects, roughly grouped into technical, social and legal barriers to entry. Sometimes it’s appropriate for a new community member to face a learning curve – you want to maintain a tone in your community, and ensure that core developers understand the social and technical norms of your project – but often the things that they have to learn are incidental, rather than essential, and removing these is a worthwhile thing to do.

Without further ado, here is Community barriers to entry (pdf) – I’m publishing this under CC BY-SA 3.0 and I would be delighted to get feedback on this to help improve it and make it more useful. Comments welcome!

Why I disagree with RMS concerning Mono

freesoftware, gimp, gnome, maemo, openwengo 43 Comments

The GNOME press contact alias got a mail last weekend from Sam Varghese asking about the possibility of new Mono applications being added to GNOME 3.0, and I answered it. I didn’t think much about it at the time, but I see now that the reason Sam was asking was because of Richard Stallman’s recent warnings about Mono – Sam’s article has since appeared with the ominous looking title “GNOME 3.0 may have more Mono apps“. And indeed it may. It may also have more alien technology, we’re not sure yet. We’re still working on an agreement with the DoD to get access to the alien craft in Fort Knox.

Anyway – that aside, Richard’s position is that it’s dangerous to include Mono to the point where removing it is difficult, should that become necessary to legally distribute your software. On the surface, I agree. But he goes a little further, saying that since it is dangerous to depend on Mono, we should actively discourage its use. And on this point, we disagree.

I’m not arguing that we should encourage its use either, but I fundamentally disagree with discouraging someone from pursuing a technology choice because of the threat of patents. In this particular case, the law is an ass. The patent system in the United States is out of control and dysfunctional, and it is bringing the rest of the world down with it. The time has come to take a stand and say “We don’t care about patents. We’re just not going to think about them. Sue us if you want.”

The healthy thing to do now would be to provoke a test case of the US patent system. Take advantage of one of the many cease & desist letters that get sent out for vacuous patented technology to make a case against the US PTO’s policy pertaining to software and business process patents. Run an “implement your favourite stupid patent as free software” competition.

In all of the projects that I have been involved in over the years, patent fears have had a negative affect on developer productivity and morale. In the GIMP, we struggled with patent issues related to compression algorithms for GIF and TIFF, colour management, and for some plug-ins. In GNOME, it’s been Mono mostly, but also MP3, and related (and unrelated) issues have handicapped basic functionality like playing DVDs for years. In Openwengo, the area of audio and video codecs is mined with patent restrictions, including the popular codecs G729 and H264 among others.

What could we have achieved if standards bodies had a patent pledge as part of their standardisation process, and released reference implementations under an artistic licence? How much further along would we be if cryptography, filesystems, codecs and data compression weren’t so heavily handicapped by patents? Or if we’d just ignored the patents and created clean-room implementations of these patented technologies?

That’s what I believe we need to do. Ignore the patent system completely. I believe strongly in respecting licencing requirements related to third party products and developer packs. I think it’s reasonable to respect people’s trademarks and trade secrets. But having respect for patents, and the patent system, is ridiculous. Let a thousand flowers bloom, and let the chips fall where they may.

So if you want to write a killer app in Mono, then don’t let anyone tell you otherwise. If you build it, they will come.

Too many platforms?

community, freesoftware, maemo 6 Comments

Fabrizio Capobianco of Funambol wondered recently if there are too many mobile Linux platforms.

The context was the recent announcement of oFono by Intel and Nokia, and some confusion and misunderstanding about what oFono represents. Apparently, several people in the media thought that oFono would be Yet Another Complete Stack, and Fabrizio took the bait too.

As far as I can tell, oFono is a component of a mobile stack, supplying the kind of high-level API for telephony functions which Gstreamer does for multimedia applications. If you look at it like this, it is a natural compliment to Moblin and Maemo and potentially a candidate technology for inclusion in the GNOME Mobile module set.

Which brings me to my main point. Fabrizio mentions five platforms besides oFono in his article: Android, LiMo, Symbian, Maemo and Moblin. First, Symbian is not Linux. Of the other four, LiMo, Maemo and Moblin share a bunch of technology in their platforms. Common components across the three are: The Linux kernel (duh), DBus, Xorg, GTK+, GConf, Gstreamer, BlueZ, SQLite… For the most part, they use the same build tools. The differences are in the middleware and application layers of the platform, but the APIs that developers are mostly building against are the same across all three.

Maemo and Moblin share even more technology, as well as having very solid community roots. Nokia have invested heavily in getting their developers working upstream, as has Intel. They are both leveraging community projects right through the stack, and focusing on differentiation at the top, in the user experience. The same goes for Ubuntu Netbook Edition (the nearest thing that Moblin has to a direct competitor at the moment).

So where is the massive diversity in mobile platforms? Right now, there is Android in smartphones, LiMo targeting smartphones, Maemo in personal internet tablets and Moblin on netbooks. And except for Android, they are all leveraging the work being done by projects like GNOME, rather than re-inventing the wheel. This is not fragmentation, it is adaptability. It is the basic system being tailored to very specific use-cases by groups who decide to use an existing code base rather than starting from scratch. It is, in a word, what rocks about Linux and free software in general.

Football clubs and free software projects

community, freesoftware, gnome, maemo 4 Comments

A few weeks ago I pointed out some similarities between community software projects and critical mass. After watching Chelsea-Barcelona last night – an entertaining match for many of the wrong reasons and a few of the right ones – I wanted to share another analogy that could perhaps be useful in analysing free software projects. What can we learn from football clubs?

Before you roll your eyes, hear me out for a second. I’m a firm believer that building software is just like building anything else. And free software communities share lots of similarities with other communities. And football clubs are big communities of people with shared passions.

Football clubs share quite a few features with software development. Like with free software, there are different degrees of involvement: the star players and managers on the field, the back-room staff, physiotherapists, trainers and administrators, the business development and marketing people who help grease the wheels and make the club profitable, and then the supporters. If we look at the star players, they are often somewhat mercenary – they help their club to win becauise they get paid for it. Similarly, in many free software projects, many of the core developers are hired to develop – this doesn’t mean they’re not passionate about the project, but Stormy’s presentation about the relationship between money and volunteer efforts, “would you do it again for free?” rings true.

Even within the supporters, you have different levels of involvement – members of supporter clubs and lifetime ticket holders, the people who wouldn’t miss a match unless they were on their death bed, people who are bringing their son to the first match of his life in the big stadium, and the armchair fans, who “follow” their team but never get closer than the television screen.

The importance of the various groups echoes free software projects too – those fanatical supporters may think that the club couldn’t survive without them, and they might be right, but the club needs trainers, back-room staff and players more. In the free software world, we see many passionate users getting “involved” in the community by sending lots of email to mailing lists suggesting improvements, but we need people hacking code, translating the software and in general “doing stuff” more than we need this kind of input. The input is welcome, and without our users the software we produce would be irrelevant, but the contribution of a supporter needs to be weighed against the work done by core developers, the “stars” of our community.

Drogba shares the love

Drogba shares the love

Football clubs breed a club culture, like software projects. For years West Ham was known for having the ‘ardest players in the league, with the ‘ardest supporters – the “West ‘Am Barmy Army”. Other clubs have built a culture of respect for authority – this is particularly true in a sport like rugby. More and more the culture in football is one of disrespect for authority. Clubs like Manchester United have gotten away with en masse intimidation of match officials when decisions didn’t go their way. I was ashamed to see players I have admired from afar – John Terry, Didier Drogba, Michael Ballack, in the heat of the moment show the utmost of disrespect for the referee. That culture goes right through the club – when supporters see their heroes outraged and aggressive, they get that way too. The referee in question has received death threats today.

Another similarity is the need for a sense of identity and leadership. Football fans walk around adorned in their club’s colours, it gives them a sense of identity, a shared passion. And so do free software developers – and the more obscure the t-shirt you’re wearing the better. “I was at the first GUADEC” to a GNOME hacker is like saying “I was in Istanbul” for a Liverpool supporter.

This is belonging

This is belonging

So – given the similarities – spheres of influence and involvement, with lots of different roles needed to make a successful club, a common culture and identity, what can we learn from football clubs?

A few ideas:

  • Recruitment: Football clubs work very very hard to ensure a steady stream of talented individuals coming up through the ranks. They have academies where they grow new talent, scouts, reserve teams and feeder clubs where they keep an eye on promising talent, and they will buy a star away from a competing club based on his reputation and track record.
  • Teams have natural lifecycles: When old leaders come to the end of the road, managers often have trouble filling the leadership void. Often, it’s not one player leaving, but a group of friends who have played together for years. Teams have natural lifecycles, but good teams manage to see further ahead, and are constantly looking to renew the team, so that they don’t end up in a situation where they lose 5 or 6 key players in one season
  • Build belonging: Supporters want to show their sense of belonging, and people who don’t have the skillz to be on the field still want to wear their team colours, and share their passion for the team. Merchandising is one way to do that, but not the only way. We should look at the way clubs cultivate their user groups and create a passionate following
  • Leaders decide the culture: We owe it to ourselves to systematically grow a nurturing culture at the heart of our project – core developers, thought leaders, anyone who is a figurehead within the project. If we are polite and respectful to each other, considerate of the feelings of those we deal with and sensitive to how our words will be received, our supporters will follow suit.

Are there any other dodgy analogies that we can make with free software develoment communities? Any other lessons we might be able to draw from this one?

Oracle buys MySQL shocker (and they get the rest of Sun too)

freesoftware 8 Comments

In an effort to further perturb the free software market which has been threatening its middleware, application server and database markets recently, the management in Oracle has come up with the masterly stroke of buying Sun Microsystems, and with it a chief competitor, MySQL.

Oracle had already announced their intention to undermine MySQL a few years ago when they bought InnoDB, the ACID database engine used by MySQL, just 18 months before their licensing agreement with the Swedes was due to expire. If you don’t understand why a licensing agreement was needed, you need to think about what licence MySQL was distributing InnoDB under when selling commercial MySQL licences.

My thought at the time was that Oracle could just refuse to renew the licensing arrangement, leaving MySQL without a revenue stream. The problem became moot when Sun bought MySQL, and less critical when MySQL announced they were working on an alternative ACID engine.

Oracle have been trying to undermine free software vendors for a while now, including its launch of Unbreakable Linux to compete directly with Red Hat. Red Hat had previously purchased JBoss, a product competing directly with WebLogic, Oracle’s application server product.

I figure that Oracle will do what some people were suggesting Sun should have done: wrap up Sun into three different entities: storage, servers and software. They have an interest in keeping the storage unit around – there’s considerable synergy possible there for a database company.

Within the software business unit, they will probably drop OpenOffice.org pretty quickly, and I can’t see them maintaining support for OpenSolaris or GNOME. They will keep selling and supporting Solaris as a cash cow for years to come, in the same way IBM did with the Informix database server some years ago, and Java will be a valuable asset to them.

I don’t think Oracle has a strategic interest in becoming a hardware vendor, however, and I can’t imagine a very big percentage of their client base is using SPARC systems these days, so I don’t see them keeping the server business around for long.

Interesting days ahead in the free software world! From the point of view of MySQL, it will be interesting to see if some ex-MySQL employees take the old GPL code and keep the project going under a different company & different name, or if Drizzle (or one of the other forks) gets critical mass as a community-run project to take over a sizeable chunk of the install base. For Oracle, it will be interesting to see if they start trying to move existing MySQL customers over to Oracle, or if they maintain both products, or if they EOL all support on MySQL altogether and force people into a choice. I imagine that the most likely scenario is that they will maintain support staff, cut development staff, and let the product die a slow and painful death.

Copyright assignment and other barriers to entry

community, freesoftware, gnome 18 Comments

Daniel Chalef and Matthew Aslett responded to my suggestion at OSBC that copyright assignment was unnecessary, and potentially harmful, to building a core community around your project. Daniel wrote that he even got the impression that I thought requesting copyright assignment was “somewhat evil”. This seems like a good opportunity for me to clarify exactly what I think about copyright assignment for free software projects.

First: copyright assignment is usually unnecessary.

Most of the most vibrant and diverse communities around do not have copyright assignment in place. GIMP, GNOME, KDE, Inkscape, Scribus and the Linux kernel all get along just fine without requesting copyright assignment (joint or otherwise) from new contributors.

There are some reasons why copyright assignment might be useful, and Matthew mentions them. Relicencing your software is easier when you own everything, and extremely difficult if you don’t. Defending copyright infringement is potentially easier if there is a single copyright holder. The Linux kernel is pretty much set as GPL v2, because even creating a list of all of the copyright holders would be problematic. Getting their agreement to change licence would be nigh on impossible.

Not quite 100% impossible, though, as Mozilla has shown. The relicencing effort of Mozilla took considerable time and resources, and I’m sure the people involved would be delighted not to have needed to go through it. But it is possible.

There is another reason proponents say that a JCA is useful: client indemnification. I happen to think that this is a straw man. Enterprise has embraced Linux, GNOME, Apache and any number of other projects without the need for indemnification. And those clients who do need indemnification can get it from companies like IBM, Sun, Red Hat and others. Owning all the copyright might give more credibility to your client indemnification, but it’s certainly not necessary.

There is a conflation of issues going on with customer indemnification too. What is more important than the ownership of the code is the origin of the code. I would certainly agree that projects should follow decent due dilligence procedures to ensure that a submission is the submitter’s own work, and that he has the right or permission to submit the code under your project’s licence. But this is independent of copyright assignment.

Daniel mentions Mozilla as an example of a non-vendor-led-project requiring copyright assignment – he is mistaken. The Mozilla Committer’s Agreement (pdf) requires a new committer to do due dilligence on the origin of code he contributes, and not commit code which he is not authorised to do. But they do not require joint copyright assignment. Also note when the agreement gets signed – not on your first patch, but when you are becoming a core committer – when you are getting right to the top of the Mozilla food chain.

Second: Copyright assignment is potentially harmful.

It is right and proper that a new contributor to your project jump through some hoops to learn the ways of the community. Communities are layered according to involvement, and the trust which they earn through their involvement. You don’t give the keys to the office to a new employee on day one. What you do on day one is show someone around, introduce them to everyone, let them know what the values of your community are.

Now, what does someone learn about the values of your community if, once they have gone to the effort to modify the software to add a new feature, had their patch reviewed by your committers and met your coding standards, the very next thing you do is send them a legal form that they need to print, sign, and return (and incidentally, agree with) before you will integrate their code in your project?

The hoops that people should be made to jump through are cultural and technical. Learn the tone, meet the core members, learn how to use the tools, the coding conventions, and familiarise yourself with the vision of the community. The role of community members at this stage is to welcome and teach. The equivalent of showing someone around on the first day.

Every additional difficulty which a new contributor experiences is an additional reason for him to not stick around. If someone doesn’t make the effort to familiarise himself with your community processes and tools, then it’s probably not a big deal if he leaves – he wasn’t a good match for the project. But if someone walks away for another reason, something that you could change, something that you can do away with without changing the nature of the community, then that’s a loss.

Among the most common superfluous barriers to entry that you find in free software projects are complicated build systems or uncommon tools, long delays in having questions answered and patches reviewed, and unnecessary bureaucracy around contributing. A JCA fits squarely into that third category.

In a word, the core principle is: To build a vibrant core developer community independent of your company, have as few barriers to contributing as possible.

There is another issue at play here, one which might not be welcomed by the vendors driving the communities where I think a JCA requirement does the most harm. That issue is trust.

One of the things I said at OSBC during my presentation is that companies aren’t community members – their employees might be. Communities are made up of people, individual personalities, quirks, beliefs. While we often assign human characteristics to companies, companies don’t believe. They don’t have morals. The personality of a company can change with the board of directors.

Luis Villa once wrote “what if the corporate winds change? … At that point, all the community has is the license, and [the company]’s licensing choices … When [the company] actually trusts communities, and signals as such by treating the community as equals […] then the community should (and I think will) trust them back. But not until then.”

Luis touches on an important point. Trust is the currency we live & die by. And companies earn trust by the licencing choices they make. The Apache Foundation, Python Software Foundation and Free Software Foundation are community-run non-profits. As well as their licence choices, we also have their by-laws, their membership rules and their history. They are trusted entities. In a fundamental way, assigning or sharing copyright with a non-profit with a healthy governance structure is different from sharing copyright with a company.

There are many cases of companies taking community code and forking commercial versions off it, keeping some code just for themselves. Trolltech, SugarCRM and Digium notably release a commercial version which is different from their GPL edition (Update: Several people have written in to tell me that this is no longer the case with Trolltech, since they were bought by Nokia and QT was relicenced under the LGPL – it appeared that people felt clarification was necessary, although the original point stands – Trolltech did sell a commercial QT different from their GPL “community” edition).

There are even cases of companies withdrawing from the community completely and forking commercial-only versions of software which had previously released under the GPL. A recent example is Novell‘s sale of Netmail to Messaging Architects, resulting in the creation of the Bongo project, forked off the last GPL release available. In 2001, Sunspire (since defunct) decided to release future versions of Tuxracer as a commercial game, resulting in the creation of Planet Penguin Racer, among others, off the last GPL version. Xara dipped their toes releasing most of their core product under the GPL, but decided after a few years that the experiment had failed. Xara Xtreme continues with a community effort to port the rendering engine to Cairo,  but to my knowledge, no-one from Xara is working on that effort.

Examples like these show that companies can not be trusted to continue developing the software indefinitely as free software. So as an external developer being asked to sign a JCA, you have to ask yourself the question whether you are prepared to allow the company driving the project the ability to build a commercial product with your code in there. At best, that question constitutes another barrier to entry.

At OSBC, I was pointing out some of the down sides of choices that people are making without even questioning them. JCAs are good for some things, but bad at building a big developer community. What I always say is that you first need to know what you want from your community, and set up the rules appropriately. Nothing is inherently evil in this area, and of course the copyright holder has the right to set the rules of the game. What is important is to be aware of the trade-offs which come from those choices.

To summarise where I stand, copyright assignment or sharing agreements are usually unnecessary, potentially harmful if you are trying to build a vibrant core developer community, by making bureaucracy and the trust of your company core issues for new contributors. There are situations where a JCA is merited, but this comes at a cost, in terms of the number of external contributors you will attract.

Updates: Most of the comments tended to concentrate on two things which I had said, but not emphasised enough. I have tried to clarify slightly where appropriate in the text. First, Trolltech used to distribute a commercial and community edition of QT which were different, but as the QT Software Group in Nokia, this is no longer the case (showing that licencing can change after an acquisition (for the better), as it happens. Second, assigning copyright to a non-profit is, I think, a less controversial proposition for most people because of the extra trust afforded to non-profits through their by-laws, governance structure and not-for-profit status. And it is worth pointing out that KDE eV has a voluntary joint copyright assignment for contributors that they encourage people to sign – Aaron Seigo pointed this out. I think it’s a neat way to make future relicencing easier without adding the initial barrier to entry.

« Previous Entries Next Entries »