Estimating merge costs

community, freesoftware, General, maemo 2 Comments

After commenting on Mal Minhas’s “cost of non-participation” paper (PDF), I’ve been thinking about the cost of performing a merge back to a baseline, and I think I have something to work with.

First, this might be obvious, but worth stating: Merging a branch which has changed and a branch which has not changed is trivial, and has zero cost.

So merging only has a cost if we have a situation where the two trees concerned with the merge have changed.

We can also make another observation: If we are only adding new function points to a branch, and the mainline branch does not change the API, there is a very small cost to merging (almost zero). There may be some cost if functions with similar names, performing similar functions, have been added to the mainline branch, but we can trivially merge even a large diff if we are not touching any of the baseline code, and only adding new files, objects, or functions.

With that said, let’s get to the nuts & bolts of the analysis:

Let’s say that a code tree has n function points. A vendor takes a branch and makes a series of modifications which affects x function points in the program. The community develops the mainline, and changes y function points in the original program. Both vendor and community add new function points to extend functionality, but we’re assuming that merging these is an almost zero cost.

The probability of conflicts is obviously greater the bigger x and y are. This probability increases very fast the bigger the numbers. Let’s assume that every time that a given function point has been modified by both the vendor and the community that there is a conflict which must be manually resolved  (1).  If we assume that changes are independently distributed across the codebase (2), we can work out that the probability of at least one conflict is 1 – (n-x)!(n-y)!/n!(n-x-y)! if I haven’t messed up my maths (thanks to derf on for the help!).

So if we have 20 functions, and one function gets modified on the mainline and another on the vendor branch, we have a 5% chance of a conflict, but if we modify 5 each, the probability goes up to over 80%. This is the same phenomenon which lets you show that if you have 23 people in a room, chances are that at least two of them will share a birthday.

We can also calculate the expected number of conflicts, and thus the expected cost of the merge, if we assume the cost of each of these conflicts is a constant cost C (3). However, the maths to do that is outside the scope of my skillz right now 🙁 Anyone else care to give it a go & put it in the comments?

We have a bunch of data we can analyse to calculate the cost of merges in quantitative terms (for example, Nokia’s merge of Hildon work from GTK+ 2.6 to 2.10), to estimate C, and of course we can quite easily measure n and y over time from the database of source code we have available to us, so it should be possible to give a very basic estimate metric for cost of merge with the public data.

Footnotes:

(1) It’s entirely possible to have automatic merges happen within a single function, and the longer the function, the more likely this is to happen if the patches are short.

(2) A poor assumption, since changes tend to be disproportionately concentrated in a  few key functions.

(3) I would guess that the cost is usually proportional to the number of lines in the function, perhaps by the square of the number of lines – resolving a conflict in a 40 line function os probably more than twice as easy as resolving a conflict in an 80 line function. This is slightly at odds with footnote (1), so overall the assumption of constant cost seems reasonable to me.

The value of engagement

community, freesoftware, General, gimp, gnome, maemo, work 5 Comments

(Reposted from Neary Consulting)

Mal Minhas of the LiMo Foundation announced and presented a white paper at OSiM World called “Mobile Open Source Economic Analysis” (PDF link). Mal argues that by forking off a version of a free software component to adjust it to your needs, run intensive QA, and ship it in a device (a process which can take up to 2 years), you are leaving money on the table, by way of what he calls “unleveraged potential” – you don’t benefit from all of the features and bug fixes which have gone into the software since you forked off it.

While this is true, it is also not the whole story. Trying to build a rock-solid software platform on shifting sands is not easy. Many projects do not commit to regular stable releases of their software. In the not too distant past, the FFMpeg project, universally shipped in Linux distributions, had never had a stable or unstable release. The GIMP went from version 1.2.0 in December 1999 to 2.0.0 in March 2004 in unstable mode, with only bug-fix releases on the 1.2 series.

In these circumstances, getting both the stability your customers need, and the latest & greatest features, is not easy. Time-based releases, pioneered by the GNOME project in 2001, and now almost universally followed by major free software projects, mitigate this. They give you periodic sync points where you can get software which meets a certain standard of feature stability and robustness. But no software release is bug-free, and this is true for both free and proprietary software. In the Mythical Man-Month, Fred Brooks described the difficulties of system integration, and estimated that 25% of the time in a project would be spent integrating and testing relationships between components which had already been planned, written and debugged. Building a system or a Linux distribution, then, takes a lot longer than just throwing the latest stable version of every project together and hoping it all works.

By participating actively in the QA process of the project leading up to the release, and by maintaining automated test suites and continuous integration, you can mitigate the effects of both the shifting sands of unstable development versions and reduce the integration overhead once you have a stable release. At some stage, you must draw a line in the sand, and start preparing for a release. In the GNOME project, we have a progressive freezing of modules, progressively freezing the API & ABI of the platform, the features to be included in existing modules, new module proposals, strings and user interface changes, before finally we have a complete code freeze pre-release. Similarly, distributors decide early what versions of components they will include on their platforms, and while occasional slippages may be tolerated, moving to a new major version of a major component of the platform would cause integration testing to return more or less to zero – the overhead is enormous.

The difficulty, then, is what to do once this line is drawn. Serious bugs will be fixed in the stable branch, and they can be merged into your platform easily. But what about features you develop to solve problems specific to your device? Typically, free software projects expect new features to be built and tested on the unstable branch, but you are building your platform on the stable version. You have three choices at this point, none pleasant – never merge, merge later, or merge now:

  • Develop the feature you want on your copy of the stable branch, resulting in a delta which will be unique to your code-base, which you will have to maintain separately forever. In addition, if you want to benefit from the features and bug fixes added to later versions of the component, you will incur the cost of merging your changes into the latest version, a non-negigible amount of time.
  • Once you have released your product and your team has more time, propose the features you have worked on piecemeal to the upstream project, for inclusion in the next stable version. This solution has many issues:
    • If the period is long enough, your feature additions will be long removed from the codebase as it has evolved, and merging your changes into the latest unstable tree will be a major task
    • You may be redundantly solving problems that the community has already addressed, in a different or incompatible way.
    • Feature requests may need substantial re-writing to meet community standards. This problem is doubly so if you have not consulted the community before developing the feature, to see how it might best be integrated.
    • In the worst case, you may have built a lot of software on an API which is only present in your copy of the component’s source tree, and if your features are rejected, you are stuck maintaining the component, or re-writing substantial amounts of code to work with upstream.
  • Develop your feature on the unstable branch of the project, submit it for inclusion (with the overhead that implies), and back-port the feature to your stable branch once included. This guarantees a smaller delta from the next stable version to your branch, and ensures you work gets upstream as soon as possible, but adds a time & labour overhead to the creation of your software platform

In all of these situations there is a cost. The time & effort of developing software within the community and back-porting, the maintenance cost (and related unleveraged potential) to maintaining your own branch of a major component, and the huge cost of integrating a large delta back to the community-maintained version many months after the code has been written.

Intuitively, it feels like the long-term cheapest solution is to develop, where possible, features in the community-maintained unstable branch, and back-port them to your stable tree when you are finished. While this might be nice in an ideal world, feature proposals have taken literally years to get to the point where they have been accepted into the Linux kernel, and you have a product to ship – sometimes the only choice you have is to maintain the feature yourself out-of-tree, as Robert Love did for over a year with inotify.

While addressing the raw value of the code produced by the community in the interim, Mal does not quantify the costs associated with these options. Indeed, it is difficult to do so. In some cases, there is not only a cost in terms of time & effort, but also in terms of goodwill and standing of your engineers within the community – this is the type of cost which it is very hard to put a dollar value on. I would like to see a way to do so, though, and I think that it would be possible to quantify, for example, the community overhead (as a mean) by looking at the average time for patch acceptance and/or number of lines modified from intial proposal to final mainline merge.

Anyone have any other thoughts on ways you could measure the cost of maintaining a big diff, or the cost of merging a lot of code?

Frustration

community, maemo 16 Comments

I wonder if it was a mistake to adopt the “evaluate as they come in” method for Maemo Summit presentations. As we received proposals, for each proposal on its merits we said yes, no or maybe. If you were a yes, you were added to the schedule. A no got a nice email. A maybe stayed in the queue.

We set a deadline for submissions of September 13th, but this was a deadline for us to finish the schedule, not for people who wanted to give presentations to submit. I said as much in the call for content: “The final deadline for submissions will be September 13th but the sooner you submit your proposal the better chances you will have to get a slot”.

After Nokia World, a bunch of people came out of the woodwork to propose quality presentations, and after reviewing pending proposals last week, we now have an agenda which is almost full – there are 5 open slots and about 8 open lightning talk slots, about half of which are potentially taken already.

So it’s slightly frustrating to see 16 new submissions come in over the past 2 days as people saw the deadline arriving and the schedule filling up. If they were all there before, our choices might have been different, but now we will unfortunately be obliged to reject otherwise great presentations, simply because the proposers waited too long to ask for a slot.

It’s a tough problem to solve, though – if we had set an earlier deadline, we would not have received many of those presentations, or they would have been vague proposals like “can’t say much yet, but this’ll be a cool presentation about something related to Fremantle”. Approving presentations early allowed the council to have better information for travel subsidies and allowed people to book travel earlier and thus cheaper. But we’re going to miss out on some presentations I think would be pretty good. Pity.

Ton Roosendaal to keynote the Summit community days

community, freesoftware, maemo 1 Comment
Ton Roosendaal - the one on the left

Ton Roosendaal (on the left)

I’m very pleased to share that the opening keynote for the community days during the Maemo Summit will be Ton Roosendaal of the Blender Foundation. It’s been my pleasure to know Ton, mostly from afar, for the past few years, and he is one of the most amazing people I know in the free software world.

Ton is one of those people who has a sense of doing things big, and doing them right. Over the past few years, Ton has raised money to hire artists and developers to work on commercial quality films and games, resulting in Project Orange, Project Apricot, Project Peach and now Project Durian is in pre-production (you can buy your copy now and get your name on the credits!), with the goal of showing off what Blender can do and making the program better by working closely with artists to see what needs work. The results are truly impressive, and the amount of foresight and hard work which went into getting each of these projects off the ground and completed is amazing.

The Blender community is also amazing. Ton has continually given passionate users a reason to stay around, and ways to help the project, and that has been rewarded by a diverse community of artists and developers working together. The BlenderNation fan/news site is a testament to the creativity and passion of the community.

I’m really looking forward to hearing Ton speak.

Maemo Community Council elections Q3 09: Nominations open

community, maemo 1 Comment

The nomination period for candidatures for the Q3 2009 Maemo community council election is now open.

Candidates eligible for election according to the rules of the council can be nominated by anyone in the community. If a maemo.org community member nominates someone other than themselves, the nomination must be accepted by the nominee before it is official.

Nominations may be made before 23:59 UTC, September 20th, at which time
a voting period of one week will open, by sending an email to the
maemo-community mailing list with the subject “Council Nomination:”
followed by the name of the nominee. Nominations can be confirmed by the
nominee replying to this email.

I encourage anyone who would like to be on the council to nominate
themselves early, and I would encourage all community members to be
forthcoming with questions for the candidates.

Important election dates:

Sept 7
Nominations open for Maemo Community Council elections
Sept 21
Nominations close, voting opens
Sept 28
Voting closes, provisional results declared
Oct 5
If no challenges are upheld, results for elections are final

Good luck to all!

Six word novels

General 20 Comments

I happened on a Wired article about the 6 word novel this morning, in a link from a newsletter I’m subscribed to.

Like Haiku and other very formulaic structures, the six word novel gives the author enormous freedom while constraining them in a fish-bowl.

My favourite examples:

  • “For sale: Baby shoes. Never worn.” – Ernest Hemmingway.
  • “Longed for him. Got him. Shit.” – Margaret Atwood
  • “With bloody hands, I say good-bye.” – Frank Miller

Reading these, and seeing the second-place finisher of the competition in this newsletter (which was my favourite): “My secret discovered. Plane ticket purchased.” made me want to give it a try.

After some work, here’s my best effort.

As the noose tightened, she remembered.

A bit macabre, nonetheless I’m pretty happy with the images it brings forward, and the questions it leaves unanswered.

Anyone else care to try?

GCDS round-up 6: The tail-off

community, freesoftware, gnome, guadec No Comments

The last in the series!

After Mobile Day on Wednesday, I chilled out on Thursday morning, and attended the GNOME Foundation AGM where I gave a quick report on GNOME Mobile, before heading off to play in the fourth annual FreeFA world cup, with the mad dogs and Englishmen who went out in the midday Gran Canaria sun to play football for 2 hours.

As is usual, the team I played on won and the team that Bastien played on lost 🙂 After the match, though, I let Bastien in on my secret: always play with the team with the most local guys. Why? Because the people from the local team who take time off to go play in the patch usually play regularly. The rule has never let me down 🙂

Another highlight of the match was Diego, ending the match without breaking a sweat, finally broke a sweat just as we were taking the group photo at the end.

Thursday evening, met up with Federico Mena, and Jonathan and Rosanna Blandford for a very interesting hippie BOF, with conversation varying across a bunch of subjects, including compost heaps, growing trees and herbs, architecture pattern languages, cultural variations in building design, and more. On to dinner, and home to the hotel early, ready for the cycling trip on Friday.

Up early on Friday, down to breakfast, no sign of Aaron or J5 yet, so I start eating without them, and go get started with the bike. Turns out we were sitting in different parts of the lobby in the Fataga.

Armed with bikes, we set off around 9:30 to get to Arucas, on our way to Teror. We made pretty good going of it along the waterfront, and after taking it fairly easy on the edge of the motorway (surprising that we could cycle there actually, but apparently that’s the only way to get where we were going) along the coastloine, we finally came to the intersection for Arucas. John was starting to find the going a little tough already, but nothing had prepared us for what was next.

The nice straight GC-20, from the coast to Arucas, was steep, much steeper than I had expected (if I had to guess, I’d say an average of around 8% with some bits around 10%). It was a struggle, but we got to the top, before a nice long downhill stretch to come into Arucas, after which we all needed a water stop. We agreed that the goal of Teror (600m higher and 15km further on) was probably not realistic, so we decided to cut across on the GC 300 to Tamaraceite, grabbing lunch in the first village we came to on the road.

And so we set out after a nice long break & a walk around the “cathedral” in Arucas, the town gardens and the main shopping district (with a detour by a farm supplies store) for what we thought would be a nice light 20 minute cycle to the next town over. No such luck.

After climbing a nice hill straight out of Arucas, we had our reward – a really nice winding fast descent towards Tenoya. But when we got up to the village of Tenoya, we couldn’t find a restaurant anywhere. Eventually a nice old man pointed us towards “la cantina”, which turned out to be a bar with some very nice young men standing outside calling us crazy. So we decided to go to the next village over.

Through a road tunnel – watch out for the oncoming cars! Traffic lights don’t know about bikes going through and we didn’t have lights. then we got to a sort of service station with a promising sign: “Supermarket in Las Mesas: 500m”. If you ever come across that sign, don’t believe it. Between us and lunch was a killer hill and 2km of dusty road.

We settled in to la Cantina to weather the hottest part of the day, had some nice lunch (food always tastes better after physical effort) and set off again to get back to Gran Canaria. Getting around Tamaraceite was a bit tricky, we took one or two wrong turns before finding the nice small back road to get us on the right track. Then one last killer hill, up the Cuesta Blanca to the major shopping district, and then downhill all the way through the roundabouts, right down to the hospital near the golf club, and home.

It was a great ride, lots of fun, and I’m happy I made time for it. Aaron & John were great partners.

After that, packed my bags, out for dinner with Lefty, home to bed quite early, and up with the birds for an early flight, when I got to run into lots of GCDS attendees in the airport – a nice breakfast with Guy Lunardi, Jonathan & Zana and Owen Taylor (IIRC), and I was off on my plane once more. Homeward bound, for a few days, before heading off to OSCON and the Community Leadership Summit.

GCDS round-up 5: Mobile Day

community, gimp, gnome, guadec, maemo 5 Comments

Nearing the end of the series on the Gran Canaria Desktop Summit.

On Wednesday morning (after SMASHED), we had to get to the new location for the conference. I missed the bus window of 8am to 9am, so I took a taxi, without knowing the address of where we were going, other than knowing that it was the “Gran Canaria university, informatics building”. Turns out that’s not enough information for a taxi driver 🙂 Anyway, got there eventually, late for the opening session, and a little more expensive than expected. I also lost some change down the back of the bucket seat, so he even got a tip.

Anyway, the rest of the day went pretty well, and we had some great mobile related presentations (to compliment all of the other mobile related content in the conference):

  • Multimedia in your pocket, by Stefan Kost: Nice presentation on using MAFW to build complex multimedia applications
  • Designing Moblin-Netbook. A free desktop on a 7-10″ Screen, by Nick Richards: Great overview of the Moblin platform, and the design principles guiding it – from design requirements, personas, and dealing with constraints.
  • Hildon desktop in Maemo 5 by Kimmo Hämäläinen: An overview of the Hildon desktop on a whiteboard by Kimmo.
  • MAFW: the Media Application Framework for Maemo by Iago Toral: Drilling down into the details of MAFW.
  • Why its easier to re-invent rather than participate on the mobile? by Shreyas Srinivasan: My favourite presentation of the day. Shreyas laid out what he had expectied from GNOME Mobile, the problems he encountered, his understanding of the issues, and some proposed solutions to those problems. All in 15 minutes. I really appreciate people who don’t pad out the content that they have to present and instead focus on making a high-impact presentation.
  • GNOME Mobile BOF, led by myself: We talked about how far we’ve come, the original goals of the initiative, and identified a bunch of things that we can improve short-term and medium-term.

Had a great dinner again on Wednesday, in a tapas bar with some Red Hatters and Michael Meeks, and then on to the party. Wednesday night was the golf club party, sponsored by Collabora, with a free bar until 1 (of which I mostly did not avail – I was being good), and I was in bed by 2. It was a great party, and I picked up another couple of cyclists for the outing I had been planning for Friday, before they wimped out on me.

GCDS round-up 4: Days 2 – 4

General, gnome, guadec, maemo 2 Comments

Sunday, Monday and Tuesday were the “core” days of the Gran Canaria Desktop Summit, with cross-desktop and KDE & GNOME specific presentations throughout. I caught a number of presentations, but mostly I was chatting in the hallway track, or doing work on the schedule, or actually working.

For me, the story of the 3 days was “parties”. I missed the early sessions on Sunday and Monday to get breakfast at 10am, after the parties hosted by Nokia (Sunday night) and Igalia (Monday night) – I was relieved that there was no party planned on Tuesday night, my 35 year old body couldn’t stand the pace! Great parties, not marred by excessive boozing mostly, and some great chats, notably with jrb, and Adam Dingle and Jim Nelson from Yorba, makers of Shotwell, a Vala photo manager with some really nice features and plans. And some great discussions with Michael Meeks and Matthew Garrett on the fouton during the Igalia party, with Federico Mena Quintero on architecture design patterns, and Jorge Castro on dinosaurs. I also got to meet Joaquim from Igalia, the Macacque band were great, but I’m sure that a hoarse Lefty regretted sweet home chicago and smoke on the water the day after.

I did get to some presentations though (here with a one line summary):

  • Power management by Matthew Garrett: “Power management isn’t doing the same amount of work, slower. Do less work, or you’re killing polar bears.”
  • ConnMan by Not Marcel Holtmann (Joshua Lock from Intel gave the talk in the end – thanks Emmanuele!): “ConnMan solves some problems for Moblin that NetworkManager wasn’t designed to solve.” (I think).
  • Bluetooth on Linux by Bastien Nocera: “It mostly works now”
  • Introduction to GNOME Shell, by Owen Taylor: “It’s pretty cool stuff already”
  • GNOME Zeitgeist, by Thorsten, Seif and Federico: “We record what you’re doing”
  • Communicating design in development, by Celeste Lyn Paul: “Keep it simple until they get the design principle, excessive realism too early just makes the discussion about the details”. Unfortunately, I don’t see a video available, highly recommended viewing if there was one.
  • GNOME 3.0: A live circus^Wstatus update, by Vincent Untz et al: “It’s not just GTK+, Zeitgeist and GNOME Shell”
  • GNOME 1,2,3, by Fernando Herrera and Xan Lopez: “A history of GNOME with thanks to YouTube” (my favourite presentation of the conference)
  • Personal Passion lightning talks, by Aaron Bockover: “We’re not just Free Software hackers!” This was absolutely my second favourite session of the conference. We got a 10 minute overview of the burnout cycle from Jono Bacon, underlining how important it is to have a life outside of software, and heard from people whose passion was running (complete with a soundtrack of me finishing a marathon), airplanes, motorcycling, cooking, bacon, dinosaurs, Aikido, buddhism and calligraphy, trekking in Argentina, and also a couple of geeky ones on icon design and scheme (which was very enjoyable indeed, thanks Andy!)

Update: Memory playing tricks with me – for of course, Tuesday evening was the highly anticipated meeting of SMASHED. We finally met at the Mare Baja again, where the opening night party was held, and enjoyed a bunch of tapas courtesy of CodeThink, before scoffing down some great whisk[e]y, including (from memory) a 21yo Highland Park, a nice 16yo Longmorn, a very lod bottle of Oude Ginever from Lefty, an old standard Connemara single malt, and a Yamazaki 10yo I brought.

SMASHED 2009 in Gran Canaria

SMASHED 2009 in Gran Canaria

Festivities carried on until after 1pm, when I left with Andrew Savory and someone else (whose name I don’t recall), and Behdad got in an unprovoked fight with the footpath on the way back to the hotel – it came right up and hit him in the face. Some nice KDE people took him to the hospital to get sewn up – luckily the group photos had been taken earlier in the day.

Got back to the hotel around 2, and tried to catch up on some of that beauty sleep before Mobile Day on Wednesday in the new conference location in the university.

Gathering Gran Canaria press and feedback

del.icio.us, gnome, guadec No Comments

I have been bookmarking Gran Canaria Desktop Summit blogs, articles, photos and more today, and I could definitely use some help (help!).

So, here’s what I have:

  • Tag all GCDS related webpages as gcds
  • Tag pages related to specific presentations with presentation and talk (slides or video or presentation description inline)
  • Tag blog entries as blog
  • Tag photo pages as photos
  • Tag news articles (outside press, including slashdot for example) as press
  • Tag non-English articles with the language they’re written in
  • Optionally tag GNOME specific content with guadec gnome
  • Optionally tag KDE specific content with akademy kde

This way, we can find all of the GCDS related press, GCDS related blogs, GCDS photos and so on, as well as having one big bucket of GCDS related content. There are hundreds of blog entries, and I didn’t get them all. So as an easy first step, if you blogged about your presentation, put up a photo set, spot an article about the conference (in any language) or blogged about the conference in general, please tag your content in delicious. It makes it easier to filter everything and get an idea of the buzz created, generate publicity for future conferences, and a ton of other good stuff.

Thanks!

« Previous Entries Next Entries »