The 100,000 line question

Clinton Rogers (Yorba developer and all-pro Telestrations master) pointed out something interesting yesterday.  Ohloh now lists Shotwell as having achieved an odd milestone just ten days shy of its third birthday: Shotwell has reached 100,000 lines of code.  That number represents the work of 51 code contributors and 89 translators.  (It also represents blanks and comments — 100,000 lines of pure code is a ways off.)

It’s an odd milestone because there’s a rosy tradition in computer programming of honoring tight code and efficient algorithms.  The code placed on pedestals are not thousands of lines long, they’re short and sweet, like a witty joke or a clever haiku.  Duff’s Device is my favorite example; you probably have your own.  (Tom Duff’s history of his Device is a great short read and offers a basketful of concise observations on code optimization.)

Which is why reaching the 100,000 mark makes me simultaneously proud and a little uncomfortable.  Shotwell has grown quite a bit in three years — but is it really doing the work of 100,000 lines of code?  Ur-spreadsheet VisiCalc was also 100,000 lines of code, pretty lean compared to the Macbeth that knocked it off its throne, Lotus 1-2-3 (clocking in at 400,000 lines).  Compare that to the original Colossal Cave game, which was (gulp) 700 lines of FORTRAN and 700 lines of data.  It later grew to a whopping 4,800 lines of code and data that ran entirely in memory.  100,000 lines of code feels downright luxurious, even bourgeois, in comparison.

(I’m not claiming Shotwell should be placed alongside these landmarks.  It’s just interesting to consider what 100,000 lines of code represents.  I’m also aware that there’s a number of people who think line count is a misleading, or even useless, metric.  I do think lines of code provides some scale of complexity and size.  I’ve never seen a program grow in size and get simpler.)

There’s probably no reason to duck my head in shame.  Sure, there’s plenty of features we want to add and bugs we want to squash, but those 100,000 lines of code we have today are pulling a lot of collective weight.  They include the basic organizer, a nondestructive photo editor, a standalone image viewer (which also includes an editor), hierarchical tagging, stored searches, ten plug-ins, and plenty more.  Could we scrape away 1,000 lines of code and still have the same functionality?  Almost certainly.  10,000?  I can think of a few places where fat could be trimmed, but I don’t think it’s excessive.

Note that Ohloh is counting lines of Vala code, not the C generated by valac.  Although valac does not exactly produce sparse output, it’s worth mentioning that sloccount reports over 720,000 lines of C code generated by Vala.  If Vala is producing on average six times more C code than a competent human programmer (and I’m not asserting it does), that’s 120,000 extra lines.  Reducing that by the magic factor of six means Vala saved us from writing 20,000 lines of C code, a victory worth popping open a can of beer and celebrating over.

2 thoughts on “The 100,000 line question”

  1. congrats , shotwell is useful piece of free-software , a successful Vala project!!

    good work yorba, keep it up!

  2. It clocks in at 59552 source lines of Vala, according to this. That seems pretty reasonable considering how much functionality Shotwell provides.

    Love the work you guys are doing.

Leave a Reply

Your email address will not be published. Required fields are marked *