ODF vs OOX : Asking the wrong questions

Recent posts from Brian Jones (Microsoft) and Robert Weir (IBM) highlight the strange times we live in. A squadron of flying pigs are out doing loops. On one hand you have MS documenting their file format, and providing positive press for competitors. On the other you’ve got IBM paying full time staff to critique development versions of Gnumeric. I’d like to thank both authors for their publicity, and triaging, but I also want to point out that IMO their examples are looking at things the wrong way.

Brian’s example of Numbers reading an OOX file written by Gnumeric could just as easily been an XLS file. Indeed that would likely have had better support on both ends.   The binary format was poorly documented and a miserable pain to read, but it’s been around a long time as is the dominant interchange format. All spreadsheets needed to support it.

Rob heads off in another direction with his examination of Gnumeric’s limited support for xlsx, on both import and export. What he neglects to consider is the amount of effort required. The initial importer was written on the flight to London for the ECMA meeting, and export was added on the flight back. Toss in a few hours of debugging and the sample file produced by Brian’s example was under a week of effort to read and write. After reading his post I added basic chart import and export the following day.Basic XLSX chart import

This was several orders of magnitude simpler than writing the binary filters which required parsers for OLE2, BIFF, and binary expressions just to get far enough to start reading the format details. However, even that is not the most salient question to ask. The core of this argument is the OOX vs ODF showdown. Instead what implementers really want to know is :

How hard was it to implement ODF support compared to OOX ?

it was significantly more difficult. To be clear, ODF support was nowhere near as much work as the old binary filters, we are talking about XML here. However, while Import filters start with parsing the structure, in the end, extracting the basic state is no more than the ante for the real work. You need to handle the impedance mismatches between the concepts in the file format, and your implementation. ODF’s model of ‘chartness’ didn’t fit well with Gnumeric. In contrast XLSX may be ugly, but it”s concepts were very familiar from XLS. We already had much of the code required to handle it.

I suspect most spreadsheet implementers are in the same position.