OpenDocument for Spreadsheets

From time to time I am asked if Gnumeric supports or will support the OpenDocument standard. I am sad to report that the question is meaningless. The purpose of a standard ought to be interoperability; this standard is not a step in that direction.

Executive summary: The OpenDocument Standard v1.0 completely ignores semantics of spreadsheets, thus virtually ensuring that if two implementations of the standard existed then they could not use each other’s spreadsheets. That, in turn, renders the alleged standard pointless for spreadsheets.

The OpenDocument standard v1.0 is 708 pages long. You would think that in such an elephant they could find the space to hammer out a usable spec for spreadsheets. But no, they do not.

Formulas (or formulae if you are feeling learned) is an important part of what goes into spreadsheets. What does the alleged standard have to say about those? Basically you go to page 184-186 and find the two (equivalent) formula examples that it has:

  =sum([.A1:.A5])
  =sum([.A1];[.A2];[.A3];[.A4];[.A5])

Pedantics — that would be me — would point out that both these fail to comply with the formula requirement “Every formula should begin with a namespace prefix specifying the syntax and semantics used within the formula.” Hmmm… Or maybe “=” is that prefix and anyone seeing it will magically understand the syntax and semantics.

Regardless, we learn

  • the existence of a function named sum. I hope you can make good use of it, because it is the only function we get to know about. (There is a handful more in contexts unrelated to formulas.)
  • that there exists operators. Which ones seems to be a secret, apart from the fact that some of them are logical.
  • that numbers can be part of formulas. We will have to assume that regular C-like syntax for them is used.
  • that strings can be part of formulas. There is no clue about how to write them; I will guess that you put them in quotes, but what about embedded quotes, then?

There is what appears to be the beginning of a value system on pages 184-185. You get numbers, dates, times, booleans, and strings. Notice the absence of error values and “empty”, let alone arrays. Excel has those, Gnumeric has them, and OOo-calc has at least error values. Thus there is no chance that someone following the OpenDocument standard can approximate Excel semantics, nor for that matter is there any chance that OOo-calc can implement the standard with its current value system.

It gets worse. Let us assume that a proper value system was in place. If you wanted to have any hope of achieving interoperability you would them have to define function (and operator) semantics for a set of functions large enough to be useful. Details matter. For example, for our friend sum above, Excel will completely ignore strings like “42” (with the quotes) in referenced cells, but it will convert them to numbers and include them in the sum when given as direct arguments to SUM. (Yes, that is sick.) In technical terms, Excel does not have referential integrity. OOo-calc lacks referential integrity too and in the direct-argument case produces an error, i.e., it is sick in a different way. (For completeness, Gnumeric ignores all strings for SUM; this might change.) As a cuter example, OOo-calc manages to evaluate “1”+1 to TRUE!

So there. As far as spreadsheets are concerned, the OpenDocument Standard v1.0 is the equivalent of giving precise punctuation rules for sentences without telling if it is for English, German, French, or something else.

Only vaguely related is the question of whether Gnumeric should be able to read OOo-calc files. The answer is yes, and we do have some support for it. In practice it is probably easier to interoperate via the xls format.

[Formatting updated 2015]