Archive for the ‘Gnumeric’ Category

Themes Are Evil, Part II

Saturday, February 2nd, 2008

In a previous post, I showed how a GTK+ theme engine can corrupt memory of any application unfortunate enough to be used with it.

In today’s edition, our guest star is the Qt theme engine. It does not, as far as I know, corrupt your memory or otherwise make your innocent application crash.[*] Instead it changes how your program works. For example, for Gnumeric it changes how numbers imported are handled.

If you import the number “8,5” in a decimal-comma locale then you would hope to get eight-and-a-half, right? Well, with the Qt theme you get eight and we, the Gnumeric team, look incompetent. The problem arises because the Qt theme, quite reasonably, initializes the qt library. During that, less reasonably, the following code gets executed:

setlocale( LC_ALL, “” ); // use correct char set mapping
setlocale( LC_NUMERIC, “C” ); // make sprintf()/scanf() work

I am not kidding. The Qt library thinks it should change your locale. What on Earth have the Trolls been drinking? Impure home destilled booze in large quantities?

This problem in various disguises have had us puzzled for quite a while and only very recently was the Qt theme identified as the triggering factor. Once that happened, it was not too hard to locate, but before that we have spent maybe 40 hours looking for this bug. The workaround is to set up a one-shot idle handler that resets the locale properly when the gui comes us. (Repeat this for every GTK+ program that displays or accepts floating-point values.)

The Qt theme people never caught this. If they are mostly “theme” people I can understand, but if they are mostly “Qt” people they really should have known. In either case, it is another exhibit for the case that the GTK+ theme model is seriously flawed.

[*] Well, if you use threads it might. The Qt library calls setlocale to change locale and that’s not allowed in a threaded program.

OOXML vs ODF

Tuesday, September 11th, 2007

I had a look at “OOXML is defective by design” and, quite frankly, I am not impressed.

On my surface it is a comparison of OOXML and ODF and it comes out as a landslide victory to ODF. But anyone who has worked with spreadsheet file formats will easily see that it was written by someone who, intentionally or otherwise, is deaf, dumb, and blind to the shortfalls of ODF. And if that is where you start, then what is the point?

For example, OOcalc suffers from exactly the same rounding issue that Excel does. How could it be any different when both are based on floating point numbers? (An in neither case is that a file format issue, but rather an implementation issue.)

For example, the reason that he can happily declare that ODF has backwards compatibility is that he choses a graph sample. And OOcalc’s graphing system has not, shall we say, seen a lot of improvement since the version he tried with.

Don’t complain that “ECMA 376 documents just do not exist” when the same can be said for ODF. As-of version 1.1 of the specification there still seems to no syntax for 2+2.

One could also ask a question such as “how well can legacy spreadsheet files be represented in either format?” A very reasonable question, in my humble opinion, given the number of sheet out there. Of course, since ODF doesn’t actually have non-trivial formulas, we should probably just interpret it with respect to OOcalc’s format. I do not think ODF would fare well here.

Disclaimer: I have not, and I probably will not, read the full OOXML spec.

Formatting Numbers

Tuesday, February 20th, 2007

I have spent a few evenings working on Gnumeric‘s number formatting,
i.e., the process that takes a value (3.14, “xyz”, TRUE, …) and
a format (an object initialised from a string like “[red]0.00″)
and use them to produce the string displayed in a spreadsheet cell.

Format strings are, if the user gets near them, an unmitigated GUI
disaster. How about this beau?

  dd-mmmm-yyyy[$-40b]/dd-mmmm-yyyy[Whitestone"76]*;;0/128[Blue]

(Which means typeset a non-negative number, representing a date, twice, once with month in the current langugage and once in Finnish. If there is room leftover in the cell, fill on the right side with semicolons. Oh,
and make it all white. Negative numbers, however, should be written in blue as the nearest 128th, without the minus. Non-numbers should be left as-is.)

Excel actually exposes hexadecimal numbers there! And
the parsing rules are really complicated and very much undocumented.
Well, it is documented in a variety of places, but the documentation is always combinations of wrong and incomplete.
I doubt anyone currently at Microsoft knows the details at this point
in time, but they can at least look at the source code. And format
strings can be translated (back and forth) in undocumented ways too.
Ick.

Anyway, I have been compiling a test workbook for formats. It uses the TEXT function which conveniently exposes most of the formatting logic. (Note: you must run in the US locale as many tests depend on that.)
Think of the file as a collection of horrors.
With my (unpublished) code, the score is:

Gnumeric: Pass: 606; Fail: 0
Excel: Pass 594; Fail: 12
OOo: Pass: 221; Fail: 69788

It is important to understand somethings here:

  • Excel can be wrong even though it is nominally defining the semantics. Most of the failures are avoidable overflows in fraction formats.
  • The workbook was not written to make Gnumeric look good. It was written as a tool to help Gnumeric become good. And, in fact, if you loaded the file in older Gnumerics, you would see less than stellar results. Prior to version 1.7.7, Gnumeric would even read memory beyond the end of strings and thus possibly crash or, more likely, produce bogus results.
  • The workbook was not written to make OO look bad. The fact that Gnumeric appears better is not only that I fixed Gnumeric, but also that I can only test the things I can think of. There might very well be formats that OO handles and Gnumeric does not. That is the problem with a basically undocumented language. Further, one problem might very well result in five or ten tests failing — things are not independent.
  • The weird failure count for OO comes from array formulas that OO cannot handle. :-) At least one failure comes from incorrectly loading the constant to check against.

10x+ Better Compression Than Gzip

Sunday, January 7th, 2007

I wanted to create an archive of all released Gnumeric versions.
Gnumeric’s CVS tree saw a lot of hacking on the ,v files so neither
CVS nor the derived SVN tree are useful for reconstructing past
releases. They are useful for tracking a given file’s history
minus the renames it went through.

So I hacked up a script to create a git archive for me. (You cannot actually run that script, though: it hits a “tar” bug — ick! And after hacking that, beware that it takes a long, long time to run.)

Total size of 172 tar files: 1508026377 bytes.
Total size of git archive: 139733921 bytes
Ratio: 10.8

Not too shabby, eh? Even if the corpus is pretty special.

Seeing what changed between releases is as
simple as git diff -u GNUMERIC_1_7_0..GNUMERIC_1_7_1
and very fast.

Testing is not an Option!

Wednesday, November 22nd, 2006

I released Gnumeric 1.7.3
only to discover that a little too much editing killed evaluation in
very common situations. Bad me! 1.7.4 is out.

That is not going to happen again.

I sat down and spent a few hours automating most of our tests. Then
I added a valgrind run and the beginning of tests of our importers.
It is part of “make distcheck”, so testing is now mandatory and automatic.

The workhorse of these tests is ssconvert, our handy little command-line utility that converts from one format to another.
By forcing evaluation of all cells between import and export,
we end up exercising quite a large part of the core. As well
as a few importers and exporters. No GUI tests are currently performed, but I suspect we can add that
too somehow.

A Bugfix A Day…

Friday, May 5th, 2006

It has been a while since I have been poured some water out of my
ears here, but I have been busy. A couple of months ago I decided
to fix a Gnumeric bug per day. And I have by and large kept that
and our
NEWS has been growing like weeds.

Note, that there are huge differences in the amount of work behind
the items lists. “Allow ={+42}” was a trivial one-liner, while
“Introduce top-level expressions” was a massive and
intrusive patch.

But I am running out of little issues to fix on lazy days and I
rarely have any significant amount of time during the week.

One Point Six

Monday, October 10th, 2005

It still lacks a catchy nickname, but
Gnumeric version 1.6.0
is out.

We also ought to cook up a proper stable series announcement, of course.

Bug? There is no bug!

Friday, September 9th, 2005

So what do you do when distributions (Mandriva, Ubuntu, and Gentoo so far)
send out
security fixes for a non-problem?

Ignore it? Certainly an option.

Spam the security lists with “that’s not a problem”?

Namespaces

Thursday, August 25th, 2005

Gnumeric’s solver was broken in HEAD and while fixing it, I
updated to the latest version of lp_solve.

Let me tell you, lp_solve is a prime example of how not to make
a library! It looks like there used to be a program and that it
was made into a library by removing main.

There is no concept of namespaces there. When you include the
relevant header file, you get everything used anywhere internally:
EQ, gcd, MALLOC, TRUE, is_int, and about 400-600
other identifiers.

You cannot isolate that problem to just where you use the header,
by the way, as static is practically usused.

I decided to throw a perl script at the problem and combine everything into one
gaint C file. All 44186 lines of it after pruning about 5000 lines.
The script adds tons of statics in the process,
renames the relevant part of the API, and extracts
that API. Extra points for you if you can read the perl script
without losing your breakfast.

The Cat’s Out…

Thursday, August 18th, 2005

…and so is Gnumeric 1.5.3. (Complete with a big ugly, but harmless,
error message on xls save — oops! At least we got it fixed in time
for Debian and the Win32 build.)

Home;
Source;
Release notes;
Changes.