Source of the Life

I found this sign in a park in Beijing:

Mystery sign

I tried to follow it to see what it meant, but it turned out to be a dangling pointer.

Themes are Evil!

…although I am willing to entertain the idea that they are a necessary
evil.

Themes in GTK+ land are, simplified, arbitrary pieces of dynamically
loaded code that is hooked into the drawing layers of widgets. There
is no fault isolation, so when (not if) a theme does something naughty,
the application crashes. Then the user curses a few times and blames
the usual suspects: the screen/keyboard and the application.

Have a look at bug 438456. Theme code used in, for example, the industrial theme released various
kind of memory allocated by GTK+ using the wrong function, in this case g_free. Result: instant memory corruption in all GTK+ applications! Not good.

I do not think it is realistic to redesign the theming of GTK+ at
this point in time, but we really need to do something about
stability.

  • Theme authors need to be extremely careful with their coding.
  • Theme authors should use the full range of automated testing
    at our disposal: running under Valgrind and separately with G_SLICE=debug-blocks during some kind of theme torture test is a bare minimum. For every release.

I realize that I am asking other people to do some work. I do not feel too bad about that, though, because it is theme code that is causing my application to crash, causing people to lose data, and the blame is sent my way.

Ok, go ahead and tell me how wrong I am.

Hidden Message

When your four-year old asks you to shave and your six-year old asks
you to dress up, there is some kind of hidden message.

Formatting Numbers

I have spent a few evenings working on Gnumeric‘s number formatting,
i.e., the process that takes a value (3.14, “xyz”, TRUE, …) and
a format (an object initialised from a string like “[red]0.00”)
and use them to produce the string displayed in a spreadsheet cell.

Format strings are, if the user gets near them, an unmitigated GUI
disaster. How about this beau?

  dd-mmmm-yyyy[$-40b]/dd-mmmm-yyyy[Whitestone"76]*;;0/128[Blue]

(Which means typeset a non-negative number, representing a date, twice, once with month in the current langugage and once in Finnish. If there is room leftover in the cell, fill on the right side with semicolons. Oh,
and make it all white. Negative numbers, however, should be written in blue as the nearest 128th, without the minus. Non-numbers should be left as-is.)

Excel actually exposes hexadecimal numbers there! And
the parsing rules are really complicated and very much undocumented.
Well, it is documented in a variety of places, but the documentation is always combinations of wrong and incomplete.
I doubt anyone currently at Microsoft knows the details at this point
in time, but they can at least look at the source code. And format
strings can be translated (back and forth) in undocumented ways too.
Ick.

Anyway, I have been compiling a test workbook for formats. It uses the TEXT function which conveniently exposes most of the formatting logic. (Note: you must run in the US locale as many tests depend on that.)
Think of the file as a collection of horrors.
With my (unpublished) code, the score is:

Gnumeric: Pass: 606; Fail: 0
Excel: Pass 594; Fail: 12
OOo: Pass: 221; Fail: 69788

It is important to understand somethings here:

  • Excel can be wrong even though it is nominally defining the semantics. Most of the failures are avoidable overflows in fraction formats.
  • The workbook was not written to make Gnumeric look good. It was written as a tool to help Gnumeric become good. And, in fact, if you loaded the file in older Gnumerics, you would see less than stellar results. Prior to version 1.7.7, Gnumeric would even read memory beyond the end of strings and thus possibly crash or, more likely, produce bogus results.
  • The workbook was not written to make OO look bad. The fact that Gnumeric appears better is not only that I fixed Gnumeric, but also that I can only test the things I can think of. There might very well be formats that OO handles and Gnumeric does not. That is the problem with a basically undocumented language. Further, one problem might very well result in five or ten tests failing — things are not independent.
  • The weird failure count for OO comes from array formulas that OO cannot handle. 🙂 At least one failure comes from incorrectly loading the constant to check against.

Shark!

A shark, a lot of little fish, and a few trees in one picture:

Shark, fish, tree

Taken at “Atlantis”, Paradise Island, The Bahamas.

Scary Git Grep

I wanted to know when a certain identifier was introduced in goffice.
It turned out that it was fairly simple:

  git grep -w GO_FORMAT_MARKUP `git tag -l 'GOFFICE*'` -- '*.h'

That commands searches all .h files from all Goffice releases.
It is very fast: about 0.5s (cpu time) on a fairly slow machine.

How do I do that with SVN?

10x+ Better Compression Than Gzip

I wanted to create an archive of all released Gnumeric versions.
Gnumeric’s CVS tree saw a lot of hacking on the ,v files so neither
CVS nor the derived SVN tree are useful for reconstructing past
releases. They are useful for tracking a given file’s history
minus the renames it went through.

So I hacked up a script to create a git archive for me. (You cannot actually run that script, though: it hits a “tar” bug — ick! And after hacking that, beware that it takes a long, long time to run.)

Total size of 172 tar files: 1508026377 bytes.
Total size of git archive: 139733921 bytes
Ratio: 10.8

Not too shabby, eh? Even if the corpus is pretty special.

Seeing what changed between releases is as
simple as git diff -u GNUMERIC_1_7_0..GNUMERIC_1_7_1
and very fast.

SVN

I’ll happily add myself to the chorus: switching to SVN? Someone must
have been drinking out of the potty.

If we are going to suffer the pain of a conversion, then we have better get a lot out of it.

  • Retraining. We all know cvs’ quirks. (And my fingers like to type c-v-s.)
  • Lost history. None of the conversion tools are perfect.
  • Recoding. Custom scripts will have to be recoded.

These apply to all systems (except cvs). The trouble with SVN
is that it is hard to see it as more than a stepping stone.

Dave, if the only argument that people have against it is that “everyone knows that distributed’s better” is what you believe, then you have not been listening. Start with KeithP’s write-up saying, in part, that (1) SVN lacks corruption detection,
(2) Git is a whole lot faster than SVN, and (3) SVN is a space pig compared to Git.

By the way, why is it relevant whether most distributions include the system by default? We expect, from time to time, that developers have HEAD versions of various off-site libraries, so surely we can get the same developers to obtain a recent copy of Git (etc.), right?

So can we please have git and gitweb running somewhere on gnome.org
with access to, say, ~user/public_git/ directories? I promise that I will not complain too much about SVN if that happens.

f-spot

I have been trying out f-spot recently and I must say that using
it gives me one of those rare warm fuzzy feelings inside.

It has well designed interface with relatively little clutter.
It by and large does what I want it to.
And it does not crash or hang for me.
It does not even spew a lot of scary warnings in my session log file.

I can find nits, sure I can. In fact I just filed a pile of them, but we are talking the would-be-nice department here.

Nice job on this, guys!

Testing is not an Option!

I released Gnumeric 1.7.3
only to discover that a little too much editing killed evaluation in
very common situations. Bad me! 1.7.4 is out.

That is not going to happen again.

I sat down and spent a few hours automating most of our tests. Then
I added a valgrind run and the beginning of tests of our importers.
It is part of “make distcheck”, so testing is now mandatory and automatic.

The workhorse of these tests is ssconvert, our handy little command-line utility that converts from one format to another.
By forcing evaluation of all cells between import and export,
we end up exercising quite a large part of the core. As well
as a few importers and exporters. No GUI tests are currently performed, but I suspect we can add that
too somehow.