Hidden Message

March 31st, 2007 by mortenw

When your four-year old asks you to shave and your six-year old asks
you to dress up, there is some kind of hidden message.

Formatting Numbers

February 20th, 2007 by mortenw

I have spent a few evenings working on Gnumeric‘s number formatting,
i.e., the process that takes a value (3.14, “xyz”, TRUE, …) and
a format (an object initialised from a string like “[red]0.00″)
and use them to produce the string displayed in a spreadsheet cell.

Format strings are, if the user gets near them, an unmitigated GUI
disaster. How about this beau?

  dd-mmmm-yyyy[$-40b]/dd-mmmm-yyyy[Whitestone"76]*;;0/128[Blue]

(Which means typeset a non-negative number, representing a date, twice, once with month in the current langugage and once in Finnish. If there is room leftover in the cell, fill on the right side with semicolons. Oh,
and make it all white. Negative numbers, however, should be written in blue as the nearest 128th, without the minus. Non-numbers should be left as-is.)

Excel actually exposes hexadecimal numbers there! And
the parsing rules are really complicated and very much undocumented.
Well, it is documented in a variety of places, but the documentation is always combinations of wrong and incomplete.
I doubt anyone currently at Microsoft knows the details at this point
in time, but they can at least look at the source code. And format
strings can be translated (back and forth) in undocumented ways too.
Ick.

Anyway, I have been compiling a test workbook for formats. It uses the TEXT function which conveniently exposes most of the formatting logic. (Note: you must run in the US locale as many tests depend on that.)
Think of the file as a collection of horrors.
With my (unpublished) code, the score is:

Gnumeric: Pass: 606; Fail: 0
Excel: Pass 594; Fail: 12
OOo: Pass: 221; Fail: 69788

It is important to understand somethings here:

  • Excel can be wrong even though it is nominally defining the semantics. Most of the failures are avoidable overflows in fraction formats.
  • The workbook was not written to make Gnumeric look good. It was written as a tool to help Gnumeric become good. And, in fact, if you loaded the file in older Gnumerics, you would see less than stellar results. Prior to version 1.7.7, Gnumeric would even read memory beyond the end of strings and thus possibly crash or, more likely, produce bogus results.
  • The workbook was not written to make OO look bad. The fact that Gnumeric appears better is not only that I fixed Gnumeric, but also that I can only test the things I can think of. There might very well be formats that OO handles and Gnumeric does not. That is the problem with a basically undocumented language. Further, one problem might very well result in five or ten tests failing — things are not independent.
  • The weird failure count for OO comes from array formulas that OO cannot handle. :-) At least one failure comes from incorrectly loading the constant to check against.

Shark!

January 21st, 2007 by mortenw

A shark, a lot of little fish, and a few trees in one picture:

Shark, fish, tree

Taken at “Atlantis”, Paradise Island, The Bahamas.

Scary Git Grep

January 21st, 2007 by mortenw

I wanted to know when a certain identifier was introduced in goffice.
It turned out that it was fairly simple:

  git grep -w GO_FORMAT_MARKUP `git tag -l 'GOFFICE*'` -- '*.h'

That commands searches all .h files from all Goffice releases.
It is very fast: about 0.5s (cpu time) on a fairly slow machine.

How do I do that with SVN?

10x+ Better Compression Than Gzip

January 7th, 2007 by mortenw

I wanted to create an archive of all released Gnumeric versions.
Gnumeric’s CVS tree saw a lot of hacking on the ,v files so neither
CVS nor the derived SVN tree are useful for reconstructing past
releases. They are useful for tracking a given file’s history
minus the renames it went through.

So I hacked up a script to create a git archive for me. (You cannot actually run that script, though: it hits a “tar” bug — ick! And after hacking that, beware that it takes a long, long time to run.)

Total size of 172 tar files: 1508026377 bytes.
Total size of git archive: 139733921 bytes
Ratio: 10.8

Not too shabby, eh? Even if the corpus is pretty special.

Seeing what changed between releases is as
simple as git diff -u GNUMERIC_1_7_0..GNUMERIC_1_7_1
and very fast.

SVN

December 26th, 2006 by mortenw

I’ll happily add myself to the chorus: switching to SVN? Someone must
have been drinking out of the potty.

If we are going to suffer the pain of a conversion, then we have better get a lot out of it.

  • Retraining. We all know cvs’ quirks. (And my fingers like to type c-v-s.)
  • Lost history. None of the conversion tools are perfect.
  • Recoding. Custom scripts will have to be recoded.

These apply to all systems (except cvs). The trouble with SVN
is that it is hard to see it as more than a stepping stone.

Dave, if the only argument that people have against it is that “everyone knows that distributed’s better” is what you believe, then you have not been listening. Start with KeithP’s write-up saying, in part, that (1) SVN lacks corruption detection,
(2) Git is a whole lot faster than SVN, and (3) SVN is a space pig compared to Git.

By the way, why is it relevant whether most distributions include the system by default? We expect, from time to time, that developers have HEAD versions of various off-site libraries, so surely we can get the same developers to obtain a recent copy of Git (etc.), right?

So can we please have git and gitweb running somewhere on gnome.org
with access to, say, ~user/public_git/ directories? I promise that I will not complain too much about SVN if that happens.

f-spot

December 5th, 2006 by mortenw

I have been trying out f-spot recently and I must say that using
it gives me one of those rare warm fuzzy feelings inside.

It has well designed interface with relatively little clutter.
It by and large does what I want it to.
And it does not crash or hang for me.
It does not even spew a lot of scary warnings in my session log file.

I can find nits, sure I can. In fact I just filed a pile of them, but we are talking the would-be-nice department here.

Nice job on this, guys!

Testing is not an Option!

November 22nd, 2006 by mortenw

I released Gnumeric 1.7.3
only to discover that a little too much editing killed evaluation in
very common situations. Bad me! 1.7.4 is out.

That is not going to happen again.

I sat down and spent a few hours automating most of our tests. Then
I added a valgrind run and the beginning of tests of our importers.
It is part of “make distcheck”, so testing is now mandatory and automatic.

The workhorse of these tests is ssconvert, our handy little command-line utility that converts from one format to another.
By forcing evaluation of all cells between import and export,
we end up exercising quite a large part of the core. As well
as a few importers and exporters. No GUI tests are currently performed, but I suspect we can add that
too somehow.

Hungry (or Greedy)?

November 16th, 2006 by mortenw

My company is hiring. Note, that
the jobs have nothing to do with open software and that we are rather
picky.

Gnome Terminal

November 14th, 2006 by mortenw

Until recently I have been using xterm and twm. Call me a stone-age
throw-back if you like, but those two are fast and lean.

But I am giving gnome-terminal and metacity a spin now. Metacity
suffers somewhat from click-to-focus, even if you set
it to focus-follows-mouse, but that is a story for another day.

Today it is gnome-terminal gripe time. Ok, so it is actually not
too bad, but it feels slow. Starting the first terminal window,
for example, takes a few seconds. Seconds! I want my window to
show up in less than .2 seconds.

A bit of stracing reveals the cause. It is dlopen-ing a zillion modules related to character support.
That is used to populate a, I estimate, rarely used menu. In my humble opinion, the penalty for this kind of thing should not be taken on startup, but when the rarely-used feature is activated. Or in the background while I type away happily. Luckily this does not seem
too hard to fix.

On top of this, Fontconfig is doing a whole lot of work on startup.
It does that for other GTK+ based programs too, so I doubt the
applications are to blame. So what does it do?

  • Stats ~200 directories all over the place.
  • Feels the need to do things three times:
    access("/etc/fonts/suse-hinting.conf", R_OK) = 0
    stat64("/etc/fonts/suse-hinting.conf", {st_mode=S_IFREG|0644, st_size=6575, ...}) = 0
    open("/etc/fonts/suse-hinting.conf", O_RDONLY) = 18
    

    Are you there? Are you there? Please open! This kind of behaviour
    is fairly cheap on a local file system, but not necessarily so on
    a remote one. And it is wrong: the file could have changed twice between the three calls.

  • Zig-zag reading files:
    read(17, "     fd9 78563412    1    4    4"..., 255) = 255
    lseek(17, -136, SEEK_CUR)               = 158
    read(17, "/usr/X11R6/lib/X11/fonts/CID/us"..., 4109) = 3938
    lseek(17, -3909, SEEK_CUR)              = 187
    read(17, "/usr/X11R6/lib/X11/fonts/URW/us"..., 4109) = 3909
    lseek(17, -3880, SEEK_CUR)              = 216
    read(17, "/usr/X11R6/lib/X11/fonts/uni/us"..., 4109) = 3880
    lseek(17, -3851, SEEK_CUR)              = 245
    

    Back-and-forth over the same area again and again. Short of timing
    system call overhead I do not see what it is doing.

I am also having problems with focus in the terminal. I am not entirely sure how I manage, but
somehow the focus ends up on one of the tabs where it has no business being.