Archive for the ‘Gnumeric’ Category

Writing Tests is Humbling

Tuesday, March 11th, 2014

I have spent some time recently writing import/export tests cases for Gnumeric. It is what you do when you see a mistake that it should not have been possible to make, in this case a hang when writing certain strings to the obsolete xls/biff7 format.

Writing these tests has been a very humbling experience. Highly recommended.

A lot of the code being subjected to tests is quite old: 10-15 years old. You would think that by now it would have had any obvious bugs beating out of it. No such luck. Not only were there ancient bugs — such as the direction of diagonal borders being flipped on load+save — but there were also bugs accidentally introduced when fixing other things.

Bugs happen, even where you think you are being careful. And that is not a problem. The problem arises when the bugs are not caught and make it into releases. This is where a brutal test suite is needed. Our test suite clearly was not evil enough for import and export of various formats, so I have been adding something I call round-trip tests for ods, xls/biff7, xls/biff8, and xlsx.

A round-trip test is a test that when we convert to a given format and back to our own Gnumeric format, nothing changes. Or, if something does change, then only parts we understand change. If the format is deficient, there is not much we can do. For example: ods cannot store patterned background or the sheet size; xlsx cannot store solver parameters; xls/biff7 cannot store arbitrary unicode; and xls/biff8 has a fixed sheet size of 64k-by-256. Note, that by itself a round-trip test does not guarantee that what we produce is correct. We could, hypothetically, have swapped division and multiplication are still gotten a perfect round-trip. To test that the generated files are correct one has to load the resulting sheet in Excel or LibreOffice (which, despite claims to the contrary, are what really defines xls/xlsx and ods formats). Unfortunately, I do not know how to script that so it is not automatic.

As a result of all the new tests, the recently released Gnumeric 1.12.12 should interact better with other spreadsheets.

Note: some of these new tests were probably a decade overdue. My excuse is that The Gnumeric Team is fairly small. I do hope that LO/OO already have an evil test suite, but I am not optimistic. I ran a few of my test sheets through LO and saw things like truncated strings.

Spreadsheets and the Command Line

Tuesday, January 1st, 2013

Spreadsheets are not the most obvious type of document to manipulate from the command line. They are essentially a visual tool meant for interactively exploring data. Experience has shown, however, that there are certain spreadsheet tasks for which the command line is very useful and Gnumeric supplies a number of command line tools for this.

  • ssconvert converts spreadsheets from one format to another, for example from xls to ods. That sounds fairly simple — load one, save the other — and it pretty much is although there are things such as merging several files or extracting parts of files that add a little complication. Since Gnumeric can save as pdf, this tool also allows command-line printing of spreadsheets.
  • ssgrep is like grep is for text files. And it has about the same set of options.
  • ssindex is used by things like tracker and beagle to find of pieces of text in spreadsheet files.
  • ssdiff is a new tool in the upcoming release. It compares two spreadsheets and outputs a list of differences between them. There are three output modes so far: (1) a text format, (2) an xml format, and (3) a mode that outputs a copy of one of the input files with differing cells marked in neon yellow.

None of these programs are big: 300-1100 lines of C code, ssdiff being the largest only because of its three output modes.

Where are the app developers?

Tuesday, June 26th, 2012

Jewelfox asks where the app developers for Gnome are.

My personal guess is that they have seen the incessant API breaks in the Gnome stack and have realized that making a Gnome application is task for Sisyphus.

For the record, the latest API break I am aware of is the Gtk+ change that broke scroll wheels.

Sorting Icons Theme Mess

Thursday, April 14th, 2011

In my long-running series on why themes are evil, I bring you the newest installment.

Consider the gtk stock icon GTK_STOCK_SORT_ASCENDING which is supposed to represent sorting elements to make them increasing according to some order, typically numerically or alphabetically. The icon for such an action is supposed to somehow convey what happens when it is pressed, all in, say, 24×24 pixels.

Take a look at different themes and how they implement the icon:

eog `find /usr/share/icons/ -print | grep sort-ascending`

This command will show you the icon images with some duplication due to multiple sizes.

Observations:

  • Some show an up-arrow, others show a down-arrow. Yet others show a diagonal arrow which isn’t as bad as it sounds because such arrows are annotated.
  • Some arrows have no annotations, some are annotated by “1..9”, and yet others are annotated to “a..z”.

Officially this is a mess[tm]. When annotations are present they hint at either numerical or alphabetical ordering which may or may not match what the application does. That’s minor. But when no annotations are present, the situation is far worse: my sort-ascending button looks like someone else’s sort-descending simply because of theme differences!

I don’t know how this mess came about, but it ought to be resolved. I suggest that when the icons look like vertical arrows, sort-ascending should point down because the elements of a list will then be increasing in the direction of the arrow.

Hunting Leaks in GTK+ Applications

Thursday, April 7th, 2011

Hunting leaks in GTK+ application used to be fairly simple: you would run your application under Purify (or, later, Valgrind) and the leak reports would pretty much tell you where to go plug.

That was a long time ago. In the meantime, GTK+ has gotten more complex over its iterations with more caches, more inter-object links, and deliberately-unfreed objects. On top of that, Valgrind and Purify are not particular well suited for finding the cause of leaks: by design they will tell you the backtrace of the call that allocated memory which was never leaked. In a ref-counted world that information is often quite insufficient: the leaked widget was allocated by the gui builder — oh goodie! What you really want to know is who holds the extra ref.

Enter the gobject debugger first introduced by Danielle here. After some major internal work, it has become mature.

I used this for Gnumeric, which was already one of the strictest leak-policed applications in GTK+ land. We leaked a number of GtkTreeModel/GtkListStore objects, for example. Easily fixed. Also, when touching print code or file choosers we leaked massively: one object per printer and/or two objects per file in the current directory. A sequence of bug reports (646815, 646462, 646446, 646461, 646460, 646458, 646457, and 645483) later, GTK+ is now behaving much better. All but the last of these have been fixed. With this, we are down to leaking about 20 Gtk-related objects: the recent-documents manager, im-module objects, the default icon factory, and theme engines. Basically stuff that GTK+ does not want to release.

Please try this on your applications, especially if they are long-running. I still have to kill gnome-terminal, banshee, metacity, and gvfsd from time to time when they grow to absurd sizes. That doesn’t have to be caused by GObject leaks, but it might very well. (I know some of these samples are obsolete; I am not naive enough to believe their replacements would fare any better.)

This might be a good time to remind people that g_object_get and gtk_tree_model_get will give you a reference when you use them to retrieve GObjects. You need to unref when done. The problem is that it is not immediately clear from the g_object_get/gtk_tree_model_get call whether existing code is getting objects, so a certain knowledge of the code is needed.

GHashTable Memory Requirements

Saturday, March 26th, 2011

Someone threw a 8-million cell csv file at Gnumeric. We handle it, but barely. Barely is better than LibreOffice and Excel if you don’t mind that it takes 10 minutes to load. And if you have lots of memory.

I looked at the memory consumption and, quite surprisingly, the GHashTable that we use for cells in sheet is on top of the list: a GHashTable with 8 million entries uses 600MB!

Here’s why:

  • We are on a 64-bit platform so each GHashNode takes 24 bytes including four bytes of padding.
  • At a little less than 8 million entries the hash table is resized to about 16 million entries.
  • While resizing, we therefore have 24 million GHashNodes around.
  • 24*24000000 is around 600M, all of which is being accessed during the resize.

So what can be done about it? Here are a few things:

  • Some GHashTables have identical keys and values. For such tables, there’s no need to store both.
  • If the hash function is cheap, there’s no need to keep the hash values around. This gets a little tricky with the unused/tombstone special pseudo-hash values used by the current implementation. It can be done, though.

I wrote a proof of concept which skips things like key destructors because I don’t need them. This uses 1/3 the memory of GHashTable. It might be possible to lower this further if an in-place resize algorithm could be worked out.

Code Quality, Part II

Wednesday, September 29th, 2010

I have been known to complain loudly when I see code that I feel should have been better before seeing the light of day. But what about my own code? Divinely inspired and bug free from day one? Not a chance!

With Gnumeric as the example, here is what we do to keep the bug count down.

  • Testing for past classes of errors. For example, we found errors in Gnumeric’s function help texts, such as referring to arguments that do not exist or not describing all the arguments. The solution was not only to fix the problems we found, but also to write a test that checks all the function help texts for this kind of errors. Sure enough, there were several more. They are gone now, and new ones will not creep in. We do not like to make the same mistake twice!
  • Use static code checkers. This means that we keep the warning count from “gcc -Wall” down so know nothing serious is being ignored. We have looked at c-lang and Coverity output and fixing the apparent problems. (Those tools have pretty high false report rates, though.) We occasionally use sparse too and have a handful of dumb perl scripts looking for things like GObject destroy/finalize/etc handlers that fail to chain up to the parent class.
  • Use run-time code checkers. Gnumeric has been run through Valgrind and Purify any numbers of times. It is part of the test suite, so it happens regularly. This is regrettably getting harder because newer versions of Gtk+ and the libraries upon which it is built hold on to more and more memory with no way of forcing release. Glib has a built-in checker for some memory problems. We use that too.
  • Automated tests of as many part of the program as we have found time to write. The key word here is “automated”. I used to be somewhat scared of changing the format string (number rendering) code, because there was basically no way of making sure no new errors were introduced in that hairy piece of code. With the extensive test suite, I have no such reservations anymore.
  • Fuzzing, i.e., deliberately throwing garbled input at the program. I wrote tools to do this subtly for xml and files inside a zip archive in such a way that the files are still syntactically correct xml or zip files — otherwise you end up only testing the xml/zip parser which is fine, but not sufficient.
  • Google for Gnumeric. Not every will report problems to us, but they might discuss issues with others. Google seems to be pretty good at finding such occurrences.

The take-home message from this is that code quality is work. Lots of work. And yet we still let mistakes through. I blame that on the lack of a proper QA department.

ODF Plus Five Years

Wednesday, February 10th, 2010

Five years ago I strongly criticized the OpenDocument standard for being critically incomplete for spreadsheets since it left out the syntax and semantics of formulas. As a consequence it was unusable as a basis for creating interoperable spreadsheets.

Off the record several ODF participants agreed. The explanation for the sorry state of the matter was that there was a heavy pressure for getting the ODF standard out of the door early. The people working on the text document part of the standard were not willing to wait for the spreadsheet part to be completed.

That was then and this is now. Five years have passed and there has been no relevant updates to the standard. However, one thing that has happened is that is that Microsoft started exporting ODF documents that highlight the problems I pointed out. ODF supporters cried foul when it turned out that those spreadsheets did not work with OpenOffice. In my humble opinion, those same loud ODF supporters should look for the primary culprit at the nearest mirror. You were warned; the problem was obvious for anyone dealing with programming language semantics; you did nothing.

So given the state of the standard, where does that leave ODF support in spreadsheets? Microsoft took the least-work approach and just exported formulas with their own (existing) syntax and semantics. Of course they knew that it would not play well with anyone else, but that was clearly not a priority in Redmond. Anyone else at this point realizes that ODF for spreadsheets is not defined by the standard, but by what part OpenOffice happens to implement. Just like XLS is whatever Excel says it is.

One implications is that ODF changes whenever OpenOffice does. For example, OpenOffice has changed formula syntax at least once — a change that broke Gnumeric’s import. If you follow that link, you can see that OpenOffice did precisely the same thing that Microsoft did: introduce a new formula namespace. Compare the reactions. For the record, in Gnumeric the work involved in supporting those two new namespaces were about the same.

For Gnumeric the situation remains that we will support ODF as any other spreadsheet file format. Until and unless the deficiencies are fixed, ODF is not suitable as the native format for Gnumeric or any other spreadsheet. (There are other known problems with ODF, but those are somewhat technical and not appropriate here.)

Note: I want to make clear that the above relates to spreadsheets only. I know of no serious problems with ODF and text documents, nor do I have reason to believe that are any.

It’s a Python Bug

Friday, January 30th, 2009

I am not perfect. Therefore the code I write is not perfect. Every once in a (rare!) while I have been known to write code that people of evil mind could use. I deserve blame for that. I do not deserve this kind of blame.

What we have here is a Python bug: when embedding Python, we (and half a dozen other applications) use PySys_SetArgv according to spec. Python, in its wisdom, uses that as a clue to start loading python files in unexpected places. The right thing to do would be to fix Python as well as any code that might depend on the bug. That was not done.

Somehow it was chosen instead to file this against Gnumeric and half a dozen other applications. As a design error no less. That is simply offensive! Let us hope no problem is found in libc’s malloc, because dealing with bug reports for all users of that would take some effort.

Why does Python get this kind of reputation protection? They screwed up, so let them take the blame and have them fix the bug. If that breaks other applications, well so be it – the Python people will eloquently explain that to their community.

Applications

Wednesday, July 16th, 2008

I my optics, computers are here to get certain jobs done. That means it is all about applications, not eye candy: bouncing icons, themes, semi-transparent windows. My real-life work desk is not transparent, and I do not use semi-transparent paper.

Producing large applications is a lot of work, so when I write a piece of (hopefully) well-designed code, I want that code to stay written. I do not want next week’s GTK+ deprecation to come along and, effectively, cause my code to bitrot. (and I really do not want to write two different pieces of code for the job: one for “old” GTK+ and one for “new” GTK+.)

Moving from GTK+ 1.x to GTK+ 2.x was painful. I do not need anything like that again. Talks about breaking API every 3-4 years and advice like “Stay up to date, adapt your application code early” (and, by implication, often) is a clear indication that keeping applications running is likely to mean spending much time cleaning up after someone with an attention span of a few years.

Maintaining code like GTK+ is not hard. Calling it hard because you want to play with some new toy is deceiving.  Maintaining can be tedious, but if you do not want to maintain, please do not start writing new GTK+ code. You will surely abandon that prematurely too, so you have no business writing library code. Instead, go write a useful application: if you abandon that, I probably do not have to care.