Binary Search

I am less impressed by that blog entry.

Let’s see… “In C this causes an array index out of bounds with unpredictable results.” Nope. In C it causes a signed integer overflow and hence unpredictable results. Consequently, casting the sum to unsigned will not fix it.

And if you are going to worry about overflow for one addition, why not
worry about overflow for the others? Line 10 and the subtraction on line 12 look safe, but
line 16 can overflow. That expression should be changed from
-(low + 1) to -low - 1. (I do not actually know if that is needed in Java, but in C it would be.)

Finally, the algorithm still does not work for sizes from 2^31 and up. To fix that, one would start working with unsigned quantities (thus solving the overflow problem as a side effect) and think of some other way of returning “not found”.


I really need to write that patch I have been threatening for while
now: make gcc issue a warning when -Werror is used.
The purpose of -Werror seems to make code not compile on machines
that are different from the one of which -Werror was introduced. My
gcc patch would direct the pain where it belongs.

Case in point: gcc’s warnings about potentially uninitialized
variables come and go with different versions, optimization
levels, OS, etc.

Case in point: alignment requirements differ between archs. Thus
warnings about them will too and in an object system like glib’s
we have a _lot_ of casts that can change alignment requirements.

Case in point: my system libraries like to define the same external
funtions [identically] in several different headers. That’s
perfectly valid, but gcc will warn given enough -Wflags.

Case in point: pragmas in system headers. Warnings I can simply
ignore, but -Werror what am I to do? Fix gcc not to warn?

Case in point: signal handlers. You try dealing with them in a
way that does not produce warnings on some arch.

"We Know Better"

xorg-redhat-die-ugly-pattern-die-die-die.patch is a perfect example
of a bad we-know-better attitude.

It removes the beautiful grey cross hatch stipple from X. Evidently
because someone knows better what other people like. How does one
get it back, short of tiling a bitmap? If it is anything like
SuSE’s similar patch, you don’t. At least not while the X server
is running.

In summary. Before: you could have a solid-colour background or you
could have a cross hatch background. Just you xsetroot.

After: you get what some monkey with poor eyesight though looked
nice. You cannot change it. After all, he knows better.

Incidentally, the default X pattern also serves to show the poor
state of flatscreen technology. It seems the pattern cannot be
shown over the whole screen without a certain amount of flicker.


DTrace looks nice, but it is not as-if the means to do what it does
haven’t been along for a long time.

My humble contribution: strace-account.
To use…

strace -o ~/ttt top
# q when you get tired
strace-account ~/ttt | less

from which we learn:

Cumulative Syscall Times.

Syscall  Count  Time(s)
select       8    19.42
read      3939     0.06
open      3165     0.04
close     3172     0.03
alarm     2142     0.02
stat      1565     0.02
fcntl     1523     0.01
(other)    336     0.01

Repetitive File Name Usage.

This is a list of files that are accessed, one way or another, at least
twice.  Note, that current directory is not being tracked.

Count  Filename                   
   14  /var/run/utmp
    8  /proc
    8  /proc/1
    8  /proc/1/stat
    8  /proc/1/statm
    2  /proc/meminfo
    2  /proc/stat
    2  /usr/share/terminfo/x/xterm

Small-Chunk File Input/Output.

This is a list of files that are accessed in small chunks.  "Badness" is a
heuristic measure of this.  The list is truncated at badness 2.00.  A file
can appear more than once if it is opened more than once.

Badness  Bytes  I/Os  File                       
  12.75  38784   102  /var/run/utmp
  12.75  38784   102  /var/run/utmp
  12.75  38784   102  /var/run/utmp
  12.75  38784   102  /var/run/utmp
  12.75  38784   102  /var/run/utmp
  12.75  38784   102  /var/run/utmp
  12.75  38784   102  /var/run/utmp
   2.67   2122     8  /usr/share/terminfo/x/xterm
   2.26  40878    23  (stdout)

My interpretation of this version of top’s output is that it is
opening way too many files way too many times. why is it also
calling stat(2), alarm(2), and fcntl(2) so often? [So it can connect
to /var/run/nscd/socket in non-blocking mode over and over again, if
you must know.] It also likes to read utmp many, many times in little

(strace-account is a bit of a hack. An strace with an output format
more suited for machine reading would be nice. It might even have
appeared since whenever I wrote strace-account.)

ABI/API Stability

Both GTK+ and Gnome promise ABI and API stability.

But what does that mean?

To me it means that updating API/ABI stable libraries should not
break existing applications. This should not keep one from fixing
bugs — if you depend on those you get what you deserve — but
existing, well-defined, well-documented interfaces ought to continue working as previous.

The recent updating-glib-breaks-pango issue could, with a little good will, be considered a case of a
bug in old pango that now have triggered.

But what about the new updating-gtk+-breaks-stuff issue? Evidently soneone though it would be nice to change the basic ways reference counts are handled for GTK+ objects. That causes leaks, and leaks for widgets can mean they stay visible when they should not have been, i.e., applications are now broken. This is not right!

There is some talk about fixing the GTK_OBJECT_SET_FLAGS macro (and presumably the GTK_OBJECT_FLOATING macro) to plaster over the issue and regain some amount of API compatibility. The ABIs would remain incompatible,
so someone updating their libraries would still see their
applications break and come to me.

What can I as an application programmer do about this?

  • Demand that the GTK+ changes be rolled back. That would be the right
    thing to do and is a very reasonable demand. They could go back in for GTK+ 3.
  • Throw in some “incompatible with GTK+ 2.9 and later” dependency
    and get it deployed. Awful, right? Deployment is a multi-year
  • Rush in some devious works-both-places code and get that
    deployed. Marginally less awful, but still suffers from the deployment delay.

Incidentally, it is not at all clear what tangible benefit there
is to the changed reference count scheme. I am all ears.

GTK+ Themes

There is a burst of activity in the GTK+ theme world these days.
How do I know? Well, I get bugs like this for Gnumeric. This was a debug build with a reporter
that took the time (and had the knowledge) to tell us that a theme was involved, so it was pretty easy to diagnose as Someone Else’s Problem.
But when it happens with some distribution’s theme, it generally
requires a crystal ball.

In my humble opinion, something is wrong with the way themes are
done in GTK+. The current situation is:

  • When a theme engine crashes, the blame is placed on the application by both the user and Bug Buddy.
  • There is no fault-separation between the application and the theme. Theme engines are written in a fault-intolerant language.
  • Theme code is written by people more interested in visual effects than code. The code receives less scrutiny than, say, GTK+’s main code.
  • Application developers cannot test with themes they do not know.


Week numbers

Week numbers are not used in the US who it is pointless to start
thinking about how weeks might be numbered there. I do not know
if week numbers are in use in Asia, but I do not think so.

That leaves Europe[*] and the name of the game is ISO 8601 and The Right
Answer[tm] therefore is:

Week number of 24/12/2005: 51
Week number of 25/12/2005: 51
Week number of 26/12/2005: 52
Week number of 27/12/2005: 52
Week number of 28/12/2005: 52
Week number of 29/12/2005: 52
Week number of 30/12/2005: 52
Week number of 31/12/2005: 52
Week number of 1/1/2006: 52
Week number of 2/1/2006: 1
Week number of 3/1/2006: 1
Week number of 4/1/2006: 1
Week number of 5/1/2006: 1
Week number of 6/1/2006: 1
Week number of 7/1/2006: 1

as dumped by

static void
dump (int d, int m, int y)
	GDate *gd = g_date_new ();
	g_date_set_dmy (gd, d, m, y);
	printf ("Week number of %d/%d/%d: %d\n",
		d, m, y,
		g_date_get_iso8601_week_of_year (gd));
	g_date_free (gd);

So there you have it: all weeks are seven days long (what a concept!)
and go from Monday to Sunday. The first week of 2006 starts on the second day of January 2006.

[*] Thus assuming that no-one in Africa, for example, has been
enough to invent their own date magic for no good reason.

Optimizing g_utf8_offset_to_pointer

Do not bother.

No, really. That function, regardless of implementation, basically
screams “if you call me often, your program will exhibit quadratic
or worse time behaviour, so do not do that.”

It is fine for occasional use, but then you would not care about its


Federico is bringing up the file chooser’s lack of speed again. Good.

The first step, IMHO, should be to get rid of reloading the folder
of widget mapping. It is wrong to do non-widget, expensive and externally-visible actions in a widget mapping handler. If someone
switches to another virtual screen and back (for example to peek at something),
do we really want to reload the folder? Do we lose the selections
in the process? If The Gimp wants that behaviour, I say it can
install a handler and trigger it itself.

Would things appear to be faster if we installed a single-shot
idle-handler that created a file chooser and threw it away?
(That would be a work-around more than a fix, of course.)

There is more to your item 7 than just performance, btw. It should
not stat() all those parent directories because it may not be allowed
to do so. If you just succeeded in stat(“/foo/bar/baz”) then
it should not be necessary to check that “/foo” and “/foo/bar” are

Libc is Broken, Part 1a

Robert, I said “int”, not “long”.
The code you presented does not work for platforms where the two are
different. There is no strtoi so you get to use strtol and do the
extra range checking.

It also does not work because there is no
requirement that strtol set errno for any condition
but overflow and underflow. In particular for the empty string.

It also does not work because libc functions may
set errno when no error is detected. And, in fact, they
often do.

And, finally, it also does not work for strings like “010”.
You get 8, but you should get 10 when someone mumbles “decimal”.

Update: And while I am picking at you, it seems I
have to point you at the ctype man pages too. isdigit
is defined on the special value EOF and an integers within the range
of unsigned char.

In practice, what libc implementations do is something like

  #define EOF (-1)
  #define isdigit(_c) ((__somearray+1)[(_c)] & SOME_BIT)

That can and will core dump if you send a random signed character.

(Glibc has a misguided attempt at making things work for signed
characters also. They essentially add 128 and duplicate half the
table. That works fine unless you want the right answer.)