17.08.2007 What are ELF libraries good for anyway?

See: http://blogs.testbit.eu/timj/2007/08/17/17082007-what-are-elf-libraries-good-for-anyway/ (page moved)

Or: Who in the world is actually linking against GLib without Gtk+?

I’m currently in the process of building some general purpose infrastructure for Rapicorn and because a good portion of the C++ utilities used in Beast and Rapicorn are factored out into an extra library called Birnet. I’m running into similar complexity issues here that we encountered with libglib vs. libgobject vs. libgdk already. I.e. lower level modules cannot reference or use mechanisms provided at a higher level (for instance deriving GHashTable from GObject). This becomes even worse with C++ which lacks support for partial classes (a class definition spanning multiple source files, like implementing gtk_widget_foo() outside of libgtk).

One remedy would be to actually merge libbirnet and librapicorn into a single project and let Beast depend on librapicorn. However, that’s essentially like we started out with GLib and Gtk+ in late 1996 when it got split out of the Gimp package. In June 1998 we split off libglib into a separate package for programs needing C utilities but not libgtk. So now, I’m reconsidering this separation after almost a decade. Reflecting on the move is probably not a bad idea before attempting the opposite with Rapicorn:

How many programs are actually linking against libglib but not libgdk, and is there any real benefit from separation?
If your machine has libglib and a few spare minutes, you can use this command line to find out on your own:

	find `echo $PATH | sed 's/:/ /g'` -type f -perm +111 | xargs ldd 2>/dev/null |
	  awk '/\<libglib-2\>/{ lg+=1 } /\<libgobject\>/{ lo+=1 } /\<libgdk_pixbuf\>/{ lp+=1 }
               END { print "glib=" lg " gobject=" lo " gdk=" lp ;
                     print "glib-only:    " 100*(lg-lo)/lg "%";
                     print "gobject-only: " 100*(lo-lp)/lg "%";
                     print "gdk-lot:      " 100*lp/lg "%"; }'

Here are results from a couple machines I had quick access to:

	Site                       Gdk  GObject     GLib  #Apps
	gimp.org:                50.0%    12.5%    37.5%     16
	gtk.org:                 47.1%    17.6%    35.3%     17
	developer.gnome.org:     19.5%    53.7%    26.8%     41
	my server (sarge):       66.7%    14.1%    19.3%    192
	my laptop (etch):        72.7%    14.8%    12.4%    209
	my desktop (feisty):     72.2%    17.1%    10.7%    252
	64bit KDE desktop (sid): 53.0%    16.7%    30.2%    338

That is, the servers have a quite limited set of GUI applications installed, but across a full fledged desktop the vast majority of applications is linked against libgdk anyway. I found Stefan Westerfelds Amd64 KDE desktop particularly interesting: It actually has the most applications linking against GLib, prolly because it has Mono and libQtCore.so installed which both link against GLib these days. Does anyone actually have a system with libglib installed but not libgdk?

Next, let’s take a look at the actual “savings”:

	ls -l libglib-2.0.so.0.1200.4 libgmodule-2.0.so.0.1200.4 \
              libgobject-2.0.so.0.1200.4 libgthread-2.0.so.0.1200.4
	-rw-r--r-- 1 root root 596608 2006-11-16 10:26 libglib-2.0.so.0.1200.4
	-rw-r--r-- 1 root root   9784 2006-11-16 10:26 libgmodule-2.0.so.0.1200.4
	-rw-r--r-- 1 root root 237156 2006-11-16 10:26 libgobject-2.0.so.0.1200.4
	-rw-r--r-- 1 root root  14028 2006-11-16 10:26 libgthread-2.0.so.0.1200.4
	=====================================
	                       857576 = 837KB

	ls -l libatk-1.0.so.0.1214.0 libcairo.so.2.9.2 libpango-1.0.so.0.1400.8 \
	      libgtk-x11-2.0.so.0.800.20 libgdk_pixbuf-2.0.so.0.800.20 \
              libgdk-x11-2.0.so.0.800.20
	-rw-r--r-- 1 root root  102504 2007-03-14 13:44 libatk-1.0.so.0.1214.0
	-rw-r--r-- 1 root root  395836 2006-10-20 07:44 libcairo.so.2.9.2
	-rw-r--r-- 1 root root  235464 2007-01-14 22:27 libpango-1.0.so.0.1400.8
	-rw-r--r-- 1 root root   88604 2007-03-04 22:21 libgdk_pixbuf-2.0.so.0.800.20
	-rw-r--r-- 1 root root  528580 2007-03-04 22:21 libgdk-x11-2.0.so.0.800.20
	-rw-r--r-- 1 root root 3043524 2007-03-04 22:21 libgtk-x11-2.0.so.0.800.20
	=======================================
	                       4394512 = 4292KB

GtkGdkPangoAtkCairoGlib / GLib ratio: (4394512 + 857576) / 857576 = 6.12.
So the “savings” turn out to be splitting off 5/6th of the GUI stack size. At best, a GLib-only program is saving 4.2MB that way. But then again, those are likely to be in memory already anyway. And most definitely, they reside in a library, available as const .text on the harddisk. To put that into perspective: We’re talking about only a handful megabytes here. In physical size actually less than a hires desktop background image, smaller than some icon themes, or roughly 10% of a full linux-2.6.8 kernel + modules binary footprint (41MB). Also symbol resolution performance doesn’t necessarily improve through splits, Ulrich Drepper has a few words on multiple DSOs during deployment.

Given that selecting Gtk+ subcomponents to allow selective shrinkage for size constrained embedded devices via configure.in options is on our TODO list anyway; for this set of libraries: Size is not worth a split!

Of course, other aspects not reflected in library size weigh in much more strongly. Such as the ability to build different user communities around projects (e.g. for Cairo vs. Glib) or the reduced dependency chain (sshfs and Qt wouldn’t depend on GLib if it drew in Pango or Gtk+).
So while for the vast majority of applications the GLib split off doesn’t make a huge difference technically, software evolution arguments make all the difference, definitely justifying a separation. The GLib internal splits are definitely not worth the hassle and conceptual boundaries though, if we ever break binary compatibility again, merging the GLib internal libraries and respective pkg-config packages would be a good idea.

Back to my original problem… I’m now pretty convinced that merging Rapicorn with the Birnet utility library will most likely not pose a problem anytime soon. And Qt 4 kindly demonstrates that splits can also be carried out successfully later on during the game.