C – scrambled tofu

mistakes with g_value_set_boxed()

In today’s PSA, mistakes with g_value_set_boxed(). A mistake that’s been made several times by yours truly, and only realised today thanks to Xavier.

At some point in the GLib 2.22 cycle, types such as GArray, GPtrArray, GByteArray and GHashTable ((You can discover what g_boxed_copy() and g_boxed_free() will do for a type by looking for its G_DEFINE_BOXED_TYPE in GLib, which is probably in gobject/gboxed.c.)) gained ref() and unref() methods, which allowed things like g_boxed_copy() and g_boxed_free() to be a stack faster for those types. This was mid-2008.

Danni, who didn’t get the memo, had been occasionally writing bits of code like this:

GArray *array = g_array_new (...);

g_value_set_boxed (value, array);
g_array_free (array, TRUE);

Which, if you read the code, will keep the wrapper alive, so your code won’t crash, but will release all the data the array contains. Similarly for g_ptr_array_free() etc. It nicely zeros out the length of the array, so when you come to read it in your GValue, it looks empty.

The correct thing to do (of course) is unref your data, which will then free the memory and the wrapper when the refcount reaches zero.

GArray *array = g_array_new (...);

g_value_set_boxed (value, array);
g_array_unref (array);

Update: As pointed out in this simple example, you can replace this with the following pattern:

GArray *array = g_array_new (...);

g_value_take_boxed (value, array);

Which passes your ownership to the GValue. Where this falls down is for more complex values built for dbus-glib, where the ownership isn’t clear-cut, and so you need to free the components separately afterwards.

Finding redundant GObject classes in Empathy

As part of a bugfix I did today ((Thanks, as always, to my employer, Collabora, for letting me work on Empathy)), I made a commit to remove a redundant class from Empathy, which made me wonder if there were any other redundant classes in Empathy. A quick grep of G_DEFINE_ told me there are some 116 classes in Empathy, so I wasn’t going to check them all by hand.

Instead I put together this script, which I share with you in case it’s useful, which basically checks for all classes defined with G_DEFINE_TYPE and then looks to see if anything with that namespace is used in another file. It generates false positives for classes that aren’t used outside the file they’re defined in, or classes that have different namespaces to the classname, but it produces a much more manageable list.

I did consider looking for unused symbols, but couldn’t work out an easy way to do it properly. Empathy’s compile process is split into two archive libraries (libempathy.a and libempathy-gtk.a) and several binaries (empathy, empathy-call, empathy-accounts, etc.), so I couldn’t think of a way to ask the linker to find any unused symbols. I put together this second script, which will build a list of symbols in archives and look for those symbols copied into the binaries, which gets some of the way there, but will miss any unused symbols defined in the binaries’ sources (src/).

LD_PRELOAD GObject lifetime debugging tool

For years and years I’ve dreamt of a tool that would show me what GObjects are currently alive, and let me have a look at information about them. Today, while trying to debug a particularly horrid reference leak (program wedged under refdbg, and without being able to script SIGTRAP I’ve never found this method useful), I actually started writing the tool.

The tool uses LD_PRELOAD to replace g_object_new() — it should really replace g_object_newv() and g_object_new_valist() also, but I didn’t need those — to install a weak reference on the created objects to track their lifetime.

The result of which is something like this:

[danni@adelie src]$ LD_PRELOAD=libgobject-list.so \
    GOBJECT_LIST_FILTER=Gabble GABBLE_PERSIST=1 \
    /usr/lib/telepathy/telepathy-gabble 
 ++ Created object 0x9912530, GabbleGatewayPlugin
 ++ Created object 0x990c460, GabblePluginLoader
 ++ Created object 0x9917ab8, GabbleJabberProtocol
 ++ Created object 0x99172f0, GabbleConnectionManager
...

You can then list the currently living objects and their reference counts by sending SIGUSR1.

[danni@adelie src]$ killall -USR1 telepathy-gabble
Living Objects:
 - 0x9b47038, GabblePresence: 1 refs
 - 0x990c460, GabblePluginLoader: 2 refs
 - 0x9918ae0, GabbleRosterChannel: 1 refs
 - 0x991cb38, GabbleIMChannel: 1 refs
...

This by itself was useful in showing which object wasn’t finalized but should have been (and why my object in question also hadn’t finalized), but didn’t show me who was still holding the reference.

My hunch was that the problem was a reference cycle between two objects, so I hacked a bit of code into my tool to list the “connection” property on all objects that had it. [In Telepathy Connection Managers, objects are typically meant to let this go when connections become disconnected.]

  while (g_hash_table_iter_next (&iter, (gpointer) &obj, NULL))
    {
      GObjectClass *klass;
      GObject *conn;

      g_print (" - %p, %s: %u refs\n",
          obj, G_OBJECT_TYPE_NAME (obj), obj->ref_count);

      klass = G_OBJECT_GET_CLASS (obj);
      if (g_object_class_find_property (klass, "connection") != NULL)
        {
          g_object_get (obj,
              "connection", &conn,
              NULL);
          g_print ("\t + connection = %p\n", conn);
          if (conn != NULL)
            g_object_unref (conn);
        }
    }

This showed me which object was still holding a ref it should have let go. [Now I just need to work out why, the code in question is in an abstract base class, for which 5 of 6 concrete classes work correctly.]

If you’re interested, the code is in Git.

The tool could still benefit from a lot of work. For instance, the filtering is pretty basic at the moment and it should support all the ways to create GObjects. Originally I had envisioned this tool as a GUI window that popped up, so that you could click on an object and view all of its properties, connected signals, etc. That would be pretty neat actually.

Update: cassidy has already contributed by adding a listing of the objects remaining when a program exits and thus found a leak in Empathy. Awesome!

making my C more like Python

Asynchronous programming in C can be such a pain. You’re always creating little structs in order to pass around user_data. So often I find myself wishing it were like Python, and I could just create an anonymous tuple.

Then I had this crazy idea.

DBus is always passing around tuples. dbus-glib represents these as GValueArrays. They’re so ubiquitous that telepathy-glib added utility functions to deal with them: tp_value_array_build() and tp_value_array_unpack().

We can use them like this:

my_request_async (obj, my_callback,
  tp_value_array_build (2,
    G_TYPE_POINTER, my_pointer,
    G_TYPE_UINT, my_uint,
    G_TYPE_INVALID));

With the callback:

static void
my_callback (GObject *obj,
    GAsyncResult *result,
    gpointer user_data)
{
  gpointer my_pointer;
  guint my_uint;
 
  tp_value_array_unpack (user_data, 2,
      &my_pointer,
      &my_uint);
 
  ...
 
finally:
  g_value_array_free (user_data);
}

In my specific use case, I was making requests for a property, and then needed the generic callback to know which property the request was for. I could have copied and pasted the request multiple times, or get increasingly meta:

#define MY_REQUEST_ASYNC(prop) \
  my_request_async (obj, prop, \
      my_callback, tp_value_array_build (2, \
        G_TYPE_POINTER, my_pointer, \
        G_TYPE_UINT, prop, \
        G_TYPE_INVALID));
 
  MY_REQUEST_ASYNC (PROP_1);
  MY_REQUEST_ASYNC (PROP_2);
 
#undef MY_REQUEST_ASYNC

Where this gets really useful is when you’re having to copy or reference your members: strings, objects, hash tables, etc.. You no longer need to write new() and free() functions for each structure. GValue already knows how to take care of it.

my_request_async (obj, my_callback,
  tp_value_array_build (3,
    G_TYPE_OBJECT, my_obj,
    G_TYPE_STRING, "escher",
    G_TYPE_BOOLEAN, TRUE,
    G_TYPE_INVALID));

Everything will be released when you call g_value_array_free().

You could go further. For optional numbers of arguments you could use an a{sv} map (telepathy-glib also has utility methods to manipulate these, i.e. tp_asv_new()).

Unfortunately not everyone has telepathy-glib in their stack. I bet you could also achieve the same result using GVariant with g_variant_new(), g_variant_get() and g_variant_unref(). Store your pointers as type ‘t’ and remember to cast them to (guint64) for correct var-args alignment. Unfortunately GVariant can’t do your ref-counting.

g_variant_equal() and dictionaries

For anyone using g_variant_equal() with a type containing a dictionary, you should the aware that, somewhat unexpectedly in my opinion, g_variant_equal() only returns true if (and only if) the keys in the dictionary are in the same order (GVariant implements dictionaries as an array of key-value pairs).

You can resolve this by using a function that recursively checks for semantic equivalence (such as this one).

I’m sure I’m not going to be the only person to find this useful. I’ve filed this as bug #622590.

introspecting tp-glib and converting DBus-GLib GValues to GVariants

Telepathy is a modular framework. At the very heart of Telepathy there is a specification that describes how the various connection managers, clients and other components interact with each other. The specification is written in terms of a set of D-Bus APIs.

Telepathy takes full advantage of what’s provided by D-Bus, which is why you see so many different interfaces in the specification (one day I’ll write more about this), but one thing it makes frequent use of is variant types and specifically string->variant maps (known in D-Bus parlance as a{sv} maps).

a{sv} maps are used by the DBus.Properties.GetAll() method, to return all of the properties in an interface (which are all of different types), but in Telepathy they’re used more generically with namespaced properties to make requests for communication channels and collate information. The types are specified by the key names.

Telepathy-GLib exposes these maps as GHashTables of element type string->GValue, which actually works quite well in C. Telepathy-GLib registers specific GTypes with GLib via DBus-GLib and we have generic utility functions to unpack the GValues into native C types. For simple types, this also works really well in the bindings. For simple types (int, string, boolean, etc.) GJS converts the GValue values in our map into native JS values and things are pretty neat. Unfortunately this does not work for complex, container types (e.g. sa{sv}as); GLib doesn’t provide enough API to query the complete type information of the GValue’s contents, thus it’s not possible to introspect.

The GLib solution to this problem is GVariant, a way to store values that keeps all of the type information in an easily accessible way. If tp-glib exposed a{sv} maps as string->GVariant, we would easily be able to unpack the complete type in GJS or PyGI. GDBus uses GVariant for passing around types, but porting tp-glib to GDBus would take a lot of effort and radically break the API and ABI, so just isn’t feasible at this stage.

But there is a solution! Unsurprisingly, DBus-GLib has the information to recursively unpack a GValue registered with itself, it requires this in order to marshal those GValues into a D-Bus message, but it also exposes this API so we can use it for our own ends. Thus it becomes possible to convert these GValues to GVariants! In fact here is some sample code (Telepathy-GLib is not strictly required, it’s only being used here to provide test cases). I’m hoping this utility can be provided as part of DBus-GLib and then used to provide alternative API for tp-glib that can be exposed to the bindings.

To complete the circle, I wrote GVariant unpacking for GJS, but it needs reworking to only unpack when explicitly requested instead of implicitly always.

Update: bug #28715 provides a proposed patch to dbus-glib.

this is what goes around; and this.. this is what comes around

It used to be that no applications would compile for 64-bit architectures, because everyone was trying to cram pointers into ints. Today I had the opposite. The head of some code I'm working on wouldn't compile in a 32-bit environment, because someone was trying to store 5 bytes in a long.

We have truly come full circle.