Working on a personal itch-scratching project, I found myself wanting a sortable treeview. So I stuffed my treemodel into a TreeModelSort and set up my sort functions, and everything was happy. Except, when running my application, I kept getting strange segfaults within glib’s qsort implementation. I had set the following function as my sort_func (yes, it’s C++, but it should be fairly self-explanatory):
static int
compare_foo (const Gtk::TreeModel::iterator& a, const Gtk::TreeModel::iterator& b)
{
std::tr1::shared_ptr<Foo> l1, l2;
l1 = a->get_value (COLUMN_FOO);
l2 = b->get_value (COLUMN_FOO);
if (!l1)
return -1;
if (!l2)
return 1;
return l1->get_id () - l2->get_id ();
}
Can you spot the error? Running it under valgrind indicated that I was reading data from *before* the start of the array that was being sorted. In fact, the following lines of code in g_qsort_with_data() were causing tmp_ptr to be decreased back past the start of the array:
while ((*compare_func) ((void *) run_ptr, (void *) tmp_ptr, user_data) < 0)
tmp_ptr -= size;
Hmm. Why is there no guard to prevent tmp_ptr from being decreased past the start? Then I noticed the following comment up a few lines:
/* Find smallest element in first threshold and place it at the
array's beginning. This is the smallest array element,
and the operation speeds up insertion sort's inner loop. */
So at this point, the code is assuming that the very first element of the array is the smallest, so compare_func should always return >= 0 when it reaches the start of the array. Aha! So the problem is my compare_func. It turns out I forgot to handle the single case of both values being NULL.
l1 = a->get_value (COLUMN_FOO);
l2 = b->get_value (COLUMN_FOO);
+
+ if (!l1 && !l2)
+ return 0;
if (!l1)
return -1;
So there’s not really a big lesson to be learned here, but if you ever hit strange segfaults deep inside glib’s qsort while sorting a treemodel, do yourself a favor and double-check that your sort function is sane first. Also, do yourself a favor and run it under valgrind as soon as you get a strange segfault. Knowing the exact point of the invalid read is infinitely more helpful than waiting until that invalid read causes a segfault.
Thank you! This has haunted me for 3 years. Damned. That was a quick fix.
Wouldn’t it be even nicer to just do:
if (l1 == l2) return 0;
You’ll save yourself resolving some pointers and calling some functions both in the identity case (which is a superset of two NULLs).
Stupid touchpad, I meant “both resolving and …” 😉
Sure, that might be a nicer solution. But that wasn’t really the point of the post 😉
When you hit this issue once, you never forget again. In general you MUST ensure the following properties on your cmp function:
1) cmp(a,b)==-cmp(b,a)
2) cmp(a,a)==0
3) cmp(a,b)==0 && cmp(b,c)==0 => cmp(a,c)==0
4) cmp(a,b)<0 && cmp(b,c) cmp(a,c)<0
behdad
Thanks for the helpful list Behdad. I think something may have gotten eaten on #4 above due to the angle brackets ?
You’re right. Lets try again:
4) cmp(a,b)<0 && cmp(b,c)<0 => cmp(a,c)<0