Working on a personal itch-scratching project, I found myself wanting a sortable treeview. So I stuffed my treemodel into a TreeModelSort and set up my sort functions, and everything was happy. Except, when running my application, I kept getting strange segfaults within glib’s qsort implementation. I had set the following function as my sort_func (yes, it’s C++, but it should be fairly self-explanatory):
static int
compare_foo (const Gtk::TreeModel::iterator& a, const Gtk::TreeModel::iterator& b)
{
std::tr1::shared_ptr<Foo> l1, l2;
l1 = a->get_value (COLUMN_FOO);
l2 = b->get_value (COLUMN_FOO);
if (!l1)
return -1;
if (!l2)
return 1;
return l1->get_id () - l2->get_id ();
}
Can you spot the error? Running it under valgrind indicated that I was reading data from *before* the start of the array that was being sorted. In fact, the following lines of code in g_qsort_with_data() were causing tmp_ptr to be decreased back past the start of the array:
while ((*compare_func) ((void *) run_ptr, (void *) tmp_ptr, user_data) < 0)
tmp_ptr -= size;
Hmm. Why is there no guard to prevent tmp_ptr from being decreased past the start? Then I noticed the following comment up a few lines:
/* Find smallest element in first threshold and place it at the
array's beginning. This is the smallest array element,
and the operation speeds up insertion sort's inner loop. */
So at this point, the code is assuming that the very first element of the array is the smallest, so compare_func should always return >= 0 when it reaches the start of the array. Aha! So the problem is my compare_func. It turns out I forgot to handle the single case of both values being NULL.
l1 = a->get_value (COLUMN_FOO);
l2 = b->get_value (COLUMN_FOO);
+
+ if (!l1 && !l2)
+ return 0;
if (!l1)
return -1;
So there’s not really a big lesson to be learned here, but if you ever hit strange segfaults deep inside glib’s qsort while sorting a treemodel, do yourself a favor and double-check that your sort function is sane first. Also, do yourself a favor and run it under valgrind as soon as you get a strange segfault. Knowing the exact point of the invalid read is infinitely more helpful than waiting until that invalid read causes a segfault.
Thank you! This has haunted me for 3 years. Damned. That was a quick fix.
Wouldn’t it be even nicer to just do:
if (l1 == l2) return 0;
You’ll save yourself resolving some pointers and calling some functions both in the identity case (which is a superset of two NULLs).
Stupid touchpad, I meant “both resolving and …”
Sure, that might be a nicer solution. But that wasn’t really the point of the post
When you hit this issue once, you never forget again. In general you MUST ensure the following properties on your cmp function:
1) cmp(a,b)==-cmp(b,a)
2) cmp(a,a)==0
3) cmp(a,b)==0 && cmp(b,c)==0 => cmp(a,c)==0
4) cmp(a,b)<0 && cmp(b,c) cmp(a,c)<0
behdad
Thanks for the helpful list Behdad. I think something may have gotten eaten on #4 above due to the angle brackets ?
You’re right. Lets try again:
4) cmp(a,b)<0 && cmp(b,c)<0 => cmp(a,c)<0