Yesterday’s post about -fstack-protector
challenges leads us to today’s post. If you haven’t read it, go back and read it first.
Basically, the workaround I had at the time was to just disable -fstack-protector
for the get_type()
functions. It certainly made things faster, but it was a compromise. The get_type()
functions can have user-provided code inserted into them via macros like G_DEFINE_TYPE_EXTENDED()
and friends.
A real solution should manage to return the performance of the hot-path back to pre-stack-protector performance without sacrificing the the protection gained by using it.
So I spent some time today to make that happen. In bugzilla #795180 I’ve added some patches which break the get_type()
functions into two. One containing the hot path, and a second that does the full type registration which is protected by our g_once_init_enter()
. If we add the magic __attribute__((noinline))
to the function doing the full type registration, it can’t be inline’d into the fast path allowing it to pass the stack-protector sniff test (and therefore, not incur the stack checking wrath).
The best part is that applications and libraries only need to be recompiled to get the speedup (due to macro expansion).
Not a bad experience diving into some compiler bits this week.
Many of these calls to get_type() function can be removed by building with -DG_DISABLE_CAST_CHECKS, at the cost of turning off valuable runtime assertions. Do you suggest doing so in production builds?
I’m aware that Debian uses this when compiling WebKit, for example, though Fedora does not.
That only helps cast-checks. I don’t think that gets used all that often except by people like me in my overly-zealous precondition checks (g_return_if_fail() situations). Those too can be disabled.
The areas where this affects things far more is in stuff like FOO_GET_IFACE(self)->vfunc or gtk_widget_get_ancestor(self, FOO_TYPE_BAR) or literally anything dealing with accessing information about an object some of which I listed in the posts.