Bisect’ing outside of the box

There is a sort of thrill to a bug hunt once you dig your heels in deep enough. This is one of those stories.

Earlier in the week a fellow Red Hatter messaged me about a bug in Ptyxis where if you right click on a tab, the tab stays in the active state. Clearly not a very good look. My immediate thought was that we don’t do anything special there, so maybe the issue is lower in the stack.

A screenshot of Ptyxis with 4 terminal tabs open. Two of which appear "focused" due to an overzealous active state on the tab widget.

Oopsie!

Libadwaita?

The first layer beneath that is of course libadwaita and so naturally I’ll test Text Editor to see if I get the same behavior. Lo-and-behold it exists there. Cool, file an issue in libadwaita while I simultaneously see if I can fix it.

Nothing stands out too obvious, but I try a few things which make it slightly better but I can still get into this state with enough attempts. Bummer. Also, Alice thinks the real issue is in GTK because we’ve seen this in a number of places. It appears to be something wrong with active state tracking.

GTK?

No worries, I know that code-base well, so we hop on over there and see what is going on. First things first though, where does active state tracking happen? A quick git grep later we see it is largely in gtkmain.c in response to incoming GdkEvent.

GTK or GDK?

Having written a large portion of the macOS back-end for GTK 4, I know that events can be extremely platform specific. So lets cut this problem in half by determining if the issue really is in the GTK-side or GDK-side. So we run the test again with GDK_BACKEND=x11 ptyxis -s and see if we can reproduce.

The issue appears to go away, lovely! Although, let’s test it on macOS too just to be certain. And again, it does not reproduce there. So that must mean the issue lies somewhere in gdk/wayland/ or related.

Regressions?

Another important detail that Alice mentioned was that it seemed to be a regression. That would be great because if so, it would mean that we can bisect to find an answer. However, since both gdk/wayland/ and compositors move forward at a somewhat lock-step pace, that can be quite difficult to prove reliably enough for bisect.

So instead I move on to something a bit whacky. I have a CentOS 7 installation here that I use to test that Ptyxis crazy GLibc hacks work for ptyxis-agent. That has GNOME 3.28 which can run as a Wayland compositor if you install the appropriate session file. Thus I do, and what do you know, it has the same issue there. That was surely to hit the “old wayland protocol” code-paths which are a bit different than the newer extensions, which was what I had hoped to test.

Either way, very old Mutter still equals broken. Womp.

Weston Enters the Ring

We have a “reference compositor” in the form of Weston which can help shake out protocol issues (assuming the diligence to Weston correctness is upheld). So I try running Ptyxis on my local machine inside a nested Weston instance (which appears to be 13.0.0). Surprising, things actually work!

Okay, but I still have that CentOS 7 VM running, I wonder if it has Weston available? It does not. Bummer. But I also have an Alma Linux 9 VM around that I use for testing occasionally and that must have it. Let’s try there.

In fact it does, but it’s version 8.0.0. Let’s see if it works there too!

A screenshot of GNOME Boxes running Alma Linux 9 in a GNOME session with a nested Weston compositor containing Ptyxis with 2 of 3 tabs broken.

Bisecting Weston

So now we know Weston 8.0.0 is broken and 13.0.0 works. Now we have something I can bisect!

So I fire up meson locally and ensure that my local build fails when it is 8.0.0. Indeed it does. And 13.0.0 appears to work. Bisect commence!

We find the commit which “fixes” things from a Weston point of view to be “Set grab client before calling grab functions“.

That gives me enough information to start commenting out lines of code in Mutter like the uncivilized monkey I am. Eventually, Shakespeare has been written and the bug goes away. Though certainly not in an upstream-able fashion. Bugs filed, lines of code pointed to, and here I hope for our lovely Mutter maintainers to take over.

Back to GDK

Lots of amazing things have happened since we got Wayland compositors. One of the more undeniable difficulties though has been equal event ordering amongst them. It’s almost certainly the case that other compositors are doing things in slightly different ways and it may not even be possible to change that based on their use-case/designs.

So some sort of mitigation will be required on the GDK side.

After some interpretation of what is really going on here, a very small patch to drop certain leave events.

And that’s the story of how I bisected Weston to find an issue in Mutter (and GDK).

Manuals on Flathub

Manuals contains the documentation engine from Builder as a standalone application. Not only does it browse documentation organized by SDK but can install additional SDKs too. This is done using the same techniques Builder uses to manage your project SDKs.

It should feel very familiar if you’re already using the documentation tooling in Builder.

In the past, we would just parse all the *.devhelp2 files up-front when loading. GMarkupParseContext is fast enough that it isn’t too much overhead at start-up for a couple hundred files.

However, once you start dealing with SDKs and multiple versions of all these files the startup performance can take quite a hit. So Manuals indexes these files into SQLite using GOM and performs queries using that instead. It conveniently makes cross-referencing easy too so you can jump between SDK revisions for a particular piece of documentation.

Enjoy!

Red Hat Day of Learning

Occasionally at Red Hat we have a “Day of Learning” where we get to spend time learning about technology of our choice.

I spent some time listening to various AI explanations which were suggested readings for the day. Nothing too surprising but also not exactly engaging to me. Maybe that’s because I grew up with a statistics professor for a father.

So while that was playing I spent a little time learning how the GitLab API works. Immediately it stood out that one of the primary challenges in presenting UI for such an API would be in bridging GListModel to their implementation.

So I spent a little time on the architecture for how you might want to do that in the form of a Gitlab-GLib library.

Pagination

There are essentially two modes of pagination supported by GitLab depending on the result set size. Both methods use HTTP headers to denote information about the result set.

Importantly, any design would want to have an indirection object (and in this case GitlabListItem) which can have it’s resource dynamically loaded.

Resources (such as a GitlabProject, or GitlabIssue) are “views” into the JSON result set. They are only used for reading, not for mutating or creating resources on the API server.

Offset-based Pagination

This form is somewhat handy because you can know the entire result-set size up front. Then you can back-fill entries as accessed by the user using bucketed lazy-loading.

When the bucketed page loads, the indirection objects are supplied with their resource object which provides a typed API over the JsonNode backing it.

This is the “ideal” form from the consuming standpoint but can put a great deal of load on the GitLab server instance.

Progressive Pagination

The other form of pagination lets you fetch the next page after each subsequent request. The header provides the next page number.

One might expect that you could use this to still jump around to the appropriate page. However if you are not provided the “number of pages” header from the server then there is not much you can do to clamp your page range.

Conclusion

This was a fun little side-project to get to know some of the inner workings of the API backing what those of us in GNOME use every day. I have no idea if anything will come of it, but it certainly could be useful from Builder if anyone has time to run with it.

For example, imagine having access to common GitLab operations from the header bar.

A screenshot of GNOME Builder with a new GtkMenuButton in the headerbar containing a GitLab icon and menu.