More Sysprof’ing

GWeather

Last time I wrote we talked about a new search index for libgweather. In the end I decided to take another route so that we can improve application performance without any changes. Instead, I added a kdtree to do nearest neighbor search when deserializing GWeatherLocation. That code path was looking for the nearest city from a latitude/longitude in degrees.

The merge request indexes some 10,000 lat/lon points in radians at startup into a kd-tree. When deserializing it can find nearest city without the need for a linear scan. Maybe this is enough to allow significantly more data into the database someday so my small hometown can be represented.

Nautilus

I found a peculiarity in that I was seeing a lot of gtk_init() calls while profiling search. That means processes are being spawned. Since I have D-Bus session capture in Sysprof now, I was able to find this being caused by Nautilus sending an org.freedesktop.DBus.Peer.Ping() RPC to kgx and gnome-disks.

Seems like a reasonable way to find out if a program exists, but it does result in those applications being spawned. gtk_init() can take about 2% CPU alone and combined this is now 4-5% that is unnecessary. Nautilus itself needing to initialize GTK on startup plus these two as a casualty puts us combined over 6%.

D-Bus has an org.freedesktop.DBus.ListActivatableNames() RPC which will tell you what names are activatable. Same outcome, less work.

Corey, like a great maintainer, has already jumped into action.

Photos

I was hesitant to look at gnome-photos because I think it’s falling out of core this next cycle. But since it’s on my system now, I want to take a look.

First off, I was seeing about 10 instances of gnome-photos in a single Sysprof capture. That means that the process was spawned 10 times in response to search queries. In other words, it either must have crashed or exited after each search request I typed into GNOME Shell. Thankfully, it was the later. But since we know gtk_init() alone is about 2%, and combined this looks like it’s in the 30-40% range there must be more to it.

So, easy fix. Call g_application_set_inactivity_timeout() with something reasonable. The search provider was already doing the right thing in calling g_application_hold() and g_application_release() to extend the process lifetime. But without inactivity-timeout set, it doesn’t help much.

After that, we’re down to just one instance. Cool.

Next capture we see we’re still at a few percent, which means something beyond just gtk_init() is getting called. Looks like it’s spending a bunch of time in gegl_init(). Surely we don’t need GEGL to provide search results (which come from tracker anyway), so make a quick patch to defer that until the first window is created. That generally won’t happen when just doing Shell queries, so it disappears from profiles now.

Calculator

Rarely do I have gnome-calculator actually running when performing a Shell search despite using it a lot. That means it too needs to be spawned to perform the search.

It’s already doing the right thing in having a dedicated gnome-calculator-search-provider binary that is separate from the application (so you can reduce start-up time and memory usage) so why is it showing up on profiles? Looks like it’s initializing GTK even though that isn’t used at all in providing search results. Probably a vestige of yesteryear.

Easy fix by just removing gtk_init(). Save time connecting to the display server, setting up seats, icon themes, and most importantly, parsing unused CSS.

Another couple percent saved.

Measure, Measure, Measure

Anyway, in conclusion, I might leave you with this tidbit. Nobody gets code right on the first try, especially me. If you don’t take a look and observe it, my guess is that it looks a lot different at run-time than it does in your head.

Sysprof is my attempt to make that a painless process.

Writing Fast Search

The problem we encountered in my last writing was that gnome-clocks was taking about 300 milliseconds to complete a basic search query. I guess the idea is that if you type “paris” into GNOME Shell you’ll get the time in either Paris, France or one of the Paris’ in the United States. I guess 300 milliseconds wouldn’t be so bad if it didn’t also consume 100% of the CPU during that time.

Thankfully in my career I’ve had plenty of opportunity to work with database search indexes. So I have some practical experience in making that stuff fast(er).

So this morning I put together a small search index which can be generated from the Locations.bin using the libgweather API. That search index contains the serialized document form and a series of trigrams for the GWeatherLocation textual representation. That search index is meant to be static and installed along side Locations.bin.

Then for search, you take your term list and generate another series of trigrams. The SearchIndex provides iterators for each of those trigrams to find documents which contain it. So if you line those up with a sorted document list you can create an O(n*m) worst case iterator across potentially matching documents. In practice you look at a very small subset of the corpus.

As you iterate through those, you do your full termlist matching as you would have previously. Except instead of looking at thousands of entries, you look at just a few.

Long story short, you can go from 100% CPU for 300 milliseconds repeatedly to about 10 milliseconds and it keeps getting faster the more you type.

Once again, without tools like Sysprof and distributions with courage to enable frame-pointers like GNOME OS and Fedora, finding this stuff can be quite nebulous.

How to use Sysprof (again)

Every once in a while I take a moment to test GNOME OS on physical hardware.

The experience today was quite a bit underwhelming. Fresh install, type a few characters into the search box, and things grind to a halt.

Being the system profiler author I am, where would I consider spending time to make this better? Here ya go, and please do help because I can make the tools but I need people like you to help go resolve them.

I had to build Sysprof from source quick on GNOME OS until new GNOME OS builds are out (soon).

$ sysprof-cli --session-bus --gnome-shell capture.syscap 
$ sysprof ./capture.syscap

An overview of time spent in various processes

Interesting, a couple systemd-coredump processes busy doing ztsd compression on Nautilus crashes (in search providers). Issue filed.

Next up, gnome-software clocking in at 23% CPU (and remember, we’re competing against multiple zstd compressors for CPU time) which is busy doing appstream search for Flatpaks. Seems a bit high for something which is pre-compiled into a binary format and mmap()d at runtime to reduce CPU and memory overhead. Issue filed.

A screenshot of gnome-clocks search provider busyin libgweather deserialization.

Next is gnome-clocks at a whopping 15% to show me the time in cities near to whatever I type which is obviously “Riga” given GUADEC. Again, that’s 15% while competing with multiple zstd so in reality it’d be even more. Appears to be busy in libgweather doing deserialization, but specifically in finding the nearest city to a lat/lon position. A quick look at the code shows that this is probably one of the most expensive operations you can do and it’s done for every object deserialized. Probably could use some flags to avoid that from a search provider. Issue filed.

A screenshot of gnome-characters search provider taking 10% of system time in filter_keywords

Lastly in our top-offenders list is gnome-characters search provider. It’s clocking in at roughly 10% of system time (again, would be more if not for zstd) filtering keywords and getting character names. Considering we’re only showing up to maybe 3 of these results that seems significantly high. Issue filed.

So I implore my readers to go and make things fast.

Additionally, to be a good citizen myself, I put together an MR that makes search in Characters much, much faster.

And some fixes to make libxmlb faster (Software) here and here.

Sysprof 45

Unfortunately I couldn’t be at GUADEC this year, but that wont stop me from demoing new things!

I’ve been doing a lot of work on Sysprof now that we have semi-reliable frame unwinding on Fedora, Silverblue and GNOME OS. When I have tolling that works on the OS it makes it a lot easier to build profilers and make them useful.

Additionally, we’re at a good point in GTK 4 where you can do really powerful things if you design your data models correctly. So this cycle I’ve spent time redesigning how we record and process our captured data.

There is certainly more work to be done, but the big strokes of the new design are in place. It could really use the benefit of another person joining in to help polish various bits of the apps like scales and legends.

For 45 I decided to remove the tabbed interface and Builder will now just open captures with Sysprof directly. It’s too cumbersome to try to shove all this information into a single view widget just so I can embed it in Builder.

Greeter

The first thing you’ll see is a new greeter. It still has a bit more to finish but my primary goal was to elevate how things work. That was something lacking with just icons like we had previously.

A screenshot of the window that displays when you start Sysprof 45

You’ll also notice you can capture either to disk or to memory. Depending on your situation that may be of use. For example, if you’re testing under memory pressure, creating an unbounded memfd may not be what you want. Instead you can capture to disk and the capture will periodically flush when the buffer is full.

Recording Pad

While recording, Sysprof now creates a much smaller recording pad that you can use to stop the recording. The goal here is to further reduce overhead created by Sysprof itself. It still updates once per second to give you an idea of how many data frames have been recorded to the capture.

A screenshot showing a small dialog that appears while recording to minimize rendering overhead.

Exploring Captures

After capturing your system, you’ll be presented with a window to explore the capture.

A screenshot showing a window to explore captured data. It has categories along the left sidebar with a chart showing stack depth above a traditional callgraph display.

Things were getting pretty cramped before, so the new sections in the sidebar make it easier for us to put related information together in a way that is understandable.

I tried very hard to keep the callgraph in the three-section format we’ve used for many years. However, it has a nice filter now on the functions list thanks to GtkFilterListModel making it so easy.

Selecting Time Spans

Many parts of the window will automatically filter themselves based on the selected time span. Use the charts at the top of the window to select time ranges that are interesting. You can use the controls in the sidebar to navigate the capture as well.

You can click the + icon within the selection to zoom into that range.

A screenshot showing a time span selected with a filtered callgraph only containing stack traces from that time range.

Callgraph Options

There are a number of new callgraph options you can toggle.

  • Categorized Frames
  • Hide System Libraries
  • Include Threads
  • Bottom Up

A menu showing options for the callgraph.

They are all pretty standard things in a profiler so I don’t need to dwell on them much. But having a “Bottom Up” option means we have some help when you run into truncated stack traces and still want to get an idea of what’s going on by function fragments. The new “Include Threads” option lets you break up your callgraph by one more level, the thread that was running.

Categorized Stack Traces

While I was working on this I had to add a few things I’ve wanted for a while. One such thing was a utility sidebar that can be shown with additional information relative to the current selection. In this case, you can expand the callgraph and see a list of all the stack traces that contributed to that callgraph frame showing up in the capture. Additionally, we can categorize stack traces based on the libraries and functions contributing to them to give you a high-level overview of where time is being spent.

A screenshot showing the utility sidebar on the right of the callgraph with the ability to select and view stacktraces one-by-one and a categorization breakdown of recorded stacktraces such as Kernel, Memory Allocations, Paint, Layout, and more.

Logs View

When spawning an application from Sysprof it can write logs by integrating with libsysprof-capture-4.a. That’s not new but what is new is that Sysprof now has a journald collector which can be interposed in your capture.

A screenshot showing logs from Builder and journald side-by-side, captured as part of the system capture.

Marks

Marks have gone through substantial work to be more useful.

A mark is just a data frame in the capture that has a time and duration associated with a category, name, and optional message. These are used by GNOME Shell to annotate what is happening in the compositor as well as by GTK to denote what is happening during the frame cycles. Furthermore, GLib has optional Sysprof support which can annotate your main loop cycles so you can see why applications are waking up and for how long.

Marks Chart

The first new view we have for this is the “Mark Chart”. It contains a breakdown of the selected time span by category and name. The X axis is of course time.

A screenshot showing a chart of marks and their durations in a convenient and compact display.

Marks Table

Sysprof now has a long-requested mark table.

A screenshot containing a list of marks in a table which contains time, cpu, duration, and more all of which can be sorted.

Sometimes its easier to look at data in a more raw form. Especially since you can sort by column and dive into what you care about. It doesn’t hurt that this is much more accessibility friendly too.

Marks Waterfall

We still have the old waterfall style display as well so you can see how things naturally depend on one-another.

A screenshot of marks in order of time and duration which naturally shows dependency graphs.

You can double click on these waterfall entries and the visible time region will update to match that item’s duration.

Marks Summary

It was a bit hidden before, but we still have a mark summary. Although I’ve beefed it up a bit and provide median values in addition to mean. These are also sortable like the other tables you’ll find in Sysprof.

A screenshot showing the breakdown of marks and their min, max, mean, and median durations.

Processes

We now give you a bit more insight into the processes we discovered running during your capture. The new Processes section shows you a timeline of the processes that ran.

A timeline of processes that were run and their durations and command line arguments.

Additionally there is a table view, again more accessible and sometimes easier to read, sort, and analyze. If you double click a row you’ll get additional information on that process such as the address layout, mounts, and thread information we have.

This is all information that Sysprof collects to be able to do it’s job as a profiler and we might as well make that available to you too.

A screenshot showing the table of process information and the additional information on a single process including Address Layout.

D-Bus Messages

You can record D-Bus messages on your session or system bus now. We may end up needing to tweak how we get access to the system bus so that you are more certain to have privileges beyond just listening from your read socket.

There are no fancy viewers like Bustle yet, but you do have a table of messages. Someone could use this as a basis to connect the reply message with the send message so that you can draw proper message durations in a chart.

A screenshot containing a table of D-Bus messages that were recorded from the session bus.

Counters

Counters have been broken up a bit more so that we can expand on them going forward. Different sections have different additional data to view. For example the CPU section will give you the CPU breakdown we recorded such as processor model and what CPU id maps to what core.

I find it strange that my Xeon skips core 6 and 7.

A visual breakdown of CPU information.

There are all the same counters we had previously for CPU, Energy (RAPL), Battery Charge, Disk I/O, Network I/O, and GTK counters such as FPS.

A screenshot of the Graphics counters including FPS and GTK GL renderer specific information.

Files

Sysprof supports embedding files in chunks within the *.syscap file. The SysprofDocument exports a GListModel of those which can be reconstructed at will. Since we needed that support to be able to model process namespaces, we might as well give the user insight too. Lots of valuable information is stored here, typically compressed, although Sysprof will transparently decompress it for you.

This will hopefully speed up maintainers ability to get necessary system information without back-and-forths with someone filing an issue.

A screenshot showing the list of files embedded in the system capture, and a window display the contents of the /etc/os-release file.

Metadata

A metadata frame is just a key/value pair that you can embed into capture files. Sysprof uses them to store various information about the capture for quick reference later. Since we’re capturing information about a user’s system, we want to put them in control of knowing what is in that capture. But again, this is generally system statistics that help us track down issues without back-and-forths.

A screenshot containing a table of metadata such as the display environment variable, system memory usage, and the command line arguments used to spawn a profiled application.

Symbolizing

The symbolizing phase of Sysprof has also been redesigned. To effectively handle the changes in how systems are built now from when Sysprof was revamped requires quite a bit of hand-waving. We have containers with multiple and sometimes overlapping storage technologies, varying file-systems used for the operating system including those with subvolumes which might not match a processes, chroots and ostrees.

To make things mostly work across the number of systems I have at my fingertips to test with required quite a bit of iterative tweaking. The end result is that we basically try to model the mount namespace of the target process and the mount namespace of the host and cross-correlate to get a best guess at where to resolve the library path. At that point, we can try to resolve additional paths so that looking at .gnu_debuglink still results in something close to correct.

We also give you more data in the callgraph now so if you do get an inode mismatch or otherwise unresolveable symbol you at least get an offset within the .text section of the ELF you can manually disassemble in your debugger. Few people will likely do this, but I’ve had to a number of times.

To make that stuff fast, Sysprof has a new symbol cache. It is the combination of an augmented Red-Black tree with address ranges (so an interval tree). It’s maintained per-process and can significantly reduce decoding overhead.

PERF_EVENT_MMAP2 and build_id

Sysprof now records mmap2 records from Perf while also requesting build_id for executable pages. The goal here is that we would be able to use the build_id to resolve symbols rather than all the process mount namespace and .gnu_debuglink madness. In practice, I haven’t had too much success getting these values but in time I assume that would allow for symbolizing with tools such as debuginfod.

Writing your own Profiler

You can always write your own profiler using libsysprof and get exactly what you want. The API is significantly reduced and cleaned up for GNOME 45.

SysprofProfiler *profiler = sysprof_profiler_new ();
SysprofCaptureWriter *writer = sysprof_capture_writer_new ("capture.syscap", 0);

sysprof_profiler_add_instrument (profiler, sysprof_sampler_new ());
sysprof_profiler_add_instrument (profiler, sysprof_network_usage_new ());
sysprof_profiler_add_instrument (profiler, sysprof_disk_usage_new ());
sysprof_profiler_add_instrument (profiler, sysprof_energy_usage_new ());
sysprof_profiler_add_instrument (profiler, sysprof_power_profile_new ("performance"));

/* If you want to symbolize at end of capture and attach to the capture,
 * use this. It makes your capture more portable for sharing.
 */
sysprof_profiler_add_instrument (profiler, sysprof_symbols_bundle_new ());

sysprof_profiler_record_async (profiler, writer, record_cb, NULL, NULL);

You get the idea.

Writing your own Analyzer

You can also use libsysprof to analyze an existing capture.

SysprofDocumentLoader *loader = sysprof_document_loader_new ("capture.syscap");

/* there is a sensible default symbolizer, but you can even disable it if you
 * know you just want to look at marks/counters/etc.
 */
sysprof_document_loader_set_symbolizer (loader, sysprof_no_symbolizer_get ());

SysprofDocument *document = sysprof_document_loader_load (loader, NULL, &error);

GListModel *counters = sysprof_document_list_counters (document);
GListModel *samples = sysprof_document_list_samples (document);
GListModel *marks = sysprof_document_list_marks (document);

This stuff is all generally fast because at load time we’ve indexed the whole thing into low-cardinality indexes that can be intersected. The SysprofDocument itself is also a GListModel of every data frame in the capture which makes for fun data-binding opportunities.

Thanks for reading and happy performance hacking!

GListModel as a file format interface

One of the things I’ve done this cycle leading up to GNOME 45 is some rework on how we process *.syscap files. In particular, I wanted to really push the GListModel interface in GTK 4.

That is a tall order at first sight because Sysprof capture files easily have hundreds of thousands of data frames. To create an object for each would be an enormous amount of overhead.

However, GListModel allows you to create objects on demand, which means you only need to create them as necessary.

But that becomes a bit more difficult once you need to segment those data frames like a database.

For example, one data frame type is samples (e.g. a stacktrace). Those need to be processed very differently than counter values. Since these documents are read-only, we can do a bunch of fun performance hacks!

Sysprof, in what will land very soon, uses a mmap() file for the underlying document. That document may be in a non-native endian format, so it has a bunch of helpers to deal with that so you can keep the memory map read-only. The document then exports a GListModel of data frames.

But lets say you want to have a SysprofDocument:samples property that is also a GListModel? You probably don’t want to filter a few hundred thousand objects (each needing to be inflated) just to create that index.

Sysprof gets around this by doing a single “full-table-scan” at startup and indexing the memory offsets of all the data frames. From there, we create roaring bitmaps of all the sorts of indexes we need at runtime. Then you can take that document (a GListModel) plus a roaring bitmap to create a new indexed GListModel. Position 0 in that new model will map to position N in the document, based on the roaring bitmap index. Very handy.

So now you can have all sorts of properties on that document like samples, memory allocations, counters, logs, marks, files, metadata, and more. This makes data binding to GtkBuilder templates very easy and natural.

To take it to the next level though, you have essentially a table with indexes. And where those become powerful is through the use of index intersection.

So now Sysprof will even index which symbols show up in which stack traces. That means you very quickly find stack traces which have a prefix/suffix or even just contain any number of specific functions by intersecting the indexes which itself yields a new index.

Anyway, GListModel all the way down seems to be working out a whole lot better than I anticipated, and this will probably change how I write applications going forward.

Spellchecking for GTK 4

Apparently, spellchecking was preventing some people from porting their applications to GTK 4. So I spent a little time today extracting Text Editor’s spellcheck engine into a library you can use in your GTK 4 application without having to write fun data-structures on your own.

It’s slightly different since I have to avoid putting code in subclasses of GtkTextView and GtkTextBuffer, but it should work nonetheless.

It does benefit from GtkSourceView from git though, as I added a new function to avoid doing work while buffers are loading (as Text Editor does).

https://gitlab.gnome.org/chergert/libspelling/

GJS plugins in GNOME Builder

As I mentioned in my last post, Builder has switched to GJS as it’s dynamic language for plugins. We already support a number of compiled languages including C, C++, Rust, and Vala.

Previously we had used PyGObject. Do to the lack of GTypeInstance support in PyGObject, that isn’t an option currently. I already ported all of Builder’s plugins written in Python to C over the course of a week last summer. That ended up making things both more stable and allow us to ship the GTK 4 port on time.

This past year I wrote a new async/futures framework for GLib called libdex which provides Fibers, Futures, Channels, await, threadpools, io_uring support, and more. That tool heavily uses the same GTypeInstance features that GTK 4 uses.

GJS has improved a lot over the years due to how it is being maintained and it’s importance in the GNOME Shell stack. I’d like to double down on that so Builder can benefit from their hard work. Therefore, if you want to write plugins in JavaScript and maintain them upstream, that’s something I’m happy to see happen.

You can see some examples for how to write a JavaScript plugin for Builder in the examples directory.

GJS plugins for libpeas-2.0

One of the main features I want to land for the libpeas-2.0 ABI break is support for plugins in JavaScript.

With the right set of patches, you can get that. Thanks to Philip Chimento, GJS will hopefully soon land support for running code in a SpiderMonkey realm. Philip also did us a solid and wrote the code to exfiltrate enough GType information from an imported JavaScript module. That allows libpeas to correlate which GTypes are provided by a plugin.

With the GJS realm support in place, we can land the new GJS loader for libpeas-2.0.

My personal goal for this is to enable JavaScript-based plugins in GNOME Builder. With how much GJS has improved over the years to support GNOME Shell, it is probably our most-maintained language binding for a dynamic language with modern JIT features.

For example, if you wanted to make an addin in Builder which responded to changes of a file within the editor, you might write something like this as your plugin. Keep in mind I’m not a JavaScript developer and GJS developers may tell you there are fancy new language features you can use to simplify this code further.

import GObject from 'gi://GObject';
import Ide from 'gi://Ide';

export var TestBufferAddin = GObject.registerClass({
    Implements: [Ide.BufferAddin],
}, class TestBufferAddin extends GObject.Object {

    vfunc_language_set(buffer, language_id) {
        print('language set to', language_id);
    }

    vfunc_file_loaded(buffer, file) {
        print(file.get_uri(), 'loaded');
    }

    vfunc_save_file(buffer, file) {
        print('before saving buffer to', file.get_uri());
    }

    vfunc_file_saved(buffer, file) {
        print('after buffer saved to', file.get_uri());
    }

    vfunc_change_settled(buffer) {
        print('spurious changes have settled');
    }

    vfunc_load(buffer) {
        print('load buffer addin');
    }

    vfunc_unload(buffer) {
        print('unload buffer addin');
    }

    vfunc_style_scheme_changed(buffer) {
        let scheme = buffer.get_style_scheme();
        print('style scheme changed to', scheme ? scheme.get_id() : scheme);
    }
});

You can easily correlate that to the IdeBufferAddin interface definition.

libpeas-2

Now that GNOME 44 is out the door, I took some time to do a bunch of the refactoring I’ve wanted in libpeas for quite some time. For those not in the know, libpeas is the plugin engine behind applications like Gedit and Builder.

This does include an ABI break but libpeas-1.0 and libpeas-2 can be installed side-by-side.

In particular, I wanted to remove a bunch of deprecated API that is well over a decade old. It wasn’t used for very long and causes libpeas to unnecessarily link against gobject-introspection-1.0.

Additionally, there is no need for the libpeas-gtk library anymore. With GTK 4 came much more powerful list widgets. Combine that with radically different plugin UI designs, the “one stop plugin configuration widget” in libpeas-gtk just isn’t cutting it.

Now that there is just the single library, using subdirectories in includes does not make sense. Just #include <libpeas.h> now.

Therefore, PeasEngine is now a GListModel containing PeasPluginInfo.

I also made PeasExtensionSet a GListModel which can be convenient when you want to filter which extensions use care about using something like GtkFilterListModel.

And that is one of the significant reasons for the ABI break. Previously, PeasPluginInfo was a boxed-type, incompatible with GListModel. It is now derived from GObject and thusly provides properties for all the important bits, including PeasPluginInfo:loaded to denote if the plugin is loaded.

A vestige of the old-days is PeasExtension which was really just an alias to GObject. This just isn’t needed anymore and we use GObject directly in function prototypes.

PeasActivatable is also removed because creating interfaces is so easy these days with language bindings and/or G_DECLARE_INTERFACE() that it doesn’t make sense to have such an interface in-tree. Just create the interface you want rather than shoehorning this one in.

I’ve taken this opportunity to rename our development branch to main and you can get the old libpeas-1.0 ABI from the very fresh 1.36 branch.

Smoother Scrolling of Text Views

When working on GTK 4, special care was taken to ensure that most of a GtkTextView‘s content could be rendered without GL program changes and maximal use of glDrawArrays() with vertices from a VBO.

That goes a long way towards making smooth scrolling.

In recent releases, GTK gained support for scrolling using more precise scroll units. On macOS with Apple touchpads, for example, that might map physical distance to a certain number of logical pixels within the application.

If you’re at 2x scaling, you might get values from the input system with “half pixel” values (e.g. .5) since that would map just fine to the physical pixel boundary of the underlying display server.

That’s great for our future, but not everything from GTK’s early designs around X11 have been excised from the toolkit. Currently, widget allocations are still integer based, meaning at 2x scaling they will sit on 2x physical pixel boundaries even though 1x physical pixel boundaries would be just fine to keep lines sharp (assuming you aren’t fractionally scaling afterwards).

Furthermore, GtkTextView works in logical pixels as well. That means that if you want to smooth scroll a text view and you get those 0.5 logical pixel values (1x physical pixels) you wont scroll until you get to 1.0 which then jumps you 2x physical pixels.

That can create some unsightly jitter, most noticeable during kinetic deceleration.

To fix the widget allocation situation, a future ABI break in GTK would have to move to some sort of float/double for widget coordinates. Everything underneath GTK (like GSK/GDK/Graphene/etc) is already largely doing this as GDK went through substantial improvements and simplification for GTK 4.

But with a little creativity, abstraction, and willingness to completely break ABI for a prototype, you can get an idea of what that would be like.

I put together a quick branch today which makes GtkTextView use double for coordinates so that I could push it to snap to physical pixel boundaries.

The fun thing is finding all the ways it breaks stuff. Like text underline getting into situations where it looks different as you scroll or having to allocate GtkWidget embedded within the GtkTextView on logical pixels.

Like I said earlier, it completely breaks ABI of GtkTextView, so don’t expect to replace your system GTK with it or anything.

P.S. I’m on Mastodon now.