Asynchronous Varlink with varlink-glib

I’ve been putting together varlink-glib, which is a library for writing Varlink clients and services in C. The basic idea is to keep the transport policy out of the library. You get a connected GIOStream however you want, whether that is GLib networking, socket activation, or something more specialized, and then wrap it in a VarlinkClientProtocol or VarlinkServerProtocol.

The API is built around DexFuture, which makes the async parts feel a lot nicer in C than the usual callback layering. Client calls return a future, server replies return a future, and internally the protocol can use fibers for the work that wants to look sequential while still integrating with the GLib main context. This is very much the style of API I want for system services: explicit enough that you can debug it, but not so painfully manual that every call site becomes a state machine.

future = example_calc_add_call_invoke (proxy,
                                       call,
                                       VARLINK_METHOD_CALL_DEFAULT,
                                       add_reply_cb,
                                       NULL,
                                       NULL);

There is also a varlink-codegen tool which takes a .varlink interface and generates typed C wrappers for it. That gives you proxy objects, server skeletons, call objects, reply objects, and error constants instead of making every application hand-roll JSON. You can still drop down to VarlinkMessage, VarlinkMessageBuilder, and VarlinkMessageReader for forwarding or weird infrastructure cases, but most code should get to stay typed.

File descriptor passing works when the transport is a Unix socket connection. This follows the same general model as systemd’s Varlink support: the JSON payload contains an integer index, while the actual descriptors are sent out-of-band with SCM_RIGHTS and attached to the containing message as a GUnixFDList. Generated code can continue to treat the field as an integer, while the actual descriptor list stays attached to the underlying VarlinkMessage.

fd_list = g_unix_fd_list_new ();
fd_index = g_unix_fd_list_append (fd_list, fd, &error);

varlink_message_set_unix_fd_list (parameters, fd_list);

Protocol upgrades are supported too. A method call can ask to upgrade the connection, and once the final successful reply is sent, Varlink stops being valid on that connection. The VarlinkProtocol is still a GIOStream, so the next protocol can continue reading and writing through the same object. That keeps the handoff explicit without requiring a separate transport abstraction.

I also wired in optional Sysprof capture support. When enabled, client and server RPCs can show up as Sysprof marks with useful bits like method name, result, reply count, one-way, multi-reply, upgrade, Varlink error, and GError details. That matters because once you have concurrent calls, generated dispatch through VarlinkServerRegistry, and services doing real work, “it got slow somewhere” is not enough information.

A screenshot of Sysprof showing captured varlink calls over time with the RPC and message metadata inline.

There is still more polish to do, but the shape is there: typed generated APIs for normal users, low-level message APIs for infrastructure, DexFuture for async flow, Unix FD passing for the system service cases, protocol upgrades for handoff cases, and profiler hooks so it can be debugged when the happy path stops being happy.

Numbers

So what does that look like performance wise you might ask? For a very simple Echo interface in both D-Bus and Varlink you can get a rough estimate. No daemons, just serialization on a socketpair(). I haven’t started performance tuning yet so there may be ground to make up on both sides. But the answer is that the testcase for varlink-glib is about 5x faster than the testcase for GDBusConnection in either synchronous or asynchronous modes.

This doesn’t apply to all use-cases of D-Bus of course. But for a specific case I use it for (P2P IPC between peer processes), it is pretty big difference.

A graph showing the performance differences between Varlink-GLib and GDBusConnection.

A small personal note: as I wrote in my recent update from France, I am no longer employed by Red Hat. Work like this is currently self-funded, out of pocket, while my family and I settle into a new chapter. If you find it useful, a note of encouragement or a contribution means a lot right now. It helps make it possible to keep improving the free software infrastructure many of us rely on.

Translating French

I have been spending more time learning French lately, and as often happens, that turned into a small side project. liblingua is not intended to be a big mainstream translation platform. It is a fun GLib/GObject library for experimenting with local machine translation from applications.

The library uses Bergamot from Mozilla for the translation backend. Instead of sending text to a web service, liblingua resolves the language pair you ask for, downloads the required language model into the local user cache, and then performs translation locally.

The high-level API is built around a few small objects: LinguaRegistry discovers available translation profiles, LinguaProfile represents a model that can be loaded or downloaded, LinguaProgress reports download progress, and LinguaTranslator performs the translation.

All potentially blocking work is exposed as DexFuture, so it fits naturally into libdex based applications. If you are already using fibers, the code can stay linear and easy to read with dex_await_object().

A Small Example

Here is the basic shape of translating French into English:

#include <liblingua.h>

static void
translate_example (void)
{
  g_autoptr(LinguaProgress) progress = lingua_progress_new ();
  g_autoptr(LinguaRegistry) registry = NULL;
  g_autoptr(LinguaProfile) profile = NULL;
  g_autoptr(LinguaTranslator) translator = NULL;
  g_autoptr(LinguaTranslation) result = NULL;
  g_autoptr(GListModel) profiles = NULL;
  g_autoptr(GError) error = NULL;

  if ((registry = dex_await_object (lingua_registry_new (), &error)) &&
      (profiles = lingua_registry_resolve (registry, "fr", "en")) &&
      (profile = g_list_model_get_item (profiles, 0)) &&
      (translator = dex_await_object (lingua_profile_load (profile, progress), &error)) &&
      (result = dex_await_object (lingua_translator_translate (translator, "Bonjour"), &error)))
    g_print ("%s\n", lingua_translation_get_translation (result));
  else
    g_printerr ("Error: %s\n", error->message);
}

The first time a model is needed, loading the profile may download it. After that, the model is reused from the local cache. That makes liblingua useful for little tools, demos, and desktop experiments where local translation is preferable to wiring everything through a remote service.

In the future this is probably the type of thing we would want as a desktop service to avoid duplicating caches amongst Flatpak applications. It would also be extremely useful to do live translation in Camera and Image Preview apps. I played a bit with that using Tesseract for OCR and it worked better than expected.

Limiters in libdex

Libdex now has DexLimiter, a small utility for bounding how much asynchronous work runs at once.

This is useful when a workload can produce more parallelism than the underlying machine, subsystem, or service should actually handle. Common examples include indexing files, downloading URLs, generating thumbnails, parsing documents, or querying a service with a fixed concurrency budget.

The usual API is dex_limiter_run(). It acquires a permit, starts a fiber, and releases the permit when that fiber finishes.

static DexFuture *
load_one_file (gpointer user_data)
{
  GFile *file = user_data;

  return dex_file_load_contents_bytes (file);
}

DexLimiter *limiter = dex_limiter_new (8);
DexFuture *future = dex_limiter_run (limiter,
                                     NULL,
                                     0,
                                     load_one_file,
                                     g_object_ref (file),
                                     g_object_unref);

In this example, no more than eight file loads will run at the same time, regardless of how many files are queued. The returned DexFuture resolves or rejects with the result of the spawned fiber.

One important detail is that dropping the returned future does not cancel a fiber that has already started. Once work has acquired a permit, it is allowed to complete so that the limiter can release the permit cleanly.

For more specialized cases, DexLimiter also supports manual acquire and release:

g_autoptr(GError) error = NULL;

if (dex_await (dex_limiter_acquire (limiter), &amp;error))
  {
    do_limited_work ();
    dex_limiter_release (limiter);
  }

This is useful when the limited section is not naturally represented by a single fiber. However, callers must release exactly once for every successful acquire. In most cases, dex_limiter_run() is preferable because it handles release on both success and failure paths.

The limit should describe the constrained resource, not the number of items being processed. Remote APIs and databases may need a small limit. CPU-heavy work should usually be near the amount of useful worker parallelism. Local I/O can often tolerate a larger value, depending on the storage system. Separate resources should usually have separate limiters, so one workload does not consume another workload’s concurrency budget.

Finally, dex_limiter_close() can be used during shutdown. Once closed, pending and future acquisitions reject with DEX_ERROR_SEMAPHORE_CLOSED. Work that already holds a permit may continue, but releasing after close does not make new permits available.

The goal is to make bounded parallelism simple: queue as much asynchronous work as you need, but only run as much of it as the system should handle.

A Small Update from France

For about the past month, I have no longer been with Red Hat.

That is a strange sentence to write after so many years, but life has a way of changing the scenery whether or not one has finished packing. My family and I have made it safely to France, and we are quite happy here. The light is different, the pace is different, and there is a great deal to learn. For now, that is exactly where our attention needs to be.

I also think there is a broader lesson here for people whose safety, immigration status, or family stability may depend on employer flexibility. Do not assume that long tenure, remote work history, or prior verbal guidance will be enough. My own experience left me with the uncomfortable conclusion that these processes can become very narrow exactly when the human stakes are highest. Get things in writing, understand the policy surface area, and protect your family first.

This also means that some of the things I wrote about in my earlier mid-life update remain unresolved. I am currently in France on a visitor visa, which does not authorize work here. Our focus is on integration, language learning, and getting ourselves properly settled for the long term. That takes time, patience, paperwork, and a certain tolerance for being a beginner again.

As a result, I will not be taking on ongoing software maintenance responsibilities for the foreseeable future. I may still scratch the occasional itch where it directly improves my own computing life, but I am not currently able to provide the kind of broad, sustained stewardship that many projects deserve.

That is not an easy thing to say. Open source is entering an especially difficult period. We are now seeing AI systems used not only to generate code, but also to probe, disrupt, and attack critical software infrastructure. I suspect this will have a negative effect on the maintenance burden for a lot of projects, particularly the foundational pieces that distributions, companies, and users all rely on without always seeing the people behind them.

But there is a limit to what can reasonably be carried as unpaid labor, especially when the primary financial beneficiaries are downstream organizations with considerably more resources than the individual maintainers doing the work. At the moment, I need to prioritize my family, our stability, and the next chapter of our life here.

That said, I am still reachable for appropriate professional inquiries, advisory conversations, or consulting opportunities where the structure and location make sense. The best address for that now is christian at sourceandstack dot com.

For now, we are safe, settling in, and doing our best to build something durable out of a rather odd moment in the world. There are worse places to begin again than France.

A calm Mediterranean shoreline at sunset, with gentle waves rolling onto a pebbled beach beneath a wide blue sky streaked with soft clouds and pastel pink light on the horizon.

Mid-life transitions

The past few months have been heavy for many people in the United States, especially families navigating uncertainty about safety, stability, and belonging. My own mixed family has been working through some of those questions, and it has led us to make a significant change.

Over the course of last year, my request to relocate to France while remaining in my role moved up and down the management chain at Red Hat for months without resolution, ultimately ending in a denial. That process significantly delayed our plans despite providing clear evidence of the risks involved to our family. At the beginning of this year, my wife and I moved forward by applying for long-stay visitor visas for France, a status that does not include work authorization.

During our in-person visa appointment in Seattle, a shooting involving CBP occurred just a few parking spaces from where we normally park for medical outpatient visits back in Portland. It was covered by the news internationally and you may have read about it. Moments like that have a way of clarifying what matters and how urgently change can feel necessary.

Our visas were approved quickly, which we’re grateful for. We’ll be spending the next year in France, where my wife has other Tibetan family. I’m looking forward to immersing myself in the language and culture and to taking that responsibility seriously. Learning French in mid-life will be humbling, but I’m ready to give it my full focus.

This move also means a professional shift. I’ve dedicated a substantial portion of my time to maintaining and developing key components across the GNOME platform and its surrounding ecosystem. These projects are widely used, including in major Linux distributions and enterprise environments, and they depend on steady, ongoing care.

For the better part of two decades I’ve been putting in well beyond forty hours each week maintaining and advancing this stack. That level of unpaid or ad-hoc effort isn’t something I can sustain, and my direct involvement going forward will be very limited. Given how widely this software is used in commercial and enterprise environments, long-term stewardship really needs to be backed by funded, dedicated work rather than spare-time contributions.

If you or your organization depend on this software, now is a good time to get involved. Perhaps by contributing engineering time, supporting other maintainers, or helping fund long-term sustainability.

The folliwing is a short list of important modules where I’m roughly the sole active maintainer:

  • GtkSourceView – foundation for editors across the GTK eco-system
  • Text Editor – GNOME’s core text editor
  • Ptyxis – Default terminal on Fedora, Debian, Ubuntu, RHEL/CentOS/Alma/Rocky and others
  • libspelling – Necessary bridge between GTK and enchant2 for spellcheck
  • Sysprof – Whole-systems profiler integrating Linux perf, Mesa, GTK, Pango, GLib, WebKit, Mutter, and other statistics collectors
  • Builder – GNOME’s flagship IDE
  • template-glib – Templating and small language runtime for a scriptable GObject Introspection syntax
  • jsonrpc-glib – Provides JSONRPC communication with language servers
  • libpeas – Plugin library providing C/C++/Rust, Lua, Python, and JavaScript integration
  • libdex – Futures, Fibers, and io_uring integration
  • GOM – Data object binding between GObject and SQLite
  • Manuals – Documentation reader for our development platform
  • Foundry – Basically Builder as a command-line program and shared library, used by Manuals and a future Builder (hopefully)
  • d-spy – Introspect D-Bus connections
  • libpanel – Provides IDE widgetry for complex GTK/libadwaita applications
  • libmks – Qemu Mouse-Keyboard-Screen implementation with DMA-BUF integration for GTK

There are, of course, many other modules I contribute to, but these are the ones most in need of attention. I’m committed to making the transition as smooth as possible and am happy to help onboard new contributors or teams who want to step up.

My next chapter is about focusing on family and building stability in our lives.

pgsql-glib

Much like the s3-glib library I put together recently, I had another itch to scratch. What would it look like to have a PostgreSQL driver that used futures and fibers with libdex? This was something I wondered about more than a decade ago when writing the libmongoc network driver for 10gen (later MongoDB).

pgsql-glib is such a library which I made to wrap the venerable libpq PostgreSQL state-machine library. It does operations on fibers and awaits FD I/O to make something that feels synchronous even though it is not.

It also allows for something more “RAII-like” using g_autoptr() which interacts very nicely with fibers.

API Documentation can be found here.

Experimenting with S3 APIs

My Red Hat colleague Jan Wildeboer made some posts recently playing around with Garage for S3 compatible storage. Intrigued by the prospect, I was curious what a nice libdex-based API would look like.

Many years ago I put together a scrappy aws-glib library to do something similar but after changing jobs I failed to find any free time to finish it. S3-GLib is an opportunity to do it better thanks to all the async/future/fiber support I put into libdex.

I should also mention what a pleasure it is to be able to quickly spin up a Garage instance to run the testsuite.

Status Week 48

This week was Thanksgiving holiday in the US and I spent the entire week quite sick with something that was not Covid nor Flu according to rapid tests.

Managed to get a few things done through sheer force of will. I don’t recommend it.

Red Hat

  • A lot of misdirection related to our international move towards France where my wife has family. The policies at Red Hat may mean that we need to come up with a strategy to keep a million-plus lines of code maintained if my only option to protect my family is to go the route of long-stay-visa (e.g. work not permitted).

    Some very lovely people have reached out and thank you for that!

    Touch base if you have options that would allow me to work from France if you like my approach to engineering. While I’d love to continue working on GNOME/GTK and related technologies, some things are more important and so I’m flexible.

    Lets all hope that the Corporate Leadership Team at Red Hat determines that supporting multi-ethnic families get safely out of the United States near their support network is of value to our mutual success.

Ptyxis

  • Triage a handful of issues. This project has a pretty high-velocity file-rate while having an pretty low “bug exists here” rate. This legitimately burns a lot of my time every week that could be spent creating things.

    So if you are going to file issues in projects like Ptyxis or Text Editor, please take the time to do basic binary-search on where the problem really exists. Points given just for trying.

    In fact, this is a rant about how you should do that in all aspects of your life. Cut the problems you have in half — and half again.

Libdex

  • Fumbled my way through adding a gi override to integrate DexFuture with asyncio. This allows for await some_future() but you need to make sure you have a GLib main context running as well and integrated with asyncio.

    Hopefully this means that you will soon be able to easily use all the nice Foundry APIs from Python with relative ease.

Foundry

  • Add support for communicating with SSH agent for simple signing requests. Was surprised to not find this in libssh2 but maybe I missed something. We already link against that for libgit2 support.

  • Add support for signing commits in the FoundryGitCommitBuilder commit helper. This supports GPG and SSH commit signing (no X509) because I have no need for it. Though the SSH form is tested best.

  • Iteration on tracking changes to untracked/unstaged/staged files while using the commit builder API.

  • Abstract commit signing into generic buffer signing so that it can be used for more things in the future (like signing tags).

  • Add support for staging/unstaging files, hunks, and lines.

  • New test-git-commit-builder-gtk to be able to test out the infrastructure for commit creation. It’s extremely scrappy but gets the job done enough to test the API out even if not very ergonomic.

  • New foundry_git_vcs_stash() helper to be like git stash.

  • Lots of new APIs around generating diffs, deltas, and patches.

  • Lots of refactoring of Git subsystem to make the combination of threading and convenient APIs easier to maintain.

Builder

  • Some work on git change management panel, diff viewer, etc

  • VCS history panel which is useful as auxiliary for files as well as for viewing diffs.

  • Details auxiliary panel for forge issues/merge-requests

  • Experiment with using AdwPreferencesGroup and friends for creating AdwBottomSheet style menus in narrow mode (e.g. mobile).

Libdex Futures from asyncio

One of my original hopes for Libdex was to help us structure complex asynchronous code in the GNOME platform. If my work creating Foundry and Sysprof are any indicator, it has done leaps and bounds for my productivity and quality in that regard.

Always in the back of my mind I hoped we could make those futures integrate with language runtimes.

This morning I finally got around to learning enough of Python’s asyncio module to write the extremely minimal glue code. Here is a commit that implements integration as a PyGObject introspection override. It will automatically load when you from gi.repository import Dex.

This only integrates with the asyncio side of things so your application is still responsible for integrating asyncio and GMainContext or else nothing will be pumping the DexScheduler. But that is application toolkit specific so I must leave that to you.

I would absolutely love it if someone could work on the same level of integration for GJS as I have even less experience with that platform.

Status Week 47

Ptyxis

  • Issue filed about highlight colors. Seems like a fine addition but again, I can’t write features for everyone so it will need a MR.

  • Walk a user through changing shortcuts to get the behavior they want w/ ctrl+shift+page up/down. Find a bug along the way.

  • Copy GNOME Terminal behavior for placement of open/copy links from the context menu.

  • Add a ptyxis_tab_grab_focus() helper and use that everywhere instead of gtk_widget_grab_focus() on the tab (which then defers to the VteTerminal). This may improve some state tracking of keyboard shift state, which while odd, is a fine enough workaround.

  • Issue about adding waypipe to the list of “network” command detection. Have to nack for now just because it is so useful in situations without SSH.

  • Issue about growing size of terminal window (well shrinking back) after changing foreground. Triaged enough to know this is still a GDK Wayland issue with too-aggressively caching toplevel sizes and then re-applying them even after first applying a different size. Punted to GTK for now since I don’t have the cycles for it.

Foundry

  • Make progress icon look more like AdwSpinner so we can transition between the two paintables for progress based on if we get any sort of real fractional value from the operation.

  • Paintable support for FoundryTreeExpander which is getting used all over Foundry/Builder at this point.

  • Now that we have working progress from LSPs we need to display it somewhere. I copied the style of Nautilus progress operations into a new FoundryOperationBay which sits beneath the sidebar content. It’s not a perfect replicate but it is close enough for now to move on to other things.

    Sadly, we don’t have a way with the Language Server Protocol to tie together what operation caused a progress object to be created so cancellation of progress is rather meaningless there.

  • Bring over the whole “snapshot rewriting” we do in Ptyxis into the FoundryTerminal subclass of VTE to make it easy to use in the various applications getting built on Foundry.

  • Fix FoundryJsonrpcDriver to better handle incoming method calls. Specifically those lacking a params field. Also update things to fix reading from LF/Nil style streams where the underlying helper failed to skip bytes after read_upto().

  • It can’t be used for much but there is a foundry mcp server now that will expose available tools. As part of this make a new Resources type that allows providing context in an automated fashion. They can be listed, read, track changes to list, subscribe to changes in a resource, etc.

    There are helpers for JSON style resources.

    For example, this makes the build diagnostics available at a diagnostics:// URI.

  • Make PluginFlatpakSdk explicitly skip using flatpak build before the flatpak build-init command in the pipeline should have run.

  • Allow locating PluginDevhelpBook by URI so that clicking on the header link in documentation can properly update path navigators.

  • Make FoundryPtyDiagnostics auto-register themselves with the FoundryDiagnosticManager so that things like foundry_diagnostic_manager_list_all() will include extracted warnings that came from the build pipeline automatically rather than just files with diagnostic providers attached.

  • Support for “linked pipelines” which allows you to insert a stage into your build pipeline which then executes another build pipeline from another project.

    For example, this can allow you to run a build pipeline for GTK or GtkSourceView in the early phase of your projects build. The goal here is to simplify the effort many of us go through working on multiple projects at the same time.

    Currently there are limitations here in that the pipeline you want to link needs to be setup correctly to land in the same place as your application. For jhbuild or any sort of --prefix= install prefix this is pretty simple an works.

    For the flatpak case we need more work so that we can install into a separate DESTDIR which is the staging directory of the app we’re building. I prefer this over setting up incremental builds just for this project manually as we have no configuration information to key off.

  • Add new FoundryVcsDiffHunk and FoundryVcsDiffLine objects and API necessary access them via FoundryVcsDelta. Implement the git plugin version of these too.

  • Add a new FoundryGitCommitBuilder high-level helper class to make writing commit tooling easier. Provides access to the delta, hunk, and lines as well as commit details. Still needs more work though to finish off the process of staging/unstaging individual lines and try to make the state easy for the UI side.

    Add a print-project-diff test program to test that this stuff works reasonably well.

Builder

  • Lots of little things getting brought over for the new editor implementation. Go to line, position tracking, etc.

  • Searchable diagnostics page based on build output. Also fix up how we track PTY diagnostics using the DiagnosticManager now instead of only what we could see from the PTY diagnostics.

Manuals

  • Merge support for back/forward mouse buttons from @pjungkamp

  • Merge support for updating pathbar elements when navigating via the WebKit back/forward list, also from @pjungkamp.

Systemd

  • Libdex inspired fiber support for systemd which follows a core design principle which is that fibers are just a future like any other.

    https://github.com/systemd/systemd/pull/39771

CentOS

  • Merge vte291/libadwaita updates for 10.2

GTK

  • Lots of digging through GdkWayland/GtkWindow to see what we can do to improve the situation around resizing windows. There is some bit of state getting saved that is breaking our intent to have a window recalculate it’s preferred size.

GNOME Settings Daemon

  • Took a look over a few things we can do to reduce periodic IO requests in the housekeeping plugin.

    Submitted a few merge requests around this.

GLib

  • Submitted a merge request adding some more file-systems to those that are considered “system file system types”. We were missing a few that resulted in extraneous checking in gsd-housekeeping.

    While researching this, using a rather large /proc/mounts I have on hand, and modifying my GLib to let me parse it rather than the system mounts, I verified that we can spend double-digit milliseconds parsing this file. Pretty bad if you have a system where the mounts can change regularly and thus all users re-parse them.

    Looking at a Sysprof recording I made, we spend a huge amount of time in things like strcmp(), g_str_hash(), and in hashtable operations like g_hash_table_resize().

    This is ripe for a better data-structure instead of strcmp().

    I first tried to use gperf to generate perfect hashes for the things we care about and that was a 10% savings. But sometimes low-tech is even better.

    In the end I went with bsearch() which is within spitting distance of the faster solutions I came up with but a much more minimal change to the existing code-base at the simple cost of keeping lists sorted.

    There is likely still more that can be done on this with diminishing returns. Honestly, I was surprised a single change would be even this much.

Red Hat

This week also had me spending a significant amount of time on Red Hat related things.