Vivid colors in Brno

Co-authored by Sebastian Wick & Jonas Ådahl.

During April 24 to 26 Red Hat invited people working on compositors and display drivers to come together to collaborate on bringing the Linux graphics stack to the next level. There were three high level topics that were discussed at length: Color Management, High Dynamic Range (HDR) and Variable Refresh Rate (VRR). This post will go through the discussions that took place, and occasional rough consensus reached among the people who attended.

The event itself aimed to be both as inclusive and engaging as possible, meaning participants could attend both in person, in the Red Hat office in Brno, Czech Republic, or remotely via a video link. The format of the event was structured in a way aiming to give remote attendees and physical attendees an equal opportunity to participate in discussions. While the hallway track can be a great way to collaborate, discussions accessible remotely were prioritized by having two available rooms with their own video link.

This meant that if the main room wanted to continue on the same topic, while some wanted to do a breakout session, they could go to the other room, and anyone attending remotely could tag along by connecting to the other video link. In the end, the break out room became the room where people collaborated on various things in a less structured manner, leaving the main room to cover the main topics. A reason for this is that the microphones in both rooms were a bit too good, effectively catching any conversation anyone had anywhere in the room. Making one of the rooms a bit more chaotic, while the other focused, also allowed for both ways of collaborating.

For the kernel side, people working on AMD, Intel and NVIDIA drivers were among the attendees, and for user space there was representation from gamescope, GNOME, KDE, smithay, Wayland, weston and wlroots. Some of those people are community contributors and some of them were attending on behalf of Red Hat, Canonical, System76, sourcehut, Collabora, Blue Systems, Igalia, AMD, Intel, Google, and NVIDIA. We had a lot of productive discussion, ending up in total with a 20 (!) page document of notes.

Discussion with remote attendees during the hackfest

Color management & HDR


Color management in the Linux graphics stack is shifting in the way it is implemented, away from the style used in X11 where the display server ( takes a hands-off approach and the end result is dependent on individual client capabilities, to an architecture where the Wayland display server takes an active role to ensure that all clients, be them color aware or not, show up on screen correctly.

Pekka Paalanen and Sebastian Wick gave a summary of the current state of digital color on Linux and Wayland. For full details, see the Color and HDR documentation repository.

They described the in-development color-representation and color-management Wayland protocols. The color-representation protocol lets clients describe the way color channels are encoded and the color-management protocol lets clients describe the color channels’ meaning to completely describe the appearance of surfaces. It also gives clients information about how it can optimize its content to the target monitor capabilities to minimize the color transformations in the compositor.

Another key aspect of the Wayland color protocols in development is that compositors will be able to choose what they want to support. This allows for example to implement HDR without involving ICC workflows.

There is already a broad consensus that this type of active color management aligns with the Wayland philosophy and while work is needed in compositors and client toolkits alike, the protocols in question are ready for prototyping and review from the wider community.

Colors in kernel drivers & compositors

There are two parts to HDR and color management for compositors. The first one is to create content from different SDR and HDR sources using color transformations. The second is signaling the monitor to enter the desired mode. Given the current state of kernel API capabilities, compositors are in general required to handle all of their color transformations using shaders during composition. For the short term we will focus on removing the last blockers for HDR signaling and in the long term work on making it possible to offload color space conversions to the display hardware which should ideally make it possible to power down the GPU while playing e.g. a movie

Short term

Entering HDR mode is done by setting the colorimetry (KMS Colorspace property) and overriding the transfer characteristics (KMS HDR_OUTPUT_METADATA property).

Unfortunately the design of the Colorspace property does not mix well with the current broader KMS design where the output format is an implementation detail of the driver. We’re going to tweak the behavior of the Colorspace property such that it doesn’t directly control the InfoFrame but lets the driver choose the correct variant and transparently convert to YCC using the correct matrix if required. This should allow AMD to support HDR signaling upstream as well.

The HDR_OUTPUT_METADATA property is a bit weird as well and should be documented. Changing it might require a mode set and changing the transfer characteristics part of the blob will make monitors glitch, while changing other parameters must not require a mode set and must not glitch.

Both landing support upstream for the AMD driver, and improvements to the documentation should happen soon, enabling proper upstream HDR signaling.

Vendor specific uAPI for color pipelines

Recently a proposal for adding vendor specific properties for exposing hardware color pipelines via KMS has been posted, and while it is great to see work being done to improve situation in the Linux kernel, there are concerns that this opens up for per vendor API that end up necessary for compositors to implement, effectively reintroducing per vendor GPU drivers in userspace outside of mesa.

Still, upstream support in the kernel has its upsides, as it for example makes it much easier to experiment. A way forward discussed is to propose that vendor specific color pipeline properties should be handled with care, by requiring them to be clearly documented as experimental, and disabled by default both with a build configuration, and a off-by-default module parameter.

A proposal for this will be sent by Harry Wentland to the relevant kernel mailing lists.

Color pipelines in KMS

Long term, KMS should support color pipelines without any experimental flags, and there is a wide agreement that it should be done with a vendor agnostic API. To achieve this, a proposal was discussed at length, but to summarize it, the goal is to introduce a new KMS object for color operations. A color operation object exposes a low level mathematical function (e.g. Matrix multiplication, 1D or 3D look up tables) and a link to the next operation. To declare a color pipeline, drivers construct a linked list of these operations, for example 1D LUT → Matrix → 1D LUT to describe the current DEGAMMA_LUT → CTM → GAMMA_LUT KMS pipeline.

The discussions primarily focused on per plane color pipelines for the pre-blending stage, but the same concept should be reusable for the post blending stage on the CRTC.

Eventually this work should also make it possible to cleanly separate KMS properties which change the colors (i.e. color operations) from properties changing the mode and signaling to sinks, such as Broadcast RGB, Colorspace, max_bpc.

It was also agreed that user space needs more control over the output format, i.e. what is transmitted over the wire. Right now this is a driver implementation detail and chosen such that the bandwidth requirements of the selected mode will be satisfied. In particular making it possible to turn off YCC subsampling, specifying the minimum bit depth and specifying the compression strength for DCC seems to have consensus.

There are a lot more details that handle all the quirks that hardware may have. For more details and further discussion about the color pipeline proposal, head over to the RFC that Simon Ser just sent to the relevant mailing lists.

Testing & VKMS

Testability of color pipelines and KMS in general was a topic that was brought up as well, with two areas of interest: testing compositors and the generic DRM layer in the kernel using VKMS, and testing actual kernel drivers.

The state of VKMS is to some degree problematic; it currently lacks a large enough pool of established contributors that can take maintainership responsibilities, i.e. reviewing and landing code, but at the same time, there is an urge to make it a more central part of GPU driver development in general, where it can take a more active role in ensuring cross driver conformance. Discussions on how to create more incentive for both kernel developers and compositor developers to help out were discussed, and while ability to test compositors is a relatively good incentive, an idea discussed was to require new DRM properties to always get a VKMS implementation as well to be able to land. This is, however, not easy, since a significant amount of bootstrapping is needed to make that viable. Some ideas were thrown around, and hopefully something will come out of it; keep an eye on the relevant mailing lists for something related to this area.

For testing actual drivers, the usage of Chamelium was discussed, and while everyone agreed it’s something that is definitely nice to have, it takes a significant amount of resources to maintain wired up CI runners for the community to rely on. Ideally a setup that can be shared across the different compositors and GPU drivers would be great, but it’s a significant task to handle.

Variable Refresh Rate

Smoothing out refresh rate changes

Variable Refresh Rate monitors driven at a certain mode have a minimum and maximum refresh cycle duration and the actual duration can be chosen for every refresh cycle. One problem with most existing VRR monitors however is that when the refresh duration changes too quickly, they tend to produce visible glitches. They appear as brightness changes for a fraction of a second and can be very jarring. To avoid them, each refresh cycle must change the duration only up to some fixed amount. The amount however varies between monitors, with some having no restriction at all.

A VESA certification is currently being deployed aiming to certify monitors where any change in the refresh cycle duration does not result in glitches. For all other monitors, the increase and decrease in duration which does not result in glitches is unknown if not provided by optional EDID/DisplayID data blocks.

Driving monitors glitch-free without machine readable information therefore requires another approach. One idea is to make the limits configurable. Requiring all users to tweak and fiddle to make it work good enough, however, is not very user friendly, so another idea that was discussed is to maintain a database similar to the one used by libinput, but in libdisplay-info, that contains the required information about monitors, even if there is no such information made available by the vendor.

With all of the required information, the smoothing of refresh rate changes still needs to happen somewhere. It was debated whether this should be handled transparently by the kernel, or if it should be completely up to user space. There are pros and cons to both ways, for example better timing ability in the kernel, but less black box magic if handled by user space. In the end, the conclusion is for user space components (i.e. compositors) to handle this themselves first, and then reconsider some point in the future if that is enough, or whether new kernel uAPI is needed.

Low Framerate Compensation

The usual frame rates that a VRR monitor can achieve typically do not cover a bunch of often used low frame rates, such as 30, 25, or 24 Hz. To still be able to show such content without stutter, the display can be driven at a multiple of the target frame rate and present new content on every n-th refresh cycle.

Right now this Low Framerate Compensation (LFC) feature is built into the kernel driver, and when VRR is enabled, user space can transparently present content at refresh rates even lower than what the display supports. While this seems like a good idea, there are problems with this approach. For example the cursor can only be updated when there is a content update, making it very sluggish because of the low rate of content updates even though the screen refreshes multiple times. This either requires a special KMS commit which does not result in an immediate page flip but ends up on the refresh cycles inserted by LFC, or implementing LFC in user space instead. Like with the refresh rate change smoothing talked about earlier, moving LFC to user space might be possible but also might require some help from the kernel to be able to time page flips well enough.


For VRR to work, applications need to provide content updates on a surface in a semi-regular interval. GUI applications for example often only draw when something changed which makes the updates irregular, driving VRR to its minimum refresh rate until e.g. an animation is playing and VRR is ramping up the refresh rate over multiple refresh cycles. This results in choppy mouse cursor movements and animations for some time. GUI applications sometimes do provide semi-regular updates, e.g. during animations or video playback. Some applications, like games, always provide semi-regular updates.

Currently there is no1 Wayland protocol letting applications advertise that a surface works with VRR at a moment in time, or at all. There is no way for a compositor to automatically determine if an app or a surface is suitable for VRR as well. For wayland native applications a protocol to communicate this information could be created but there are a lot of applications out there which would work fine with VRR but will not get updated to support this protocol.

Maintaining a database similar to the one mentioned above, but for applications, was discussed, but there is no clear winner in how to do so, and where to store the data. Maintaining a list is cumbersome, and complicates the ability for applications to work with VRR on release, or on distributions with out of date databases. Another idea was a desktop file entry stating support, but this too has its downsides. All in all, there is no clear path forward in how to actually enable VRR for applications transparently without causing issues.

1. Except for a protocol proposal.


Brno, Czech Republic

The hackfest was a huge success! Not only was this a good opportunity to get everyone up to speed and learn about what everyone is doing, having people with different backgrounds in the discussions made it possible to discuss problems, ideas and solutions spanning all the way from clients over compositors, to drivers and hardware. Especially on the color and HDR topics we came up with good, actionable consensus and a clear path to where we want to go. For VRR we managed to pin-point the remaining issues and know which parts require more experimentation.

For GNOME, Color management, HDR and VRR are all topics that are being actively worked on, and the future is both bright and dynamic, not only when it comes to luminescence and color intensity, but also when it comes to the rate monitors present all these intense colors.

Dor Askayo who has been working on bringing VRR to GNOME attended part of the hackfest, and together we can hopefully bring experimental VRR to GNOME soon. There will be more work needed to iron out the overall experience, as covered above, but getting the fundamental building blocks in place is a critical first step.

For HDR, work has been going on to attach color state information to the scene graph, and at the hackfest Georges Basile Stavracas, Sebastian Wick and Jonas Ådahl sat down and sketched out a new Clutter rendering API that aims replace the current Clutter paint nodes API that is used in Mutter and GNOME Shell, which will make color transformations a first class citizen. We will initially focus on using shaders for everything, but down the road, the goal is to utilize the future color pipeline KMS uAPI for both performance and power consumption improvements.

We’d like to thank Red Hat for organizing and hosting the hackfest and for allowing us to work on these interesting topics, Red Hat and Collabora for sponsoring food and refreshments, and especially Carlos Soriano Sanchez and Tomas Popela for actually doing all the work making the event happen. It was great. Also thanks to Jakub Steiner for the illustration, and Carlos Soriano Sanchez for the photo from the hackfest.

For another great hackfest write-up, head over to Simon Ser’s blog post.

Ensuring steady frame rates with GPU-intensive clients

On Wayland, a surface is the basic primitive used to build what users refer to as a “window”. Wayland clients define their contents by attaching buffers to surfaces. This turns the contents of the buffer into the current surface contents. Wayland clients are free to attach a new buffer to a surface anytime. When a Wayland compositor like Mutter starts working on a new output frame, it picks the latest available buffer for each visible surface. This is called “mailbox semantics” (the buffers are metaphorical letters falling into a mailbox, the visible “letter” is the last one on top).


With hardware accelerated drawing, a client normally attaches a new buffer to a surface right after it finished calling OpenGL/Vulkan/<insert your favourite drawing API> APIs to define the contents of the buffer. When the compositor processes the protocol requests attaching the buffer to the surface, the GPU generally hasn’t finished drawing to the buffer yet.

Since the contents of the compositor’s output frame depend on the contents of each visible surface, the former cannot complete before the GPU finishes drawing to each of the picked surface buffers (and subsequently to the compositor’s own output buffer, in the general case).

If the GPU does not finish drawing in time for the next display refresh cycle, the compositor’s output frame misses that cycle and is delayed by at least the duration of one refresh cycle. This can be noticeable as judder/stutter, because the compositor’s frame rate is reduced, and the contents of some frames are not consistent with the timing when they become visible.

The likelihood of that happening depends largely on the clients, mainly on how long it takes the GPU to draw their buffer contents and how much time lies between when a client starts drawing to its buffer and when the compositor starts working on its resulting output frame.

In summary, a Wayland compositor can miss a display refresh cycle because the GPU failed to finish drawing to a client buffer in time.

This diagram visualizes a normal and problematic case:

Left side: normal case, right side: problematic case
Left side: normal case, right side: problematic case


Basic idea

The basic idea is simple: the compositor considers a client buffer “available” per the mailbox semantics only once the GPU finishes drawing to it. Until then, it picks the previously available buffer.


Now if it was as simple as that might sound, there would be no need to write a >1000-word article about it. 🙂

The main thing which makes things more complicated is that, together with attaching a new buffer, various other surface states can be modified in the same commit. All state changes in the same commit must be applied atomically, i.e. the user must either see all or none of them (per Wayland’s “every frame is perfect” motto). For an example, there are various states which affect how a Wayland surface is scaled for display. Attaching a new buffer and changing the scaling state in the same commit ensures that the surface always appears consistently. If the buffer size and scaling state were to change independently, the surface might intermittently appear in the wrong size.

As if that wasn’t complicated enough, Wayland has so-called synchronized sub-surfaces. State changes for a synchronized sub-surface are not applied immediately, but only the next time any state changes are applied for its parent surface. Conceptually, one can think of the committed sub-surface state becoming part of the parent surface’s state commit. Again, all state combined like this between sub-surfaces (which can be nested, i.e. a sub-surface can be the parent of another sub-surface) and their parents must be applied atomically, all or nothing, to ensure that sub-surfaces and their parents always appear consistently as a whole.

This means that the compositor cannot simply wait for the GPU to finish drawing to client buffers, while applying other corresponding surface state immediately. It needs to stage the committed state changes somehow, and actually apply them only once the GPU has finished drawing to all new buffers attached in the same combined state commit.

Enter transactions

The idea for “stage somehow” is to introduce the concept of a transaction, which combines a set of state changes for one or multiple (sub-)surfaces. When a client commits a set of state changes for a surface, they are inserted into an appropriate transaction; either a new one or an existing one, depending on circumstances.

When the committed state changes should get applied per Wayland protocol semantics, the transaction is committed and inserted into a queue of committed transactions. The queue is ordered such that for any given surface,  state commits are applied in the same order as they were committed by the client. This ensures that the contents of a surface never appear to “move backwards” because one transaction affecting the surface managed to “overtake” another one.

A transaction is considered ready to be applied only once both of these conditions are true:

  1. It’s the oldest (closest to the queue head) transaction in the queue for all surfaces it carries state for.
  2. The GPU has finished drawing to all client buffers attached in the transaction.

Once both of these conditions are true, the transaction is applied atomically. From that point on, the compositor uses the state in the transaction for its output frames.


I implemented the solution described above in Mutter merge request !1880, which was merged for the GNOME 44 release. While it went under the radar of news outlets, I hope that many of you will notice the benefits!

One situation where the benefits of transactions can be noticed is interactive OpenGL applications such as games, with “vsync” disabled (e.g. for better input → output latency), you should be less likely to see stuttering due to Mutter missing a display refresh cycle, in particular in fullscreen and if Mutter can use direct scanout of client buffers.

If the GPU & drivers support true high priority EGL contexts which can preempt lower priority ones (as of this writing, this is true e.g. with “not too old” Intel GPUs), Mutter can now sustain full frame rate even if clients are GPU-bound to lower frame rates, as demonstrated in this video:

Even if the GPU & drivers do not support this, Mutter should now get bogged down less by such heavy clients, in particular the mouse cursor.

It’s effective for X clients running via Xwayland as well, not only for native Wayland clients.

Long term, all major Wayland compositors will want to do something like this. gamescope already does.


It took almost two years (on and off, not full-time) from having the initial idea, deciding to try implementing it myself, until finally getting it ready to be merged. I wasn’t very familiar with the Mutter code or Wayland protocol semantics when I started, so I couldn’t have done it without a lot of help from many Mutter and Wayland developers. I am deeply grateful to all of you.

Thanks to Jakub Steiner for the featured image and to Niels De Graef for the diagram of this post.

I would also like to thank Red Hat for giving me the opportunity to work on this, even though “Mutter developer” isn’t really a central part of my job description.

Automated testing of GNOME Shell

Automated testing is important to ensure software continues to behave as it is intended and it’s part of more or less all modern software projects, including GNOME Shell and many of the dependencies it builds upon. However, as with most testing, we can always do better to get more complete testing. In this post, we’ll dive into how we recently improved testing in GNOME Shell, and what this unlocks in terms of future testability.

Already existing testing

GNOME Shell already performs testing as part of its continuous integration pipeline (CI), but tests have been limited to unit testing, meaning testing selected components in isolation ensuring they behave as expected, but due to of the nature of the functionalities that Shell implements, the amount of testing one can do as unit testing is rather limiting. Primarily, in something like GNOME Shell, it is just as important to test how things behave when used in their natural environment, i.e. instead of testing specific functionality in isolation, the whole Shell instance needs to be executed with all bits and pieces running as a whole, as if it was a real session.

In other words, what we need is being able running all of GNOME Shell as if it was installed and logged in into on a real system.

Test Everything

As discussed, to actually test enough things, we need to run all of GNOME Shell with all its features, as if it was a real session. What this also means is that we don’t necessarily have the ability to set up actual test cases filled with asserts as one does with unit testing; instead we need mechanisms to verify the state of the compositor in a way that looks more like regular usage. Enter “perf tests“.

Since many years back, GNOME Shell has had automated performance tests, that would measure how well the Shell performed doing various tasks. Each test is a tiny JavaScript function that performs a few operations, while making sure all the performed operations actually happened, and when it finishes, the Shell instance is terminated. For example, a “perf test” could look like

  1. Open overview
  2. Open notifications
  3. Close notifications
  4. Leave overview

As is it turns out, this infrastructure fits rather neatly with the kind of testing we want to add here – tests that that perform various tasks that exercise user facing functionality.

There are, however, more ways to  verify that things behave as expected other than triggering these operations and ensuring that they executed correctly. The most immediate next step is to ensure that there were no warnings logged during the whole test run. This is useful in part due to the fact that GNOME Shell is largely written in JavaScript, as this means the APIs provided by lower level components such as Mutter and GLib tend to have runtime input validation in introspected API entry points. Consequently, if an API is misused by some JavaScript code, it tends to result in warnings being logged. We can be more confident that a particular change won’t introduce regressions when it runs GNOME Shell completely without warnings.

This, however, is easier said than done, for two main reasons: we’ll be running in a container, and the complications that comes with mixing memory management models of different programming languages.

Running GNOME Shell in a container

For tests to be useful, they need to run in CI. Running in CI means running in a container, and that is not all that straightforward when it comes to compositors. The containerized environment is rather different than running on a regularly installed and setup Linux distribution; it lack many services that are expected to be running, and provide important functionality needed to build a desktop environment, like service and session management (e.g. logging out), system management (e.g. rebooting), dealing with network connectivity, and so on.

Running with most of these services missing is possible, but results in many warnings, and a partially broken session. To get any useful testing done, we need to eliminate all of these warnings, without just silencing them. Enter service mocking.

Mocked D-Bus Services

In the world of testing, “mocking” involves creating an implementation of an API, without the actual real world API implementation sitting behind it. Often these mocked services provide a limited pre-defined subset of functionality, for example hard coding results of API operations given a pre-defined set of possible input arguments. Sometimes, mocked APIs can simply only be there to pretend a service available, and nothing more is needed unless the functionality it provides needs to be actively triggered.

As part of CI testing in Mutter, the basic building blocks for mocking services needed to run a display server in CI have been implemented, but GNOME Shell needs many more compared to plain Mutter. As of this writing, in addition to the few APIs Mutter relies on, GNOME Shell also needs the following:

  • org.freedesktop.Accounts (accountsservice) – For e.g. the lock screen
  • org.freedesktop.UPower (upower) – E.g. battery status
  • org.freedesktop.NetworkManager (NetworkManager) – Manage internet
  • org.freedesktop.PolicyKit1 (polkit) – Act as a PolKit agent
  • net.hadess.PowerProfiles (power-profiles-daemon) – Power profiles management
  • org.gnome.DisplayManage (gdm) – Registering with GDM
  • org.freedesktop.impl.portal.PermissionStore (xdg-permission-store) – Permission checking
  • org.gnome.SessionManager (gnome-session) – Log out / Reboot / …
  • org.freedesktop.GeoClue2 (GeoClue) – Geolocation control
  • org.gnome.Shell.CalendarServer (gnome-shell-calendar-server) – Calendar integration

The mock services used by Mutter are implemented using python-dbusmock, and Mutter conveniently installs its own service mocking implementations. Building on top of this, we can easily continue mocking API after API until all the needed ones are provided.

As of now, either upstream python-dbusmock or GNOME Shell have mock implementations of all the mentioned services. All but one, org.freedesktop.Accounts, either existed or needed a trivial implementation. In the future, for further testing that involves interacting with the system, e.g. configuring Wi-Fi, we will need expand what these mocked API implementations can do, but for what we’re doing initially, it’s good enough.

Terminating GNOME Shell

Mixing JavaScript, a garbage collected language, and C, with all its manual memory management, has its caveats, and this is especially true during tear down. In the past the Mutter context was terminated, later followed by the JavaScript context. Terminating the JavaScript context last prevented Clutter and Mutter objects from being destroyed, as JavaScript may still have references to these objects. If you ever wondered why there tends to be warnings in journal when logging out, this is why. All of these warnings and potential crashes mean any tests that rely on zero warnings would fail. We can’t have that!

To improve this situation, we have to shuffle things around a bit. In rough terms, we now terminate the JavaScript context first, ensuring there are no references held by JavaScript, before tearing down the backend and the Mutter context. To make this possible without introducing even more issues, this meant tearing down the whole UI tree on shut-down, making sure the actual JavaScript context disposal more or less only involves cleaning up defunct JavaScript objects.

In the past, this has been complicated too, since not all components can easily handle bits and pieces of the Shell getting destroyed in a rather arbitrary order, as it means signals get emitted when they were not expected to, e.g. when parts of the shell that was expected to still exist has already been cleaned up. A while ago, a new door was opened making it possible to handle rather conveniently: enter the signal tracker, a helper that makes it possible to write code using signal handlers that automatically disconnects signal handlers on shutdown.

With the signal tracker in place and in use, a few smaller final fixes here, and the aforementioned reversed order we tear down the JavaScript context and the Mutter bits, we can now terminate without any warnings being logged.

And as a result, the tests pass!

Enabled in CI

Right now we’re running the “basic” perf test on each merge request in GNOME Shell. It performs some basic operations, including opening the quick settings menu, handles an incoming notification, opens the overview and application grid. A screen recording of what it does can be seen below.

What’s Next

More Tests

Testing more functionality than basic.js. There are some more existing “perf tests” that could potentially be used, but tests that aim for testing specific functionality, for example window management, or configuring the Wi-Fi, that isn’t related to performance don’t really exist yet. This will become easier after the port to standard JavaScript modules, when tests no longer have to be included in the gnome-shell binary itself.

Input Events

So far, widgets are triggered programmatically. Using input events via virtual input devices means we get more fully tested code paths. Better test infrastructure for things related to input is being worked on for Mutter, and can hopefully be reused in GNOME Shell’s tests.

Running tests from Mutter’s CI

GNOME Shell provides a decent sanity test for Clutter, Mutter’s compositing library, so ensuring that it runs successfully and without warnings is useful to make sure changes there doesn’t introduce regressions.

Screenshot-based Tests

Using so called reference screenshots, test will be able to ensure there were no actual visual changes unless so was intended. The basic infrastructure exist in and can be exposed by Mutter, but for something like GNOME Shell, we probably need a way other than in-tree reference images for storage as is done in Mutter, in order to not make the gnome-shell git repository grow out of hand.


Currently the tests use a single fixed resolution virtual monitor, but this should be expanded to involve multi monitor and hotplugging. Mutter has ways to create virtual monitors, but does not yet export this via by GNOME Shell consumable API.

GNOME Shell Extensions

Not only GNOME Shell itself needs testing, running tests specifically for extensions, or running GNOME Shell’s own tests as part of testing extensions would have benefits as well.

GNOME Shell on mobile: An update

It’s been a while since the last update on GNOME Shell mobile, but there’s been a huge amount of progress during that time, which culminated in a very successful demo at the Prototype Fund Demo Day last week.

​The current state of the project is that we have branches with all the individual patches for GNOME Shell and Mutter, which together comprise a pretty complete mobile shell experience. This includes all the basics we set out to cover during the Prototype Fund project (navigation gestures, screen size detection, app grid, on-screen keyboard, etc.) and some additional things we ended up adding along the way.

The heart of the mobile shell experience is the sophisticated 2D gesture navigation: The gestures to go to the overview vertically and switch horizontally between apps are fluid, interruptible, and multi-dimensional. This allows for navigation which is not only quick and ergonomic, but also intuitive thanks to an incredibly simple spatial model.

While the overall gesture paradigm we use is quite similar to what iOS and Android have, there’s one important difference: We have a single overview for both launching and switching, instead of two separate screens on iOS (home screen and multitasking) and three separate screens on Android (home screen, app drawer, multitasking).

This allows us to avoid the awkward “swipe, stop, and wait” gesture to go to multitasking that other systems rely on, as well as the confusing spatial model, where apps live both within the app icon and next to the home screen, and sometimes show up from the left when swiping… up?

Our overview is always a single swipe away, and allows instant access to both open apps and the app grid, without having to choose between launching and switching.

In case you’re wondering where the “overview” state with just the multitasking cards (like we had in previous iterations) went – After some experimentation and informal user research we realized that it’s not really adding any value over the row of thumbnails in the app grid state. The smaller thumbnails are more than large enough to interact with, and more useful because you can see more of them at the same time.

We ported the shell search experience to a single-column layout for the narrower screen, which coincidentally is a direction we’re also exploring for the desktop search layout.

We completely replaced the on-screen keyboard gesture input, applying several tricks that OSKs on other mobile OSes employ, e.g. releasing the currently pressed key when another one is pressed. The heuristics for when the keyboard shows up are a lot more intuitive now and more in line with other mobile OSes.

The keyboard layout was adapted to the narrower size and the emoji keyboard got a redesign. There’s also a very fancy new gesture for hiding the keyboard, and it automatically hides when scrolling the view.

The app grid layout was adapted to portrait sizes, including a new style for folders and lots of spacing and padding tweaks to make it work well for the phone use case. All the advanced re-ordering and organizing features the app grid already had before are of course available.

Luckily for us, Florian independently implemented the new Quick Settings this cycle. These work great on the phone layout, but on top of that we also added notifications to that same menu, to get a unified system menu you can open with a swipe from the top. This is not as mature as other parts of the mobile shell yet and needs further work, which we’ll hopefully get to soon as part of the planned notifications overhaul.

One interesting new feature here is that notifications can be swiped away horizontally to close, and notification bubbles can be swiped up to hide them.

Next steps

From a development perspective the next steps are primarily upstreaming all of the work done so far, starting with the new gesture API, which is used by many different parts of the mobile shell and will bring huge improvements to gestures on desktop as well. This upstreaming effort is going to require many separate merge requests that depend on each other, and will likely take most of the 44 cycle.

Beyond upstreaming what already exists there are many additional things we want or need to work on to make the mobile experience really awesome, including:

  • Calls on the lock screen (i.e. an API for apps to draw over the lock screen)
  • Emergency calls
  • Haptic feedback
  • PIN Unlock
  • Adapt terminal keyboard layout for mobile, more custom keyboard layouts e.g. for URLs
  • Notifications revamp, including grouping and better actions
  • Flashlight quick settings toggle
  • Workspace reordering in the overview

There are also a few rough edges visually which need lower-level changes to fix:

  • Rounded thumbnails in the overview
  • Transparent panel
  • A way for apps to draw behind the top and bottom bars and the keyboard (to allow for glitch-free keyboard showing/hiding)

Help with any of the above would be highly appreciated!

How to try it

In addition to further development work there’s also the question of getting testing images. While the current version is definitely still work in progress, it’s quite usable overall, so we feel it would make sense to start having experimental GNOME OS Nightly images with it. There’s also postmarketOS, who are working to add builds of the mobile shell to their repositories.

The hardware question

The main question we’re being asked by everyone is “What device do I have to get to start using this?”, which at this stage is especially important for development. Unfortunately there’s not a great answer to this right now.

So far we used a Pinephone Pro sponsored by the GNOME Foundation to allow for testing, but unfortunately it’s nowhere near ready in terms of hardware enablement and it’s unclear when it will be.

The original Pinephone is much further along in hardware enablement, but the hardware is too weak to be realistically usable. The Librem 5 is probably the best option in both hardware support and performance, but it still takes a long time to ship. There are a number of Android phones that sort of work, but there unfortunately isn’t one that’s fully mainlined, performant enough, and easy to buy.

Thanks to the Prototype Fund

All of this work was possible thanks to the Prototype Fund, a grant program supporting public interest software by the German Ministry of Education (BMBF).



Towards GNOME Shell on mobile

As part of the design process for what ended up becoming GNOME 40 the design team worked on a number of experimental concepts, a few of which were aimed at better support for tablets and other smaller devices. Ever since then, some of us have been thinking about what it would take to fully port GNOME Shell to a phone form factor.

GNOME Shell mockup from 2020, showing a tiling-first tablet shell overview and two phone-sized screens
Concepts from early 2020, based on the discussions at the hackfest in The Hague

It’s an intriguing question because post-GNOME 40, there’s not that much missing for GNOME Shell to work on phones, even if not perfectly. A few of the most difficult pieces you need for a mobile shell are already in place today:

  • Fully customizable app grid with pagination, folders, and drag-and-drop re-ordering
  • “Stick-to-finger” horizontal workspace gestures, which are pretty close to what we’d want on mobile for switching apps
  • Swipe up gesture for navigating to overview and app grid, which is also pretty close to what we’d want on mobile

On top of that, many of the things we’re currently working towards for desktop are also relevant for mobile, including quick settings, the notifications redesign, and an improved on-screen keyboard.

Possible thanks to the Prototype Fund

Given all of this synergy, we felt this is a great moment to actually give mobile GNOME Shell a try. Thanks to the Prototype Fund, a grant program supporting public interest software by the German Ministry of Education (BMBF), we’ve been working on mobile support for GNOME Shell for the past few months.


We’re not expecting to complete every aspect of making GNOME Shell a daily driveable phone shell as part of this grant project. That would be a much larger effort because it would mean tackling things like calls on the lock screen, PIN code unlock, emergency calls, a flashlight quick toggle, and other small quality-of-life features.

However, we think the basics of navigating the shell, launching apps, searching, using the on-screen keyboard, etc. are doable in the context of this project, at least at a prototype stage.

Three phone-sized UI mockups, one showing the shell overview with multitasking cards, the second showing the app grid with tiny multitasking cards on top, and the third showing quick toggles with notifications below.
Mockups for some of the main GNOME Shell views on mobile (overview, app grid, system status area)

Of course, making a detailed roadmap for this kind of effort is hard and we will keep adjusting it as things progress and become more concrete, but these are the areas we plan to work on in roughly the order we want to do them:

  • New gesture API: Technical groundwork for the two-dimensional navigation gestures (done)
  • Screen size detection: A way to detect the shell is running on a phone and adjust certain parts of the UI (done)
  • Panel layout: Using the former, add a separate mobile panel layout, with a different top panel and a new bottom panel for gestures (in progress)
  • Workspaces and multitasking: Make every app a fullscreen “workspace” on mobile (in progress)
  • App Grid layout: Adapt the app grid to the phone portrait screen size, ideally as part of a larger effort to make the app grid work better at various resolutions (in progress)
  • On-screen keyboard: Add a narrow on-screen keyboard mode for mobile portrait
  • Quick settings: Implement the new quick settings designs

Current Progress

One of the main things we want to unlock with this project is the fully semantic two-dimensional navigation gestures we’ve been working towards since GNOME 40. This required reworking gesture recognition at a fairly basic level, which is why most of the work so far has been focused around unlocking this. We introduced a new gesture tracker and had to rewrite a fair amount of the input handling fundamentals in Clutter.

Designing a good API around this took a lot of iterations and there’s a lot of interesting details to get into, but we’ll cover that in a separate deep-dive blogpost about touch gesture recognition in the near future.

Based on the gesture tracking rework, we were able to implement two-dimensional gestures and to improve the experience on touchscreens quite a bit in general. For example, the on-screen keyboard now behaves a lot more like you’re used to from your smartphone.

Here’s a look at what this currently looks like on laptops (highly experimental, the second bar would only be visible on phones):

Some other things that already work or are in progress:

  • Detecting that we’re running on a phone, and disabling/adjusting UI elements based on that
  • A more compact app grid layout that can fit on a mobile portrait screen
  • A bottom bar that can act as handle for gesture navigation; we’ll definitely need this for mobile but it’s is also a potentially interesting future direction for larger screens

Taken together, here’s what all of this looks like on actual phone hardware right now:

Most of this work is not merged into Mutter and GNOME Shell yet, but there are already a few open MRs in case you’d like to dive into the details:

Next Steps

There’s a lot of work ahead, but going forward progress will be faster and more visible because it will be work on the actual UI, rather than on internal APIs. Now that some of the basics are in place we’re also excited to do more testing and development on actual phone hardware, which is especially important for tweaking things like the on-screen keyboard.

Photo of the app grid on a Pinephone Pro leaning against a wood panel.
The current prototype running on a Pinephone Pro sponsored by the GNOME Foundation

An Eventful Instant

Artist, gamers, rejoice! GNOME Shell 42 will let applications handle input events at the full input device rate.

It’s a long story

Traditionally, GNOME Shell has been compressing pointer motion events so its handling is synchronized to the monitor refresh rate, this means applications would typically see approximately 60 events per second (or 144 if you follow the trends).

This trait inherited from the early days of Clutter was not just a shortcut, handling motion events implies looking up the actor that is beneath the pointer (mainly so we know which actor to send the event to) and that was an expensive enough operation that it made sense to do with the lowest frequency possible. If you are a recurrent reader of this blog you might remember how this area got great improvements in the past.

But that alone is not enough, motion events can also end up handled in JS land, and it is in the best interest of GNOME Shell (and people complaining about frame loss) that we don’t need to jump into the JavaScript machinery too often in the course of a frame. This again makes sense to keep to a minimum.

Who wants it different?

Applications typically don’t care a lot about motion events, beyond keeping up with the frame rate. Others however have a stronger reliance on motion event data that this event compression is suboptimal.

Some examples where sending input events at the device rate matters:

  • Applications that use device input for velocity/direction/acceleration calculations (e.g. a drawing app applying a brush effect) want as much granularity as it is possible, compressing events is going to smooth values and tamper with those calculations.
  • Applications that render more often than the frame rate (e.g. games with vsync off) may spend multiple frames without seeing a motion event. Many of those are also timing sensitive, and not just want as much granularity as possible, but also want the events to be delivered as fast as possible.

How crazy is crazy?

As mentioned, events are now sent at the input device rate, but… what rate is that? This starts at tens of times per second on cheap devices, up to the lower hundred-or-so in your regular laptop touchpad, to the low hundreds on drawing tablets.

But enter the gamer, high end gaming mice have an input frequency of 1000Hz, which means there are approximately 16 events per frame (in the typical case of a 60Hz display) that must get through to the application ASAP. This usecase is significantly more demanding than the others, and not by a small margin.

A look under the hood

Having to look up the actor beneath the pointer 1000 times a second (16x as often) means it doesn’t suffice to avoid GPU based picking in favor of SIMD operations, there has to be a very aggressive form of caching as well.

To keep the required calculations to a minimum, Mutter now caches a set of rectangles that approximates the visible, uncovered area of the actor beneath the pointer. These are in the same coordinate space than input events so comparisons are direct. If the pointer moves outside the expressed region or the cache is dropped by other means (e.g. a relayout), the actor is looked up again and the new area cached.

This is of course most optimal when the actors are big, with pointer picking virtually dropping to 0 on e.g. fullscreen clients, but it helps even when blazing your pointer across many actors in the screen. Crossing a button from left to right can take a surprising amount of events.

But what about JavaScript? Would it maybe trigger a thousand times a second? Absolutely not, events as handled within Clutter (and GNOME Shell actors) are still motion compressed. This unthrottled event delivery only applies in the direction of Wayland clients.

There were other areas that got indirectly stressed by the many additional events, there’s been a number of optimizations across the board so it doesn’t turn bad even when Mutter is doing so much more.

How does it feel?

This is something you’d have to check for yourself. But we can show you how this looks!

Green is good.

This is a quick and dirty test application that displays timing information about the received events. Some takeaways from this:

  • Due to human limitations, it is next to impossible to produce a steady 1000Hz input rate for a full second. Moving the mouse left and right wastes precious milliseconds decelerating to accelerate again, even drawing the most perfect circle is too slow to have them need one event per millisecond. The devices are capable of shorter 1000Hz bursts though.
  • The separation between events (i.e. the time difference between the current and last events as received by the client) is predominantly sub-frame. There is only some larger separation when Mutter is busy putting images onscreen.
  • The event latency (time elapsed between emission by the hw/kernel and reception by the application) is <2ms in most cases. There are surely a few events that take a longer time to the application, but it is essentially noise.

Gamers (and other people that care about responsiveness) should notice this as “less janky”.

Didn’t drawing tablets have this before?

Yes and no. Mutter skipped motion compression altogether for drawing tablets, since the applications interested in these really preferred the extra events despite the drawbacks. With these changes in place, drawing tablet users will purely benefit of the improved performance.

Why so loooong

If you have been following GNOME Shell development, you might have heard about this change before. Why it took so long to have this merged?

The showstopper was probably what you would suspect the least: applications that are not handling events. If an application is not reading events in time (is temporarily blocking the main loop, frozen, slow, in a breakpoint, …), these events will queue up.

But this queue is not infinite, the client would eventually be shutdown by the compositor. With these input devices that could take a long… less than half a second. Clearly, there had to be a solution in place before we rolled this in.

There’s been some back and forth here, and several proposed solutions. The applied fix is robust, but unfortunately still temporary, a better solution is being proposed at the Wayland library level but it’s unlikely to be ready before GNOME 42. In the mean time , users can happily shake their input devices without thinking how many times a second is enough.

Until the next adventure!

GNOME 40 & your extension

As you are probably aware by now, GNOME 40 will bring some big changes.

This is exciting, but these changes also means that some extensions will have to adjust to continue working in GNOME 40.

To help with that, this post provides a brief overview(!) of the most important changes.

You can join the #gnome-shell and #shell-extensions channels on IRC/Matrix for further questions, and the friendly folks of the extensions rebooted project provide helpful resources like a testing VM image as well as advice.


The overview was the focus of the GNOME 40 changes, so it is not surprising that it is also the place where adjustment is most likely to be needed.


This is now the central place that controls the overall state and ties the various overview components together:

    • dash (now horizontal and at the bottom, otherwise largely the same as before)
    • window picker
    • app grid
    • workspace minimap (formerly known as workspace switcher)
    • search controller (formerly known as view selector)

All those components have seen changes to their internals as well, so watch out for those if your extension modifies any of them.

Adjustments, adjustments, adjustments

Most state is now controlled by adjustments, so that transitions can either be animated or controlled by gestures:

    • overview adjustment
      controls the overall overview state, with the possible ControlsState values HIDDEN, WINDOW_PICKER and APP_GRID
    • fit-mode adjustment
      controls how workspaces are displayed, namely whether centering on a single workspace (0) or fitting all workspaces in the available space (1)
    • workspace adjustment
      controls which workspace is in view, that is the value corresponds to the active workspace (or an in-between value during transitions)
    • app grid adjustment
      controls the scroll position of app grid pages
    • workspace state adjustment
      controls whether window previews are shown floating (0) as outside the overview, or spread out according to the used layout strategy (1)

The first one is the most important one, driving both the overview transition and the fit-mode and workspace-state adjustments.

Backgrounds have moved into workspaces

This is a relatively minor change, but it affected two extensions I’m maintaining, so I decided it was worth mentioning after all.


Extension preferences must use GTK4 now.

It is not possible to use both GTK3 and GTK4 from the same process, so we all have to take the plunge together; and as the process that opens preference dialogs was ported, now is an excellent time for that 🙂

The GTK documentation contains a migration guide that lists most of the changes that are required.

Porting a single preference dialog should be a lot easier than porting an entire application. At least that’s what I found when porting the gnome-shell-extensions and Fedora’s background-logo extensions, so hopefully it won’t be much more work for you.

Version validation

With all those changes, we expect more extensions to have compatibility issues than usual.

To protect against that, we are again doing version validation. That means unless the shell-version field in an extension’s metainfo.json file includes “40”, it will be disabled and marked as out-of-date.

Apropos “version”: We are following the new GNOME version scheme, so if you do any version comparisons yourself, make sure to take the major version into account.

… and one more thing

We no longer put arrows in top bar menus.

There have been no significant changes to top bar menus this cycle, so if your extension just adds a menu or indicator, it is unlikely to break.

It will just look a bit foreign if you show an arrow next to your menu, so we recommend removing them.

GNOME Shell 40 and multi-monitor

Multi-monitor has come up a fair bit in conversations about the GNOME Shell UX updates that are coming in GNOME 40. There’s been some uncertainty and anxiety in this area, so we wanted to provide more detail on what the multi-monitor experience will exactly be like, so people know what to expect. We also wanted to provide background on the decisions that have been made.


Before we get into multi-monitor, a short status update! As you would expect for this stage in the development cycle, the main bulk of the UI changes are now in the GNOME Shell master branch. This was the result of a really hard push by Georges and Florian, so huge thanks to them! Anyone who is interested in following this work should ensure that they are running the master branch, and not the now redundant development branch.

There are still a few relatively minor UI changes that we are hoping to land this cycle, but overall the emphasis is now on stabilisation and bug fixing, so if you are testing and have spotted any issues, now’s the time to report them.


OK, back to multi-monitor!

In many key respects, multi-monitor handling in GNOME 40 will be identical to how it is in 3.38. GNOME 40 still defaults to workspaces only on the primary display, as we have since 3.0. The top bar and overview will only be shown on the primary display, and the number of workspaces will still be dynamic. In many respects, GNOME 40 should feel very similar to previous GNOME versions, therefore.

That still leaves a lot of unanswered questions, of course, so let’s run through the GNOME 40 multi-monitor experience in more detail. Much of this concerns how workspaces will work in combination with multi-monitor setups.

Default configuration

As mentioned already, GNOME 40 will continue to default to only showing workspaces on the primary display. With a dual display setup, the overview will therefore look like this by default:

One detail to notice is how we’re scaling down the background on the secondary display, to communicate that it’s a single workspace like those on the primary display. We feel that this presentation helps to make the logic of the multiple displays clearer, and helps to unify the different screens.

To get an idea of what this will look like in use, Jakub Steiner has kindly created some motion mockups. These are intended to communicate how each part fits together and what the transitions will be like, rather than being a 100% accurate rendering of the final product (in particular, the transitions have been slowed down).

Here you can get an idea of what it will look like opening the overview and moving between workspaces. Just like the current default configuration, workspace switching only happens on the primary display.

Workspaces on all displays

While workspaces only being on the primary display is the default behaviour, GNOME also supports having workspaces on all displays, using the workspaces-only-on-primary settings key. The following static mockup shows what the overview will look like in GNOME 40 with this configuration.

As you can see, this is very similar to the workspaces only on primary configuration. The main difference is that you can see the additional workspaces extending to the right on the secondary display. It’s also possible to see that the workspace navigator (the small set of thumbnails at the top) is visible on both displays. The introduction of the workspace navigator on secondary displays is a new change for GNOME 40, which is intended to improve the experience for users who opt to have workspaces on all displays. We know from our user research that this is something that many users will welcome.

Like in other GNOME versions, when workspaces are on all displays, they are switched in unison. For example, all displays show workspace 2 at the same time. This can be seen in the motion mockups:

Keyboard shortcuts

The existing workspace shortcuts will continue to work in GNOME 40. Super+PgUp/PgDown will continue to switch workspace. Adding Shift will continue to move windows between workspaces.

We are also introducing additional shortcuts which align with the horizontal layout. The new shortcut to switch workspace will be Super+Alt+←/→. Moving windows between workspaces will be Super+Alt+Shift+←/→. Super+Alt+↑ will also open the overview and then app grid, and Super+Alt+↓ will close them.

These directional keyboard shortcuts have matching touchpad gestures: three-finger swipes left and right will switch workspaces, and three-finger swipes up and down will open the overview and app grid.

Why horizontal?

A few people have pointed out that horizontal workspaces aren’t as clean with horizontal multi-monitor setups. The concern is that, when multiple displays are horizontal, they end up clashing with the layout of the workspaces. There is some truth in that, and we recognise that some users might need to adjust to this aspect of the design.

However, it’s worth pointing out that horizontal workspaces are a feature of every other desktop out there. Not only is it how every other desktop does it, but it is also how GNOME used to do it prior to 3.0, and how GNOME’s classic mode continues to do it. Therefore, we feel that horizontal workspaces and horizontally-arranged displays can get along just fine. If anyone is concerned about this, we’d suggest that you give it a try and see how it goes.

Some people have also asked why we are making the switch to horizontal workspaces at all, which is fair! Here I think that it needs to be understood that horizontal workspaces are fundamental to the the design we’re pursuing for 40: the film-strip of workspaces (which proved effective in testing), the clearer organisation of the overview, the coherent touchpad gestures, a dash that can more comfortably scale to include more items, and so on. This is all facilitated by the workspace orientation change, and would not be possible without it.

GNOME ❤️ multi-monitor

In case there’s any doubt: multi-monitor is absolutely a priority for us in the shell design and development team. We know that the multi-monitor experience is important to many GNOME users (including many of us who work on GNOME!), and it is something that we’re committed to improving. This applies to both the default workspaces behaviour as well as the workspaces on all displays option.

Multi-monitor considerations regularly featured in the design planning for GNOME 40. They were also a research theme, both in our early discovery interviews and survey, as well as in the diary study that we ran. As a result of this, we are confident that GNOME 40 will provide an excellent multi-monitor experience.

We actually have a few plans for multi-monitor improvements in the future. Some of these pre-date the GNOME 40 work that is currently happening, and we hope to get back to them during the next development cycle. Our ambition is for the multi-monitor story to keep on getting better.

Thanks for reading!

Shell UX Changes: The Research

This post is part of an ongoing series about the overview design changes which are being worked on for GNOME 40. (For previous posts, see here.)

Ongoing user research has been a major feature of this design initiative, and I would say that it is by far the best researched project that I have worked on. Our research has informed the design as it has proceeded, resulting in particular design choices and changes, which have improved the overall design and will make it a better experience for users. As a result of this, we have a much greater degree of confidence in the designs.

This post is intended as a general overview of the research that we’ve been doing. I’m excited to share this, as a way of explaining how the design came about, as well as sharing some of the insights that we’ve found along the way.

What we did

In total, we conducted six separate research exercises as part of this initiative. These ran alongside the design and development effort, in order to answer the questions we had at each stage of the process.

Many of the research exercises were small and limited. This reflected our ambition to use a lean approach, which suited the limited resources we had available. These smaller exercises were supplemented with a larger piece of paid research, which was conducted for us by an external research company. In what follows I’ll go through each exercise in order, and give a brief description of what was done and what we found out.

So far the data from our research isn’t publicly available, largely because it contains personal information about test participants which can’t be shared. However, we do plan on making a version of the data available, with any identifying information removed.

1. Exploratory interviews

I already blogged about this exercise back in September. A summary: the initial interviews were an exploratory, sensitising, exercise, to find out how existing users used and felt about GNOME Shell. We spoke to seven GNOME users who had a range of roles and technical expertise. Each participant showed us their desktop setup and how they used it, and we asked them questions to find out how the existing shell design was working for them.

We found out a bunch of valuable things from those early interviews. A good portion of the people we spoke to really liked the shell, particularly its minimalism and the lack of distractions. We also discovered a number of interesting behaviours and user types around window and workspace usage.

2. Initial behavioural survey

The initial survey exercise was also covered in my September blog post. It was intended to provide some numbers on app, window and workspace usage, in order to provide some insight into the range of behaviours that any design changes needed to accommodate.

The survey was a deliberately quick exercise. We found out that most people had around 8 open windows, and that the number of people with a substantially higher number of open windows was low. We also found that most people were only using a single workspace, and that high numbers of workspaces in use (say, above six) was quite rare.

3. Running apps experiment

During the early design phase, the design team was interested in the role of the running apps in the dash. To explore this, I ran a little experiment: I got colleagues to install an extension which removes running apps from the dash, and asked them to record any issues that they experienced.

We found that most people got along just fine without running apps in the dash. Despite this, in the end we decided to keep the running apps, based on other anecdotal reports we’d seen.

4. External user testing

Thanks to support by Endless, we were lucky to have the opportunity to contract out some research work. This was carried out by Insights and Experimentation Agency Brooks Bell and was contracted under the umbrella of the GNOME Foundation.

The research occurred at a point in the process where we were weighing up key design decisions, which the research was designed to help us answer in an informed manner.


The research itself consisted of 20 moderated user testing sessions, which were conducted remotely. Each participant tested GNOME 3.38 and then either a prototype of the new design or Endless OS. This provided us with a means to compare how each of the three desktops performed, with a view to identifying the strengths and weaknesses of each.

Each session involved a combination of exploration and evaluation. Participants were interviewed about their typical desktop usage, and were invited to recreate a typical desktop session within the test environment. They were then asked to perform some basic tasks. After testing both environments, they were required to fill in a post-test survey to give feedback on the two desktops they had tried.

Research participants included both existing GNOME users, as well as users who had never used GNOME before. The sample included a range of technical abilities and experience levels. It also included a mix of professional and personal computer users. The study was structured in such a way that we could analyse the differences between different user groups, so we could get a sense of how each desktop performed with different user groups. Participants were recruited from six countries: Brazil, Canada, Germany, Italy, United Kingdom and the USA.

Brooks Bell were a great firm to work with. Our own design and development team were able to have detailed planning conversations with them, as well as lengthy sessions to discuss the research findings. We were also given access to all the research data, to enable us to do our own analysis work.


The external research provided a wealth of useful information and analysis. It addressed the specific research questions that we had for the study, but also went further to address general questions about how and why the participants responded to the designs in the way that they did, as well as identifying a number of unrelated design issues which we hope to address in future releases.

One of the themes in the research was the degree to which users positively responded to UI conventions with which they were already familiar. This was reflected in both how respondents responded to the designs in general, as well as how successfully they were able to use specific aspects of them. For example, interactions with the app grid and dash were typically informed by the participants’ experiences with similar UIs from other platforms.

This might seem like an obvious finding, however the utility of the research was in demonstrating how this general principle played out in the specific context of our designs. It was also very interesting to see how conventions from both mobile and desktop informed user behaviour.

In terms of specific findings, there wasn’t a single clear story from the tests, but rather multiple overlapping findings.

Existing GNOME users generally felt comfortable with the desktop they already use. They often found the new design to be exciting and liked the look and feel, and some displayed a negative reaction to Endless, due to its similarity with Windows.

“I like the workspaces moving sideways, it feels more comfortable to switch between them.”
—Comment on the prototype by an existing GNOME user

All users seemed to find the new workspace design to be more engaging and intuitive, in comparison with the workspaces in GNOME 3.38. This was one particular area where the new design seemed to perform better than existing GNOME Shell.

“[It feels] quicker to navigate through. It [has a] screen where I can view my desktop at the top and the apps at the bottom, this makes it quicker to navigate.”
—Comment on the prototype by a non-GNOME user

On the other hand, new users generally got up to speed more quickly with Endless OS, often due to its similarity to Windows. Many of these testers found the bottom panel to be an easy way to switch applications. They also made use of the minimize button. In comparison, both GNOME 3.38 and the prototype generally took more adjustment for these users.

“I really liked that it’s similar to the Windows display that I have.”
—Comment on Endless OS by a non-GNOME user

5. Endless user testing

The final two research exercises we conducted were used to fill in specific gaps in our existing knowledge, largely as a validation exercise for the design we were working towards. The first of these consisted of 10 remote user testing sessions, conducted by Endless with participants from Guatemala, Kenya and the USA. These participants were picked from particular demographics that are of importance to Endless, particularly young users with limited computing experience.

Each test involved the participant running through a series of basic desktop tasks. Like the tests run by Brooks Bell, these sessions had a comparative element, with participants trying both Endless OS and the prototype of the new design. In many respects, these sessions confirmed what we’d already found through the Brooks Bell study, with participants both responding well to the workspace design in the prototype, and having to adjust to designs that were unfamiliar to them.

“Everything happens naturally after you go to Activities. The computer is working for you, you’re not working for it”
—Tester commenting on the new design

6. Diary study

The diary study was intended to identify any issues that might be encountered with long-term usage, which might have been missed in the previous user tests. Workspaces and multi-monitor usage were a particular focus for this exercise, and participants were selected based on whether they use these features.

The five diary study participants installed the prototype implementation and used it for a week. I interviewed them before the test to find out their existing usage patterns, then twice more over the test period, to see how they were finding the new design. The participant also kept a record of their experiences with the new design, which we referred back to during the interviews.

This exercise didn’t turn up any specific issues with multi-monitor or workspace usage, despite including participants who used those features. In all the participants generally had a positive response to the new design and preferred it over the existing GNOME shell they were using. It should be mentioned that this wasn’t universal however.

7. Community testing and feedback

While community testing isn’t strictly a research exercise, it has nevertheless been an important part of our data-driven approach for this initiative. One thing that we’ve managed to do relatively successfully is have a range of easy ways to test the new design. This was a priority for us from the start and has resulted in us being able to have a good round of feedback and design adjustment.

It should be noted that those of us on the design side have had detailed follow-up conversations with those who have provided feedback, in order to ensure that we have a complete understanding of the issues as described. (This often requires having background knowledge about users setup and usage patterns.) I have personally found this to be an effective way of developing empathy and understanding. It is also a good example of how our previous research has helped, by providing a framework within which to understand feedback.

The main thing that we have got from this stage of the process is testing with a wider variety of setups, which in particular has informed the multi-monitor and workspace aspects of the design.


As I wrote in the introduction to this post, GNOME has never had a design initiative that has been so heavily accompanied by research work. The research we’ve done has undoubtedly improved the design that we’re pursuing for GNOME 40. It has also enabled us to proceed with a greater degree of confidence than we would have otherwise had.

We’re not claiming that every aspect of the research we’ve done has been perfect or couldn’t have been improved. There are gaps which, if were able to do it all again, we would have liked to have filled. But perfect is the enemy of good and doing some research – irrespective of its issues – is certainly better than doing none at all. Add to this the fact that we have been doing research in the context of an upstream open source project with limited resources, and I think we can be proud of what we’ve achieved.

When you put together the lessons from each of the research exercises we’ve done, the result is a picture of different user segments having somewhat different interests and requirements. On the one hand, we have the large number of people who have never used GNOME or an open source desktop, to whom a familiar design is one that is generally preferable. On the other hand, there are users who don’t want a carbon copy of the proprietary desktops, and there are (probably more technical) users who are particularly interested in a more minimal, pared back experience which doesn’t distract them from their work.

The best way for the GNOME project to navigate this landscape is a tricky question, and it involves a difficult balancing act. However, with the changes that are coming in GNOME 40, we hope that we are starting out on that path, with an approach that both adopts some familiar conventions from other platforms, while developing and refining GNOME’s unique strengths.

Another Shell UX Update

Another update on the UX changes that are being worked on for GNOME 40! (See previous posts here, here, here…)

A status update

First off, a summary of where we are. Development work has been proceeding apace, and the main batch of changes are currently in the process of being merged into master. Other polish changes are being queued up alongside this, ready to be merged.

This work has primarily been undertaken by Georges Stravacas, with assistance from Florian Müllner. Georges has even been doing live coding sessions, where you can see him do the work in real time!

We currently have about two weeks until UI freeze. This means that, once the major changes have been merged, we will have a short, intense period of polishing and bug fixing work. (If we discover issues after the freeze, there’s also the possibility of getting exceptions to land changes.)

There has been plenty of activity on the design side, too. The design team has been busy testing the development branch and dealing with issues as they have come up. We have a fairly long list of issues that we’re tracking, which we will be turning into a more accessible roadmap very shortly.

We have also been busy discussing solutions to the issues that have come up in feedback. One of the major changes to come out of that is a new workspace navigator, which has been added to the “window picker” in the Activities Overview.

Testing testing testing

We have a variety of methods available for people to test the new design, and invite everyone who is interested to try it out. Once the changes have landed in master will be a great time to file issues.

Current testing methods include:

Run a prebuilt VM image in Boxes

Felipe Borges has done a great job creating a VM image containing the changes, which can be downloaded and run in Boxes. To do this:

  • Download the image
  • Extract the downloaded .qcow2.gz file (from Files, right click on the file and click Extract Here)
  • Open Boxes
  • Press the add (+) button, then select Create a virtual machine…
  • Scroll down to Operating System Image File – press that, then select the .qcow2 file that was extracted
  • At the next step, click the Template entry and select Fedora 33. Then click Next.
  • At the next Review and Create step, click Create to make the VM
  • When the VM has finished installing and has booted, log in to the gnome user account. The password is “gnome”.

These images are based on Fedora using a COPR repository. They can be updated using DNF to get the latest design and development changes.

Use the testing COPR from Fedora 33

The main development branches for the changes have been made available as a COPR repository for Fedora 33. This can be used with an existing Fedora 33 install, either in a VM or on bare metal.

The obvious warning applies here – this is development software with limited testing, and you will be swapping out key desktop components. You should be prepared for your system to break and have an idea of how to recover should this happen. This is definitely a case of proceeding at your own risk.

Using the COPR is simply a matter of adding it and updating with the following commands:

  • sudo dnf copr enable haeckerfelix/gnome-shell-40
  • sudo dnf update

Then reboot.

Build the branch in a virtual machine

If you want to track the latest changes in real time, or want to help with development, this could be a good option for you, and doesn’t take a huge amount of work. Instructions can be found here.

What next?

Everyone is welcome to test the design using the methods described above. Note that, at the moment, all of these involve using the development branch which will not be identical to the final implementation. The branch is primarily useful for testing the general design and giving general feedback to the design team. Once more work has landed in master, that will be the time to file specific bugs.

If other testing methods become available, please let us know and we’ll add references to them in the blog post.