GNOME Shell on mobile: An update

It’s been a while since the last update on GNOME Shell mobile, but there’s been a huge amount of progress during that time, which culminated in a very successful demo at the Prototype Fund Demo Day last week.

​The current state of the project is that we have branches with all the individual patches for GNOME Shell and Mutter, which together comprise a pretty complete mobile shell experience. This includes all the basics we set out to cover during the Prototype Fund project (navigation gestures, screen size detection, app grid, on-screen keyboard, etc.) and some additional things we ended up adding along the way.

The heart of the mobile shell experience is the sophisticated 2D gesture navigation: The gestures to go to the overview vertically and switch horizontally between apps are fluid, interruptible, and multi-dimensional. This allows for navigation which is not only quick and ergonomic, but also intuitive thanks to an incredibly simple spatial model.

While the overall gesture paradigm we use is quite similar to what iOS and Android have, there’s one important difference: We have a single overview for both launching and switching, instead of two separate screens on iOS (home screen and multitasking) and three separate screens on Android (home screen, app drawer, multitasking).

This allows us to avoid the awkward “swipe, stop, and wait” gesture to go to multitasking that other systems rely on, as well as the confusing spatial model, where apps live both within the app icon and next to the home screen, and sometimes show up from the left when swiping… up?

Our overview is always a single swipe away, and allows instant access to both open apps and the app grid, without having to choose between launching and switching.

In case you’re wondering where the “overview” state with just the multitasking cards (like we had in previous iterations) went – After some experimentation and informal user research we realized that it’s not really adding any value over the row of thumbnails in the app grid state. The smaller thumbnails are more than large enough to interact with, and more useful because you can see more of them at the same time.

We ported the shell search experience to a single-column layout for the narrower screen, which coincidentally is a direction we’re also exploring for the desktop search layout.

We completely replaced the on-screen keyboard gesture input, applying several tricks that OSKs on other mobile OSes employ, e.g. releasing the currently pressed key when another one is pressed. The heuristics for when the keyboard shows up are a lot more intuitive now and more in line with other mobile OSes.

The keyboard layout was adapted to the narrower size and the emoji keyboard got a redesign. There’s also a very fancy new gesture for hiding the keyboard, and it automatically hides when scrolling the view.

The app grid layout was adapted to portrait sizes, including a new style for folders and lots of spacing and padding tweaks to make it work well for the phone use case. All the advanced re-ordering and organizing features the app grid already had before are of course available.

Luckily for us, Florian independently implemented the new Quick Settings this cycle. These work great on the phone layout, but on top of that we also added notifications to that same menu, to get a unified system menu you can open with a swipe from the top. This is not as mature as other parts of the mobile shell yet and needs further work, which we’ll hopefully get to soon as part of the planned notifications overhaul.

One interesting new feature here is that notifications can be swiped away horizontally to close, and notification bubbles can be swiped up to hide them.

Next steps

From a development perspective the next steps are primarily upstreaming all of the work done so far, starting with the new gesture API, which is used by many different parts of the mobile shell and will bring huge improvements to gestures on desktop as well. This upstreaming effort is going to require many separate merge requests that depend on each other, and will likely take most of the 44 cycle.

Beyond upstreaming what already exists there are many additional things we want or need to work on to make the mobile experience really awesome, including:

  • Calls on the lock screen (i.e. an API for apps to draw over the lock screen)
  • Emergency calls
  • Haptic feedback
  • PIN Unlock
  • Adapt terminal keyboard layout for mobile, more custom keyboard layouts e.g. for URLs
  • Notifications revamp, including grouping and better actions
  • Flashlight quick settings toggle
  • Workspace reordering in the overview

There are also a few rough edges visually which need lower-level changes to fix:

  • Rounded thumbnails in the overview
  • Transparent panel
  • A way for apps to draw behind the top and bottom bars and the keyboard (to allow for glitch-free keyboard showing/hiding)

Help with any of the above would be highly appreciated!

How to try it

In addition to further development work there’s also the question of getting testing images. While the current version is definitely still work in progress, it’s quite usable overall, so we feel it would make sense to start having experimental GNOME OS Nightly images with it. There’s also postmarketOS, who are working to add builds of the mobile shell to their repositories.

The hardware question

The main question we’re being asked by everyone is “What device do I have to get to start using this?”, which at this stage is especially important for development. Unfortunately there’s not a great answer to this right now.

So far we used a Pinephone Pro sponsored by the GNOME Foundation to allow for testing, but unfortunately it’s nowhere near ready in terms of hardware enablement and it’s unclear when it will be.

The original Pinephone is much further along in hardware enablement, but the hardware is too weak to be realistically usable. The Librem 5 is probably the best option in both hardware support and performance, but it still takes a long time to ship. There are a number of Android phones that sort of work, but there unfortunately isn’t one that’s fully mainlined, performant enough, and easy to buy.

Thanks to the Prototype Fund

All of this work was possible thanks to the Prototype Fund, a grant program supporting public interest software by the German Ministry of Education (BMBF).

 

 

Towards GNOME Shell on mobile

As part of the design process for what ended up becoming GNOME 40 the design team worked on a number of experimental concepts, a few of which were aimed at better support for tablets and other smaller devices. Ever since then, some of us have been thinking about what it would take to fully port GNOME Shell to a phone form factor.

GNOME Shell mockup from 2020, showing a tiling-first tablet shell overview and two phone-sized screens
Concepts from early 2020, based on the discussions at the hackfest in The Hague

It’s an intriguing question because post-GNOME 40, there’s not that much missing for GNOME Shell to work on phones, even if not perfectly. A few of the most difficult pieces you need for a mobile shell are already in place today:

  • Fully customizable app grid with pagination, folders, and drag-and-drop re-ordering
  • “Stick-to-finger” horizontal workspace gestures, which are pretty close to what we’d want on mobile for switching apps
  • Swipe up gesture for navigating to overview and app grid, which is also pretty close to what we’d want on mobile

On top of that, many of the things we’re currently working towards for desktop are also relevant for mobile, including quick settings, the notifications redesign, and an improved on-screen keyboard.

Possible thanks to the Prototype Fund

Given all of this synergy, we felt this is a great moment to actually give mobile GNOME Shell a try. Thanks to the Prototype Fund, a grant program supporting public interest software by the German Ministry of Education (BMBF), we’ve been working on mobile support for GNOME Shell for the past few months.

Scope

We’re not expecting to complete every aspect of making GNOME Shell a daily driveable phone shell as part of this grant project. That would be a much larger effort because it would mean tackling things like calls on the lock screen, PIN code unlock, emergency calls, a flashlight quick toggle, and other small quality-of-life features.

However, we think the basics of navigating the shell, launching apps, searching, using the on-screen keyboard, etc. are doable in the context of this project, at least at a prototype stage.

Three phone-sized UI mockups, one showing the shell overview with multitasking cards, the second showing the app grid with tiny multitasking cards on top, and the third showing quick toggles with notifications below.
Mockups for some of the main GNOME Shell views on mobile (overview, app grid, system status area)

Of course, making a detailed roadmap for this kind of effort is hard and we will keep adjusting it as things progress and become more concrete, but these are the areas we plan to work on in roughly the order we want to do them:

  • New gesture API: Technical groundwork for the two-dimensional navigation gestures (done)
  • Screen size detection: A way to detect the shell is running on a phone and adjust certain parts of the UI (done)
  • Panel layout: Using the former, add a separate mobile panel layout, with a different top panel and a new bottom panel for gestures (in progress)
  • Workspaces and multitasking: Make every app a fullscreen “workspace” on mobile (in progress)
  • App Grid layout: Adapt the app grid to the phone portrait screen size, ideally as part of a larger effort to make the app grid work better at various resolutions (in progress)
  • On-screen keyboard: Add a narrow on-screen keyboard mode for mobile portrait
  • Quick settings: Implement the new quick settings designs

Current Progress

One of the main things we want to unlock with this project is the fully semantic two-dimensional navigation gestures we’ve been working towards since GNOME 40. This required reworking gesture recognition at a fairly basic level, which is why most of the work so far has been focused around unlocking this. We introduced a new gesture tracker and had to rewrite a fair amount of the input handling fundamentals in Clutter.

Designing a good API around this took a lot of iterations and there’s a lot of interesting details to get into, but we’ll cover that in a separate deep-dive blogpost about touch gesture recognition in the near future.

Based on the gesture tracking rework, we were able to implement two-dimensional gestures and to improve the experience on touchscreens quite a bit in general. For example, the on-screen keyboard now behaves a lot more like you’re used to from your smartphone.

Here’s a look at what this currently looks like on laptops (highly experimental, the second bar would only be visible on phones):

Some other things that already work or are in progress:

  • Detecting that we’re running on a phone, and disabling/adjusting UI elements based on that
  • A more compact app grid layout that can fit on a mobile portrait screen
  • A bottom bar that can act as handle for gesture navigation; we’ll definitely need this for mobile but it’s is also a potentially interesting future direction for larger screens

Taken together, here’s what all of this looks like on actual phone hardware right now:

Most of this work is not merged into Mutter and GNOME Shell yet, but there are already a few open MRs in case you’d like to dive into the details:

Next Steps

There’s a lot of work ahead, but going forward progress will be faster and more visible because it will be work on the actual UI, rather than on internal APIs. Now that some of the basics are in place we’re also excited to do more testing and development on actual phone hardware, which is especially important for tweaking things like the on-screen keyboard.

Photo of the app grid on a Pinephone Pro leaning against a wood panel.
The current prototype running on a Pinephone Pro sponsored by the GNOME Foundation

An Eventful Instant

Artist, gamers, rejoice! GNOME Shell 42 will let applications handle input events at the full input device rate.

It’s a long story

Traditionally, GNOME Shell has been compressing pointer motion events so its handling is synchronized to the monitor refresh rate, this means applications would typically see approximately 60 events per second (or 144 if you follow the trends).

This trait inherited from the early days of Clutter was not just a shortcut, handling motion events implies looking up the actor that is beneath the pointer (mainly so we know which actor to send the event to) and that was an expensive enough operation that it made sense to do with the lowest frequency possible. If you are a recurrent reader of this blog you might remember how this area got great improvements in the past.

But that alone is not enough, motion events can also end up handled in JS land, and it is in the best interest of GNOME Shell (and people complaining about frame loss) that we don’t need to jump into the JavaScript machinery too often in the course of a frame. This again makes sense to keep to a minimum.

Who wants it different?

Applications typically don’t care a lot about motion events, beyond keeping up with the frame rate. Others however have a stronger reliance on motion event data that this event compression is suboptimal.

Some examples where sending input events at the device rate matters:

  • Applications that use device input for velocity/direction/acceleration calculations (e.g. a drawing app applying a brush effect) want as much granularity as it is possible, compressing events is going to smooth values and tamper with those calculations.
  • Applications that render more often than the frame rate (e.g. games with vsync off) may spend multiple frames without seeing a motion event. Many of those are also timing sensitive, and not just want as much granularity as possible, but also want the events to be delivered as fast as possible.

How crazy is crazy?

As mentioned, events are now sent at the input device rate, but… what rate is that? This starts at tens of times per second on cheap devices, up to the lower hundred-or-so in your regular laptop touchpad, to the low hundreds on drawing tablets.

But enter the gamer, high end gaming mice have an input frequency of 1000Hz, which means there are approximately 16 events per frame (in the typical case of a 60Hz display) that must get through to the application ASAP. This usecase is significantly more demanding than the others, and not by a small margin.

A look under the hood

Having to look up the actor beneath the pointer 1000 times a second (16x as often) means it doesn’t suffice to avoid GPU based picking in favor of SIMD operations, there has to be a very aggressive form of caching as well.

To keep the required calculations to a minimum, Mutter now caches a set of rectangles that approximates the visible, uncovered area of the actor beneath the pointer. These are in the same coordinate space than input events so comparisons are direct. If the pointer moves outside the expressed region or the cache is dropped by other means (e.g. a relayout), the actor is looked up again and the new area cached.

This is of course most optimal when the actors are big, with pointer picking virtually dropping to 0 on e.g. fullscreen clients, but it helps even when blazing your pointer across many actors in the screen. Crossing a button from left to right can take a surprising amount of events.

But what about JavaScript? Would it maybe trigger a thousand times a second? Absolutely not, events as handled within Clutter (and GNOME Shell actors) are still motion compressed. This unthrottled event delivery only applies in the direction of Wayland clients.

There were other areas that got indirectly stressed by the many additional events, there’s been a number of optimizations across the board so it doesn’t turn bad even when Mutter is doing so much more.

How does it feel?

This is something you’d have to check for yourself. But we can show you how this looks!

Green is good.

This is a quick and dirty test application that displays timing information about the received events. Some takeaways from this:

  • Due to human limitations, it is next to impossible to produce a steady 1000Hz input rate for a full second. Moving the mouse left and right wastes precious milliseconds decelerating to accelerate again, even drawing the most perfect circle is too slow to have them need one event per millisecond. The devices are capable of shorter 1000Hz bursts though.
  • The separation between events (i.e. the time difference between the current and last events as received by the client) is predominantly sub-frame. There is only some larger separation when Mutter is busy putting images onscreen.
  • The event latency (time elapsed between emission by the hw/kernel and reception by the application) is <2ms in most cases. There are surely a few events that take a longer time to the application, but it is essentially noise.

Gamers (and other people that care about responsiveness) should notice this as “less janky”.

Didn’t drawing tablets have this before?

Yes and no. Mutter skipped motion compression altogether for drawing tablets, since the applications interested in these really preferred the extra events despite the drawbacks. With these changes in place, drawing tablet users will purely benefit of the improved performance.

Why so loooong

If you have been following GNOME Shell development, you might have heard about this change before. Why it took so long to have this merged?

The showstopper was probably what you would suspect the least: applications that are not handling events. If an application is not reading events in time (is temporarily blocking the main loop, frozen, slow, in a breakpoint, …), these events will queue up.

But this queue is not infinite, the client would eventually be shutdown by the compositor. With these input devices that could take a long… less than half a second. Clearly, there had to be a solution in place before we rolled this in.

There’s been some back and forth here, and several proposed solutions. The applied fix is robust, but unfortunately still temporary, a better solution is being proposed at the Wayland library level but it’s unlikely to be ready before GNOME 42. In the mean time , users can happily shake their input devices without thinking how many times a second is enough.

Until the next adventure!

Extensions Rebooted: Porting your existing extensions to GNOME 40

This will be the first blog post in a series to help get extensions quickly updated after each release. While communications have been quiet, we have not been idle! For the past few months, we have been working on building the structure to building a robust extensions community.

GNOME 40 will be released soon and it will be important to know what that means for extension developers. Since there have been significant changes in GNOME Shell – it will be important to understand where those changes are and how they might affect the various extensions that are out there.

The changes in GNOME Shell have been primarily around the overview as you can imagine if you’ve been keeping up with the GNOME Shell blog posts and the preferences. Previously preferences were GTK3 base and now require using GTK4.

To help with updating your extensions, community member Just Perfection has created a porting guide that you can use to learn how to modify your extension to work with the GNOME 40 shell release. With the advent of the porting guide, we hope that porting will be a lot smoother than it has in the past.

It’s also important to highlight an important change that has also taken place and that is GNOME Shell will once again perform strict version checking. Because there are numerous changes, and that some distros will still be using GNOME 3.38 – we will be enforcing version checking here on in for GNOME 40 compatible extensions. If you do not set your version to the correct GNOME Shell version – it will fail to load.

Testing Your Extension

To test your extension for the GNOME 40, please download the GNOME OS image from [link] and then use the GNOME Boxes from  https://flathub.org/. Do not use your distro version as that will not work with the GNOME OS image. 

GNOME Boxes need to be compiled with UEFI support, but not all distros have compiled that support in and so it’s more consistent to use the flatpak version as the canonical source.

Download the GNOME 40 release candidate from here – https://os.gnome.org/download/40.rc/gnome_os_installer_40.rc.iso  [~2.2GB].

Once you have downloaded the image, you can import it in GNOME Boxes. Go through the GNOME initial setup to create your account and then you’ll be fully logged in.

GNOME software will automatically notify you if there are any updates. You can manually keep up with changes with:

$ sudo ostree admin upgrade -r

In order to effectively use the image testing you will need to switch to the development tree. To do that:

$ sudo ostree admin switch gnome-os:gnome-os/40/x86_64-devel

This might require that you resize your partition so that there is enough space to do that If so, resize your partition to an extra 2G of storage to accommodate the developer toolchain.

You can then use git and other tools to copy your extension into the image. Please note that this image is not based on a distro and is built completely from source using buildstream and freedesktop-sdk and is managed through ostree. If your extensions have any external dependencies, you will need to bring them in manually.

Once you’ve got the extension working to your expectations, please make sure you update the metadata file to include the GNOME 40 version so that it will properly load before packaging it and uploading it to https://extensions.gnome.org/.

In the future, we will try to provide a better experience for testing – but for now, we will use this method. If you have questions or run into problems – you are welcome to use our community communication channels to ask them.

Porting your extension

For your convenience, we provide a porting guide – which is located at https://gjs.guide/ – inside the guide, there will be a porting section that you can use to port your extension by identifying the various function calls you are using and providing an alternative one to use.

During the porting process, you are welcome to join our community communication channels and ask us for help.

Reaching out to the Community

There are several ways you can reach out to us. You’re welcome to ask your questions on discourse.gnome.org and/or join us on the matrix at https://matrix.to/#/#extensions:gnome.org. If you prefer IRC, you can use our IRC server irc.gnome.org and then join #extensions.

To our users

We hope this is a beginning of a much better experience for those who use extensions on the GNOME platform. To aid us in this, we need your help. Since this is a relatively new initiative – it would be wonderful if you would aid us in outreach. If you have a favorite extension please politely call attention to this blog post. The greater the outreach the better we can hope to have most of your favorite extensions available for you when GNOME 40 arrives at your distribution.

GNOME 40 & your extension

As you are probably aware by now, GNOME 40 will bring some big changes.

This is exciting, but these changes also means that some extensions will have to adjust to continue working in GNOME 40.

To help with that, this post provides a brief overview(!) of the most important changes.

You can join the #gnome-shell and #shell-extensions channels on IRC/Matrix for further questions, and the friendly folks of the extensions rebooted project provide helpful resources like a testing VM image as well as advice.

Overview

The overview was the focus of the GNOME 40 changes, so it is not surprising that it is also the place where adjustment is most likely to be needed.

OverviewControls

This is now the central place that controls the overall state and ties the various overview components together:

    • dash (now horizontal and at the bottom, otherwise largely the same as before)
    • window picker
    • app grid
    • workspace minimap (formerly known as workspace switcher)
    • search controller (formerly known as view selector)

All those components have seen changes to their internals as well, so watch out for those if your extension modifies any of them.

Adjustments, adjustments, adjustments

Most state is now controlled by adjustments, so that transitions can either be animated or controlled by gestures:

    • overview adjustment
      controls the overall overview state, with the possible ControlsState values HIDDEN, WINDOW_PICKER and APP_GRID
    • fit-mode adjustment
      controls how workspaces are displayed, namely whether centering on a single workspace (0) or fitting all workspaces in the available space (1)
    • workspace adjustment
      controls which workspace is in view, that is the value corresponds to the active workspace (or an in-between value during transitions)
    • app grid adjustment
      controls the scroll position of app grid pages
    • workspace state adjustment
      controls whether window previews are shown floating (0) as outside the overview, or spread out according to the used layout strategy (1)

The first one is the most important one, driving both the overview transition and the fit-mode and workspace-state adjustments.

Backgrounds have moved into workspaces

This is a relatively minor change, but it affected two extensions I’m maintaining, so I decided it was worth mentioning after all.

Preferences

Extension preferences must use GTK4 now.

It is not possible to use both GTK3 and GTK4 from the same process, so we all have to take the plunge together; and as the process that opens preference dialogs was ported, now is an excellent time for that 🙂

The GTK documentation contains a migration guide that lists most of the changes that are required.

Porting a single preference dialog should be a lot easier than porting an entire application. At least that’s what I found when porting the gnome-shell-extensions and Fedora’s background-logo extensions, so hopefully it won’t be much more work for you.

Version validation

With all those changes, we expect more extensions to have compatibility issues than usual.

To protect against that, we are again doing version validation. That means unless the shell-version field in an extension’s metainfo.json file includes “40”, it will be disabled and marked as out-of-date.

Apropos “version”: We are following the new GNOME version scheme, so if you do any version comparisons yourself, make sure to take the major version into account.

… and one more thing

We no longer put arrows in top bar menus.

There have been no significant changes to top bar menus this cycle, so if your extension just adds a menu or indicator, it is unlikely to break.

It will just look a bit foreign if you show an arrow next to your menu, so we recommend removing them.

GNOME Shell 40 and multi-monitor

Multi-monitor has come up a fair bit in conversations about the GNOME Shell UX updates that are coming in GNOME 40. There’s been some uncertainty and anxiety in this area, so we wanted to provide more detail on what the multi-monitor experience will exactly be like, so people know what to expect. We also wanted to provide background on the decisions that have been made.

Newsflash

Before we get into multi-monitor, a short status update! As you would expect for this stage in the development cycle, the main bulk of the UI changes are now in the GNOME Shell master branch. This was the result of a really hard push by Georges and Florian, so huge thanks to them! Anyone who is interested in following this work should ensure that they are running the master branch, and not the now redundant development branch.

There are still a few relatively minor UI changes that we are hoping to land this cycle, but overall the emphasis is now on stabilisation and bug fixing, so if you are testing and have spotted any issues, now’s the time to report them.

Multi-monitor

OK, back to multi-monitor!

In many key respects, multi-monitor handling in GNOME 40 will be identical to how it is in 3.38. GNOME 40 still defaults to workspaces only on the primary display, as we have since 3.0. The top bar and overview will only be shown on the primary display, and the number of workspaces will still be dynamic. In many respects, GNOME 40 should feel very similar to previous GNOME versions, therefore.

That still leaves a lot of unanswered questions, of course, so let’s run through the GNOME 40 multi-monitor experience in more detail. Much of this concerns how workspaces will work in combination with multi-monitor setups.

Default configuration

As mentioned already, GNOME 40 will continue to default to only showing workspaces on the primary display. With a dual display setup, the overview will therefore look like this by default:

One detail to notice is how we’re scaling down the background on the secondary display, to communicate that it’s a single workspace like those on the primary display. We feel that this presentation helps to make the logic of the multiple displays clearer, and helps to unify the different screens.

To get an idea of what this will look like in use, Jakub Steiner has kindly created some motion mockups. These are intended to communicate how each part fits together and what the transitions will be like, rather than being a 100% accurate rendering of the final product (in particular, the transitions have been slowed down).

Here you can get an idea of what it will look like opening the overview and moving between workspaces. Just like the current default configuration, workspace switching only happens on the primary display.

Workspaces on all displays

While workspaces only being on the primary display is the default behaviour, GNOME also supports having workspaces on all displays, using the workspaces-only-on-primary settings key. The following static mockup shows what the overview will look like in GNOME 40 with this configuration.

As you can see, this is very similar to the workspaces only on primary configuration. The main difference is that you can see the additional workspaces extending to the right on the secondary display. It’s also possible to see that the workspace navigator (the small set of thumbnails at the top) is visible on both displays. The introduction of the workspace navigator on secondary displays is a new change for GNOME 40, which is intended to improve the experience for users who opt to have workspaces on all displays. We know from our user research that this is something that many users will welcome.

Like in other GNOME versions, when workspaces are on all displays, they are switched in unison. For example, all displays show workspace 2 at the same time. This can be seen in the motion mockups:

Keyboard shortcuts

The existing workspace shortcuts will continue to work in GNOME 40. Super+PgUp/PgDown will continue to switch workspace. Adding Shift will continue to move windows between workspaces.

We are also introducing additional shortcuts which align with the horizontal layout. The new shortcut to switch workspace will be Super+Alt+←/→. Moving windows between workspaces will be Super+Alt+Shift+←/→. Super+Alt+↑ will also open the overview and then app grid, and Super+Alt+↓ will close them.

These directional keyboard shortcuts have matching touchpad gestures: three-finger swipes left and right will switch workspaces, and three-finger swipes up and down will open the overview and app grid.

Why horizontal?

A few people have pointed out that horizontal workspaces aren’t as clean with horizontal multi-monitor setups. The concern is that, when multiple displays are horizontal, they end up clashing with the layout of the workspaces. There is some truth in that, and we recognise that some users might need to adjust to this aspect of the design.

However, it’s worth pointing out that horizontal workspaces are a feature of every other desktop out there. Not only is it how every other desktop does it, but it is also how GNOME used to do it prior to 3.0, and how GNOME’s classic mode continues to do it. Therefore, we feel that horizontal workspaces and horizontally-arranged displays can get along just fine. If anyone is concerned about this, we’d suggest that you give it a try and see how it goes.

Some people have also asked why we are making the switch to horizontal workspaces at all, which is fair! Here I think that it needs to be understood that horizontal workspaces are fundamental to the the design we’re pursuing for 40: the film-strip of workspaces (which proved effective in testing), the clearer organisation of the overview, the coherent touchpad gestures, a dash that can more comfortably scale to include more items, and so on. This is all facilitated by the workspace orientation change, and would not be possible without it.

GNOME ❤️ multi-monitor

In case there’s any doubt: multi-monitor is absolutely a priority for us in the shell design and development team. We know that the multi-monitor experience is important to many GNOME users (including many of us who work on GNOME!), and it is something that we’re committed to improving. This applies to both the default workspaces behaviour as well as the workspaces on all displays option.

Multi-monitor considerations regularly featured in the design planning for GNOME 40. They were also a research theme, both in our early discovery interviews and survey, as well as in the diary study that we ran. As a result of this, we are confident that GNOME 40 will provide an excellent multi-monitor experience.

We actually have a few plans for multi-monitor improvements in the future. Some of these pre-date the GNOME 40 work that is currently happening, and we hope to get back to them during the next development cycle. Our ambition is for the multi-monitor story to keep on getting better.

Thanks for reading!

Shell UX Changes: The Research

This post is part of an ongoing series about the overview design changes which are being worked on for GNOME 40. (For previous posts, see here.)

Ongoing user research has been a major feature of this design initiative, and I would say that it is by far the best researched project that I have worked on. Our research has informed the design as it has proceeded, resulting in particular design choices and changes, which have improved the overall design and will make it a better experience for users. As a result of this, we have a much greater degree of confidence in the designs.

This post is intended as a general overview of the research that we’ve been doing. I’m excited to share this, as a way of explaining how the design came about, as well as sharing some of the insights that we’ve found along the way.

What we did

In total, we conducted six separate research exercises as part of this initiative. These ran alongside the design and development effort, in order to answer the questions we had at each stage of the process.

Many of the research exercises were small and limited. This reflected our ambition to use a lean approach, which suited the limited resources we had available. These smaller exercises were supplemented with a larger piece of paid research, which was conducted for us by an external research company. In what follows I’ll go through each exercise in order, and give a brief description of what was done and what we found out.

So far the data from our research isn’t publicly available, largely because it contains personal information about test participants which can’t be shared. However, we do plan on making a version of the data available, with any identifying information removed.

1. Exploratory interviews

I already blogged about this exercise back in September. A summary: the initial interviews were an exploratory, sensitising, exercise, to find out how existing users used and felt about GNOME Shell. We spoke to seven GNOME users who had a range of roles and technical expertise. Each participant showed us their desktop setup and how they used it, and we asked them questions to find out how the existing shell design was working for them.

We found out a bunch of valuable things from those early interviews. A good portion of the people we spoke to really liked the shell, particularly its minimalism and the lack of distractions. We also discovered a number of interesting behaviours and user types around window and workspace usage.

2. Initial behavioural survey

The initial survey exercise was also covered in my September blog post. It was intended to provide some numbers on app, window and workspace usage, in order to provide some insight into the range of behaviours that any design changes needed to accommodate.

The survey was a deliberately quick exercise. We found out that most people had around 8 open windows, and that the number of people with a substantially higher number of open windows was low. We also found that most people were only using a single workspace, and that high numbers of workspaces in use (say, above six) was quite rare.

3. Running apps experiment

During the early design phase, the design team was interested in the role of the running apps in the dash. To explore this, I ran a little experiment: I got colleagues to install an extension which removes running apps from the dash, and asked them to record any issues that they experienced.

We found that most people got along just fine without running apps in the dash. Despite this, in the end we decided to keep the running apps, based on other anecdotal reports we’d seen.

4. External user testing

Thanks to support by Endless, we were lucky to have the opportunity to contract out some research work. This was carried out by Insights and Experimentation Agency Brooks Bell and was contracted under the umbrella of the GNOME Foundation.

The research occurred at a point in the process where we were weighing up key design decisions, which the research was designed to help us answer in an informed manner.

Methodology

The research itself consisted of 20 moderated user testing sessions, which were conducted remotely. Each participant tested GNOME 3.38 and then either a prototype of the new design or Endless OS. This provided us with a means to compare how each of the three desktops performed, with a view to identifying the strengths and weaknesses of each.

Each session involved a combination of exploration and evaluation. Participants were interviewed about their typical desktop usage, and were invited to recreate a typical desktop session within the test environment. They were then asked to perform some basic tasks. After testing both environments, they were required to fill in a post-test survey to give feedback on the two desktops they had tried.

Research participants included both existing GNOME users, as well as users who had never used GNOME before. The sample included a range of technical abilities and experience levels. It also included a mix of professional and personal computer users. The study was structured in such a way that we could analyse the differences between different user groups, so we could get a sense of how each desktop performed with different user groups. Participants were recruited from six countries: Brazil, Canada, Germany, Italy, United Kingdom and the USA.

Brooks Bell were a great firm to work with. Our own design and development team were able to have detailed planning conversations with them, as well as lengthy sessions to discuss the research findings. We were also given access to all the research data, to enable us to do our own analysis work.

Findings

The external research provided a wealth of useful information and analysis. It addressed the specific research questions that we had for the study, but also went further to address general questions about how and why the participants responded to the designs in the way that they did, as well as identifying a number of unrelated design issues which we hope to address in future releases.

One of the themes in the research was the degree to which users positively responded to UI conventions with which they were already familiar. This was reflected in both how respondents responded to the designs in general, as well as how successfully they were able to use specific aspects of them. For example, interactions with the app grid and dash were typically informed by the participants’ experiences with similar UIs from other platforms.

This might seem like an obvious finding, however the utility of the research was in demonstrating how this general principle played out in the specific context of our designs. It was also very interesting to see how conventions from both mobile and desktop informed user behaviour.

In terms of specific findings, there wasn’t a single clear story from the tests, but rather multiple overlapping findings.

Existing GNOME users generally felt comfortable with the desktop they already use. They often found the new design to be exciting and liked the look and feel, and some displayed a negative reaction to Endless, due to its similarity with Windows.

“I like the workspaces moving sideways, it feels more comfortable to switch between them.”
—Comment on the prototype by an existing GNOME user

All users seemed to find the new workspace design to be more engaging and intuitive, in comparison with the workspaces in GNOME 3.38. This was one particular area where the new design seemed to perform better than existing GNOME Shell.

“[It feels] quicker to navigate through. It [has a] screen where I can view my desktop at the top and the apps at the bottom, this makes it quicker to navigate.”
—Comment on the prototype by a non-GNOME user

On the other hand, new users generally got up to speed more quickly with Endless OS, often due to its similarity to Windows. Many of these testers found the bottom panel to be an easy way to switch applications. They also made use of the minimize button. In comparison, both GNOME 3.38 and the prototype generally took more adjustment for these users.

“I really liked that it’s similar to the Windows display that I have.”
—Comment on Endless OS by a non-GNOME user

5. Endless user testing

The final two research exercises we conducted were used to fill in specific gaps in our existing knowledge, largely as a validation exercise for the design we were working towards. The first of these consisted of 10 remote user testing sessions, conducted by Endless with participants from Guatemala, Kenya and the USA. These participants were picked from particular demographics that are of importance to Endless, particularly young users with limited computing experience.

Each test involved the participant running through a series of basic desktop tasks. Like the tests run by Brooks Bell, these sessions had a comparative element, with participants trying both Endless OS and the prototype of the new design. In many respects, these sessions confirmed what we’d already found through the Brooks Bell study, with participants both responding well to the workspace design in the prototype, and having to adjust to designs that were unfamiliar to them.

“Everything happens naturally after you go to Activities. The computer is working for you, you’re not working for it”
—Tester commenting on the new design

6. Diary study

The diary study was intended to identify any issues that might be encountered with long-term usage, which might have been missed in the previous user tests. Workspaces and multi-monitor usage were a particular focus for this exercise, and participants were selected based on whether they use these features.

The five diary study participants installed the prototype implementation and used it for a week. I interviewed them before the test to find out their existing usage patterns, then twice more over the test period, to see how they were finding the new design. The participant also kept a record of their experiences with the new design, which we referred back to during the interviews.

This exercise didn’t turn up any specific issues with multi-monitor or workspace usage, despite including participants who used those features. In all the participants generally had a positive response to the new design and preferred it over the existing GNOME shell they were using. It should be mentioned that this wasn’t universal however.

7. Community testing and feedback

While community testing isn’t strictly a research exercise, it has nevertheless been an important part of our data-driven approach for this initiative. One thing that we’ve managed to do relatively successfully is have a range of easy ways to test the new design. This was a priority for us from the start and has resulted in us being able to have a good round of feedback and design adjustment.

It should be noted that those of us on the design side have had detailed follow-up conversations with those who have provided feedback, in order to ensure that we have a complete understanding of the issues as described. (This often requires having background knowledge about users setup and usage patterns.) I have personally found this to be an effective way of developing empathy and understanding. It is also a good example of how our previous research has helped, by providing a framework within which to understand feedback.

The main thing that we have got from this stage of the process is testing with a wider variety of setups, which in particular has informed the multi-monitor and workspace aspects of the design.

Reflection

As I wrote in the introduction to this post, GNOME has never had a design initiative that has been so heavily accompanied by research work. The research we’ve done has undoubtedly improved the design that we’re pursuing for GNOME 40. It has also enabled us to proceed with a greater degree of confidence than we would have otherwise had.

We’re not claiming that every aspect of the research we’ve done has been perfect or couldn’t have been improved. There are gaps which, if were able to do it all again, we would have liked to have filled. But perfect is the enemy of good and doing some research – irrespective of its issues – is certainly better than doing none at all. Add to this the fact that we have been doing research in the context of an upstream open source project with limited resources, and I think we can be proud of what we’ve achieved.

When you put together the lessons from each of the research exercises we’ve done, the result is a picture of different user segments having somewhat different interests and requirements. On the one hand, we have the large number of people who have never used GNOME or an open source desktop, to whom a familiar design is one that is generally preferable. On the other hand, there are users who don’t want a carbon copy of the proprietary desktops, and there are (probably more technical) users who are particularly interested in a more minimal, pared back experience which doesn’t distract them from their work.

The best way for the GNOME project to navigate this landscape is a tricky question, and it involves a difficult balancing act. However, with the changes that are coming in GNOME 40, we hope that we are starting out on that path, with an approach that both adopts some familiar conventions from other platforms, while developing and refining GNOME’s unique strengths.

Another Shell UX Update

Another update on the UX changes that are being worked on for GNOME 40! (See previous posts here, here, here…)

A status update

First off, a summary of where we are. Development work has been proceeding apace, and the main batch of changes are currently in the process of being merged into master. Other polish changes are being queued up alongside this, ready to be merged.

This work has primarily been undertaken by Georges Stravacas, with assistance from Florian Müllner. Georges has even been doing live coding sessions, where you can see him do the work in real time!

We currently have about two weeks until UI freeze. This means that, once the major changes have been merged, we will have a short, intense period of polishing and bug fixing work. (If we discover issues after the freeze, there’s also the possibility of getting exceptions to land changes.)

There has been plenty of activity on the design side, too. The design team has been busy testing the development branch and dealing with issues as they have come up. We have a fairly long list of issues that we’re tracking, which we will be turning into a more accessible roadmap very shortly.

We have also been busy discussing solutions to the issues that have come up in feedback. One of the major changes to come out of that is a new workspace navigator, which has been added to the “window picker” in the Activities Overview.

Testing testing testing

We have a variety of methods available for people to test the new design, and invite everyone who is interested to try it out. Once the changes have landed in master will be a great time to file issues.

Current testing methods include:

Run a prebuilt VM image in Boxes

Felipe Borges has done a great job creating a VM image containing the changes, which can be downloaded and run in Boxes. To do this:

  • Download the image
  • Extract the downloaded .qcow2.gz file (from Files, right click on the file and click Extract Here)
  • Open Boxes
  • Press the add (+) button, then select Create a virtual machine…
  • Scroll down to Operating System Image File – press that, then select the .qcow2 file that was extracted
  • At the next step, click the Template entry and select Fedora 33. Then click Next.
  • At the next Review and Create step, click Create to make the VM
  • When the VM has finished installing and has booted, log in to the gnome user account. The password is “gnome”.

These images are based on Fedora using a COPR repository. They can be updated using DNF to get the latest design and development changes.

Use the testing COPR from Fedora 33

The main development branches for the changes have been made available as a COPR repository for Fedora 33. This can be used with an existing Fedora 33 install, either in a VM or on bare metal.

The obvious warning applies here – this is development software with limited testing, and you will be swapping out key desktop components. You should be prepared for your system to break and have an idea of how to recover should this happen. This is definitely a case of proceeding at your own risk.

Using the COPR is simply a matter of adding it and updating with the following commands:

  • sudo dnf copr enable haeckerfelix/gnome-shell-40
  • sudo dnf update

Then reboot.

Build the branch in a virtual machine

If you want to track the latest changes in real time, or want to help with development, this could be a good option for you, and doesn’t take a huge amount of work. Instructions can be found here.

What next?

Everyone is welcome to test the design using the methods described above. Note that, at the moment, all of these involve using the development branch which will not be identical to the final implementation. The branch is primarily useful for testing the general design and giving general feedback to the design team. Once more work has landed in master, that will be the time to file specific bugs.

If other testing methods become available, please let us know and we’ll add references to them in the blog post.

Threaded input adventures

Come around and gather, in this article we will talk about how Mutter got an input thread in the native backend.

Weaver, public domain (Source)

A trip down memory lane

Mutter wasn’t always a self-contained compositor toolkit, in the past it used to rely on Clutter and Cogl libraries for all the benefits usually brought by toolkits: Being able to draw things on screen, and being able to receive input.

In the rise of Wayland, that reliance on an external toolkit drove many of the design decisions around input management, usually involving adding support in the toolkit, and the necessary hooks so Mutter could use or modify the behavior. It was unavoidable that both sides were involved.

Later on, Mutter merged its own copies of Clutter and Cogl, but the API barrier stayed essentially the same at first. Slowly over time, and still ongoing, we’ve been refactoring Mutter so all the code that talks to the underlying layers of your OS lives together in src/backends, taking this code away from Clutter and Cogl.

A quick jump to the near past

However, in terms of input, the Clutter API barrier did still exist for the most part, it was still heavily influenced by X11 design, and was pretty much used as it was initially designed. Some examples, no special notoriety or order:

  • We still forwarded input axes in a compact manner, that requires querying the input device to decode event->motion.axes[3] positions into CLUTTER_INPUT_AXIS_PRESSURE. This space-saving peculiarity comes straight from XIEvent and XIQueryDevice.
  • Pointer constraints were done by hooking a function that could impose the final pointer position.
  • Emission of wl_touch.cancel had strange hooks into libinput event handling, as the semantics of CLUTTER_TOUCH_CANCEL varied slightly.

Polishing these interactions so the backend code stays more self-contained has been a very significant part of the work involved.

Enter the input thread

The main thread context is already a busy place, in the worst case (and grossly simplified) we:

  • Dispatch several libinput events, convert them to ClutterEvents
  • Process several ClutterEvents across the stage actors, let them queue state changes
  • Process the frame clock
    • Relayout
    • Repaint
  • Push the Framebuffer changes

All in the course of a frame. The input thread takes the first step out of that process. For this to work seamlessly the thread needs a certain degree of independence, it needs to produce ClutterEvents and know where will the pointer end up without any external agents. For example:

  • Device configuration
  • Pointer barriers
  • Pointer locks/constraints

The input thread takes over all these. There is of course some involvement from the main thread (e.g. specifying what barriers or constraints are in effect, or virtual input), but these synchronization points are either scarce, or implicitly async already.

The main goal of the input thread is to provide the main thread with ClutterEvents, with the fact that they are produced in a distinct thread being irrelevant. In order to do so, all the information derived from them must be independent of the input thread state. ClutterInputDevice and ClutterInputDeviceTool (representing input devices and drawing tablet tools) are consequently morphing into immutable objects, all changes underneath (e.g. configuration) are handled internally in the input thread, and abstracted away in the emitted events.

The Dark Side of the Loom“The Dark Side of the Loom” by aldoaldoz is licensed under CC BY-NC-SA 2.0

What it brings today

Having a thread always ready to dispatch libinput may sound like a small part in the complexity involved to give you a new frame, but it does already bring some benefits:

  • Libinput events are always dispatched ASAP, so this will mean less “client bug: event processing lagging behind by XXms” messages in the journal.
  • Input handling not being possibly stalled by the rest of the operations in the main thread means fewer awkward situations where we don’t process events in time (e.g. a key release stopping key repeat, at least in the compositor side).
  • With the cursor logical position being figured alone by the input thread, updating the cursor plane position to reflect the most up-to-date position for the next frame does simply require asking the input thread for it.
  • Generally, a tidier organization of input code where fewer details leak outside the backend domain.

What it does not bring (yet)

Code is always halfways to a better place, the merged work does not achieve yet everything that could be achieved. Here’s some things you shouldn’t expect to see fixed yet:

  • The main thread is still in charge of KMS, and updating the cursor plane buffer and position. This means the pointer cursor will still freeze if the main thread stalled, despite the input thread handling events underneath. In the future, There would be another separate thread handling atomic KMS operations, so it’d be possible for the input and KMS threads to talk between them and bypassing any main thread stalls.
  • The main thread still has some involvement in handling of Ctrl+Alt+Fn, should you need to switch to another TTY while hard-locked. Making it fully handled in the input thread would be a small nicety for developers, perhaps a future piece of work.
  • Having an input handling that is unblocked by almost anything else is a prerequisite for handling 1000Hz mice and other high-frequency input devices. But the throttling behavior towards those is unchanged, a better behavior should be expected in the short term.

Conclusions

We’ve so far been working really hard in making Mutter as fast and lock free as possible. This is the first step towards a next level in design that is internally protective against stall situations.

A shell UX update

Last month I shared an updated activities overview design, which is planned for the next GNOME release, version 40.

The new design has prompted a lot of interest and comment, which we’re all really thrilled about. In this post I wanted to provide an update of where the initiative is at. I also want to take the opportunity to answer some of the common questions that have come up.

Where we’re at

Development work has moved rapidly since I blogged last, thanks mostly to a big effort by Georges. As a result, a lot of the basic elements of the design are now in place in a development branch. The following is a short screencast of the development branch (running in a VM), to give an idea of where the development effort has got to:

There are still work items remaining and the branch has noticeable polish issues. Anyone testing it should bear this in mind – as it stands, it isn’t a complete reflection of the actual design.

On the design side, we’ve been reviewing the feedback that has been provided on the design so far, and are tracking the main points as they’ve emerged. This is all really valuable, but we’d also suggest that people wait to try the new design before jumping to conclusions about it. We plan on making it easier to test the development version, and will provide details about how to do so in the near future.

The roadmap from here is to develop the branch with the new design, open it up to testing, and have an intensive period of bug fixing and evaluation prior to the UI freeze in about a month’s time. As we progress it will become easier for people to get involved both in terms of design and development.

What the design means for users

In the rest of this post, I’m going to address some of the common questions and concerns that we’ve heard from people about the new design. My main goal here is to clear up any confusion or uncertainty that people might have.

Argh, change!

A good portion of the comments that we’ve had about the design reflect various concerns about existing workflows being disrupted by the design changes. We understand these concerns and an effort has been made to limit the scale and disruptiveness of the updated design. As a result, the changes that are being introduced are actually quite limited.

Everything about the shell remains the same except for the overview, and even that is structurally the same as the previous version. The overview contains the same key elements – windows overview, search, the dash, the app grid – which are accessed in the same sequence as before. The old features that are tied to muscle memory will work just as before: the super key will open the overview, search will function as before, and the existing shortcuts for workspaces will continue to be supported.

One piece of feedback that we got from initial testing is that testers often didn’t notice a massive difference with the new design. If you’re concerned about potential disruption, we’d encourage you to wait to try the design, and see how it behaves in practice. You might be surprised at how seamless the transition is.

Advantages of the new design

A few users have asked me: “so how is the new design better for me?” Which is a fair question! I’ll run through what I see as the main advantages here. Users should bear in mind that some of the improvements are particularly relevant to new rather than existing users – there are some positive impacts which you might not personally benefit from.

Boot experience

The boot experience is something that we’ve struggled with throughout GNOME 3, and with the new design we think we’ve cracked it. Instead of being greeted by a blank desktop (and then, a blank overview), when you boot into the new design, you’ll be presented with the overview and your favourite apps that you can launch. Overall, it’s a more welcoming experience, and is less work to use.

I have been asked why this change isn’t possible with the existing shell UI. Couldn’t we just show the overview on boot, without making these other changes? Theoretically we could, but the new overview design is much better suited to being shown after boot: the layout provides a focus for action and places app launching more centrally. In contrast, the old shell design places launching on the periphery and does not guide the user into their session as effectively.

Touchpad gestures

Effective touchpad gestures can be incredibly effective for navigation, yet our gestures for navigating the shell have historically been difficult to use and lacking a clear schema. The new design changes that, by providing a simple, easy and coherent set of touchpad gestures for moving around the system. Up and down moves in and out of the overview and app grid. Left and right moves between workspaces. If you’re primarily using the touchpad, this is going to be a huge win and it’s a very easy way to move around.

Easy workspaces

In our user testing, the new workspace design demonstrated itself to be more engaging and easier to get to grips with than the old one. New users could easily understand workspaces as “screens” and found it easier to get started with them, compared to the current design which wasn’t as accessible.

Feel and organisation

Designers often talk about mental and spatial models, and the new design is stronger in both regards. What does this translate to for users? Mostly, that the new design will generall feel better. Everything fits together better, and is more coherent. Moving around the system should be more natural and intuitive.

Other advantages

Other than those other main advantages, there are other more minor plus points to the new design:

  • Personalised app grid – you can now fully rearrange the app grid to your liking, using drag and drop. This is something that we’ve been working on independently to the other changes, but has continued to evolve and improve this cycle, and it fits very nicely with the other overview changes.
  • App icons in the window overview – the window overview now shows the app icon for each window, to help with identification.
  • Improved app titles – we have a new behaviour for GNOME 40, which shows the full title of the application when hovering its launcher.

Q & A

The following are some of the other questions that have come up in comments about the designs. Many of these have been answered in place, and it seemed worthwhile to share the answers more widely.

How will window drag and drop between workspaces work?

The current design works by zooming out the view to show all workspaces when a window is dragged:

Will I be able to search from the overview after pressing super?

Yes, that won’t change.

Will there be an option to restore the old design?

We don’t plan on supporting this option, largely because of the work involved. However, there could of course be community extensions which restore some aspects of the old design (say, having a vertical dash along the side). We’re happy to work with extension developers to help this to happen.

Please keep the hot corner!

OK. 🙂 (We weren’t planning on removing it.)

How will the new design affect multi-display setups?

It should have very little impact on multi-monitor. The same behaviour with regards to workspaces will be supported that we currently have: by default, only the primary display has workspaces and secondary displays are standalone. We don’t anticipate any major regressions and have some ideas for how to improve multi-monitor support, but that might need to wait until a future release.

Will the new design work OK with vertical displays?

Yes, it will work just fine.

End

That’s it for now. With this initiative proceeding quickly, we hope to have more updates soon. We also aim to provide another post with details on our user research in the not too distant future.