Local-First Workshop (feat. p2panda)

This week we had a local-first workshop at offline in Berlin, co-organized with the p2panda project. As I’ve written about before, some of us have been exploring local-first approaches as a way to sync data between devices, while also working great offline.

We had a hackfest on the topic in September, where we mapped out the problem space and discussed different potential architectures and approaches. We also realized that while there are mature solutions for the actual data syncing part with CRDT libraries like Automerge, the network and discovery part is more difficult than we thought.

Network Woes

The issues we need to address at the network level are the classic problems any distributed system has, including:

  • Discovering other peers
  • Connecting to the other peers behind a NAT
  • Encryption and authentication
  • Replication (which clients need what data?)

We had sketched out a theoretical architecture for first experiments at the last hackfest, using WebRTC data channel to send data, and hardcoding a public STUN server for rendezvous.

A few weeks after that I met Andreas from p2panda at an event in Berlin. He mentioned that in p2panda they have robust networking already, including mDNS discovery on the local network, remote peer discovery using rendezvous servers, p2p connections via UDP holepunching or relays, data replication, etc. Since we’re very interested in getting a low-fi prototype working sooner rather than later it seemed like a promising direction to explore.

p2panda

The p2panda project aims to provide a batteries-included SDK for easy local-first app development, including all the hard networking stuff mentioned above. It’s been around since about 2020, and is currently primarily developed by Andreas Dzialocha and Sam Andreae.

The architecture consist of nodes and clients. Nodes include networking, materialization, and an SQL database. Clients sign and create data, and interact with the node using a GraphQL API.

As of the latest release there’s TLS transport encryption between nodes, but end-to-end data-encryption using MLS is still being worked on, as well as a capabilities system and privacy-respecting deletion. Currently there’s a single key/value CRDT being used for all data, with no high-level way for apps to customize this.

The Workshop

The idea for the workshop was to bring together people from the GNOME and local-first communities, discuss the problem space, and do some initial prototyping.

For the latter Andreas prepared a little bookmark manager demo project (git repository) that people can open in Workbench and hack on easily. This demo runs a node in the background and accesses the database via GraphQL from a simple GTK frontend, written in Rust. The demo app automatically finds other peers on the local network and syncs the data between them.

Bookmark syncing demo using p2panda, running in Workbench

We had about 10 workshop participants with diverse backgrounds, including an SSB developer, a Mutter developer, and some people completely new to both local-first and GTK development. We didn’t get a ton of hacking done due to time constraints (we had enough program for an all-day workshop realistcally :D), but a few people did start projects they plan to pursue after the workshop, including C/GObject bindings for p2panda-rs and an app/demo to sync a list of map locations. We also had some really good discussions on local-first architecture, and the GNOME perspective on this.

Thoughts on Local-First Architectures

The way p2panda splits responsibilities between components is optimized for simple client development, and being able to use it in the browser using the GraphQL API. All of the heavy lifting is done in the node, including networking, data storage, and CRDTs. It currently only supports one CRDT, which is optimized for database-style apps with lots of discrete fields.

One of the main takeaways from our previous hackfest was that data storage and CRDTs should ideally be in the client. Different apps need different CRDTs, because these encode the semantics of the data. For example, a text editor would need a custom text CRDT rather than the current p2panda one.

Longer-term we’ll probably want an architecture where clients have more control over their data to allow for more complex apps and diverse use cases. p2panda can provide these building blocks (generic reducer logic, storage providers, networking stack, etc.) but these APIs still need to be exposed for more flexibility. How exactly this could be done and if/how parts of the stack could be shared needs more exploration :)

Theoretical future architectures aside, p2panda is a great option for local-first prototypes that work today. We’re very excited to start playing with some real apps using it, and actually shipping them in a way that people can try.

What’s Next?

There’s a clear path towards first prototype GNOME apps using p2panda for sync. However, there are two constraints to keep in mind when considering ideas for this:

  • Data is not encrypted end-to-end for now (so personal data is tricky)
  • The default p2panda CRDT is optimized for key / value maps (more complex ones would need to be added manually)

This means that unfortunately you can’t just plug this into a GtkSourceView and have a Hedgedoc replacement. That said, there’s still lots of cool stuff you can do within these constraints, especially if you get creative in the client around how you use/access data. If you’re familiar with Rust, the Workbench demo Andreas made is a great starting point for app experiments.

Some examples of use cases that could be well-suited, because the data is structured, but not very sensitive:

  • Expense splitting (e.g. Splittypie)
  • Meeting scheduling (e.g. Doodle)
  • Shopping list
  • Apartment cleaning schedule

Thanks to Andreas Dzialocha for co-organizing the event and providing the venue, Sebastian Wick for co-writing this blog post, Sonny Piers for his help with Workbench, and everyone who joined the event. See you next time!

GNOME 45 Release Party & Hackfest

In celebration of the 45 release we had a hackfest and release party in Berlin last week. It was initially supposed to be a small event, but it turns out the German community is growing more rapidly than we thought! In the end we were around 25 people, about half of them locals from Berlin :)

GNOME OS

Since many of the GNOME OS developers were in town for All Systems Go, this was one of the main topics. In addition to Valentin, Javier, and Jordan (remote), we also had Lennart from systemd and Adrian from carbonOS and discussed many of the key issues for image-based operating systems.

I was only present for part of these discussions so I’ll leave it to others to report the results in detail. It’s very exciting how things are maturing in this area though, as everyone is standardizing on systemd’s tools for image-based OSes.

Discussing the developer story on image-based OSes with Jonas and Sebastian from GNOME Shell/mutter. Left side: Adrian, Javier, Valentin, Sebastian. Right side: Jonas, Kai, Lennart

Local-First

On Saturday the primary topic was local-first. This is the idea that software should always work offline, and optionally use the network when available for device sync and collaboration. This allows for people to own their data, but still have access to modern features like multiplayer editing.

People in the GNOME community have long been interested in local-first and we’ve had various discussions and experiments in this direction over the past few years. However, so far we have not really investigated how we’d implement it at a larger scale, and what concrete steps in that direction would look like.

For context, any sync system (local-first or not) needs the following things:

  • Network: Device discovery, channel to send the actual data, way to handle offline nodes, encryption, device authentication, account management
  • Sync: Merging data from different peers, handling conflicts
  • UI: User interface for viewing and manipulating the data, showing sync status, managing devices, permissions, etc.

Local-first usually refers to systems that do the “sync” part on the client, though that doesn’t mean the other areas are easy :)

Muse

Adam Wiggins stopped by on Saturday morning to tell us about his work on Muse, a local-first whiteboard app for Apple platforms. While it’s a totally different tech stack and background, it was super interesting because Muse is one of very few consumer apps using local-first sync in production today.

Some of my takeaways from the session with Adam:

  • Local-first means all the logic lives in the client. In the Muse architecture, the server is extremely simple, basically just a dumb pipe routing data between clients. While data is not encrypted end-to-end in their case, it’s possible to use this same architecture with E2E.
  • CRDTs (conflict-free replicated data types) are a magical new advancement in computer science over the past few years, which makes the actual merging of content relatively easy.
  • Merge conflicts are not as big a deal as one might think, and not the hardest problem to solve in this space.
  • Local-first is a huge opportunity for desktop projects like GNOME. We were not really able to be competitive with proprietary software in the past decade on features like sync and multiplayer because we can’t realistically run huge cloud services for every single app/use case. Local-first could change this, since the logic is shifting back to the client. Servers become generic dumb pipes, which all kinds of apps can use without needing their own custom sync server.

To learn more about Muse, I recommend watching Adam’s Local-First Meetup talk from earlier this year, which touches on many of the topics we discussed in our hackfest session as well.

Other Relevant Art

The two projects we discussed as relevant art from our community are Christian Hergert’s Bonsai, and Carlos Garnacho’s work on RDF sync in tracker (codename “Emergence”).

Bonsai is not quite local-first architecturally, since it assumes an always-on home server. This server hosts your data and runs services, which apps on your other devices can use to access data, sync, etc. This is quite different from the dumb pipe server model discussed above, and of course comes with the usual caveats with any kind of public-facing service on local networks (NAT, weird network configurations, etc.).

Codename “Emergence” is a way to sync graph databases (such as tracker’s SPARQL database). It only touches on the “sync” layer, and is only intended for app data, e.g. bookmarks, contacts, and the like. There was a lot of discussion at the hackfest about whether the conflict resolution algorithm is/could be a CRDT, but regardless, using this system for syncing some types of content wouldn’t affect the overall architecture. We could use it for syncing e.g. bookmarks, and share the rest of the stack (e.g. network layer) with other apps not using tracker.

Afternoon local-first discussion. Left-to-right: Zeeshan, Sebastian, Andrei, Marvin, Adrian, Carlos, and myself

Next Steps

By the end of the hackfest, we had a rough consensus that long-term we probably want something like this:

  • Muse-style architecture with dumb pipe sync servers that only route encrypted traffic between clients
  • Some kind of system daemon that apps can use to send packets to sync servers, so they don’t all have to run in the background
  • The ability to fall back to other kinds of transport with full compatibility, e.g. local network or USB keys
  • A client library that makes it easy to integrate sync into apps, using well-established CRDTs

However, there was also a general feeling that we want to go slowly and explore the space before coming up with over-engineered solutions. To this end, we think the best next step is to try CRDTs in small, self-contained apps. We brainstormed a number of potentially interesting use cases, including:

  • Alarms: Make extra sure you hear your alarms by having it synced on all devices
  • Scratchpad: Super simple notepad that’s always in sync across devices
  • Emoji history: The same recently used emoji on all devices
  • Podcasts: Sync subscription list, episode playback state, per-episode progress, currently played episode, etc.
  • Birthday reminders: Simple list of birthdays with reminder notifications that syncs across all devices

For a first minimum viable prototype we discussed ways to cut as many corners as possible, and came up with the following plan:

  • Use an off-the shelf plain text CRDT to build a syncing scratchpad as a first experiment
  • To avoid having to deal with servers, do peer-to-peer transfer only and send data via WebRTC data channel
  • For peer discovery, just hardcode a public WebRTC STUN server in the client
  • Simple Rust GTK app mostly consisting of a text area, using gstreamer for WebRTC and automerge for CRDTs
  • Sync only between two devices

We’ll see how this develops, but it’s great to have mapped out the territory, and put together a concrete plan for next steps in this direction. I’m also conscious that we’re a huge community and only a handful of people were present at the hackfest. It’s very likely that these plans will evolve as more people get involved and we get more experience working with the technology.

For more detail on the discussions, check out the full notes from our local-first sessions.

If you’d like to experiment with this in your own app and have any questions, don’t hesitate to reach out :)

Our beautiful GNOME 45 cake :)

Transparent Panel

Jonas has had an open merge request for a transparent panel for a number of years, and while we’ve tried to get it over the line a few times we never quite managed. Recently Adrian Vovk was interested in giving it another try, so at the hackfest him and Jonas sat down and did some archaeology on Jonas’ old commits, rebased it, got it to work agian, and opened a new merge request.

Adrian and Jonas hard at work rebasing ancient commits

While there are still a few open questions and edge cases, it’s early in the cycle so there’s a real chance that we might finally get this in for 46 :)

And More!

A few other things I was involved with during the hackfest:

  • Julian looked into one of my pet bugs: The default generated avatars when you create a new account not looking like AdwAvatar, but using an older, uglier implementation. This is surprisingly tricky because GDM/GNOME Shell can’t show GTK widgets, and the exported PNG avatars from libadwaita can only be exported at the size they’re being displayed at (which is smaller than in GDM/GNOME Shell).
  • We worked a bit on Annotations in Evince with Pablo, and also interviewed Keywan about how they use annotations in the editorial process for their magazine.
  • DieBahn was officially renamed to Railway, and we discussed next steps for the app and train APIs in general. Railway works for so many providers by accident, because so many of them use the same backend (HAFAS), but it’d be great to have actual open APIs for querying trains from all providers. Perhaps we need a lobbying group to get some EU legislation for this? :)
  • We discussed a “demo mode”, i.e. an easy way to set up a device with a bunch of nice-looking apps, pre-loaded with nice-looking content. One potential approach we discussed was a script that installs a set of apps, and sets them up with data by pre-filling their .var/app/ directories. The exact process for creating and updating this data would need looking into, but I’m very interested in getting something set up for this, because not having it really our software hard to demo.
  • Marvin showed me how he uses CLion for C/Vala development, and we discussed what features Builder would need to gain for him to switch from his custom Vala setup in CLion to Builder.
Julian working on fixing the default avatars

Thanks to Sonny for co-organizing, Cultivation Space for hosting us, and the GNOME Foundation for financial sponsorship! See you next time :)