Modern Text Editor Design

I’ve been fascinated about a few technologies in my career. I have a fondness for finding the right data-structure for a problem. Maybe it was because of all those years playing with cars that gave me the “I wanna go fast” mentality. It lead me to various jobs, including working on databases.

Another technology I love are text editors. There is some really fascinating technology going on behind the scenes.

Gtk4 development is heating up, and we are starting to see a toolkit built like a game engine. That’s pretty cool. But how will that change how we write editors? Should it?

In the Gtk3 cycle, I added support to GtkTextView that would render using Alex’s GtkPixelCache. It helped us amortize the cost of rendering into mostly just an XCopyArea() when drawing a frame. It’s why we have that nice 60fps two-finger-scrolling.

But now that we can have GPU textures, do we want to start doing paginated rendering like Apple did with Preview to make PDF rendering ultra fast? Do we want to focus on just sending text data to the GPU to render from an glyph atlas? How about layout? And intermixing right-aligned with left-aligned text? What if we just focus on code editing and not generic editing with rich font support. How about inserting widgets in-between rows? Do we want unlimited undo? How about crash recovery?

These are all questions that can inform the design of a text editor, and they are things I’m starting to think about.

To inform myself on the problem domain better, I started writing a piecetable implementation with various tweaks to outperform those I’ve seen previously. What I’ve come up with is a combination of a b+tree (bplus tree) and a piecetable. The neat thing about it is the high-density you get per-cacheline as compared to something in a RBTree or SplayTree (Atom recently did this, and AbiWord did it a decade ago). It’s at least as fast as those, but with much less malloc overhead because you need fewer, but densely packed allocations.

I prefer dense-cacheline packed structures over pointer chasing (dereferencing pointers) because the CPU can crunch through numbers in a cacheline much faster than it can load another cacheline (as it may not yet be in L1-3 caches). So while it looks like you’re doing more work, you may in fact be saving time.

On my 10 year old 1.2ghz ThinkPad, I can do 1.5 to 2 million random inserts per-second. So I think it’s “good enough” to move on to solving the next layer of problems.

One neat thing about using the linked-leaves from a b+tree is that you get a “next pointer” from each leaf to the next sequential leaf. So if you need to reconstruct the buffer, it’s a simple scan without tree traversal. This is a common problem in a text editor, because we’re sending out text data to diagnostic engines constantly and it needs to be optimized for.

Part of the piecetable design is that you have two buffers. The original data (file state at loading) and change data (append only buffer if each character typed by the user). The piece table is just pointing to ranges in each buffer set to reconstruct the final buffer.

If you log all of your operations to a log file, you can fairly quickly get yourself a crash recovery mechanism. Additionally, you can create unlimited undo.

One thing I don’t like about GtkTextBuffer today is that you cannot open “very large files” with it. It can certainly handle 100mb text files, but it has to load all of that data into memory. And that means that opening a 10gb SQL dump is going to be fairly difficult. But if we implemented on-demand, paginated data loading (reading from disk when that portion of the file is needed), we can get a fixed memory overhead for a file.

One downside to that approach is that if the file is modified behind the scenes, you are basically screwed. (A proper rename() would not affect things since the old FD would still be valid). One way to work around this is to copy the file before editing (a swap file). If your filesystem has reflink support, that copy is even “free”.

Some files are in encodings that require conversion to UTF-8. If a character encoding crossed the page/block boundary, we’d not be able to convert it without locating the neighboring page. So it seems somewhat at odds with this design. But if we just do the iconv (or similar) encoding conversion as part of the copy to our swap file, you can ensure you have valid UTF-8 to begin with. It’s also a convenient place to count new lines so that you can get relatively accurate document height from the start (otherwise you have to scan the data and count newlines which will jump the scrollbar around).

Another thing that might be an interesting trick is to keep around PangoLayouts for each of the cursor lines. It might allow us to mutate the layout immediately upon the key-press-event and render the content out of order (without going through layout cycles). This is somewhat similar to what other editors do to make things “feel” more interactive. It guarantees you render the key event on the next frame, even if slightly incorrect.

In short, writing a good, fast, modern text editor today is the combination of writing a database and a graphics engine. Btrees, low-cardinality indexes, page caches, write ahead logs, transaction replays, memory pooling, GPU dispatching, texture atlases, layout, and more.

https://github.com/chergert/pieceplustree

Implementing GActionGroup

Gtk applications are more and more using GAction and GActionGroup and it’s easy to see why. They are stateful, allow parameters when activating, and can be inserted into the widget hierarchy using gtk_widget_insert_action_group(). The latter is useful so that you only activate actions (or toggle button sensitivity) in the portion of the user interface that makes sense.

One thing to consider is what your strategy will be for using GActionGroup. One way is to encapsulate the GActionGroup by using GSimpleActionGroup. Another, which I prefer, is to implement the GActionGroupInterface. Although, this requires much more boilerplate code.

Until now…

In libdazzle, I added a header-only helper to ease creating action groups with much less effort on your part.

It goes something like this.

#include <dazzle.h>

#include "foo-bar.h"

static void foo_bar_action_frobnicate (FooBar   *self,
                                       GVariant *param);

DZL_DEFINE_ACTION_GROUP (FooBar, foo_bar, {
  { "frobnicate", foo_bar_action_frobnicate },
})

G_DEFINE_TYPE_WITH_CODE (FooBar, foo_bar, G_TYPE_OBJECT,
                         G_IMPLEMENT_INTERFACE (G_TYPE_ACTION_GROUP, foo_bar_init_action_group))

There are a few niceties when defining action groups this way.

  • Your function signatures get your proper class type as parameter 1.
  • You no longer need to instantiate a GSimpleAction for every action.
  • No unnecessary parameters (like user_data) need to be provided anymore.
  • It uses GActionGroupInterface.query_action to optimize for certain queries.
  • You can change action state with ${prefix}_set_action_state().
  • You can change if an action is enabled with ${prefix}_set_action_enabled().
  • You can take your new GActionGroup and gtk_widget_insert_action_group() directly.

That’s it for now, happy hacking!

A new gutter for Builder

The GtkSourceView library has this handy concept of a GtkSourceGutterRenderer. They are similar in concept to a GtkCellRenderer but for the gutter to the left or right of the text editor.

Like a GtkCellRenderer, you pack it into a container and they are placed one after another with some amount of optional spacing in-between. This is convenient because you can start quickly by mixing and matching what you need from existing components. Those include text (such as line numbers), pixbuf rendering, or even code folding regions.

However, there is a cost to this sort of composition. One is function call overhead, but that isn’t particularly interesting to me because there are ways to amortize that away (like we did with the pixel cache). The real problem is one of physical space. Each time a renderer is added, the width of the gutter is increased.

Builder 3.26.0 added a new column for breakpoints, and so we increased our width by another 18 pixels or so. Enough to be cumbersome. It looked like the following which has 4 renderers displayed.

  • Breakpoints renderer
  • Diagnostics renderer
  • Line numbers
  • Line changes (git)

Once you reach some level of complexity, you need to bite the bullet and implement a single renderer that has all the features you want in one place. It allows you to overlap content for density and use the background itself as a component. We just did that for Builder and here is what it looks like.

There are a couple other nice points performance-wise by implementing the gutter as a single renderer. We can take a number of “shortcuts” in the render path that a generic renderer cannot without sacrificing flexibility. Since the gutter is not pixel cached, this has improved the performance of kinetic scrolling on various HiDPI displays. There is always more performance work to do, but I’m rather happy with the result so far.

You’ll find this in the upcoming 3.26.1 release of Builder and is already available in Builder’s Nightly flatpak.

Builder 3.26 has landed

We’ve updated our Wiki page to give people discovering our application more insight into what Builder can do. You can check it out at wiki.gnome.org/Apps/Builder.

Furthermore, you’ll see prominently links to download either our Stable (now 3.26) or Nightly (3.27) builds via Flatpak.

We have also continued to fill in gaps in Builder’s documentation. If Builder is missing a plugin to do something you need, it’s high time you started writing it. ;-)

We want plugins upstream for the same reason the Linux kernel does. It helps us ship better software and avoid breaking API you use.

Builder 3.26 Sightings

We’re getting close to 3.26 and a number of features have landed. Let’s take a quick screenshot tour to see what you’re likely to see in 3.26.

Most of us have seen the new visual design by now

Visual refresh

A modest debugger

A debugger for Builder

Integrated Symbol Search by GSoC student Anoop Chandu

symbol search

Inline Documentation by GSoC student Lucie Charvát

Inline documentation

Word Completion based on distance from cursor by GSoC student Umang Jain

word completion

I expect the word completion to gain some fancy features like following #include files and custom sort ordering which will look and feel similar to Vim users.

Debugging

I’ve been quiet since I got back from GUADEC. It’s been a busy summer, but I’ve managed to sneak away and build this in-between my other maintainer/GSoC duties.

There is still plenty to do, but this gets the basic plumbing in place for a debugger.

It can also debug flatpak-based applications (although you’ll need .Debug runtimes for good symbols in gtk/glib/etc).

Builder 3.25.5

Like every year, GUADEC has snuck up on me. I’ll be heading out to Manchester in a handful of days which means things are going to get hectic any moment now.

We just reached another milestone in our development phase towards 3.26. I’ve landed two important components which will have important roles in Builder’s future. The new visual layout, and the new shortcut engine. Neither are complete yet, but they are good enough to land on master and allow us to iterate without giant branches.

The new layout is based upon a design iteration lead by Allan which Andreas, Jakub, and myself took part. Not everything is implemented but the parts which most heavily contribute to user workflow are in place.

The new shortcut engine has also landed. This is a monumental amount of work because it completely overhauls the keyboard input model within Builder. We’ve moved to a “capture and bubble” event delivery for keyboard input so that we have more flexibility in how shortcuts are activated.

The event delivery is in one of three phases. Capture, Dispatch, or Bubble. The capture phase allows the widget hierarchy (starting from the toplevel) to handle an input event before the destination widget. Dispatch is delivery to the expected destination, where GtkBindingSets are activated. Finally, the bubble phase allows the hierarchy to handle the event if nothing handled it previously.

This also includes a whole swath of features around it for user-modifiable shortcut themes, custom keyboard controllers, a shortcut editor and more. To simplify the process of merging into Builder it has a compatibility interface to support CSS-based -gtk-key-bindings properties.

Effectively, bubble/capture allows us to have keyboard shortcuts that only activate if the source editor or terminal have not stolen the input. A convincing Vim implementation for our source editor can really benefit from this.

Enjoy some screenshots

This week in Builder

I’ve been aggressively pushing forward our new layout branch to integrate the new design work we’ve been iterating on. I don’t think we need to have it all done to merge to master, so we are rapidly approaching the time-frame for the branch to land.

There is much from the design still missing, but every day more pieces come together. I expect to have a good portion of it done by GUADEC.

A lot happened this week on the wip/chergert/layout branch. Here are some highlights.

  • Georges landed a “focus mode” fullscreen mode for Builder.
  • We’ve landed a patch in flatpak-builder to update terminal titles. This will allow us to scrape that info via the PTY in Builder for better progress messages.
  • Document titlebars now match the primary-color of the content. We also fade between states. This took some craftiness to avoid cascading the entire CSS tree. Video example below.
  • DzlPropertiesGroup is a convenient GActionGroup implementation for exporting multiple object properties from a GObject. This is more convenient than GPropertyAction when you are exposing lots of properties together.
  • The TODO plugin was ported to C and employs various techniques to use reduce memory overhead.
  • Dazzle gained convenience API to do insertion sort on a GtkListStore in O(log n) time by accessing the GSequence directly. This should be a safe layer-violation on Gtk+ 3.x.
  • Terminal, Devhelp, and HTML-Preview were ported to the new layout design.
  • IdeSymbolResolver gained a “find nearest scope” API which is used to update the scope name in the document titlebar. Both C and Vala are supported. Other languages need an implementation still.
  • The project tree was tweaked to look more like the mockup. While this won’t be our default project-tree for 3.26, we will keep it around and therefore it should match the design.
  • Overview map was ported to the new layout design.
  • Plugins should use the new IdeBuffer::change-settled API to be notified of when they should update their state based on buffer content (unless alternate API is provided via the plugin interface).
  • We’ve added type icons to the devhelp search that match the language-feature icons used elsewhere in Builder.
  • Scrolling to the insertion point on buffer load has been vastly improved.
  • Performance of buffer loading has also improved.
  • Todo and build diagnostics have been ported to the new design.

Reflowing text in GtkTreeView

You can do some pretty evil things with GtkTreeView if you put your mind to it. One of the most common questions over the years has been how to reflow (wrap) text. The short answer is you can’t. The answer for those willing to hack around it is something like:

So the real question is “Why not use GtkListBox?”

GtkListBox is a fantastic widget. But there are some things it does not deal well with today. I presume that once it can deal with only keeping a minimum number of widgets in memory (and the size-request magic that goes with it) it would be a better solution.

Builder keeps a lot of widgets active during the lifetime of the application, and that makes style propagation particularly slow. The fewer widgets we can use, the more responsive we can keep things. (And of course, there are multiple ways to do this).

evil tree view
evil cell renderer