Code indexing in Builder

Anoop, one of Builder’s GSoC students this past summer, put together a code-index engine built upon Builder’s fuzzy search algorithm. It shipped with support for C and C++. Shortly after the 3.27 cycle started, Patrick added support for GJS. Today I added support for Vala which was rather easy given the other code we have in Builder.

It looks something like this:

A screenshot of Builder display the code search results for Vala

Happy Hacking!

Simplifying contributions

Every release of both GNOME and Builder, we try to lower the barrier a bit more for new contributions. Bastian mentioned to me at GUADEC that we could make things even simpler from the Builder side of things. After a few mockups, I finally found some time to start implementing it.

With the upcoming Nightly build of Builder, you’ll be able to warp right through cloning and building of an application that is ready for newcomer contributions. Just open Builder and click on the application’s icon.

The greeter now shows a grid of icons so newcomers can simply click on the given icon to clone and build.

There is still more to do here, like adding a language emblem and such. Of course, if you want to work on that, do get in touch.

Closures with Async Operations

Way back in 2011 people were discussing usage of modern GCC features like __attribute__((cleanup())). A few years later it found it’s way into our API’s in GLib with one small caveat, only GCC/Clang support (so no MSVC/Xlc/SunProC). Since I couldn’t care less about MSVC I’ve been using it for years (and really Microsoft, you could contribute more to the mental health of open source programmers by modernizing MSVC).

I want to give a few examples of patterns I use to make tracking down issues easier.

Using GTask

static void
my_async_cb (GObject      *object,
             GAsyncResult *result,
             gpointer      user_data)
{
  // take ownership of task from caller
  g_autoptr(GTask) task = user_data;
  g_autoptr(GError) error = NULL;

  g_assert (G_IS_TASK (task));
  g_assert (G_IS_ASYNC_RESULT (result));

  if (!do_something_finish (result, &error))
    // explicitly pass ownership of error to GTask
    g_task_return_error (task, g_steal_pointer (&error));
  else
    g_task_return_boolean (task, TRUE);
}

void
my_obj_frob_async (MyObj               *self,
                   GCancellable        *cancellable,
                   GAsyncReadyCallback  callback,
                   gpointer             user_data)
{
  g_autoptr(GTask) task = NULL;

  g_return_if_fail (MY_IS_OBJ (self));
  g_return_if_fail (!cancellable || G_IS_CANCELLABLE (cancellable));

  task = g_task_new (self, cancellable, callback, user_data);
  g_task_set_source_tag (task, my_obj_frob_async);

  // pass task ownership to callback
  do_something_async (cancellable,
                      my_async_cb,
                      g_steal_pointer (&task));
}

The nice thing about this style is that all ownership transfers are explicit. I hope that in the future we can get some automatic checking of this via coverity or gcc/clang plugins. But we’re not quite there yet. Either way, it simplifies the auditing case.

Using Idle Callbacks

State tracking during idle callbacks can very easily turn into security issues. So make sure your function always has access to a reference, and simplify your releasing of the data by allowing the GSource to own the closure. For example, with a GObject it is pretty simple.

static gboolean
frob_from_idle_cb (gpointer data)
{
  MyObj *self = data;

  my_obj_frob (self);

  return G_SOURCE_REMOVE;
}

gdk_threads_add_idle_full (G_PRIORITY_LOW,
                           frob_from_idle_cb,
                           g_object_ref (obj),
                           g_object_unref);

The GSource which is registered and calls frob_from_idle_cb() will automatically call g_object_unref() after the function returns G_SOURCE_REMOVE. This also ensures your object isn’t finalized before the callback has occurred.

This also works with g_timeout_add_full(), gdk_threads_add_timeout_full().

Creating Custom Closures

Sometimes you might have state that is more complex than passing around a single GObject. In that case, create a closure structure and define a cleanup function so you can use g_autoptr().

typedef struct
{
  MyObj *self;
  guint  count;
} FrobState;

static void
frob_state_free (FrobState *state)
{
  g_clear_object (&state->self);
  g_free (state);
}

G_DEFINE_AUTOPTR_CLEANUP_FUNC (FrobState, frob_state_free)

With the above definition, you can use g_autoptr(FrobState) state = user_data; like you would for objects. This also works with the idle functions, just use (GDestroyNotify)frob_state_free as your cleanup function.

Improving Builder docs

We use reStructuredText/sphinx for Builder’s documentation because, quite frankly, I found it the easiest for writing massive amounts of documentation in short order. I’m not sure if it’s what I want to stick with long term, but it’s doing the job short term.

However, one thing we don’t get (and I really want) out of any documentation system is support for cross-referencing the user documentation and API docs. This would be useful for us in Builder because we are the use-case for turning users into contributors. Reading our documentation on writing plugins and then locating the API docs should be seamless.

So last week I put together a prototype to generate reStructuredText/Sphinx API docs from .gir files. It will use the provided .gir, including any located dependencies, and generate API docs for them. It can’t be 100% complete for C because the .gir is missing some information that are only found in the sgml files. But it does an okay job.

Mostly, this was just a prototype to see what the state of the documentation systems are. I’m still fairly dissatisfied and am leaning towards the path of “more prototyping necessary”. For example, some things I still want from a modern documentation platform:

  • Cross-referencing among user and API docs
  • Switch between C, Python, GJS, etc
  • Jump to function/class source code
  • Find example usage of API in GNOME git
  • Partial gettext/i18n support so we don’t have to translate API docs, but can still cross-reference them.
  • Fast lookup indexes for use in IDE auto-completion docs using only read-only mmap()‘d files.
  • We might want surrogate links to API docs when cross-referencing so end-user documentation stays small.

Anyway, if you’re bored, play around with it here and give me constructive feedback.

Builder gains multi-touch gestures

If you’re running Wayland and have a touchpad capable of multi-touch, Builder (Nightly) now lets you do fun stuff like the following video demonstrates.

Just three-finger-swipe left or right to move the document. Content is sticky-to-fingers, which is my expectation when using gestures.

It might also work on a touchscreen, but I haven’t tried.

Modern Text Editor Design

I’ve been fascinated about a few technologies in my career. I have a fondness for finding the right data-structure for a problem. Maybe it was because of all those years playing with cars that gave me the “I wanna go fast” mentality. It lead me to various jobs, including working on databases.

Another technology I love are text editors. There is some really fascinating technology going on behind the scenes.

Gtk4 development is heating up, and we are starting to see a toolkit built like a game engine. That’s pretty cool. But how will that change how we write editors? Should it?

In the Gtk3 cycle, I added support to GtkTextView that would render using Alex’s GtkPixelCache. It helped us amortize the cost of rendering into mostly just an XCopyArea() when drawing a frame. It’s why we have that nice 60fps two-finger-scrolling.

But now that we can have GPU textures, do we want to start doing paginated rendering like Apple did with Preview to make PDF rendering ultra fast? Do we want to focus on just sending text data to the GPU to render from an glyph atlas? How about layout? And intermixing right-aligned with left-aligned text? What if we just focus on code editing and not generic editing with rich font support. How about inserting widgets in-between rows? Do we want unlimited undo? How about crash recovery?

These are all questions that can inform the design of a text editor, and they are things I’m starting to think about.

To inform myself on the problem domain better, I started writing a piecetable implementation with various tweaks to outperform those I’ve seen previously. What I’ve come up with is a combination of a b+tree (bplus tree) and a piecetable. The neat thing about it is the high-density you get per-cacheline as compared to something in a RBTree or SplayTree (Atom recently did this, and AbiWord did it a decade ago). It’s at least as fast as those, but with much less malloc overhead because you need fewer, but densely packed allocations.

I prefer dense-cacheline packed structures over pointer chasing (dereferencing pointers) because the CPU can crunch through numbers in a cacheline much faster than it can load another cacheline (as it may not yet be in L1-3 caches). So while it looks like you’re doing more work, you may in fact be saving time.

On my 10 year old 1.2ghz ThinkPad, I can do 1.5 to 2 million random inserts per-second. So I think it’s “good enough” to move on to solving the next layer of problems.

One neat thing about using the linked-leaves from a b+tree is that you get a “next pointer” from each leaf to the next sequential leaf. So if you need to reconstruct the buffer, it’s a simple scan without tree traversal. This is a common problem in a text editor, because we’re sending out text data to diagnostic engines constantly and it needs to be optimized for.

Part of the piecetable design is that you have two buffers. The original data (file state at loading) and change data (append only buffer if each character typed by the user). The piece table is just pointing to ranges in each buffer set to reconstruct the final buffer.

If you log all of your operations to a log file, you can fairly quickly get yourself a crash recovery mechanism. Additionally, you can create unlimited undo.

One thing I don’t like about GtkTextBuffer today is that you cannot open “very large files” with it. It can certainly handle 100mb text files, but it has to load all of that data into memory. And that means that opening a 10gb SQL dump is going to be fairly difficult. But if we implemented on-demand, paginated data loading (reading from disk when that portion of the file is needed), we can get a fixed memory overhead for a file.

One downside to that approach is that if the file is modified behind the scenes, you are basically screwed. (A proper rename() would not affect things since the old FD would still be valid). One way to work around this is to copy the file before editing (a swap file). If your filesystem has reflink support, that copy is even “free”.

Some files are in encodings that require conversion to UTF-8. If a character encoding crossed the page/block boundary, we’d not be able to convert it without locating the neighboring page. So it seems somewhat at odds with this design. But if we just do the iconv (or similar) encoding conversion as part of the copy to our swap file, you can ensure you have valid UTF-8 to begin with. It’s also a convenient place to count new lines so that you can get relatively accurate document height from the start (otherwise you have to scan the data and count newlines which will jump the scrollbar around).

Another thing that might be an interesting trick is to keep around PangoLayouts for each of the cursor lines. It might allow us to mutate the layout immediately upon the key-press-event and render the content out of order (without going through layout cycles). This is somewhat similar to what other editors do to make things “feel” more interactive. It guarantees you render the key event on the next frame, even if slightly incorrect.

In short, writing a good, fast, modern text editor today is the combination of writing a database and a graphics engine. Btrees, low-cardinality indexes, page caches, write ahead logs, transaction replays, memory pooling, GPU dispatching, texture atlases, layout, and more.

https://github.com/chergert/pieceplustree

Implementing GActionGroup

Gtk applications are more and more using GAction and GActionGroup and it’s easy to see why. They are stateful, allow parameters when activating, and can be inserted into the widget hierarchy using gtk_widget_insert_action_group(). The latter is useful so that you only activate actions (or toggle button sensitivity) in the portion of the user interface that makes sense.

One thing to consider is what your strategy will be for using GActionGroup. One way is to encapsulate the GActionGroup by using GSimpleActionGroup. Another, which I prefer, is to implement the GActionGroupInterface. Although, this requires much more boilerplate code.

Until now…

In libdazzle, I added a header-only helper to ease creating action groups with much less effort on your part.

It goes something like this.

#include <dazzle.h>

#include "foo-bar.h"

static void foo_bar_action_frobnicate (FooBar   *self,
                                       GVariant *param);

DZL_DEFINE_ACTION_GROUP (FooBar, foo_bar, {
  { "frobnicate", foo_bar_action_frobnicate },
})

G_DEFINE_TYPE_WITH_CODE (FooBar, foo_bar, G_TYPE_OBJECT,
                         G_IMPLEMENT_INTERFACE (G_TYPE_ACTION_GROUP, foo_bar_init_action_group))

There are a few niceties when defining action groups this way.

  • Your function signatures get your proper class type as parameter 1.
  • You no longer need to instantiate a GSimpleAction for every action.
  • No unnecessary parameters (like user_data) need to be provided anymore.
  • It uses GActionGroupInterface.query_action to optimize for certain queries.
  • You can change action state with ${prefix}_set_action_state().
  • You can change if an action is enabled with ${prefix}_set_action_enabled().
  • You can take your new GActionGroup and gtk_widget_insert_action_group() directly.

That’s it for now, happy hacking!

A new gutter for Builder

The GtkSourceView library has this handy concept of a GtkSourceGutterRenderer. They are similar in concept to a GtkCellRenderer but for the gutter to the left or right of the text editor.

Like a GtkCellRenderer, you pack it into a container and they are placed one after another with some amount of optional spacing in-between. This is convenient because you can start quickly by mixing and matching what you need from existing components. Those include text (such as line numbers), pixbuf rendering, or even code folding regions.

However, there is a cost to this sort of composition. One is function call overhead, but that isn’t particularly interesting to me because there are ways to amortize that away (like we did with the pixel cache). The real problem is one of physical space. Each time a renderer is added, the width of the gutter is increased.

Builder 3.26.0 added a new column for breakpoints, and so we increased our width by another 18 pixels or so. Enough to be cumbersome. It looked like the following which has 4 renderers displayed.

  • Breakpoints renderer
  • Diagnostics renderer
  • Line numbers
  • Line changes (git)

Once you reach some level of complexity, you need to bite the bullet and implement a single renderer that has all the features you want in one place. It allows you to overlap content for density and use the background itself as a component. We just did that for Builder and here is what it looks like.

There are a couple other nice points performance-wise by implementing the gutter as a single renderer. We can take a number of “shortcuts” in the render path that a generic renderer cannot without sacrificing flexibility. Since the gutter is not pixel cached, this has improved the performance of kinetic scrolling on various HiDPI displays. There is always more performance work to do, but I’m rather happy with the result so far.

You’ll find this in the upcoming 3.26.1 release of Builder and is already available in Builder’s Nightly flatpak.

Builder 3.26 has landed

We’ve updated our Wiki page to give people discovering our application more insight into what Builder can do. You can check it out at wiki.gnome.org/Apps/Builder.

Furthermore, you’ll see prominently links to download either our Stable (now 3.26) or Nightly (3.27) builds via Flatpak.

We have also continued to fill in gaps in Builder’s documentation. If Builder is missing a plugin to do something you need, it’s high time you started writing it. 😉

We want plugins upstream for the same reason the Linux kernel does. It helps us ship better software and avoid breaking API you use.

Builder 3.26 Sightings

We’re getting close to 3.26 and a number of features have landed. Let’s take a quick screenshot tour to see what you’re likely to see in 3.26.

Most of us have seen the new visual design by now

Visual refresh

A modest debugger

A debugger for Builder

Integrated Symbol Search by GSoC student Anoop Chandu

symbol search

Inline Documentation by GSoC student Lucie Charvát

Inline documentation

Word Completion based on distance from cursor by GSoC student Umang Jain

word completion

I expect the word completion to gain some fancy features like following #include files and custom sort ordering which will look and feel similar to Vim users.