Gtef is now hosted on gnome.org, and the 2.0 version has been released alongside GNOME 3.24. So it’s a good time for a new blog post on this new library.
The main goal of Gtef is to ease the development of text editors and IDEs based on GTK+ and GtkSourceView, by providing a higher-level API.
Some background information is written on the wiki:
In this blog post I’ll explain in more details some aspects of Gtef: why a new library was needed, why calling it a framework, and one feature that I worked on during this cycle (a new file loader). There are more stuff already in the pipeline and will maybe be covered by future blog posts, stay tuned (and see the roadmap) ;)
Iterative API design + stability guarantees
In Gtef, I want to be able to break the API at any time. Because API design is hard, it needs an iterative process. Sometimes we see possible improvements several years later. But application developers want a stable API. So the solution is simple: bumping the major version each time an API break is desirable, every 6 months if needed! Gtef 1.0 and Gtef 2.0 are parallel-installable, so an application depending on Gtef 1.0 still compiles fine.
Gtef is a small library, so it’s not a problem if there are e.g. 5 different gtef *.so loaded in memory at the same time. For a library like GTK+, releasing a new major version every 6 months would be more problematic for memory consumption and application startup time.
A concrete benefit of being able to break the API at any time: a contributor (David Rabel) wanted to implement code folding. In GtkSourceView there are several old branches for code folding, but nothing was merged because it was incomplete. In Gtef it is not a problem to merge the first iteration of a class. So even if the code folding API is not finished, there has been at least some progress: two classes have been merged in Gtef. The code will be maintained instead of bit-rotting in a branch. Unfortunately David Rabel doesn’t have the time anymore to continue contributing, but in the future if someone wants to implement code folding, the first steps are already done!
Gtef is the acronym for “GTK+ Text Editor Framework”, but the framework part is not yet finished. The idea is to provide the main application architecture for text editors and IDEs: a GtkApplication on top, containing GtkApplicationWindow’s, containing a GtkNotebook, containing tabs (GtkGrid’s), with each tab containing a GtkSourceView widget. If you look at the current Gtef API, there is only one missing subclass: GtkNotebook. So the core of the framework is almost done, I hope to finish it for GNOME 3.26. I’ll probably make the GtkNotebook part optional (if a text editor prefers only one GtkSourceView per window) or replacable by something else (e.g. a GtkStack plus GtkStackSwitcher). Let’s see what I’ll come up with.
Of course once the core of the framework is finished, to be more useful it’ll need an implementation for common features: file loading and saving, search and replace, etc. With the framework in place, it’ll be possible to offer a much higher-level API for those features than what is currently available in GtkSourceView.
Also, it’s interesting to note that there is a (somewhat) clear boundary between GtkSourceView and Gtef: the top level object in GtkSourceView is the GtkSourceView widget, while the GtkSourceView widget is at the bottom of the containment hierarchy in Gtef. I said “somewhat” because there is also GtkSourceBuffer and GtefBuffer, and both libraries have other classes for peripheral, self-contained features.
New file loader based on uchardet
The file loading and saving API in GtkSourceView is quite low-level, it contains only the backend part. In case of error, the application needs to display the error (preferably in a GtkInfoBar) and for some errors provide actions like choosing another character encoding manually. One goal of Gtef will be to provide a simpler API, taking care of all kinds of errors, showing GtkInfoBars etc.
But how the backend works has an impact on the GUI. The file loading and saving classes in GtkSourceView come from gedit, and I’m not entirely happy with the gedit UI for file loading and saving. There are several problems, one of them is that GtkFileChooserNative cannot be used with the current gedit UI so it’s problematic to sandbox the application with Flatpak.
With gedit, when we open a file from a GtkFileChooserDialog, there is a combobox for the encoding: by default the encoding is auto-detected from a configurable list of encodings, and it is possible to choose manually an encoding from that same list. I want to get rid of that combobox, to always auto-detect the encoding (it’s simpler for the user), and to be able to use GtkFileChooserNative (because custom widgets like the combobox cannot be added to a GtkFileChooserNative).
The problem with the file loader implementation in GtkSourceView is that the encoding auto-detection is not that good, hence the need for the combobox in the GtkFileChooserDialog in gedit. But to detect the encoding, there is now a simple to use library called uchardet, maintained by Jehan Pagès, and based on the Mozilla universal charset detection code. Since the encoding auto-detection is much better with uchardet, it will be possible to remove the combobox and use GtkFileChooserNative!
Jehan started to modify GtkSourceFileLoader (or, more precisely, the internal class GtkSourceBufferOutputStream) to use uchardet, but as a comment in GtkSourceBufferOutputStream explains, that code is a big headache… And the encoding detection is based only on the first 8KB of the file, which results in bugs if for example the first 8KB are only ASCII characters and a strange character appears later. Changing that implementation to take into account the whole content of the file was not easily possible, so instead, I decided to write a new implementation from scratch, in Gtef, called GtefFileLoader. It was done in Gtef and not in GtkSourceView, to not break the GtkSourceView API, and to have the time in Gtef to write the implementation and API incrementally (trying to keep the API as close as possible to the GtkSourceView API).
The new GtefFileLoader takes a simpler approach, doing things sequentially instead of doing everything at the same time (the reason for the headache). 1) Loading the content in memory, 2) determining the encoding, 3) converting the content to UTF-8 and inserting the result into the GtkTextBuffer.
Note that for step 2, determining the encoding, it would have been entirely possible without uchardet, by counting the number of invalid characters and taking the first encoding for which there are no errors (or taking the one with the fewest errors, escaping the invalid characters). And when uchardet is used, that method can serve as a nice fallback. Since all the content is in memory, it should be fast enough even if it is done on the whole content (GtkTextView doesn’t support very big files anyway, 50MB is the default maximum in GtefFileLoader).
GtefFileLoader is usable and works well, but it is still missing quite a few features compared to GtkSourceFileLoader: escaping invalid characters, loading from a GInputStream (e.g. stdin) and gzip uncompression support. And I would like to add more features: refuse to load very long lines (it is not well supported by GtkTextView) and possibly ask to split the line, and detect binary files.
The higher-level API is not yet created, GtefFileLoader is still “just” the backend part.