Smooth transition to new major versions of a set of libraries

With GTK+ 4 in development, it is a good time to reflect about some best-practices to handle API breaks in a library, and providing a smooth transition for the developers who will want to port their code.

But this is not just about one library breaking its API. It’s about a set of related libraries all breaking their API at the same time. Like what will happen in the near future with (at least a subset of) the GNOME libraries in addition to GTK+.

Smooth transition, you say?

What am I implying by “smooth transition”, exactly? If you know the principles behind code refactoring, the goal should be obvious: doing small changes in the code, one step at a time, and – more importantly – being able to compile and test the code after each step. Not in one huge commit or a branch with a lot of un-testable commits.

So, how to achieve that?

Reducing API breaks to the minimum

When developing a non-trivial feature in a library, designing a good API is a hard problem. So often, once an API is released and marked as stable, we see some possible improvements several years later. So what is usually done is to add a new API (e.g. a new function), and deprecating an old one. For a new major version of the library, all the deprecated APIs are removed, to simplify the code. So far so good.

Note that a deprecated API still needs to work as advertised. In a lot of cases, we can just leave the code as-is. But in some other cases, the deprecated API needs to be re-implemented in terms of the new API, usually for a stateful API where the state is stored only wrt the new API.

And this is one case where library developers may be tempted to introduce the new API only in a new major version of the library, removing at the same time the old API to avoid the need to adapt the old API implementation. But please, if possible, don’t do that! Because an application would be forced to migrate to the new API at the same time as dealing with other API breaks, which we want to avoid.

So, ideally, a new major version of a library should only remove the deprecated API, not doing other API breaks. Or, at least, reducing to the minimum the list of the other, “real” API breaks.

Let’s look at another example: what if you want to change the signature of a function? For example adding or removing a parameter. This is an API break, right? So you might be tempted to defer that API break for the next major version. But there is another solution! Just add a new function, with a different name, and deprecate the first one. Coming up with a good name for the new function can be hard, but it should just be seen as the function “version 2”. So why not just add a “2” at the end of the function name? Like some Linux system calls: umount() -> umount2() or renameat() -> renameat2(), etc. I admit such names are a little ugly, but a developer can port a piece of code to the new function with one (or several) small, testable commit(s). The new major version of the library can rename the v2 function to the original name, since the function with the original name was deprecated and thus removed. It’s a small API break, but trivial to handle, it’s just renaming a function (a git grep or the compiler is your friend).

GTK+ timing and relation to other GNOME libraries

GTK+ 3.22 as the latest GTK+ 3 version came up a little as a surprise. It was announced quite late during the GTK+/GNOME 3.20 -> 3.22 development cycle. I don’t criticize the GTK+ project for doing that, the maintainers have good reasons behind that decision (experimenting with GSK, among other things). But – if we don’t pay attention – this could have a subtle negative fallout on higher-level GNOME libraries.

Those higher-level libraries will need to be ported to GTK+ 4, which will require a fair amount of code changes, and might force to break in turn their API. So what will happen is that a new major version will also be released for those libraries, removing their own share of deprecated API, and doing other API breaks. Nothing abnormal so far.

If you are a maintainer of one of those higher-level libraries, you might have a list of things you want to improve in the API, some corners that you find a little ugly but you never took the time to add a better API. So you think, “now is a good time” since you’ll release a new major version. This is where it can become problematic.

Let’s say you released libfoo 3.22 in September. If you follow the new GTK+ numbering scheme, you’ll release libfoo 3.90 in March (if everything goes well). But remember, porting an application to libfoo 3.90/4.0 should be as smooth as possible. So instead of introducing the new API directly in libfoo 3.90 (and removing the old, ugly API at the same time), you should release one more version based on GTK+ 3: libfoo 3.24. To reduce the API delta between libfoo-3 and libfoo-4.

So the unusual thing about this development cycle is that, for some libraries, there will be two new versions in March (excluding the micro/patch versions). Or, alternatively, one new version released in the middle of the development cycle. That’s what will be done for GtkSourceView, at least (the first option), and I encourage other library developers to do the same if they are in the same situation (wanting to get rid of APIs which were not yet marked as deprecated in GNOME 3.22).

Porting, one library at a time

If each library maintainer has reduced to the minimum the real API breaks, this eases greatly the work to port an application (or higher-level library).

But in the case where (1) multiple libraries all break their API at the same time, and (2) they are all based on the same main library (in our case GTK+), and (3) the new major version of those other libraries all depend on the new major version of the main library (in our case, libfoo 3.90/4.0 can be used only with GTK+ 3.90/4.0, not with GTK+ 3.22). Then… it’s again the mess to port an application – except with the following good practice that I will just describe!

The problem is easy but must be done in a well-defined order. So imagine that libfoo 3.24 is ready to be released (you can either release it directly, or create a branch and wait March to do the release, to follow the GNOME release schedule). What are the next steps?

  • Do not port libfoo to GTK+ 3.89/3.90 directly, stay at GTK+ 3.22.
  • Bump the major version of libfoo, making it parallel-installable with previous major versions.
  • Remove the deprecated API and then release libfoo 3.89.1 (development version). With a git tag and a tarball.
  • Do the (hopefully few) other API breaks and then release libfoo 3.89.2. If there are many API breaks, more than one release can be done for this step.
  • Port to GTK+ 3.89/3.90 for the subsequent releases (which may force other API breaks in libfoo).

The same for libbar.

Then, to port an application:

  • Make sure that the application doesn’t use any deprecated API (look at compilation warnings).
  • Test against libfoo 3.89.1.
  • Port to libfoo 3.89.2.
  • Test against libbar 3.89.1.
  • Port to libbar 3.89.2.
  • […]
  • Port to GTK+ 3.89/3.90/…/4.0.

This results in smaller and testable commits. You can compile the code, run the unit tests, run other small interactive/GUI tests, and run the final executable. All of that, in finer-grained steps. It is not hard to do, provided that each library maintainer has followed the above steps in the good order, with the git tags and tarballs so that application developers can compile the intermediate versions. Alongside a comprehensive (and comprehensible) porting guide, of course.

For a practical example, see how it is done in GtkSourceView: Transition to GtkSourceView 4. (It is slightly more complicated, because we will change the namespace of the code from GtkSource to Gsv, to stop stomping on the Gtk namespace).

And you, what is your list of library development best-practices when it comes to API breaks?

PS: This blog post doesn’t really touch on the subject of how to design a good API in the first place, to avoid the need to break it. It will maybe be the subject of a future blog post. In the meantime, this LWN article (Designing better kernel ABIs) is interesting. But for a user-space library, there is more freedom: making a new major version parallel-installable (even every six months, if needed, like it is done in the Gtef library that can serve as an incubator for GtkSourceView). Writing small copylibs/git submodules before integrating the feature to a shared library. And a few other ways. With a list of book references that help designing an Object-Oriented code and API.

This entry was posted in GtkSourceView, Library development, Programming, Thoughts. Bookmark the permalink.

7 Responses to Smooth transition to new major versions of a set of libraries

  1. Very interesting post. Unfortunately we still have applications that have not been ported from the 2.x era to 3.x. Any advice on how to handle those? Port to 3 then to 4? Skip 3 and try to go to 4 directly?

    • swilmet says:

      It is recommended to first port to gtk3, and then to gtk4 (once released as stable). A gtk2 application probably uses APIs that have been deprecated during gtk3 (like GtkUIManager, GtkAction, stock icons, etc). Those APIs are still present in gtk3, but have been removed in gtk4. So it’s easier to port the application first to gtk3 but by using a lot of deprecated API. Then port to the new gtk3 APIs (GAction, GMenu, for example). Then, when the application doesn’t use any deprecated API from gtk3, try to port to gtk4.

      That’s what the GTK+ porting guide recommends:
      https://git.gnome.org/browse/gtk+/tree/docs/reference/gtk/migrating-2to4.xml?h=3.89.1

      But for such an application that still uses gtk2 today, I would recommend to wait GTK+ 4.0, the stable version, not 3.90, 3.92 etc.

    • Alex says:

      What happens when developers can’t keep up since there is a new major stable version of GTK+ every two years? Make users install dozens of GTK+ versions in parallel?

  2. How about you explain people to just use http://semver.org ?

    From semver.org:

    How should I handle deprecating functionality?

    Deprecating existing functionality is a normal part of software development and is often required to make forward progress. When you deprecate part of your public API, you should do two things: (1) update your documentation to let users know about the change, (2) issue a new minor release with the deprecation in place. Before you completely remove the functionality in a new major release there should be at least one minor release that contains the deprecation so that users can smoothly transition to the new API.

    Except you are trying to achieve semver rules with just x.y instead of x.y.z, abusing z for minor increments here and there (for lack of proper rules on y?). An API break in semver is in fact really simple:

    given 1.0.0 needs an API break, you release a 1.1.0 with the API to be broken marked as deprecated and the new API being added. Then a release later you release a 2.0.0 with the deprecated API removed and the new API the same as 1.1.0.

    In your example: 3.22.0 which needs an API break becomes 3.23.0 with the API to be broken marked as deprecated and the new API being added. And then 4.0.0 with the deprecated API removed and the new API the same as 3.23.0.

    In the meantime when you have (security) bugs you just increment the z of x.y.z. For example if you found a (security) bug in 3.22.0 and it got propagated to 3.23.0 and 4.0.0, and you want to fix all three those releases, then you’ll make a 3.22.1 where you JUST fixed the bug (you DID NOT change its API), you release a 3.23.0 where you JUST fixed the bug (you DID NOT change its API) and you release a 4.0.1 where you JUST fixed the bug (you DID NOT change its API).

    The value of this is that our awesome packagers can make dependency rules for the packages that use our libraries for all three kinds of versions.

    They can say, for example: 3.[>=22].[>0] to ensure that they get a backward compatible release of 3.23.0 that DOES NOT have the security bug. And both 3.22.1 and 3.23.1 can be selected from the package database. They can also say, for example, 4.[>=0].[>0] to get the release with API 4.0.0 that DOES NOT have the security bug.

    Of course is it hard to use sensible standards, like semver.org. It’s much more easy to use not invented here syndromes.

    Meanwhile the rest of the world just does semver.org.

    • swilmet says:

      Instead of the -alpha, -beta etc suffixes, GNOME has the difference between even and odd minor versions. Other than that, it’s true that semver.org is a good reference, and it can be applied to GNOME.

      But real API breaks do happen (other than removing deprecated API), an example that I have given in the blog post is to rename foo2() -> foo(), just to get rid of the temporarily ugly name. Another example that happened in GtkSourceView is to make a GObject property construct-only; in theory another property could be created, and the first one deprecated/removed, but the API would look strange with only the new property name (because it’s hard to come up with a good name when the obvious one is already taken).

      This blog post explains a little more things than semver.org wrt API breaks, especially when multiple related libraries are involved. GTK+ releases a new major version -> this has an impact on higher-level libraries, not just on applications.

  3. I meant “you release a 3.23.1 where you JUST fixed the bug (you DID NOT change its API) and” instead of 3.23.0 for that security bug. Argh. You get the point :-)

Comments are closed.

Leave a Reply

Your email address will not be published.