Rebasing downstream translations

At Endless, we maintain downstream translations for an number of GNOME projects, such as gnome-software, gnome-control-center and gnome-initial-setup. These are projects where our (large) downstream modifications introduce new user-facing strings. Sometimes, our translation for a string differs from the upstream translation for the same string. This may be due to:

  • a deliberate downstream style choice – such as tú vs. usted in Spanish
  • our fork of the project changing the UI so that the upstream translation does not fit in the space available in our UI – “Suspend” was previously translated into German as „In Bereitschaft versetzen“, which we changed to „Bereitschaft“ for this reason
  • the upstream translation being incorrect
  • the whim of a translator

Whenever we update to a new version of GNOME, we have to reconcile our downstream translations with the changes from upstream. We want to preserve our intentional downstream changes, and keep our translations for strings that don’t exist upstream; but we also want to pull in translations for new upstream strings, as well as improved translations for existing strings. Earlier this year, the translation-rebase baton was passed to me. My predecessor would manually reapply our downstream changes for a set of officially-supported languages, but unlike him, I can pretty much only speak English, so I needed something a bit more mechanical.

I spoke to various people from other distros about this problem. ((I’d love to credit the individuals I spoke to but my memory is awful. Please let me know if you remember being on the other side of these conversations and I’ll update this post!)) A common piece of advice was to not maintain downstream translation changes: appealing, but not really an option at the moment. I also heard that Ubuntu follows a straightforward rule: once the translation for a string has been changed downstream, all future upstream changes to the translation for that string are ignored. The assumption is that all downstream changes to a translation must have been made for a reason, and should be preserved. This is essentially a superset of what we’ve done manually in the past.

I wrote a little tool to implement this logic, pomerge translate-o-tron 3000 (or “t3k” for short). ((Thanks to Alexandre Franke for pointing out the existence of at least one existing tool called “pomerge”. In my defence, I originally wrote this script on a Eurostar with no internet connection, so couldn’t search for conflicts at the time.)) Its “rebase” mode takes the last common upstream ancestor, the last downstream commit, and a working copy with the newest downstream code. For each locale, for each string in the translation in the working copy, it compares the old upstream and downstream translations – if they differ, it merges the latter into the working copy. For example, Endless OS 3.5.x was based on GNOME 3.26; Endless OS 3.6.x is based on GNOME 3.32. I might rebase the translations for a module with:

$ cd src/endlessm/gnome-control-center

# The eos3.6 branch is based on the upstream 3.32.1 tag
$ git checkout eos3.6

# Update the .pot file
$ ninja -C build meson-gnome-control-center-2.0-pot

# Update source strings in .po files
$ ninja -C build meson-gnome-control-center-2.0-update-po

# The eos3.6 branch is based on the upstream 3.26.1 tag;
# merge downstream changes between those two into the working copy
$ t3k rebase `pwd` 3.26.1 eos3.5

# Optional: Python's polib formats .po files slightly differently to gettext;
# reformat them back. This has no semantic effect.
$ ninja -C build meson-gnome-control-center-2.0-update-po

$ git commit -am 'Rebase downstream translations'

It also has a simpler “copy” mode which copies translations from one .po file to another, either when the string is untranslated in the target (the default) or for all strings. In some cases, we’ve purchased translations for strings which have not been translated upstream; I’ve used this to submit some of those upstream, such as Arabic translations of the myriad OARS categories, and hope to do more of that in the future now I can make a computer do the hard work.

$ t3k copy \
    ~/src/endlessm/gnome-software/po/ar.po \
    ~/src/gnome/gnome-software/po/ar.po

4 thoughts on “Rebasing downstream translations”

  1. I’d definitely agree with the recommendation to avoid downstream translation changes as much as possible – because if you’re going to have them, you need people who speak the appropriate languages to help maintain them.

    I mean, sure, you _can_ just always assume your overrides are correct… but then you run the risk of inconsistency if upstream introduces new strings that use the terms you’ve changed, and when you update, some of the UI uses one word and other parts use a different word…

  2. The ubuntu policy of keeping downstream translations forever is a source of headaches for upstream maintainers. They are often of bad quality because of the lack of review process. I wouldn’t take them as an example.

    The choice of pomerge as a name is unfortunate as that is already the name of a commonly used tool in Free Software translation, as your favourite search engine will probably tell you in its first result.

    I’m curious about the credits and license compatibility for the purchased translations.

  3. I have the same worry about our translations’ (lack of) review. I would like to get our community translators involved about our translations, and I would like to get us out of the business of changing upstream strings. This is a chicken-and-egg problem; improving the tooling around our existing process allows more slack to improve it in future.

    I have renamed the tool to translate-o-tron 3000 (or t3k for short). Thanks for pointing out the existing pomerge tool.

    Credits – fair point, I need to teach t3k to propagate those. License compatibility: we mostly purchase translations from Gengo (via Transifex). Section 6 of their terms of service says:

    All intellectual property rights in the Translated Works will be assigned to Customer upon Customer’s Approval of the Translated Works and Your compliance with this TOS.

    So I don’t think there is a license compatibility problem: we release the translations under the same license as the project in question, as we do for all our modifications to free software projects.

  4. Well, all the changes being preserved were originally made by speakers of the appropriate languages. We do have (internal and community) speakers of the relevant languages in most cases. Yes, I do want to review the changes & decide whether to keep, upstream, or discard each one – tooling to help identify the changes makes this practical. One step at a time!

Comments are closed.