Earlier this week, the W3C released the Internationalization Tag Set (ITS) 2.0 as a recommendation. This is a big leap from ITS 1.0 in terms of functionality, and I’m proud to have played a small part in the development of it. Today, I released ITS Tool 2.0.0 with full support for ITS 2.0. Notably, this release supports:

  • Parameters in selectors, including the ability for users to override parameters on the command line.
  • Preserve Space, a data category that allows you to specify which elements are space-preserving. This was based in part on a similar extension data category from ITS Tool 1
  • External Resource, a data category that allows you to locate referenced resources like images and videos. This was based in part on a similar extension data category from ITS Tool 1
  • Locale Filter, a data category that allows you to exclude content from localized copies based on locale. This replaces the considerably more limited dropRule extension from ITS Tool 1
  • ID Value, a data category that allows you to specify potentially complex IDs for elements.

This release also includes a number of features beyond ITS 2.0 support, such as an option to preserve entity references, an option to load external DTDs, and built-in rules for DocBook 5. This is the biggest release of ITS Tool since it was first released. I’d appreciate people trying it out and reporting bugs.

I do have plans for more features, including:

  • An option to follow XIncludes, automatically processing any included files. This is distinct from just merging the XIncludes in that the files are handled individually, as if you had specified them each on the command line. This is really useful in setups that have a large pool of common content that gets XIncluded in different deliverables so that a single PO file can easily reflect all the translatable strings for one deliverable.

  • Support for some sort of readiness data category that can specify whether individual segments are ready for translation. This would put information in the PO file about whether each PO message is finalized yet. This is useful for partial or slushy freezes, where you want to notify translators of finished content, but you can’t commit to freezing all your deliverables.

    We discussed a data category for this in ITS 2.0, but it wasn’t ready in time. Other implementers are interested in this, so we will probably use a shared extension that works in other tools.

  • The ability to specify repeatable elements for multi-lingual documents. In ITS Tool 1.2.0, I added the ability to join multiple translations into a single multi-lingual file, such as those that GNOME now uses for AppData files. Unfortunately, you currently have no control over which elements are repeated with distinct xml:lang attributes. Right now, ITS Tool assumes that whatever elements it uses for segmentation are the repeatable elements. But in files like AppData files, you may want to segment on the p elements, but repeat the description elements.

    I tried to bend the ITS 2.0 Target Pointer data category to support this use case, but ultimately decided it was too different and would needless complicate the specification.

  • HTML support. ITS 2.0 officially supports HTML5, and specifies exactly how ITS information is mapped to HTML. It does not, however, require implementations to support HTML5. For now, ITS Tool is just an XML tool, but I’d like to add support for HTML5. The libxml2 HTML parser doesn’t cut it, unfortunately, so I need to find something that does, and preferably something that I can map to libxml2′s data structures so I can use the same behind-the-scenes logic.

If you’re interested in using ITS Tool for your XML document translation, get in touch. Leave a comment or email me at shaunm at gnome dot org. I’m always happy to help people set up better translation processes.