Infographic: How Mallard helps cross-stream documentation workflows

One of the projects I’ve been working on lately is Ducktype, a lightweight syntax for Mallard. Mallard has a lot of strengths. Its automatic linking mechanisms make content organization easier. Its focus on independent topics makes content re-use possible. Its revision information and other metadata allow you to do status tracking on large content pools. It has a well-defined extension mechanism that allows you to add new functionality and embed external vocabularies like TTML, SVG, and ITS.

XML is the backbone that makes all of this possible. But XML is also what slows adoption. There’s a growing trend towards using lightweight formats to make it easier to contribute. But while lightweight formats make easy things easy, they tend to fall over when dealing with the issues that XML-based vocabularies are designed to solve.

The idea for a lightweight syntax for Mallard has floated around for a couple years. I even spent some time trying to repurpose an existing lightweight format like reStructuredText or AsciiDoc, but none of them are able to carry the level of semantic information that Mallard needs.

Before going into details, let’s look at a Mallard page written in Ducktype:

= My First Topic
@link[guide >index]
@desc A short description of this page
@revision[date=2014-11-13 status=draft]

This is the first paragraph.
The paragraph continues here, but ends with the blank line below.

* This is a steps list, common in Mallard.
* Without the [steps] declaration above, we'd get a normal bullet list.
  Indentation is significant, so this is still in the second item.
* Indentation is so significant that you can actually nest block elements.

  So this is a new paragraph still inside the third item.

  And this is a paragraph in a note in the third item.

* You can also nest list items, or literally anything else.

  * This is a basic bullet list.
  * It is in the fourth item of the steps list.

This paragraph is outside the steps list.

One of the most distinguishing features is that, like Python, indentation matters. Indentation is how you stay inside a block element. Ducktype also allows you to do everything you’d do inside a Mallard <info> element, which is crucial to pretty much all of the compelling features of Mallard.

Ducktype is guided by a few design principles, just as Mallard was so many years ago:

  1. It should be possible to do almost anything Mallard XML can do. You can arbitrarily nest block elements. You can have inline markup everywhere you need it, including in code blocks and in other inline markup. You can embed extensions and other vocabularies so that things like Mallard Conditionals and Mallard+TTML are possible. In fact, the only limitation I’ve yet encountered is that you can’t put attributes on page and section titles. This means that Ducktype is capable of serving as a non-XML syntax for virtually any XML vocabulary.
  2. The most commonly used Mallard features should be easy to use. Mallard pages tend to be short with rich content and a fair amount of metadata. Steps lists are common. Semantic inline content is common. Linking mechanisms are, unsurprisingly, extremely common. Credits are common. Revision info is common. Licenses are nearly always done with XInclude.
  3. There should be a minimal number of syntactical constructs. Most lightweight formats have shorthand, special-purpose syntax for everything. This makes it extremely difficult to support extension content without modifying the parser. And for any non-trivial content, it makes it difficult to remember which non-alphanumeric characters you have to escape when.
  4. For extra special bonus points, it should be possible to extend the syntax for special purposes. Lightweight syntaxes are popular in code comments for API documentation, and in API documentation you want shorthand syntax for the things you reference most often. For an object-oriented language, that’s classes and methods. For XSLT, it’s templates and parameters. By not gobbling up all the special characters in the core syntax, we make it possible to add shorthand inline notations by just loading a plugin into the parser.

There’s some discussion on the mallard-list mailing list, starting in August. And there’s a preliminary Ducktype parser up on Gitorious. You can also get it from PyPI with `pip install duck`. If you’re interested in docs, or ducks, or anything of the sort, please join the conversation. I always like getting more input.

Add this to the list of things I never expected to be doing: opening a grocery store.

At last year’s Open Help Conference, I gave a talk titled Community Lessons From IRL. I told the story of how I got involved in opening a grocery store, and what I’ve learned about community work when the community is your neighbors.

I live in Cincinnati, in a beautiful, historic, walkable neighborhood called Clifton. We pride ourselves on being able to walk to get everything we need. We have a hardware store, a pharmacy, and a florist. We have lots of great restaurants. We had a grocery store, but after generations of serving the people of Clifton, our neighborhood IGA closed its doors nearly four years ago.

The grocery store closing hurt our neighborhood. It hurt our way of life. Other shops saw their business decline. Quite a few even closed their doors. At restaurants and coffee houses and barber shops, all anybody talked about was the grocery store being closed. When will it reopen? Has anybody contacted Trader Joe’s/Whole Foods/Fresh Market? Somebody should do something.

“Somebody should do something” isn’t doing something.

If there’s one thing I’ve learned from over a decade of working in open source, it’s that things only get done when people get up and do them. Talk is cheap, whether it’s in online forums or in the barber shop. So a group of us got up and did something.

Last August, a concerned resident sent out a message that if anybody wanted to take action, she was hosting a gathering at her house. Sometimes just hosting a gathering at your house is all it takes to get the ball rolling. Out of that meeting came a team of people committed to bringing a full-service grocery store back to Clifton as a co-op, owned and controlled by the community.

Thus was born Clifton Market.

Clifton Market display in the window of the vacant IGA building

Clifton Market display in the window of the vacant IGA building

For the last 14 months, I’ve spent whatever free time I could muster trying to open a grocery store. Along with an ever-growing community of volunteers, I’ve surveyed the neighborhood, sold shares, created a business plan, talked to contractors, negotiated real estate, and learned far more about the grocery industry than I ever expected. In many ways, I’ve been well-served by my experience working with volunteer communities in GNOME and other projects. But a lot of things are different when the project is in your backyard staring you down each day.

Opening a grocery store costs money, and we’ve been working hard on raising the money through shares and owner loans. If you want to support our effort, you can buy a share too.

Passive Voice Day will again be observed Monday, April 28, 2014. This event is participated in by many each year. On Passive Voice Day, the passive voice is preferred in tweets, blogs, and casual conversation. Not only is this found to be amusing, but an opportunity is also created for people to be educated on writing well.

The hashtag #passivevoiceday should be used so that your passive voice sentences can be enjoyed by others.

If it’s not known how the passive voice is used, this information from Grammar Girl should be read.

The GNOME doc team has been having a doc sprint in sunny Norwich, England. I arrived a day late due to crazy snow in Cincinnati, but the team’s been hard at work all week. As usual, there’ve been plenty of feature requests for Mallard, which I’ve been trying to keep track of. I’ve tried to encourage people to join the Mallard mailing list and to file Mallard enhancement proposals. Kat asked for a feature in yelp-tools to check licenses of Mallard page files. I’ve added a “license” subcommand to yelp-check. This is in git master, and I’ll get docs up on the wiki after the next release.

I also wrote a new tutorial on Mallard conditionals. It’s hosted with the rest of the Mallard tutorials on, and I hope we can adapt it to a chapter on conditionals in the Mallard book. In other Mallard news, Ryan’s been pushing me to resume my work on my non-XML Mallard markdown, so that might see some work soon.

We had our semi-annual discussion about the state of our developer docs and what we can do to fix them. As usual, a lot of problems were identified and a lot of ideas were tossed around, but we never seem to revamp things quite to the level I’d like. Maybe this time will be different.

Allan Day joined us for a few days. I enjoyed working with him on redesigning Yelp. Hopefully Yelp 3.12 will look more like a modern GNOME 3 app. Aside from the chrome of Yelp, Allan took a crack at designing a splash page for the desktop help that has a bit more visual appeal. I’ve been working on implementing this for the last two days. Everybody loves screenshots:


Following our tradition of naming special page styles after our hackfests, this is currently accomplished with the “norwich” style on links elements, along with some uix:thumbs elements on the target pages. Petr’s been pushing me to get thumbnails out of experimental and into Mallard UI proper. Hopefully I can find time to carry through on this after the hackfest.

Thanks to the GNOME Foundation for sponsoring my travel, and to the University of East Anglia for their wonderful hospitality.

Sponsored by the GNOME Foundation

Earlier this week, the W3C released the Internationalization Tag Set (ITS) 2.0 as a recommendation. This is a big leap from ITS 1.0 in terms of functionality, and I’m proud to have played a small part in the development of it. Today, I released ITS Tool 2.0.0 with full support for ITS 2.0. Notably, this release supports:

  • Parameters in selectors, including the ability for users to override parameters on the command line.
  • Preserve Space, a data category that allows you to specify which elements are space-preserving. This was based in part on a similar extension data category from ITS Tool 1
  • External Resource, a data category that allows you to locate referenced resources like images and videos. This was based in part on a similar extension data category from ITS Tool 1
  • Locale Filter, a data category that allows you to exclude content from localized copies based on locale. This replaces the considerably more limited dropRule extension from ITS Tool 1
  • ID Value, a data category that allows you to specify potentially complex IDs for elements.

This release also includes a number of features beyond ITS 2.0 support, such as an option to preserve entity references, an option to load external DTDs, and built-in rules for DocBook 5. This is the biggest release of ITS Tool since it was first released. I’d appreciate people trying it out and reporting bugs.

I do have plans for more features, including:

  • An option to follow XIncludes, automatically processing any included files. This is distinct from just merging the XIncludes in that the files are handled individually, as if you had specified them each on the command line. This is really useful in setups that have a large pool of common content that gets XIncluded in different deliverables so that a single PO file can easily reflect all the translatable strings for one deliverable.

  • Support for some sort of readiness data category that can specify whether individual segments are ready for translation. This would put information in the PO file about whether each PO message is finalized yet. This is useful for partial or slushy freezes, where you want to notify translators of finished content, but you can’t commit to freezing all your deliverables.

    We discussed a data category for this in ITS 2.0, but it wasn’t ready in time. Other implementers are interested in this, so we will probably use a shared extension that works in other tools.

  • The ability to specify repeatable elements for multi-lingual documents. In ITS Tool 1.2.0, I added the ability to join multiple translations into a single multi-lingual file, such as those that GNOME now uses for AppData files. Unfortunately, you currently have no control over which elements are repeated with distinct xml:lang attributes. Right now, ITS Tool assumes that whatever elements it uses for segmentation are the repeatable elements. But in files like AppData files, you may want to segment on the p elements, but repeat the description elements.

    I tried to bend the ITS 2.0 Target Pointer data category to support this use case, but ultimately decided it was too different and would needless complicate the specification.

  • HTML support. ITS 2.0 officially supports HTML5, and specifies exactly how ITS information is mapped to HTML. It does not, however, require implementations to support HTML5. For now, ITS Tool is just an XML tool, but I’d like to add support for HTML5. The libxml2 HTML parser doesn’t cut it, unfortunately, so I need to find something that does, and preferably something that I can map to libxml2’s data structures so I can use the same behind-the-scenes logic.

If you’re interested in using ITS Tool for your XML document translation, get in touch. Leave a comment or email me at shaunm at gnome dot org. I’m always happy to help people set up better translation processes.

How do you decide what to write about? How do you organize what you’ve written? Often, your view will be shaped by the technology you’re used to. Mallard users will think of topics and guides. DITA users will turn to tasks and concepts. DocBook users will line up chapters and sections. All of these have their pros and cons, but there are three types of documents you need to stop writing.

README: Read you? I’ll read what I like, thankyouverymuch. What’s in this README file? Instructions for how to use the software? How to install it? Develop plugins for it? Start contributing to it? The answer to all of these is yes, and much more. I just don’t know what’s in there. All I know is that somebody thinks I should read it. Maybe.

TODO: I want to know what I can do with your software today, not tomorrow, and certainly not in your imagined tomorrow. By all means, keep a TODO list. Better yet, use an issue tracker. Software development is hard work, and we all need help keeping track of what we need to do. But don’t put it in your user documentation.

FAQ: I don’t care if my question is frequently asked. There are many ways you could organize information, but an FAQ isn’t a taxonomy. It’s a brain dump of answers to some questions somebody had. Worse, it’s often a brain dump of answers to questions the developers think you should ask. (Did anybody really ask how your project got its witty name?) Identify the valuable information in your FAQ and take the time to work it into the organizational structure of your documentation.

All of these are failures to identify the audience and organize information for them. A writer’s job doesn’t end with writing some words and just putting them somewhere. When writing and organizing, always think of where your reader is likely to look for the information you’re providing.

It has been decided that the third annual Passive Voice Day will be observed on April 26, 2013. Though previous years were observed April 27, it is thought that more participants can be found if Passive Voice Day is observed on a week day.

Passive Voice Day is observed by people around the world. The absurdity of the English language being tortured is enjoyed by these people. For one day, the passive voice is used exclusively in tweets, blogs, and casual conversation.

The hashtag #passivevoiceday should be used on Twitter and other social media, so that your passive voice sentences can be enjoyed by others.

Is it not known how the passive voice is used? Is a refresher needed? The information provided by Grammar Girl should be read.

Mallard development has been a bit dormant lately. A few features have trickled in over the last year, but the backlog of things to improve has been steadily growing. But things are looking up. I’m nearly done moving to my Linode, which will allow me to finally fix the broken mailing list archives. Then I can finalize some specifications and release actual packages for the schemas, thus ending this two-year yak-shaving exercise.

This post will highlight some of the back-burner Mallard projects that I hope to get traction on. To help the progress, I’m considering having a virtual quackfest where a handful of people work on specifications, tutorials, and implementations. You don’t have to be a programmer to get involved. Sometimes all we need are experienced Mallard users to give input and try new ideas. If you’re interested, leave a comment, or email me at shaunm at gnome dot org.

Here’s an overview of what I hope to address in the near future:

Mallard 1.1

Mallard 1.0 is finished, despite the admonition on the specification that it’s still a draft. We’ve gotten a lot of feedback, and seen what works and what doesn’t for extensions. (It works more than it doesn’t.) Mallard 1.1 will address that.

  • Support a tagging mechanism. The very rough Facets extension defines a tagging mechanism that it uses to match pages. But tagging has uses outside faceted navigation, so we should move this into the Mallard core.
  • Allow info elements in formal block elements. This allows you, for example, to provide credits for code snippets and videos. It’s also necessary to support the next bullet item:
  • Let formal block elements participate in automatic linking. People have asked to be able to link with a finer granularity than pages and sections. There are good implementation reasons why Mallard doesn’t allow arbitrary anchors, but I believe we can link to certain well-defined endpoints.
  • Allow sections IDs be optional. This is a common gotcha, and I think it’s a restriction we can relax.
  • Allow comments after sections. This is another common gotcha. Comments are just block content, and it doesn’t make much sense to put block content after sections. I think we can special-case comments.
  • Allow the links element to override the link role. This is a bit esoteric, but very useful in some cases.
  • Let informational links be one-way only. Sometimes it’s handy to opt out of automatic link reciprocation.
  • Provide a sort of static informational link type. This would allow you to assemble groups of links with no other semantics that you can still format with the links element.
  • Move hi out of experimental. Yelp has supported an experimental element to highlight some text for a very long time. It’s useful. It should be standard.
  • Allow link grouping for section links. I’m not sure on the best implementation for this yet, but the feature is useful.
  • Provide a generic div element with an optional block title. This is useful for extensions. We’d want to slightly redefine block fallback behavior to make this really useful. This is a somewhat backwards-incompatible change, but I think the risk is minimal.
  • Provide a way to do automatic links through tags. Sometimes you have a collection of pages that you want to link together. Mallard’s automatic links are one-to-one, so they make this case only marginally better. We may be able to hook into the tagging mechanism to do automatic links to all pages with a matching tag.
  • Allow multiple desc elements, with the exact same semantics as multiple informational titles.

Mallard UI

The Mallard UI extension is intended to hold extensions that add some user interactivity without additional semantics. Currently, expanders are fully defined and implemented. We have experimental implementations for media overlays and link thumbnails, and a plan for tabbed sections.

Mallard Sync

The Mallard Sync extension is planned to allow you to syncronize videos with text content. There are only rough ideas at this point. It will allow things like action links to seek in a video, showing and highlighting parts of the document as a video plays, and tables of contents for videos.

Mallard Conditionals

The Mallard Conditionals extension provides a runtime conditionals mechanism. Content can be conditionally shown based on things like the target platform, the reading environment, the supported Mallard features of the processing tool, and the language of the content. This is well-defined and fully implemented as it is. It just needs a thorough audit to finalize it.

There are other test token schemes that I’d like to work on:

  • Check the current page or section ID.
  • Check for page or sections IDs that exist in the document.
  • Check the tag values for the page.

All of these help with reuse. They allow you to XInclude standard content that can adapt itself to different pages and documents.

Mallard API

I did some work on an extension that allows you to format automatic links as API synopses when doing API documentation. I briefly mentioned this in my blog post API Docs on Mobile. This still needs a lot of work, and it needs input from people who are used to working with API documentation in different programming languages.

Mallard Glossaries

I blogged before about an extension to do automatic glossaries in Mallard. It’s been collecting dust for a while.

Faceted Navigation

I also blogged before about an extension to do faceted navigation in Mallard. It’s been collecting dust for an even longer while.

Mallard+TTML, Mallard+SVG, Mallard+MathML, Mallard+ITS

You can add W3C-standard formats like TTML, SVG, MathML, and ITS to your Mallard document. I’ve blogged about Mallard+TTML Video Captions, and there’s a tutorial on Mallard and SVG. These are all implemented, and they work extremely well thanks to Mallard’s well-defined extension mechanism. But they’d all be a lot better with a specification and a schema.

As you can see, there’s a lot to work on. Mallard was designed to be a platform from which we could explore new ideas for help. I think it’s proven itself in that regard. But as with any open source project, it needs an investment from people to keep driving it forward.

Open Help Conference & Sprints

I’ve recently been talking with Petr Kovar about how to make language packs for videos work well with Mallard. Petr, Jakub Steiner, and others have been working on a video-intensive “Getting Started” document for GNOME. Videos, of course, can take up a lot of disk space very quickly, and the problem is compounded when we localize into dozens of languages, as we do in GNOME.

I suggested making language packs for videos. So, for example, the Czech videos would be in a package called gnome-getting-started-cz. But you can’t expect people to use the software center to install the language pack on their own before viewing some introductory videos. Fortunately, we have a mechanism to install packages directly from a help document, using install action links.

<p>Install a language pack to view vidoes in your language.</p>
<p><link action="install:gnome-getting-started-cz" style="button">Install<link></p>

This works nicely when viewed locally in Yelp, but it doesn’t work so well when the document is built to HTML for the web. We can use Mallard Conditionals to make the note only visible when install action links are available.

<if:if test="action:install"
<p>Install a language pack to view vidoes in your language.</p>
<p><link action="install:gnome-getting-started-cz" style="button">Install<link></p>

And while we’re at it, we really don’t want this note showing up when you view the original English source document, so we can refine the conditional with some language tokens:

<if:if test="action:install !lang:C !lang:en"
<p>Install a language pack to view vidoes in your language.</p>
<p><link action="install:gnome-getting-started-cz" style="button">Install<link></p>

This is almost right, except that we’ve hard-coded the package name for the Czech language pack. We want to be able to translate the package name in the action attribute. If you use itstool to translate your Mallard document with PO files, it turns out the package name will be in a translatable message, but embedded in markup in a way that translators won’t like:

msgid "<link action=\"install:getting-started-cz\" style=\"button\">Install</link>"

Worse yet, if you use Okapi to translate your document with XLIFF files, it won’t appear at all. Okapi and itstool are both based on the W3C Internationalization Tag Set (ITS), and this is a case where I think ITS really shines. We can use local overrides and embedded ITS rules to instruct these tools on exactly what to offer for translation.

For convenience, define these two namespace prefixes on the page element:


To make segmentation clearer (especially for itstool), mark the link as non-translatable. This makes sure the action attribute doesn’t just get segmented with the rest of the containing paragraph. But we do want to translate the content of the link, so add a span that is translatable:

<p><link action="install:gnome-getting-started-cz" style="button" its:translate="no">
<span its:translate="yes">Install</span><link></p>

With itstool, you’ll now get the nicer no-markup message in your PO file:

msgid "Install"

But now we want to be able to translate the action attribute. Of course, we can’t add an its:translate attribute to the attribute. XML just doesn’t work that way. So we have to use embedded global rules to mark it is translatable. And while we’re at it, we can also provide a localization note for translators. Put this in the info element of the page:

<its:rules version="1.0">
<its:translateRule selector="//mal:link/@action" translate="yes"/>
<its:locNoteRule selector="//mal:link/@action" locNoteType="description">
<its:locNote>Translate this to install:getting-started-LL, replacing LL
with your locale, only if there is a video translation pack for your

You’ll now get this in your PO file:

#. Translate this to install:getting-started-LL, replacing LL with your
#. locale, only if there is a video translation pack for your locale.
msgid "install:getting-started-cz"

This is the kind of thing that’s possible when you have a dynamic help format, an integrated local help viewer, a run-time conditional processing system, and a translation process based on powerful industry standards. And it’s why I still love XML.