Ducktype: A Lightweight Syntax for Mallard

One of the projects I’ve been working on lately is Ducktype, a lightweight syntax for Mallard. Mallard has a lot of strengths. Its automatic linking mechanisms make content organization easier. Its focus on independent topics makes content re-use possible. Its revision information and other metadata allow you to do status tracking on large content pools. It has a well-defined extension mechanism that allows you to add new functionality and embed external vocabularies like TTML, SVG, and ITS.

XML is the backbone that makes all of this possible. But XML is also what slows adoption. There’s a growing trend towards using lightweight formats to make it easier to contribute. But while lightweight formats make easy things easy, they tend to fall over when dealing with the issues that XML-based vocabularies are designed to solve.

The idea for a lightweight syntax for Mallard has floated around for a couple years. I even spent some time trying to repurpose an existing lightweight format like reStructuredText or AsciiDoc, but none of them are able to carry the level of semantic information that Mallard needs.

Before going into details, let’s look at a Mallard page written in Ducktype:

= My First Topic
@link[guide >index]
@desc A short description of this page
@revision[date=2014-11-13 status=draft]

This is the first paragraph.
The paragraph continues here, but ends with the blank line below.

[steps]
* This is a steps list, common in Mallard.
* Without the [steps] declaration above, we'd get a normal bullet list.
  Indentation is significant, so this is still in the second item.
* Indentation is so significant that you can actually nest block elements.

  So this is a new paragraph still inside the third item.

  [note]
  And this is a paragraph in a note in the third item.

* You can also nest list items, or literally anything else.

  * This is a basic bullet list.
  * It is in the fourth item of the steps list.

This paragraph is outside the steps list.

One of the most distinguishing features is that, like Python, indentation matters. Indentation is how you stay inside a block element. Ducktype also allows you to do everything you’d do inside a Mallard <info> element, which is crucial to pretty much all of the compelling features of Mallard.

Ducktype is guided by a few design principles, just as Mallard was so many years ago:

  1. It should be possible to do almost anything Mallard XML can do. You can arbitrarily nest block elements. You can have inline markup everywhere you need it, including in code blocks and in other inline markup. You can embed extensions and other vocabularies so that things like Mallard Conditionals and Mallard+TTML are possible. In fact, the only limitation I’ve yet encountered is that you can’t put attributes on page and section titles. This means that Ducktype is capable of serving as a non-XML syntax for virtually any XML vocabulary.
  2. The most commonly used Mallard features should be easy to use. Mallard pages tend to be short with rich content and a fair amount of metadata. Steps lists are common. Semantic inline content is common. Linking mechanisms are, unsurprisingly, extremely common. Credits are common. Revision info is common. Licenses are nearly always done with XInclude.
  3. There should be a minimal number of syntactical constructs. Most lightweight formats have shorthand, special-purpose syntax for everything. This makes it extremely difficult to support extension content without modifying the parser. And for any non-trivial content, it makes it difficult to remember which non-alphanumeric characters you have to escape when.
  4. For extra special bonus points, it should be possible to extend the syntax for special purposes. Lightweight syntaxes are popular in code comments for API documentation, and in API documentation you want shorthand syntax for the things you reference most often. For an object-oriented language, that’s classes and methods. For XSLT, it’s templates and parameters. By not gobbling up all the special characters in the core syntax, we make it possible to add shorthand inline notations by just loading a plugin into the parser.

There’s some discussion on the mallard-list mailing list, starting in August. And there’s a preliminary Ducktype parser up on Gitorious. You can also get it from PyPI with `pip install duck`. If you’re interested in docs, or ducks, or anything of the sort, please join the conversation. I always like getting more input.

Creative Commons Attribution 3.0 United States
This work by Shaun McCance is licensed under a Creative Commons Attribution 3.0 United States.