One of the projects I’ve been working on lately is Ducktype, a lightweight syntax for Mallard. Mallard has a lot of strengths. Its automatic linking mechanisms make content organization easier. Its focus on independent topics makes content re-use possible. Its revision information and other metadata allow you to do status tracking on large content pools. It has a well-defined extension mechanism that allows you to add new functionality and embed external vocabularies like TTML, SVG, and ITS.

XML is the backbone that makes all of this possible. But XML is also what slows adoption. There’s a growing trend towards using lightweight formats to make it easier to contribute. But while lightweight formats make easy things easy, they tend to fall over when dealing with the issues that XML-based vocabularies are designed to solve.

The idea for a lightweight syntax for Mallard has floated around for a couple years. I even spent some time trying to repurpose an existing lightweight format like reStructuredText or AsciiDoc, but none of them are able to carry the level of semantic information that Mallard needs.

Before going into details, let’s look at a Mallard page written in Ducktype:

= My First Topic
@link[guide >index]
@desc A short description of this page
@revision[date=2014-11-13 status=draft]

This is the first paragraph.
The paragraph continues here, but ends with the blank line below.

[steps]
* This is a steps list, common in Mallard.
* Without the [steps] declaration above, we'd get a normal bullet list.
  Indentation is significant, so this is still in the second item.
* Indentation is so significant that you can actually nest block elements.

  So this is a new paragraph still inside the third item.

  [note]
  And this is a paragraph in a note in the third item.

* You can also nest list items, or literally anything else.

  * This is a basic bullet list.
  * It is in the fourth item of the steps list.

This paragraph is outside the steps list.

One of the most distinguishing features is that, like Python, indentation matters. Indentation is how you stay inside a block element. Ducktype also allows you to do everything you’d do inside a Mallard <info> element, which is crucial to pretty much all of the compelling features of Mallard.

Ducktype is guided by a few design principles, just as Mallard was so many years ago:

  1. It should be possible to do almost anything Mallard XML can do. You can arbitrarily nest block elements. You can have inline markup everywhere you need it, including in code blocks and in other inline markup. You can embed extensions and other vocabularies so that things like Mallard Conditionals and Mallard+TTML are possible. In fact, the only limitation I’ve yet encountered is that you can’t put attributes on page and section titles. This means that Ducktype is capable of serving as a non-XML syntax for virtually any XML vocabulary.
  2. The most commonly used Mallard features should be easy to use. Mallard pages tend to be short with rich content and a fair amount of metadata. Steps lists are common. Semantic inline content is common. Linking mechanisms are, unsurprisingly, extremely common. Credits are common. Revision info is common. Licenses are nearly always done with XInclude.
  3. There should be a minimal number of syntactical constructs. Most lightweight formats have shorthand, special-purpose syntax for everything. This makes it extremely difficult to support extension content without modifying the parser. And for any non-trivial content, it makes it difficult to remember which non-alphanumeric characters you have to escape when.
  4. For extra special bonus points, it should be possible to extend the syntax for special purposes. Lightweight syntaxes are popular in code comments for API documentation, and in API documentation you want shorthand syntax for the things you reference most often. For an object-oriented language, that’s classes and methods. For XSLT, it’s templates and parameters. By not gobbling up all the special characters in the core syntax, we make it possible to add shorthand inline notations by just loading a plugin into the parser.

There’s some discussion on the mallard-list mailing list, starting in August. And there’s a preliminary Ducktype parser up on Gitorious. You can also get it from PyPI with `pip install duck`. If you’re interested in docs, or ducks, or anything of the sort, please join the conversation. I always like getting more input.

3 Responses to “Ducktype: A Lightweight Syntax for Mallard”

  1. Peter Shinners Says:

    So is the way to do cross referenced links with the @link[section] syntax? That might be heavy for inlined links between sections. If it supported something like `section` that would be more tempting?

    I’m not a big fan of ReST. Markdown comes pretty close to what I like, but it is missing good tools for linking between documents.

    Perhaps inline cross referenced links is not that important when writing most Gnome user documentation.

  2. shaunm Says:

    Links in the info section (the way you populate guide and do seealso links and such) would look like this:

    @link[guide >some_page_id]

    Think of everything in the [] as XML attributes. Then make “>” a shorthand for “xref=”, and make bare words a shorthand for “type=”. You could write it out if you’re into that sort of thing:

    @link[type=guide xref=some_page_id]

    Inline linking would use the inline syntax. Right now I’m using $ as a marker for inline. That could change. Feedback is awesome. So an inline link would look like this:

    For more information, click $link[>some_page_id](here).

    And if you just let the link text be auto-populated from the linked-to page title, just this:

    For more information, see $link[>some_page_id].

    The inline syntax is certainly heavier than other lightweight formats, though it is comparable to using macros in AsciiDoc or text roles in reStructuredText, which is what you have to do to use full semantics.

    Maybe it’s worth adding special syntax just for links. I do have different design goals than something like Markdown though. I’m trying to strike the right balance between being easy to write and allowing unfettered access to all of Mallard’s features. People who just want to write simple pages with minimal markup should just use Markdown. It’s a great tool for the problem space that it serves.

  3. shaunm Says:

    (Also, sorry for the late comment approval and reply. For some reason, I’m not getting email notifications from WordPress.)


Comments are closed.