I build a number of sites with Pintail, including projectmallard.org and yelp.io. Some of these sites I build and upload manually. Others are hooked up to continuous deployment. I’ve been using python-github-webhooks on my server, which is a very simple tool to receive GitHub notifications and do stuff in response. What I was doing in response was building sites with Pintail.

The problem with this approach is that GitHub wants endpoints to respond within 30 seconds. And although building Mallard with Pintail is fast, there are things you don’t want to block on. In particular, you don’t want network operations to hold things up. At the very least, building requires updating one git repository, and possibly more. The seemingly simple yelp.io configuration pulls in two more git repositories. (Yes, it’s that easy.)

So I needed a job queue. I don’t want to just background the build tasks, because then I could end up starting a new build before a previous build finished, and down that path lies madness. I looked into using AMQP queues or using a full-blown CI tool like Buildbot. But I wanted something simple that didn’t involve a lot of new software on my servers. (Side note: I’m building a handful of relatively small sites with fairly low traffic. If you’re doing more, go use a tool like Buildbot and ignore the rest of this post.)

What I finally decided to do was to manage a simple build queue with mkfifo. I have a program that creates a FIFO and reads from it indefinitely, triggering builds when it receives data. Slightly stripped down version:

wdir=/var/pintail

rm -f "$wdir/queue"
mkfifo "$wdir/queue"
chmod a+w "$wdir/queue"

while read repo <"$wdir/queue"; do
    if [ "x$repo" = "xyelp.io" ]; then
        git='https://github.com/projectmallard/yelp.io.git'
    # Other sites get elif statements here.
    else
        continue
    fi
    if [ ! -d "$wdir/$repo" ]; then
        (cd "$wdir" && git clone "$git")
    else
        (cd "$wdir/$repo" && git pull -r)
    fi

    outdir="$repo"-$(date +%Y-%m-%d)-$(uuidgen)
    mkdir -p "/var/www/$outdir"
    (cd "$wdir/$repo" &&
        LANG=en_US.utf-8 scl enable python33 -- pintail build -v -o "/var/www/$outdir" &&
        cd "/var/www" &&
        ln -sf "$outdir" "$repo".new &&
        mv -T "$repo".new "$repo"
    ) 2>&1 >> "$wdir/$repo"-log
done   

Now the only thing my hook endpoints actually do is write a line to the FIFO. Importantly, the build process only looks for known strings in the FIFO, and ignores any other input. It doesn’t, for example, execute arbitrary commands placed in the FIFO. So the worst an attacker could do is trigger builds (potentially resulting in a DoS).

This script has one other trick: It uses symlinks to atomically update sites. The actual built site is in a unique directory named with the actual site name, the date, and a uuid. The actual directory pointed to by my httpd config files is a symlink. Overwriting a symlink with mv -T is an atomic operation, so your site is never half-updated or half-broken. This is a trick I learned at a previous employer, where it was very very important that our very very large documentation site was updated exactly as our release announcement went out.

Lately I’ve been working on Pintail, a documentation site generator built on top of Mallard, Yelp, and the various other tools we’ve developed over the years. Pintail grew out of the tool that used to build projectmallard.org from Mallard sources. But it’s grown a lot to be able to handle general documentation sites. I want GNOME to be able to use Pintail for its documentation site. I want other projects to be able to use it too.

One of the more compelling features, and something many documentation site generators don’t handle, is that Pintail can pull in different git repositories for different directories. Small projects can get away with having all their docs in one repository. Large projects like GNOME can’t. Here’s a snippet of what the configuration for help.gnome.org might look like:

[/users/gnome-help/stable/]
git_repository = git://git.gnome.org/gnome-user-docs
git_branch = master
git_directory = gnome-help/C/

Pintail’s native format is Mallard, but you can add in support for other formats pretty easily. There’s Docbook support, for example, and I’d like to add AsciiDoc support using asciidoctor-mallard.

There are two major features I hope to have ready soon. Both of them are available as Summer of Code projects. First, documentation sites obviously need search. But it’s not enough to just search the whole site. You want to be able to search within specific documents, or specific versions of documents. I’ve actually already got some indexing code in Pintail using Elasticsearch as a backend.

Second, Pintail needs to be able to handle localizations. I’ve put a lot of work into documentation internationalization over the years. It’s important, and everything I work on will continue to support it. I have some ideas on how this will work.

If you need to build a documentation site, give Pintail a try. I’m building a few sites with it already, but I’d love to get input from people with different needs.