I build a number of sites with Pintail, including projectmallard.org and yelp.io. Some of these sites I build and upload manually. Others are hooked up to continuous deployment. I’ve been using python-github-webhooks on my server, which is a very simple tool to receive GitHub notifications and do stuff in response. What I was doing in response was building sites with Pintail.
The problem with this approach is that GitHub wants endpoints to respond within 30 seconds. And although building Mallard with Pintail is fast, there are things you don’t want to block on. In particular, you don’t want network operations to hold things up. At the very least, building requires updating one git repository, and possibly more. The seemingly simple yelp.io configuration pulls in two more git repositories. (Yes, it’s that easy.)
So I needed a job queue. I don’t want to just background the build tasks, because then I could end up starting a new build before a previous build finished, and down that path lies madness. I looked into using AMQP queues or using a full-blown CI tool like Buildbot. But I wanted something simple that didn’t involve a lot of new software on my servers. (Side note: I’m building a handful of relatively small sites with fairly low traffic. If you’re doing more, go use a tool like Buildbot and ignore the rest of this post.)
What I finally decided to do was to manage a simple build queue with mkfifo
. I have a program that creates a FIFO and reads from it indefinitely, triggering builds when it receives data. Slightly stripped down version:
wdir=/var/pintail rm -f "$wdir/queue" mkfifo "$wdir/queue" chmod a+w "$wdir/queue" while read repo <"$wdir/queue"; do if [ "x$repo" = "xyelp.io" ]; then git='https://github.com/projectmallard/yelp.io.git' # Other sites get elif statements here. else continue fi if [ ! -d "$wdir/$repo" ]; then (cd "$wdir" && git clone "$git") else (cd "$wdir/$repo" && git pull -r) fi outdir="$repo"-$(date +%Y-%m-%d)-$(uuidgen) mkdir -p "/var/www/$outdir" (cd "$wdir/$repo" && LANG=en_US.utf-8 scl enable python33 -- pintail build -v -o "/var/www/$outdir" && cd "/var/www" && ln -sf "$outdir" "$repo".new && mv -T "$repo".new "$repo" ) 2>&1 >> "$wdir/$repo"-log done
Now the only thing my hook endpoints actually do is write a line to the FIFO. Importantly, the build process only looks for known strings in the FIFO, and ignores any other input. It doesn’t, for example, execute arbitrary commands placed in the FIFO. So the worst an attacker could do is trigger builds (potentially resulting in a DoS).
This script has one other trick: It uses symlinks to atomically update sites. The actual built site is in a unique directory named with the actual site name, the date, and a uuid. The actual directory pointed to by my httpd config files is a symlink. Overwriting a symlink with mv -T
is an atomic operation, so your site is never half-updated or half-broken. This is a trick I learned at a previous employer, where it was very very important that our very very large documentation site was updated exactly as our release announcement went out.