Websockets + socket.io on the ESP8266 w/ Micropython

I recently learned about the ESP8266 while at Pycon AU. It’s pretty nifty: it’s tiny, it has wifi, a reasonable amount of RAM (for a microcontroller) oh, and it can run Python. Specifically Micropython. Anyway I purchased a couple from Adafruit (specifically this one) and installed the Micropython UNIX port on my computer (be aware with the cheaper ESP8266 boards, they might not be very reflashable, or so I’ve been told, spend the extra money for one with decent flash).

The first thing you learn is that the ports are all surprisingly different in terms of what functionality they support, and the docs don’t make it clear like they do for CPython. I learned the hard way there is a set of docs per port, which maybe is why you the method you’re looking for isn’t there.

The other thing is that even though you’re getting to write in Python, and it has many Pythonic abstractions, many of those abstractions are based around POSIX and leak heavily on microcontrollers. Still a number of them look implementable without actually reinventing UNIX (probably).

The biggest problem at the moment is there’s no “platform independent” way to do asynchronous IO. On the microcontroller you can set top-half interrupt handlers for IO events (no malloc here, yay!), gate the CPU, and then execute bottom halfs from the main loop. However that’s not going to work on UNIX. Or you can use select, but that’s not available on the ESP8266 (yet). Micropython does support Python 3.5 asyncio coroutines, so hopefully the port of asyncio to the ESP8266 happens soon. I’d be so especially ecstatic if I could do await pin.trigger(Pin.FALLING).

There’s a few other things that could really help make it feel like Python. Why isn’t disabling interrupts a context manager/decorator. It’s great that you can try/finally your interrupt code, but the with keyword is so much more Pythonic. Perhaps this is because the code is being written by microprocessor people… which is why they’re so into protocols like MQTT for talking to their devices.

Don’t get me wrong, MQTT is a great protocol that you can cram onto all sorts of devices, with all sorts of crappy PHYs, but I have wifi, and working SSL. I want to do something more web 2.0. Something like websockets. In fact, I want to take another service’s REST API and websockets, and deliver that information to my device… I could build a HTTP server + MQTT broker, but that sounds like a pain. Maybe I can just build a web server with socket.io and connect to that directly from the device?!

The ESP8266 already has some very basic websocket support for its WebREPL, but that’s not very featureful and seems to only implement half of the spec. If we’re going to have Python on a device, maybe we can have something that looks like the great websockets module. Turns out we can!

socket.io is a little harder, it requires a handshake which is not documented (I reversed it in the end), and decoding a HTTP payload, which is not very clearly documented (had to read the source). It’s not the most efficient protocol out there, but the chip is more than fast enough to deal with it. Also fun times, it turns out there’s no platform independent way to return from waiting for IO. Basically it turned out there were a lot of yaks to shave.

Where it all comes into its own though is the ability to write what is pretty much everyday, beautiful Python however, it’s worth it over Arduino sketches or whatever else takes your fancy.

uwebsockets/usocketio on Github.

Electronics breadboard with a project on it sitting on a laptop keyboard Electronics breadboard with a project on it sitting on a laptop keyboard

Django and PostgreSQL composite types

PostgreSQL has this nifty feature called composite types that you can use to create your own types from the built-in PostgreSQL types. It’s a bit like hstore, only structured, which makes it great for structured data that you might reuse multiple times in a model, like addresses.

Unfortunately to date, they were pretty much a pain to use in Django. There were some older implementations for versions of Django before 1.7, but they tended to do things like create surprise new objects in the namespace, not be migrateable, and require connection to the DB at any time (i.e. during your build).

Anyway, after reading a bunch of their implementations and then the Django source code I wrote django-postgres-composite-types.

Install with:

pip install django-postgres-composite-types

Then you can define a composite type declaratively:

from django.db import models
from postgres_composite_type import CompositeType

class Address(CompositeType):
    """An address."""

    address_1 = models.CharField(max_length=255)
    address_2 = models.CharField(max_length=255)

    suburb = models.CharField(max_length=50)
    state = models.CharField(max_length=50)

    postcode = models.CharField(max_length=10)
    country = models.CharField(max_length=50)

    class Meta:
        db_type = 'x_address'  # Required

And use it in a model:

class Person(models.Model):
    """A person."""

    address = Address.Field()

The field should provide all of the things you need, including formfield etc and you can even inherit this field to extend it in your own way:

class AddressField(Address.Field):
    def __init__(self, in_australia=True, **kwargs):
        self.in_australia = in_australia


Finally to set up the DB there is a migration operation that will create the type that you can add:

import address
from django.db import migrations

class Migration(migrations.Migration):

    operations = [
        # Registers the type
            field=address.Address.Field(blank=True, null=True),

It’s not smart enough to add it itself (can you do that?). Nor would it be smart enough to write the operations to alter a type. That would be a pretty cool trick. But it’s useful functionality all the same, especially when the alternative is creating lots of 1:1 models that are hard to work with and hard to garbage collect.

It’s still pretty early days, so the APIs are subject to change. PRs accepted of course.

Filtering derived fields with Wagtail search

Wagtail’s built in search functionality has this nifty feature where it will index callables on your model, e.g. your model has a start and end date, but you want to search on duration. Hypothetically we could add a FilterField here to index this for Elasticsearch1.

class Job(Model, index.Indexed):
    """A job you can apply for."""

    start_date = models.DateField()
    end_date = models.DateField()

    search_fields = (
        index.FilterField('duration', type='IntegerField'),

    def duration(self):
        """Duration of the assignment."""
        return 12 * (self.end_date.year - self.start_date.year) + \
            self.end_date.month - self.start_date.month

Wagtail is quite clever in that it takes a Django QuerySet and decomposes the filters.

queryset = queryset\
    .filter(duration__range=(data['duration'].lower or 0,
                             data['duration'].upper or 99999))
query = backend.search(keywords, queryset)

Of course Django will get upset about this, since that’s not a field you can filter on. So we can annotate the value in.

from django.db.models import Func, fields

class DurationInMonths(Func):  # pylint:disable=abstract-method
    SQL Function to calculate the duration of assignments in months.
    template = \
        'EXTRACT(year FROM age(%(expressions)s)) * 12 + ' \
        'EXTRACT(month FROM age(%(expressions)s))'
    output_field = fields.IntegerField()

queryset = queryset\
        F('end_date'), F('start_date')
    .filter(duration__range=(data['duration'].lower or 0,
                             data['duration'].upper or 99999))

Also Django will be upset it can’t annotate the duration, so you’ll need to add a setter to your model.

class Job(Model, index.Indexed):
    def duration(self):

    def duration(self, value):
        """Ignored to make Django annotations happy."""

And you ideally would be done except now Wagtail gets upset that it can’t determine the attribute name for the field but you can kludge around this for now:

class DurationInMonths(Func):  # pylint:disable=abstract-method
    SQL Function to calculate the duration of assignments in months.
    template = \
        'EXTRACT(year FROM age(%(expressions)s)) * 12 + ' \
        'EXTRACT(month FROM age(%(expressions)s))'
    output_field = fields.IntegerField()
    target = type('IntegerFieldKludge',
                  {'attname': 'duration'})()
  1. Note the requirement for a type, otherwise this will be indexed as a string. []

Multiple choice using Django’s postgres ArrayField

There’s a lot of times you want to have a multiple choice set of flags and using a many-to-many database relation is massively overkill. Django 1.9 added a builtin modelfield to leverage Postgres’ built-in array support. Unfortunately the default formfield for ArrayField, SimpleArrayField, is not even a little bit useful (it’s a comma-delimited text input).

If you’re writing your own form, you can simply use the MultipleChoiceField formfield, but if you’re using something that builds forms using the ModelForm automagic factories with no overrides (e.g. Wagtail’s admin site), you need a way to specify the formfield.

Instead subclass ArrayField:

from django import forms
from django.contrib.postgres.fields import ArrayField

class ChoiceArrayField(ArrayField):
    A field that allows us to store an array of choices.
    Uses Django 1.9's postgres ArrayField
    and a MultipleChoiceField for its formfield.

    def formfield(self, **kwargs):
        defaults = {
            'form_class': forms.MultipleChoiceField,
            'choices': self.base_field.choices,
        # Skip our parent's formfield implementation completely as we don't
        # care for it.
        # pylint:disable=bad-super-call
        return super(ArrayField, self).formfield(**defaults)

You use this like ArrayField, except that choices is required.

    ('defect', 'Defect'),
    ('enhancement', 'Enhancement'),
flags = ChoiceArrayField(models.CharField(max_length=...,

You can similarly use this with any other field that supports choices, e.g. IntegerField (but you’re not storing choices as integers… are you).


Two new fixtures-related utilities for Django and Wagtail

During constantly bettering today I got around to forking two small tools I wrote from their parent codebases and uploading them to GitHub/PyPI.

wagtailimporter is a Django app that reads Wagtail pages from a yaml file and imports them into the database. It’s designed to handle pages in a much neater way than DB fixtures, including things like foreign key lookups (via Yaml objects) and inline support for StreamField. It’s sort of a work in progress, where I’m adding more support for things as needed.

django_loaddata_stdin is a very small utility that extends Django’s loaddata command to support reading from stdin. This is extremely useful when you have fragments of site config loaded into the database (e.g. Django sites, Allauth socialaccounts), but you don’t want this in a fixture file committed to revision control.

For instance, to deploy my site into Docker, where I don’t have site-config fixture files built into the containers.

docker-compose run web ./manage.py loaddata --format=yaml - << EOF

Redmine analytics with Python, Pandas and iPython Notebooks


We use Redmine at Infoxchange to manage our product backlog using the Agile plugin from Redmine CRM. While this is generally good enough for making sure we don’t lose any cards and has features like burndown charting you still have to keep your own metrics. This is sort of annoying because you’re sure the data is in there somehow and what do you do when you want to explore a new metric?

This is where iPython Notebooks and the data analysis framework Pandas come in. iPython Notebooks are an interactive web-based workspace where you can intermix documentation and Python code, with graphs and tables you produce output directly into the document. Individual “cells” of the notebook are cached so you can work on part of the program, and experiment (great for data analysis) without having to run big slow number crunching or data download steps.

Pandas is a library for loading and manipulating data. It is based on the well-known numpy and scipy scientific packages and extends them to be able to load data from almost any file type or source (i.e. a Redmine CSV straight from your Redmine server) without having to know much programming. Your data becomes intuitively exposed. It also has built-in plotting and great integration with iPython Notebooks.<!–more–>The first metric I wanted to collect was to study the relationship between our story estimates (in points), our tasking estimates (in hours) and reality. First let’s plot our velocity. It is a matter of saving a custom query whose CSV export URL I can copy into my notebook. This means I can run the notebook at any time to get the latest data.

Remember that iPython notebooks cache the results of cells so I can place accessing Redmine into its own cell and I won’t have to constantly be downloading the data. I can even work offline if I want.

import pandas
data = pandas.read_csv('https://redmine/projects/devops/issues.csv?query_id=111&key={key}'.format(key=key))

Pandas exposes the result as a data frame, which is something we can manipulate in meaningful ways. For instance, we can extract all of the values of a column withdata['column name']. We can also select all of the rows that match a certain condition:

data[data['Target version'] != 'Backlog']

We can go a step further and group this data by the sprint it belongs to:

data[data['Target version'] != 'Backlog']\
    .groupby('Target version')

We can then even create a new series of data by summing the points column each group to find our velocity (as_index means we wish to make the sprint version the row identifier for each row, this will be useful when plotting the data):

data[data['Target version'] != 'Backlog']\
    .groupby('Target version', as_index=True)\
    ['Story Points'].sum()

If you’ve entered each of these into an iPython Notebook and executed them you’ll notice that the table appeared underneath your code block.

We can use this data series to do calculations. For example to find the moving average of our sprint velocity we can do this:

velocity = data[data['Target version'] != 'Backlog']\
    .groupby('Target version', as_index=True)\
    ['Story Points'].sum()

avg_velocity = pandas.rolling_mean(velocity, window=5, min_periods=1)

We can then plot our series together (note the passing of ax to the second plot, this allows us to share the graphs on one plot):

ax = velocity.plot(kind='bar', color='steelblue', label="Velocity",
                   legend=True, title="Velocity", figsize=(12,8))
avg_velocity.plot(ax=ax, color='r', style='.-', label="Average velocity (window=5)",

You can view the full notebook for this demo online. If you want to learn more about Pandas read 10 minutes to Pandas. Of course what makes iPython Notebooks so much more powerful than Excel, SPSS and R is our ability to use any 3rd party Python package we like. Another time I’ll show how we can use a 3rd party Python-Redmine API to load dataframes to extract metrics from the individual issues’ journals.

Using an SSL intermediate as your CA cert with Python Requests

Originally posted on ixa.io

We recently had to work around an issue integrating with a service that did not provide the full SSL certificate chain. That is to say it had the server certificate installed but not the intermediate certificate required to chain back up to the root. As a result we could not verify the SSL connection.

Not a concern, we thought, we can just pass the appropriate intermediate certificate as the CA cert using the verify keyword to Requests. This is after all what we do in test with our custom root CA.

response = requests.post(url, body, verify=settings.INTERMEDIATE_CA_FILE)

While this worked using curl and the openssl command it continued not to work in Python. Python instead gave us a vague error about certificate validity. openssl is actually giving us the answer though by showing the root CA in the trust chain.

The problem it turns out is that you need to provide the path back to a root CA (the certificate not issued by somebody else). The openssl command does this by including the system root CAs when it considers the CA you supply, but Python considers only the CAs your provide in your new bundle. Concatenate the two certificates together into a bundle:


Pass this bundle to the verify keyword.

You can read up about Intermediate Certificates on Wikipedia.

Running Django on Docker: a workflow and code

It has been an extremely long time between beers (10 months!). I’ve gotten out of the habit of blogging and somehow I never blogged about the talk I co-presented at PyCon AU this year on Pallet and Forklift the standard and tool we’ve developed at Infoxchange to help make it easier to develop web-applications on Docker1.

Infoxchange is one of the few places I’m aware of that runs Docker in prod. If you’re looking at using Docker to do web development, it’s worth checking out what we’ve been doing over on the Infoxchange devops blog.

  1. There’s also Straddle Carrier, a set of Puppet manifests for loading Docker containers on real infrastructure, but they’ve not been released yet as they rely too much on our custom Puppet config. []

lazy-loading class-based-views in Django

So one of the nice things with method-based views in Django is the ability to do this sort of thing to load a view at the path frontend.views.home:

urlpatterns = patterns(

    url(r'^$', 'home', name='home'),

Unfortunately, if you’re using class-based-views, you can’t do this:

urlpatterns = patterns(

    url(r'^$', 'HomeView', name='home'),

And instead you had to resort to importing the view and calling HomeView.as_view(). Sort of annoying when you didn’t want to import all of those views.

It turns out however that overloading the code to resolve HomeView is not that difficult, and we can do it with a pretty straightforward monkeypatch. This version uses the kwargs argument of url() to pass keyword arguments to as_view().

from django.conf import urls
from django.views.generic import View

class ClassBasedViewURLPattern(urls.RegexURLPattern):
    A version of RegexURLPattern able to handle class-based-views

    Monkey-patch it in to support class-based-views

    def callback(self):
        Hook locating the view to handle class based views

        view = super(ClassBasedViewURLPattern, self).callback

        if isinstance(view, type) and issubclass(view, View):
            view = view.as_view(**self.default_args)
            self.default_args = {}

        return view

urls.RegexURLPattern = ClassBasedViewURLPattern

Django utility methods (including New Relic deployment notification)

So we’ve moved to Github here at Infoxchange as our primary development platform because pull requests and Travis CI are much nicer than yelling across the room at each other1. To enable Travis to build our code, we’ve needed to move our little utility libraries to Github too. Since some of these were already on pip, it made sense to open source the rest of them too.

The most useful is a package called IXDjango which includes a number of generally useful management commands for Django developers. Especially useful are deploy which will run a sequence of other commands for deployment, and newrelic_notify_deploy which you can use to notify New Relic of your deployment, which annotates all of your graphs with the version number.

We hope these are useful to people.

  1. big shout out to both Github and Travis CI for supporting our not-for-profit mission with gratis private accounts []