Generating JSON from SQLAlchemy objects

I had to put together a small web app the other day, using SQLAlchemy and Flask. Because I hate writing code multiple times, when I can do things using a better way, I wanted to be able to serialise SQLAlchemy ORM objects straight to JSON.

I decided on an approach where taking a leaf out of Javascript, I would optionally implement a tojson() method on a class, which I would attempt to call from my JSONEncoder1.

It turns out to be relatively simple to extend SQLAlchemy’s declarative base class to add additional methods (we can also use this as an excuse to implement a general __repr__().

from sqlalchemy.ext.declarative import declarative_base as real_declarative_base

# Let's make this a class decorator
declarative_base = lambda cls: real_declarative_base(cls=cls)

class Base(object):
    Add some default properties and methods to the SQLAlchemy declarative base.

    def columns(self):
        return [ for c in self.__table__.columns ]

    def columnitems(self):
        return dict([ (c, getattr(self, c)) for c in self.columns ])

    def __repr__(self):
        return '{}({})'.format(self.__class__.__name__, self.columnitems)

    def tojson(self):
        return self.columnitems

We can then define our tables in the usual way:

class Client(Base):
    __tablename__ = 'client'


You can obviously replace any of the methods in your subclass, if you don’t want to serialise the whole thing. Bonus points for anyone who wants to extend this to serialise one-to-many relationships.

And what about calling the tojson() method? That’s easy, we can just provide our own JSONEncoder.

import json

class JSONEncoder(json.JSONEncoder):
    Wrapper class to try calling an object's tojson() method. This allows
    us to JSONify objects coming from the ORM. Also handles dates and datetimes.

    def default(self, obj):
        if isinstance(obj,
            return obj.isoformat()

            return obj.tojson()
        except AttributeError:
            return json.JSONEncoder.default(self, obj)

Cutting edge Flask provides a way to replace the default JSON encoder, but the version I got out of pip does not. This is relatively easy to work around though by replacing jsonify with our own version.

from flask import Flask

app = Flask(__name__)

def jsonify(*args, **kwargs):
    Workaround for Flask's jsonify not allowing replacement of the JSONEncoder
    in my version of Flask.

    return app.response_class(json.dumps(dict(*args, **kwargs),

If you do have a newer Flask, where you don’t have to replace jsonify, you can also inherit from Flask’s JSONEncoder, which already handles things like datetimes for you.

  1. The tojson() method actually returns a Python dict understandable by JSONEncoder []

elevation data and APIs

So I found a bit of time to hack on my project today. Today’s task was to load and validate data coming from Geoscience Australia’s SRTM digital elevation model data, which I downloaded from their elevation data portal last week.1 The data is Creative Commons, so I might just upload it somewhere, if I can find a place for 2GB of elevation data.

This let me load elevation data in a 3 arcsecond (about 100m) grid, which I did using the ubiquitous GDAL via its Python API. Initial code is here. It doesn’t do anything super clever yet, like check and normalise the projection, because I don’t need to.2

Looking at plots of values can give you a gist of what’s what (oh look, it goes out to sea, and then the data is masked out) but it doesn’t really validate anything. I could do validation runs against my GPS tracks, but for a first pass, I decided it would be easier to validate using Google’s Elevation API. This is a pretty neat web service that you make a request to, and it gives you back some JSON (or XML). There are undoubtedly Python APIs to access this, but it’s pretty easy to do a simple call with urllib2 or httplib. I chose to reuse my httplib Client wrapper from my RunKeeper/HealthGraph API. I wrote it directly in the test.

For a real unit test, I would have probably calculated the residuals, and ensured they sat within some acceptable range, but I’m lazy, so instead I just plotted them together. Google’s data, you will notice, includes bathymetry, which is actually pretty neat.

SRTM v Google Elevation

  1. Note to the unwary, seems to be buggy in Chrome? []
  2. I did write a skeleton context manager for gdal.Open. I say skeleton because it doesn’t actually do anything smart like turning errors from GDAL into Exceptions, because I didn’t have any errors to handle. []

Testing warnings with py.test

For those who use like to add warnings to your Python code, and want to test those warnings actually happen in your unit tests, here are two techniques to do so, both are based around fixtures/funcargs.

Firstly is the mechanism built into py.test using recwarn.

The second is to create a fixture that specifically enables warnings as exceptions and combined that with pytest.raises, for instance:

import warnings

def warnings_as_errors(request):

    request.addfinalizer(lambda *args: warnings.resetwarnings())

def test_timers_warn(log, warnings_as_errors):


    with pytest.raises(RuntimeWarning):

The advantage of this second method is you can guarantee exactly what method call raises the warning without repeatedly having to check recwarn.

Investigating cycling speed anomalies

So as I’ve spent the last year learning Melbourne as a cyclist, there’s been a few times where I’ve found myself on an absolutely staggering hill, only to go down it again, and worse find there was another way that totally avoided the hill; or I’ve chosen routes that subject me to staggering head winds only to be told I should have taken another route instead.

This got me thinking, with everyone tracking their cycles on their smartphones, why couldn’t I feed all of this data into a model, along with some data like NASA’s elevation grids, or the Bureau of Meteorology’s wind observations. As something to keep me occupied over Christmas, I started a little project on the plane to Perth.

It turns out RunKeeper has this handy API that lets you access everything stored there. Unfortunately it seems that no one has really written a good Python API for this, so I put one together.

Throw in a bit of NumPy, PyProj (to convert to rectilinear coordinates) and Matplotlib (to plot it) and you can get a graph that looks like this (which thankfully looks a lot like RunKeeper’s graph):
Speed Anomalies
If we do some long window smoothing, we can get an idea of a cyclist’s average speed and then calculate a percentage anomaly from this average speed. This lets us compensate for different cyclists, how tired they are or if they’re riding with someone else1.

If we then do this for lots of tracks and grid the results based on whether the velocity vector at each point is headed towards or away from Melbourne2 we can get spatial plots that look like this (blue is -1 and red is 1):
Directional Speed Anomalies
If you squint at the graphs you can sort of see that there are many places where the blue/red are inverted, which is promising, it meant something was making us faster one way and slower the other (a hill or wind or the pub). You can also see that I still don’t really have enough data, I tend to always cycle the same routes. If I want to start considering factors that are highly temporally variable, like wind, I’m going to need a lot more data to keep the number of datapoints (always want to call this fold) high in my temporal bins.

The next step I suppose is to set up the RunKeeper download as a web service, so people can submit their RunKeeper tracks to me. This means I’m going to have to fix up some hard coded assumptions in the code, like the UTM zone for rectilinear projection, and what constitutes an inbound or an outbound route. Unsurprisingly this has become a lot more ambitious than a summer project.

If you feel like having a play, there is source code.

  1. This does have the side effect of reducing the signal from head/tail winds, especially on straight trips, I need to think about this more. []
  2. We need this, otherwise the velocity anomaly would average out depending on which direction we’re headed up/down a hill. []

Extending Selenium with jQuery

Last week I wrote about combining Selenium and py.test and I promised to also talk about my function find_elements_by_jquery().

Selenium by default can find elements by id, CSS selector and XPath, but I often find I already know the query as a jQuery selector, and so frequently it’s easiest just to use that.

We start by overloading the Selenium webdriver. Since the webdriver is exposed through several classes (one per web browser), we do this in a particularly meta way.

from selenium.webdriver.remote.webdriver import WebElement
from selenium.common.exceptions import InvalidSelectorException

def MyWebDriver(base, **kwargs):
    return type('MyWebDriver', (_MyWebDriver, base), kwargs)

class _MyWebDriver(object):
    def create_web_element(self, element_id):
        return MyWebElement(self, element_id)

    def find_elements_by_jquery(self, jq):
        return self.execute_script('''return $('%s').get();''' % jq)

    def find_element_by_jquery(self, jq):
        elems = self.find_elements_by_jquery(jq)
        if len(elems) == 1:
            return elems[0]
            raise InvalidSelectorException(
                "jQuery selector returned %i elements, expected 1" % len(elems))

We then do a similar implementation for the webelement:

class MyWebElement(WebElement):
    def __repr__(self):
        """Return a pretty name for an element"""

        id = self.get_attribute('id')
        class_ = self.get_attribute('class')

        if len(id) > 0:
            return '#' + id
        elif len(class_) > 0:
            return '.'.join([self.tag_name] + class_.split(' '))
            return self.tag_name

    def find_elements_by_jquery(self, jq):
        return self.parent.execute_script(
            '''return $(arguments[0]).find('%s').get();''' % jq, self)

    def find_element_by_jquery(self, jq):
        elems = self.find_elements_by_jquery(jq)
        if len(elems) == 1:
            return elems[0]
            raise InvalidSelectorException(
                "jQuery selector returned %i elements, expected 1" % len(elems))

We can now pass in jQuery selectors for instance b.find_element_by_jquery('#region option:selected'). Or form.find_elements_by_jquery(':input'). It’s especially incredibly powerful when all of your DOM manipulation already works in terms of jQuery selectors.

As an added bonus, overloading the classes lets us add functionality like Firebug style element names (MyWebElement.__repr__) or wrap things like the Wait utility into the webdriver, e.g.

from import WebDriverWait as Wait
from selenium.common.exceptions import TimeoutException

class FrontendError(Exception):

# class _MyWebDriver...
    def wait(self, event, timeout=10):
            Wait(self, timeout).until(event)
        except (TimeoutException, FrontendError) as e:
            # do we have an error dialog
            dialog = self.find_element_by_id('error-dialog')
            if dialog.is_displayed():
                content = dialog.find_element_by_id('error-dialog-content')
                raise FrontendError(content.text)
                raise e

Combining py.test and Selenium to test webapps

Recently I started adding unit and acceptance tests to a webapp using Selenium, integrated into the existing py.test framework that tests the backend code.

py.test fixtures make using Selenium, via its Python bindings, really straightforward. Here’s how I did it.

First I put all the Selenium related tests in a tests/selenium/ directory. I then created tests/selenium/ and wrote a fixture to allow tests to access a single instance of the webdriver for the entire session:

import pytest
from selenium import webdriver

browsers = {
    'firefox': webdriver.Firefox,
    'chrome': webdriver.Chrome,

def driver(request):
    if 'DISPLAY' not in os.environ:
        pytest.skip('Test requires display server (export DISPLAY)')

    b = browsers[request.param]()

    request.addfinalizer(lambda *args: b.quit())

    return b

Note that we’re able to parameterise the fixture so that it runs with multiple browsers. We then add a per-function fixture that sets up the session for an individual test:

def b(driver, url):
    b = driver
    b.set_window_size(1200, 800)

    return b

A fixture can refer to other fixtures of more generic scope. So url is a fixture that accesses the optional --url property.

def pytest_addoption(parser):
    parser.addoption('--url', action='store',

def url(request):
    return request.config.option.url

These fixtures are available for all tests in that package. Tests have the form:

def test_badger(b):
    # test goes here

We can also create per-module fixtures, that optionally inherit our generic fixtures. Say for example we want to run a number of tests (e.g. for WCAG 2.0 compliance) on a number of parameterised instances of the set-up webapp. We might do this in

import pytest

def wcag(driver, url):
    Set up a single session for these tests.

    b = driver
    b.set_window_size(1200, 800)

    # do stuff here with Selenium to set up webapp
    return b

We can now write tests1 in this module, e.g.

def test_unique_ids(wcag):
    All ids in the document should be unique.

    elems = wcag.find_elements_by_jquery('[id]')
    ids = map(lambda e: e.get_attribute('id'), elems)

    assert len(elems) >= 1 # sanity check
    assert util.unique(ids)

Again, we can parameterise this fixture to set up the webapp in a number of different ways. Note that we have to use driver as our fixture, not b. This is because we can only refer to fixtures more general in scope than the one we are writing.

  1. find_elements_by_jquery() is a method I’ve added in an extension of Selenium’s webdriver, and is a topic for another post. []

Finding Ada — Elaine Miles

October 16th is Ada Lovelace Day. A day that showcases women in engineering, maths, science and technology by profiling a woman technologist, scientist, engineer or mathematician on your blog.

Elaine Miles portrait

This year I’m writing about Elaine Miles, a researcher at the Australian Bureau of Meteorology, who I met through mutual colleagues over lunch one day. She has since become one of my go-to people whenever I require a crash-course in something. She awesomely let me interview her for Ada Lovelace Day.

Miles is a physicist working at the Centre for Australian Weather at Climate Research (CAWCR), a joint project between the Bureau of Meteorology and the Commonwealth Scientific and Industrial Research Organisation (CSIRO), where she is investigating the use of dynamic models to predict sea level in the Western Pacific.

Miles studied Applied Mathematics and Physics at the University of Melbourne. She attributes her love of maths to primary school, where she recalls being chastised by her teacher for attempting the subtraction problems further on in the workbook before they had been taught subtraction. She says she would always lament when another class ran over and cut into the maths lesson. Her love of maths originally led her to enroll in Electrical Engineering, but she didn’t like the black box thinking that engineering encourages, preferring to understand concepts from first principles.

Completing a Bachelor in Applied Mathematics, she went on to do Honours in Physics, working on a project in art conservation, which it turns out is an extremely technical field. She built a laser interferometer using off-the-shelf parts (laser, CCD camera and a laptop) to monitor canvas artworks and detect the problems caused to art by changes in microclimate.

After teaching English in Japan for a year, Miles returned to Melbourne where she began her PhD (Miles is not related to Dr Elaine Miles the glass artist). Miles says she wanted to be learning or developing new things (plus she had unfinished business in art conservation) and so a PhD was the logical progression. Her PhD focused on two areas: the science of paint drying (literally watching paint dry she says) and subsurfacing imaging. She had a focus on south-east Asia, where Western art production techniques are prevalent but unsuitable because of the different climate.

Miles spent 3 months working with galleries in the Philippines where she used her own laser speckle interferometers to study artwork hanging in the gallery, in-situ. As far as she’s aware, studying art in-situ had never been done before. This work allows conservators to determine best practice and a course of action for storing and restoring works of art.

With her PhD close to being submitted, Miles began work at CAWCR where she first worked on data assimilation of weather balloon observations into weather forecasting models. She then moved on to verifying rainfall prediction models and getting weather radar data assimilated into the model.

For the last 10 months she has worked with the Pacific-Australia Climate Change Science and Adaptation Planning Program (PACCSAPP), where she investigates applying POAMA (Predictive Ocean Atmosphere Model for Australia), a dynamic, coupled ocean-atmospheric, multi-model ensemble global seasonal prediction model, to forecast global sea level anomalies 1-9 months in the future, specifically validating predictions with observations in the Western Pacific. This is the first time dynamic models have been used to predict medium-term sea level, and forms an extremely important part of helping the Pacific adapt to the immediate effects of climate change.

As for the future, Miles looks forward to getting her PhD submitted, but would like to continue working with sea level modelling. She hopes to start leading projects in Australia and around the world.

Elaine Miles verifying data

sort of in love with jQuery

After a bunch of years of hacking code with GTK+ and Clutter, doing a web app is pretty sweet. It’s the first time I’ve ever really used jQuery properly (not just animating a little toggle), and I have to say, I’m sort of in love with it1.

I can’t show you the app I’m working on yet, there’s a beta online but it’s still undergoing scientific verification. Also my latest work, with animated transitions and everything, which feels like a much more high quality app, is not yet pushed to the production server.

Still, it’s so straightforward to use, I rewrote the skip next/prev post feature I have on Planet VegMel to be in jquery, which makes it about 50x smaller and much less brittle. If you want to add it to your own planet, grab the Javascript.

  1. I am however, not in love with ExtJS []

automatically protecting people with private browsing

A friend of mine recently suggested people donate to the Women’s Domestic Violence Crisis Service, whose website provides a quick escape button (like a boss button) for victims of abuse to get away from the site quickly. Unfortunately the button simply takes you to Google (via an image map, I wonder why?). It doesn’t manipulate your history.

For people in abusive relationships, leaving behind browser history saying they were accessing a site to get help could be dangerous (this is why there’s a quick escape button in the first place).

HTML5 includes a History API, which will let you manipulate the most recent entry, and you could just screw up all the site history, that would be annoying for other people. It seems like the correct answer here is private browsing or incognito mode. While the warning page could include instructions on how to activate incognito mode for your browser (and offer to screw up the history) one wonders why HTML5 doesn’t include a method to load a site private/incognito so websites could offer to stay out of your browser history? I’m sure it would prove very popular with porn sites, but it might also help to protect some people.