First thoughts on RedHat OpenShift

OpenShift logoI’m looking for a PaaS provider that isn’t going to cost me very much (or anything at all) and supports Flask and PostGIS. Based on J5’s recommendation in my blog the other day, I created an OpenShift account.

A free account OpenShift gives you three small gears1 which are individual containers you can run an app on. You can either run an app on a single gear or have it scale to multiple gears with load balancing. You then install components you need, which OpenShift refers to by the pleasingly retro name of cartridges. So for instance, Python 2.7 is one cartridge and PostgreSQL is another. You can either install all cartridges on one gear or on separate gears based on your resource needs2.

You choose your base platform cartridge (i.e. Python-2.6) and you optionally give it a git URL to do an initial checkout from (which means you can deploy an app that is already arranged for OpenShift very fast). The base cartridge sets up all the hooks for setting up after a git push (you get a git remote that you can push to to redeploy your app). The two things you need are a root setup.py containing your pip requirements, and a wsgi/application file which is a Python blob containing an WSGI object named application. For Python it uses virtualenv and all that awesome stuff. I assume for node.js you’d provide a package.json and it would use npm, similarly RubyGems for Ruby etc.

There’s a nifty command line tool written in Ruby (what happened to Python-only Redhat?) that lets you do all the sort of cloud managementy stuff, including reloading cartridges and gears, tailing app logs and SSHing into the gear. I think an equivalent of dbshell would be really useful based on your DB cartridge, but it’s not a big deal.

There are these deploy hooks you can add to your git repo to do things like create your databases. I haven’t used them yet, but again it would make deploying your app very fast.

There are also quickstart scripts for deploying things like WordPress, Rails and a Jenkins server onto a new gear. Speaking of Jenkins there’s also a Jenkins client cartridge which I think warrants experimentation.

So what’s a bit crap? Why isn’t my app running on OpenShift yet? Basically because the available cartridges are a little antique. The supported Python is Python 2.6, which I could port my app too; or there are community-supported 2.7 and 3.3 cartridges, so that’s fine for me (TBH, I thought my app would run on 2.6) but maybe annoying for others. There is no Celery cartridge, which is what I would have expected, ideally so you can farm tasks out to other gears, and although you apparently can use it, there’s very little documentation I could find on how to get it running.

Really though the big kick in the pants is there is no cartridge for Postgres 9.2/PostGIS 2.0. There is a community cartridge you can use on your own instance of OpenShift Origin, but that defeats the purpose. So either I’m waiting for new Postgres to be made available on OpenShift or backporting my code to Postgres 8.4.

Anyway, I’m going to keep an eye on it, so stay tuned.

  1. small gears have 1GB of disk and 512MB of RAM allocated []
  2. I think if you have a load balancing (scalable) application, your database needs to be on its own gear so all the other gears can access it. []
Posted in Uncategorized | 6 Comments

Extending geoalchemy through monkeypatching

I’ve been working on the data collection part of my cycle route modelling. I’m hoping that I can, as a first output, put together a map of where people are cycling in Melbourne. A crowd-sourced view of the best places to cycle, if you will. Given I will probably be running this in the cloud1, I thought it was best to actually store the data in a GIS database, rather than lots and lots of flat files.

A quick Google turned up GeoAlchemy, which are GIS extensions for SQLAlchemy. Provides lots of the standard things you want to do as methods on fields, but this is only a limited set of what you can do with PostGIS. Since I’m going to be wanting to do things like binning data, I thought it was worth figuring out how hard it was to call other PostGIS methods.

GeoAlchemy supports subclassing to create new dialects, but you have to subclass 3 classes, and it’s basically a pain in the neck when you just want to extend the functionality of the PostGIS dialect. Probably what I should do is submit a pull request with the rest of the PostGIS API as extensions, but I’m lazy. Henceforth, for the second time this week I am employing monkey patching to get the job done (and for the second time this week, kittens cry).

Functions in GeoAlchemy require two things, a method stub saying how we collect the arguments and the return (look at geoalchemy.postgis.pg_functions) and a mapping from this to the SQL function. Since we only care about one dialect, we can make this easier on ourselves by combining these two things. Firstly we monkeypatch in the method stubs:

from geoalchemy.functions import BaseFunction
from geoalchemy.postgis import pg_functions

@monkeypatchclass(pg_functions)
class more_pg_functions:
    """
    Additional functions to support for PostGIS
    """

    class length_spheroid(BaseFunction):
        _method = 'ST_Length_Spheroid'

Note the _method attribute which isn’t something used anywhere else. We can then patch in support for this:

from geoalchemy.dialect import SpatialDialect

@monkeypatch(SpatialDialect)
def get_function(self, function_cls):
    """
    Add support for the _method attribute
    """

    try:
        return function_cls._method
    except AttributeError:
        return self.__super__get_function(function_cls)

The monkeypatching functions look like this:

def monkeypatch(*args):
    """
    Decorator to monkeypatch a function into class as a method
    """

    def inner(func):
        name = func.__name__

        for cls in args:
            old = getattr(cls, name)
            setattr(cls, '__super__{}'.format(name), old)

            setattr(cls, name, func)

    return inner


def monkeypatchclass(cls):
    """
    Decorator to monkeypatch a class as a baseclass of @cls
    """

    def inner(basecls):
        cls.__bases__ += (basecls,)

        return basecls

    return inner

Finally we can do queries like this:

>>> track = session.query(Track).get(1)
>>> session.scalar(track.points.length_spheroid('SPHEROID["WGS 84",6378137,298.257223563]'))
6791.87502950043

Code on GitHub.

  1. your recommendations for cloud-based services please, must be able to run Flask and PostGIS and be super cheap []
Posted in Uncategorized | 4 Comments

scratching my own itch [or: why I still love open source]

I’m visiting my parents for the long weekend. Sitting in the airport I decided I should use my spare time to write some documentation, so sitting in the airport was the first time I’d tried to connect to work’s OpenVPN server. While it’s awesome that Network Manager can now import OpenVPN configs, it didn’t work because NM doesn’t support the crucial keysize parameter.

Rather than work around the problem, which some people have done, but would annoyingly break my other OpenVPNs, I used the fact that it’s open source to fix the problem properly.

My Dad asked if I was working. No, well, not really. I’m fixing the interface to my VPN client so I can connect to work’s VPN, I replied. Unglaublich! my father remarked. Not unbelievable, because it’s open source!

Posted in Uncategorized | 3 Comments

Review: GNOME 3 Application Development: Beginner’s Guide

GNOME 3 Application DevelopmentThe folk at Packt Publishing sent me an e-copy of GNOME 3 Application Development Beginners Guide a month or so ago.

I’ve been putting off this review because I don’t think this is an very good book and it’s hard to write bad reviews.

First off, the book’s Javascript sections use Seed. I think this is an unconventional choice given that the shell and most of GNOME uses gjs. It had been my experience with the Javascript bindings that gjs was significantly more mature, a view which is confirmed by the fact that Seed has had very little development in the last 18 months.

The book does not seem to use GTK+ best practice, like using Gtk.Grid or Gtk.Application and not using c_new constructor. It is full of things like use of Vala’s [CCode] pragma, but I don’t see why. I felt important and powerful facilities in GLib like properties were not properly explained, especially property binding. There was also a lack of understanding, for example, referring to Timeout objects, which don’t exist (the structure you’re looking for is a Source).

I do like that it uses Anjuta. It’s a shame that it requires unexplained hacks to get things building.

The Clutter section was very poor. Comparing Clutter to GTK+ is simply not reasonable. Clutter is a scene graph API, which doesn’t really have a comparison in the GTK+ stack, which goes from drawing layer to widget layer with no intermediate layer. I immediately noticed the Clutter examples hardcoded layout instead of using a layout manager.

The multimedia section had the user installing non-free codecs. Then it uses alsasink and not auto*sink. It spends a lot of time setting up GStreamer pipelines, rather than using decodebin and playbin, maybe this improves understanding, but I think it mostly will lead to the creation of very rigid apps.

I stopped reading and started skimming at this point. I did again notice weird things like creating JSON using append methods and not the handy JSON-GLib. The examples of HTML5 applications with WebKit perhaps would explain why Seed, except the wrapper is written in Vala, so there’s no problem of conflicting JS engines (I think it works fine anyway, right?). Similarly the application accesses applications by looking in /usr/share/applications rather than libgnome-menus. Again, this will lead to very rigid apps that don’t work very well and doesn’t teach beginners the best practice for GNOME development.

There’s stuff that’s just weird, the system requirements are significantly more powerful than my last computer, on which I was doing GNOME development just fine. There is discussion of how to switch to GNOME Shell, as if it’s required, whereas you can develop GNOME apps in Unity and XFCE just fine.

The typesetting of the book is poor. The source code is weirdly indented and I feel like it lacked readability (there’s great syntax highlighting available for printed text). There are grammatical mistakes that really should have been picked up in editing. The screenshots are blurry in the PDF (looks like some kind of busted bilinear filtering?). Also I can see the resize indicator on the mouse. Generally these serve to make the book look unprofessional.

Finally I don’t think the book really leads the new developer into the community as the best source to get help, which they will undoubtedly need. Of course, the community has already produced some excellent tutorials, which I think new developers would be much better off with.

All up I’m giving it 1 star.

GNOME 3 Application Development: Beginner’s Guide: ★☆☆☆☆

Posted in Uncategorized | 9 Comments

IdentitiesOnly + ssh-agent

I’m really hoping that someone can provide me with some enlightenment.

I have a lot of ssh keys. 6 by today’s count. On my desktop I have my ssh configured with IdentitiesOnly yes and an IdentityFile for each host. This works great.

I then forward my agent to my dev VM. I can see the keys with ssh-add -l. So far so good. If I then ssh into a host, I can see it trying every key from the agent in sequence, which is sometimes going to fail with too many keys tried. However, if I try IdentitiesOnly yes in my dev VM config, it doesn’t offer any keys, if I add IdentityFile it doesn’t work because I don’t have those key files on my VM.

So what’s the solution? What I want is to specify identities by their identifier in the agent, e.g. danni@github, however I can’t see config to do that. Anyone got a nifty solution?

Posted in Uncategorized | 8 Comments

generic lettuce steps for Django models

After I left the Bureau approximately a month ago I’ve taken up a new role with Infoxchange Australia. My first project here is working on a rewrite of an application using Django.

People here are really into behaviour driven testing, and we’re using Lettuce to do it (using a branch with better Django integration).

I sort of dislike this sort of testing, because it creates an annoying abstraction layer on top of the code, with a poorly defined, quasi-real language. It’s like a bad knock off of Applescript. Anyway, I got sick of defining steps per model, so I put together some generic steps for manipulating Django models (that I’ll have to contribute back).

Anyway they look like this (examples of the step in the docstrings):

# build a hash of model verbose names to models
# this is used by get_model()
def _models_generator():
    for model in get_models():
        yield (model._meta.verbose_name, model)
        yield (model._meta.verbose_name_plural, model)

MODELS = dict(_models_generator())


def get_model(model):
    """
    Convert a model's verbose name to the model class. This allows us to
    use the models verbose name in steps.
    """

    name = model.lower()
    model = MODELS.get(model, None)

    assert model, "Could not locate model by name '%s'" % name

    return model


def create_models(model, hashes):
    for hash_ in hashes:
        model.objects.create(**hash_)


def models_exist(model, hashes):
    for hash_ in hashes:
        assert \
            model.objects.filter(**hash_).exists(), \
            "Object does not exist"


@step(r'I have ([a-z][a-z0-9_ ]*) in the database:')
def create_models_generic(step, model):
    """
    And I have admin field values in the database:
    | name         | value   |
    | project_type | Twine   |

    The generic method can be overridden for a specific model by defining a
    function create_badgers(step), which creates the Badger model.
    """

    try:
        globals()['create_%s' % model](step)
    except KeyError:
        model = get_model(model)

        create_models(model, step.hashes)


@step(r'(?:Given|And|Then) ([A-Z][a-z0-9_ ]*) with ([a-z]+) "([^"]*)" has ([A-Z][a-z0-9_ ]*) in the database:')  # noqa
def create_models_for_relation(step, rel_model_name,
                               rel_key, rel_value, model):
    """
    And project with name "Ball Project" has goals in the database:
    | description                             |
    | To have fun playing with balls of twine |
    """

    lookup = {rel_key: rel_value}
    rel_model = get_model(rel_model_name).objects.get(**lookup)

    for hash_ in step.hashes:
        hash_['%s_id' % rel_model_name] = rel_model.id

    create_models_generic(step, model)


@step('(?:Given|And|Then) ([A-Z][a-z0-9_ ]*) should be present in the database')
def step_models_exist(step, model):
    """
    And objectives should be present in the database:
    | description      |
    | Make a mess      |
    """

    model = get_model(model)

    models_exist(model, step.hashes)


@step(r'There should be (\d+) ([a-z][a-z0-9_ ]*) in the database')
def model_count(step, count, model):
    """
    Then there should be 0 goals in the database
    """

    model = get_model(model)

    assert_equals(model.objects.count(), int(count))
Posted in python | Comments Off

Reviewing GNOME3 App Development Beginners Guide

GNOME 3 Application DevelopmentThe folk at Packt Publishing sent me an e-copy of GNOME 3 Application Development Beginners Guide the other day. Since I find myself with a couple of weeks off (more on that another time) I’m going to be reading it and writing a review.

The book weighs in at 366 pages and purports to cover GLib, GTK+, GStreamer, E-D-S, WebKit, desktop D-Bus APIs, i18n and unit testing in both Javascript (via Seed) and Vala.

Hopefully I will get it read in the next couple of weeks and get my thoughts jotted down. I am not getting anything except an e-copy of the book for my trouble so you can trust me to be brutally honest :-P

Posted in Uncategorized | 5 Comments

Generating JSON from SQLAlchemy objects

I had to put together a small web app the other day, using SQLAlchemy and Flask. Because I hate writing code multiple times, when I can do things using a better way, I wanted to be able to serialise SQLAlchemy ORM objects straight to JSON.

I decided on an approach where taking a leaf out of Javascript, I would optionally implement a tojson() method on a class, which I would attempt to call from my JSONEncoder1.

It turns out to be relatively simple to extend SQLAlchemy’s declarative base class to add additional methods (we can also use this as an excuse to implement a general __repr__().

from sqlalchemy.ext.declarative import declarative_base as real_declarative_base

# Let's make this a class decorator
declarative_base = lambda cls: real_declarative_base(cls=cls)

@declarative_base
class Base(object):
    """
    Add some default properties and methods to the SQLAlchemy declarative base.
    """

    @property
    def columns(self):
        return [ c.name for c in self.__table__.columns ]

    @property
    def columnitems(self):
        return dict([ (c, getattr(self, c)) for c in self.columns ])

    def __repr__(self):
        return '{}({})'.format(self.__class__.__name__, self.columnitems)

    def tojson(self):
        return self.columnitems

We can then define our tables in the usual way:

class Client(Base):
    __tablename__ = 'client'

    ...

You can obviously replace any of the methods in your subclass, if you don’t want to serialise the whole thing. Bonus points for anyone who wants to extend this to serialise one-to-many relationships.

And what about calling the tojson() method? That’s easy, we can just provide our own JSONEncoder.

import json

class JSONEncoder(json.JSONEncoder):
    """
    Wrapper class to try calling an object's tojson() method. This allows
    us to JSONify objects coming from the ORM. Also handles dates and datetimes.
    """

    def default(self, obj):
        if isinstance(obj, datetime.date):
            return obj.isoformat()

        try:
            return obj.tojson()
        except AttributeError:
            return json.JSONEncoder.default(self, obj)

Cutting edge Flask provides a way to replace the default JSON encoder, but the version I got out of pip does not. This is relatively easy to work around though by replacing jsonify with our own version.

from flask import Flask

app = Flask(__name__)

def jsonify(*args, **kwargs):
    """
    Workaround for Flask's jsonify not allowing replacement of the JSONEncoder
    in my version of Flask.
    """

    return app.response_class(json.dumps(dict(*args, **kwargs),
                                         cls=JSONEncoder),
                              mimetype='application/json')

If you do have a newer Flask, where you don’t have to replace jsonify, you can also inherit from Flask’s JSONEncoder, which already handles things like datetimes for you.

  1. The tojson() method actually returns a Python dict understandable by JSONEncoder []
Posted in python | 3 Comments

elevation data and APIs

So I found a bit of time to hack on my project today. Today’s task was to load and validate data coming from Geoscience Australia’s SRTM digital elevation model data, which I downloaded from their elevation data portal last week.1 The data is Creative Commons, so I might just upload it somewhere, if I can find a place for 2GB of elevation data.

This let me load elevation data in a 3 arcsecond (about 100m) grid, which I did using the ubiquitous GDAL via its Python API. Initial code is here. It doesn’t do anything super clever yet, like check and normalise the projection, because I don’t need to.2

Looking at plots of values can give you a gist of what’s what (oh look, it goes out to sea, and then the data is masked out) but it doesn’t really validate anything. I could do validation runs against my GPS tracks, but for a first pass, I decided it would be easier to validate using Google’s Elevation API. This is a pretty neat web service that you make a request to, and it gives you back some JSON (or XML). There are undoubtedly Python APIs to access this, but it’s pretty easy to do a simple call with urllib2 or httplib. I chose to reuse my httplib Client wrapper from my RunKeeper/HealthGraph API. I wrote it directly in the test.

For a real unit test, I would have probably calculated the residuals, and ensured they sat within some acceptable range, but I’m lazy, so instead I just plotted them together. Google’s data, you will notice, includes bathymetry, which is actually pretty neat.

SRTM v Google Elevation

  1. Note to the unwary, seems to be buggy in Chrome? []
  2. I did write a skeleton context manager for gdal.Open. I say skeleton because it doesn’t actually do anything smart like turning errors from GDAL into Exceptions, because I didn’t have any errors to handle. []
Posted in Uncategorized | Comments Off

Testing warnings with py.test

For those who use like to add warnings to your Python code, and want to test those warnings actually happen in your unit tests, here are two techniques to do so, both are based around fixtures/funcargs.

Firstly is the mechanism built into py.test using recwarn.

The second is to create a fixture that specifically enables warnings as exceptions and combined that with pytest.raises, for instance:

import warnings

@pytest.fixture
def warnings_as_errors(request):
    warnings.simplefilter('error')

    request.addfinalizer(lambda *args: warnings.resetwarnings())

def test_timers_warn(log, warnings_as_errors):

    log.start_timer('method')

    with pytest.raises(RuntimeWarning):
        log.start_timer('method')

The advantage of this second method is you can guarantee exactly what method call raises the warning without repeatedly having to check recwarn.

Posted in python | Comments Off