Returning screenshots in Gitlab CI (and Travis CI)


Our code base includes a large suite of functional tests using Lettuce, Selenium and PhantomJS. When a test fails, we have a hook that captures the current screen contents and writes it to a file. In an ideal CI system these would be collected as a failure artifact (along with the stack trace, etc.) but that’s not currently possible withGitlab CI (hint hint Gitlab team, investigate Subunit for streaming test output).

Instead what we do on Gitlab is output our images as base64 encoded text:

if $SUCCESS; then
    echo "Success"
    echo "Failed ------------------------------------"
    for i in Test*.html; do echo $i; cat "$i"; done
    for i in Test*.png; do echo $i; base64 "$i"; echo "EOF"; done

Of course now you have test output full of meaningless base64’ed data. Enter Greasemonkey.

// ==UserScript==
// @name View CI Images
// @namespace
// @description View CI Images
// @version 1
// @match https://gitlabci/projects/*/builds/*
// ==/UserScript==

(function ($) {
    var text = $('#build-trace').html();
    text = text.replace(/(Test_.+\.png)\n([A-Za-z0-9\n\/\+=]+)\nEOF\n/g,
                        '<h2>$1</h2><img src="data:image/png;base64,$2" ' +
                        'style="display:block; max-width:800px; width: auto; height: auto;" ' +

Web browsers (handily) can already display base64’ed images. This little user script will match builds on the CI server you specify and then replace those huge chunks of base64 with the image they represent.

This technique could easily be replicated on Travis CI by updating the jQuery selector.

Testing sites with beforeunload and Lettuce/Cucumber

I recently added a beforeunload event handler to a site I’m working on, which instantly caused a regression of the entire Lettuce test suite before they got stuck on a “Leave this page?” dialog. We reuse the same Selenium browser session between tests in order to make our tests run in something approximating less than a decade.

Unfortunately Ghostdriver can’t see alerts and dialogs, which makes simply dismissing the dialog in Selenium kind of hard, but an easier way is at the end of the scenario to simply disable the event, and let it be reinstated with the next page load.

Add this hook to your Lettuce steps:

def disable_beforeunload(scenario):
    Disable before unload after a scenario so that the next scenario can
    reload the site.

try {
} catch (e) {

generic lettuce steps for Django models

After I left the Bureau approximately a month ago I’ve taken up a new role with Infoxchange Australia. My first project here is working on a rewrite of an application using Django.

People here are really into behaviour driven testing, and we’re using Lettuce to do it (using a branch with better Django integration).

I sort of dislike this sort of testing, because it creates an annoying abstraction layer on top of the code, with a poorly defined, quasi-real language. It’s like a bad knock off of Applescript. Anyway, I got sick of defining steps per model, so I put together some generic steps for manipulating Django models (that I’ll have to contribute back).

Anyway they look like this (examples of the step in the docstrings):

# build a hash of model verbose names to models
# this is used by get_model()
def _models_generator():
    for model in get_models():
        yield (model._meta.verbose_name, model)
        yield (model._meta.verbose_name_plural, model)

MODELS = dict(_models_generator())

def get_model(model):
    Convert a model's verbose name to the model class. This allows us to
    use the models verbose name in steps.

    name = model.lower()
    model = MODELS.get(model, None)

    assert model, "Could not locate model by name '%s'" % name

    return model

def create_models(model, hashes):
    for hash_ in hashes:

def models_exist(model, hashes):
    for hash_ in hashes:
        assert \
            model.objects.filter(**hash_).exists(), \
            "Object does not exist"

@step(r'I have ([a-z][a-z0-9_ ]*) in the database:')
def create_models_generic(step, model):
    And I have admin field values in the database:
    | name         | value   |
    | project_type | Twine   |

    The generic method can be overridden for a specific model by defining a
    function create_badgers(step), which creates the Badger model.

        globals()['create_%s' % model](step)
    except KeyError:
        model = get_model(model)

        create_models(model, step.hashes)

@step(r'(?:Given|And|Then) ([A-Z][a-z0-9_ ]*) with ([a-z]+) "([^"]*)" has ([A-Z][a-z0-9_ ]*) in the database:')  # noqa
def create_models_for_relation(step, rel_model_name,
                               rel_key, rel_value, model):
    And project with name "Ball Project" has goals in the database:
    | description                             |
    | To have fun playing with balls of twine |

    lookup = {rel_key: rel_value}
    rel_model = get_model(rel_model_name).objects.get(**lookup)

    for hash_ in step.hashes:
        hash_['%s_id' % rel_model_name] =

    create_models_generic(step, model)

@step('(?:Given|And|Then) ([A-Z][a-z0-9_ ]*) should be present in the database')
def step_models_exist(step, model):
    And objectives should be present in the database:
    | description      |
    | Make a mess      |

    model = get_model(model)

    models_exist(model, step.hashes)

@step(r'There should be (\d+) ([a-z][a-z0-9_ ]*) in the database')
def model_count(step, count, model):
    Then there should be 0 goals in the database

    model = get_model(model)

    assert_equals(model.objects.count(), int(count))

Testing warnings with py.test

For those who use like to add warnings to your Python code, and want to test those warnings actually happen in your unit tests, here are two techniques to do so, both are based around fixtures/funcargs.

Firstly is the mechanism built into py.test using recwarn.

The second is to create a fixture that specifically enables warnings as exceptions and combined that with pytest.raises, for instance:

import warnings

def warnings_as_errors(request):

    request.addfinalizer(lambda *args: warnings.resetwarnings())

def test_timers_warn(log, warnings_as_errors):


    with pytest.raises(RuntimeWarning):

The advantage of this second method is you can guarantee exactly what method call raises the warning without repeatedly having to check recwarn.

Extending Selenium with jQuery

Last week I wrote about combining Selenium and py.test and I promised to also talk about my function find_elements_by_jquery().

Selenium by default can find elements by id, CSS selector and XPath, but I often find I already know the query as a jQuery selector, and so frequently it’s easiest just to use that.

We start by overloading the Selenium webdriver. Since the webdriver is exposed through several classes (one per web browser), we do this in a particularly meta way.

from selenium.webdriver.remote.webdriver import WebElement
from selenium.common.exceptions import InvalidSelectorException

def MyWebDriver(base, **kwargs):
    return type('MyWebDriver', (_MyWebDriver, base), kwargs)

class _MyWebDriver(object):
    def create_web_element(self, element_id):
        return MyWebElement(self, element_id)

    def find_elements_by_jquery(self, jq):
        return self.execute_script('''return $('%s').get();''' % jq)

    def find_element_by_jquery(self, jq):
        elems = self.find_elements_by_jquery(jq)
        if len(elems) == 1:
            return elems[0]
            raise InvalidSelectorException(
                "jQuery selector returned %i elements, expected 1" % len(elems))

We then do a similar implementation for the webelement:

class MyWebElement(WebElement):
    def __repr__(self):
        """Return a pretty name for an element"""

        id = self.get_attribute('id')
        class_ = self.get_attribute('class')

        if len(id) > 0:
            return '#' + id
        elif len(class_) > 0:
            return '.'.join([self.tag_name] + class_.split(' '))
            return self.tag_name

    def find_elements_by_jquery(self, jq):
        return self.parent.execute_script(
            '''return $(arguments[0]).find('%s').get();''' % jq, self)

    def find_element_by_jquery(self, jq):
        elems = self.find_elements_by_jquery(jq)
        if len(elems) == 1:
            return elems[0]
            raise InvalidSelectorException(
                "jQuery selector returned %i elements, expected 1" % len(elems))

We can now pass in jQuery selectors for instance b.find_element_by_jquery('#region option:selected'). Or form.find_elements_by_jquery(':input'). It’s especially incredibly powerful when all of your DOM manipulation already works in terms of jQuery selectors.

As an added bonus, overloading the classes lets us add functionality like Firebug style element names (MyWebElement.__repr__) or wrap things like the Wait utility into the webdriver, e.g.

from import WebDriverWait as Wait
from selenium.common.exceptions import TimeoutException

class FrontendError(Exception):

# class _MyWebDriver...
    def wait(self, event, timeout=10):
            Wait(self, timeout).until(event)
        except (TimeoutException, FrontendError) as e:
            # do we have an error dialog
            dialog = self.find_element_by_id('error-dialog')
            if dialog.is_displayed():
                content = dialog.find_element_by_id('error-dialog-content')
                raise FrontendError(content.text)
                raise e

Combining py.test and Selenium to test webapps

Recently I started adding unit and acceptance tests to a webapp using Selenium, integrated into the existing py.test framework that tests the backend code.

py.test fixtures make using Selenium, via its Python bindings, really straightforward. Here’s how I did it.

First I put all the Selenium related tests in a tests/selenium/ directory. I then created tests/selenium/ and wrote a fixture to allow tests to access a single instance of the webdriver for the entire session:

import pytest
from selenium import webdriver

browsers = {
    'firefox': webdriver.Firefox,
    'chrome': webdriver.Chrome,

def driver(request):
    if 'DISPLAY' not in os.environ:
        pytest.skip('Test requires display server (export DISPLAY)')

    b = browsers[request.param]()

    request.addfinalizer(lambda *args: b.quit())

    return b

Note that we’re able to parameterise the fixture so that it runs with multiple browsers. We then add a per-function fixture that sets up the session for an individual test:

def b(driver, url):
    b = driver
    b.set_window_size(1200, 800)

    return b

A fixture can refer to other fixtures of more generic scope. So url is a fixture that accesses the optional --url property.

def pytest_addoption(parser):
    parser.addoption('--url', action='store',

def url(request):
    return request.config.option.url

These fixtures are available for all tests in that package. Tests have the form:

def test_badger(b):
    # test goes here

We can also create per-module fixtures, that optionally inherit our generic fixtures. Say for example we want to run a number of tests (e.g. for WCAG 2.0 compliance) on a number of parameterised instances of the set-up webapp. We might do this in

import pytest

def wcag(driver, url):
    Set up a single session for these tests.

    b = driver
    b.set_window_size(1200, 800)

    # do stuff here with Selenium to set up webapp
    return b

We can now write tests1 in this module, e.g.

def test_unique_ids(wcag):
    All ids in the document should be unique.

    elems = wcag.find_elements_by_jquery('[id]')
    ids = map(lambda e: e.get_attribute('id'), elems)

    assert len(elems) >= 1 # sanity check
    assert util.unique(ids)

Again, we can parameterise this fixture to set up the webapp in a number of different ways. Note that we have to use driver as our fixture, not b. This is because we can only refer to fixtures more general in scope than the one we are writing.

  1. find_elements_by_jquery() is a method I’ve added in an extension of Selenium’s webdriver, and is a topic for another post. []