Exploring Github Actions

To help keep myself honest, I wanted to set up automated test runs on a few personal projects I host on Github.  At first I gave Travis a try, since a number of projects I contribute to use it, but it felt a bit clunky.  When I found Github had a new CI system in beta, I signed up for the beta and was accepted a few weeks later.

While it is still in development, the configuration language feels lean and powerful.  In comparison, Travis’s configuration language has obviously evolved over time with some features not interacting properly (e.g. matrix expansion only working on the first job in a workflow using build stages).  While I’ve never felt like I had a complete grasp of the Travis configuration language, the single page description of Actions configuration language feels complete.

The main differences I could see between the two systems are:

  1. A Github workflow is composed of multiple jobs right from the start.
  2. All jobs run in parallel by default.  It is possible to serialise jobs (similar to Travis’s stages) by declaring dependencies between jobs.
  3. Each job specifies which VM image it will run on, with a choice of Ubuntu, Windows, or MacOS versions.  If you choose Ubuntu, you can also specify a Docker container to run your build in, giving access to other Linux build environments.
  4. Each job can have a matrix attached, allowing the job to be duplicated according to a set of parameters.
  5. Jobs are composed of a sequence of steps.  Unlike Travis’s fixed set of build phases, these are generic.
  6. Steps can consist of either code executed by the shell or a reference to an external action.
  7. Actions are the primary extension mechanism, and are even used for basic tasks like checking out your repository.  Actions are either implemented in JavaScript or as a Docker container.  Only JavaScript actions are available for Windows and MacOS jobs.

The first project I converted over was asyncio-glib, where I was using Travis to run the test suite on a selection of Python versions.  My old Travis configuration can be seen here, and the new Actions workflow can be seen here.  Both versions are roughly equivalent, although the actions/setup-python@v1 action doesn’t currently make beta releases of Python available. The result of a run of the workflow can be seen here.

For a second project (videowhisk), I am running the tests against the VM’s default Python image.  For this project, I’m more interested in compatibility with the distro release’s GStreamer libraries than compatibility with different Python versions.  I suppose I could extend this using the matrix feature to test on multiple Ubuntu versions, or containers for other Linux releases.

While I’ve just been using this to run the test suite, it looks like Actions can be used for a lot more.  A project can have multiple workflows with different triggers, so it can also be used for automated triage of bugs or pull requests (e.g. request a review from a specific developer when a pull request is created that modifies files in a specific directory). It also looks like I could create a workflow to automatically publish to PyPI when I push a new tag to the repository that looks like a version number.

It will be interesting to see what this does to the larger ecosystem of “CI as a service” products built to work with Github.  On the one hand having a choice is nice, but on the other hand it’s nice to have something well integrated.  I really like Gitlab’s integrated CI system for projects I have hosted on various Gitlab instances, for example.

Building IoT projects with Ubuntu Core talk

Last week I gave a talk at Perth Linux Users Group about building IoT projects using Ubuntu Core and Snapcraft. The video is now available online. Unfortunately there were some problems with the audio setup leading to some background noise in the video, but it is still intelligible:

The slides used in the talk can be found here.

The talk was focused on how Ubuntu Core could be used to help with the ongoing security and maintenance of IoT projects. While it might be easy to buy a Raspberry Pi, install Linux and your application, how do you make sure the device remains up to date with security updates? How do you push out updates to your application in a reliable fashion?

I outlined a way of deploy a project using Ubuntu Core, including:

  1. Packaging a simple web server app using the snapcraft tool.
  2. Configuring automatic builds from git, published to the edge channel on the Snap Store. This is also an easy way to get ARM builds for a package, rather than trying to deal with cross compilation tool chains.
  3. Using the ubuntu-image command to create an Ubuntu Core image with the application preinstalled

I gave a demo booting such an image in a virtual machine. This showed the application up and running ready to use. I also demonstrated how promoting a build from the edge channel to stable on the store would make it available to the system wide automatic update mechanism on the device.

Performing mounts securely on user owned directories

While working on a feature for snapd, we had a need to perform a “secure bind mount”. In this context, “secure” meant:

  1. The source and/or target of the mount is owned by a less privileged user.
  2. User processes will continue to run while we’re performing the mount (so solutions that involve suspending all user processes are out).
  3. While we can’t prevent the user from moving the mount point, they should not be able to trick us into mounting to locations they don’t control (e.g. by replacing the path with a symbolic link).

The main problem is that the mount system call uses string path names to identify the mount source and target. While we can perform checks on the paths before the mounts, we have no way to guarantee that the paths don’t point to another location when we move on to the mount() system call: a classic time of check to time of use race condition.

One suggestion was to modify the kernel to add a MS_NOFOLLOW flag to prevent symbolic link attacks. This turns out to be harder than it would appear, since the kernel is documented as ignoring any flags other than MS_BIND and MS_REC when performing a bind mount. So even if a patched kernel also recognised the MS_NOFOLLOW, there would be no way to distinguish its behaviour from an unpatched kernel. Fixing this properly would probably require a new system call, which is a rabbit hole I don’t want to dive down.

So what can we do using the tools the kernel gives us? The common way to reuse a reference to a file between system calls is the file descriptor. We can securely open a file descriptor for a path using the following algorithm:

  1. Break the path into segments, and check that none are empty, ".", or "..".
  2. Open the root directory with open("/", O_PATH|O_DIRECTORY).
  3. Open the first segment with openat(parent_fd, "segment", O_PATH|O_NOFOLLOW|O_DIRECTORY).
  4. Repeat for each of the remaining file descriptors, closing parent descriptors as needed.

Now we just need to find a way to use these file descriptors with the mount system call. I came up with two strategies to achieve this.

Use the current working directory

The first idea I tried was to make use of the fact that the mount system call accepts relative paths. We can use the fchdir system call to change to a directory identified by a file descriptor, and then refer to it as ".". Putting those together, we can perform a secure bind mount as a multi step process:

  1. fchdir to the mount source directory.
  2. Perform a bind mount from "." to a private stash directory.
  3. fchdir to the mount target directory.
  4. Perform a bind mount from the private stash directory to ".".
  5. Unmount the private stash directory.

While this works, it has a few downsides. It requires a third intermediate location to stash the mount. It could interfere with anything else that relies on the working directory. It also only works for directory bind mounts, since you can’t fchdir to a regular file.

Faced with these downsides, I started thinking about whether there was any simpler options available.

Use magic /proc symbolic links

For every open file descriptor in a process, there is a corresponding file in /proc/self/fd/. These files appear to be symbolic links that point at the file associated with the descriptor. So what if we pass these /proc/self/fd/NNN paths to the mount system call?

The obvious question is that if these paths are symbolic links, is this any different than passing the real paths directly? It turns out that there is a difference, because the kernel does not resolve these symbolic links in the standard fashion. Rather than recursively resolving link targets, the kernel short circuits the process and uses the path structure associated with the file descriptor. So it will follow file moves and can even refer to paths in other mount namespaces. Furthermore the kernel keeps track of deleted paths, so we will get a clean error if the user deletes a directory after we’ve opened it.

So in pseudo-code, the secure mount operation becomes:

source_fd = secure_open("/path/to/source", O_PATH);
target_fd = secure_open("/path/to/target", O_PATH);
mount("/proc/self/fd/${source_fd}", "/proc/self/fd/${target_fd}",
      NULL, MS_BIND, NULL);

This resolves all the problems I had with the first solution: it doesn’t alter the working directory, it can do file bind mounts, and doesn’t require a protected third location.

The only downside I’ve encountered is that if I wanted to flip the bind mount to read-only, I couldn’t use "/proc/self/fd/${target_fd}" to refer to the mount point. My best guess is that it continues to refer to the directory shadowed by the mount point, rather than the mount point itself. So it is necessary to re-open the mount point, which is another opportunity for shenanigans. One possibility would be to read /proc/self/fdinfo/NNN to determine the mount ID associated with the file descriptor and then double check it against /proc/self/mountinfo.

ThinkPad Infrared Camera

One of the options available when configuring the my ThinkPad was an Infrared camera. The main selling point being “Windows Hello” facial recognition based login. While I wasn’t planning on keeping Windows on the system, I was curious to see what I could do with it under Linux. Hopefully this is of use to anyone else trying to get it to work.

The camera is manufactured by Chicony Electronics (probably a CKFGE03 or similar), and shows up as two USB devices:

04f2:b5ce Integrated Camera
04f2:b5cf Integrated IR Camera

Both devices are bound by the uvcvideo driver, showing up as separate video4linux devices. Interestingly, the IR camera seems to be assigned /dev/video0, so generally gets picked by apps in preference to the colour camera. Unfortunately, the image it produces comes up garbled:

So it wasn’t going to be quite so easy to get things working. Looking at the advertised capture modes, the camera supports Motion-JPEG and YUYV raw mode. So I tried capturing a few JPEG frames with the following GStreamer pipeline:

gst-launch-1.0 v4l2src device=/dev/video0 num-buffers=10 ! image/jpeg ! multifilesink location="frame-%02d.jpg"

Unlike in raw mode, the red illumination LEDs started flashing when in JPEG mode, which resulted in frames having alternating exposures. Here’s one of the better exposures:

What is interesting is that the JPEG frames have a different aspect ratio to the raw version: a more normal 640×480 rather than 400×480. So to start, I captured a few raw frames:

gst-launch-1.0 v4l2src device=/dev/video0 num-buffers=10 ! "video/x-raw,format=(string)YUY2" ! multifilesink location="frame-%02d.raw"

The illumination LEDs stayed on constantly while recording in raw mode. The contents of the raw frames show something strange:

00000000  11 48 30 c1 04 13 44 20  81 04 13 4c 20 41 04 13  |.H0...D ...L A..|
00000010  40 10 41 04 11 40 10 81  04 11 44 00 81 04 12 40  |@.A..@....D....@|
00000020  00 c1 04 11 50 10 81 04  12 4c 10 81 03 11 44 00  |....P....L....D.|
00000030  41 04 10 48 30 01 04 11  40 10 01 04 11 40 10 81  |A..H0...@....@..|
...

The advertised YUYV format encodes two pixels in four bytes, so you would expect any repeating patterns to occur at a period of four bytes. But the data in these frames seems to repeat at a period of five bytes.

Looking closer it is actually repeating at a period of 10 bits, or four packed values for every five bytes. Furthermore, the 800 byte rows work out to 640 pixels when interpreted as packed 10 bit values (rather than the advertised 400 pixels), which matches the dimensions of the JPEG mode.

The following Python code can unpack the 10-bit pixel values:

def unpack(data):
    result = []
    for i in range(0, len(data), 5):
        block = (data[i] |
                 data[i+1] << 8 |
                 data[i+2] << 16 |
                 data[i+3] << 24 |
                 data[i+4] << 32)
        result.append((block >> 0) & 0x3ff)
        result.append((block >> 10) & 0x3ff)
        result.append((block >> 20) & 0x3ff)
        result.append((block >> 30) & 0x3ff)
    return result

After adjusting the brightness while converting to 8-bit greyscale, I get a usable image. Compare a fake YUYV frame with the decoded version:

I suppose this logic could be wrapped up in a GStreamer element to get usable infrared video capture.

I’m still not clear why the camera would lie about the pixel format it produces. My best guess is that they wanted to use the standard USB Video Class driver on Windows, and this let them get at the raw data to process in user space.

u1ftp: a demonstration of the Ubuntu One API

One of the projects I’ve been working on has been to improve aspects of the Ubuntu One Developer Documentation web site.  While there are still some layout problems we are working on, it is now in a state where it is a lot easier for us to update.

I have been working on updating our authentication/authorisation documentation and revising some of the file storage documentation (the API used by the mobile Ubuntu One clients).  To help verify that the documentation was useful, I wrote a small program to exercise those APIs.  The result is u1ftp: a program that exposes a user’s files via an FTP daemon running on localhost.  In conjunction with the OS file manager or a dedicated FTP client, this can be used to conveniently access your files on a system without the full Ubuntu One client installed.

You can download the program from:

https://launchpad.net/u1ftp/trunk/0.1/+download/u1ftp-0.1.zip

To make it easy to run on as many systems as possible, I packaged it up as a runnable zip file so can be run directly by the Python interpreter.  As well as a Python interpreter, you will need the following installed to run it:

  • On Linux systems, either the gnomekeyring extension (if you are using a GNOME derived desktop), or PyKDE4 (if you have a KDE derived desktop).
  • On Windows, you will need pywin32.
  • On MacOS X, you shouldn’t need any additional modules.

These could not be included in the zip file because they are extension modules rather than pure Python.

Once you’ve downloaded the program, you can run it with the following command:

python u1ftp-0.1.zip

This will start the FTP server listening at ftp://localhost:2121/.  Pointing a file manager at that URL should prompt you to log in, where you can use your standard Ubuntu One credentials and start browsing your files.  It will verify the credentials against the Ubuntu SSO service and issue an OAuth token that it stores in the keyring.  The OAuth token is then used to authenticate requests to the file storage REST API.

While I expect this program to be useful on its own, it was also intended to act as an example of how the Ubuntu One API can be used.  One way to browse the source is to simply unzip the package and poke around.  Alternatively, you can check out the source directly from Launchpad:

bzr branch lp:u1ftp

If you come up with an interesting extension to u1ftp, feel free to upload your changes as a branch on Launchpad.