August 2018 – Will Thompson

TL;DR: there’s now an rsync server at rsync://images-dl.endlessm.com/public from which mirror operators can pull Endless OS images, along with an instance of Mirrorbits to redirect downloaders to their nearest—and hopefully fastest!—mirror. Our installer for Windows and the eos-download-image tool baked into Endless OS both now fetch images via this redirector, and from the next release of Endless OS our mirrors will be used as BitTorrent web seeds too. This should improve the download experience for users who are near our mirrors.

If you’re interested in mirroring Endless OS, check out these instructions and get in touch. We’re particularly interested in mirrors in Southeast Asia, Latin America and Africa, since our mission is to improve access to technology for people in these areas.

Big thanks to Niklas Edmundsson, who administers the mirror at Academic Computer Club, Umeå University, who recommended Mirrorbits and provided the nudge needed to get this work going, and to dotsrc.org and Mythic Beasts who are also mirroring Endless OS already.

Read on if you are interested in the gory details of setting this up.

We’ve received a fair few offers of mirroring over the years, but without us providing an rsync server, mirror operators would have had to fetch over HTTPS using our custom JSON manifest listing the available images: extra setup and ongoing admin for organisations who are already generously providing storage and bandwidth. So, let’s set up an rsync server! One problem: our images are not stored on a traditional filesystem, but in Amazon S3. So we need some way to provide an rsync server which is backed by S3.

I decided to use an S3-backed FUSE filesystem to mount the bucket holding our images. It needs to provide a 1:1 mapping from paths in S3 to paths on the mounted filesystem (preserving the directory hierarchy), perform reasonably well, and ideally offer local caching of file contents. I looked at two implementations (out of the many that are out there) which have these features:

s3fs-fuse, which is packaged for Debian as s3fs. Debian is the predominant OS in our server infrastructure, as well as the base for Endless OS itself, so it’s convenient to have a package. ((Do not confuse s3fs-fuse with fuse-s3fs, a totally separate project packaged in Fedora. It uses its own flattened layout for the S3 bucket rather than mapping S3 paths to filesystem paths, so is not suitable for what we’re doing here.))
goofys, which claims to offer substantially better performance for file metadata than s3fs.

I went with s3fs first, but it is a bit rough around the edges:

Our S3 bucket name contains dots, which is not uncommon. By default, if you try to use one of these with s3fs, you’ll get TLS certificate errors. This turns out to be because s3fs accesses S3 buckets as $NAME.s3.amazonaws.com, and the certificate is for *.s3.amazonaws.com, which does not match foo.bar.s3.amazonaws.com. s3fs has a -o use_path_request_style flag which avoids this problem by putting the bucket name into the request path rather than the request domain, but this use of that parameter is only documented in a GitHub Issues comment.
If your bucket is in a non-default region, AWS serves up a redirect, but s3fs doesn’t follow it. Once again, there’s an option you can use to force it to use a different domain, which once again is documented in a comment on an issue.
Files created with s3fs have their permissions stored in an x-amz-meta-mode header. Files created by other tools (which is to say, all our files) do not have this header, so by default get mode 0000 (ie unreadable by anybody), and so the mounted filesystem is completely unusable (even by root, with FUSE’s default settings).

There are two ways to fix this last problem, short of adding this custom header to all existing and future files:

The -o complement_stat option forces files without the magic header to have mode 0400 (user-readable) and directories 0500 (user-readable and -searchable).
The -o umask=0222 option (from FUSE) makes the files and directories world-readable (an improvement on complement_stat in my opinion) at the cost of marking all files executable (which they are not)

I think these are all things that s3fs could do by itself, by default, rather than requiring users to rummage through documentation and bug reports to work out what combination of flags to add. None of these were showstoppers; in the end it was a catastrophic memory leak (since fixed in a new release) that made me give up and switch to goofys.

Due to its relaxed attitude towards POSIX filesystem semantics where performance would otherwise suffer, goofys’ author refers to it as a “Filey System”. ((This inspired a terrible “joke”.)) In my testing, throughput is similar to s3fs, but walking the directory hierarchy is orders of magnitude faster. This is due to goofys making more assumptions about the bucket layout, not needing to HEAD each file to get its permissions (that x-amz-meta-mode thing is not free), and having heuristics to detect traversals of the directory tree and optimize for that case. ((I’ve implemented a similar optimization elsewhere in our codebase: since we have many "directories", it takes many less requests to ask S3 for the full contents of the bucket and transform that list into a tree locally than it does to list each directory individually.))

For on-disk caching of file contents, goofys relies on catfs, a separate FUSE filesystem by the same author. It’s an interesting design: catfs just provides a write-through cache atop any other filesystem. The author has data showing that this arrangement performs pretty well. But catfs is very clearly labelled as alpha software (“Don’t use this if you value your data.”) and, as described in this bug report with a rather intemperate title, it was not hard to find cases where it DOSes itself or (worse) returns incorrect data. So we’re running without local caching of file data for now. This is not so bad, since this server only uses the file data for periodic synchronisation with mirrors: in day-to-day operation serving redirects, only the metadata is used.

With this set up, it’s plain sailing: a little rsync configuration generator that uses the filter directive to only publish the last two releases (rather than forcing terabytes of archived images onto our mirrors) and setting up Mirrorbits. Our Mirrorbits instance is configured with some extra “mirrors” for our CloudFront distribution so that users who are closer to a CloudFront-served region than any real mirrors are directed there; it could also have been configured to make the European mirrors (which they all are, as of today) only serve European users, and rely on its fallback mechanism to send the rest of the world to CloudFront. It’s a pretty nice piece of software.

If you’ve made it this far, and you operate a mirror in Southeast Asia, South America or Africa, we’d love to hear from you: our mission is to improve access to technology for people in these areas.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Month: August 2018

They should have called it Mirrorball

Using Vundle from the Vim Flatpak