Continues Integration in Librsvg, Part 3

Caching stuff

Generally 5min/job does not seem like a terribly long time to wait, but it can add up really quickly when you add couple of jobs to the pipeline. First let’s take a look where most of the time is spent. First of jobs currently are spawned in a clean environment, which means each time we want to build the Rust part of librsvg, we download the whole cargo registry and all of the cargo dependencies each time. That’s our first low hanging fruit! Apart from that another side-effect of the clean environment is that we build librsvg from scratch each time, meaning we don’t make use of the incremental compilation that modern compilers offer. So let’s get started.

Cargo Registry

According to the cargo docs there’s a cache of the registry is stored in $CARGO_HOME(default under $HOME/.cargo). Gitlab-CI though only allows you to cache things that exists inside your projects root (it does not need to be tracked by git). So we have to somehow relocate $CARGO_HOME somewhere where we can extract it. Thankfully that’s as easy as setting $CARGO_HOME to our desired path.

.test_template: &distro_test
  before_script:
    - mkdir -p .cargo_cache
    # Only stuff inside the repo directory can be cached
    # Override the CARGO_HOME variable to force it location
    - export CARGO_HOME="${PWD}/.cargo_cache"
  script:
    - echo foo
 
  cache:
    paths:
      - .cargo_cache/

What’s new in our template above compared to part 1, are the before_script: and cache: blocks. In the before_script: first we create the .cacrgo_cache folder if it not exists (cargo is probably smart enough to not need this but ccache isn’t! So better be safe I guess). And then we export the new $CARGO_HOME location. Then in the cache: block we set what folder we want to cache. That’s it, now our cargo registry and downloaded crates should persist across builds!

Caching Rust Artifacts

The only thing needed to cache the rustc build artifacts is to add target/ in the cache: block. That’s it I am serious.

cache:
  paths:
    - target/

Caching C Artifacts with ccache

C and ccache on the other hand are a completely different story sadly. To that it did not contribute that my knowledge of C and build systems is approximating 0. Thankfully while searching I found another post from Ted Gould where he describes how ccache was setup for Inkscape.  The following config ended up working for librsvg‘s current autotools setup.

.test_template: &distro_test
  before_script:
    # ccache Config
    - mkdir -p ccache
    - export CCACHE_BASEDIR=${PWD}
    - export CCACHE_DIR=${PWD}/ccache
    - export CC="ccache gcc"

  script:
    - echo foo
  cache:
    paths:
      - ccache/

I got stuck on how to actually call gcc through ccache since it depends on the build system you use (see export CC). Shout out to Christian Hergert for showing me how to do it!

Cache behavior

One last thing, is that we want our each of our job to have an independent cache as opposed to a shared one across the pipeline. This can be achieved by using the key: directive. I am not sure how it works and I wish the jobs would elaborate a bit more. In practice the following line will make sure that each job on each branch will have it’s own cache. For more complex configurations I suggest looking at the gitlab docs.

cache:
    # JOB_NAME - Each job will have it's own cache
    # COMMIT_REF_SLUG = Lowercase name of the branch
    # ^ Keep different caches for each branch
    key: "$CI_JOB_NAME-$CI_COMMIT_REF_SLUG"

Final config and results

So here is the cache config as it exists today on Librsvg‘s master branch. This brought build of each job from ~4-5min, which we left it in part 2, to ~1-1:30min. Pretty damn fast! But what if you wanted to do a clean build or rule the out the possibility that the cache is causing bugs and failed runs? Well if you happen to use gitlab 10.4 or later (and GNOME is) you can do it from the Web GUI. If not you probably have to contact a gitlab administrator.

.test_template: &distro_test
  before_script:
    # CCache Config
    - mkdir -p ccache
    - mkdir -p .cargo_cache
    - export CCACHE_BASEDIR=${PWD}
    - export CCACHE_DIR=${PWD}/ccache
    - export CC="ccache gcc"

    # Only stuff inside the repo directory can be cached
    # Override the CARGO_HOME variable to force it location
    - export CARGO_HOME="${PWD}/.cargo_cache"
  script:
    - echo foo

  cache:
    # JOB_NAME - Each job will have it's own cache
    # COMMIT_REF_SLUG = Lowercase name of the branch
    # ^ Keep diffrerent caches for each branch
    key: "$CI_JOB_NAME-$CI_COMMIT_REF_SLUG"
    paths:
      - target/
      - .cargo_cache/
      - ccache/

Leave a Reply

Your email address will not be published. Required fields are marked *