SotK

It's been a while since I thought deeply about CI/CD. Ever since I was involved in OpenDev for a while I've held the opinion that Zuul is by far and away the best tool in the space in terms of its power and flexibility.

Unfortunately for me I spend most of my time working with inferior tooling like Jenkins, or less hatefully GitLab CI.

GitLab CI is fine, and I'd likely reach for it when looking for a CI system which makes it trivial to throw together a basic pipeline. However, it forces you to be tied to GitLab itself, which is increasingly problematic to me.

In general I dislike the concept of Git "forges", which is a topic for a different post, but GitLab specifically is well on the way to having entirely jumped the shark.

Earlier this year some discussions with colleagues got me thinking about CI again. Having spent a significant amount of time working on a Remote Execution API server implementation, my thoughts quickly headed down the road of how to utilise the REAPI in CI.

What I want in a CI system

Before going off into the REAPI weeds, I gave some thought to what I would want out of a CI system, coming up with the following list.

Declarative configuration, stored in the Git repo it applies to
A concept like a pipeline, to group a set of jobs in some kind of order
Simple config for a basic pipeline
Local-first design, so that jobs can be invoked locally just as easily as on some remote server
Some way to cache environment construction
Speculative execution of pipelines for parallel changes
Support for manual success/failure decisions for jobs
Job artifacts
Forge-agnostic (ideally usable without git at all)

Some of these are self-explanatory and obviously good, but others deserve a bit more detail.

Local-first design

In most existing CI systems, local debugging of failed jobs is a bit of a pain. With GitLab CI for example, you need to run your command in whatever container the job runs in, and hope that you didn't use any CI variables. Debugging config issues is similarly trying, with a lack of good linting tools and no good way to test jobs themselves pre-merge.

I think this is backwards; a CI system should be invokable by a developer before they've even pushed their code anywhere, in much the same way as any other developer tooling (e.g. just directly running the test suite, or building the project). This is the single most crucial aspect to me, if a CI system makes it hard to debug and reproduce issues locally, it is not an improvement on existing options.

An interesting exception to this common rule is Ambient, which also shares some of the other goals I set out above.

Caching environment construction

A common pattern in CI is having a build job using a tool like BuildStream, which uses a cache to avoid expensive rebuilds. This speeds up the build job, ideally to the point of making the actual build time only be the time taken to fetch a result from the cache.

However, the environment to execute this job is constructed outside of that build tool, meaning we can spend several minutes preparing an environment for a job which then takes less than a second to actually execute. This is obviously wasteful.

Speculative execution

Zuul supports speculative testing of approved changes; if a change is approved, then it is tested as if all earlier approved but unmerged changes passed their tests.

Whilst I didn't set out post-approval merge gating as a high level requirement (mainly because in practice this feels like a feature I want in the review tool instead), this kind of speculative execution could still be useful to minimise CI duration when rebasing approved PRs.

Manual success/failure

Another use-case which isn't well served by existing tooling is CI jobs which produce some artifact which needs subsequent validation by a human. Obviously you can get some of the way to this by just having reviewers inspect job output, but I want an explicit validation step as part of gating merge behind a successful CI run.

Remote Execution API

Before we go any further, I think it is worthwhile touching on the basics of the Remote Execution API.

Originally from Bazel, this gRPC API provides a standardised approach for remote execution and caching of commands; usually build tool/compiler invocations. These days the API is used by many more tools than just Bazel, from similar tooling like Buck2 through lower-level compiler wrappers like recc, to integration tools like BuildStream, and has a similarly wide variety of server implementations.

Concepts

There are three main services in the REAPI; Execution, ActionCache, and ContentAddressableStorage. These service names are pretty self-explanatory.

ContentAddressableStorage (CAS) provides content-addressed blob storage for storing command inputs and outputs, execution environments, serialised protobuf messages, and whatever else you want to put in there.

Execution is where the actual remote execution happens. An Execute request references an Action by it's digest, which points to a serialised Action message in the CAS. This Action message contains all information needed to execute a command; the command itself, the working directory, and other details about the required execution environment.

The service then arranges for the command described in that Action to be executed somewhere, and eventually returns the result in the form of an ActionResult message.

ActionCache provides a cache layer mapping the digest of Action messages to their corresponding ActionResult. This service provides a distributed cache for clients, allowing execution results to be widely reused when possible.

Remote Asset API

In addition to the core REAPI, there is a companion API which we'll be making extensive use of later on.

The Remote Asset API defines a Fetch service and a Push service, which function kind of like a more generic version of the ActionCache from REAPI.

Fetch provides methods which take one or more URIs and some key/value pairs to qualify them further, and returns a corresponding CAS digest which can be used to retrieve the blob or directory represented by the URI.

Push does the inverse, taking both the URIs and qualifiers and a CAS digest and storing a mapping between the two, for later retrieval using Fetch.

Mapping REAPI concepts onto CI

I started by thinking about integrating trexe into a CI pipeline, allowing local invocation of trexe to be identical to CI jobs, giving cache hits for previously executed jobs with unchanged inputs.

That idea is a good start, but it feels a bit like shoehorning a tool in rather than a neat solution. In an ideal world, the CI tool itself would know how to use REAPI directly when constructing its pipeline graph.

Distilled down to the basic concepts, CI systems aren't dissimilar to build or integration tools like Bazel and BuildStream. All of these essentially build a graph of tasks to be executed, and attempt to optimally schedule their execution.

A job in a typical CI system isn't much different to a REAPI Action, specifying a command to execute in a particular execution environment. REAPI ActionResults give us everything we might want from the result of a CI job.

CAS gives us something not dissimilar to GitLab's artifacts for free, especially when we can map a URI to the otherwise content-addressed blobs stored there.

REAPI caching gives us the environment construction caching I want, and also gives us something even better. The opportunity for instantaneous CI.

Instantaneous CI

Using a shared remote cache in a build tool lets us get an approximately instantaneous result when building something that somebody else already built.

What if we take the same idea and apply it to continuous integration?

With a shared remote cache of CI jobs, we can get an instantaneous result once we've run a job once for any given set of input files. By modelling our CI jobs as REAPI Actions, we get both a way to schedule work across a compute farm and this remote caching for free.

This ties in with a local-first CI system. To achieve the instantaneous pass/fail in CI, we need to execute the jobs before we need to inspect their result to vote on an MR.

The obvious way to do this is to have users invoke them as part of development. I can see a world where a pre-commit hook invokes CI jobs, so that by the time the code is actually pushed and a pull request raised the actual CI run is mostly or all cached. We could even build a kind of test-on-save tool, moving the invocation as close to the time of change as possible.

This caching also means we get cache hits on post-merge pipelines, since the actual contents of the git repo are the same pre- and post-merge. This might not always be desired, e.g. a release job, but can be a nice way to speed things up.

Cicada

Having realised the potential of a REAPI-native CI system I decided to build one. Cicada is the result, a minimal prototype which integrates with BuildGrid and Forgejo.

Cicada depends on the worker sandboxing functionality provided by BuildBox in order to construct execution environments, so isn't trivially portable between REAPI server implementations.

At the time of writing, the prototype of Cicada is hardcoded to expect a quickstart BuildGrid setup running locally.

How it works

Cicada is configured by adding a .cicada.yml file to a directory.

This configuration file defines sandboxes, jobs, and pipelines. These jobs and pipelines operate on the contents of the directory which contains the config file.

Lets look at a contrived example.

sandboxes:
  - name: alpine
    uri: alpine:3.22.4
    network: false
    environment:
      - "PATH=/usr/local/bin:/usr/bin:/bin"

jobs:
  - name: hello-world
    sandbox: alpine
    command: echo hello

pipelines:
  - name: validation
    trigger: merge-request
    jobs:
      - hello-world

sandboxes

This is a list of execution environments. This is how you provide tools that your jobs need. At the moment uri must be a resolvable container image name or URI. In the future, I want this to support arbitrary things, as long as there is a sensible way to resolve them to a Directory message stored in CAS.

jobs

The individual tasks that need to run. This might be a build command, a test command, or something else entirely. Jobs can also have outputs, which is a list of paths to upload to CAS after executing the command, and depends, which is a list of job names which must be executed before executing this job.

Outputs of dependencies are available in the working directory used to execute a job.

These jobs become REAPI Actions, with their results cached and reused as often as possible. Any job will ideally execute at most once against a given set of inputs.

pipelines

A way to group multiple jobs together, and define triggers to run them. The only trigger currently implemented is merge-request.

The "pipeline" metaphor doesn't fully make sense here; the jobs key only loosely implies ordering. In practice, jobs will be executed with the maximum possible parallelism based on their depends.

Using Cicada

Once you have a configuration file, you can invoke Cicada on the command line. The cicada run command takes the name of a job or pipeline and executes it and any dependencies remotely.

cicada run hello-world  # invoke `hello-world` and its depends
cicada run validation   # invoke all jobs in the `validation` pipeline

cicada ci simulates a CI invocation. This executes all pipelines with the merge-request trigger. If you run this before pushing your code, Cicada will be able to instantly approve or reject your PR when it is pushed.

# Equivalent to `cicada run validation` in this example
cicada ci

Finally, cicada listen starts a HTTP server which listens for webhook events from Forgejo. This is intended for use on your CI orchestrator, not for local invocation.

cicada listen listens for push events and pull-request events. On push, the repository is cloned and pushed into CAS to save on upload time later.

On pull-request all pipelines with the merge-request trigger are executed, the same as a local cicada ci invocation.

What's next?

Cicada is basically just a proof of concept for the idea of CI via REAPI.

Beyond obvious code cleanup, error handling, and configurability, there are quite a few ideas I want to explore in Cicada in the future. In the rough order I currently plan to work on them, some ideas are

Local replay/debugging/environment reconstruction/testing of jobs
Cacheability improvements via input restriction
Better artifact functionality
A web UI for viewing pipelines and jobs
Lazy environment construction
Speculative pre-merge CI execution when multiple PRs exist
Nested pipelines
Integration with other forges
Additional pipeline triggers
Conditional jobs
Job templating
Manual job pass/fail decisions

Hopefully I'll find time to write some more technically detailed posts as I experiment with some of these too.

I'd love to hear other people's thoughts on the ideas in this post. Feel free to contribute issues and PRs to the repository on Codeberg (no AI code please) or join me in #cicadaci on Libera.

Category: Software

Tags: #ci #testing #remote-execution #cicada