Reproducible builds made easy: introducing StageX

Posted on 10/31/24 by Arnaud, Founding Engineer at Turnkey (follow on X)

This post is about Turnkey’s journey with reproducible builds. As mentioned in this other post, we don’t have a choice: our builds must be reproducible to secure TEE deployments and use remote attestations meaningfully. By reproducible, we mean that each time the build runs, given the same source code, it generates the same binary artifact, byte-for-byte, regardless of where or when it runs. Unfortunately reproducible builds aren’t easy out-of-the-box. After a brief reminder on why reproducible builds matter, we’ll survey the landscape of existing options available to us, show our first attempt at reproducible builds which leveraged Debian containers, and explain why and how we’ve arrived at StageX: a new container-based, full-source-bootstrapped, reproducible, multi-party signed Linux distro. It simplifies reproducible builds considerably and supports all Turnkey builds today.

Reproducible builds transfer trust from code to binaries

When you pull a container image from DockerHub, who built it? How do we know this particular artifact matches the published source code and isn’t some malware pushed by someone who phished the credentials of a legit maintainer? It turns out we have no idea, and that is a problem.

A similar problem was highlighted by Ken Thompson in his 1984 paper “Reflections on Trusting Trust”, which describes how a malicious software compilation tool could tamper with any software compiled with that compiler, including other compilers.

The problem of having to trust compiled artifacts is everywhere: your operating system downloads packages, your code imports packages, your production servers pull docker images, and so on. When attackers succeed in sneaking bad third-party binaries or code into our systems, we talk about “Supply Chain Attacks”. It would be easy to fill pages of examples of real world supply chain attacks because they happen all the time. Here are a few famous examples: one, two, and most recently: three.

To avoid these attacks we need to verify binaries before using them. Unfortunately humans can’t directly verify them: they’re opaque! This is where reproducible builds help: they transfer trust from source code to binaries. Given an opaque binary and human-readable source code, anyone can:

  • Read the code and convince themselves it doesn’t do anything malicious.
    • Now they trust the code! ✅
  • Obtain a binary from the code with a reproducible build1
  • Compare this binary with the published binary (usually through a digest comparison)
    • Now they trust the binary! ✅

This wouldn’t be possible without a reproducible build. If the binary was different every time the build ran, only the person or machine who first published the binary would trust it. Without a reproducible build, trust can’t transfer from code to published binary. With a reproducible build, anyone can reproduce the binary and trust the published binary they’re about to download.

Humble beginnings with Debian and Toolchain

The first version of reproducible builds at Turnkey used Debian containers as a base and Toolchain to build them in a reproducible way. The main idea behind Toolchain was to abstract away differences between build environments (such as user and group IDs, number of CPUs, timestamp and many others) with custom environment variables and system configuration baked into build processes via Makefile macros. This came with major downsides that slowed down developer productivity:

  • Repositories needed to keep costly snapshots of all dependencies in Git LFS or similar to be able to reproduce the exact build container. Otherwise the “latest” packages would shift over time, breaking reproducibility. This created a lot of friction for our team having to regularly archive, hash-lock, and sign hundreds of .deb files for every project.
  • Debian has very old versions of Rust, which we rely on heavily. This very frequently caused frustration when trying to upgrade external crates.
  • The builds themselves relied heavily on Makefile and macros. Most engineers are not familiar with this syntax; as a result debugging builds was really, really hard.

After a few months with this setup, we concluded that something had to change. In the rest of this post we introduce StageX, a community effort which builds on classical Stage 0-3 compiler bootstrapping to produce a container-native, minimal, and reproducible toolchain.

Creating StageX: why not use X instead?

To achieve reliable reproducible builds we took a hard look at the available options around us to avoid building anything from scratch ourselves if we did not have to. This is a list of what we evaluated and why we ultimately rejected those options:

  • Alpine is the most popular distro in container-land and has made great strides in proving a minimal musl-based distro with reasonable security defaults. It is suitable for most use cases, however in the interest of developer productivity and low friction for contributors, packages are only signed by centralized CI builder keys. This single point of failure makes it a non-starter for our own threat model.
  • Chainguard sounds great on paper (container-native!), but on closer inspection they built their framework on top of Alpine which is neither signed nor reproducible and Chainguard image authors do not sign commits or packages with their own keys. They double down on centralized signing with cosign and the SLSA framework to prove their centrally built images were built by a known trusted CI system. This is however only as good as those central signing keys and the people who manage them which we have no way to trust independently.
  • Debian (and derivatives like Ubuntu) is one of most popular options for servers, and also sign most packages. However, these distros are glibc-based with a focus on compatibility and desktop use-cases. As a result they have a huge number of dependencies, partial code freezes for long periods of time between releases, and stale packages as various compatibility goals block updates.
  • Fedora (and RedHat-based distros) sign packages with a global signing key, similar to Chainguard, which is not great. They otherwise suffer from similar one-size-fits-all bloat problems as Debian with a different coat of paint. Their reliance on centralized builds has been used as justification for them to not pursue reproducibility, which makes them a non-starter for security-focused use cases.
  • Arch Linux has very fast updates as a rolling release distro. Package definitions are signed, and often reproducible, but they change from one minute to the next. Reproducible builds require pinning and archiving sets of dependencies that work well together for your own projects.
  • Nix is almost entirely reproducible by design and allows for lean and minimal output artifacts. It is also a big leap forward in having good separation of concerns between privileged immutable and unprivileged mutable spaces, however they do not mandate contributor-level signing, in order to ensure any hobbyist can contribute with low friction.
  • Guix is reproducible by design, borrowing a lot from Nix2. It also does maintainer-level signing like Debian. It comes the closest to what we need overall (and this is what Bitcoin settled on!), but lacks the enforcement of multiple signatures for each package contributions. The dependency tree is large because of glibc, which makes retrofitting signature requirements or reproducibility an uphill battle.

Summarizing the above in a table:

Distro OCI support3 Signatures Libc4 Reproducible5 Bootstrapped
Alpine Published 1 Bot musl No No
Chainguard Native 1 Bot musl No No
Debian Published 1 Human glibc Partial (96%) No
Fedora Published 1 Bot glibc No No
Arch Published 1+ Human glibc Partial (90%) No
Nix Exported 1 Bot glibc Partial (95%) Partially
Guix Exported 1+ Human glibc Partial (90%) Yes
StageX Native 2+ Humans musl Yes (100%) Yes
3
4
5

This should speak for itself: the current candidates didn’t quite meet our bar. We wanted the musl-based container-ideal minimalism of Alpine, the obsessive reproducibility and full-source supply chain goals of Guix, and a step beyond the single-sig signed packages of Debian or Arch.

How StageX works

StageX distributes packages as OCI containers. This allows hosting them just like any other images, on DockerHub6, and allows for hash-locked pulls out of the gate. OCI is the only well-documented packaging standard with multiple competing toolchain implementations and multiple-signature support.

Because StageX packages are OCI images, using StageX’s reproducible Rust is a simple FROM away:

FROM stagex/rust@sha256:b7c834268a81bfcc473246995c55b47fe18414cc553e3293b6294fde4e579163

This forces a download of an exact image, pinned to a specific digest (b7c83426…). You can see existing signatures for this image at stagex:signatures/stagex/rust@sha256=b7c83426…, or reproduce it yourself from source with make rust. As a result you can trust that the Rust image you’re pulling comes from this Containerfile and contains nothing malicious, even if you pull it from an untrusted source. If the downloaded image is corrupted, its sha256 digest won’t match the pinned digest, and the build will error out.

StageX packages are all produced by a single Containerfile with multiple layers:

  • base: sets environment variables, defines source code locations, and pins digests.
  • fetch: downloads source code in a hash-locked way over the network.
  • build: builds sources into artifacts, potentially bringing in dependencies (other StageX packages!) to do so. This is done with no network access.
  • install: places the binaries in the right location within the /rootfs directory
  • package: copies /rootfs to a final container. This is what StageX users import.

A good example to look at is the bash Containerfile: file locations and hashes are hardcoded in base, source code is downloaded in fetch (with --checksum), build untars the source code, calls ./configure and make, install calls install, and package exports the contents of /rootfs. If you’ve ever installed something from source on a Unix based OS before, this should feel very familiar!

Creating Containerfiles for applications using StageX packages is no different than packaging applications with standard Docker images. The StageX README contains an example Containerfile to compile and run a basic Rust “hello, world!”, pasted here for convenience:

FROM scratch AS build

COPY --from=stagex/rust@sha256:b7c834268a81bfcc473246995c55b47fe18414cc553e3293b6294fde4e579163 . /
COPY --from=stagex/gcc:13.1.0@sha256:439bf36289ef036a934129d69dd6b4c196427e4f8e28bc1a3de5b9aab6e062f0 . /
COPY --from=stagex/binutils:2.43.1@sha256:30a1bd110273894fe91c3a4a2103894f53eaac43cf12a035008a6982cb0e6908 . /
COPY --from=stagex/libunwind:1.7.2@sha256:97ee6068a8e8c9f1c74409f80681069c8051abb31f9559dedf0d0d562d3bfc82 . /
COPY --from=stagex/musl:1.2.4@sha256:ad351b875f26294562d21740a3ee51c23609f15e6f9f0310e0994179c4231e1d . /
COPY --from=stagex/llvm:18.1.8@sha256:30517a41af648305afe6398af5b8c527d25545037df9d977018c657ba1b1708f . /
COPY --from=stagex/zlib:1.3.1@sha256:96b4100550760026065dac57148d99e20a03d17e5ee20d6b32cbacd61125dbb6 . /

COPY <<-EOF ./hello.rs
  fn main(){
    println!("Hello World!");
  }
EOF
RUN ["rustc","-C","target-feature=+crt-static","-o","hello","hello.rs"]

FROM scratch
COPY --from=build /hello .
ENTRYPOINT ["/hello"]

The structure of this file follows the “multi-layer” philosophy:

  • The build layer is responsible for compiling our source code (inlined with <<-EOF) into a “hello” binary
  • We then use a fresh FROM scratch layer to copy and expose this new binary as our default entry point.

Note the difference between build and the final layer: build requires llvm, binutils, zlib and many other dependencies to build our program, whereas our final container only contains the “hello” binary. This ensures the final image is as slim as possible.

To build it yourself, save this snippet as “Containerfile” somewhere, and run docker build . -t rust-hello -f Containerfile. This builds an image with the tag rust-hello. Once this is done, execute your hello world program by running the new image:

$ docker run -t rust-hello
Hello World!

Voilà! Turnkey applications are all built this way. As a result anyone can reproduce builds independently, and use remote attestations meaningfully when we deploy critical software into secure enclaves.

The invisible hard problems StageX resolved

The fact that StageX works is a miracle that could not have been possible without relying on other people’s work. Here we highlight a few of the big challenges.

Bootstrapping GCC

Do you know how to make yogurt? The first step is to add yogurt to milk!
Bootstrappable Builds Project

This was by far the thorniest issue to resolve. Many individuals and projects have contributed to solving it over the years. Carl Dong gave a talk about bootstrapping which rallied people to the effort started by the Bitcoin community, Guix recently proved it could bootstrap a modern Linux distribution for which the Stage0 team and the Gnu Mes provided key ingredients, and the bootstrappable builds and live-bootstrap projects glued it all together.

StageX follows the footsteps of Guix and uses the same full-source bootstrap process, starting from hex0, a 190 bytes seed of well-understood assembly code. This seed is used to compile kaem, “the world’s worst build tool”, in Stage 0. Stage 1, 2, and 3 build on this just enough to build gcc, which is used to build many other compilers and tools.

GCC to Golang

It is worth acknowledging the excellent work done by Google. They have documented this path well and provide all the tooling to do it. You only need 3 versions of golang to get all the way back to GCC. See stagex:packages/go.

Bootstrapping Rust

A given version of Rust can only ever be built with the immediately previous version. If you go down this chicken-and-egg problem far enough and you realize that in most distros the chicken comes first: most include a non-reproducible “seed” Rust binary presumably compiled by some member of the Rust team, use that to build the next version, and carry on from there. Even some of the distros that say their Rust builds are reproducible have a pretty major asterisk.

Thankfully John Hodges created mrustc, which implements a minimal semi-modern Rust 1.54 compiler in C++. It is missing a lot of critical features but it does support enough features to compile the official Rust 1.54 sources, which can compile Rust 1.55 and so on. This is the path Guix and Nix both went down, and StageX is following their lead, except using musl. A quick patch did the trick to make mrustc work with musl. See this in action for yourself at stagex:packages/rust.

Reproducible NodeJS (!!)

NodeJS was never designed with reproducible builds in mind. Through extensive discussion with the maintainers and a lot of effort, NodeJS is now packaged in StageX: packages/nodejs. This is (to our knowledge) an industry first.

Building a Community

StageX is only possible because a few dozen people around the world have collectively decided to address the massive supply chain risks that threaten everything we do on the internet.

While Turnkey, Mysten and Distrust provided the funding that brought StageX to life, it has only been possible to hit this level of quality by being open-source and receiving feedback from external entities and individuals that share similar requirements to ours.

For this reason all contributing parties agreed StageX should be a standalone project hosted by the open-source community. Anyone is free to add any packages useful to them that meet or exceed the current security standards in place today.

You can find the StageX repo at https://codeberg.org/stagex/stagex. The repository is hosted by Codeberg, a non-profit deployment of Forgejo, which is itself open source and has correct code signing enforcement, which Github currently lacks. Repo ownership is currently shared by contributing Turnkey engineers and trusted members of the open-source community.

Our Matrix room is #stagex:matrix.org and the team is actively looking for constructive feedback, improvements in various areas, and package maintainers. If you have access to beefy desktops or servers, consider building and co-signing all new packages to prove no one is tampering with them!

Acknowledgements

This blog post started as a document authored by Lance Vick, who founded StageX, got to the initial MVP and helped build the community that now runs it day-to-day. Plain and simple: StageX is Lance’s baby. Couldn’t have happened without him. StageX stands on the shoulders of many others, among which: Carl Dong and Bitcoin, the Stage0 team, the Gnu Mes team, the bootstrappable builds and live-bootstrap projects, the Guix team, the Docker and OCI teams, and all the many maintainers and contributors that are constantly maintaining, reproducing, and improving StageX so it can build everything and anything reproducibly.

A big THANK YOU to Lance Vick, Michael Avrukin, and Andrew Min for reviewing drafts of this blog post and providing great comments and suggestions along the way!


  1. Of course it doesn’t help if the reproducible build process uses malicious tools! These tools must be built reproducibly as well, and the tools with which the tools are built must be built reproducibly as well, and so on. If you go far enough you’ll end up at GCC and the problem of bootstrapping. More on that in later sections of this blog post! ↩︎

  2. Turns out Guix is not 100% reproducible either and is in a similar position to Nix. Packages that include binary blobs like the firmware blobs are just copied directly. Hitting 100% reproducibility may take ages, particularly with no forcing function. ↩︎

  3. Whether a distro is natively based around the composability and layering of Containerfiles (“native”), can be used to create an OCI or Docker container from its package manager (“exported”), or has images published that can be used as the base for a Containerfile (“published”). ↩︎

  4. Whether the distro and resulting artifacts are built against musl or glibc↩︎

  5. See https://reproducible-builds.org/citests/. The statistic we care about the most is the distribution as a whole, meaning a combination of “core” packages as well as “extra” or “community”. Multiple architectures, however, are not yet considered. Fedora and Alpine were previously listed on the Reproducible Builds site, but their entries have not been maintained, and as such are marked not reproducible. ↩︎

  6. We’re thankful for the Docker team’s help: they advised on finding a path to fully reproducible OCI images and offered unlimited free bandwidth to upload and host StageX images. ↩︎