Use docker cache mounts for apt, pip and cargo #11106

mjpieters · 2025-01-30T18:02:38Z

The cache mounts are cached using standard github actions cache when building in the CI pipeline.

Note that the build stage no longer contains the whole source tree, these are instead mounted into the build container when building to avoid invalidating cached build container layers.

.github/workflows/build-docker.yml

zanieb · 2025-01-30T18:38:39Z

Is this just for Docker image build performance?

zanieb · 2025-01-30T18:39:43Z

Installing toolchain via `rustup` with concurrent builds

I don't think you should use a cache mount for rustup, the toolchain + target(s) it installs at least should not be part of the cache mount. Keep them in the image like you do with zig and gcc.

If you do keep it with a cache mount, there is a not so obvious failure when doing concurrent platform builds that both want to write to the same location at once as they install their own copy of the toolchain. To prevent that you'd need ,sharing=locked on the cache mount options.

I'd personally just cache it in a layer and let updates to rust-toolchain.toml invalidate it:

RUN \
  <<HEREDOC
  curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal -- default-toolchain none

  echo 'targets = [ "aarch64-unknown-linux-musl", "x86_64-unknown-linux-musl" ]' >> rust-toolchain.toml
  rustup toolchain install
HEREDOC

If you're concerned about prior layers invalidating that, you could use a separate stage to minimize that concern and COPY --link the RUSTUP_HOME and any other necessary changes.

Removing `RUSTUP_HOME` from cache mount can bloat disk usage by 1.4GB

If you remove the cache mount for rustup, you will encounter another concern where disk usage for both AMD64 + ARM64 platforms diverges from the earlier ARG TARGETPLATFORM + RUN.

Those two instructions can be removed so that the image only diverges by platform for the RUN that actually builds uv, all that changes is that instead of rust_target.txt an ENV is used (CARGO_BUILD_TARGET, an actual cargo ENV) to store the build target. That's handled by updating the already existing switch-case statement.

Here's what that looks like:

ARG TARGETPLATFORM
RUN \
  # Use bind mounts to access Cargo config, lock, and sources; without needing to
  # copy them into a build layer (avoids bloating the docker build layer cache):
  --mount=type=bind,source=crates,target=crates \
  --mount=type=bind,source=Cargo.toml,target=Cargo.toml \
  --mount=type=bind,source=Cargo.lock,target=Cargo.lock \
  # Add cache mounts to speed up builds:
  --mount=type=cache,target=${HOME}/target/ \
  --mount=type=cache,target=/buildkit-cache,id="tool-caches" \
  <<HEREDOC
  # Handle platform differences like mapping target arch to naming convention used by cargo targets:
  # https://en.wikipedia.org/wiki/X86-64#Industry_naming_conventions
  case "${TARGETPLATFORM}" in
    ( 'linux/amd64' )
      export CARGO_BUILD_TARGET='x86_64-unknown-linux-musl'
      ;;
    ( 'linux/arm64' )
      export CARGO_BUILD_TARGET='aarch64-unknown-linux-musl'
      export JEMALLOC_SYS_WITH_LG_PAGE=16
      ;;
    ( * )
      echo "ERROR: Unsupported target platform: '${TARGETPLATFORM}'"
      return 1
      ;;
  esac

  cargo zigbuild --release --bin uv --bin uvx --target "${CARGO_BUILD_TARGET}"
  cp "target/${CARGO_BUILD_TARGET}/release/uv" /uv
  cp "target/${CARGO_BUILD_TARGET}/release/uvx" /uvx
HEREDOC

Better troubleshooting with `SHELL`

Final note, an optional improvement that improves troubleshooting when stuff breaks, is to add this SHELL instruction to the top of the Dockerfile:

FROM --platform=$BUILDPLATFORM ubuntu AS build
# Configure the shell to exit early if any command fails, or when referencing unset variables.
# Additionally `-x` outputs each command run, this is helpful for troubleshooting failures.
SHELL ["/bin/bash", "-eux", "-o", "pipefail", "-c"]

polarathene · 2025-04-29T00:01:09Z

Dockerfile

+  ( \
+    rustup self update \
+    || curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --target $(cat rust_target.txt) --profile minimal --default-toolchain none \
+  ) \
+  # Installs the correct toolchain version from rust-toolchain.toml and then the musl target
+  && rustup target add $(cat rust_target.txt)


Since no toolchain is installed at this point, the --target option seems redundant?

bash: rustup: command not found info: downloading installer info: profile set to 'minimal' info: default host triple is x86_64-unknown-linux-gnu info: skipping toolchain installation warn: ignoring requested target: x86_64-unknown-linux-musl

Also, due to the copied rust-toolchain.toml, --profile minimal is ignored too. You could patch it like I did in my PR since we don't need the extra components that'd otherwise be brought in (share/doc/rust/html is 600MB for example). Granted this is going into a cache mount for you, it's less noticeable but contributes towards CI cache storage?

Given that you're using the same tool-caches and the image uses a FROM with platform constraint tied to the native build host arch rather than the target, you might as well keep the shared layers between TARGETPLATFORM images the same here? (EDIT: I just noticed compared to my PR your rust toolchain is stored in a cache mount, thus layer sharing won't improve much here)

To do that, shift the earlier ARG TARGETPLATFORM block below this rustup one, and explicitly install both musl AMD64 + ARM64 targets. In fact, since the only usage for TARGETPLATFORM will be in that final RUN, you can completely avoid rust_target.txt.

Suggested change

( \

rustup self update \

|| curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --target $(cat rust_target.txt) --profile minimal --default-toolchain none \

) \

# Installs the correct toolchain version from rust-toolchain.toml and then the musl target

&& rustup target add $(cat rust_target.txt)

<<HEREDOC

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal -- default-toolchain none

echo 'targets = [ "aarch64-unknown-linux-musl", "x86_64-unknown-linux-musl" ]' >> rust-toolchain.toml

rustup toolchain install

HEREDOC

NOTE: In my older PR I also set profile to minimal as well:

echo 'profile = "minimal"' >> rust-toolchain.toml

This is required if you run another rustup command like rustup target add, but with the newer rustup toolchain install command, it actually respects the --profile minimal originally set as a fallback.

rustup toolchain install is intended to be the proper approach (requires Rustup 1.28.0+, released in March 2025) to installing the toolchain from rust-toolchain.toml, rather than implicitly installing when using other rustup commands. So this should help justify preferring the switch 👍

Minor improvement from my rejected PR was to fail early, such as with pipelines with curl ... | sh ....

You'd add this SHELL instruction at the top of the file:

FROM --platform=$BUILDPLATFORM ubuntu AS build # Configure the shell to exit early if any command fails, or when referencing unset variables. # Additionally `-x` outputs each command run, this is helpful for troubleshooting failures. SHELL ["/bin/bash", "-eux", "-o", "pipefail", "-c"]

I had some build failures when building the image locally, for RUN with multiple chains of commands, +x would have been a bit useful. Took a while for me to realize the issue with rustup I encountered was only reproducible with a cache mount being accessed concurrently 😓

polarathene · 2025-04-29T00:17:18Z

Dockerfile

+  case "${TARGETPLATFORM}" in \
  "linux/arm64") export JEMALLOC_SYS_WITH_LG_PAGE=16;; \
  esac && \


Adjusted RUN content that makes the earlier ARG TARGETPLATFORM block redundant (so ARM64 + AMD64 builds only diverge common image layers at this point of the build instead).

ARG TARGETPLATFORM RUN \ # Use bind mounts to access Cargo config, lock, and sources; without needing to # copy them into a build layer (avoids bloating the docker build layer cache): --mount=type=bind,source=crates,target=crates \ --mount=type=bind,source=Cargo.toml,target=Cargo.toml \ --mount=type=bind,source=Cargo.lock,target=Cargo.lock \ # Add cache mounts to speed up builds: --mount=type=cache,target=${HOME}/target/ \ --mount=type=cache,target=/buildkit-cache,id="tool-caches" \ <<HEREDOC # Handle platform differences like mapping target arch to naming convention used by cargo targets: # https://en.wikipedia.org/wiki/X86-64#Industry_naming_conventions case "${TARGETPLATFORM}" in ( 'linux/amd64' ) export CARGO_BUILD_TARGET='x86_64-unknown-linux-musl' ;; ( 'linux/arm64' ) export CARGO_BUILD_TARGET='aarch64-unknown-linux-musl' export JEMALLOC_SYS_WITH_LG_PAGE=16 ;; ( * ) echo "ERROR: Unsupported target platform: '${TARGETPLATFORM}'" return 1 ;; esac cargo zigbuild --release --bin uv --bin uvx --target "${CARGO_BUILD_TARGET}" cp "target/${CARGO_BUILD_TARGET}/release/uv" /uv cp "target/${CARGO_BUILD_TARGET}/release/uvx" /uvx HEREDOC

polarathene · 2025-04-29T04:43:12Z

Dockerfile

+RUN \
+  --mount=type=cache,target=/buildkit-cache,id="tool-caches" \


This RUN does not play well with concurrent writers when that tool-caches cache mount is used. Causing builds to fail:

1.499 info: downloading component 'cargo' 1.790 error: component download failed for cargo-x86_64-unknown-linux-gnu: could not rename downloaded file from '/buildkit-cache/rustup/downloads/c5c1590f7e9246ad9f4f97cfe26ffa92707b52a769726596a9ef81565ebd908b.partial' to '/buildkit-cache/rustup/downloads/c5c1590f7e9246ad9f4f97cfe26ffa92707b52a769726596a9ef81565ebd908b': No such file or directory (os error 2)

While cargo might manage lock files to avoid this type of scenario, you need to be mindful of cache mount usage when it's not compatible with the default sharing=shared mount option.

# When using a Buildx container driver: docker buildx create --name=container --driver=docker-container --use --bootstrap # You can now build for multiple platforms concurrently: docker buildx build --builder=container --platform=linux/arm64,linux/amd64 --tag localhost/uv .

To prevent this problem use sharing=locked to block another build from writing to the same cache mount id. That or running two separate build commands to build one platform at a time.

While on the topic of cache mounts. It's a non-issue for CI of a project where you only build a single Dockerfile your project maintains.

However on user systems, AFAIK if that id is used in another project Dockerfile, it also shares that cache. Sometimes that's a non-issue, but be mindful of accidentally mixing/sharing with other projects that shouldn't share a cache mount due to concerns like invalidating each others storage, or like seen here conflicting write access, or with sharing=locked blocking a build of another project.

Suggested change

RUN \

--mount=type=cache,target=/buildkit-cache,id="tool-caches" \

RUN \

--mount=type=cache,target=/buildkit-cache,id="tool-caches",sharing=locked \

EDIT: As per feedback in the next comment, I'm really not sure about the toolchain being stored in a cache mount as a good idea? Rather then apply this fix it may be better to just avoid the cache mount entirely (you'd then have the ability to build the build stage and shell into it to troubleshoot building if need be too, actually maybe not due to CARGO_HOME if you need zigbuild)

I am not sure about why the rust toolchain is stored in a cache mount, while Zig and other toolchains are left in the image layers? To pair an update of rust-toolchain.toml bumping the toolchain to trigger rustup self update?

The COPY for rust-toolchain.toml would invalidate the RUN layer, so it would be updated just the same no?

I could understand if you were sharing this cache mount with other Dockerfile without common base layer sharing, but if those projects were configured with different toolchains they likewise accumulate in cache storage? (which is more prone to GC than an actively used layer) Cleaning up unused layers is probably preferable, cache should really be used for actual cache (I think it's possible for a cache mount to clear between RUN, not ideal for a toolchain).

The other possibility being for CI image caching and wanting to minimize storage.

The bulk of your build time with this Dockerfile is with the actual cargo build later on, so pulling from a CI cache blob or from the remote source (rustup, package manager, etc) are not likely to be that much faster. Regardless you're configuring persistence in CI via cache mounts, is that beneficial vs standard caching of image layers?

If you lose CI time to the large cache import/export delays (eg: due to de/compression), it may be faster to just not cache that portion of the image at all and do a clean build of it. Cache only what's helpful.

You will however benefit from the cache mount when building multiple targets separately (rather than multiple cargo build in the same RUN):

This is only because of the earlier ARG TARGETPLATFORM introducing a divergence in layer cache (1.3GB + 1.4GB to support without cache mount but actual diff is approx 200MB only).

Since both targets build from the same build host arch, there's no concern about conflict there with the cache mount either 👍

It should be rare for earlier layer cache invalidation to really matter, but that'd be a win for cache mounts. Personally I prefer the immutable/predictable layer content vs accumulating cache mount that if I'm not mistaken can be cleared during build between layers (as cache is intended to be disposable).

That concern is easily fixed as per my suggestion for avoiding divergence at this point. Both targets added are 354MB combined. Total layer weight with minimal profile is 930MB (instead of 1.6GB), be that layer cache or a cache mount.

Breakdown:

# Build (without `tool-caches` cache mount): docker buildx build --builder=container --platform=linux/amd64 --tag localhost/uv --load . # Inspect: docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock ghcr.io/wagoodman/dive:latest localhost/uv

Sizes (bolded is within a cache mount):

1.6GB (930MB minimal profile) => Rustup toolchain /buildkit-cache/rustup (also adds 19MB to sibling dir cargo/):

lib/rustlib/aarch64-unknown-linux-musl/lib (135MB) / lib/rustlib/x86_64-unknown-linux-musl/lib (219MB)

lib/rustlib/x86_64-unknown-linux-gnu/bin (18MB) + lib/rustlib/x86_64-unknown-linux-gnu/lib (158MB)

lib/libLLVM.so.19.1-rust-1.86.0-stable (174MB) + lib/librustc_driver-ea2439778c0a32ac.so (141MB)

85MB => Pip cache /buildkit-cache/pip/http-v2

258MB => Apt cache /var/cache/apt (220MB) + /var/lib/apt (48MB)

310MB => Zig toolchain at /root/.venv/lib/python3.12/site-packages/ziglang

527MB => Base image (78MB) + 13MB (python venv setup) + /usr (base package layer adds 436MB)

Image build time:

On a budget VPS (Fedora 42 at Vultr, 1vCPU + 2GB RAM with 3GB more via zram swap):

apt layer built within 37s

cargo-zigbuild install 12s

rustup setup 32s

cargo release build (x86_64), 2 hours 25 minutes.

The build took excessively long presumably due to single CPU and quite possibly RAM, I didn't investigate that too extensively. Changing from lto="fat" to lto="thin" brought that build time down to 43 minutes, at the expense of being 25% larger (40MB => 50MB).

You're getting much better results reported for the build, but the bulk of the time is down to the actual build. I'd avoid wasting CI cache store (causing evictions sooner than necessary for cache items that are actually helpful) on the rust toolchain, saving a minute at best is not worth better using the cache to optimize the build time (requires sccache IIRC to be decent but is not without quirks).

That said you can use the cache mounts in CI and not upload/restore them for minimizing the image layers cache, but presently there is very little benefit in caching image layers at all? You could instead just focus on the cache mount(s) for the cargo build itself.

The cargo target cache is 1GB alone when building this project, but as mentioned it's a bit of a hassle to actually leverage for the CI.

After a build

For reference, the cargo and zig caches are decent in size, but a good portion of the cargo one isn't relevant, nor is the zigbuild cache mount worthwhile?

# Cargo: $ du -shx /buildkit-cache/cargo 298M /buildkit-cache/cargo # Bulk is from registry dir: $ /buildkit-cache/cargo/registry/ 217M /buildkit-cache/cargo/registry/src 33M /buildkit-cache/cargo/registry/cache 26M /buildkit-cache/cargo/registry/index 275M /buildkit-cache/cargo/registry/ # Zig: du -shx /buildkit-cache/zig 164M /buildkit-cache/zig # Zigbuild: # Nothing worthwhile to cache? (plus it created another folder for itself): du -shx /buildkit-cache/cargo-zigbuild/cargo-zigbuild/0.20.0 24K /buildkit-cache/cargo-zigbuild/cargo-zigbuild/0.20.0 # Rustup for reference (before optimization): $ du -hx --max-depth=1 /buildkit-cache/rustup 4.0K /buildkit-cache/rustup/tmp 1.5G /buildkit-cache/rustup/toolchains 8.0K /buildkit-cache/rustup/update-hashes 4.0K /buildkit-cache/rustup/downloads 1.5G /buildkit-cache/rustup # This was built without minimal profile applied + only x86_64 musl target: $ du -hx --max-depth=1 /buildkit-cache/rustup/toolchains/1.86-x86_64-unknown-linux-gnu 20K /buildkit-cache/rustup/toolchains/1.86-x86_64-unknown-linux-gnu/etc 1.4M /buildkit-cache/rustup/toolchains/1.86-x86_64-unknown-linux-gnu/libexec 728M /buildkit-cache/rustup/toolchains/1.86-x86_64-unknown-linux-gnu/share 73M /buildkit-cache/rustup/toolchains/1.86-x86_64-unknown-linux-gnu/bin 679M /buildkit-cache/rustup/toolchains/1.86-x86_64-unknown-linux-gnu/lib 1.5G /buildkit-cache/rustup/toolchains/1.86-x86_64-unknown-linux-gnu

As per my PR attempt, the bulk of the cargo cache mount there is from data that is quick to generate/compute at build time, thus not worth persisting. I used two separate tmpfs cache mounts to filter those out:

# These are redundant as they're easily reconstructed from cache above. Use TMPFS mounts to exclude from cache mounts: # TMPFS mount is a better choice than `rm -rf` command (which is risky on a cache mount that is shared across concurrent builds). --mount=type=tmpfs,target="${CARGO_HOME}/registry/src" \ --mount=type=tmpfs,target="${CARGO_HOME}/git/checkouts" \

Only relevant if storage of the cache mount is a concern, which it may be for CI limits to keep tame, otherwise is overkill :)

mjpieters had a problem deploying to release January 30, 2025 18:02 — with GitHub Actions Failure

mjpieters force-pushed the docker-caching branch from 6f11fe5 to 63481da Compare January 30, 2025 18:04

mjpieters had a problem deploying to release January 30, 2025 18:04 — with GitHub Actions Failure

mjpieters commented Jan 30, 2025

View reviewed changes

.github/workflows/build-docker.yml Outdated Show resolved Hide resolved

mjpieters marked this pull request as ready for review January 30, 2025 18:07

mjpieters force-pushed the docker-caching branch from 63481da to d0b6d49 Compare January 30, 2025 18:36

mjpieters temporarily deployed to release January 30, 2025 18:36 — with GitHub Actions Inactive

mjpieters commented Jan 30, 2025

View reviewed changes

.github/workflows/build-docker.yml Outdated Show resolved Hide resolved

mjpieters force-pushed the docker-caching branch from d0b6d49 to 70224a9 Compare January 30, 2025 19:28

mjpieters temporarily deployed to release January 30, 2025 19:28 — with GitHub Actions Inactive

samypr100 reviewed Jan 31, 2025

View reviewed changes

mjpieters force-pushed the docker-caching branch from 70224a9 to 6cc484d Compare January 31, 2025 10:11

mjpieters temporarily deployed to release January 31, 2025 10:12 — with GitHub Actions Inactive

This comment was marked as outdated.

Sign in to view

mjpieters force-pushed the docker-caching branch from 6cc484d to 0bdeea0 Compare January 31, 2025 12:20

mjpieters temporarily deployed to release January 31, 2025 12:20 — with GitHub Actions Inactive

mjpieters temporarily deployed to release January 31, 2025 12:31 — with GitHub Actions Inactive

mjpieters force-pushed the docker-caching branch from 81a3559 to 5c013ec Compare January 31, 2025 12:37

mjpieters force-pushed the docker-caching branch from 79d5163 to 5850449 Compare January 31, 2025 13:11

mjpieters temporarily deployed to release January 31, 2025 13:11 — with GitHub Actions Inactive

mjpieters force-pushed the docker-caching branch from 5850449 to 44a6f92 Compare January 31, 2025 15:41

mjpieters had a problem deploying to release January 31, 2025 15:41 — with GitHub Actions Failure

mjpieters force-pushed the docker-caching branch from 44a6f92 to 3c2dc0f Compare January 31, 2025 15:42

mjpieters temporarily deployed to release January 31, 2025 15:43 — with GitHub Actions Inactive

mjpieters force-pushed the docker-caching branch from 3c2dc0f to 589c740 Compare January 31, 2025 15:53

mjpieters temporarily deployed to release January 31, 2025 15:53 — with GitHub Actions Inactive

mjpieters force-pushed the docker-caching branch from 589c740 to 7d7320a Compare January 31, 2025 19:10

mjpieters temporarily deployed to release January 31, 2025 19:10 — with GitHub Actions Inactive

mjpieters force-pushed the docker-caching branch from 7d7320a to a74d19e Compare January 31, 2025 21:57

mjpieters temporarily deployed to release January 31, 2025 21:57 — with GitHub Actions Inactive

mjpieters force-pushed the docker-caching branch from a74d19e to 95d45aa Compare January 31, 2025 22:16

mjpieters temporarily deployed to release January 31, 2025 22:16 — with GitHub Actions Inactive

mjpieters force-pushed the docker-caching branch from 95d45aa to 4973fc3 Compare January 31, 2025 22:22

mjpieters temporarily deployed to release January 31, 2025 22:22 — with GitHub Actions Inactive

konstin mentioned this pull request Apr 28, 2025

feat: Refactor Dockerfile #3372

Closed

polarathene reviewed Apr 29, 2025

View reviewed changes

		RUN \
		--mount=type=cache,target=/buildkit-cache,id="tool-caches" \

Use docker cache mounts for apt, pip and cargo #11106

Are you sure you want to change the base?

Use docker cache mounts for apt, pip and cargo #11106

Uh oh!

Conversation

mjpieters commented Jan 30, 2025

Uh oh!

Uh oh!

zanieb commented Jan 30, 2025

Uh oh!

zanieb commented Jan 30, 2025

Uh oh!

Uh oh!

mjpieters commented Jan 30, 2025

Uh oh!

zanieb commented Jan 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mjpieters commented Jan 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samypr100 Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

mjpieters commented Jan 31, 2025

Uh oh!

mjpieters commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

polarathene commented Apr 29, 2025

Uh oh!

polarathene left a comment

Choose a reason for hiding this comment

Installing toolchain via rustup with concurrent builds

Removing RUSTUP_HOME from cache mount can bloat disk usage by 1.4GB

Better troubleshooting with SHELL

Uh oh!

polarathene Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

polarathene Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

polarathene Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

polarathene Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

polarathene Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

polarathene Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

polarathene Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

After a build

Uh oh!

Uh oh!

zanieb commented Jan 30, 2025 •

edited

Loading

mjpieters commented Jan 30, 2025 •

edited

Loading

samypr100 Jan 31, 2025 •

edited

Loading

mjpieters commented Jan 31, 2025 •

edited

Loading

Installing toolchain via `rustup` with concurrent builds

Removing `RUSTUP_HOME` from cache mount can bloat disk usage by 1.4GB

Better troubleshooting with `SHELL`

polarathene Apr 29, 2025 •

edited

Loading

polarathene Apr 29, 2025 •

edited

Loading

polarathene Apr 29, 2025 •

edited

Loading