Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Adding cached layers for kaniko builds #300

@priyawadhwa

Description

@priyawadhwa

@mattmoor had this super cool idea, which I've copied below for reference:

tl;dr FTL-style caching for kaniko

Today FTL elides recomputing the dependency layer by publishing an image like:

  gcr.io/mattmoor-images/image-to-publish/cache/python-blah-blah:<hash-of-stuff>

... when asked to publish: gcr.io/mattmoor-images/image-to-publish:foo-bar

<hash of stuff> includes the requirements.txt, (should) include the base image version, 
and could include a timestamp (like what day) to enable some level of freshness.


The idea here is that kaniko would, prior to materializing FROM, fast-forward as far as it has cached:

  FROM ubuntu:latest           # This would be resolved to digest (first step in pull anyways)

  RUN apt-get update           # Check cache for hash(^^ digest, hash("apt-get update"))
  RUN apt-get install foo bar  # Check cache for hash(^^ hash, hash("apt-get install foo bar"))

  ADD baz /blah                # Check cache for hash(^^ hash, hash(relevant files))
  USER sockpuppet              # ...
  WORKDIR /app                 # ...

  RUN echo Hello World         # ...


If at any point we miss the cache, we treat the prior hit as the new "FROM" 
and begin evaluating from the miss.

Phase two of this would be to enable the caching layer to simulate non-RUN operations 
(e.g. ADD/COPY/USER/WORKDIR) against the registry API without downloading the base image.
This would enable Dockerfile's like the following to iterate *very* rapidly 
without ever downloading the base or cache (a la FTL):

  FROM ubuntu:latest           # Same digest, different day

  RUN apt-get update           # No change
  RUN apt-get install foo bar  # No change

  ADD baz /blah                # Oh noes, a change, but upload the layer and continue
  USER sockpuppet              # Metadata-only, post a new config
  WORKDIR /app                 # Metadata-only, post a new config

As Matt suggested, I agree that getting started with a prototype for the first phase would be a good starting point. After we have a prototype, we could do some basic benchmarking comparing no-cache kaniko, cached kaniko, and regular "docker build".

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions