Skip to content

Commit b1920d3

Browse files
authored
Merge pull request #189 from ktock/docs-esgz
Fix docs to focus more on eStargz
2 parents 3f3c69a + 55bbe91 commit b1920d3

10 files changed

+323
-103
lines changed

README.md

Lines changed: 55 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,19 +5,41 @@
55

66
Read also introductory blog: [Startup Containers in Lightning Speed with Lazy Image Distribution on Containerd](https://medium.com/nttlabs/startup-containers-in-lightning-speed-with-lazy-image-distribution-on-containerd-243d94522361)
77

8-
Pulling image is one of the time-consuming steps in the container lifecycle. Research shows that time to take for pull operation accounts for 76% of container startup time[[FAST '16]](https://www.usenix.org/node/194431). *Stargz Snapshotter* is an implementation of snapshotter which aims to solve this problem by *lazy pulling* leveraging [stargz image format by CRFS](https://github.com/google/crfs). The following histogram is the benchmarking result for startup time of several containers measured on Github Actions, using Docker Hub as a registry.
8+
Pulling image is one of the time-consuming steps in the container lifecycle.
9+
Research shows that time to take for pull operation accounts for 76% of container startup time[[FAST '16]](https://www.usenix.org/node/194431).
10+
*Stargz Snapshotter* is an implementation of snapshotter which aims to solve this problem by *lazy pulling*.
11+
*Lazy pulling* here means a container can run without waiting for the pull completion of the image and necessary chunks of the image are fetched *on-demand*.
912

10-
<img src="docs/images/benchmarking-result-288c338.png" width="600" alt="The benchmarking result on 288c338">
13+
[*eStargz*](/docs/stargz-estargz.md) is a lazily-pullable image format proposed by this project.
14+
This is compatible to [OCI](https://github.com/opencontainers/image-spec/)/[Docker](https://github.com/moby/moby/blob/master/image/spec/v1.2.md) images so this can be pushed to standard container registries (e.g. ghcr.io) as well as this is *still runnable* even on eStargz-agnostic runtimes including Docker.
15+
eStargz format is based on [stargz image format by CRFS](https://github.com/google/crfs) but comes with additional features like runtime optimization and content verification.
1116

12-
`legacy` shows the startup performance when we use containerd's default snapshotter (`overlayfs`) with images copied from `docker.io/library` without optimization. For this configuration, containerd pulls entire image contents and `pull` operation takes accordingly. When we use stargz snapshotter with `stargz` images we are seeing performance improvement on the `pull` operation because containerd can start the container before the entire image contents locally available and fetches each file on-demand. But at the same time, we see the performance drawback for `run` operation because each access to files takes extra time for fetching them from the registry. When we use the further optimized version of images(`estargz`) we can mitigate the performance drawback observed in `stargz` images. This is because [stargz snapshotter prefetches and caches some files which will be most likely accessed during container workload](./docs/stargz-estargz.md). Stargz snapshotter waits for the first container creation until the prefetch completes so `create` sometimes takes longer than other types of image. But this wait only occurs just after the pull completion until the prefetch completion and it's shorter than waiting for downloading all files of all layers.
17+
The following histogram is the benchmarking result for startup time of several containers measured on Github Actions, using GitHub Container Registry.
1318

14-
The above histogram is [the benchmarking result on the commit `288c338`](https://github.com/containerd/stargz-snapshotter/actions/runs/50632674). We are constantly measuring the performance of this snapshotter so you can get the latest one through the badge shown top of this doc. Please note that we sometimes see dispersion among the results because of the NW condition on the internet and the location of the instance in the Github Actions, etc. Our benchmarking method is based on [HelloBench](https://github.com/Tintri/hello-bench).
19+
<img src="docs/images/benchmarking-result-ecdb227.png" width="600" alt="The benchmarking result on ecdb227">
20+
21+
`legacy` shows the startup performance when we use containerd's default snapshotter (`overlayfs`) with images copied from `docker.io/library` without optimization.
22+
For this configuration, containerd pulls entire image contents and `pull` operation takes accordingly.
23+
When we use stargz snapshotter with eStargz-converted images but without any optimization (`estargz-noopt`) we are seeing performance improvement on the `pull` operation because containerd can start the container without waiting for the `pull` completion and fetch necessary chunks of the image on-demand.
24+
But at the same time, we see the performance drawback for `run` operation because each access to files takes extra time for fetching them from the registry.
25+
When we use [eStargz with optimization](/docs/ctr-remote.md) (`estargz`), we can mitigate the performance drawback observed in `estargz-noopt` images.
26+
This is because [stargz snapshotter prefetches and caches *likely accessed files* during running the container](/docs/stargz-estargz.md).
27+
On the first container creation, stargz snapshotter waits for the prefetch completion so `create` sometimes takes longer than other types of image.
28+
But it's still shorter than waiting for downloading all files of all layers.
29+
30+
The above histogram is [the benchmarking result on the commit `ecdb227`](https://github.com/containerd/stargz-snapshotter/actions/runs/398606060).
31+
We are constantly measuring the performance of this snapshotter so you can get the latest one through the badge shown top of this doc.
32+
Please note that we sometimes see dispersion among the results because of the NW condition on the internet and the location of the instance in the Github Actions, etc.
33+
Our benchmarking method is based on [HelloBench](https://github.com/Tintri/hello-bench).
1534

1635
Stargz Snapshotter is a **non-core** sub-project of containerd.
1736

1837
## Quick Start with Kubernetes
1938

20-
For using stargz snapshotter on kubernetes nodes, you need the following configuration to containerd as well as run stargz snapshotter daemon on the node. We assume that you are using containerd newer than at least [commit `d8506bf`](https://github.com/containerd/containerd/commit/d8506bfd7b407dcb346149bcec3ed3c19244e3f1) as a CRI runtime.
39+
- For more details about stargz snapshotter plugin and its configuration, refer to [Containerd Stargz Snapshotter Plugin Overview](/docs/overview.md).
40+
41+
For using stargz snapshotter on kubernetes nodes, you need the following configuration to containerd as well as run stargz snapshotter daemon on the node.
42+
We assume that you are using containerd (> v1.4.2) as a CRI runtime.
2143

2244
```toml
2345
version = 2
@@ -36,16 +58,18 @@ version = 2
3658
disable_snapshot_annotations = false
3759
```
3860

39-
**Note that `disable_snapshot_annotations = false` is required since containerd > 1.4.2**
61+
**Note that `disable_snapshot_annotations = false` is required since containerd > v1.4.2**
4062

41-
This repo contains [a Dockerfile as a KinD node image](./Dockerfile) which includes the above configuration. You can use it with [KinD](https://github.com/kubernetes-sigs/kind) like the following,
63+
This repo contains [a Dockerfile as a KinD node image](/Dockerfile) which includes the above configuration.
64+
You can use it with [KinD](https://github.com/kubernetes-sigs/kind) like the following,
4265

4366
```console
4467
$ docker build -t stargz-kind-node https://github.com/containerd/stargz-snapshotter.git
4568
$ kind create cluster --name stargz-demo --image stargz-kind-node
4669
```
4770

48-
Then you can create stargz pods on the cluster. In this example, we create a stargz-converted Node.js pod (`ghcr.io/stargz-containers/node:13.13-esgz`) as a demo.
71+
Then you can create eStargz pods on the cluster.
72+
In this example, we create a stargz-converted Node.js pod (`ghcr.io/stargz-containers/node:13.13-esgz`) as a demo.
4973

5074
```yaml
5175
apiVersion: v1
@@ -77,18 +101,24 @@ $ curl 127.0.0.1:8080
77101
Hello World!
78102
```
79103

80-
Stargz snapshotter also supports further configuration including private registry authentication, mirror registries, etc.
81-
For more details, refer to the [overview doc](./docs/overview.md).
104+
Stargz snapshotter also supports [further configuration](/docs/overview.md) including private registry authentication, mirror registries, etc.
82105

83-
## Creating stargz images and further optimization
106+
## Creating eStargz images with optimization
84107

85-
For more examples and details about the image converter `ctr-remote`, refer to [this doc](./docs/ctr-remote.md).
108+
- For more examples and details about the image converter `ctr-remote`, refer to [Optimize Images with `ctr-remote image optimize`](/docs/ctr-remote.md).
109+
- For more details about eStargz format, refer to [eStargz: Standard-Compatible Extensions to Tar.gz Layers for Lazy Pulling Container Images](/docs/stargz-estargz.md)
86110

87-
For lazy pulling images, you need to prepare stargz images first. You can use [CRFS-official `stargzify`](https://github.com/google/crfs/tree/master/stargz/stargzify) command or our `ctr-remote` command which has further optimization functionality. You can also try our pre-converted images listed in [this doc](./docs/pre-converted-images.md). For more details about stargz and the optimization, refer to [this doc](./docs/stargz-estargz.md)
111+
For lazy pulling images, you need to prepare eStargz images first.
112+
You can use [`ctr-remote`](/docs/ctr-remote.md) command for do this.
113+
You can also try our pre-converted images listed in [Trying pre-converted images](/docs/pre-converted-images.md).
88114

89-
In this section, we introduce `ctr-remote` command for converting images into stargz with further optimization for the performance of reading files. On-demand lazy pulling improves the performance of pull but it has runtime performance penalty because reading files induce remotely downloading contents. For solving this, `ctr-remote` has *workload-based* optimization for images. This section shows how to convert and pull an image lazily using `ctr-remote` command and briefly describes workload-based optimization.
115+
In this section, we introduce `ctr-remote` command for converting images into eStargz with optimization for reading files.
116+
As shown in the above benchmarking result, on-demand lazy pulling improves the performance of pull but causes runtime performance penalty because reading files induce remotely downloading contents.
117+
For solving this, `ctr-remote` has *workload-based* optimization for images.
90118

91-
First, prepare the demo environment with the following command (put this repo on `${GOPATH}/src/github.com/containerd/stargz-snapshotter`).
119+
For trying the examples described in this section, you can also use the docker-compose-based demo environment.
120+
You can setup this environment as the following commands (put this repo on `${GOPATH}/src/github.com/containerd/stargz-snapshotter`).
121+
*Note that this runs privileged containers on your host.*
92122

93123
```console
94124
$ cd ${GOPATH}/src/github.com/containerd/stargz-snapshotter/script/demo
@@ -98,15 +128,22 @@ $ docker exec -it containerd_demo /bin/bash
98128
(inside container) # ./script/demo/run.sh
99129
```
100130

101-
Generally, container images are built with purpose and the *workloads* are defined in the Dockerfile with some parameters including entrypoint command, environment variables and user. By default, `ctr-remote` optimizes the performance of reading files that are most likely accessed in the workload defined in the Dockerfile. You can also specify the custom workload using options.
131+
Generally, container images are built with purpose and the *workloads* are defined in the Dockerfile with some parameters (e.g. entrypoint, envvars and user).
132+
By default, `ctr-remote` optimizes the performance of reading files that are most likely accessed in the workload defined in the Dockerfile.
133+
[You can also specify the custom workload using options if needed](/docs/ctr-remote.md).
102134

103-
The following example converts the legacy `library/ubuntu:18.04` image into stargz. The command also optimizes the image for the workload of executing `ls` on `/bin/bash`. The thing actually done is it runs the specified workload in a sandboxed environment and profiles all file accesses. Then these files are marked in the image as likely accessed also in production. Then it pushes the converted image to the local registry (`registry2:5000`). The converted image is still __docker-compatible__ so you can run it with other runtimes (e.g. Docker).
135+
The following example converts the legacy `library/ubuntu:18.04` image into eStargz.
136+
The command also optimizes the image for the workload of executing `ls` on `/bin/bash`.
137+
The thing actually done is it runs the specified workload in a temporary container and profiles all file accesses with marking them as *likely accessed* also during runtime.
138+
Then it pushes the converted image to the local container registry (`registry2:5000`).
139+
The converted image is still **docker-compatible** so you can run it with eStargz-agnostic runtimes (e.g. Docker).
104140

105141
```console
106142
# ctr-remote image optimize --plain-http --entrypoint='[ "/bin/bash", "-c" ]' --args='[ "ls" ]' ubuntu:18.04 http://registry2:5000/ubuntu:18.04
107143
```
108144

109-
Finally, the following commands pull the stargz image lazily. Stargz snapshotter prefetches files that are most likely accessed in the optimized workload, which hopefully increases the cache hit rate for that workload and mitigates runtime overheads as shown in the benchmarking result shown top of this doc.
145+
Finally, the following commands pull the eStargz image lazily.
146+
Stargz snapshotter prefetches files that are most likely accessed in the optimized workload, which hopefully increases the cache hit rate for that workload and mitigates runtime overheads as shown in the benchmarking result shown top of this doc.
110147

111148
```console
112149
# ctr-remote images rpull --plain-http registry2:5000/ubuntu:18.04

docs/ctr-remote.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ This optimization is done by baking the information about files that are likely
1313
On runtime, Stargz Snapshotter prefetches these prioritized files before mounting the layer for making sure these files are locally accessible.
1414
This can avoid downloading chunks on every file read and mitigate the runtime performance drawbacks.
1515

16-
For more details about eStargz and its optimization, refer also to the [doc about image formats](/docs/stargz-estargz.md).
16+
For more details about eStargz and its optimization, refer also to [eStargz: Standard-Compatible Extensions to Tar.gz Layers for Lazy Pulling Container Images](/docs/stargz-estargz.md).
1717

1818
## Requirements
1919

-13.2 KB
Binary file not shown.
14 KB
Loading

docs/images/estargz-landmark.png

135 KB
Loading

docs/images/estargz-structure.png

128 KB
Loading

docs/overview.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Containerd Stargz Snapshotter Overview
1+
# Containerd Stargz Snapshotter Plugin Overview
22

33
__Before get through this overview document, we recommend you to read [README](README.md).__
44

@@ -15,8 +15,10 @@ The actual image contents can be fetched *lazily* so runtimes can startup contai
1515
We call these remotely mounted layers as *remote snapshots*.
1616

1717
*Stargz Snapshotter* is a remote snapshotter plugin implementation which supports standard compatible remote snapshots functionality.
18-
This leverages [*stargz* image format by Google](https://github.com/google/crfs) which enables lazy distribution but is backwards-compatible with container standards.
19-
When you run a container image and it is formatted by stargz, stargz snapshotter prepares container's rootfs layers as remote snapshots by mounting layers from [OCI](https://github.com/opencontainers/distribution-spec)/[Docker](https://docs.docker.com/registry/spec/api/) standard registries to the node, instead of pulling the entire image contents.
18+
This snapshotter leverages [eStargz](/docs/stargz-estargz.md) image, which is lazily-pullable and still standard-compatible.
19+
Because of this conpatibility, eStargz image can be pushed to and lazily pulled from [OCI](https://github.com/opencontainers/distribution-spec)/[Docker](https://docs.docker.com/registry/spec/api/) registries (e.g. ghcr.io).
20+
Furthermore, images can run even on eStargz-agnostic runtimes (e.g. Docker).
21+
When you run a container image and it is formatted by eStargz, stargz snapshotter prepares container's rootfs layers as remote snapshots by mounting layers from the registry to the node, instead of pulling the entire image contents.
2022

2123
This document gives you a high-level overview of stargz snapshotter.
2224

@@ -26,11 +28,11 @@ This document gives you a high-level overview of stargz snapshotter.
2628

2729
Stargz snapshotter is implemented as a [proxy plugin](https://github.com/containerd/containerd/blob/04985039cede6aafbb7dfb3206c9c4d04e2f924d/PLUGINS.md#proxy-plugins) daemon (`containerd-stargz-grpc`) for containerd.
2830
When containerd starts a container, it queries the rootfs snapshots to stargz snapshotter daemon through an unix socket.
29-
This snapshotter remotely mounts queried stargz layers from registries to the node and provides these mount points as remote snapshots to containerd.
31+
This snapshotter remotely mounts queried eStargz layers from registries to the node and provides these mount points as remote snapshots to containerd.
3032

3133
Containerd recognizes this plugin through an unix socket specified in the configuration file (e.g. `/etc/containerd/config.toml`).
3234
Stargz snapshotter can also be used through Kubernetes CRI by specifying the snapshotter name in the CRI plugin configuration.
33-
We assume that you are using containerd newer than at least [commit `d8506bf`](https://github.com/containerd/containerd/commit/d8506bfd7b407dcb346149bcec3ed3c19244e3f1)
35+
We assume that you are using containerd (> v1.4.2).
3436

3537
```toml
3638
version = 2
@@ -53,7 +55,7 @@ This repo contains [a Dockerfile as a KinD node image](/Dockerfile) which includ
5355

5456
## State directory
5557

56-
Stargz snapshotter mounts stargz layers from registries to the node using FUSE.
58+
Stargz snapshotter mounts eStargz layers from registries to the node using FUSE.
5759
The all files metadata in the image are preserved on the filesystem and files contents are fetched from registries on demand.
5860

5961
At the root of the filesystem, there is a *state directory* (`/.stargz-snapshotter`) for status monitoring for the filesystem.

0 commit comments

Comments
 (0)