-
Notifications
You must be signed in to change notification settings - Fork 144
Description
Abstract
Implement a snapshotter or image-mounter that treats uncompressed OCI tar layers as raw block devices. By appending a minimal UDF (Universal Disk Format) metadata trailer to an existing uncompressed tarball, the kernel can mount the layer natively using udf.ko. This bypasses the need for extraction (untarring) and eliminates the performance overhead associated with FUSE-based solutions.
Motivation
Standard OCI image extraction is a major bottleneck for container startup and host performance, particularly when dealing with "small-file" metadata pressure on the host filesystem. While projects like stargz-snapshotter address "lazy-pulling," they typically rely on FUSE to bridge the gap between compressed blobs and the VFS.
This proposal offers a "Missing Middle": a way to achieve zero-extraction for local or pre-fetched uncompressed layers using a stable, legacy kernel driver already present in virtually every Linux distribution.
Technical Concept
The architecture treats the uncompressed tarball as the "data partition" of a UDF volume, while placing the metadata descriptors in a synthesized trailer at the end of the file.
- Block-Level Alignment: UDF supports 512-byte logical blocks. Since tar records are 512-byte aligned, the UDF Information Control Block (ICB) allocation descriptors can point directly to the byte-offsets of file data within the tarball, effectively "skipping" the 512-byte tar headers.
- On-the-Fly Indexing: A single pass over the tar stream generates a UDF metadata trailer (File Identifier Descriptors, ICBs, and Directory structures).
- The "Trailer-Only" Mount: By leveraging specific
udf.komount options, we can bypass standard volume identification and point the kernel directly to the trailer:mount -t udf -o loop,novrs,anchor=<BLOCK_OFFSET>,bs=512 layer.tar /mnt
novrs: Bypasses Volume Recognition Sequence (VRS) discovery.anchor: Explicitly defines the location of the Anchor Volume Descriptor Pointer.
- OCI Compatibility: OCI whiteouts (
.wh.*) and opaque markers are mapped as literal files. The UDF mount serves as a standardlowerdirfor OverlayFS, which handles the whiteout logic natively.
Advantages
- Native Performance: Bypasses FUSE context-switching; utilizes native kernel page cache and the highly optimized
udf.koblock-reading path. - Zero Extraction: No per-file metadata/inode allocation on the host filesystem. One layer = One Inode on the host.
- Standard Driver: Relies on
udf.ko, which is ubiquitous and supports 512-byte blocks and anchor overrides on modern kernels. - Atomic Operations: Image cleanup and management become O(1) file operations rather than O(N) recursive directory deletions.
Potential Implementation
This could be integrated as a specialized snapshotter for containerd. The implementation would require a lean tar2udf generator capable of synthesizing the required ECMA-167 descriptors (Anchor, VDS, LVD, FSD, and ICBs) and calculating the necessary CRC16 checksums for the trailer.