-
Notifications
You must be signed in to change notification settings - Fork 94
Description
Is your feature request related to a problem? Please describe.
Currently, the FSx for Lustre CSI Driver mounts an FSx for Lustre file system individually for each pod. While this provides good isolation, it becomes a performance bottleneck in high-density EKS clusters more specifically for read-heavy workloads where multiple pods on the same node access the same files, this behavior causes:
- Redundant mounts on the same EKS node.
- Repeated fetches of the same file from the FSx for Lustre file server.
- Excessive metadata and I/O load, leading to degraded performance at scale.
This is especially problematic in HPC and analytics workloads where thousands of pods may concurrently read identical datasets.
Describe the solution you'd like in detail
Enable per-node shared mounts/per-node local cache in the FSx for Lustre CSI Driver. in a way the first pod on a node mounts the FSx for Lustre file system and subsequent pods on the same node bind-mount to the existing mountpoint instead of creating a new mount.
Optionally, behavior can be configurable (e.g., via a StorageClass parameter or driver flag) so operators can choose between:
- Per-pod isolation (current behavior).
- Per-node shared mount (optimized for read-heavy workloads).
This design is very similar to how the Mountpoint for Amazon S3 CSI Driver v2.0 handles caching and avoids redundant downloads of common files across pods.
Mountpoint Pod Sharing feature of Mountpoint for Amazon S3 CSI Driver