Checks
Motivation
Currently, configuring S3 data connections in Kubeflow requires users to manually:
- Create Kubernetes Secrets with the correct keys (
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_S3_ENDPOINT, etc.)
- Manually author PodDefault YAML resources that reference those secrets
- Manually configure environment variable injection and optional S3 filesystem mounts (s3fs-FUSE or CSI S3)
- Ensure all labels and annotations are correct so the admission webhook picks them up
This is error-prone, requires kubectl access, and is not accessible to data scientists who primarily work through the Kubeflow UI. Other platforms (e.g., OpenShift AI / RHODS) already provide a first-class "Data Connections" UI that abstracts away this complexity.
We propose adding a Data Connections page to the Central Dashboard that lets users create, list, and delete S3 data connections entirely through the UI — along with a new s3-poddefault-controller that automatically reconciles labeled Secrets into PodDefaults so notebooks can consume S3 credentials without any manual YAML authoring.
Implementation
Components Changed
- Central Dashboard — New "Data Connections" sidebar page
- A new
data-connections-view Polymer component added to the Central Dashboard
- Sidebar entry added under the existing navigation
- Users can:
- View all S3 data connections in their namespace
- Create new data connections via a form (name, bucket, access key, secret key, endpoint)
- Enable optional S3 filesystem mount with configurable mount path
- Set an optional environment variable prefix for namespacing env vars
- Delete existing data connections
- Central Dashboard — Backend S3 Service (
s3_service.ts)
- New backend service that manages S3 credentials as Kubernetes Secrets
- Secrets are labeled with
kubeflow.org/s3-data-connection: true for controller discovery
- API endpoints:
GET /api/s3credentials — list data connections in a namespace
POST /api/s3credentials — create a new data connection
DELETE /api/s3credentials/:id — delete a data connection
- RBAC updates: ClusterRole and Role updated to allow secrets read/write in user namespaces
- New component:
s3-poddefault-controller
- A Kubernetes controller (Go) that watches for Secrets with the
kubeflow.org/s3-data-connection label
- Automatically creates a corresponding PodDefault that injects S3 environment variables into pods
- When mount is enabled, configures either s3fs-FUSE sidecar or CSI S3 volume mount
- Full lifecycle management: PodDefaults are updated when secrets change and garbage-collected when secrets are deleted
- Supports leader election for HA deployments
- Includes unit tests and Kustomize manifests for deployment
Are you willing & able to help?
Checks
kubeflow/dashboardrepository.Motivation
Currently, configuring S3 data connections in Kubeflow requires users to manually:
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_S3_ENDPOINT, etc.)This is error-prone, requires kubectl access, and is not accessible to data scientists who primarily work through the Kubeflow UI. Other platforms (e.g., OpenShift AI / RHODS) already provide a first-class "Data Connections" UI that abstracts away this complexity.
We propose adding a Data Connections page to the Central Dashboard that lets users create, list, and delete S3 data connections entirely through the UI — along with a new s3-poddefault-controller that automatically reconciles labeled Secrets into PodDefaults so notebooks can consume S3 credentials without any manual YAML authoring.
Implementation
Components Changed
data-connections-viewPolymer component added to the Central Dashboards3_service.ts)kubeflow.org/s3-data-connection: truefor controller discoveryGET /api/s3credentials— list data connections in a namespacePOST /api/s3credentials— create a new data connectionDELETE /api/s3credentials/:id— delete a data connections3-poddefault-controllerkubeflow.org/s3-data-connectionlabelAre you willing & able to help?