Skip to content

feat: Add S3 Data Connections UI and automated PodDefault creation in Central Dashboard #272

@mishraa-pooja

Description

@mishraa-pooja

Checks

Motivation

Currently, configuring S3 data connections in Kubeflow requires users to manually:

  1. Create Kubernetes Secrets with the correct keys (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_S3_ENDPOINT, etc.)
  2. Manually author PodDefault YAML resources that reference those secrets
  3. Manually configure environment variable injection and optional S3 filesystem mounts (s3fs-FUSE or CSI S3)
  4. Ensure all labels and annotations are correct so the admission webhook picks them up

This is error-prone, requires kubectl access, and is not accessible to data scientists who primarily work through the Kubeflow UI. Other platforms (e.g., OpenShift AI / RHODS) already provide a first-class "Data Connections" UI that abstracts away this complexity.

We propose adding a Data Connections page to the Central Dashboard that lets users create, list, and delete S3 data connections entirely through the UI — along with a new s3-poddefault-controller that automatically reconciles labeled Secrets into PodDefaults so notebooks can consume S3 credentials without any manual YAML authoring.

Implementation

Components Changed

  1. Central Dashboard — New "Data Connections" sidebar page
  • A new data-connections-view Polymer component added to the Central Dashboard
  • Sidebar entry added under the existing navigation
  • Users can:
    • View all S3 data connections in their namespace
    • Create new data connections via a form (name, bucket, access key, secret key, endpoint)
    • Enable optional S3 filesystem mount with configurable mount path
    • Set an optional environment variable prefix for namespacing env vars
    • Delete existing data connections
  1. Central Dashboard — Backend S3 Service (s3_service.ts)
  • New backend service that manages S3 credentials as Kubernetes Secrets
  • Secrets are labeled with kubeflow.org/s3-data-connection: true for controller discovery
  • API endpoints:
    • GET /api/s3credentials — list data connections in a namespace
    • POST /api/s3credentials — create a new data connection
    • DELETE /api/s3credentials/:id — delete a data connection
  • RBAC updates: ClusterRole and Role updated to allow secrets read/write in user namespaces
  1. New component: s3-poddefault-controller
  • A Kubernetes controller (Go) that watches for Secrets with the kubeflow.org/s3-data-connection label
  • Automatically creates a corresponding PodDefault that injects S3 environment variables into pods
  • When mount is enabled, configures either s3fs-FUSE sidecar or CSI S3 volume mount
  • Full lifecycle management: PodDefaults are updated when secrets change and garbage-collected when secrets are deleted
  • Supports leader election for HA deployments
  • Includes unit tests and Kustomize manifests for deployment

Are you willing & able to help?

  • I am able to submit a PR!
  • I can help test the feature!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions