Skip to content

KFP support for pss restricted#3412

Merged
juliusvonkohout merged 24 commits into
masterfrom
copilot/move-baseline-to-security-pss
Mar 20, 2026
Merged

KFP support for pss restricted#3412
juliusvonkohout merged 24 commits into
masterfrom
copilot/move-baseline-to-security-pss

Conversation

Copilot AI commented Mar 20, 2026

Copy link
Copy Markdown
Contributor

KFP v1 and v2 restricted support and test

@github-actions

Copy link
Copy Markdown

Welcome to the Kubeflow Manifests Repository

Thanks for opening your first PR. Your contribution means a lot to the Kubeflow community.

Before making more PRs:
Please ensure your PR follows our Contributing Guide.
Please also be aware that many components are synchronizes from upstream via the scripts in /scripts.
So in some cases you have to fix the problem in the upstream repositories first, but you can use a PR against kubeflow/manifests to test the platform integration.

Community Resources:

Thanks again for helping to improve Kubeflow.

Copilot AI requested review from Copilot and removed request for Copilot March 20, 2026 16:27
@google-oss-prow google-oss-prow Bot added size/S and removed size/XS labels Mar 20, 2026
Copilot AI changed the title [WIP] Move baseline to common/security/PSS and update links Relocate PSS component to common/security/PSS and switch enforce level to restricted Mar 20, 2026
Copilot AI requested a review from juliusvonkohout March 20, 2026 16:28
Copilot AI requested review from Copilot and removed request for Copilot March 20, 2026 16:35
@google-oss-prow google-oss-prow Bot added size/M and removed size/S labels Mar 20, 2026
Copilot AI requested review from Copilot and removed request for Copilot March 20, 2026 16:42
Copilot AI changed the title Relocate PSS component to common/security/PSS and switch enforce level to restricted Relocate PSS component, enforce restricted labels, and harden KFP pipeline run security context Mar 20, 2026
Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com>
@juliusvonkohout juliusvonkohout marked this pull request as ready for review March 20, 2026 17:57
@google-oss-prow google-oss-prow Bot requested a review from kimwnasptd March 20, 2026 17:57

This comment was marked as spam.

Copilot AI changed the title Enforce overlay-only Pipelines patching; remove illegal nested/upstream patch wiring Use typed Kubernetes security context in KFP v1 pipeline test Mar 20, 2026
Copilot AI changed the title Use typed Kubernetes security context in KFP v1 pipeline test Align KFP v1 security context usage with SDK contract; simplify KFP v2 task security application Mar 20, 2026
Copilot AI changed the title Align KFP v1 security context usage with SDK contract; simplify KFP v2 task security application Simplify KFP v2 task security wiring and fix KFP v1 compile-time security context application Mar 20, 2026
Updated security context values to empty strings.

Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com>
.
Signed-off-by: juliusvonkohout <45896133+juliusvonkohout@users.noreply.github.com>
.
Signed-off-by: juliusvonkohout <45896133+juliusvonkohout@users.noreply.github.com>
.
Signed-off-by: juliusvonkohout <45896133+juliusvonkohout@users.noreply.github.com>
.
Signed-off-by: juliusvonkohout <45896133+juliusvonkohout@users.noreply.github.com>
Updated defaultSecurityContextRunAsNonRoot to an empty value with a TODO comment for future improvement.

Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com>
@juliusvonkohout

Copy link
Copy Markdown
Member

/approve

@google-oss-prow

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: juliusvonkohout

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@akagami-harsh

Copy link
Copy Markdown
Member

/lgtm

@juliusvonkohout

juliusvonkohout commented Mar 20, 2026

Copy link
Copy Markdown
Member

Force merging since the trainer test is broken

@akagami-harsh

akagami-harsh commented Mar 20, 2026

Copy link
Copy Markdown
Member

@juliusvonkohout, This PyTorchJob https://github.com/kubeflow/manifests/blob/master/tests/training_operator_job.yaml used by test also need to be PSS restricted compliant, because the container in this PyTorchJob are getting blocked by the PSS.

harshvir@msi:~/manifests$ k describe pytorchjobs.kubeflow.org -n kubeflow-user-example-com pytorch-simple 
Name:         pytorch-simple
Namespace:    kubeflow-user-example-com
Labels:       <none>
Annotations:  <none>
API Version:  kubeflow.org/v1
Kind:         PyTorchJob
Metadata:
  Creation Timestamp:  2026-03-20T21:48:10Z
  Generation:          1
  Resource Version:    13694
  UID:                 0ae7462d-c48d-47cd-96d0-b4ee3785e44a
Spec:
  Pytorch Replica Specs:
    Master:
      Replicas:        1
      Restart Policy:  OnFailure
      Template:
        Metadata:
          Labels:
            sidecar.istio.io/inject:  false
        Spec:
          Containers:
            Command:
              python3
              /opt/pytorch-mnist/mnist.py
              --epochs=1
              --no-cuda
              --batch-size=32
              --log-interval=50
              --lr=0.01
              --test-batch-size=1000
            Env:
              Name:             PYTHONUNBUFFERED
              Value:            1
              Name:             OMP_NUM_THREADS
              Value:            1
              Name:             CUDA_VISIBLE_DEVICES
              Value:            
              Name:             MALLOC_TRIM_THRESHOLD_
              Value:            0
              Name:             MALLOC_MMAP_MAX_
              Value:            0
            Image:              docker.io/kubeflowkatib/pytorch-mnist:v1beta1-45c5727
            Image Pull Policy:  Always
            Name:               pytorch
            Resources:
              Limits:
                Cpu:     4000m
                Memory:  1Gi
              Requests:
                Cpu:     300m
                Memory:  512Mi
    Worker:
      Replicas:        1
      Restart Policy:  OnFailure
      Template:
        Metadata:
          Labels:
            sidecar.istio.io/inject:  false
        Spec:
          Containers:
            Command:
              python3
              /opt/pytorch-mnist/mnist.py
              --epochs=1
              --no-cuda
              --batch-size=32
              --log-interval=50
              --lr=0.01
              --test-batch-size=1000
            Env:
              Name:             PYTHONUNBUFFERED
              Value:            1
              Name:             OMP_NUM_THREADS
              Value:            1
              Name:             CUDA_VISIBLE_DEVICES
              Value:            
              Name:             MALLOC_TRIM_THRESHOLD_
              Value:            0
              Name:             MALLOC_MMAP_MAX_
              Value:            0
            Image:              docker.io/kubeflowkatib/pytorch-mnist:v1beta1-45c5727
            Image Pull Policy:  Always
            Name:               pytorch
            Resources:
              Limits:
                Cpu:     4000m
                Memory:  1Gi
              Requests:
                Cpu:     300m
                Memory:  512Mi
Events:
  Type     Reason           Age                   From                   Message
  ----     ------           ----                  ----                   -------
  Warning  FailedCreatePod  6m9s (x3 over 6m50s)  pytorchjob-controller  Error creating: pods "pytorch-simple-worker-0" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (containers "init-pytorch", "pytorch" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (containers "init-pytorch", "pytorch" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or containers "init-pytorch", "pytorch" must set securityContext.runAsNonRoot=true), seccompProfile (pod or containers "init-pytorch", "pytorch" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
  Warning  FailedCreatePod  82s (x14 over 6m50s)  pytorchjob-controller  Error creating: pods "pytorch-simple-master-0" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "pytorch" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "pytorch" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "pytorch" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "pytorch" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

@akagami-harsh

Copy link
Copy Markdown
Member

And i think, we might also need to make init-pytorch container injected by training operator to be PSS restricted compliant in the upstream repo

@juliusvonkohout

Copy link
Copy Markdown
Member

#3414

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants