Use v1 DRA APIs #732

varunrsekar · 2025-12-10T01:20:10Z

DRA is GA in k8s v1.34! This change bumps up the consumed DRA APIs from v1beta2 -> v1 api version.

Changes:

Bump go mod dependencies for k8s libs to v0.34
Pin kserve libs to a commit in the master branch that supports k8s v0.34 libs (https://github.com/kserve/kserve/tree/1cb39eba0f1a/) - this is a temporary change until kserve v0.17.0 is released.
Tweaks in all kserve-related code to support the above version

Note:

With this change, NIM Operator will no longer support k8s versions below 1.34 for consuming DRA.
NIM Operator will no longer support DRA resourceclaims/resourceclaimtemplates of api version v1beta2

Testing:

Verified w/ DRA and w/o DRA with k8s cluster version v1.34
Verified w/o DRA with k8s cluster version v1.33

copy-pr-bot · 2025-12-10T01:20:13Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

varunrsekar · 2025-12-10T01:21:44Z

/cc @xieshenzh

xieshenzh · 2025-12-10T02:50:18Z

/cc @xieshenzh

I think this should be updated as well:

k8s-nim-operator/internal/webhook/apps/v1alpha1/nimservice_webhook_validation_helper.go

Lines 513 to 518 in f628f6c

    
           mode, annotated := spec.Annotations["serving.kserve.org/deploymentMode"] 
        
           // If the annotation is absent, kserve defaults to serverless. 
        
           serverless := !annotated || strings.EqualFold(mode, "serverless") 
        
           // When Spec.InferencePlatform is "kserve" and used in "serverless" mode: 
        
           if platformIsKServe && serverless {

By the way, I believe the annotation should be changed to serving.kserve.io/deploymentMode.

xieshenzh · 2025-12-10T03:00:08Z

These test cases should be updated, since the default mode is supposed to be read from a configmap when the annotation is absent:

k8s-nim-operator/internal/webhook/apps/v1alpha1/nimservice_webhook_validation_helper_test.go

Lines 1002 to 1080 in f628f6c

    
           { 
        
           	name: "standalone platform – no errors", 
        
           	modify: func(ns *appsv1alpha1.NIMService) { 
        
           		ns.Spec.InferencePlatform = appsv1alpha1.PlatformTypeStandalone 
        
           	}, 
        
           	wantErrs:     0, 
        
           	wantWarnings: 0, 
        
           }, 
        
           { 
        
           	name: "kserve serverless (annotation absent) – valid", 
        
           	modify: func(ns *appsv1alpha1.NIMService) { 
        
           		ns.Spec.InferencePlatform = appsv1alpha1.PlatformTypeKServe 
        
           		// No annotation ⇒ serverless by default. 
        
           	}, 
        
           	wantErrs:     0, 
        
           	wantWarnings: 0, 
        
           }, 
        
           { 
        
           	name: "kserve serverless (annotation present) – autoscaling set", 
        
           	modify: func(ns *appsv1alpha1.NIMService) { 
        
           		ns.Spec.InferencePlatform = appsv1alpha1.PlatformTypeKServe 
        
           		ns.Spec.Annotations = map[string]string{"serving.kserve.org/deploymentMode": "Serverless"} 
        
           		ns.Spec.Scale.Enabled = &trueVal 
        
           	}, 
        
           	wantErrs:     1, 
        
           	wantWarnings: 0, 
        
           }, 
        
           { 
        
           	name: "kserve serverless – ingress set", 
        
           	modify: func(ns *appsv1alpha1.NIMService) { 
        
           		ns.Spec.InferencePlatform = appsv1alpha1.PlatformTypeKServe 
        
           		ns.Spec.Expose.Router.Ingress = &appsv1alpha1.RouterIngress{ 
        
           			IngressClass: "nginx", 
        
           		} 
        
           	}, 
        
           	wantErrs:     1, 
        
           	wantWarnings: 0, 
        
           }, 
        
           { 
        
           	name: "kserve serverless – servicemonitor set", 
        
           	modify: func(ns *appsv1alpha1.NIMService) { 
        
           		ns.Spec.InferencePlatform = appsv1alpha1.PlatformTypeKServe 
        
           		ns.Spec.Metrics.Enabled = &trueVal 
        
           	}, 
        
           	wantErrs:     1, 
        
           	wantWarnings: 0, 
        
           }, 
        
           { 
        
           	name: "kserve serverless – all prohibited set", 
        
           	modify: func(ns *appsv1alpha1.NIMService) { 
        
           		ns.Spec.InferencePlatform = appsv1alpha1.PlatformTypeKServe 
        
           		ns.Spec.Scale.Enabled = &trueVal 
        
           		ns.Spec.Expose.Router.Ingress = &appsv1alpha1.RouterIngress{ 
        
           			IngressClass: "nginx", 
        
           		} 
        
           		ns.Spec.Metrics.Enabled = &trueVal 
        
           	}, 
        
           	wantErrs:     3, 
        
           	wantWarnings: 0, 
        
           }, 
        
           { 
        
           	name: "kserve rawdeployment – allowed autoscaling, but multidnode forbidden", 
        
           	modify: func(ns *appsv1alpha1.NIMService) { 
        
           		ns.Spec.InferencePlatform = appsv1alpha1.PlatformTypeKServe 
        
           		ns.Spec.Annotations = map[string]string{"serving.kserve.org/deploymentMode": "RawDeployment"} 
        
           		ns.Spec.Scale.Enabled = &trueVal // should be fine 
        
           		ns.Spec.MultiNode = &appsv1alpha1.NimServiceMultiNodeConfig{Parallelism: &appsv1alpha1.ParallelismSpec{Pipeline: ptr.To(uint32(1))}} 
        
           	}, 
        
           	wantErrs:     1, // only multiNode should trigger 
        
           	wantWarnings: 0, 
        
           }, 
        
           { 
        
           	name: "kserve – multidnode alone", 
        
           	modify: func(ns *appsv1alpha1.NIMService) { 
        
           		ns.Spec.InferencePlatform = appsv1alpha1.PlatformTypeKServe 
        
           		ns.Spec.MultiNode = &appsv1alpha1.NimServiceMultiNodeConfig{Parallelism: &appsv1alpha1.ParallelismSpec{Pipeline: ptr.To(uint32(2))}} 
        
           	}, 
        
           	wantErrs:     1, 
        
           	wantWarnings: 0,

xieshenzh · 2025-12-10T03:24:11Z

These examples should be updated as well, since Serverless is not supposed to be the default mode when the annotation is absent:
https://github.com/NVIDIA/k8s-nim-operator/tree/f628f6cf905082e0ce6a4db47586a1731c3abd60/config/samples/nim/serving/kserve

shivamerla · 2025-12-11T22:55:00Z

Overall looks good other than KServe comments. We need to update the sample here too: https://github.com/NVIDIA/k8s-nim-operator/blob/main/config/samples/nim/serving/advanced/dra/manual/llm.yaml

varunrsekar · 2025-12-16T21:19:39Z

These test cases should be updated, since the default mode is supposed to be read from a configmap when the annotation is absent:

@xieshenzh Do you mean that we need to setup a configmap with default deployment mode as KNative in these tests?

What's the expected behavior if the configmap is empty and we don't set any annotation in the ISvc? this testcase indicates the configmap being empty is valid.

Signed-off-by: Varun Ramachandra Sekar <[email protected]>

xieshenzh · 2025-12-16T22:41:37Z

These test cases should be updated, since the default mode is supposed to be read from a configmap when the annotation is absent:

@xieshenzh Do you mean that we need to setup a configmap with default deployment mode as KNative in these tests?

What's the expected behavior if the configmap is empty and we don't set any annotation in the ISvc? this testcase indicates the configmap being empty is valid.

As far as I know, serverless deployment is no longer supported on RHOAI 3.x. So, the kserve community is changing the default deployment mode from serverless to standard/rawdeployment.
There is a constant for the default deployment mode, which is standard, but unfortunately it is not used anywhere in the code.

With the current kserve code, if the configmap is empty, the deploymentMode will be empty.
Then, I don't think the inferenceservice will be properly reconciled. Because the code sometimes expects knative/serverless is explicitly set, sometimes assumes knative/serverless is the default deployment mode.

I think the expected behavior if the configmap is empty is to use the standard mode.
If to use standard mode, the reconciler of NIMService should set the annotation explicitly when creating the inferenceservice, in order to avoid any potential issues from the kserve controller.

Otherwise, it is also acceptable to return an error and avoid creating an inferenceservice, if the configmap is empty.

varunrsekar · 2025-12-17T01:28:13Z

I think the expected behavior if the configmap is empty is to use the standard mode.
If to use standard mode, the reconciler of NIMService should set the annotation explicitly when creating the inferenceservice, in order to avoid any potential issues from the kserve controller.

I see here that the predictor reconciler defaults to knative even though we're expected to exit early in case of missing knative CRDs

There is a constant for the default deployment mode, which is standard, but unfortunately it is not used anywhere in the code.

isvcutils.GetDeploymentMode seems to provide the default deployment. Isn't this behavior sufficient? Or do you think we should set this as an annotation explicitly from the NIMService?

xieshenzh · 2025-12-17T02:45:20Z

isvcutils.GetDeploymentMode seems to provide the default deployment. Isn't this behavior sufficient? Or do you think we should set this as an annotation explicitly from the NIMService?

For RHOAI, it is sufficient. The configmap is not supposed to be empty, if the kserve is installed with RHOAI.

But I am not sure if the configmap could be empty, when installing the community version of kserve.
If the configmap is empty, It is not sufficient, which happens in the unit test you mentioned earlier.
In this case, it would be safer to set the annotation explicitly, if isvcutils.GetDeploymentMode returns an empty string.

visheshtanksale · 2025-12-17T17:36:58Z

@varunrsekar What will be upgrade story when user who are on < v1.34 want to just upgrade the NIM Operator?

varunrsekar requested review from ArangoGutierrez, shengnuo, shivamerla and visheshtanksale as code owners December 10, 2025 01:20

varunrsekar added 3 commits December 16, 2025 13:24

Use v1 DRA APIs

a4418ff

Signed-off-by: Varun Ramachandra Sekar <[email protected]>

address kserve review comments

0bb8c1b

Signed-off-by: Varun Ramachandra Sekar <[email protected]>

fix manual resourceclaim sample

dee9492

Signed-off-by: Varun Ramachandra Sekar <[email protected]>

varunrsekar force-pushed the v1-dra branch from f628f6c to dee9492 Compare December 16, 2025 21:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use v1 DRA APIs #732

Use v1 DRA APIs #732

Uh oh!

varunrsekar commented Dec 10, 2025

Uh oh!

copy-pr-bot bot commented Dec 10, 2025

Uh oh!

varunrsekar commented Dec 10, 2025

Uh oh!

xieshenzh commented Dec 10, 2025 •

edited

Loading

Uh oh!

xieshenzh commented Dec 10, 2025 •

edited

Loading

Uh oh!

xieshenzh commented Dec 10, 2025

Uh oh!

shivamerla commented Dec 11, 2025

Uh oh!

varunrsekar commented Dec 16, 2025

Uh oh!

xieshenzh commented Dec 16, 2025 •

edited

Loading

Uh oh!

varunrsekar commented Dec 17, 2025

Uh oh!

xieshenzh commented Dec 17, 2025

Uh oh!

visheshtanksale commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Use v1 DRA APIs #732

Are you sure you want to change the base?

Use v1 DRA APIs #732

Uh oh!

Conversation

varunrsekar commented Dec 10, 2025

Uh oh!

copy-pr-bot bot commented Dec 10, 2025

Uh oh!

varunrsekar commented Dec 10, 2025

Uh oh!

xieshenzh commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xieshenzh commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xieshenzh commented Dec 10, 2025

Uh oh!

shivamerla commented Dec 11, 2025

Uh oh!

varunrsekar commented Dec 16, 2025

Uh oh!

xieshenzh commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

varunrsekar commented Dec 17, 2025

Uh oh!

xieshenzh commented Dec 17, 2025

Uh oh!

visheshtanksale commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

xieshenzh commented Dec 10, 2025 •

edited

Loading

xieshenzh commented Dec 10, 2025 •

edited

Loading

xieshenzh commented Dec 16, 2025 •

edited

Loading