/kind feature
Currently, a kfserving-gateway is used for kfserving inference endpoints. It uses LoadBalancer and all endpoints are publicly available without protection.
This doesn't meet requirement for production grade cluster. We need a way to project these endpoints. Seems currently probe is a blocker to inference behind authn layer.
Describe the solution you'd like
Long term, I think there're two options. (assume we can probe issue fixed)
-
Remove kfserving-gateway, reuse istio-ingressgateway which means requests from external need to get authenticated. I think both IAP (GCP), Coginito/OIDC (AWS) supports programmatic authentication. I am not sure about Dex. The advantage is this solution reuse AuthN and AuthZ from existing infra.
-
Still have separate gateway for kfserving. Leave implementation to different vendors. User could build authentication on top of it. For example, AWS can replace service with ingress and LB level authentication for it. The reason we don't reuse istio-gateway is because we can have other authentication strategy for kfserving. For example, each user can request different APIKeys for different models. etc.
-
Have a middleware for kfserving to manage API Keys. Not sure if there's existing solution on Kubernetes. This sounds like very common use case.
Anything else you would like to add:
Pipeline SDK will have similar issue, we can consider this together.
The user experience should be simple enough, client can get clientId and secret to refresh token or just use an assigned token to make call directly.
Solution needs to be latency optimized.
Related Issue:
kubeflow/kfctl#140
kubeflow/kubeflow#4912
@yuzisun @ellis-bigelow @animeshsingh @jlewi @cliveseldon
/kind feature
Currently, a
kfserving-gatewayis used for kfserving inference endpoints. It uses LoadBalancer and all endpoints are publicly available without protection.This doesn't meet requirement for production grade cluster. We need a way to project these endpoints. Seems currently probe is a blocker to inference behind authn layer.
Describe the solution you'd like
Long term, I think there're two options. (assume we can probe issue fixed)
Remove kfserving-gateway, reuse
istio-ingressgatewaywhich means requests from external need to get authenticated. I think both IAP (GCP), Coginito/OIDC (AWS) supports programmatic authentication. I am not sure about Dex. The advantage is this solution reuse AuthN and AuthZ from existing infra.Still have separate gateway for kfserving. Leave implementation to different vendors. User could build authentication on top of it. For example, AWS can replace service with ingress and LB level authentication for it. The reason we don't reuse istio-gateway is because we can have other authentication strategy for kfserving. For example, each user can request different APIKeys for different models. etc.
Have a middleware for kfserving to manage API Keys. Not sure if there's existing solution on Kubernetes. This sounds like very common use case.
Anything else you would like to add:
Pipeline SDK will have similar issue, we can consider this together.
The user experience should be simple enough, client can get clientId and secret to refresh token or just use an assigned token to make call directly.
Solution needs to be latency optimized.
Related Issue:
kubeflow/kfctl#140
kubeflow/kubeflow#4912
@yuzisun @ellis-bigelow @animeshsingh @jlewi @cliveseldon