Update README.md#901
Conversation
Updated spelling mistake
|
Hi @kunallimaye. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/assign @jlewi |
|
/ok-to-test |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jlewi The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
|
/test all |
Two bugs found in run kubeflow#901 (cancelled at 60m, all tests passed): 1. Log collection used if: failure() but timeout triggers cancelled(), not failure(). Logs were never collected. Fix: failure() || cancelled(). 2. Cleanup kubectl-delete commands waited for Kubernetes finalizer reconciliation (default --wait=true). Each InferenceService delete blocked 3-5 min for Knative revision teardown on the 2-vCPU runner, adding ~15 min to the test step. The runner is destroyed after the job, so waiting is pointless. Fix: --wait=false on all deletes. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
…ization documentation, print prediction output (#3354) * test(kserve): add raw deployment test, trim auth docs, print prediction output Addresses follow-up items from PR #3352 as requested by @juliusvonkohout: 1. Trim KServe authentication documentation (README.md): - Remove verbose subsections: CI fallback YAML, routing table, Models Web App section, CI Test Coverage table (~70 lines removed) - Add concise 4-line note about CI rules: [{}] workaround - Preserve core architecture and Architecture Analysis sections 2. Add Test 7 for raw deployment mode (kserve_test.sh): - Deploy sklearn model with RawDeployment annotation - AuthorizationPolicy uses serving.kserve.io/inferenceservice label (works for both serverless and raw modes, unlike Knative labels) - Host-based routing only (path-based requires kserve/kserve#5090, not yet merged upstream) - Verifies auth enforcement: rejects unauthenticated, accepts authenticated requests - Reference: inferenceservice.yaml ingressClassName config 3. Print prediction results to console (kserve_test.sh): - All 12 curl commands now capture and print response body + HTTP status code for easier debugging of test failures - Pattern: curl -s -w '\n%{http_code}' + head/tail split 4. Fix pre-existing cleanup gap: - Add missing kubectl delete for allow-isvc-sklearn AuthorizationPolicy (was created but never cleaned up) References: - Follow-up to: #3352 - Raw deployment ref: kserve/kserve#5090 - Config ref: kserve/kserve@b8fa0fb/config/configmap/inferenceservice.yaml#L389 - Reviewer comment: #3352 (comment) Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * address review: remove Architecture Analysis, add securityContext, truncate response body Changes addressing juliusvonkohout review comments: - Remove speculative Architecture Analysis section and orphaned [^6] footnote - Add CI workflow and test script links to KServe Authentication heading - Add host-based and path-based routing sentence with kserve#5090 reference - Add securityContext to secure-model-predictor Knative Service fixture - Truncate response body to 200 characters in echo statements for readable CI logs - Log prediction results in kserve_sklearn_test.py before assertion - Fix pre-existing em dashes in comments (Unicode to ASCII) Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(test): add runAsUser to secure-model-predictor securityContext The gcr.io/knative-samples/helloworld-go image runs as root UID 0 by default. Setting runAsNonRoot: true without runAsUser causes kubelet to reject the pod, leading to ksvc timeout in CI. Fix: runAsUser 65534 forces non-root execution. Go binaries are statically linked and run at any UID. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(test): expand ksvc abbreviation, enable live pytest logging Address review: replace ksvc with Knative Service in echo statements. Switch --log-level info to --log-cli-level=INFO so prediction results stream to console during CI runs instead of being captured by pytest. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(test): force-kill port-forward to prevent script hang The kubectl port-forward background process ignores SIGTERM when connections are active. Bash waits for all child processes before exiting, causing the test step to hang indefinitely after all tests pass. Fix by sending SIGTERM, waiting 2s for graceful shutdown, then SIGKILL as fallback, and finally reaping with wait. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * ci(kserve): add 60-min timeout and log collection on failure Add timeout-minutes: 60 per maintainer guidance -- no test should run longer than 60 minutes. Add 'Collect Logs on Failure' and 'Upload Diagnostic Logs' steps modeled after full_kubeflow_integration_test.yaml to capture pod descriptions, events, and container logs when the job fails, enabling root-cause analysis instead of blind timeout bumps. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(ci): collect logs on timeout, skip cleanup waits Two bugs found in run #901 (cancelled at 60m, all tests passed): 1. Log collection used if: failure() but timeout triggers cancelled(), not failure(). Logs were never collected. Fix: failure() || cancelled(). 2. Cleanup kubectl-delete commands waited for Kubernetes finalizer reconciliation (default --wait=true). Each InferenceService delete blocked 3-5 min for Knative revision teardown on the 2-vCPU runner, adding ~15 min to the test step. The runner is destroyed after the job, so waiting is pointless. Fix: --wait=false on all deletes. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * Reduce timeout for kserve test job to 25 minutes Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * address review: rename UNAUTH_TOKEN, remove --wait=false - Rename UNAUTH_TOKEN to UNAUTHORIZED_TOKEN: this test validates authorization (RBAC), not authentication (JWT). The token is valid but the ServiceAccount lacks namespace-scoped permissions. - Remove --wait=false from cleanup deletes: deletion is part of the test -- verifying that resources tear down cleanly matters. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(ci): parallelize cleanup deletes, bump timeout to 30m Julius: cleanup must succeed and should complete in <60s. Sequential deletes took ~5 min (sum of finalizer waits). Parallel deletes bring total to ~30-45s (max of individual). All 6 resources are independent — no ordering dependencies. Bump kserve_test.yaml timeout from 25m to 30m per Julius. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(ci): reduce AuthorizationPolicy sleep 60s to 30s, bump models web app timeout to 30m Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(kserve): restore 60s sleep for AuthorizationPolicy propagation The sleep was reduced from 60s to 30s in d31918d, but Envoy in the KinD CI environment needs ~45-60s to pick up the new AuthorizationPolicy. With only 30s, Test 2b (host-based with token) fires before the ALLOW rule propagates, causing 403 RBAC: access denied. Restore 60s for both Test 2 and Test 7 (raw deployment). Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(kserve): replace blind sleep with active polling for AuthorizationPolicy The two sleep 60 calls add 2 minutes of dead time, causing the job to exceed the 30-minute timeout by ~15 seconds. Replace with polling loops that exit as soon as the AuthorizationPolicy propagates through Envoy (typically 10-30s). Each loop polls every 5 seconds for up to 120 seconds, providing both faster execution and more robust handling of slow environments. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(kserve): poll host-based URL to avoid false-positive 404 from path-based route The previous polling loop used the path-based URL, which returns 404 immediately in CI before the route is configured. Since 404 != 403, the loop exited after 5 seconds thinking the AuthorizationPolicy had propagated — but it had not. Test 2b host-based then failed with 403. Fix: poll the host-based URL instead, which correctly returns 403 until the AuthorizationPolicy propagates through Envoy. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * ci(kserve): bump timeout to 35m for expanded test suite The test suite grew with Test 7 (raw deployment) and active polling loops. Total runtime is ~30m18s with 30m timeout. Bump to 35m to validate all tests pass before finalizing the timeout value. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * Update kserve_test.sh Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * Update kserve_test.sh Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * Apply suggestion from @juliusvonkohout Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * Update kserve_test.sh Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * Apply suggestion from @juliusvonkohout Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * fix(kserve): use Standard instead of deprecated RawDeployment annotation KServe v0.16 renamed RawDeployment to Standard (and Serverless to Knative). Using the legacy name causes the validation webhook to reject status updates during deletion, leaving the finalizer in place and the ISVC stuck in a terminating state indefinitely. Update the kserve_test.sh annotation and comments to use Standard. Ref: kserve/kserve#4710, kserve/kserve#4798 Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> --------- Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> Co-authored-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com>
…ization documentation, print prediction output (kubeflow#3354) * test(kserve): add raw deployment test, trim auth docs, print prediction output Addresses follow-up items from PR kubeflow#3352 as requested by @juliusvonkohout: 1. Trim KServe authentication documentation (README.md): - Remove verbose subsections: CI fallback YAML, routing table, Models Web App section, CI Test Coverage table (~70 lines removed) - Add concise 4-line note about CI rules: [{}] workaround - Preserve core architecture and Architecture Analysis sections 2. Add Test 7 for raw deployment mode (kserve_test.sh): - Deploy sklearn model with RawDeployment annotation - AuthorizationPolicy uses serving.kserve.io/inferenceservice label (works for both serverless and raw modes, unlike Knative labels) - Host-based routing only (path-based requires kserve/kserve#5090, not yet merged upstream) - Verifies auth enforcement: rejects unauthenticated, accepts authenticated requests - Reference: inferenceservice.yaml ingressClassName config 3. Print prediction results to console (kserve_test.sh): - All 12 curl commands now capture and print response body + HTTP status code for easier debugging of test failures - Pattern: curl -s -w '\n%{http_code}' + head/tail split 4. Fix pre-existing cleanup gap: - Add missing kubectl delete for allow-isvc-sklearn AuthorizationPolicy (was created but never cleaned up) References: - Follow-up to: kubeflow#3352 - Raw deployment ref: kserve/kserve#5090 - Config ref: kserve/kserve@b8fa0fb/config/configmap/inferenceservice.yaml#L389 - Reviewer comment: kubeflow#3352 (comment) Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * address review: remove Architecture Analysis, add securityContext, truncate response body Changes addressing juliusvonkohout review comments: - Remove speculative Architecture Analysis section and orphaned [^6] footnote - Add CI workflow and test script links to KServe Authentication heading - Add host-based and path-based routing sentence with kserve#5090 reference - Add securityContext to secure-model-predictor Knative Service fixture - Truncate response body to 200 characters in echo statements for readable CI logs - Log prediction results in kserve_sklearn_test.py before assertion - Fix pre-existing em dashes in comments (Unicode to ASCII) Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(test): add runAsUser to secure-model-predictor securityContext The gcr.io/knative-samples/helloworld-go image runs as root UID 0 by default. Setting runAsNonRoot: true without runAsUser causes kubelet to reject the pod, leading to ksvc timeout in CI. Fix: runAsUser 65534 forces non-root execution. Go binaries are statically linked and run at any UID. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(test): expand ksvc abbreviation, enable live pytest logging Address review: replace ksvc with Knative Service in echo statements. Switch --log-level info to --log-cli-level=INFO so prediction results stream to console during CI runs instead of being captured by pytest. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(test): force-kill port-forward to prevent script hang The kubectl port-forward background process ignores SIGTERM when connections are active. Bash waits for all child processes before exiting, causing the test step to hang indefinitely after all tests pass. Fix by sending SIGTERM, waiting 2s for graceful shutdown, then SIGKILL as fallback, and finally reaping with wait. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * ci(kserve): add 60-min timeout and log collection on failure Add timeout-minutes: 60 per maintainer guidance -- no test should run longer than 60 minutes. Add 'Collect Logs on Failure' and 'Upload Diagnostic Logs' steps modeled after full_kubeflow_integration_test.yaml to capture pod descriptions, events, and container logs when the job fails, enabling root-cause analysis instead of blind timeout bumps. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(ci): collect logs on timeout, skip cleanup waits Two bugs found in run kubeflow#901 (cancelled at 60m, all tests passed): 1. Log collection used if: failure() but timeout triggers cancelled(), not failure(). Logs were never collected. Fix: failure() || cancelled(). 2. Cleanup kubectl-delete commands waited for Kubernetes finalizer reconciliation (default --wait=true). Each InferenceService delete blocked 3-5 min for Knative revision teardown on the 2-vCPU runner, adding ~15 min to the test step. The runner is destroyed after the job, so waiting is pointless. Fix: --wait=false on all deletes. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * Reduce timeout for kserve test job to 25 minutes Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * address review: rename UNAUTH_TOKEN, remove --wait=false - Rename UNAUTH_TOKEN to UNAUTHORIZED_TOKEN: this test validates authorization (RBAC), not authentication (JWT). The token is valid but the ServiceAccount lacks namespace-scoped permissions. - Remove --wait=false from cleanup deletes: deletion is part of the test -- verifying that resources tear down cleanly matters. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(ci): parallelize cleanup deletes, bump timeout to 30m Julius: cleanup must succeed and should complete in <60s. Sequential deletes took ~5 min (sum of finalizer waits). Parallel deletes bring total to ~30-45s (max of individual). All 6 resources are independent — no ordering dependencies. Bump kserve_test.yaml timeout from 25m to 30m per Julius. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(ci): reduce AuthorizationPolicy sleep 60s to 30s, bump models web app timeout to 30m Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(kserve): restore 60s sleep for AuthorizationPolicy propagation The sleep was reduced from 60s to 30s in d31918d, but Envoy in the KinD CI environment needs ~45-60s to pick up the new AuthorizationPolicy. With only 30s, Test 2b (host-based with token) fires before the ALLOW rule propagates, causing 403 RBAC: access denied. Restore 60s for both Test 2 and Test 7 (raw deployment). Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(kserve): replace blind sleep with active polling for AuthorizationPolicy The two sleep 60 calls add 2 minutes of dead time, causing the job to exceed the 30-minute timeout by ~15 seconds. Replace with polling loops that exit as soon as the AuthorizationPolicy propagates through Envoy (typically 10-30s). Each loop polls every 5 seconds for up to 120 seconds, providing both faster execution and more robust handling of slow environments. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * fix(kserve): poll host-based URL to avoid false-positive 404 from path-based route The previous polling loop used the path-based URL, which returns 404 immediately in CI before the route is configured. Since 404 != 403, the loop exited after 5 seconds thinking the AuthorizationPolicy had propagated — but it had not. Test 2b host-based then failed with 403. Fix: poll the host-based URL instead, which correctly returns 403 until the AuthorizationPolicy propagates through Envoy. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * ci(kserve): bump timeout to 35m for expanded test suite The test suite grew with Test 7 (raw deployment) and active polling loops. Total runtime is ~30m18s with 30m timeout. Bump to 35m to validate all tests pass before finalizing the timeout value. Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> * Update kserve_test.sh Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * Update kserve_test.sh Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * Apply suggestion from @juliusvonkohout Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * Update kserve_test.sh Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * Apply suggestion from @juliusvonkohout Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> * fix(kserve): use Standard instead of deprecated RawDeployment annotation KServe v0.16 renamed RawDeployment to Standard (and Serverless to Knative). Using the legacy name causes the validation webhook to reject status updates during deletion, leaving the finalizer in place and the ISVC stuck in a terminating state indefinitely. Update the kserve_test.sh annotation and comments to use Standard. Ref: kserve/kserve#4710, kserve/kserve#4798 Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> --------- Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com> Signed-off-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com> Co-authored-by: Julius von Kohout <45896133+juliusvonkohout@users.noreply.github.com>
Updated spelling mistake
Which issue is resolved by this Pull Request:
Resolves #900
Description of your changes:
Fixed a simple spelling mistake
Checklist:
Simple typo fix in README.md
cd manifests/testsmake generate-changed-onlymake testThis change is