fix(backend): avoid bucket refresh on metadata retry. Fixes #13501#13503
fix(backend): avoid bucket refresh on metadata retry. Fixes #13501#13503MikeTomlin19 wants to merge 2 commits into
Conversation
Signed-off-by: MikeTomlin19 <17487859+MikeTomlin19@users.noreply.github.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @MikeTomlin19. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR refines artifact upload retry behavior by refreshing the object-store bucket only for upload-related failures (not MLMD/metadata failures), and adds test coverage for the new behavior.
Changes:
- Add a typed upload error (
outputArtifactUploadError) to distinguish upload failures from metadata failures. - Refresh bucket credentials only when the failure is an upload error; allow injecting a custom bucket opener for testing.
- Extend tests to assert bucket refresh behavior on upload vs. metadata retry paths.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| backend/src/v2/component/launcher_v2.go | Adds upload error typing and conditional bucket refresh logic with injectable opener. |
| backend/src/v2/component/launcher_v2_test.go | Updates OpenBucketConfig construction and adds tests verifying refresh/no-refresh behavior. |
| if errors.As(err, &uploadErr) { | ||
| glog.Info("Refreshing credentials before retrying artifacts upload.") | ||
| openBucket := objectstore.OpenBucket | ||
| if openBucketConfig.open != nil { | ||
| openBucket = openBucketConfig.open | ||
| } | ||
| refreshedBucket, err := openBucket( | ||
| openBucketConfig.ctx, | ||
| openBucketConfig.k8sClient, | ||
| openBucketConfig.namespace, | ||
| openBucketConfig.config, | ||
| ) |
There was a problem hiding this comment.
Handled in cfa7726 by returning an explicit wrapped error when an upload failure needs a credential refresh but no OpenBucketConfig is available, instead of dereferencing nil. I also added a regression test for that path.
| refreshedBucket, err := openBucket( | ||
| openBucketConfig.ctx, | ||
| openBucketConfig.k8sClient, | ||
| openBucketConfig.namespace, | ||
| openBucketConfig.config, | ||
| ) | ||
| if err != nil { | ||
| glog.Infof("Failed to refresh credentials: %v", err) | ||
| finalErr = err | ||
| continue | ||
| } | ||
| opts.bucket = refreshedBucket |
There was a problem hiding this comment.
I left bucket closing unchanged here because uploadOutputArtifactsWithRetry receives the bucket through opts and does not own the caller-provided bucket lifecycle. Closing the old bucket inside this helper would be a behavior change for the caller. The retry now swaps to a refreshed bucket only after open succeeds, preserving the existing ownership semantics.
Signed-off-by: MikeTomlin19 <17487859+MikeTomlin19@users.noreply.github.com>
Summary
RecordArtifactfailures during output artifact retryVerification
go test -v -run 'Test_uploadOutputArtifactsWithRetry_refreshesBucketAfterUploadFailure|Test_executeV2_publishLogs/retry_required_-_component_success|Test_executeV2_publishLogs/retry_required_-_component_failure' ./backend/src/v2/componentgo test ./backend/src/v2/component ./backend/src/v2/objectstore ./backend/src/v2/metadataautoreview --mode local --engine codex --model gpt-5.3-codex-sparkreturned cleanFixes #13501