-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix: Prevent excessive reconciliation when timeout disabled #9202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Prevent excessive reconciliation when timeout disabled #9202
Conversation
c99df33 to
70e919d
Compare
|
/hold |
twoGiants
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job for the fix 😸 👍. My brain got challenged understanding the timeout order... 🧠 ⚡. It looks good though! I would just keep the code a bit more consistent, will aid in future understanding.
See my comments below.
- Stop immediate requeue loops when default-timeout-minutes is "0" - Remove redundant hasCondition checks (Knative already deduplicates) This adds an e2e tests that looks at pipeline logs to see how much reconciler loop there is. If you run it before the fix, it would count more than 1500 reconciler loop, whereas with the fix, only about 10. Signed-off-by: Vincent Demeester <[email protected]> Co-Authored-By: Claude <[email protected]>
661ed4e to
9d13377
Compare
|
@twoGiants I changed a bit the if/else chains to make it.. less weird.. hopefully it's more readable. |
|
/hold cancel |
|
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! 😸 👍
I would just add a category the tests => parallel and return early in a if.
/approve
/lgtm
| if timeout != config.NoTimeoutDuration { | ||
| waitTime := timeout - elapsed | ||
| if finallyWaitTime < waitTime { | ||
| waitTime = finallyWaitTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| waitTime = finallyWaitTime | |
| return controller.NewRequeueAfter(finallyWaitTime) |
| // With the fix, reconciliations should stay well below 20 (typically around 10 or less). | ||
| // | ||
| // This test validates the fix by counting actual reconciliations from controller logs. | ||
| func TestPipelineRunExcessiveReconciliation(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Categorize the test as a parallel test following the new categorization logic.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: twoGiants The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Use early returns in timeout handling for better code consistency and readability throughout the reconciliation logic. Add missing @test:execution=parallel annotation to the excessive reconciliation e2e test for proper test categorization. Addresses review feedback from PR tektoncd#9202.
Use early returns in timeout handling for better code consistency and readability throughout the reconciliation logic. Add missing @test:execution=parallel annotation to the excessive reconciliation e2e test for proper test categorization. Addresses review feedback from PR tektoncd#9202.
Use early returns in timeout handling for better code consistency and readability throughout the reconciliation logic. Add missing @test:execution=parallel annotation to the excessive reconciliation e2e test for proper test categorization. Addresses review feedback from PR tektoncd#9202.
|
/cherry-pick release-v1.6.x release-v1.3.x release-v1.0.x |
|
✅ Cherry-pick to A new pull request has been created to cherry-pick this change to Please review and merge the cherry-pick PR. |
|
✅ Cherry-pick to A new pull request has been created to cherry-pick this change to Please review and merge the cherry-pick PR. |
|
✅ Cherry-pick to A new pull request has been created to cherry-pick this change to Please review and merge the cherry-pick PR. |
Bump Go version to 1.24.0 to enable t.Context() usage in tests, which was introduced in Go 1.24. This is required for the cherry-pick of tektoncd#9202 (excessive reconciliation fix) to work on the release-v1.0.x branch. Co-Authored-By: Claude Opus 4.5 <[email protected]>
Bump Go version to 1.24.0 to enable t.Context() usage in tests, which was introduced in Go 1.24. This is required for the cherry-pick of #9202 (excessive reconciliation fix) to work on the release-v1.0.x branch. Co-Authored-By: Claude Opus 4.5 <[email protected]>
|
/cherry-pick release-v1.0.x |
|
❌ Cherry-pick to The automatic cherry-pick to Output: Next steps:
|
|
/cherry-pick release-v1.3.x |
|
❌ Cherry-pick to The automatic cherry-pick to Output: Next steps:
|
|
/cherry-pick release-v1.3.x |
|
❌ Cherry-pick to The automatic cherry-pick to Output: Next steps:
|
|
arf, we probably need to do it manually ? 🤔 or something is fishy... |
Changes
This adds an e2e tests that looks at pipeline logs to see how much reconciler loop there is. If you run it before the fix, it would count more than 1500 reconciler loop, whereas with the fix, only about 10.
/cc @afrittoli @pritidesai @tektoncd/core-maintainers
It took me a while to figure out, and I got some help from Claude (AI) to write the tests. The previous behavior seemed very weird as well, with no timeout, we would, re-queue instantly, which is.. madness 🙃
/kind bug
Fixes #8495
This could be a good candidate to be backported.
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
/kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tepRelease Notes