feat: resolve steps referencing StepActions concurrently#8925
Conversation
|
|
|
The following is the coverage report on the affected files.
|
|
/kind feature |
fbe1cd7 to
e48f5d1
Compare
|
The following is the coverage report on the affected files.
|
afrittoli
left a comment
There was a problem hiding this comment.
Thanks for this, it looks good!
A few minor comments, nothing blocking.
/approve
pkg/apis/config/testdata/config-defaults-step-action-parallelism-limit-err.yaml
Outdated
Show resolved
Hide resolved
pkg/apis/config/testdata/config-defaults-step-action-parallelism-limit.yaml
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| // HasStepRefs provides a fast check to see if any steps in a TaskSpec contain a reference to a StepAction. | ||
| func HasStepRefs(taskSpec *v1.TaskSpec) bool { |
There was a problem hiding this comment.
Is this exported only so that it may have dedicated unit tests?
There was a problem hiding this comment.
Indeed, initially the function was private but not wanting to modify the test file too much, I made it public, thinking that it could be a function that could be useful in other contexts.
However, I introduced this commit to make it private if that makes more sense to you.
Let me know what your preference is and I'd be happy to squash or remove that commit.
There was a problem hiding this comment.
Thanks @Maximilien-R for the extra commit. Either way is probably ok.
We usually don't export functions unless we need to, but we also have a policy (which we don't always honour), to only tests for exported functions, meaning that other functions can only be tested indirectly through their calling function.
In this case, I'm not sure which of the two policies would win, it seems reasonable to have unit tests specifically for that function. @vdemeester any preference?
There was a problem hiding this comment.
@afrittoli @vdemeester should we document this under the contribution guide if not documented already ?
waveywaves
left a comment
There was a problem hiding this comment.
The default-step-action-parallelism-limit doesn't specify exactly what is being parallelized, and don't think these would be running in parallel as your PR adds changes to throttle the concurrency of StepAction resolution go routines (g.Go() usage) which doesn't guarantee parallel CPU execution. I believe using the term concurrency here is much better.
What do you think about updating the config key to another identifier which reflects this better default-step-ref-concurrency-limit (considering this is to resolve the references), or default-step-action-concurrency-limit (replacing the existing parallemism from the key mentioned in this PR to concurrency) or maybe something else ...
|
/retest |
c3c4e4a to
c2ae737
Compare
|
The following is the coverage report on the affected files.
|
|
The following is the coverage report on the affected files.
|
|
@afrittoli, I have applied the various proposed corrections as well as converting the public @waveywaves As suggested, I replaced the various occurrences of the notion of "parallelism" with "concurrency" and "step action" with "step ref", indeed, this seems more coherent to me. I also took the opportunity to expand the documentation around the configuration key to make things more explicit and detailed. Note: I've pushed unit commits to make it easier to review and identify the changes made. When you're satisfied with the results, I'd be glad to squash everything into one commit and update the body and message of my initial commit. |
|
/retest |
There was a problem hiding this comment.
Based on #8925 (comment), I see a -2.9% delta on one of the files wrt unit test coverage. Can we try having a smaller delta <0.3-0.5%?
Apart from that it looks good, I can lgtm after we have this one update 👼
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: afrittoli, waveywaves The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
c2ae737 to
72fc43f
Compare
|
The following is the coverage report on the affected files.
|
|
Hi @waveywaves, I added two more commits:
I also took the opportunity to rebase my branch on If the two added commits are fine with you I can squash them too 👍 |
|
@Maximilien-R thank you for your work, I'll review it soon, if you can squash it that would be great |
Avoids unnecessary DeepCopy operations on steps that do not reference a StepAction. Introduces concurrent resolution of steps that reference StepActions to improve the performance of TaskRun reconciliation, especially when using remote resolvers like git. The key changes include: - `hasStepRefs` function: A new function that quickly checks if a `TaskSpec` contains any steps referencing `StepActions`. This allows for an early exit if no resolution is needed, avoiding unnecessary work. - `resolveStepRef` function: This new function encapsulates the logic for resolving a single `StepAction` reference. It handles fetching the remote resource, merging the `StepAction` with the step's specification, and returning the resolved step - Two-phase resolution: The `GetStepActionsData` function is now split into two distinct phases: - Concurrent Resolution: All `StepAction` references are resolved concurrently using an `errgroup`. - Sequential Merging: The resolved steps and their provenance are merged into the final step list and the `TaskRun` status sequentially. - `updateTaskRunProvenance` function: A dedicated function for updating the TaskRun's status with provenance information. The maximum number of StepActions that can be resolved concurrently is defined by the default config and its `default-step-ref-concurrency-limit` key.
72fc43f to
9b1f2e7
Compare
|
The following is the coverage report on the affected files.
|
|
/ok-to-test |
|
/lgtm thank you for your work on this very useful feature! |
@waveywaves It seems the e2e-tests failed, is it possible to rerun it ? Thank you |
|
/retest |
This PR improves the performance of
TaskRunreconciliation by resolvingStepActionsconcurrently and refactors the resolution logic for better efficiency.The problem
Currently, when a
Taskcontains multiple steps that referenceStepActions, the resolution of these references is performed sequentially. This can lead to significant delays in starting aTaskRun, particularly when using remote resolvers like git, as each resolution adds to the total time.Additionally, the existing code performs a deep copy of every step, regardless of whether it references a
StepAction, leading to unnecessary memory allocations.The changes
This pull request introduces two main improvements to
StepActionresolution:Concurrent resolution:
StepActionsare now resolved concurrently using anerrgroup. This reduces the time required to processTaskRunsthat contain multiple steps with remoteStepActionreferences, such as those from a git repository.Code refactoring: The resolution logic in
taskspec.gohas been refactored for clarity and maintainability. This includes:HasStepRefsfunction for an early exit if noStepActionsneed to be resolved.resolveStepReffunction to encapsulate the logic of resolving a singleStepAction.updateTaskRunProvenancefunction to handle status updates cleanly.DeepCopyto only occur when astep.Refis present./kind feature