VTGate: fix warming reads timeout context#19674
Conversation
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
|
@copilot please provide a review of this draft |
There was a problem hiding this comment.
Pull request overview
Fixes VTGate replica warming reads so --warming-reads-query-timeout actually bounds the warming queries by propagating the timeout context into the warming execution path.
Changes:
- Updated
VCursor.CloneForReplicaWarming()to return both a cloned VCursor and the context to use for the warming query. - Switched the warming read goroutine to use the returned context for routing and execution.
- Added/updated unit coverage to assert the warming read executes with a deadline-bearing context.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| go/vt/vtgate/executorcontext/vcursor_impl.go | Changes CloneForReplicaWarming to return the timeout context alongside the cloned VCursor. |
| go/vt/vtgate/engine/route.go | Uses the returned warming context for findRoute and ExecuteMultiShard. |
| go/vt/vtgate/engine/primitive.go | Updates VCursor interface signature for CloneForReplicaWarming. |
| go/vt/vtgate/engine/route_warming_reads_test.go | Extends test VCursor to return a timeout context and asserts a deadline is present. |
| go/vt/vtgate/engine/fake_vcursor_test.go | Adjusts test VCursor implementations to match the new interface signature. |
Comments suppressed due to low confidence (1)
go/vt/vtgate/executorcontext/vcursor_impl.go:306
CloneForReplicaWarmingcreates acontext.WithTimeoutbut discards the returned cancel func. Now that this context is used to run the actual warming query, failing to call cancel will keep the timer/resources around until the timeout fires (and if the warming read is dropped, it still allocates a timer). Consider returning acontext.CancelFunc(or encapsulating cancellation in the returned VCursor) and ensuring callersdefer cancel()when done; also guardWarmingReadsTimeout <= 0if "no timeout" is intended (sinceWithTimeout(..., 0)is immediately expired).
func (vc *VCursorImpl) CloneForReplicaWarming(ctx context.Context) (engine.VCursor, context.Context) {
callerId := callerid.EffectiveCallerIDFromContext(ctx)
immediateCallerId := callerid.ImmediateCallerIDFromContext(ctx)
timedCtx, _ := context.WithTimeout(context.Background(), vc.config.WarmingReadsTimeout) //nolint
clonedCtx := callerid.NewContext(timedCtx, callerId, immediateCallerId)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Description
Fixes the warming reads timeout context so it actually works as intended.
CloneForReplicaWarming()was creating acontext.WithTimeoutusing the--warming-reads-query-timeoutflag value, but:cancelfunc was discarded (suppressed with//nolint), leaking the timeout goroutinectxtofindRoute()andExecuteMultiShard(), so when the primary query completed and its context was cancelled, the warming read got aKILL QUERYThe dedicated
--warming-reads-query-timeoutwas effectively unused since the introduction of warming reads in v16.This PR introduces a
WarmingReadsContext()method on theVCursorinterface that creates the timeout context inside the warming goroutine usingcontext.WithoutCancel(ctx)as the base (preserves tracing spans and caller IDs while detaching from parent cancellation). The clone is also moved inside theselect caseto avoid unnecessary work when the channel is full.Test coverage includes:
findRoute()receives the warming context (viaResolveDestinationshook)DeadlineExceededThe justification for backports is this timeout is not working as designed, since the introduction of warming reads in v16. cc @olyazavr
Related Issue(s)
Resolves: #19673
Related: #19656
Checklist
Deployment Notes
N/A
AI Disclosure
Claude Code assisted with development and testing. Claude Code prepared this PR summary