onlineddl: context-scoped bulk migration operations#20147
onlineddl: context-scoped bulk migration operations#20147shlomi-noach wants to merge 13 commits into
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
There was a problem hiding this comment.
Pull request overview
This PR extends ALTER VITESS_MIGRATION ... CONTEXT 'ctx' support from just CANCEL to the full set of bulk migration operations, so bulk actions can be scoped to a specific migration_context without impacting other contexts.
Changes:
- Extends the SQL grammar/AST formatting to parse + round-trip
... CONTEXT 'ctx'for all bulk operations. - Threads
migrationContextthrough OnlineDDL executor “All” operations and filters pending migrations accordingly (plus a context-specific cleanup UPDATE). - Adds end-to-end helpers/tests to validate each context-scoped bulk operation.
Reviewed changes
Copilot reviewed 10 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| go/vt/vttablet/tabletserver/query_executor.go | Passes alterMigration.Context to OnlineDDL executor bulk-operation dispatch. |
| go/vt/vttablet/onlineddl/schema.go | Adds a context-scoped cleanup UPDATE query constant. |
| go/vt/vttablet/onlineddl/executor.go | Adds migrationContext param to bulk executor methods and filters to matching contexts. |
| go/vt/sqlparser/sql.y | Adds CONTEXT token + grammar rules for context-scoped bulk migration operations. |
| go/vt/sqlparser/parse_test.go | Adds parser round-trip coverage for the new ... CONTEXT statements. |
| go/vt/sqlparser/keywords.go | Registers context keyword/token. |
| go/vt/sqlparser/cached_size.go | Updates cached-size accounting for AlterMigration to include Context. |
| go/vt/sqlparser/ast.go | Adds Context field to the AlterMigration AST node. |
| go/vt/sqlparser/ast_format.go | Formats bulk operations as ... context 'ctx' when Context is set. |
| go/vt/sqlparser/ast_format_fast.go | Adds fast formatting for the new Context field (see review comment about escaping). |
| go/vt/sqlparser/ast_equals.go | Includes Context in AlterMigration equality comparisons. |
| go/test/endtoend/onlineddl/vtgate_util.go | Adds VTGate helper functions for context-scoped operations (see review comment about query construction). |
| go/test/endtoend/onlineddl/scheduler/onlineddl_scheduler_test.go | Adds scheduler e2e tests validating context-scoped behavior across operations. |
Files not reviewed (3)
- go/vt/sqlparser/ast_equals.go: Language not supported
- go/vt/sqlparser/ast_format_fast.go: Language not supported
- go/vt/sqlparser/cached_size.go: Language not supported
Comments suppressed due to low confidence (7)
go/test/endtoend/onlineddl/vtgate_util.go:253
- This query is built via fmt.Sprintf with migrationContext embedded in quotes; contexts containing quotes/backslashes will break the SQL and can lead to unintended statements. Prefer sqlparser.ParseAndBind with %a + StringBindVariable for migrationContext.
// CheckPostponeCompleteContextMigrations postpones completion of all pending migrations with a given context and expects number of affected rows.
// A negative value for expectCount indicates "don't care, no need to check"
func CheckPostponeCompleteContextMigrations(t *testing.T, vtParams *mysql.ConnParams, migrationContext string, expectCount int) {
query := fmt.Sprintf("alter vitess_migration postpone complete context '%s'", migrationContext)
r := VtgateExecQuery(t, vtParams, query, "")
go/test/endtoend/onlineddl/vtgate_util.go:286
- This query is built via fmt.Sprintf with migrationContext embedded in quotes; contexts containing quotes/backslashes will break the SQL and can lead to unintended statements. Prefer sqlparser.ParseAndBind with %a + StringBindVariable for migrationContext.
// CheckCancelContextMigrations cancels all pending migrations with a given context and expect number of affected rows
// A negative value for expectCount indicates "don't care, no need to check"
func CheckCancelContextMigrations(t *testing.T, vtParams *mysql.ConnParams, migrationContext string, expectCount int) {
cancelQuery := fmt.Sprintf("alter vitess_migration cancel context '%s'", migrationContext)
r := VtgateExecQuery(t, vtParams, cancelQuery, "")
go/test/endtoend/onlineddl/vtgate_util.go:297
- This query is built via fmt.Sprintf with migrationContext embedded in quotes; contexts containing quotes/backslashes will break the SQL and can lead to unintended statements. Prefer sqlparser.ParseAndBind with %a + StringBindVariable for migrationContext.
// CheckCleanupContextMigrations cleans up terminal migrations with a given context and expects number of affected rows.
// A negative value for expectCount indicates "don't care, no need to check"
func CheckCleanupContextMigrations(t *testing.T, vtParams *mysql.ConnParams, migrationContext string, expectCount int) uint64 {
query := fmt.Sprintf("alter vitess_migration cleanup context '%s'", migrationContext)
r := VtgateExecQuery(t, vtParams, query, "")
go/test/endtoend/onlineddl/vtgate_util.go:322
- This query is built via fmt.Sprintf with migrationContext embedded in quotes; contexts containing quotes/backslashes will break the SQL and can lead to unintended statements. Prefer sqlparser.ParseAndBind with %a + StringBindVariable for migrationContext.
// CheckLaunchContextMigrations launches all queued postponed migrations with a given context and expects number of affected rows.
// A negative value for expectCount indicates "don't care, no need to check"
func CheckLaunchContextMigrations(t *testing.T, vtParams *mysql.ConnParams, migrationContext string, expectCount int) {
query := fmt.Sprintf("alter vitess_migration launch context '%s'", migrationContext)
r := VtgateExecQuery(t, vtParams, query, "")
go/test/endtoend/onlineddl/vtgate_util.go:343
- This query is built via fmt.Sprintf with migrationContext embedded in quotes; contexts containing quotes/backslashes will break the SQL and can lead to unintended statements. Prefer sqlparser.ParseAndBind with %a + StringBindVariable for migrationContext.
// CheckForceCutOverContextMigrations marks all pending migrations with a given context for forced cut-over and expects number of affected rows.
// A negative value for expectCount indicates "don't care, no need to check"
func CheckForceCutOverContextMigrations(t *testing.T, vtParams *mysql.ConnParams, migrationContext string, expectCount int) {
query := fmt.Sprintf("alter vitess_migration force_cutover context '%s'", migrationContext)
r := VtgateExecQuery(t, vtParams, query, "")
go/test/endtoend/onlineddl/vtgate_util.go:493
- This query is built via fmt.Sprintf with migrationContext embedded in quotes; contexts containing quotes/backslashes will break the SQL and can lead to unintended statements. Prefer sqlparser.ParseAndBind with %a + StringBindVariable for migrationContext.
// ThrottleContextMigrations throttles all pending migrations with a given context.
func ThrottleContextMigrations(t *testing.T, vtParams *mysql.ConnParams, migrationContext string) {
query := fmt.Sprintf("alter vitess_migration throttle context '%s' expire '24h' ratio 1", migrationContext)
_ = VtgateExecQuery(t, vtParams, query, "")
go/test/endtoend/onlineddl/vtgate_util.go:505
- This query is built via fmt.Sprintf with migrationContext embedded in quotes; contexts containing quotes/backslashes will break the SQL and can lead to unintended statements. Prefer sqlparser.ParseAndBind with %a + StringBindVariable for migrationContext.
// UnthrottleContextMigrations unthrottles all pending migrations with a given context.
func UnthrottleContextMigrations(t *testing.T, vtParams *mysql.ConnParams, migrationContext string) {
query := fmt.Sprintf("alter vitess_migration unthrottle context '%s'", migrationContext)
_ = VtgateExecQuery(t, vtParams, query, "")
3e170fd to
7700c65
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 14 changed files in this pull request and generated no new comments.
Files not reviewed (3)
- go/vt/sqlparser/ast_equals.go: Language not supported
- go/vt/sqlparser/ast_format_fast.go: Language not supported
- go/vt/sqlparser/cached_size.go: Language not supported
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20147 +/- ##
===========================================
- Coverage 69.67% 62.05% -7.62%
===========================================
Files 1614 135 -1479
Lines 216793 30725 -186068
===========================================
- Hits 151044 19067 -131977
+ Misses 65749 11658 -54091
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
5572440 to
f862bc0
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 14 changed files in this pull request and generated 1 comment.
Files not reviewed (3)
- go/vt/sqlparser/ast_equals.go: Language not supported
- go/vt/sqlparser/ast_format_fast.go: Language not supported
- go/vt/sqlparser/cached_size.go: Language not supported
| | ALTER comment_opt VITESS_MIGRATION CANCEL CONTEXT STRING | ||
| { | ||
| $$ = &AlterMigration{ | ||
| Type: CancelAllMigrationType, | ||
| Context: $6, | ||
| } |
There was a problem hiding this comment.
OK OK you're right and I give up. Empty context is now rejected by the parser. I like this idea better than my original approach.
a707fc3 to
9a0d463
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 14 changed files in this pull request and generated no new comments.
Files not reviewed (3)
- go/vt/sqlparser/ast_equals.go: Language not supported
- go/vt/sqlparser/ast_format_fast.go: Language not supported
- go/vt/sqlparser/cached_size.go: Language not supported
28f157f to
03b669b
Compare
Introduces `ALTER VITESS_MIGRATION CANCEL CONTEXT 'ctx'` syntax to cancel all pending migrations belonging to a given migration context, without affecting migrations from other contexts. Complements the existing CANCEL ALL. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
The test was accidentally placed inside testSingleton where t1uuid, t2uuid, trivialAlterT1Statement, and ddlStrategy are out of scope. Moved it to the end of testScheduler where those variables are declared. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
For every XxxAllMigrationType executor method, empty migrationContext is the sentinel for the ALL path. OP CONTEXT '' is therefore equivalent to OP ALL -- empty context is not a supported filter value. Document this in each method doc comment and add parse round-trip tests that confirm the formatter produces the expected ALL form. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
The refactoring to readPendingMigrations inadvertently widened the ALL path to include running/ready migrations. Restore sqlSelectQueuedMigrations as the data source (queued-only) and apply the CONTEXT filter in-loop on migration_context. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
ForceCutOverPendingMigrations, CompletePendingMigrations, and PostponeCompletePendingMigrations now track a matched counter and log "done iterating N migrations, matched M" — consistent with the CancelPendingMigrations pattern. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Previously the test held migrations in queued state via --postpone-launch and asserted queued status after POSTPONE COMPLETE CONTEXT, which proved nothing about postpone_completion semantics — the queued state was maintained by postpone_launch, not postpone_completion. New flow: set postpone_completion, then launch, wait for running, assert running persists (ensureStateNotChangedTime), then COMPLETE CONTEXT and wait for complete. This actually exercises the postpone_completion flag. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
LaunchMigrations filters via sqlSelectQueuedMigrations which requires reviewed_timestamp IS NOT NULL. The scheduler sets this in a later processing cycle after the migration enters queued state, so tests calling CheckLaunchContextMigrations immediately after WaitForMigrationStatus(queued) could race. Add WaitForMigrationReviewedTimestamp helper that polls SHOW VITESS_MIGRATIONS every second until the field is set, and use it in the two affected tests instead of a fixed sleep. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
|
Promptless prepared a documentation update related to this change. Triggered by PR #20147 Added documentation for the new context-scoped bulk migration operations ( |
mattlord
left a comment
There was a problem hiding this comment.
Shouldn't we have CLEANUP CONTEXT use the same exact context matching semantics as the other bulk operations in go/vt/vttablet/onlineddl/schema.go:191-197?
The new cleanup path filters with migration_context=%a in SQL, so it inherits the sidecar table's collation. schema_migrations is CHARSET=utf8mb4 without an explicit binary collation, and the supported MySQL versions today all default to a case-insensitive collation for utf8mb4. The other new context-scoped operations compare pm.migrationContext != migrationContext in Go, which is byte/case-sensitive. Since ValidateMigrationContext allows uppercase, contexts like ctx-a and CTX-a can be treated as distinct by COMPLETE/POSTPONE/CANCEL/LAUNCH/FORCE_CUTOVER/THROTTLE, but CLEANUP CONTEXT 'ctx-a' may mark both contexts for artifact cleanup.
Unless I'm missing something I think that we should either make the SQL cleanup predicate binary/exact, or define/implement context matching as case-insensitive consistently across all of these operations. A regression test with two terminal migrations whose contexts differ only by case would then cover this (maybe it will already pass as I'm missing something).
Otherwise it LGTM!
| for _, row := range rows { | ||
| uuid := row["migration_uuid"].ToString() | ||
| mc := row["migration_context"].ToString() | ||
| if migrationContext != "" && migrationContext != mc { |
There was a problem hiding this comment.
Might be worth trimming whitespace from both ends of the strings just in case.
There was a problem hiding this comment.
When looking to apply that, I realize the same could be said about dozens if not more other locations in the code where any migration context or uuid are involved. I'd say skip this.
main changed isOpen from atomic.LoadInt64(&e.isOpen) to e.isOpen.Load(); take that update while keeping the migrationContext parameter our branch added to CleanupAllMigrations, ForceCutOverPendingMigrations, CompletePendingMigrations, PostponeCompletePendingMigrations, and LaunchMigrations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
sqlUpdateReadyForCleanupAllByContext filtered by migration_context using the table's default utf8mb4_0900_ai_ci collation (case-insensitive), while all other CONTEXT operations compare migration_context in Go with ==, which is byte-exact. Use COLLATE utf8mb4_bin in the WHERE clause to make the SQL predicate case-sensitive without changing the schema. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
That's a good catch. In choosing the best approach I saw two alternatives:
I picked the latter it's a simple enough change and I think worth it, even if this still means some inconsistency with existing logic. It's just so much simpler. Let me know if you disagree. |
| migration_status IN ('complete', 'cancelled', 'failed') | ||
| AND cleanup_timestamp IS NULL | ||
| AND retain_artifacts_seconds > 0 | ||
| AND migration_context COLLATE utf8mb4_bin=%a |
There was a problem hiding this comment.
This will prevent usage of the index on the migration_context column, and this is an update so it is taking X locks on rows it scans, but that is probably OK. We could alternatively make the column use this collation in the schema.
There was a problem hiding this comment.
Oh I forgot to explain, the index this query uses is:
KEY `cleanup_status_idx` (`cleanup_timestamp`, `migration_status`),and so migration_context is irrelevant to the index search.
There was a problem hiding this comment.
updated above comment which was poorly worded.
Followup to #20146, similar changes (same changes parser-wise, obviously different implementation onlineddl-executor-wise).
Fixes #20177.
Related: #20145
Extends the
ALTER VITESS_MIGRATION ... CONTEXT 'ctx'pattern to all bulk migration operations. Previously onlyCANCEL CONTEXT 'ctx'existed. Now the full set is supported:Each operation affects only migrations belonging to the given
migration_context, leaving unrelated contexts untouched.How it works
The
AlterMigrationAST node already carries aContext stringfield (added in the CANCEL CONTEXT PR). EachXxxAllMigrationTypeexecutor method now accepts an optionalmigrationContext stringparameter:CLEANUP CONTEXTuses a new SQL constant that addsAND migration_context=?to the existing cleanup query. All other operations iteratereadPendingMigrationsand skip non-matching contexts, matching the pattern established byCancelPendingMigrations.THROTTLE/UNTHROTTLE CONTEXTwork by iterating pending migrations in the context and throttling each by UUID (the individual-migration throttle path), since the whole-app throttle doesn't support context filtering.Changes
go/vt/sqlparser/— 7 new grammar rules, format branching for all ALL types, 11 parse round-trip testsgo/vt/vttablet/onlineddl/executor.go—migrationContextparam on all 7 "All" methodsgo/vt/vttablet/onlineddl/schema.go—sqlUpdateReadyForCleanupAllByContextconstantgo/vt/vttablet/tabletserver/query_executor.go— passalterMigration.Contextin all ALL-type dispatch casesgo/test/endtoend/onlineddl/— 7 new vtgate helpers + e2e scheduler tests for each operationMost of this was written by Claude Code — I provided direction.