chunked builtin backup engine#20167
Conversation
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
There was a problem hiding this comment.
Pull request overview
This PR adds optional chunking to the builtinbackupengine so that large MySQL files can be backed up and restored as independently-compressed pieces, enabling higher parallelism (especially beneficial for object stores like S3) and improving restore throughput.
Changes:
- Introduces chunk metadata in the backup manifest (
FileEntry.Chunks) and new flags to control chunking (--builtinbackup-file-chunk-threshold,--builtinbackup-file-chunk-size). - Updates builtin backup/restore to split large files into chunks for parallel backup and to restore chunked files via parallel
WriteAt(pwrite-style) writes into a pre-sized destination. - Adds unit and end-to-end tests validating chunk name parsing and verifying chunked vs non-chunked backups via MANIFEST inspection.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| go/vt/mysqlctl/file_close_test.go | Updates tests for the new backupFile(..., chunkIndex) signature. |
| go/vt/mysqlctl/builtinbackupengine.go | Core implementation: chunking flags, manifest schema, chunked backup work scheduling, and parallel chunk restore. |
| go/vt/mysqlctl/builtinbackupengine_test.go | Adds unit tests for parsing storage names (parseBackupName). |
| go/test/endtoend/backup/vtctlbackup/backup_utils.go | Adds helpers to verify chunking by reading MANIFEST and counting chunks. |
| go/test/endtoend/backup/vtctlbackup/backup_test.go | Adds end-to-end tests for chunked and non-chunked builtin backups with forced small thresholds/sizes. |
| go/flags/endtoend/vttestserver.txt | Documents new builtinbackup chunking flags in end-to-end flag snapshots. |
| go/flags/endtoend/vttablet.txt | Documents new builtinbackup chunking flags in end-to-end flag snapshots. |
| go/flags/endtoend/vtctld.txt | Documents new builtinbackup chunking flags in end-to-end flag snapshots. |
| go/flags/endtoend/vtcombo.txt | Documents new builtinbackup chunking flags in end-to-end flag snapshots. |
| go/flags/endtoend/vtbackup.txt | Documents new builtinbackup chunking flags in end-to-end flag snapshots. |
Comments suppressed due to low confidence (2)
go/vt/mysqlctl/builtinbackupengine.go:1372
- Chunk restore goroutines close over loop variables
jandfe(and usedest/fe.Nameinside the closure). This can result in writing the wrong chunk offset/data and misreporting errors/logs. Rebind the loop variables (e.g.j := j,feLocal := fe) before starting each goroutine.
for j := range fe.Chunks {
g.Go(func() error {
chunk := &fe.Chunks[j]
select {
go/vt/mysqlctl/builtinbackupengine.go:1394
- Non-chunked restore goroutine closes over
i/fefrom the enclosing for-loop. This can cause it to restore the wrong file index and log/record errors under the wrong name. Capture locals (e.g.iLocal := i,feLocal := fe) before g.Go.
// Non-chunked file: restore as before.
g.Go(func() error {
name := strconv.Itoa(i)
select {
| if backupFileChunkThreshold > 0 && fileSize > backupFileChunkThreshold { | ||
| numChunks := (fileSize + backupFileChunkSize - 1) / backupFileChunkSize | ||
| fe.Chunks = make([]FileChunk, numChunks) | ||
| for j := range numChunks { | ||
| offset := j * backupFileChunkSize |
There was a problem hiding this comment.
it does compile, this sounds like advice for an older Go version?
| for _, wi := range workItems { | ||
| g.Go(func() error { | ||
| fe := &fes[i] | ||
| name := strconv.Itoa(i) | ||
| fe := &fes[wi.feIndex] | ||
|
|
||
| // Check for context cancellation explicitly because, the way semaphore code is written, theoretically we might | ||
| // end up not throwing an error even after cancellation. Please see https://cs.opensource.google/go/x/sync/+/refs/tags/v0.1.0:semaphore/semaphore.go;l=66, | ||
| // which suggests that if the context is already done, `Acquire()` may still succeed without blocking. This introduces | ||
| // unpredictability in my test cases, so in order to avoid that, I am adding this cancellation check. | ||
| select { | ||
| // Skip work if the context has been cancelled (e.g. another goroutine failed). | ||
| case <-ctxCancel.Done(): | ||
| log.Error(fmt.Sprintf("Context canceled or timed out during %q backup", fe.Name)) | ||
| bh.RecordError(name, vterrors.Errorf(vtrpcpb.Code_CANCELED, "context canceled")) | ||
| bh.RecordError(wi.name, vterrors.Errorf(vtrpcpb.Code_CANCELED, "context canceled")) | ||
| return nil | ||
| default: | ||
| } | ||
|
|
||
| // Backup the individual file. | ||
| var errBackupFile error | ||
| if errBackupFile = be.backupFile(ctxCancel, params, bh, fe, name); errBackupFile != nil { | ||
| bh.RecordError(name, vterrors.Wrapf(errBackupFile, "failed to backup file '%s'", name)) | ||
| if errBackupFile := be.backupFile(ctxCancel, params, bh, fe, wi.name, wi.chunkIndex); errBackupFile != nil { | ||
| bh.RecordError(wi.name, vterrors.Wrapf(errBackupFile, "failed to backup '%s'", wi.name)) |
There was a problem hiding this comment.
same here, this seems pre 1.22: https://go.dev/doc/go1.22#language
| if files := bh.GetFailedFiles(); len(files) > 0 { | ||
| newFEs := make([]FileEntry, len(fes)) | ||
| for _, file := range files { | ||
| fileNb, err := strconv.Atoi(file) | ||
| if err != nil { | ||
| return "", vterrors.Wrapf(err, "failed to retry file '%s'", file) | ||
| feIdx, chunkIdx, parseErr := parseBackupName(file) | ||
| if parseErr != nil { | ||
| return "", parseErr | ||
| } | ||
| oldFes := fes[fileNb] | ||
| newFEs[fileNb] = FileEntry{ | ||
| Base: oldFes.Base, | ||
| Name: oldFes.Name, | ||
| ParentPath: oldFes.ParentPath, | ||
| Hash: oldFes.Hash, | ||
| RetryCount: 1, | ||
| oldFe := fes[feIdx] | ||
| if newFEs[feIdx].Name == "" { | ||
| newFEs[feIdx] = FileEntry{ | ||
| Base: oldFe.Base, | ||
| Name: oldFe.Name, | ||
| ParentPath: oldFe.ParentPath, | ||
| Hash: oldFe.Hash, | ||
| RetryCount: 1, | ||
| } | ||
| } | ||
| if chunkIdx >= 0 { | ||
| newFEs[feIdx].Chunks = append(newFEs[feIdx].Chunks, oldFe.Chunks[chunkIdx]) | ||
| } |
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
|
Promptless prepared a documentation update related to this change. Triggered by PR #20167 Added documentation for the new |
| fullPath, pathErr := fe.fullPath(params.Cnf) | ||
| if pathErr != nil { | ||
| return vterrors.Wrapf(pathErr, "can't get path for chunked file %v", fe.Name) | ||
| } | ||
| dest, openErr := os.OpenFile(fullPath, os.O_WRONLY, 0o644) | ||
| if openErr != nil { | ||
| return vterrors.Wrapf(openErr, "can't open destination for chunked file %v", fe.Name) | ||
| } |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20167 +/- ##
===========================================
- Coverage 69.67% 52.78% -16.89%
===========================================
Files 1614 46 -1568
Lines 216793 7290 -209503
===========================================
- Hits 151044 3848 -147196
+ Misses 65749 3442 -62307
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
| type FileChunk struct { | ||
| StorageName string | ||
| Offset int64 | ||
| Size int64 | ||
| Hash string | ||
| } |
There was a problem hiding this comment.
Let's add comments for each field, similar to the FileEntry struct
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
| func computeFileChunks(fileIndex int, fileSize, chunkSize int64) []FileChunk { | ||
| numChunks := (fileSize + chunkSize - 1) / chunkSize | ||
| chunks := make([]FileChunk, numChunks) | ||
| for j := range numChunks { | ||
| offset := j * chunkSize | ||
| size := chunkSize | ||
| if offset+size > fileSize { | ||
| size = fileSize - offset | ||
| } | ||
| chunks[j] = FileChunk{ | ||
| StorageName: fmt.Sprintf("%d-%d", fileIndex, j), | ||
| Offset: offset, | ||
| Size: size, | ||
| } | ||
| } | ||
| return chunks | ||
| } |
| for _, wi := range workItems { | ||
| g.Go(func() error { | ||
| fe := &fes[i] | ||
| name := strconv.Itoa(i) | ||
| fe := &fes[wi.feIndex] | ||
|
|
||
| // Check for context cancellation explicitly because, the way semaphore code is written, theoretically we might | ||
| // end up not throwing an error even after cancellation. Please see https://cs.opensource.google/go/x/sync/+/refs/tags/v0.1.0:semaphore/semaphore.go;l=66, | ||
| // which suggests that if the context is already done, `Acquire()` may still succeed without blocking. This introduces | ||
| // unpredictability in my test cases, so in order to avoid that, I am adding this cancellation check. | ||
| select { | ||
| // Skip work if the context has been cancelled (e.g. another goroutine failed). | ||
| case <-ctxCancel.Done(): | ||
| log.Error(fmt.Sprintf("Context canceled or timed out during %q backup", fe.Name)) | ||
| bh.RecordError(name, vterrors.Errorf(vtrpcpb.Code_CANCELED, "context canceled")) | ||
| bh.RecordError(wi.name, vterrors.Errorf(vtrpcpb.Code_CANCELED, "context canceled")) | ||
| return nil | ||
| default: | ||
| } | ||
|
|
||
| // Backup the individual file. | ||
| var errBackupFile error | ||
| if errBackupFile = be.backupFile(ctxCancel, params, bh, fe, name); errBackupFile != nil { | ||
| bh.RecordError(name, vterrors.Wrapf(errBackupFile, "failed to backup file '%s'", name)) | ||
| if errBackupFile := be.backupFile(ctxCancel, params, bh, fe, wi.name, wi.chunkIndex); errBackupFile != nil { | ||
| bh.RecordError(wi.name, vterrors.Wrapf(errBackupFile, "failed to backup '%s'", wi.name)) |
| for j := range fe.Chunks { | ||
| g.Go(func() error { | ||
| chunk := &fe.Chunks[j] | ||
|
|
||
| select { | ||
| // Skip work if the context has been cancelled (e.g. another goroutine failed). | ||
| case <-ctx.Done(): | ||
| log.Error(fmt.Sprintf("Context canceled or timed out during %q chunk %d restore", fe.Name, j)) | ||
| bh.RecordError(chunk.StorageName, vterrors.Errorf(vtrpcpb.Code_CANCELED, "context canceled")) | ||
| return nil |
| if backupFileChunkSize <= 0 { | ||
| return BackupUnusable, vterrors.Errorf(vtrpcpb.Code_FAILED_PRECONDITION, "builtinbackup-file-chunk-size can't be zero") | ||
| } |
| cleanup := func() error { | ||
| params.Logger.Infof("closing decompressor") | ||
| closeAt := time.Now() | ||
| cerr := closeWithRetry(ctx, params.Logger, closer, "decompressor") | ||
| if cerr != nil { | ||
| cerr = vterrors.Wrapf(cerr, "failed to close decompressor %v", name) | ||
| params.Logger.Error(cerr) | ||
| } | ||
| params.Stats.Scope(stats.Operation("Decompressor:Close")).TimedIncrement(time.Since(closeAt)) | ||
| return cerr | ||
| } |
mattlord
left a comment
There was a problem hiding this comment.
go/vt/mysqlctl/builtinbackupengine.go:1386-1390 ignores final close errors for chunked restore destinations. Non-chunked restore propagates destination close failures because they can mean data was not safely flushed; chunked restore only logs them and can report success, then later attempt to start MySQL on an incomplete/corrupt file. Please collect these close errors and return them, and also check the dest.Close() in createChunkedDestinations at line 1367.
go/vt/mysqlctl/builtinbackupengine.go:196-198 / :691-694 has no bound on the chunk count. A small --builtinbackup-file-chunk-size typo, e.g. 1, can allocate one FileChunk and one work item per byte of a large InnoDB file before backup starts. Please enforce a sane minimum chunk size or a max chunks-per-file limit before allocating.
go/vt/mysqlctl/builtinbackupengine.go:281-283 validates chunk size even when chunking is disabled or when taking an incremental backup that may not use chunking. I’d validate chunk-size > 0 only when chunk-threshold > 0, reject negative thresholds explicitly, and fix the message to say must be > 0.
I agree with the compatibility caveat too: once chunking is enabled, those backups are not restorable by older Vitess versions because old restore code ignores Chunks and looks for whole-file objects. That should be called out in release notes summary.
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
|
thanks @mattlord, I have update the PR! let me know if you spot any other issues or want me to make any changes. |
Description
This is the first PR as part of #20159
This PR adds chunked parallel backup/restore to the builtin backup engine. Files larger than a configurable threshold are split into independently-compressed chunks during backup, which can then be restored in parallel using writes at known offsets.
Changes:
--builtinbackup-file-chunk-threshold(default 0, disabled) and--builtinbackup-file-chunk-size(default 1GiB)Related Issue(s)
Checklist
Deployment Notes
AI Disclosure
PR created by me with support of Claude, fully tested by me before publishing on our own branch and tested with unit tests and e2e on the
mainbranch