Skip to content

RFC: incremental backup and point-in-time recovery #11227

@shlomi-noach

Description

@shlomi-noach

We wish to implement a native solution for (offline) incremental backup and compatible point-in-time recovery in Vitess. There is already a Work in Progress PR. But let's first describe the problem, what's the offered solution, and how it differs from an already existing prior implementation.

Background

Point-in-time recoveries make it possible to recover a database into a specific or rough, timestamp or position. The classic use case is a catastrophic change to the data, e.g. an unintentional DELETE FROM <table> or similar. Normally the damage only applies to a subset of the data, and the database is generally still valid, and the app is still able to function. As such, we want to fix the specific damage inflicted. The flow is to restore the data on an offline/non-serving server, to a point in time immediately before the damage was done. It's then typically a manual process of salvaging the specific damaged records.

It's also possible to just throw away everything and roll back the entire database to that point in time, though that is an uncommon use case.

A point in time can be either an actual timestamp, or, more accurately, a position. Specifically in MySQL 5.7 and above, this will be a GTID set, the @@gtid_executed just before the damage. Since every transaction gets its own GTID value, it should be possible to restore up to a single transaction granularity (where a timestamp is a more coarse measurement).

A point in time recovery is possible by combining a full backup recovery, followed by an incremental stream of changes since that backup. There are two main techniques in three different forms:

  1. Using binary logs, stored offline
  2. Using a binary log live stream
  3. Using Xtrabackup incremental backup

This RFC wishes to address (1). There is already prior work for (2). Right now we do not wish to address (3).

The existing prior work addresses (2), and specifically assumes:

  • You have a binlog server in your topology
  • The binlog server still has all the required binary logs to perform the recovery
  • You are able to join your server into the live replication stream

Suggested solution, backup

We wish to implement a more general solution by actually backing up binary logs as part of the backup process. These can be stored on local disk, in S3, etc., same way as any vitess backup is stored. In fact, an incremental backup will be listed just like any other backup, and this listing is also the key to performing a restore.

The user will take an incremental backup similarly to how they take a full backup:

  • Full backup: vtctlclient -- Backup zone1-0000000102
  • Incremental backup: vtctlclient -- Backup --incremental_from_pos "MySQL56/16b1039f-22b6-11ed-b765-0a43f95f28a3:1-615" zone1-0000000102
  • or, auto incremental backup: vtctlclient -- Backup --incremental_from_pos "auto" zone1-0000000102

An incremental backup needs to have a starting point, given as --incremental_from_pos flag. The incremental backup must cover that position, but does not have to start exactly at that position: it can start with an earlier position. See diagram below. The backup ends with the rough position of the time the backup was requested. It will cover the exact point in time where the request was made, and possibly extend slightly beyond that.

An incremental backup is taken by copying binary logs. To do that, there is no need to shut down the MySQL server, and it is free to be fully operational and serve traffic while backup takes place. The backup process will rotate binary logs (FLUSH BINARY LOGS) so as to ensure the files it is backing up are safely immutable.

A manifest of an incremental backup may look like so:

{
  "BackupMethod": "builtin",
  "Position": "MySQL56/16b1039f-22b6-11ed-b765-0a43f95f28a3:1-883",
  "FromPosition": "MySQL56/16b1039f-22b6-11ed-b765-0a43f95f28a3:1-867",
  "Incremental": true,
  "BackupTime": "2022-08-25T12:55:05Z",
  "FinishedTime": "2022-08-25T12:55:05Z",
  "ServerUUID": "1ea0631b-22b6-11ed-933f-0a43f95f28a3",
  "TabletAlias": "zone1-0000000102",
  "CompressionEngine": "pargzip",
  "FileEntries": [
     ..
  ]
}
  • The above is an incremental backup's manifest. Clearly indicated by "Incremental": true,
  • "FileEntries" will list binary log files
  • "FromPosition" indicates the first position covered by the backup. It is smaller than or equal to the requested --incremental_from_pos. This value is empty for full backup.
  • ServerUUID is new and self explanatory, added for convenience
  • TabletAlias is new and self explanatory, added for convenience

Suggested solution, restore/recovery

Again, riding the familiar Restore command. A restore looks like:

vtctlclient -- RestoreFromBackup  --restore_to_pos  "MySQL56/16b1039f-22b6-11ed-b765-0a43f95f28a3:1-10000" zone1-0000000102

Vitess will attempt to find a path that recovers the database to that point in time. The path consists of exactly one full backup, followed by zero or more incremental restores. There could be exactly one such path, there could be multiple paths, or there could be no path. Consider the following scenarios:

Recovery scenario 1

point-in-time-recovery-path-1

This is the classic scenario. A full backup takes place at e.g. 12:10, then an incremental backup taken from exactly that point and is valid to 13:20, then the next one from exactly that point, valid to 16:15, etc.

To restore the database to e.g. 20:00 (let's assume that's at position 16b1039f-22b6-11ed-b765-0a43f95f28a3:1-10000), we will restore the full backup, followed by incrementals 1 -> 2 -> 3 -> 4. Note that 4 exceeds 20:00 and vitess will only apply changes up to 20:00, or to be more precise, up to 16b1039f-22b6-11ed-b765-0a43f95f28a3:1-10000.

Recovery scenario 2

point-in-time-recovery-path-2

The above is actually identical to the first scenario. Notice how the first incremental backup precedes the full backup, and how backups 2 % 3 overlap. This is fine! We take strong advantage of MySQL's GTIDs. Because the overlapping transactions in 2 and 3 are consistently identified by same GTIDs, MySQL is able to ignore the duplicates as we apply both restores one after the other.

Recovery scenario 3

point-in-time-recovery-path-3

In the above we have four different paths for recovery!

  • 1 -> 2 -> 3 -> 4
  • 1 -> 2 -> 6
  • 1 -> 5 -> 3 -> 4
  • 1 -> 5 -> 6

Either is valid, Vitess should choose however it pleases. Ideally using as fewest backups as possible (hence preferring 2nd or 4th options).

Recovery scenario 4

If we wanted to restore up to 22:15, then, there's no incremental backup that can take us there, and the operation must fail before it event begins.

Finding paths

Vitess should be able to determine the recovery path before even actually applying anything. It is able to do so by reading the available manifests, finding the shortest valid path to a requested point in time. By a greedy algorithm, it will seek the most recent full backup at or before requested time, and then the shortest sequence of incremental backups to take us to that point.

Backups from multiple sources

Scenario (3) looks imaginary, until you consider backups may be taken from different tablets. These have different binary logs at different rotation time -- but all share the same sequence of GTIDs. Since an incremental backup consists of full binary log copies, there could be overlaps between binary logs backed up from different tablets/MySQL servers.

Vitess should not care about the identity of the sources, should not care about the binary log names (one server's binlog.0000289 may come before another server's binlog.0000101), should not care about binary log count. It should only care about the GTID range an incremental backup covers: from (exclusive) and to (inclusive)

Restore time

It should be notes that an incremental restore based on binary logs means sequentially applying changes to a server. This make take minutes or hours, depending on how many binary log events we need to apply.

Testing

As usual, testing is to take place in:

  • Unit tests (e.g. validate recovery path logic)
  • endtoend (validate incremental backup, validate point in time restore)

Thoughts welcome. Please see #11097 for Work In Progress.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions