Skip to content

RFC: VTGate and Tablet Picker: prefer local cells in source tablet selection for vreplication workflows and the VStream API #11999

@rohit-nayak-ps

Description

@rohit-nayak-ps

Feature Description

Description

We propose updating the tablet picker (code) logic to make it easier to choose tablets that are local to the cell or availability zone of the tablet picker's caller/user.

This would include an enhancement the vtgate VStream API (WiP docs here) to support enabling/disabling this new behavior along with new vtctl client flags to enable/disable this new behavior for client commands such as MoveTables.

Related issues

#11770
#11771
#11579

Use Cases

  1. For cost/performance reasons it is usually preferable to pick a tablet from the same cell as the target tablet (in the case of vreplication) or vtgate (for VStream API). Today the picker just takes a list of cells which are deemed equal. If the picker knows what the local cell is, it can use that information to prefer local tablets and fallback to the specified cells if none exist.

Note: the user does not typically want to specify ONLY a local cell (the default behavior if no cells are provided) since then the picker might not find any tablet to stream from, causing starvation.

  1. There are users who have multiple cells in the same failure domain (e.g. Availability Zone), all of which share a common cell alias. If the alias is passed to the picker it will expand it into all cells which are part of the alias, resulting in a similar issue as Removing mercurial references. #1. It will be useful if the picker can be told to fallback to the other cells in an alias if no local tablets are found.

  2. In addition, the VStream API only takes a single tablet type for the selection of a source today.

  3. We currently have an "in_order" hint today that can be specified as a prefix of the string of tablet types passed into the picker. We can deprecate that in favor of explicitly defining the strategy.

Note: the specified list of cells can also include cell aliases.

Proposed Changes

The main issue is that the picker doesn't know the default cell of the caller today (vtgate, vttablet, or vtctld). We propose updating the tablet picker api to allow that to be specified.

As part of that, specific strategy options will be explicitly defined for both tablet types and cells.

Tablet type preference (specified using the in_order: clause today) will be given precedence over the local cell. That is, if a tablet type preference is specified with a value of REPLICA then we will always choose a REPLICA tablet whenever one is available, irrespective of the cell (the local cell is only the first choice). Without an explicit tablet type preference specified the cell strategy takes precedence and we will first choose the cells based on their specified priority and look for any appropriate tablet types within it, and if none exist fallback to lower priority cells.

Cell selection options:

  • PreferLocal: prefer local cell, then specified cells
  • PreferLocalWithAlias: prefer local cell, then cell alias of local cell, then specified cells
  • OnlySpecified: prefer specified cells, no fallback

One question here is: can we combine the first two? If the local cell has an alias and we don't find tablets in it, we fallback on the alias. Seems to make sense, since specifying an alias itself means that the user has organized the cells in a particular fashion.

Tablet selection options:

  • Any: all provided types are equal
  • InOrder: provided tablet types expected to be in order or priority

No new flags will be specified at the vttablet or vtgate level.

Code Changes

The following parts of the code will need to be modified for the refactor:

  • VStream API
  • VReplication
  • Wrangler
  • VDiff1
  • VDiff2

The following user-facing changes will happen:

  • Additional tablet picker strategy parameter to VReplication workflows and VDiff1/VDiff2
  • VStream API will allow a new flag to be defined to modify picker strategy

Breaking Changes

  • Default strategy might give a different result than what we get today. However overall we think the new approach will result in a better default choice.
  • If VStream API users want to use any of these features then the client (including the Debezium adapter) will need to target the enhanced API. However, if we chose the PreferLocalWithAlias strategy as the default then, at least for the current use case from tablet picker cell alias fallback with local cell preference #11771, no changes should needed.

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

Status
Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions