Skip to content

Bug Report: Throttler Config Via Topo Is Broken #12523

@mattlord

Description

@mattlord

Overview of the Issue

In #11604 we added support for dynamic tablet throttler configuration.

The problem with the implementation we landed on is a general one because we stored the configuration directly in the cell local SrvKeyspace records.

The Keyspace record is the persistent and (mostly) static keyspace configuration stored in the global topo. The "Keyspace [Serving] Graph" is stored in the SrvKeyspace records across the cells and that is dynamic and ephemeral, regularly built and rebuilt from the global topo info and the live state. We build and rebuild it internally per cell and across all cells when certain operations happen, and a user can rebuild them as well using the RebuildKeyspaceGraph client command. Currently in release-16.0 and main, anytime a SrvKeyspace record in a cell is built/rebuilt, it WILL NOT have any previously configured ThrottlerConfig value. So what tablets have what config will be undefined and chaotic (not to mention the Vitess Operator’s SrvKeyspace pruning behavior).

We can instead store the configuration in the global topo Keyspace record and propagate that to the SrvKeyspace records on build/rebuild — leaving the rest of the work largely unchanged (see draft PR for an example).

There were also some more minor issues:

  1. The IsEnabled property is no longer a static property of the throttler via tablet flags but is dynamic. There was no way to observe whether or not the throttler was currently active/enabled on a tablet.
  2. We did not wait for the throttler to become enabled on all relevant tablets in the endtoend tests before proceeding (which was difficult to do because of the first issue).
  3. We did not return the cells we actually updated when we had partial results for updating the topo.

Reproduction Steps

An easy way to demonstrate the general problem:

git checkout main
make build
cd examples/local

./101_initial_cluster.sh

vtctlclient TopoCat -- --cell=zone1 --decode_proto_json keyspaces/commerce/SrvKeyspace | jq '.[0].throttlerConfig'

vtctldclient UpdateThrottlerConfig --enable commerce

vtctlclient TopoCat -- --cell=zone1 --decode_proto_json keyspaces/commerce/SrvKeyspace | jq '.[0].throttlerConfig'

vtctldclient RebuildKeyspaceGraph commerce

vtctlclient TopoCat -- --cell=zone1 --decode_proto_json keyspaces/commerce/SrvKeyspace | jq '.[0].throttlerConfig'

You will see that the throttler config is wiped out after the rebuild:

$ vtctlclient TopoCat -- --cell=zone1 --decode_proto_json keyspaces/commerce/SrvKeyspace | jq '.[0].throttlerConfig'
null

$ vtctldclient UpdateThrottlerConfig --enable commerce

$ vtctlclient TopoCat -- --cell=zone1 --decode_proto_json keyspaces/commerce/SrvKeyspace | jq '.[0].throttlerConfig'
{
  "enabled": true
}

$ vtctldclient RebuildKeyspaceGraph commerce

$ vtctlclient TopoCat -- --cell=zone1 --decode_proto_json keyspaces/commerce/SrvKeyspace | jq '.[0].throttlerConfig'
null

Binary Version

v16.0.0 and v17.0.0-SNAPSHOT

Operating System and Environment details

N/A

Log Fragments

N/A

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions