-
Notifications
You must be signed in to change notification settings - Fork 237
feat: Enable sharding of ResourceGroup objects #3430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
c8c56c3
to
cd0777c
Compare
aac6022
to
e5ad2bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking MUCH cleaner and more readable now, thanks!
Just a few more suggestions and it needs some unit tests for all that sharding logic in Apply and ApplyWithPrune.
// shardedLabel genereates the formatted label name required for sharded ResourceGroups. | ||
// The label value should be formatted as `<resourcegroup_name>/id`. | ||
func (irg *InventoryResourceGroup) shardedLabel() string { | ||
return fmt.Sprintf("%s/id", irg.Name()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the trailing /id
for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The label format for sharded ResourceGroups is of the format <resourcegroup_name>/id
with an empty string value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this new label be namespaced?
kpt uses config.kubernetes.io/
: https://kpt.dev/reference/annotations/
cli-utils uses cli-utils.sigs.k8s.io/
/cc @nan-yu @janetkuo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, unprefixed keys are reserved for end-users; all labels/annotations need to be prefixed: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#label-selector-and-annotation-conventions
The proposed prefix is resourcegroup_name
. If we choose a different prefix, the resource group could be the value of the label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This label format was selected to ensure we don't face character limits. resourcegroup_name
is suggested as a prefix since the prefix of a label key can contain up to 253 characters, while the name segment of a key can only contain up to 63 characters. Similarly, the value of the label can also only contain up to 63 characters. Using the resourcegroup_name within the name segment, or as a label value could result in the resourcegroup_name
being truncated and potentially resulting in collisions.
https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using a conventional pattern for truncation. If the name is longer than N characters, hash the full name with sha1, and add the hash as the suffix. So is 1234567890
is the name and the max is 9, then the name becomes 1-<hash>
, where the hash is the first 7 characters of the full hash (git --short default). This doesn't guarantee uniqueness. You still have to check for collisions with other names in the namespace, but it does make it much less likely there are collisions. and when there are, you log an error and set an error status condition that says there's a collision.
pkg/live/inventoryrg_test.go
Outdated
} | ||
|
||
clusterRGList, err := namespacedClient.List(context.TODO(), v1.ListOptions{ | ||
LabelSelector: irg.shardedLabel(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use Get for NameStrategy and List for LabelStrategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kpt only supports NameStrategy, and has always had strategy hard-coded as NameStrategy. LabelStrategy is only used in the destroy command since we want the destroyer to delete all sharded ResourceGroups by its common label.
This commit enables sharding of ResourceGroup objects to scale the number of GKNN resources tracked within an inventory. Prior to sharding, the number of objects tracked is limited to the size of the ResourceGroup object and is bottlenecked by etcd size limits. This modification allows us to distribute GKNNs to multiple 'paginated' ResourceGroups.
/hold Per offline discussion, the cli-utils ClusterClient needs refactoring to push more of the inventory lifecycle management into the provider/storage layer. |
This commit enables sharding of ResourceGroup objects to scale the
number of GKNN resources tracked within an inventory. Prior to sharding,
the number of objects tracked is limited to the size of the
ResourceGroup object and is bottlenecked by etcd size limits. This
modification allows us to distribute GKNNs to multiple 'paginated'
ResourceGroups.