-
Notifications
You must be signed in to change notification settings - Fork 794
Optimize ENI slot reservation for non-supported instance type #3250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize ENI slot reservation for non-supported instance type #3250
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR prevents ENI slot reservation on instance types that don’t support ENI trunking by introducing a check in the IPAMD logic and wiring it through AWS utils and tests.
- Add lists of non-supported ENI trunking instance types/families and implement
IsENITrunkingSupported - Update
hasRoomForEnito account for trunking support - Extend mocks and unit tests in both
ipamdandawsutilspackages to cover the new trunking flag
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/ipamd/ipamd.go | Added IsENITrunkingSupported check in hasRoomForEni |
| pkg/ipamd/ipamd_test.go | Expanded test cases (testIncreaseIPPool) to include trunking |
| pkg/awsutils/no_eni_trunking_instance_types.go | Defined noENITrunkingInstanceTypes and noENITrunkingInstanceFamilies |
| pkg/awsutils/awsutils.go | Implemented IsENITrunkingSupported using new slices |
| pkg/awsutils/mocks/awsutils_mocks.go | Added mock recorder and implementation for IsENITrunkingSupported |
| pkg/awsutils/awsutils_test.go | Added unit tests for IsENITrunkingSupported |
Comments suppressed due to low confidence (3)
pkg/ipamd/ipamd_test.go:514
- The parameter name 'unschedulabeNode' is misspelled and may confuse readers. Consider renaming it to 'unschedulableNode'.
func testIncreaseIPPool(t *testing.T, useENIConfig bool, unschedulabeNode bool, subnetDiscovery bool, eniTrunking bool) {
pkg/ipamd/ipamd.go:2236
- Consider adding unit tests for
hasRoomForEnito verify its behavior when ENI trunking is supported versus not supported, ensuring thetrunkEnioffset is applied correctly.
if c.awsClient.IsENITrunkingSupported() && c.enablePodENI && c.dataStore.GetTrunkENI() == "" {
pkg/ipamd/ipamd_test.go:668
- [nitpick] This test sets
MY_NODE_NAMEbut doesn’t unset it; environment state may leak between tests. Consider usingdefer os.Unsetenv("MY_NODE_NAME")or explicitly callingos.Unsetenvin teardown.
func TestIncreasePrefixPoolDefault(t *testing.T) {
92b3109 to
d8372ce
Compare
|
Hi @jayanthvn @yash97, do you think this PR reasonable to go ahead ? |
|
@phuhung273 could you rebase your changes. |
d8372ce to
19f9fd3
Compare
|
Thanks for taking a look @jaydeokar, I just rebased the branch |
|
Thanks for making this change. Since the instance types and families which currently don't support trunking is fixed, lets add this to the scripts/gen_vpc_ip_limits.go and add a property IsTrunkingEnabled. The default value should be true, but only set false for the instance types and families which you've added in the new file |
|
For testing -
|
|
instead of maintaining list of non supported instances, can you explore using this file https://github.com/aws/amazon-vpc-resource-controller-k8s/blob/master/pkg/aws/vpc/limits.go instead. You can check this field |
pkg/awsutils/awsutils.go
Outdated
| } | ||
|
|
||
| // IsENITrunkingSupported return true if the instance type is not in non-supported list | ||
| func (cache *EC2InstanceMetadataCache) IsENITrunkingSupported() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use data from this file here https://github.com/aws/amazon-vpc-resource-controller-k8s/blob/master/pkg/aws/vpc/limits.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't even know this super helpful file exist. Thank you so much. I've updated the function to use it.
19f9fd3 to
089fc08
Compare
|
Thanks @jaydeokar for the test instruction. Let me try it. |
|
Hi team, I've included e2e manual test in PR description. Can also find a similar (require creating nodegroup) SNAT integration test amazon-vpc-cni-k8s/test/integration/snat/snat_suite_test.go Lines 94 to 106 in ee97808
Let me know if you want to add integration test @jaydeokar @yash97, I will just learn from the code base. Thanks team. |
|
Yes please could you add a small test suite ? You can create managed nodegroups as well, doesn't have to be self managed in this case. Just need to pass the right instance type. The change overall looks good to me.. Thanks for working on this |
089fc08 to
f0fe88e
Compare
|
I've added a new integration suite using Managed NodeGroup as discussed. Test output is in PR description. Let me know what you think @jaydeokar. Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
f0fe88e to
dadd07a
Compare
jaydeokar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
@phuhung273, thanks for making the change. We'll be tracking this with the next CNI release |
|
Thanks so much team for your guidance @jaydeokar @yash97 @jayanthvn. |
What type of PR is this?
improvement
Which issue does this PR fix?:
Close #3094
What does this PR do / Why do we need it?:
CreateAndWaitTillManagedNGReadyTesting done on this change:
make unit-testIntegration test
Manual test
Prerequisites:
SGP enabled

Instance types for testing:
Reproducing
On latest CNI, no-trunking cannot scale to maxPods

Test new code
IMAGE=phuhung273/amazon-k8s-cni make docker: phuhung273/amazon-k8s-cni:089fc089ipamd.log
Will this PR introduce any new dependencies?:
No
Will this break upgrades or downgrades? Has updating a running cluster been tested?:
Can the team show me how to test this case ?
Does this change require updates to the CNI daemonset config files to work?:
No
Does this PR introduce any user-facing change?:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.