Skip to content

Conversation

@zhiying-lin
Copy link
Contributor

What type of PR is this?
/kind test

What this PR does / why we need it:

Adding a webhook can blocking cx to create a duplicate trafficManagerBackend, which involves more work considering there's no validation webhook today.

For now, we keep the implementation as it's today, and let the controller retry.
There is another rare case that customer will manually add the endpoint to the azure profile. In this case, the trafficManagerBackend will be unaccepted too.

Which issue(s) this PR fixes:

Fixes #

Requirements:

How has this code been tested

added e2e tests, https://github.com/Azure/fleet-networking/actions/runs/15068110856

Special notes for your reviewer

@kaito-pr-agent
Copy link

kaito-pr-agent bot commented May 19, 2025

Title

(Describe updated until commit 1dc41ad)

Add test for duplicate trafficManagerBackend and update documentation


Description

  • Added test for duplicate trafficManagerBackend creation

  • Updated documentation for troubleshooting duplicate public IPs

  • Improved README links for taints and tolerations


Changes walkthrough 📝

Relevant files
Tests
traffic_manager_test.go
Add test for duplicate trafficManagerBackend                         

test/e2e/traffic_manager_test.go

  • Added a new test case for creating another trafficManagerBackend with
    the same service
  • Modified deletion logic to ignore not found errors
  • +51/-1   
    Documentation
    README.md
    Update taints and tolerations link                                             

    docs/demos/TrafficManagerProfile/session-migrate-to-another-cluster/README.md

    • Updated link to taints and tolerations documentation
    +1/-1     
    README.md
    Update taints and tolerations link                                             

    docs/demos/TrafficManagerProfile/session-migrate-to-another-region/README.md

    • Updated link to taints and tolerations documentation
    +1/-1     
    DNSBasedGlobalLoadBalancing.md
    Add explanation for duplicate public IPs                                 

    docs/toubleshooting/DNSBasedGlobalLoadBalancing.md

  • Added explanation for duplicate public IP addresses in Azure Traffic
    Manager profile
  • Included sample status output for reference
  • +21/-1   

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @kaito-pr-agent
    Copy link

    kaito-pr-agent bot commented May 19, 2025

    PR Reviewer Guide 🔍

    (Review updated until commit 1dc41ad)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Possible Issue

    The test creates an invalidBackend but does not verify if it is indeed invalid before proceeding with the deletion and acceptance checks. Ensure that the backend is truly invalid before performing these operations.

    	By("Creating an invalid trafficManagerBackend")
    	invalidBackend := wm.TrafficManagerBackend()
    	invalidBackend.Name = fmt.Sprintf("%s-%s", backend.Name, "invalid")
    	invalidBackendName := types.NamespacedName{Namespace: invalidBackend.Namespace, Name: invalidBackend.Name}
    	Expect(hubClient.Create(ctx, &invalidBackend)).Should(Succeed(), "Failed to create the trafficManagerBackend")
    
    	By("Validating the invalidBackend trafficManagerBackend status")
    	status := validator.ValidateTrafficManagerBackendIfAcceptedAndIgnoringEndpointName(ctx, hubClient, invalidBackendName, false, nil, lightAzureOperationTimeout)
    	validator.ValidateTrafficManagerBackendStatusAndIgnoringEndpointNameConsistently(ctx, hubClient, invalidBackendName, status)
    
    	By("Validating the Azure traffic manager profile")
    	// No changes should be made to the profile.
    	atmValidator.ValidateProfile(ctx, atmProfileName, atmProfile)
    
    	By("Deleting existing trafficManagerBackend")
    	Expect(hubClient.Delete(ctx, &backend)).Should(Succeed(), "Failed to delete the trafficManagerBackend")
    	validator.IsTrafficManagerBackendDeleted(ctx, hubClient, backendName, lightAzureOperationTimeout)
    
    	By("Validating the invalid trafficManagerBackend status and should be accepted now")
    	wantEndpoints := []fleetnetv1beta1.TrafficManagerEndpointStatus{
    		{
    			Weight: ptr.To(int64(50)),
    			Target: ptr.To(fmt.Sprintf(azureDNSFormat, memberDNSLabels[0], clusterLocation)),
    			From: &fleetnetv1beta1.FromCluster{
    				ClusterStatus: fleetnetv1beta1.ClusterStatus{Cluster: memberClusters[0].Name()},
    				Weight:        ptr.To(int64(1)),
    			},
    		},
    		{
    			Weight: ptr.To(int64(50)),
    			Target: ptr.To(fmt.Sprintf(azureDNSFormat, memberDNSLabels[1], clusterLocation)),
    			From: &fleetnetv1beta1.FromCluster{
    				ClusterStatus: fleetnetv1beta1.ClusterStatus{Cluster: memberClusters[1].Name()},
    				Weight:        ptr.To(int64(1)),
    			},
    		},
    	}
    	status = validator.ValidateTrafficManagerBackendIfAcceptedAndIgnoringEndpointName(ctx, hubClient, invalidBackendName, true, wantEndpoints, heavyAzureOperationTimeout)
    	validator.ValidateTrafficManagerBackendStatusAndIgnoringEndpointNameConsistently(ctx, hubClient, invalidBackendName, status)
    
    	By("Validating the Azure traffic manager profile")
    	atmProfile = buildDesiredATMProfile(profile, status.Endpoints)
    	atmProfile = *atmValidator.ValidateProfile(ctx, atmProfileName, atmProfile)
    
    	By("Deleting invalid trafficManagerBackend")
    	Expect(hubClient.Delete(ctx, &invalidBackend)).Should(Succeed(), "Failed to delete the trafficManagerBackend")
    	validator.IsTrafficManagerBackendDeleted(ctx, hubClient, invalidBackendName, lightAzureOperationTimeout)
    })

    @kaito-pr-agent
    Copy link

    kaito-pr-agent bot commented May 19, 2025

    PR Code Suggestions ✨

    Latest suggestions up to 1dc41ad

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    General
    Ignore not found errors on deletion

    Handle potential errors during deletion gracefully.

    test/e2e/traffic_manager_test.go [421]

    -Expect(hubClient.Delete(ctx, &backend)).Should(Succeed(), "Failed to delete the trafficManagerBackend")
    +Expect(client.IgnoreNotFound(hubClient.Delete(ctx, &backend))).Should(Succeed(), "Failed to delete the trafficManagerBackend")
    Suggestion importance[1-10]: 8

    __

    Why: Handling not found errors gracefully improves robustness and prevents unnecessary test failures when the resource might have already been deleted.

    Medium
    Ensure creation succeeds

    Validate creation success and handle failures.

    test/e2e/traffic_manager_test.go [450]

     Expect(hubClient.Create(ctx, &invalidBackend)).Should(Succeed(), "Failed to create the trafficManagerBackend")
    +// Add validation or logging here if necessary
    Suggestion importance[1-10]: 3

    __

    Why: The suggestion is redundant as the existing code already checks for successful creation. Adding validation or logging would be beneficial but is not provided in the suggestion.

    Low

    Previous suggestions

    Suggestions up to commit e73e2e6
    CategorySuggestion                                                                                                                                    Impact
    General
    Improve error handling during deletion

    Handle potential errors when deleting the trafficManagerBackend.

    test/e2e/traffic_manager_test.go [421]

    -Expect(hubClient.Delete(ctx, &backend)).Should(Succeed(), "Failed to delete the trafficManagerBackend")
    +Expect(client.IgnoreNotFound(hubClient.Delete(ctx, &backend))).Should(Succeed(), "Failed to delete the trafficManagerBackend")
    Suggestion importance[1-10]: 8

    __

    Why: The suggestion improves error handling by ignoring not found errors during deletion, which is a critical enhancement for robustness.

    Medium
    Suggestions up to commit e73e2e6
    CategorySuggestion                                                                                                                                    Impact
    General
    Handle non-existent resources gracefully

    Use client.IgnoreNotFound to handle cases where the resource might not exist.

    test/e2e/traffic_manager_test.go [421]

    -Expect(hubClient.Delete(ctx, &backend)).Should(Succeed(), "Failed to delete the trafficManagerBackend")
    +Expect(client.IgnoreNotFound(hubClient.Delete(ctx, &backend))).Should(Succeed(), "Failed to delete the trafficManagerBackend")
    Suggestion importance[1-10]: 8

    __

    Why: Using client.IgnoreNotFound improves error handling by gracefully managing cases where the resource might not exist, which is a critical improvement for robustness.

    Medium
    Add validation for backend creation

    Validate the creation of invalidBackend to ensure it meets the expected criteria.

    test/e2e/traffic_manager_test.go [450]

     Expect(hubClient.Create(ctx, &invalidBackend)).Should(Succeed(), "Failed to create the trafficManagerBackend")
    +// Add validation logic here to ensure invalidBackend is correctly set up
    Suggestion importance[1-10]: 5

    __

    Why: While the suggestion is valid, it only suggests adding validation without specifying what kind of validation should be done, which offers limited improvement.

    Low
    Verify endpoint configuration

    Ensure wantEndpoints accurately reflects the expected state.

    test/e2e/traffic_manager_test.go [465-482]

     wantEndpoints := []fleetnetv1beta1.TrafficManagerEndpointStatus{
       {
         Weight: ptr.To(int64(50)),
         Target: ptr.To(fmt.Sprintf(azureDNSFormat, memberDNSLabels[0], clusterLocation)),
         From: &fleetnetv1beta1.FromCluster{
           ClusterStatus: fleetnetv1beta1.ClusterStatus{Cluster: memberClusters[0].Name()},
           Weight:        ptr.To(int64(1)),
         },
       },
       {
         Weight: ptr.To(int64(50)),
         Target: ptr.To(fmt.Sprintf(azureDNSFormat, memberDNSLabels[1], clusterLocation)),
         From: &fleetnetv1beta1.FromCluster{
           ClusterStatus: fleetnetv1beta1.ClusterStatus{Cluster: memberClusters[1].Name()},
           Weight:        ptr.To(int64(1)),
         },
       },
     }
    +// Verify that memberDNSLabels and memberClusters are correctly populated
    Suggestion importance[1-10]: 5

    __

    Why: The suggestion is relevant but vague, as it only suggests verifying the configuration without providing specific checks or assertions, which offers limited improvement.

    Low

    @codecov
    Copy link

    codecov bot commented May 19, 2025

    Codecov Report

    All modified and coverable lines are covered by tests ✅

    Project coverage is 80.29%. Comparing base (0076b8d) to head (1dc41ad).
    Report is 1 commits behind head on main.

    Additional details and impacted files
    @@            Coverage Diff             @@
    ##             main     #300      +/-   ##
    ==========================================
    - Coverage   80.54%   80.29%   -0.25%     
    ==========================================
      Files          29       29              
      Lines        4101     4101              
    ==========================================
    - Hits         3303     3293      -10     
    - Misses        633      643      +10     
      Partials      165      165              

    ☔ View full report in Codecov by Sentry.
    📢 Have feedback on the report? Share it here.

    🚀 New features to boost your workflow:
    • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

    @ryanzhang-oss ryanzhang-oss requested a review from Copilot May 20, 2025 21:41
    Copy link

    Copilot AI left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Pull Request Overview

    Adds an end-to-end test to handle cases where a service’s public IP already exists in an Azure Traffic Manager profile, updates cleanup logic to ignore not-found errors, and refreshes related documentation links and troubleshooting guidance.

    • Ignore not-found errors when cleaning up trafficManagerBackends
    • Introduce a new E2E scenario for duplicate-creation of trafficManagerBackend
    • Document the “public IP already exists” error and correct demo links

    Reviewed Changes

    Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

    File Description
    test/e2e/traffic_manager_test.go Wrap delete in IgnoreNotFound and add a duplicate-backend test
    docs/toubleshooting/DNSBasedGlobalLoadBalancing.md Add troubleshooting entry for existing public IP and sample output
    docs/demos/TrafficManagerProfile/session-migrate-to-another-region/README.md Fix taints-tolerations URL
    docs/demos/TrafficManagerProfile/session-migrate-to-another-cluster/README.md Fix taints-tolerations URL
    Comments suppressed due to low confidence (2)

    docs/toubleshooting/DNSBasedGlobalLoadBalancing.md:138

    • Remove the extra space between the backslash and the escaped quote so it reads "error" instead of \ "error" for valid YAML/JSON escaping.
    \ "error": {
    

    docs/troubleshooting/DNSBasedGlobalLoadBalancing.md:129

    • Indent this sub-bullet two spaces further under the numbered list item to ensure proper Markdown nesting and rendering.
    -   Please use the existing trafficManagerBackend to manage your endpoints exported by the service.
    

    @ryanzhang-oss ryanzhang-oss requested a review from Copilot May 20, 2025 22:54
    Copy link

    Copilot AI left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Pull Request Overview

    Adds an E2E test for handling an existing public IP in an Azure Traffic Manager profile and updates related documentation.

    • Enhanced test teardown to ignore not-found errors and introduced a new test case for duplicate TrafficManagerBackend.
    • Added a troubleshooting entry for when a public IP already exists in the Traffic Manager profile.
    • Updated demo README links pointing to the taints-tolerations documentation on the website.

    Reviewed Changes

    Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

    File Description
    test/e2e/traffic_manager_test.go Ignore not-found errors in cleanup and add a test for creating a second backend against the same service.
    docs/toubleshooting/DNSBasedGlobalLoadBalancing.md Add a new step explaining the “public IP already exists” error and its workaround.
    docs/demos/TrafficManagerProfile/session-migrate-to-another-region/README.md Update taints-tolerations link to the kubefleet-dev website.
    docs/demos/TrafficManagerProfile/session-migrate-to-another-cluster/README.md Update taints-tolerations link to the kubefleet-dev website.
    Comments suppressed due to low confidence (2)

    docs/demos/TrafficManagerProfile/session-migrate-to-another-region/README.md:166

    • Verify that the updated URL to the taints-tolerations doc is correct and matches the live site structure to avoid broken links.
    - Remove the env label from the `aks-member-1` & `aks-member-3` or add [a taint on these two members](https://kubefleet-dev.github.io/website/docs/how-tos/taints-tolerations/) so that the cluster won't be picked by the `clusterResourcePlacement` again.
    

    docs/demos/TrafficManagerProfile/session-migrate-to-another-cluster/README.md:155

    • Ensure this link to the taints-tolerations guide is valid and leads to the intended documentation on the live kubefleet-dev site.
    - Remove the env label from the `aks-member-1` or add [a taint on the member](https://kubefleet-dev.github.io/website/docs/how-tos/taints-tolerations/) so that the cluster won't be picked by the `clusterResourcePlacement` again.
    

    @zhiying-lin zhiying-lin merged commit 6607965 into Azure:main May 22, 2025
    10 of 11 checks passed
    @zhiying-lin zhiying-lin deleted the investigate-400 branch May 22, 2025 01:46
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants