-
Notifications
You must be signed in to change notification settings - Fork 128
Description
Describe the bug
When multiple PlacementDecisions for the same Placement are updated sequentially, clusters can temporarily disappear from all PlacementDecisions during the update process.
This causes the addon-management-controller to delete and recreate ManagedClusterAddOns for those clusters, resulting in potential service disruption.
To Reproduce
- Create a ClusterManagementAddOn with
installStrategy.type: Placements - Create a Placement that selects >100 clusters (e.g., 150 clusters), resulting in multiple PlacementDecisions
- Trigger a change that causes clusters to move between PlacementDecisions (e.g., update placement scores or labels)
- Observe sequential PlacementDecision updates where:
inital status: decision-1: c1, c2 ... , c100 decision-2: c101, c102, ...
update: decision-1: c0, c1, c2 ... , c99 decision-2: c100, c101, c102, ...
- During the time window between these two updates, c100 exists in neither PlacementDecision
- The addon-management-controller reconciles during this window and deletes the ManagedClusterAddOn for c100.
- When decision-2 is updated to include c100, the ManagedClusterAddOn is re-created.
Expected behavior
The ManagedClusterAddOn for c100 should remain unchanged throughout the PlacementDecision updates. The addon should not be deleted and recreated.
Environment ie: OCM version, Kubernetes version and provider:
Additional context
Add any other context about the problem here.