Skip to content

Conversation

@XinShuYang
Copy link
Contributor

No description provided.

@XinShuYang XinShuYang marked this pull request as ready for review December 22, 2025 02:20
Comment on lines +32 to +35
--aws-region us-west-2
sudo ./ci/test-conformance-gke.sh --gc-cluster --gc-cluster-age-hours 3 \
--gke-zone us-west1-a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will these region and zone always be the value? I saw that the region and zone can be overridden in the jobs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with Lan.

Would be possible to traverse through a list of regions and do cleanup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. It should be possible to implement this feature by enhancing the jenkins builder, I will investigate and verify it.

Comment on lines +32 to +35
--aws-region us-west-2
sudo ./ci/test-conformance-gke.sh --gc-cluster --gc-cluster-age-hours 3 \
--gke-zone us-west1-a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with Lan.

Would be possible to traverse through a list of regions and do cleanup?


if [ -z "$creation_epoch" ]; then
echo "WARNING: Could not parse creation time for cluster $cluster"
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe introduce a failed_count for such scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added failed_count to enhance the result.

if [ $stack_age_seconds -gt $GC_CLUSTER_AGE_SECONDS ]; then
echo "Found old CloudFormation stack: $stack_name (age: ${stack_age_hours}h)"
echo "Deleting CloudFormation stack: $stack_name"
aws cloudformation delete-stack --region ${REGION} --stack-name ${stack_name}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have retry for deleting stacks?


if [[ "$RUN_GARBAGE_COLLECTION" == true ]]; then
trap "kill -9 $timeout_watcher_pid 2>/dev/null || true" EXIT
garbage_collection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be clusters_gc here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be clusters_gc, updated.

@edwardbadboy
Copy link
Contributor

It would be good to explain why we need this in the pull request and commit message.

Add a periodic garbage collection job to handle orphaned cloud clusters
and prevent resource leakage when primary cleanup steps fail.

Signed-off-by: Shuyang Xin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants