Skip to content

feat(retry): add exponential backoff retry for 503 errors#375

Merged
miton18 merged 1 commit intomasterfrom
fix/issue-339-retry-503-errors
Mar 19, 2026
Merged

feat(retry): add exponential backoff retry for 503 errors#375
miton18 merged 1 commit intomasterfrom
fix/issue-339-retry-503-errors

Conversation

@miton18
Copy link
Copy Markdown
Collaborator

@miton18 miton18 commented Mar 19, 2026

Fixes #339

When creating multiple resources (~10+) with terraform apply, users experience 503 errors from the Clever Cloud API, requiring multiple retries to complete successfully. This creates "retry fatigue" and poor UX.

The Clever Cloud API has capacity limitations when handling concurrent requests. Terraform's default parallelism (10 resources) can overwhelm the API, causing 503 Service Unavailable errors.

Implemented automatic retry with exponential backoff for API calls that create resources (apps and addons). This approach:

  • Retries automatically on 503 errors only

  • Uses exponential backoff: 1s, 2s, 4s, 8s, 16s (max 30s)

  • Max 5 attempts before giving up

  • Logs retry attempts for debugging

  • Generic implementation that can be extended to other status codes

  • retry.go: Generic retry mechanism with exponential backoff

    • WithRetry(): Wraps any client.Response-returning function
    • WithRetryConfig(): Allows custom retry configuration
    • DefaultConfig(): 5 attempts, exponential backoff (1s→30s max)
  • retry_test.go: Unit tests for backoff calculation and retry logic

  • addon.go: Added CreateAddonWithRetry() wrapper

  • app.go: Added CreateAppWithRetry() wrapper

Replaced all direct calls to:

  • tmp.CreateAddon()tmp.CreateAddonWithRetry()
  • tmp.CreateApp()tmp.CreateAppWithRetry()

Affected resources:

  • All addon types (postgresql, mysql, redis, mongodb, elasticsearch, etc.)

  • All application types (via application/create.go)

  • Config providers

  • ✅ Unit tests for retry logic pass

  • ✅ Provider builds successfully

  • ✅ Exponential backoff calculation verified

Default retry config:

  • Max attempts: 5
  • Initial delay: 1 second
  • Max delay: 30 seconds
  • Multiplier: 2.0 (exponential)

Users can still use -parallelism=1 flag for extreme cases, but this should no longer be necessary for most scenarios.

@miton18 miton18 force-pushed the fix/issue-339-retry-503-errors branch from 7a8765d to eb1c822 Compare March 19, 2026 18:30
Fixes #339

When creating multiple resources (~10+) with terraform apply, users
experience 503 errors from the Clever Cloud API, requiring multiple retries
to complete successfully. This creates "retry fatigue" and poor UX.

The Clever Cloud API has capacity limitations when handling concurrent
requests. Terraform's default parallelism (10 resources) can overwhelm
the API, causing 503 Service Unavailable errors.

Implemented automatic retry with exponential backoff for API calls that
create resources (apps and addons). This approach:
- Retries automatically on 503 errors only
- Uses exponential backoff: 1s, 2s, 4s, 8s, 16s (max 30s)
- Max 5 attempts before giving up
- Logs retry attempts for debugging
- Generic implementation that can be extended to other status codes

- **retry.go**: Generic retry mechanism with exponential backoff
  - `WithRetry()`: Wraps any client.Response-returning function
  - `WithRetryConfig()`: Allows custom retry configuration
  - `DefaultConfig()`: 5 attempts, exponential backoff (1s→30s max)
- **retry_test.go**: Unit tests for backoff calculation and retry logic

- **addon.go**: Added `CreateAddonWithRetry()` wrapper
- **app.go**: Added `CreateAppWithRetry()` wrapper

Replaced all direct calls to:
- `tmp.CreateAddon()` → `tmp.CreateAddonWithRetry()`
- `tmp.CreateApp()` → `tmp.CreateAppWithRetry()`

Affected resources:
- All addon types (postgresql, mysql, redis, mongodb, elasticsearch, etc.)
- All application types (via application/create.go)
- Config providers

- ✅ Unit tests for retry logic pass
- ✅ Provider builds successfully
- ✅ Exponential backoff calculation verified

Default retry config:
- Max attempts: 5
- Initial delay: 1 second
- Max delay: 30 seconds
- Multiplier: 2.0 (exponential)

Users can still use `-parallelism=1` flag for extreme cases, but this
should no longer be necessary for most scenarios.
@miton18 miton18 force-pushed the fix/issue-339-retry-503-errors branch from eb1c822 to f53efd8 Compare March 19, 2026 18:33
@miton18 miton18 merged commit 5e41b18 into master Mar 19, 2026
29 of 39 checks passed
@miton18 miton18 deleted the fix/issue-339-retry-503-errors branch March 19, 2026 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Retry fatigue - CC api disallow a full apply

1 participant