-
Notifications
You must be signed in to change notification settings - Fork 164
fix(BA-1840): Make cloud provider detection future-proof #5086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors cloud provider detection to be async and future-proof, replacing local heuristics with concurrent metadata endpoint probes and standardizing return values using a new CloudProvider
enum.
- Introduces
CloudProvider
enum and per-provider async detection helpers - Replaces synchronous
detect_cloud
withasyncio.staggered_race
for fastest metadata lookup - Enhances
curl
helper withaiohttp
timeouts,raise_for_status
, and type-safe overloads
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
src/ai/backend/common/networking.py | Updated curl to use aiohttp.ClientTimeout , added overloads |
src/ai/backend/common/identity.py | Rewrote detect_cloud with async sub-detectors, added CloudProvider enum and asyncio.staggered_race |
Comments suppressed due to low confidence (2)
src/ai/backend/common/identity.py:97
- [nitpick] Consider adding unit tests or integration tests for
detect_cloud
to validate correct provider detection and fallback behavior across AWS, Azure, and GCP.
async def detect_cloud() -> Optional[CloudProvider]:
src/ai/backend/common/networking.py:23
- The runtime annotation references
yarl.URL
butyarl
is only imported underTYPE_CHECKING
. Either importyarl
at runtime or enable postponed annotations (from __future__ import annotations
) to avoid aNameError
.
url: str | yarl.URL,
Co-authored-by: Copilot <[email protected]>
470e1aa
to
2a646f2
Compare
resolves #5085 (BA-1840)
This PR modernizes
ai.backend.common.networking.curl()
andai.backend.common.identity.{detect_cloud(), get_instance_*()}
functions.i-my-vm-name-mfqwcylb
) as the VM name is only guaranteed to be unique within resource groups where resource groups are just tags.asyncio.staggered.staggered_race()
to run the happy-eyeballs logic over multiple cloud-provider detection routines.aiohttp.ClientTimeout(connect=...)
instead ofasync with asyncio.timeout()
nor classictimeout=...
parameter to explicitly set timeouts on the connection attempts only, excluding response parsing and processing.raise_for_status=True
and catchaiohttp.ClientError
which includesaiohttp.ClientResponseError
.test_curl_*
test cases to useaioresponses
to makeraise_for_status=True
option working in tests.ai.backend.common.identity
module working regardless whether there is a running event loop or not.assert
and handlingAssertionError
for mandated runtime checks as they may be omitted with Python runtime's optimization flags.Example
AWS
Azure
GCP
Local VM
Checklist: (if applicable)