-
Notifications
You must be signed in to change notification settings - Fork 163
feat(BA-992): Offload health check capability to AppProxy #5147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors model service health checks by offloading them to AppProxy, updates route management via Redis, and replaces the old predicate checker with a unified helper.
- Introduce health check configuration reading and pass it to AppProxy
- Refactor ModelServicePredicateChecker into ModelServiceHelper with consolidated validation methods
- Persist route connection info in Redis (
redis_live
) and emit real-time update events
Reviewed Changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
src/ai/backend/manager/services/processors.py | Add redis_live to ServiceArgs |
src/ai/backend/manager/services/model_serving/types.py | Define RouteConnectionInfo dataclass |
src/ai/backend/manager/services/model_serving/services/model_serving.py | Integrate redis_helper , health checks, notifications |
src/ai/backend/manager/registry.py | Add get_health_check_info , create_appproxy_endpoint , notify_endpoint_route_update_to_appproxy |
src/ai/backend/manager/models/endpoint.py | Implement generate_redis_route_info , rename helper |
src/ai/backend/manager/models/routing.py | Update default status_filter , add load_session |
src/ai/backend/manager/models/storage.py | Raise VFolderGone on 410 response |
src/ai/backend/manager/models/vfolder.py | Catch VFolderGone in deletion |
src/ai/backend/common/data/config/types.py | Add HealthCheckConfig Pydantic model |
Comments suppressed due to low confidence (4)
src/ai/backend/manager/registry.py:3659
- [nitpick] Consider adding a docstring to get_health_check_info to explain how it selects or reads health check settings for different runtime variants, including fallback behavior.
async def get_health_check_info(
src/ai/backend/manager/services/model_serving/types.py:64
- [nitpick] The field name
app
in RouteConnectionInfo is ambiguous. Consider renaming it toapp_name
orservice_name
for clarity.
class RouteConnectionInfo:
src/ai/backend/manager/registry.py:3692
- Consider adding unit or integration tests for create_appproxy_endpoint and get_health_check_info to verify correct serialization of health check configs, Redis writes, and error handling.
async def create_appproxy_endpoint(
src/ai/backend/manager/models/routing.py:125
- Changing the default status_filter to include all active_route_statuses may change behavior for callers that relied on the previous default (HEALTHY, UNHEALTHY, PROVISIONING). Consider explicitly specifying the intended filter in existing call sites or updating tests.
status_filter: list[RouteStatus] = list(RouteStatus.active_route_statuses()),
lambda r: r.set( | ||
f"endpoint.{action.service_id}.session.{route.session}.traffic_ratio", | ||
str(action.traffic_ratio), | ||
ex=3600, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TTL value 3600
is hardcoded in multiple places. Extract this into a named constant or configuration parameter to improve maintainability.
ex=3600, | |
ex=DEFAULT_TTL_SECONDS, |
Copilot uses AI. Check for mistakes.
) -> HealthCheckConfig | None: | ||
_info: HealthCheckConfig | None = None | ||
|
||
if _path := MODEL_SERVICE_RUNTIME_PROFILES[endpoint.runtime_variant].health_check_endpoint: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indexing MODEL_SERVICE_RUNTIME_PROFILES by endpoint.runtime_variant may throw a KeyError for variants not registered in the map. Add a guard or default fallback to handle unexpected variants.
if _path := MODEL_SERVICE_RUNTIME_PROFILES[endpoint.runtime_variant].health_check_endpoint: | |
runtime_profile = MODEL_SERVICE_RUNTIME_PROFILES.get(endpoint.runtime_variant) | |
if runtime_profile and (_path := runtime_profile.health_check_endpoint): |
Copilot uses AI. Check for mistakes.
Resolves #3051 (BA-992).
Summary
Breaking Changes
Important: This PR breaks health check capability on OSS AppProxy. Future work will restore support, but for now disable the health check feature in
model-definition.yaml
to use model service on Open Source Backend.AI.Changes
API Specifications
ModelServiceStatusEventArgs
base classEndpointRouteListUpdatedEvent
for endpoint route synchronizationHealth Check System
ModelServiceHelper
class replacingModelServicePredicateChecker
Route Management
generate_redis_route_info
method for serializable connection datanotify_endpoint_route_update_to_appproxy
for real-time updatesDatabase Changes
Technical Details
The health check system now reads configuration from either:
Health check configuration is passed to AppProxy during endpoint creation, allowing AppProxy to handle health checking independently. Route connection information is stored in Redis for AppProxy consumption with the key pattern:
endpoint.{endpoint_id}.route_connection_info
Testing
Existing tests should continue to pass. New functionality requires model service integration testing with AppProxy components.