Advanced High-Availability considerations for model service

Currently Model Service relies on the health check information provided by the kernel runner operating on each container. As the container itself acts as the only source, the health status cannot be determined whenever entire GPU node shuts down.To guarantee the activeness of each model service, it is crucial to check whether the container itself is unresponsive and try to reconcile the replica size if it is. We can suggest following improvements to resolve the issue:

- Make AppProxy as the health checker
- Add an option to automatically terminate unhealthy sessions after a certain grace period



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Advanced High-Availability considerations for model service #3051

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Advanced High-Availability considerations for model service #3051

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions