This repository was archived by the owner on Jul 4, 2025. It is now read-only.
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
Add Multi-GPU Support for LlamaCpp Engine #1391
Closed
Description
Add Multi-GPU Support for LlamaCpp Engine
Description
We need to implement multi-GPU support for our LlamaCpp wrapper engine to improve performance and allow users to utilize multiple GPUs effectively.
Goals
- Allow users to choose which available GPUs to use for running the engine
- Implement load balancing across selected GPUs
- Maintain compatibility with single-GPU setups
Proposed Implementation
- Detect available GPUs on the system
- Add a configuration option for users to specify which GPUs to use
- Modify the wrapper engine to distribute workload across selected GPUs
Acceptance Criteria
- Users can specify which GPUs to use via configuration
- The engine correctly utilizes all selected GPUs
Additional Considerations
- Ensure proper error handling for scenarios where specified GPUs are unavailable
- Will we add this feature to model.yml for model management?
- Is this feature works for both CLI and API?
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status