-
Notifications
You must be signed in to change notification settings - Fork 67
add support for mixtral-8x7b and mixtral-8x7b-instruct #408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
model-engine/model_engine_server/domain/use_cases/llm_model_endpoint_use_cases.py
Outdated
Show resolved
Hide resolved
@@ -58,6 +58,8 @@ def get_default_supported_models_info() -> Dict[str, ModelInfo]: | |||
), | |||
"mistral-7b": ModelInfo("mistralai/Mistral-7B-v0.1", None), | |||
"mistral-7b-instruct": ModelInfo("mistralai/Mistral-7B-Instruct-v0.1", None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's also update mistral-7b-Instruct to use the newer version released today: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2 we can do this in a follow up pr but good to do as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add it as a separate model instead of replacing the current one? also, should do that as a follow-up PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I wouldn't make it a separate model, I would just add the weights to s3 and use them in favor of the v0.1 weights. I suppose there is some value to having both models though for completeness, @yunfeng-scale thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't think there's need to add as a new model
Pull Request Summary
bump vllm to v0.2.4 to support mixtral-8x7b models
Test Plan and Usage Guide
spin up model for local inference