-
Notifications
You must be signed in to change notification settings - Fork 6
Model architectures #90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📦 Build Artifacts Available |
…vLLM Inference session Signed-off-by: Alexandre Marques <[email protected]>
This update introduces a new section detailing the currently supported model architectures, including Llama-3 and Qwen3, along with their trained checkpoints. This addition enhances the documentation by providing users with clear information on available models. Signed-off-by: Alexandre Marques <[email protected]>
Signed-off-by: Alexandre Marques <[email protected]>
75f13f5
to
2a13cc4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anmarques I'd like to adapt this further to ideally balance discoverability, clarity, and depth. Can we change this into a table format and either add it as a new sub section in either the Overview (after key features) or in Resources (### Models and then ### Research Implementations).
I'm thinking something like the following for the table format so it is compact and informative as a really quick example:
Model / Verifier | Training & Creation | Deployment (vLLM) | Pretrained Checkpoints |
---|---|---|---|
Qwen3 (8B) | Eagle 3 ✔️, HASS ⏳ | Eagle 3 ✔️ | Eagle 3 ✔️ |
Llama-3.1 (70B) | Eagle 3 ⏳ | Eagle 3 ⏳ | ✖️ |
✔️ = Supported,⏳ = In Progress,✖️ = Not Available
I like having this central place for everything supported, LGTM once the existing comments are resolved! |
I agree with the suggested changes. Rahul and I are putting together a similar table for in-depth vLLM overview that we'll add in the next week or so. |
This PR lists the model architectures currently supported and planned to be supported with the speculators format in vLLM. It specifies model architectures that are currently supported and others for which there are trained checkpoints, from Red Hat AI, EAGLE or HASS.