-
Notifications
You must be signed in to change notification settings - Fork 14
[Roadmap] Primus-Turbo Roadmap H2 2025 #101
Copy link
Copy link
Open
Description
This roadmap is the H2 2025 development plan of Primus-Turbo.
Note: The roadmap is flexible and will be updated over time based on project needs and community input.
Release Overview
| Version | Framework | Status | Date |
|---|---|---|---|
| v0.1.0 | PyTorch + ROCm6.4 | ✅ Released | 2025-09-11 |
| v0.1.1 | PyTorch + ROCm7.0 | ✅ Released | 2025-10-15 |
| v0.2.0 | PyTorch + ROCm7.1 | ✅ Released | 2025-12-05 |
Detailed Plans
v0.1.0 (Released)
Focus
- Build the foundational framework of Primus-Turbo.
- Provide core operators.
Features
- GEMM: Support FP16/BF16.
- FlashAttention: Support FP16/BF16.
- GroupedGEMM: Support FP16/BF16.
Famework
- Provide PyTorch APIs
- Support ROCm 6.4
v0.2.0 (Released)
Focus
- Introduce FP8 foundational support.
- Enable communication primitives with FP8, focusing on DeepEP.
Features
- GEMM: Support FP8 (E4M3/E5M2).
- Support Tensorwise.
- Support Rowwise.
- Support Blockwise.
- Support MX
- FlashAttention: Support FP8 (E4M3/E5M2).
- Support Blockwise.
- GroupedGEMM: Support FP8 (E4M3/E5M2).
- Support Tensorwise.
- Support Rowwise.
- Support Blockwise.
- Support MX
- All2All: FP8 support.
- Support Tensorwise.
- DeepEP:
- Intra-Node Normal Kernel.
- Inter-Node Normal Kernel.
- Support NICs.
- ConnectX-7
- Thor2
- Pensando
- Support
internode_dispatchGPU-CPU no sync. - Support
torch.compile
- TokenDispatcher:
- Integrate Permute/Unpermute
- Support Sync-Free
DeepEPTokenDispatcher - Support MoE Fused Activations.
Reactions are currently unavailable
Metadata
Metadata
Labels
No labels