[Megatron-LM] Create patch to update hybrid models TFLOPS calculation by clairesonglee · Pull Request #594 · AMD-AGI/Primus

clairesonglee · 2026-03-11T20:55:29Z

No description provided.

…32B Configs for MI300X & MI355X (#556) YF: Only SFT related config and Doc changes, bypassing unit CI tests ## Summary This PR introduces post-training documentation and updates Qwen3 32B model configuration files to support AMD MI300X and MI355X accelerators. --- ## Changes ### 📘 Documentation - **Added `posttraining.md`** - New comprehensive guide for post-training workflows - Covers setup instructions, configuration details, and usage examples - **Updated `docs/README.md`** - Added a new section referencing post-training documentation - Improved documentation organization and navigation --- ### ⚙️ Configuration Updates - **Updated Qwen3_32B model YAML configs** - Added/modified configurations optimized for: - MI300X - MI355X - Adjusted parameters for compatibility and stable execution --- ## Validation - Verified updated configs load and execute successfully on MI300X and MI355X environments - Confirmed documentation links and structure render correctly --- ## Checklist - [x] Added `posttraining.md` - [x] Updated `docs/README.md` - [x] Modified Qwen3_32B YAML configs - [x] Verified changes locally

Co-authored-by: Mingyu Yang <Mingyu.Yang@amd.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Co-authored-by: Kailash Gogineni <gkailashnath1998@gmail.com> Co-authored-by: HuangWei-95 <Wei.Huang4@amd.com> Co-authored-by: HuangWei-95 <weihuan@amd.com> Co-authored-by: Xiaoming-AMD <Xiaoming.Peng@amd.com> Co-authored-by: WangLingxun <linxwang@amd.com>

…578) Expand projection.md with memory projection and performance details.

…581) Hook Megatron validate_args alongside parse_args so Primus-injected arguments are validated consistently, and run additional ROCM-specific argument checks during initialization.

primus/backends/megatron/core/models/hybrid/hybrid_mamba_mla_layer_specs.py

vidushi8 and others added 11 commits February 12, 2026 19:13

Update torchtitan batxh size and enable CE fusion

43eacb6

update MI355 yaml for better perf

8db36dc

update yaml

143d593

tune hybrid model mi300x configs

365758c

tune hybrid model mi355x configs

06d8e1e

Expand projection.md with memory projection and performance details. (#…

fadaeb1

…578) Expand projection.md with memory projection and performance details.

Merge branch 'main' into release/v26.2

4262afd

update yamls to fix regressions and standardize

d32360e

fix(megatron): patch validate_args and add ROCM argument validation (#…

bc4c861

…581) Hook Megatron validate_args alongside parse_args so Primus-injected arguments are validated consistently, and run additional ROCM-specific argument checks during initialization.

github-code-quality bot found potential problems Mar 11, 2026

View reviewed changes

primus/backends/megatron/core/models/hybrid/hybrid_mamba_mla_layer_specs.py Fixed Show fixed Hide fixed

clairesonglee changed the title ~~Create patch to update hybrid models TFLOPS calculation~~ [Megatron-LM] Create patch to update hybrid models TFLOPS calculation Mar 11, 2026

wenxie-amd and others added 7 commits March 13, 2026 09:46

Merge branch 'main' into release/v26.2

c6754d9

fix code-lint issue

0420f1a

[Megatron-LM] Update Mamba model tokenizer (#603)

3423cec

remove redundant params in mamba config

b4204f6

update config mi355 llama3 70b

df4c65e

fix turbo argument in mi355 dsv3

697efab

sync with release/v26.2 & add patch to calculate zebra-llama flops

2c26dd7

clairesonglee force-pushed the dev/clairlee/update-hybrid-throughput branch from 810cfd3 to 2c26dd7 Compare March 17, 2026 05:37

clairesonglee and others added 3 commits March 24, 2026 23:30

Merge branch 'main' into dev/clairlee/update-hybrid-throughput

1be8773

update mbs=16

aafa2f4

create MI355X config

2f738e6

clairesonglee marked this pull request as ready for review March 25, 2026 06:39

clairesonglee requested review from Xiaoming-AMD, limou102 and wenxie-amd as code owners March 25, 2026 06:39

clairesonglee marked this pull request as draft March 25, 2026 06:55

code lint with pre-commit

d271241

clairesonglee marked this pull request as ready for review March 25, 2026 07:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Megatron-LM] Create patch to update hybrid models TFLOPS calculation#594

[Megatron-LM] Create patch to update hybrid models TFLOPS calculation#594
clairesonglee wants to merge 22 commits intomainfrom
dev/clairlee/update-hybrid-throughput

clairesonglee commented Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

clairesonglee commented Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants