[None] [feat] Add Tencent HunYuanMoEV1 model support #5521

qianbiaoxiang · 2025-06-26T14:46:32Z

Description

Currently, the Hunyuan inference team supports the Hunyuan-A13B model. By adding the modeling_hunyuan_moe.py related files, it supports the model of HunYuanMoEV1ForCausalLM.

We have validated the accuracy of this PR，HunYuan (new MoE LLM model from Tencent) will open source these days.

Thanks~

Test Coverage

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

Summary by CodeRabbit

New Features
- Advanced LLM quickstart: new CLI flag to optionally apply chat-template formatting to prompts.
- New HunYuan MoE causal LM model exposed for Torch users.
- Tokenizer: ability to retrieve chat templates.
- RoPE: dynamic scaling mode with configurable alpha and added QK-normalization support for attention.
Bug Fixes
- Safer MLA detection before enabling flash MLA.
- Clearer tokenizer loading warnings with exception details.

wm2012011492 · 2025-06-30T02:48:19Z

/bot run

tensorrt-cicd · 2025-06-30T02:53:17Z

PR_Github #10260 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-30T03:05:08Z

PR_Github #10260 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #7581 completed with status: 'FAILURE'

byshiue · 2025-08-13T03:00:23Z

/bot run

tensorrt-cicd · 2025-08-13T03:07:10Z

PR_Github #15060 [ run ] triggered by Bot

tensorrt-cicd · 2025-08-13T04:57:49Z

PR_Github #15060 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11371 completed with status: 'FAILURE'

byshiue · 2025-08-13T07:09:45Z

/bot run

tensorrt-cicd · 2025-08-13T07:15:28Z

PR_Github #15089 [ run ] triggered by Bot

tensorrt-cicd · 2025-08-13T08:42:14Z

PR_Github #15089 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11392 completed with status: 'FAILURE'

byshiue · 2025-08-13T08:49:48Z

/bot run

tensorrt-cicd · 2025-08-13T08:54:54Z

PR_Github #15102 [ run ] triggered by Bot

tensorrt-cicd · 2025-08-13T15:50:28Z

PR_Github #15102 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11402 completed with status: 'FAILURE'

byshiue · 2025-08-13T23:05:40Z

/bot run

tensorrt-cicd · 2025-08-13T23:11:21Z

PR_Github #15193 [ run ] triggered by Bot

byshiue · 2025-08-14T13:01:21Z

/bot run

tensorrt-cicd · 2025-08-14T13:06:36Z

PR_Github #15298 [ run ] triggered by Bot

QiJune

LGTM

tensorrt-cicd · 2025-08-14T22:19:15Z

PR_Github #15298 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11550 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

byshiue · 2025-08-14T22:57:21Z

There are too many conservation due to wrong commit at the beginning. So, I bypass the merge after checking the comments we give, and ignore the comments of robot.

Signed-off-by: sorenwu <[email protected]> Co-authored-by: sorenwu <[email protected]> Co-authored-by: bhsueh_NV <[email protected]>

Signed-off-by: sorenwu <[email protected]> Co-authored-by: sorenwu <[email protected]> Co-authored-by: bhsueh_NV <[email protected]> Signed-off-by: Wangshanshan <[email protected]>

Signed-off-by: sorenwu <[email protected]> Co-authored-by: sorenwu <[email protected]> Co-authored-by: bhsueh_NV <[email protected]>

Signed-off-by: sorenwu <[email protected]> Co-authored-by: sorenwu <[email protected]> Co-authored-by: bhsueh_NV <[email protected]> Signed-off-by: Wangshanshan <[email protected]>

Signed-off-by: sorenwu <[email protected]> Co-authored-by: sorenwu <[email protected]> Co-authored-by: bhsueh_NV <[email protected]>

Signed-off-by: sorenwu <[email protected]> Co-authored-by: sorenwu <[email protected]> Co-authored-by: bhsueh_NV <[email protected]> Signed-off-by: Wangshanshan <[email protected]>

qianbiaoxiang requested review from a team as code owners June 26, 2025 14:46

qianbiaoxiang requested review from Naveassaf, suyoggupta and pcastonguay June 26, 2025 14:46

juney-nvidia added Community want to contribute PRs initiated from Community Community Engagement help/insights needed from community labels Jun 26, 2025

wm2012011492 force-pushed the support_hunyuan_moe branch from 99a91f8 to f81cd35 Compare June 30, 2025 02:47

sorenwu requested review from a team as code owners August 12, 2025 03:03

byshiue requested review from a team and Wanli-Jiang and removed request for yechank-nvidia and a team August 13, 2025 02:58

Merge branch 'main' into support_hunyuan_moe

0d79fd9

byshiue approved these changes Aug 14, 2025

View reviewed changes

QiJune approved these changes Aug 14, 2025

View reviewed changes

byshiue merged commit 5c2f0fd into NVIDIA:main Aug 14, 2025
5 checks passed

[None] [feat] Add Tencent HunYuanMoEV1 model support #5521

[None] [feat] Add Tencent HunYuanMoEV1 model support #5521

Uh oh!

Conversation

qianbiaoxiang commented Jun 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Test Coverage

GitHub Bot Help

kill

skip

reuse-pipeline

Summary by CodeRabbit

Uh oh!

wm2012011492 commented Jun 30, 2025

Uh oh!

tensorrt-cicd commented Jun 30, 2025

Uh oh!

tensorrt-cicd commented Jun 30, 2025

Uh oh!

byshiue commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

byshiue commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

byshiue commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

byshiue commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

byshiue commented Aug 14, 2025

Uh oh!

tensorrt-cicd commented Aug 14, 2025

Uh oh!

QiJune left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Aug 14, 2025

Uh oh!

Uh oh!

byshiue commented Aug 14, 2025

Uh oh!

Uh oh!

qianbiaoxiang commented Jun 26, 2025 •

edited by coderabbitai bot

Loading