Skip to content

Conversation

qianbiaoxiang
Copy link
Contributor

@qianbiaoxiang qianbiaoxiang commented Jun 26, 2025

Description

Currently, the Hunyuan inference team supports the Hunyuan-A13B model. By adding the modeling_hunyuan_moe.py related files, it supports the model of HunYuanMoEV1ForCausalLM.

We have validated the accuracy of this PR,HunYuan (new MoE LLM model from Tencent) will open source these days.

Thanks~

Test Coverage

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

Summary by CodeRabbit

  • New Features

    • Advanced LLM quickstart: new CLI flag to optionally apply chat-template formatting to prompts.
    • New HunYuan MoE causal LM model exposed for Torch users.
    • Tokenizer: ability to retrieve chat templates.
    • RoPE: dynamic scaling mode with configurable alpha and added QK-normalization support for attention.
  • Bug Fixes

    • Safer MLA detection before enabling flash MLA.
    • Clearer tokenizer loading warnings with exception details.

@qianbiaoxiang qianbiaoxiang requested review from a team as code owners June 26, 2025 14:46
@juney-nvidia juney-nvidia added Community want to contribute PRs initiated from Community Community Engagement help/insights needed from community labels Jun 26, 2025
@wm2012011492 wm2012011492 force-pushed the support_hunyuan_moe branch from 99a91f8 to f81cd35 Compare June 30, 2025 02:47
@wm2012011492
Copy link
Collaborator

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10260 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10260 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #7581 completed with status: 'FAILURE'

@sorenwu sorenwu requested review from a team as code owners August 12, 2025 03:03
@byshiue byshiue requested review from a team and Wanli-Jiang and removed request for yechank-nvidia and a team August 13, 2025 02:58
@byshiue
Copy link
Collaborator

byshiue commented Aug 13, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #15060 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #15060 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11371 completed with status: 'FAILURE'

@byshiue
Copy link
Collaborator

byshiue commented Aug 13, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #15089 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #15089 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11392 completed with status: 'FAILURE'

@byshiue
Copy link
Collaborator

byshiue commented Aug 13, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #15102 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #15102 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11402 completed with status: 'FAILURE'

@byshiue
Copy link
Collaborator

byshiue commented Aug 13, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #15193 [ run ] triggered by Bot

@byshiue
Copy link
Collaborator

byshiue commented Aug 14, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #15298 [ run ] triggered by Bot

Copy link
Collaborator

@QiJune QiJune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tensorrt-cicd
Copy link
Collaborator

PR_Github #15298 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11550 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@byshiue byshiue merged commit 5c2f0fd into NVIDIA:main Aug 14, 2025
5 checks passed
@byshiue
Copy link
Collaborator

byshiue commented Aug 14, 2025

There are too many conservation due to wrong commit at the beginning. So, I bypass the merge after checking the comments we give, and ignore the comments of robot.

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 17, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 17, 2025
Signed-off-by: sorenwu <[email protected]>
Co-authored-by: sorenwu <[email protected]>
Co-authored-by: bhsueh_NV <[email protected]>
Signed-off-by: Wangshanshan <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 17, 2025
Signed-off-by: sorenwu <[email protected]>
Co-authored-by: sorenwu <[email protected]>
Co-authored-by: bhsueh_NV <[email protected]>
Signed-off-by: Wangshanshan <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 17, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 18, 2025
Signed-off-by: sorenwu <[email protected]>
Co-authored-by: sorenwu <[email protected]>
Co-authored-by: bhsueh_NV <[email protected]>
Signed-off-by: Wangshanshan <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 18, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 18, 2025
Signed-off-by: sorenwu <[email protected]>
Co-authored-by: sorenwu <[email protected]>
Co-authored-by: bhsueh_NV <[email protected]>
Signed-off-by: Wangshanshan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Engagement help/insights needed from community Community want to contribute PRs initiated from Community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants