[TRTLLM-5974][feat] Support disaggregated serving in TRTLLM Sampler #5328

dcampora · 2025-06-18T10:59:30Z

Support disaggregated serving in TRTLLM Sampler

This PR brings Disaggregated serving to the TRTLLM Sampler.

It fixes the conversion from FinishedState to finish reason in TRTLLM Sampler (bugged before).
A test that checks overlap scheduling with disaggregated serving is included.

dcampora · 2025-06-18T11:03:36Z

/bot run

tensorrt-cicd · 2025-06-18T11:08:47Z

PR_Github #9374 [ run ] triggered by Bot

dcampora · 2025-06-18T20:45:45Z

/bot run

tensorrt-cicd · 2025-06-18T20:53:48Z

PR_Github #9414 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-18T20:53:49Z

PR_Github #9374 [ run ] completed with state ABORTED

tensorrt-cicd · 2025-06-19T02:38:04Z

PR_Github #9414 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6907 completed with status: 'FAILURE'

dcampora · 2025-06-19T09:36:20Z

/bot run

tensorrt-cicd · 2025-06-19T09:47:48Z

PR_Github #9491 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-19T12:46:03Z

PR_Github #9491 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6965 completed with status: 'SUCCESS'

Copilot

Pull Request Overview

This PR adds support for disaggregated serving in the TRTLLM Sampler while addressing a bug in the conversion from FinishedState to finish reason. Key changes include:

Adding a new test configuration and test case for "trtllm_sampler" alongside the existing "overlap" configuration.
Refactoring the finish reason conversion in the sampler to use the FinishedState abstraction.
Updating the resource management in the py_executor to invoke seq_slot_manager during disaggregated generation initialization.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/integration/defs/disaggregated/test_disaggregated.py	Updated test conditions to include the new trtllm_sampler, with new YAML config.
tests/integration/defs/disaggregated/test_configs/disagg_config_trtllm_sampler.yaml	Added a new configuration file for TRTLLM Sampler tests.
tensorrt_llm/_torch/pyexecutor/seq_slot_manager.py	Adjusted request slot allocation logic for disaggregated generation.
tensorrt_llm/_torch/pyexecutor/sampler.py	Updated the finish reason conversion logic using FinishedState.
tensorrt_llm/_torch/pyexecutor/py_executor.py	Added resource preparation for the seq_slot_manager.
tensorrt_llm/_torch/pyexecutor/finish_reason.py	Introduced the FinishedState class to better encapsulate finish reason logic.

tests/integration/defs/disaggregated/test_disaggregated.py

tensorrt_llm/_torch/pyexecutor/seq_slot_manager.py

dcampora · 2025-06-20T16:52:36Z

/bot run

tensorrt-cicd · 2025-06-20T16:57:58Z

PR_Github #9560 [ run ] triggered by Bot

tensorrt_llm/_torch/pyexecutor/finish_reason.py

tensorrt-cicd · 2025-06-20T19:11:21Z

PR_Github #9560 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7019 completed with status: 'FAILURE'

dcampora · 2025-06-21T17:59:38Z

/bot run

tensorrt-cicd · 2025-06-21T18:05:27Z

PR_Github #9577 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-21T20:15:56Z

PR_Github #9577 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7033 completed with status: 'FAILURE'

dcampora · 2025-06-23T09:27:32Z

/bot run

tensorrt-cicd · 2025-06-23T09:32:46Z

PR_Github #9599 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-23T10:01:38Z

PR_Github #9599 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7053 completed with status: 'FAILURE'

tensorrt-cicd · 2025-06-24T08:31:34Z

PR_Github #9663 [ run ] triggered by Bot

tensorrt_llm/_torch/pyexecutor/sampler.py

dcampora · 2025-06-24T11:56:26Z

/bot run

tensorrt-cicd · 2025-06-24T12:02:53Z

PR_Github #9707 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-24T12:02:56Z

PR_Github #9663 [ run ] completed with state ABORTED
/LLM/main/L0_MergeRequest_PR pipeline #7105 completed with status: 'FAILURE'

tensorrt-cicd · 2025-06-24T13:35:51Z

PR_Github #9707 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7142 completed with status: 'FAILURE'

dcampora · 2025-06-24T19:58:14Z

/bot run

tensorrt-cicd · 2025-06-24T20:03:52Z

PR_Github #9749 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-24T22:01:17Z

PR_Github #9749 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7183 completed with status: 'FAILURE'

Signed-off-by: Daniel Campora <[email protected]>

dcampora · 2025-06-25T06:39:25Z

/bot run

tensorrt-cicd · 2025-06-25T06:44:28Z

PR_Github #9825 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-25T13:15:54Z

PR_Github #9825 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7250 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

…VIDIA#5328) Signed-off-by: Daniel Campora <[email protected]> Signed-off-by: Daniel Cámpora <[email protected]> Co-authored-by: Copilot <[email protected]>

dcampora requested a review from a team as a code owner June 18, 2025 10:59

dcampora requested review from Naveassaf, Funatiq, QiJune and netanel-haber June 18, 2025 10:59

chuangz0 requested a review from Shixiaowei02 June 19, 2025 01:52

Funatiq requested a review from Copilot June 20, 2025 07:06

Copilot AI reviewed Jun 20, 2025

View reviewed changes

tests/integration/defs/disaggregated/test_disaggregated.py Show resolved Hide resolved

tests/integration/defs/disaggregated/test_disaggregated.py Show resolved Hide resolved

tensorrt_llm/_torch/pyexecutor/seq_slot_manager.py Outdated Show resolved Hide resolved

dcampora enabled auto-merge (squash) June 20, 2025 16:52

netanel-haber approved these changes Jun 20, 2025

View reviewed changes

netanel-haber reviewed Jun 20, 2025

View reviewed changes

tensorrt_llm/_torch/pyexecutor/finish_reason.py Show resolved Hide resolved

dcampora force-pushed the user/dcampora/support_ds_in_trtllm_sampler branch from 6418b03 to b78af95 Compare June 24, 2025 08:16

Funatiq reviewed Jun 24, 2025

View reviewed changes

tensorrt_llm/_torch/pyexecutor/sampler.py Show resolved Hide resolved

dcampora force-pushed the user/dcampora/support_ds_in_trtllm_sampler branch from 283db74 to c37bbad Compare June 24, 2025 19:58

dcampora added 6 commits June 25, 2025 06:39

Added support for DS on TRTLLM Sampler.

45dc34b

Signed-off-by: Daniel Campora <[email protected]>

Formatting.

523de97

Signed-off-by: Daniel Campora <[email protected]>

Added missing file.

0c02df2

Signed-off-by: Daniel Campora <[email protected]>

Remove prints.

634b8c7

Signed-off-by: Daniel Campora <[email protected]>

Look for one more token in is_overlap.

6396845

Signed-off-by: Daniel Campora <[email protected]>

Fix prepare_resources key.

7cea569

Signed-off-by: Daniel Campora <[email protected]>

dcampora force-pushed the user/dcampora/support_ds_in_trtllm_sampler branch from c37bbad to 7cea569 Compare June 25, 2025 06:39

MartinMarciniszyn approved these changes Jun 25, 2025

View reviewed changes

dcampora merged commit 205c97a into NVIDIA:main Jun 25, 2025
3 checks passed

[TRTLLM-5974][feat] Support disaggregated serving in TRTLLM Sampler #5328

[TRTLLM-5974][feat] Support disaggregated serving in TRTLLM Sampler #5328

Uh oh!

Conversation

dcampora commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Support disaggregated serving in TRTLLM Sampler

Uh oh!

dcampora commented Jun 18, 2025

Uh oh!

tensorrt-cicd commented Jun 18, 2025

Uh oh!

dcampora commented Jun 18, 2025

Uh oh!

tensorrt-cicd commented Jun 18, 2025

Uh oh!

tensorrt-cicd commented Jun 18, 2025

Uh oh!

tensorrt-cicd commented Jun 19, 2025

Uh oh!

dcampora commented Jun 19, 2025

Uh oh!

tensorrt-cicd commented Jun 19, 2025

Uh oh!

tensorrt-cicd commented Jun 19, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dcampora commented Jun 20, 2025

Uh oh!

tensorrt-cicd commented Jun 20, 2025

Uh oh!

Uh oh!

tensorrt-cicd commented Jun 20, 2025

Uh oh!

dcampora commented Jun 21, 2025

Uh oh!

tensorrt-cicd commented Jun 21, 2025

Uh oh!

tensorrt-cicd commented Jun 21, 2025

Uh oh!

dcampora commented Jun 23, 2025

Uh oh!

tensorrt-cicd commented Jun 23, 2025

Uh oh!

tensorrt-cicd commented Jun 23, 2025

Uh oh!

tensorrt-cicd commented Jun 24, 2025

Uh oh!

Uh oh!

dcampora commented Jun 24, 2025

Uh oh!

tensorrt-cicd commented Jun 24, 2025

Uh oh!

tensorrt-cicd commented Jun 24, 2025

Uh oh!

tensorrt-cicd commented Jun 24, 2025

Uh oh!

dcampora commented Jun 24, 2025

Uh oh!

tensorrt-cicd commented Jun 24, 2025

Uh oh!

tensorrt-cicd commented Jun 24, 2025

Uh oh!

dcampora commented Jun 25, 2025

Uh oh!

tensorrt-cicd commented Jun 25, 2025

Uh oh!

tensorrt-cicd commented Jun 25, 2025

Uh oh!

Uh oh!

Uh oh!

dcampora commented Jun 18, 2025 •

edited

Loading