Skip to content

Conversation

dcampora
Copy link
Collaborator

@dcampora dcampora commented Jun 18, 2025

Support disaggregated serving in TRTLLM Sampler

This PR brings Disaggregated serving to the TRTLLM Sampler.

  • It fixes the conversion from FinishedState to finish reason in TRTLLM Sampler (bugged before).
  • A test that checks overlap scheduling with disaggregated serving is included.

@dcampora dcampora requested a review from a team as a code owner June 18, 2025 10:59
@dcampora
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9374 [ run ] triggered by Bot

@dcampora
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9414 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9374 [ run ] completed with state ABORTED

@chuangz0 chuangz0 requested a review from Shixiaowei02 June 19, 2025 01:52
@tensorrt-cicd
Copy link
Collaborator

PR_Github #9414 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6907 completed with status: 'FAILURE'

@dcampora
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9491 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9491 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6965 completed with status: 'SUCCESS'

@Funatiq Funatiq requested a review from Copilot June 20, 2025 07:06
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for disaggregated serving in the TRTLLM Sampler while addressing a bug in the conversion from FinishedState to finish reason. Key changes include:

  • Adding a new test configuration and test case for "trtllm_sampler" alongside the existing "overlap" configuration.
  • Refactoring the finish reason conversion in the sampler to use the FinishedState abstraction.
  • Updating the resource management in the py_executor to invoke seq_slot_manager during disaggregated generation initialization.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/integration/defs/disaggregated/test_disaggregated.py Updated test conditions to include the new trtllm_sampler, with new YAML config.
tests/integration/defs/disaggregated/test_configs/disagg_config_trtllm_sampler.yaml Added a new configuration file for TRTLLM Sampler tests.
tensorrt_llm/_torch/pyexecutor/seq_slot_manager.py Adjusted request slot allocation logic for disaggregated generation.
tensorrt_llm/_torch/pyexecutor/sampler.py Updated the finish reason conversion logic using FinishedState.
tensorrt_llm/_torch/pyexecutor/py_executor.py Added resource preparation for the seq_slot_manager.
tensorrt_llm/_torch/pyexecutor/finish_reason.py Introduced the FinishedState class to better encapsulate finish reason logic.

@dcampora
Copy link
Collaborator Author

/bot run

@dcampora dcampora enabled auto-merge (squash) June 20, 2025 16:52
@tensorrt-cicd
Copy link
Collaborator

PR_Github #9560 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9560 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7019 completed with status: 'FAILURE'

@dcampora
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9577 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9577 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7033 completed with status: 'FAILURE'

@dcampora
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9599 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9599 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7053 completed with status: 'FAILURE'

@dcampora dcampora force-pushed the user/dcampora/support_ds_in_trtllm_sampler branch from 6418b03 to b78af95 Compare June 24, 2025 08:16
@tensorrt-cicd
Copy link
Collaborator

PR_Github #9663 [ run ] triggered by Bot

@dcampora
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9707 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9663 [ run ] completed with state ABORTED
/LLM/main/L0_MergeRequest_PR pipeline #7105 completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9707 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7142 completed with status: 'FAILURE'

@dcampora dcampora force-pushed the user/dcampora/support_ds_in_trtllm_sampler branch from 283db74 to c37bbad Compare June 24, 2025 19:58
@dcampora
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9749 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9749 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7183 completed with status: 'FAILURE'

dcampora added 6 commits June 25, 2025 06:39
Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Campora <[email protected]>
@dcampora dcampora force-pushed the user/dcampora/support_ds_in_trtllm_sampler branch from c37bbad to 7cea569 Compare June 25, 2025 06:39
@dcampora
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9825 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9825 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7250 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@dcampora dcampora merged commit 205c97a into NVIDIA:main Jun 25, 2025
3 checks passed
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 9, 2025
…VIDIA#5328)

Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Cámpora <[email protected]>
Co-authored-by: Copilot <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
…VIDIA#5328)

Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Cámpora <[email protected]>
Co-authored-by: Copilot <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
…VIDIA#5328)

Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Cámpora <[email protected]>
Co-authored-by: Copilot <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
…VIDIA#5328)

Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Cámpora <[email protected]>
Co-authored-by: Copilot <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
…VIDIA#5328)

Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Cámpora <[email protected]>
Co-authored-by: Copilot <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
…VIDIA#5328)

Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Cámpora <[email protected]>
Co-authored-by: Copilot <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
…VIDIA#5328)

Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Cámpora <[email protected]>
Co-authored-by: Copilot <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
…VIDIA#5328)

Signed-off-by: Daniel Campora <[email protected]>
Signed-off-by: Daniel Cámpora <[email protected]>
Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants