-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[BREAKING CHANGE]: change default backend to PyTorch in trtllm-serve #5717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the default serving backend to PyTorch across the codebase and aligns all tests and CLI flags to always specify the backend explicitly.
- Switch test fixtures to use
"tensorrt"
instead ofNone
/"trt"
and always include a--backend
argument - Remove redundant conditional backend handling in tests, streamlining argument lists
- Update
serve.py
so that PyTorch is the default backend in both function defaults and the CLI, and extend choice options
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
tests/unittest/llmapi/apps/_test_openai_reasoning.py | Fixture params updated to ["tensorrt","pytorch"] |
tests/unittest/llmapi/apps/_test_openai_multi_nodes.py | Expanded args list and always add --backend |
tests/unittest/llmapi/apps/_test_openai_multi_gpu.py | Unified backend handling, removed conditional branch |
tests/unittest/llmapi/apps/_test_openai_misc.py | Simplified arg building, always include --backend |
tests/unittest/llmapi/apps/_test_openai_metrics.py | Removed redundant backend="pytorch" in client setup |
tests/unittest/llmapi/apps/_test_openai_completions.py | Conditionally add --max_beam_width only for TRT |
tests/unittest/llmapi/apps/_test_openai_chat.py | Same as above for chat tests |
tensorrt_llm/commands/serve.py | Default backend set to "pytorch" , CLI updated |
Comments suppressed due to low confidence (1)
tensorrt_llm/commands/serve.py:170
- The help text still refers to a "cpp path" default, but the default has been changed to "pytorch". Please update the help string to reflect the new default backend.
help="Set to 'pytorch' for pytorch path. Default is cpp path.")
190c79a
to
a7a51b5
Compare
/bot run |
PR_Github #11456 [ run ] triggered by Bot |
PR_Github #11456 [ run ] completed with state |
@LinPoly Please take care of this PR and make sure it gets merged on time, thanks! |
Will run again after some code format fix, the CI failure seems irrelevant since it is not about |
a7a51b5
to
59f4d41
Compare
/bot run |
PR_Github #11521 [ run ] triggered by Bot |
PR_Github #11521 [ run ] completed with state |
/bot run |
PR_Github #11550 [ run ] triggered by Bot |
PR_Github #11550 [ run ] completed with state |
7698199
to
a09e526
Compare
/bot run |
PR_Github #11778 [ run ] triggered by Bot |
PR_Github #11778 [ run ] completed with state |
Signed-off-by: Pengyun Lin <[email protected]>
Signed-off-by: Pengyun Lin <[email protected]>
Signed-off-by: Pengyun Lin <[email protected]>
Signed-off-by: Pengyun Lin <[email protected]>
a09e526
to
e5b5ed4
Compare
/bot run |
WalkthroughThe changes update backend handling across CLI, configuration, and test code. The default backend is now explicitly set to "pytorch" instead of None, and support for the "trt" backend is added throughout. Configuration files and test fixtures are updated for consistency, and redundant installation steps are removed from some tests. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant Config
participant Server
User->>CLI: Run serve command (with or without --backend)
CLI->>Config: Parse backend (default "pytorch" if not specified)
Config->>Server: Start server with backend ("pytorch" or "trt")
Server-->>User: Serve model using selected backend
Estimated code review effort3 (90–240 minutes) Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (13)
💤 Files with no reviewable changes (2)
🧰 Additional context used🧬 Code Graph Analysis (6)tests/unittest/llmapi/apps/_test_openai_multi_nodes.py (5)
tests/unittest/llmapi/apps/_test_openai_completions.py (5)
tests/unittest/llmapi/apps/_test_openai_misc.py (5)
tests/unittest/llmapi/apps/_test_openai_chat.py (5)
tests/unittest/llmapi/apps/_test_openai_multi_gpu.py (1)
tests/unittest/llmapi/apps/_test_openai_reasoning.py (5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (19)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
PR_Github #12423 [ run ] triggered by Bot |
PR_Github #12423 [ run ] completed with state |
…VIDIA#5717) Signed-off-by: Pengyun Lin <[email protected]>
…VIDIA#5717) Signed-off-by: Pengyun Lin <[email protected]> Signed-off-by: Shreyas Misra <[email protected]>
…VIDIA#5717) Signed-off-by: Pengyun Lin <[email protected]> Signed-off-by: Ransiki Zhang <[email protected]>
Description
Change default backend to PyTorch in
trtllm-serve
Test Coverage
Change the existing tests.
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...
Provide a user friendly way for developers to interact with a Jenkins server.
Run
/bot [-h|--help]
to print this help message.See details below for each supported subcommand.
run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]
Launch build/test pipelines. All previously running jobs will be killed.
--disable-fail-fast
(OPTIONAL) : Disable fail fast on build/tests/infra failures.--skip-test
(OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.--stage-list "A10-1, xxx"
(OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.--gpu-type "A30, H100_PCIe"
(OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.--only-multi-gpu-test
(OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.--disable-multi-gpu-test
(OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.--add-multi-gpu-test
(OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.--post-merge
(OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.--extra-stage "H100_PCIe-[Post-Merge]-1, xxx"
(OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".For guidance on mapping tests to stage names, see
docs/source/reference/ci-overview.md
.kill
kill
Kill all running builds associated with pull request.
skip
skip --comment COMMENT
Skip testing for latest commit on pull request.
--comment "Reason for skipping build/test"
is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.reuse-pipeline
reuse-pipeline
Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores