[Core][AMD] Propagate shutdown timeout to MultiprocExecutor#43154
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements a configurable shutdown timeout for the V1 engine and multiprocess executor. It adds a shutdown_timeout attribute to BackgroundResources and updates the MultiprocExecutor to use this value, ensuring a minimum grace period during worker termination. A review comment correctly identified a potential TypeError in multiproc_executor.py that could occur if the timeout configuration is None, suggesting a default value to prevent the crash.
a947bd5 to
d895571
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a configurable shutdown timeout for the MultiprocExecutor in the V1 engine. Changes include adding a shutdown_timeout field to BackgroundResources, passing this value to the engine manager during shutdown, and updating MultiprocExecutor to use the configured timeout with a 4-second minimum. Unit tests were added to verify worker termination behavior. Feedback points out a potential TypeError in MultiprocExecutor if the shutdown_timeout is None and provides a suggestion to handle this case safely.
1a048aa to
dbb1bf8
Compare
|
Hi @rjrock, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
dbb1bf8 to
eaf54b2
Compare
|
cc @njhill PTAL |
dllehr-amd
left a comment
There was a problem hiding this comment.
Can you take a quick peak at my note? I'm trying to confirm that we won't negatively impact the default operation mode if we don't set the time ourselves.
rocprofv3 requires a grace period during process shutdown in order to emit trace data. Signed-off-by: Ryan Rock <ryan.rock@amd.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Ryan Rock <ryan.rock@amd.com>
This reverts commit c20b9a8. Signed-off-by: Ryan Rock <ryan.rock@amd.com>
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
eaf54b2 to
0a59310
Compare
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
|
Added a |
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
|
Thanks @rjrock. The So I'm not sure we should use that value here. By the time we are shutting down the executor we are in tear-down mode and the 4 second timeout is just to allow the resources to be released/process to exit cleanly. Perhaps for this purpose it would be better to just add a new |
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
That makes sense. I rewrote it to use the env var |
Co-authored-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Ryan Rock <ryan.rock@amd.com>
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
|
Hi @rjrock, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, |
dllehr-amd
left a comment
There was a problem hiding this comment.
Thanks @rjrock! approving as well!
Co-authored-by: Claude Signed-off-by: Nicholas Edelman <nedelman@nvidia.com> [Core][AMD] Propagate shutdown timeout to MultiprocExecutor (vllm-project#43154) Signed-off-by: Ryan Rock <ryan.rock@amd.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> [Refactor] Deprecate ResponsesParser wrapper, inline parsing into ParsableContext (vllm-project#45431) Signed-off-by: sfeng33 <4florafeng@gmail.com> [ROCm] Bump Torch to 2.11 (vllm-project#45362) Signed-off-by: Micah Williamson <micah.williamson@amd.com> [Attention] Improve attention benchmarks: configs and profiling (vllm-project#39336) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
…ject#43154) Signed-off-by: Ryan Rock <ryan.rock@amd.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…ject#43154) Signed-off-by: Ryan Rock <ryan.rock@amd.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Purpose
rocprofv3requires a grace period during process shutdown in order to emit trace data. This PR adds the environment variableVLLM_WORKER_SHUTDOWN_TIMEOUT_SECONDSthat sets a shutdown grace period for worker processes ofMultiProcExecutor. The env var is also passed to the engine manager shutdown.Previously, running a command like the below would fail.
rocprofv3 \ --disable-signal-handlers \ --output-format pftrace \ -r -- \ vllm \ bench throughput \ --shutdown-timeout 60 \ --model Qwen/Qwen3-32B \ --num-prompts=1 \ --tensor-parallel-size 2Similarly, any
rocprofv3trace command that took longer than the 4 second shutdown period inmultiproc_executor.py::_ensure_worker_terminationwould fail.With this change merged, a successful run would look like the below.
export VLLM_WORKER_SHUTDOWN_TIMEOUT_SECONDS=120 rocprofv3 \ --disable-signal-handlers \ --output-format pftrace \ -r -- \ vllm \ bench throughput \ --shutdown-timeout 60 \ --model Qwen/Qwen3-32B \ --num-prompts=1 \ --tensor-parallel-size 2Test Plan
pytest tests/v1/executor/test_executor.py::test_multiproc_executor_worker_termination_timeoutpytest -s -v tests/v1/engine/test_core_engine_actor_manager.py::test_background_resources_passes_worker_shutdown_timeoutTest Result
Success
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.