Skip to content

Commit 1c63a16

Browse files
micah-wilgshtras
andauthored
[Core] Run garbage collector after CUDA graph capture to fix throughput regression (vllm-project#24128)
Signed-off-by: Gregory Shtrasberg <[email protected]> Co-authored-by: Gregory Shtrasberg <[email protected]>
1 parent 922d3b4 commit 1c63a16

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

vllm/v1/worker/gpu_model_runner.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2885,6 +2885,7 @@ def freeze_gc():
28852885
finally:
28862886
if should_freeze:
28872887
gc.unfreeze()
2888+
gc.collect()
28882889

28892890
# Trigger CUDA graph capture for specific shapes.
28902891
# Capture the large shapes first so that the smaller shapes

0 commit comments

Comments
 (0)