Skip to content

Commit 6742730

Browse files
committed
set zero for kvcache after warmup to avoid nan
Signed-off-by: Pengbo Wang <[email protected]>
1 parent a00ca11 commit 6742730

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

tensorrt_llm/_torch/pyexecutor/model_engine.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -648,6 +648,14 @@ def release_batch(result: ScheduledRequests | None):
648648
return
649649

650650
with contextlib.ExitStack() as stack:
651+
652+
def clean_up_kv_cache():
653+
# Zero the KV cache; NaNs may be introduced during warmup
654+
for layer_idx in kv_cache_manager.layer_offsets.keys():
655+
kv_cache_manager.get_buffers(layer_idx).zero_()
656+
657+
stack.callback(clean_up_kv_cache)
658+
651659
if self._torch_compile_enabled:
652660

653661
def disable_optimization(backend: Backend):

0 commit comments

Comments
 (0)