Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[TieredOffloading] Bound Secondary Tier lookup times v1
#45765 opened Jun 16, 2026 by varun-sundar-rabindranath Contributor Loading…
[BugFix] Support MLA model identification for draft models Kimi(deeps… bug Something isn't working
#45764 opened Jun 16, 2026 by baolongsun Loading…
3 of 4 tasks
[Bugfix] Fix Qwen3 prompt tool-call reasoning false positive bug Something isn't working qwen Related to Qwen models
#45763 opened Jun 16, 2026 by alexbi29 Loading…
[Docs] Update stale LMCache examples documentation Improvements or additions to documentation kv-connector
#45762 opened Jun 16, 2026 by sammshen Contributor Loading…
Revert "[Model Runner V2][Bugfix] Fix MRV2 LoRA warmup" (#35536) bug Something isn't working nvidia qwen Related to Qwen models v1
#45761 opened Jun 16, 2026 by vllm-agent Contributor Draft
[Frontend] Remove AsyncMicrobatchTokenizer. ready ONLY add when PR is ready to merge/full CI is needed
#45759 opened Jun 16, 2026 by noooop Collaborator Loading…
4 tasks
[XPU] Fix Triton attn fp8/bf16 check failing intel-gpu Related to Intel GPU ready ONLY add when PR is ready to merge/full CI is needed v1
#45758 opened Jun 16, 2026 by zhenwei-intel Contributor Loading…
4 tasks
[CPUOffloading] Guard CPU eviction check v1
#45757 opened Jun 16, 2026 by varun-sundar-rabindranath Contributor Loading…
[Frontend] [Parser] Migrate Nemotron V3 to streaming parser engine
#45755 opened Jun 16, 2026 by bbrowning Collaborator Loading…
[Bugfix] DiffusionGemma: only pop a request's logprobs when it commits (#45689) bug Something isn't working
#45754 opened Jun 16, 2026 by waynehacking8 Contributor Loading…
[Rust Frontend] Add CORS support rust
#45753 opened Jun 16, 2026 by tahsintunan Contributor Loading…
Turboquant native fp8 v4 store ci/build v1
#45748 opened Jun 16, 2026 by sladyn98 Contributor Loading…
[Bugfix][ROCm] Fix rocm_aiter_per_tensor_quant custom op aliasing bug Something isn't working rocm Related to AMD ROCm
#45747 opened Jun 15, 2026 by Rohan138 Contributor Loading…
DO NOT MERGE ci/build ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm
#45746 opened Jun 15, 2026 by AndreasKaratzas Member Loading…
[Bugfix] Reset per-item content_index in gpt-oss Responses streaming bug Something isn't working frontend gpt-oss Related to GPT-OSS models
#45745 opened Jun 15, 2026 by ankrovv Contributor Loading…
[M3] Enable FP8 sparse GQA ci/build
#45744 opened Jun 15, 2026 by gau-nernst Contributor Loading…
4 tasks
[M3] Tune Triton indexer score decode for spec-decode
#45743 opened Jun 15, 2026 by gau-nernst Contributor Loading…
4 tasks
Pre-Commit CI Speedup ci/build ready ONLY add when PR is ready to merge/full CI is needed
#45740 opened Jun 15, 2026 by AndreasKaratzas Member Draft
[Quantization] Extend ModelOpt mixed precision and NVFP4 runtime formats
#45735 opened Jun 15, 2026 by baonudesifeizhai Contributor Loading…
4 tasks
docs: multi-server vLLM deployment issues and solutions documentation Improvements or additions to documentation
#45732 opened Jun 15, 2026 by hsjlyj Loading…
ProTip! What’s not been updated in a month: updated:<2026-05-15.