You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue serves as a living tracker for the current issues preventing us from upgrading vLLM to Transformers v5.
We will use sub-issues to track individual failures and PRs should be made against these sub-issues.
The solutions to these issues may need to be applied to either:
Transformers in the form of:
Adding missing backward compatibility (usually for custom code models)
General bug fixes/improvements to new features of v5
vLLM in the form of:
Forward compatibility with how something is now done in v5
Edge case handling for issues that v4 ignored (such as config validation)
Sometimes, the issue is simply with the model checkpoint itself, for example if it:
Contains a malformed config.json that cannot be used to instantiate the newly input validated PreTrainedConfig class
Custom code* uses deprecated/removed APIs
In these situations, the best solution will likely be to skip these tests in vLLM and open a PR to Transformers to contribute this model. This will be faster and more sustainable than waiting for the model vendor to fix their custom model code, sometimes they nevert do.
Contributing the new model should be done using the new Modular Transformers so that the implementation is easy to maintain and will remain maintained by the Transformers team.
*particularly in the parts of the model implementation that vLLM tries to directly reuse, such as config/tokenizer/multimodal processor
Comprehensive list of skips
Now that the parent PR is merged we have a comprehensive list of all tests that are currently skipped on main
PR: TBD — tests/models/language/pooling_mteb_test/test_jina.py::test_embed_models_correctness (entire @parametrize block at line 759, covers all EMBEDDING_MODELS x dtype=half x dimensions=[16, 32]) — jinaai/jina-embeddings-v3 custom XLMRobertaLoRA model incompatible with transformers v5 (missing all_tied_weights_keys)
PR: TBD — tests/models/multimodal/generation/test_voxtral.py::test_hf_reference — VoxtralProcessor.apply_chat_template() in transformers v5 doesn't resolve chat_template=None to default
PR: TBD — tests/models/multimodal/processing/test_musicflamingo.py::test_musicflamingo_audio_feature_pipeline_matches_hf_small_config (skipif transformers >= 5.5) — transformers v5.5 added native MusicFlamingoForConditionalGeneration with different get_audio_features signature
PR: TBD — tests/v1/e2e/spec_decode/test_spec_decode.py — ("eagle3", "Qwen/Qwen3-8B", "AngelSlim/Qwen3-8B_eagle3", 1) param of test_eagle_correctness_* — "Feature is experimental and uses too much memory in CI" (TODO from hmellor)
PR: TBD — ultravox (fixie-ai/ultravox-v0_5-llama-3_2-1b) — Custom model code is not compatible with Transformers v5
PR: TBD — intern_vl image (OpenGVLab/InternVL2-1B, OpenGVLab/InternVL2-2B, OpenGVLab/Mono-InternVL-2B) — Custom model code tries to access data from meta-tensor
PR: TBD — intern_vl-video (InternVL video models) — Custom model code tries to access data from meta-tensor
PR: TBD — isaac (PerceptronAI/Isaac-0.1-2B) — Custom model imports deleted object
PR: TBD — intern_vl custom-input case at line 854 (InternVL custom-input variant) — Custom model code tries to access data from meta-tensor
PR: TBD — paddleocr_vl (PaddlePaddle/PaddleOCR-VL) — Model's custom code uses ROPE_INIT_FUNCTIONS['default'] which was removed in transformers v5
PR: TBD — test_baai.py BAAI entry at line 729 — Custom tokenizer on HF hub incompatible with transformers v5 (sets attrs before super().__init__, causing AttributeError on verbose)
PR: TBD — test_gte.py GTE entry at line 745 — Numerical regression with transformers v5
tests/models/registry.py — entries gated by max_transformers_version
PR: TBD — InternLM2VEForCausalLM (OpenGVLab/Mono-InternVL-2B), cap 4.57 — Custom config can't be loaded with v5, vision_config not always set
PR: TBD — Step3VLForConditionalGeneration (line ~530), cap 5.3 — validate_rope() no longer accepts ignore_keys param above v5.4
PR: TBD — XverseForCausalLM (xverse/XVERSE-7B-Chat), cap 4.57 — XVERSE tokenizer incompatible with v5 (add_prefix_space/prepend_scheme mismatch)
PR: TBD — FireRedASR2ForConditionalGeneration (allendou/FireRedASR2-LLM-vllm), cap 5.1 — Incompatible with v5.2+ (dict object has no attribute '__name__')
PR: TBD — FireRedLIDForConditionalGeneration (PatchyTisa/FireRedLID-vllm), cap 5.1 — Same as FireRedASR2 (dict object has no attribute '__name__')
PR: TBD — FunASRForConditionalGeneration (allendou/Fun-ASR-Nano-2512-vllm), cap 5.1 — Same as FireRedASR2 (dict object has no attribute '__name__')
This is a sub-issue forming part of the work in https://github.com/vllm-project/vllm/issues/38379, please read the description of this issue before beginning to work on this one.
## Which test is failing?```console
$ pytest tests/...```## How to configure my environment?
It's very important that you install both vLLM and Transformers from source so that your test results reflect the current state of both libraries.
```console
# Or your forkgit clone https://github.com/huggingface/transformers.gitgit clone https://github.com/vllm-project/vllm.gitcd vllmVLLM_USE_PRECOMPILED=1 uv pip install -e .uv pip install -e ../transformers```
What is this issue?
This issue serves as a living tracker for the current issues preventing us from upgrading vLLM to Transformers v5.
We will use sub-issues to track individual failures and PRs should be made against these sub-issues.
The solutions to these issues may need to be applied to either:
Sometimes, the issue is simply with the model checkpoint itself, for example if it:
config.jsonthat cannot be used to instantiate the newly input validatedPreTrainedConfigclassIn these situations, the best solution will likely be to skip these tests in vLLM and open a PR to Transformers to contribute this model. This will be faster and more sustainable than waiting for the model vendor to fix their custom model code, sometimes they nevert do.
Contributing the new model should be done using the new Modular Transformers so that the implementation is easy to maintain and will remain maintained by the Transformers team.
*particularly in the parts of the model implementation that vLLM tries to directly reuse, such as config/tokenizer/multimodal processor
Comprehensive list of skips
Now that the parent PR is merged we have a comprehensive list of all tests that are currently skipped on
mainModule-level skips (skip everything in the file)
tests/lora/test_minicpmv_tp.py(pytestmark = pytest.mark.skipif(transformers >= 5.0)) — MiniCPMV custom processor usestokenizer.im_start_idnot available on TokenizersBackend in transformers v5+tests/models/multimodal/generation/test_phi4siglip.py(pytestmark = pytest.mark.skipif(transformers >= 5.0)) — HF model custom code uses siglip2 internals (filter_out_non_signature_kwargs) removed by HF#43514tests/models/multimodal/pooling/test_colqwen3.py(pytestmark = pytest.mark.skip(...)) — ColQwen3 weight tying incompatible with transformers v5 (missingall_tied_weights_keys)tests/models/multimodal/pooling/test_intern_vit.py(pytestmark = pytest.mark.skip(...)) — InternVisionModel custom code incompatible with transformers v5 (missingall_tied_weights_keys)tests/models/multimodal/pooling/test_jinavl_reranker.py(pytestmark = pytest.mark.skip(...)) —jinaai/jina-reranker-m0custom code incompatible with transformers v5 (missingall_tied_weights_keys)Function-level / parametrized skips
tests/models/language/pooling_mteb_test/test_jina.py::test_embed_models_correctness(entire@parametrizeblock at line 759, covers allEMBEDDING_MODELSxdtype=halfxdimensions=[16, 32]) —jinaai/jina-embeddings-v3custom XLMRobertaLoRA model incompatible with transformers v5 (missingall_tied_weights_keys)tests/models/multimodal/generation/test_nemotron_parse.py—nvidia/NVIDIA-Nemotron-Parse-v1.1parametrized test (entirerun_testblock at line 875) — Custom MBart decoder head-count mismatch with transformers v5 GQA-aware cross-attention (8 vs 16 heads)tests/models/multimodal/generation/test_voxtral.py::test_hf_reference—VoxtralProcessor.apply_chat_template()in transformers v5 doesn't resolvechat_template=Noneto defaulttests/models/multimodal/processing/test_musicflamingo.py::test_musicflamingo_audio_feature_pipeline_matches_hf_small_config(skipif transformers >= 5.5) — transformers v5.5 added nativeMusicFlamingoForConditionalGenerationwith differentget_audio_featuressignaturetests/v1/e2e/spec_decode/test_spec_decode.py—("eagle3", "Qwen/Qwen3-8B", "AngelSlim/Qwen3-8B_eagle3", 1)param oftest_eagle_correctness_*— "Feature is experimental and uses too much memory in CI" (TODO from hmellor)tests/models/multimodal/generation/test_common.py— VLMTestInfo entries newly markedpytest.mark.skipultravox(fixie-ai/ultravox-v0_5-llama-3_2-1b) — Custom model code is not compatible with Transformers v5intern_vlimage (OpenGVLab/InternVL2-1B,OpenGVLab/InternVL2-2B,OpenGVLab/Mono-InternVL-2B) — Custom model code tries to access data from meta-tensorintern_vl-video(InternVL video models) — Custom model code tries to access data from meta-tensorisaac(PerceptronAI/Isaac-0.1-2B) — Custom model imports deleted objectintern_vlcustom-input case at line 854 (InternVL custom-input variant) — Custom model code tries to access data from meta-tensorpaddleocr_vl(PaddlePaddle/PaddleOCR-VL) — Model's custom code usesROPE_INIT_FUNCTIONS['default']which was removed in transformers v5tests/models/language/pooling_mteb_test/—enable_test=Falsetest_baai.pyBAAI entry at line 729 — Custom tokenizer on HF hub incompatible with transformers v5 (sets attrs beforesuper().__init__, causingAttributeErroronverbose)test_gte.pyGTE entry at line 745 — Numerical regression with transformers v5tests/models/registry.py— entries gated bymax_transformers_versionInternLM2VEForCausalLM(OpenGVLab/Mono-InternVL-2B), cap4.57— Custom config can't be loaded with v5,vision_confignot always setPlamo2ForCausalLM(pfnet/plamo-2-1b), cap4.57— Custom code uses_tied_weight_keys: list[str]; v5 expectsdict[str, str]Step3VLForConditionalGeneration(line ~530), cap5.3—validate_rope()no longer acceptsignore_keysparam above v5.4XverseForCausalLM(xverse/XVERSE-7B-Chat), cap4.57— XVERSE tokenizer incompatible with v5 (add_prefix_space/prepend_schememismatch)FireRedASR2ForConditionalGeneration(allendou/FireRedASR2-LLM-vllm), cap5.1— Incompatible with v5.2+ (dict object has no attribute '__name__')FireRedLIDForConditionalGeneration(PatchyTisa/FireRedLID-vllm), cap5.1— Same as FireRedASR2 (dict object has no attribute '__name__')FunASRForConditionalGeneration(allendou/Fun-ASR-Nano-2512-vllm), cap5.1— Same as FireRedASR2 (dict object has no attribute '__name__')HCXVisionForCausalLM(naver-hyperclovax/HyperCLOVAX-SEED-Vision-Instruct-3B), cap4.57— Custom config can't be loaded with v5,text_confignot always setInternS1ForConditionalGeneration(internlm/Intern-S1), cap4.57— Custom tokenizer code not compatible with v5MiniCPMO(openbmb/MiniCPM-o-2_6), cap4.57— Custom processor code not compatible with v5MiniCPMV(openbmb/MiniCPM-Llama3-V-2_5and 2.6/4.0/4.5 variants), cap4.57—MiniCPMVBatchFeatureincompatible with its v5 base classOpenCUAForConditionalGeneration(xlangai/OpenCUA-7B), cap4.57— Tokenizer can't be initialised in v5OpenPanguVLForConditionalGeneration(FreedomIntelligence/openPangu-VL-7B), cap4.57—OpenPanguVLVideoProcessorInitKwargsdoesn't specifytotal=FalseOvis2_5(AIDC-AI/Ovis2.5-2B), cap4.57— Custom processor code not compatible with v5Ovis2_6_MoeForCausalLM(AIDC-AI/Ovis2.6-30B-A3B), cap4.57— Custom processor code not compatible with v5Phi4ForCausalLMV(microsoft/Phi-4-reasoning-vision-15B), cap5.3— siglip2 internals removed by HF#43514 above v5.4Tarsier2ForConditionalGeneration(line ~1267), cap5.3—Qwen2VLConfigsplit intoQwen2VLConfig+Qwen2VLTextConfigin v5Sub-issue template