Upgrade to Transformers v5 · vllm-project/vllm#38379

(4 comments) (13 reactions) (1 assignee)Python (16,816 forks)batch import

help wanted

Repository metrics

Stars: (80,034 stars)
PR merge metrics: (Avg merge 3d 17h) (993 merged PRs in 30d)

Description

What is this issue?

This issue serves as a living tracker for the current issues preventing us from upgrading vLLM to Transformers v5.

We will use sub-issues to track individual failures and PRs should be made against these sub-issues.

The solutions to these issues may need to be applied to either:

Transformers in the form of:
- Adding missing backward compatibility (usually for custom code models)
- General bug fixes/improvements to new features of v5
vLLM in the form of:
- Forward compatibility with how something is now done in v5
- Edge case handling for issues that v4 ignored (such as config validation)

Sometimes, the issue is simply with the model checkpoint itself, for example if it:

Contains a malformed config.json that cannot be used to instantiate the newly input validated PreTrainedConfig class
Custom code* uses deprecated/removed APIs

In these situations, the best solution will likely be to skip these tests in vLLM and open a PR to Transformers to contribute this model. This will be faster and more sustainable than waiting for the model vendor to fix their custom model code, sometimes they nevert do.

Contributing the new model should be done using the new Modular Transformers so that the implementation is easy to maintain and will remain maintained by the Transformers team.

*particularly in the parts of the model implementation that vLLM tries to directly reuse, such as config/tokenizer/multimodal processor

Comprehensive list of skips

Now that the parent PR is merged we have a comprehensive list of all tests that are currently skipped on main

Module-level skips (skip everything in the file)

PR:https://github.com/vllm-project/vllm/pull/44282 — tests/lora/test_minicpmv_tp.py (pytestmark = pytest.mark.skipif(transformers >= 5.0)) — MiniCPMV custom processor uses tokenizer.im_start_id not available on TokenizersBackend in transformers v5+
PR: TBD — tests/models/multimodal/generation/test_phi4siglip.py (pytestmark = pytest.mark.skipif(transformers >= 5.0)) — HF model custom code uses siglip2 internals (filter_out_non_signature_kwargs) removed by HF#43514
PR: TBD — tests/models/multimodal/pooling/test_colqwen3.py (pytestmark = pytest.mark.skip(...)) — ColQwen3 weight tying incompatible with transformers v5 (missing all_tied_weights_keys)
PR: TBD — tests/models/multimodal/pooling/test_intern_vit.py (pytestmark = pytest.mark.skip(...)) — InternVisionModel custom code incompatible with transformers v5 (missing all_tied_weights_keys)
PR: TBD — tests/models/multimodal/pooling/test_jinavl_reranker.py (pytestmark = pytest.mark.skip(...)) — jinaai/jina-reranker-m0 custom code incompatible with transformers v5 (missing all_tied_weights_keys)

Function-level / parametrized skips

PR: TBD — tests/models/language/pooling_mteb_test/test_jina.py::test_embed_models_correctness (entire @parametrize block at line 759, covers all EMBEDDING_MODELS x dtype=half x dimensions=[16, 32]) — jinaai/jina-embeddings-v3 custom XLMRobertaLoRA model incompatible with transformers v5 (missing all_tied_weights_keys)
PR: https://github.com/vllm-project/vllm/pull/42498 — tests/models/multimodal/generation/test_nemotron_parse.py — nvidia/NVIDIA-Nemotron-Parse-v1.1 parametrized test (entire run_test block at line 875) — Custom MBart decoder head-count mismatch with transformers v5 GQA-aware cross-attention (8 vs 16 heads)
PR: TBD — tests/models/multimodal/generation/test_voxtral.py::test_hf_reference — VoxtralProcessor.apply_chat_template() in transformers v5 doesn't resolve chat_template=None to default
PR: TBD — tests/models/multimodal/processing/test_musicflamingo.py::test_musicflamingo_audio_feature_pipeline_matches_hf_small_config (skipif transformers >= 5.5) — transformers v5.5 added native MusicFlamingoForConditionalGeneration with different get_audio_features signature
PR: TBD — tests/v1/e2e/spec_decode/test_spec_decode.py — ("eagle3", "Qwen/Qwen3-8B", "AngelSlim/Qwen3-8B_eagle3", 1) param of test_eagle_correctness_* — "Feature is experimental and uses too much memory in CI" (TODO from hmellor)

`tests/models/multimodal/generation/test_common.py` — VLMTestInfo entries newly marked `pytest.mark.skip`

PR: TBD — ultravox (fixie-ai/ultravox-v0_5-llama-3_2-1b) — Custom model code is not compatible with Transformers v5
PR: TBD — intern_vl image (OpenGVLab/InternVL2-1B, OpenGVLab/InternVL2-2B, OpenGVLab/Mono-InternVL-2B) — Custom model code tries to access data from meta-tensor
PR: TBD — intern_vl-video (InternVL video models) — Custom model code tries to access data from meta-tensor
PR: TBD — isaac (PerceptronAI/Isaac-0.1-2B) — Custom model imports deleted object
PR: TBD — intern_vl custom-input case at line 854 (InternVL custom-input variant) — Custom model code tries to access data from meta-tensor
PR: TBD — paddleocr_vl (PaddlePaddle/PaddleOCR-VL) — Model's custom code uses ROPE_INIT_FUNCTIONS['default'] which was removed in transformers v5

`tests/models/language/pooling_mteb_test/` — `enable_test=False`

PR: TBD — test_baai.py BAAI entry at line 729 — Custom tokenizer on HF hub incompatible with transformers v5 (sets attrs before super().__init__, causing AttributeError on verbose)
PR: TBD — test_gte.py GTE entry at line 745 — Numerical regression with transformers v5

`tests/models/registry.py` — entries gated by `max_transformers_version`

Sub-issue template

This is a sub-issue forming part of the work in https://github.com/vllm-project/vllm/issues/38379, please read the description of this issue before beginning to work on this one.

## Which test is failing?

```console
$ pytest tests/
...

```

## How to configure my environment?

It's very important that you install both vLLM and Transformers from source so that your test results reflect the current state of both libraries.

```console
# Or your fork
git clone https://github.com/huggingface/transformers.git
git clone https://github.com/vllm-project/vllm.git

cd vllm
VLLM_USE_PRECOMPILED=1 uv pip install -e .
uv pip install -e ../transformers
```

Contributor guide

Research direction: Pick a specific sub issue from the list, such as one with a clear test failure and known root cause. Set up a development environment with vLLM and Transformers v5 from source. Reproduce the failure, then decide whether the fix belongs in Transformers (add backward compatibility) or in vLLM (adapt to v5 changes). Follow the sub issue template to report progress and open a PR.
Tech stack: python
Domain: backendinfrastructure
Issue type: Chore
Difficulty: 4
Estimated time: Over 1 week
Activity status: Active
Clarity: Mostly clear
Prerequisites: GitPythonHugging Face Transformers knowledgevLLM internals
Newbie friendliness: 20