Description
This is a sub-issue forming part of the work in https://github.com/vllm-project/vllm/issues/38379, please read the description of this issue before beginning to work on this one.
Which test is failing?
Transformers v5 creates the model on the meta device first, then loads the weights, similarly to what vLLM does. The issue here is that the custom model code in the checkpoint tries to use real tensors as part of model structure construction.
Since the issue here is with the HF reference generation, this cannot be fixed in vLLM (other than skipping the tests until the model works with Transformers v5). The proper solution to this issue is to upstream this architecture, which shouldn't be too hard using Modular Transformers as the text backbone is Qwen2 so that can be reused.
$ pytest tests/models/multimodal/generation/test_common.py::test_single_image_models[intern_vl-test_case25]
...
RuntimeError: Tensor.item() cannot be called on meta tensors
How to configure my environment?
It's very important that you install both vLLM and Transformers from source so that your test results reflect the current state of both libraries.
# Or your fork
git clone https://github.com/huggingface/transformers.git
git clone https://github.com/vllm-project/vllm.git
cd vllm
VLLM_USE_PRECOMPILED=1 uv pip install -e .
uv pip install -e ../transformers