unslothai/unsloth

[Bug] Cannot load qwen3-vl series with lora adapter on vllm.

Open

#3.560 aberto em 6 de nov. de 2025

Ver no GitHub
 (5 comments) (0 reactions) (0 assignees)Python (5.658 forks)batch import
good first issue

Métricas do repositório

Stars
 (64.271 stars)
Métricas de merge de PR
 (Mesclagem média 3d 15h) (525 fundiu PRs em 30d)

Description

I fine-tuned the Qwen3-VL-8B-Instruct model using Unsloth. My code is 99% identical to the official guide; the only change I made was replacing the 8B model in the guide with the 2B model for fine-tuning. After fine-tuning, I confirmed that the QLoRA adapter was saved correctly.

Excited and happy, I moved the saved QLoRA adapter and the Qwen3-VL-2B-Instruct model to my vLLM server. Then I ran a command to start model serving with vLLM as shown below. (For reference, the vLLM server has no issues—it was already serving official Qwen3-VL models.)

command = [
        sys.executable, 
        "-m", "vllm.entrypoints.openai.api_server",
        "--model", "./Qwen3-VL-2B-Instruct",
        "--max_model_len", "3500",
        "--gpu_memory_utilization", "0.85",
        "--trust-remote-code",
        "--host", "0.0.0.0",
        "--port", "8888",

        # for lora adapter
        "--enable-lora",
        "--max-lora-rank", "16",  # LoRA rank
        "--max-loras", "1", 
        "--max-cpu-loras", "1",
        "--lora-modules", "adapter0=./my_lora_adapter"
]

I waited for vLLM to properly load the QLoRA adapter, but the following problem occurred. This same issue happened even when I retrained LoRA using Unsloth with 2B, 4B, and 8B models.

When I was feeling hopeless, I tried merging the model instead of saving the LoRA adapter separately by using the save_pretrained_merged() function as shown below, and then vLLM was able to load and perform inference normally:

save_pretrained_merged( f"my_16bit_model", tokenizer, save_method="merged_16bit")

However, I don't want to merge the models—I want to load only the LoRA adapter. I’ve seen many posts from others experiencing the same error. As of now, what can I do to resolve this issue?

Guia do colaborador