Repository Issues
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Issues
Open
[Usage]: How to get query embeddings from ColBERT?
good first issueusage
9 comments0 reactions1 assignee
Open
[Docs] Document NIXL KV connector metrics aggregation semantics
good first issue
4 comments1 reaction1 assignee
Open
[Feature]: Integrate fused `kMoEFinalizeARResidualRMSNorm` from FlashInfer
feature requesthelp wanted
3 comments1 reaction0 assignees
Open
5 comments0 reactions0 assignees
Open
[torch.compile] config hashing refactor follow-ups
feature requestgood first issuehelp wanted
15 comments0 reactions3 assignees
Open
[torch.compile] E2E correctness testing for fusions
help wantedtorch.compile
6 comments0 reactions0 assignees
Open
[Bug]: Certain Ranks Take a Look Time to Load Weights
bughelp wanted
3 comments0 reactions0 assignees
Open
[Transformers v5] Tarsier2ForConditionalGeneration
good first issuehelp wanted
3 comments0 reactions0 assignees
Open
[Transformers v5] SarvamMLAForCausalLM
good first issuehelp wanted
2 comments0 reactions1 assignee
Open
[Transformers v5] InternVL2
good first issuehelp wanted
4 comments0 reactions0 assignees
Open
[Transformers v5] IsaacForConditionalGeneration
good first issuehelp wanted
4 comments0 reactions0 assignees
Open
[Transformers v5] Base model and LoRA used in test has incorrect `tokenizer_config.json`
good first issuehelp wanted
8 comments0 reactions1 assignee
Open
[Transformers v5] MiniCPMV cannot apply processor
good first issuehelp wanted
8 comments0 reactions1 assignee
Open
Upgrade to Transformers v5
help wanted
1 comment10 reactions1 assignee
Open
[Feature]: Better Flashinfer compilation logging
feature requesthelp wanted
8 comments0 reactions0 assignees
Open
[RFC]: Support ViT Full CUDA Graph (Tracker)
RFChelp wantedmulti-modality
14 comments1 reaction0 assignees
Open
[Feature]: Unify MoE "Oracles" with Class Structure
feature requestgood first issuehelp wanted
6 comments0 reactions1 assignee
Open
[Feature]: Upstream DGX spark improvements from Avarok-Cybersecurity/dgx-vllm
feature requesthelp wantednvidiaquantization
13 comments1 reaction0 assignees
Open
[Performance]: qknorm+rope fusion slower than unfused on H100
help wantedperformancetorch.compile
12 comments1 reaction1 assignee
Open
[Roadmap]: PD Disaggregation with `NixlConnector` Roadmap
feature requesthelp wanted
5 comments15 reactions0 assignees