Repository

Repository di vllm-project

This repo hosts code for vLLM CI & Performance Benchmark infrastructure.

Ultimo commit 6 giu 2026

 (42 star) (68 fork) (0 issue indicizzate) (0 good first issue aperte)

Fast and memory-efficient exact attention

Ultimo commit 30 mag 2026

 (124 star) (148 fork) (0 issue indicizzate) (0 good first issue aperte)

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Ultimo commit 22 mag 2026

 (1166 star) (156 fork) (0 issue indicizzate) (0 good first issue aperte)

Common recipes to run vLLM

Ultimo commit 7 giu 2026

 (833 star) (292 fork) (0 issue indicizzate) (0 good first issue aperte)

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Ultimo commit 8 giu 2026

 (4293 star) (699 fork) (0 issue indicizzate) (0 good first issue aperte)

TPU inference for vLLM, with unified JAX and PyTorch support.

Ultimo commit 7 giu 2026

 (348 star) (205 fork) (0 issue indicizzate) (0 good first issue aperte)

A high-throughput and memory-efficient inference and serving engine for LLMs

Ultimo commit 15 mag 2026

 (80.034 star) (16.816 fork) (61 issue indicizzate) (55 good first issue aperte)

Community maintained hardware plugin for vLLM on Ascend

Ultimo commit 2 giu 2026

 (2180 star) (1318 fork) (5 issue indicizzate) (5 good first issue aperte)

Manages vllm-nccl dependency

Ultimo commit 3 giu 2024

 (18 star) (3 fork) (0 issue indicizzate) (0 good first issue aperte)

A framework for efficient model inference with omni-modality models

Ultimo commit 8 giu 2026

 (4990 star) (1067 fork) (0 issue indicizzate) (0 good first issue aperte)

Ultimo commit 5 giu 2026

 (43 star) (101 fork) (0 issue indicizzate) (0 good first issue aperte)