Dépôts de vllm-project

This repo hosts code for vLLM CI & Performance Benchmark infrastructure.

Dernier commit 6 juin 2026

(42 stars) (68 forks) (0 issues indexées) (0 good first issues ouvertes)

Fast and memory-efficient exact attention

Dernier commit 30 mai 2026

(124 stars) (148 forks) (0 issues indexées) (0 good first issues ouvertes)

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Dernier commit 22 mai 2026

(1 166 stars) (156 forks) (0 issues indexées) (0 good first issues ouvertes)

Common recipes to run vLLM

Dernier commit 7 juin 2026

(833 stars) (292 forks) (0 issues indexées) (0 good first issues ouvertes)

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Dernier commit 8 juin 2026

(4 293 stars) (699 forks) (0 issues indexées) (0 good first issues ouvertes)

TPU inference for vLLM, with unified JAX and PyTorch support.

Dernier commit 7 juin 2026

(348 stars) (205 forks) (0 issues indexées) (0 good first issues ouvertes)

A high-throughput and memory-efficient inference and serving engine for LLMs

Dernier commit 15 mai 2026

(80 034 stars) (16 816 forks) (61 issues indexées) (55 good first issues ouvertes)

Community maintained hardware plugin for vLLM on Ascend

Dernier commit 2 juin 2026

(2 180 stars) (1 318 forks) (5 issues indexées) (5 good first issues ouvertes)

Manages vllm-nccl dependency

Dernier commit 3 juin 2024

(18 stars) (3 forks) (0 issues indexées) (0 good first issues ouvertes)

A framework for efficient model inference with omni-modality models

Dernier commit 8 juin 2026

(4 990 stars) (1 067 forks) (0 issues indexées) (0 good first issues ouvertes)

Dernier commit 5 juin 2026

(43 stars) (101 forks) (0 issues indexées) (0 good first issues ouvertes)

Recevez de nouvelles issues Easy par e-mail.