This repo hosts code for vLLM CI & Performance Benchmark infrastructure.
Dépôts
Dépôts de vllm-project
(42 stars) (68 forks) (0 issues indexées) (0 good first issues ouvertes)
Fast and memory-efficient exact attention
(124 stars) (148 forks) (0 issues indexées) (0 good first issues ouvertes)
vllm-project/guidellmPython
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
(1 166 stars) (156 forks) (0 issues indexées) (0 good first issues ouvertes)
vllm-project/recipesJavaScript
Common recipes to run vLLM
(833 stars) (292 forks) (0 issues indexées) (0 good first issues ouvertes)
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
(4 293 stars) (699 forks) (0 issues indexées) (0 good first issues ouvertes)
TPU inference for vLLM, with unified JAX and PyTorch support.
(348 stars) (205 forks) (0 issues indexées) (0 good first issues ouvertes)
vllm-project/vllmPython
A high-throughput and memory-efficient inference and serving engine for LLMs
(80 034 stars) (16 816 forks) (61 issues indexées) (55 good first issues ouvertes)
Community maintained hardware plugin for vLLM on Ascend
(2 180 stars) (1 318 forks) (5 issues indexées) (5 good first issues ouvertes)
vllm-project/vllm-ncclPython
Manages vllm-nccl dependency
(18 stars) (3 forks) (0 issues indexées) (0 good first issues ouvertes)
vllm-project/vllm-omniPython
A framework for efficient model inference with omni-modality models
(4 990 stars) (1 067 forks) (0 issues indexées) (0 good first issues ouvertes)
(43 stars) (101 forks) (0 issues indexées) (0 good first issues ouvertes)