This repo hosts code for vLLM CI & Performance Benchmark infrastructure.
Repository
Repository di vllm-project
(42 star) (68 fork) (0 issue indicizzate) (0 good first issue aperte)
Fast and memory-efficient exact attention
(124 star) (148 fork) (0 issue indicizzate) (0 good first issue aperte)
vllm-project/guidellmPython
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
(1166 star) (156 fork) (0 issue indicizzate) (0 good first issue aperte)
vllm-project/recipesJavaScript
Common recipes to run vLLM
(833 star) (292 fork) (0 issue indicizzate) (0 good first issue aperte)
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
(4293 star) (699 fork) (0 issue indicizzate) (0 good first issue aperte)
TPU inference for vLLM, with unified JAX and PyTorch support.
(348 star) (205 fork) (0 issue indicizzate) (0 good first issue aperte)
vllm-project/vllmPython
A high-throughput and memory-efficient inference and serving engine for LLMs
(80.034 star) (16.816 fork) (61 issue indicizzate) (55 good first issue aperte)
Community maintained hardware plugin for vLLM on Ascend
(2180 star) (1318 fork) (5 issue indicizzate) (5 good first issue aperte)
vllm-project/vllm-ncclPython
Manages vllm-nccl dependency
(18 star) (3 fork) (0 issue indicizzate) (0 good first issue aperte)
vllm-project/vllm-omniPython
A framework for efficient model inference with omni-modality models
(4990 star) (1067 fork) (0 issue indicizzate) (0 good first issue aperte)
(43 star) (101 fork) (0 issue indicizzate) (0 good first issue aperte)