This repo hosts code for vLLM CI & Performance Benchmark infrastructure.
Repositórios
Repositórios de vllm-project
(42 stars) (68 forks) (0 issues indexadas) (0 good first issues abertas)
Fast and memory-efficient exact attention
(124 stars) (148 forks) (0 issues indexadas) (0 good first issues abertas)
vllm-project/guidellmPython
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
(1.166 stars) (156 forks) (0 issues indexadas) (0 good first issues abertas)
vllm-project/recipesJavaScript
Common recipes to run vLLM
(833 stars) (292 forks) (0 issues indexadas) (0 good first issues abertas)
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
(4.293 stars) (699 forks) (0 issues indexadas) (0 good first issues abertas)
TPU inference for vLLM, with unified JAX and PyTorch support.
(348 stars) (205 forks) (0 issues indexadas) (0 good first issues abertas)
vllm-project/vllmPython
A high-throughput and memory-efficient inference and serving engine for LLMs
(80.034 stars) (16.816 forks) (61 issues indexadas) (55 good first issues abertas)
Community maintained hardware plugin for vLLM on Ascend
(2.180 stars) (1.318 forks) (5 issues indexadas) (5 good first issues abertas)
vllm-project/vllm-ncclPython
Manages vllm-nccl dependency
(18 stars) (3 forks) (0 issues indexadas) (0 good first issues abertas)
vllm-project/vllm-omniPython
A framework for efficient model inference with omni-modality models
(4.990 stars) (1.067 forks) (0 issues indexadas) (0 good first issues abertas)
(43 stars) (101 forks) (0 issues indexadas) (0 good first issues abertas)