neuralmagic/nm-vllmPython
A high-throughput and memory-efficient inference and serving engine for LLMs
(266 Stars) (10 Forks) (0 indexierte Issues) (0 offene good first issues)
Repositories
A high-throughput and memory-efficient inference and serving engine for LLMs
A high-throughput and memory-efficient inference and serving engine for LLMs