neuralmagic Repositories

A high-throughput and memory-efficient inference and serving engine for LLMs

Last commit Sep 4, 2024

(266 stars) (10 forks) (0 indexed issues) (0 open good first issues)

A high-throughput and memory-efficient inference and serving engine for LLMs

Last commit Jun 4, 2026

(17 stars) (7 forks) (0 indexed issues) (0 open good first issues)

Get fresh easy issues in your inbox.