Lightning-AI/lit-llamaPython
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
(5,533 stars) (473 forks) (2 indexed issues) (2 open good first issues)