Lightning-AI/lit-llamaPython
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
(5,533 stars) (473 forks) (2 个已索引 issue) (2 个开放 good first issue)