lm-sys/FastChat

support for 4bit quantization from transfomer library.

Open

#1 798 ouverte le 27 juin 2023

Voir sur GitHub
 (7 commentaires) (2 réactions) (0 assignés)Python (38 959 stars) (4 736 forks)batch import
enhancementgood first issue

Description

Loading a vicuna13B using 4bit quantization from the transformers library is possible load_in_4bit. How difficult could be for Fastach to support it?

Guide contributeur