lm-sys/FastChat
View on GitHub[Feature request] Support loading GGUF and GGML model format
Open
#2,410 opened on Sep 13, 2023
5 comments (5 comments)7 reactions (7 reactions)0 assignees (0 assignees)Python38,959 stars (38,959 stars)4,736 forks (4,736 forks)batch import
good first issue
Description
This issue does not include a description.
Contributor guide
- Tech stack
- python
- Domain
- machine learning
- Issue type
- feature
- DifficultyEstimated implementation difficulty for a new contributor, from 1 for very small changes to 5 for expert-level work.
- 3
- Estimated timeA rough time range for an experienced contributor to investigate, implement, test, and prepare a pull request.
- 3-5 days
- Activity statusHow available the issue appears right now: fresh, active, stale, blocked, or waiting on maintainer input.
- fresh
- ClarityHow clearly the issue explains the expected change, acceptance criteria, and next step.
- clear
- Prerequisites
- PythonFamiliarity with FastChat codebaseUnderstanding of LLM model formats
- Newbie friendlinessA 1-100 score estimating how approachable this issue is for first-time contributors.
- 40
- Research direction
- The issue requests support for loading GGUF and GGML model formats. Investigate the current model loading code in FastChat, particularly within the `fastchat/model` directory and any existing model adapter classes. Check the issue comments for any additional context or suggested approaches. Consider looking at how other projects like llama.cpp or transformers handle these formats to guide the implementation.