Feature Request: Mapping model name to LoRA config · ggml-org/llama.cpp#11031

(6 留言) (2 反應) (0 負責人)C++ (110,169 star) (18,202 fork)batch import

enhancementgood first issueserver

描述

Prerequisites

I am running the latest code. Mention the version if possible as well.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

I came across this idea while working on #10994

The idea is that we can maintain a list of model name mapped to LoRA config, for example:

{
    "llama-base":               [{"id": 0, "scale": 0.0}, {"id": 1, "scale": 0.0}],
    "llama-story":              [{"id": 0, "scale": 1.0}, {"id": 1, "scale": 0.0}],
    "llama-abliteration":       [{"id": 0, "scale": 0.0}, {"id": 1, "scale": 1.0}],
    "llama-story-abliteration": [{"id": 0, "scale": 0.5}, {"id": 1, "scale": 0.5}]
}

Then, user can switch the model by specifying model in the request, for example:

# first user:
{
    "model": "llama-story-abliteration",
    "messages": [
        {"role": "user", "content": "Write a NSFW story"}
    ]
}

# second user:
{
    "model": "llama-base",
    "messages": [
        {"role": "user", "content": "Is this NSFW?"}
    ]
}

Motivation

N/A

Possible Implementation

No response

貢獻者指南

技術棧: cpprest api
領域: backendapimachine learning
議題類型: feature
難度: 3
預計時間: 3-5 days
活動狀態: fresh
清晰度: clear
前置要求: Experience with llama.cpp serverUnderstanding of LoRA configs
新手友善度: 25
研究方向: Research the current LoRA adapter loading mechanism in llama.cpp's server code. Review the implementation in the server (likely in `examples/server/server.cpp` or similar) that handles model loading. Also examine PR #10994 for context on potential integration points. Investigate how to map model names to LoRA configurations and modify the server's API to accept a model field.