[Feature Request] Add a ctranslate2 model worker · lm-sys/FastChat#2133

(2 评论) (1 反应) (1 负责人)Python (4,736 fork)batch import

enhancementgood first issue

仓库指标

Star: (38,959 star)
PR 合并指标: (30 天内没有已合并 PR)

描述

According to some recent analysis on twitter, CTranslate2 can serve LLMs a little faster than vLLM and (maybe?) with a small quality increase. At least for Llama 2.

This could either be a a model worker that's added directly to fastchat OR a doc with extensive documentation on how to write a custom model worker (with mostly working \ implementation code) that anyone can use in their own project. The second option might be best so that people can expand FastChat with custom model workers without having to change the base project too much.

If this is appealing I can get to it at some point.

贡献者指南

研究方向: 研究FastChat的模型工作器架构和CTranslate2的推理API。贡献一个新的工作器或编写包含示例代码的文档。
技术栈: python
领域: backend
议题类型: 功能
难度: 3
预计时间: 1-2 天
活动状态: 新近可参与
清晰度: 清晰
前置要求: PythonFastChatCTranslate2
新手友好度: 60

仓库指标

描述

贡献者指南

每天在邮箱收到新鲜 Easy issues。