Investigate the fastllm repository (https://github.com/ztxz16/fastllm) to understand its API and model loading mechanisms. Look at existing model support in FastChat under the `fastchat.serve` directory to see how other backends (e.g., vLLM, Text Generation Inference) are integrated. Check FastChat documentation for model registration and serving endpoints. The issue lacks specific requirements, so engage with the issue author to clarify desired use cases and propose a design.