sgl-project/sglang

[Feature] Enable EPLB in Draft Models

Open

#7,893 创建于 2025年7月9日

在 GitHub 查看
 (6 评论) (0 反应) (1 负责人)Python (6,216 fork)auto 404
good first issue

仓库指标

Star
 (28,442 star)
PR 合并指标
 (平均合并 2天 1小时) (30 天内合并 1,000 个 PR)

描述

Checklist

Motivation

Currently, EPLB is not supported in draft models, which constrains the parallelism size. For example, with EPLB, we can set the EP size as 72/144 for the DeepSeek model as it has 288 experts in total. However, these parallelism settings cannot be adopted when MTP is enabled as the draft model as has 256 experts.

Related resources

No response

贡献者指南