sgl-project/sglang

[Feature] Enable EPLB in Draft Models

Open

#7 893 ouverte le 9 juil. 2025

Voir sur GitHub
 (6 commentaires) (0 réactions) (1 assigné)Python (6 216 forks)auto 404
good first issue

Métriques du dépôt

Stars
 (28 442 stars)
Métriques de merge PR
 (Merge moyen 2j 1h) (1 000 PRs mergées en 30 j)

Description

Checklist

Motivation

Currently, EPLB is not supported in draft models, which constrains the parallelism size. For example, with EPLB, we can set the EP size as 72/144 for the DeepSeek model as it has 288 experts in total. However, these parallelism settings cannot be adopted when MTP is enabled as the draft model as has 256 experts.

Related resources

No response

Guide contributeur