unslothai/unsloth

[Feature] Draft model / speculative decoding

Open

#4753 opened on Apr 1, 2026

View on GitHub
 (4 comments) (2 reactions) (0 assignees)Python (64,271 stars) (5,658 forks)batch import
feature requestgood first issuehelp wanted

Description

Can we have the possibility to select draft model in ui? Seems like an important feature, I wonder how fast would Qwen3.5 27b be if I used Qwen3.5 0.8b as draft model.

Contributor guide