sgl-project/sglang

[Feature] Profile the update weights from disk API of SGLang Diffusion

Open

Aperta il 18 feb 2026

Vedi su GitHub
 (3 commenti) (0 reazioni) (0 assegnatari)Python (28.442 star) (6216 fork)auto 404
good first issue

Descrizione

Checklist

Motivation

As we did in this comment:

https://github.com/sgl-project/sglang/pull/18306/#issuecomment-3898841774

We should profile the actual time breakdown in the update weights from disk.

Ideally speaking, 7B models' update should be within 1s (no considering save to disk time) in this https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/blob/main/sglang/latency-accelerate-for-weight-updates/readme.md

Related resources

No response

Guida contributor