vllm-project/vllm-ascend
Ver no GitHub[Misc]: Discussion on accuracy variance
Open
#6.623 aberto em 9 de fev. de 2026
help wantedwait-feedback
Métricas do repositório
- Stars
- (2.180 stars)
- Métricas de merge de PR
- (Mesclagem média 5d 16h) (419 fundiu PRs em 30d)
Description
Anything you want to discuss about vllm on ascend.
Because of batch variance, We cannot guarantee that the same input will yield the same output in a multi-batch inference case. And this accuracy variance is more explicit in certain datasets.
| model | dataset | acc | acc variance |
|---|---|---|---|
| deepseek-v3.1 | GPQA | 74~82 | 8 |
| qwen3-235b | GPQA | 64~71 | 7 |
| qwen3-480b | GPQA | 60~67 | 7 |
We need to test these case on GPU or use other inference engine, such as sglang to check if this is an acc bug.
If GPU also has the same acc variance like us, we believe that this acc variance is reasonable for this dataset. Otherwise, we need to solve this acc bug.