vllm-project/vllm
在 GitHub 查看[Feature]: Integrate RMS+fp4 fused kernel from FlashInfer
Open
#32,612 创建于 2026年1月19日
feature requestgood first issuehelp wantedstaletorch.compile
描述
🚀 The feature, motivation and pitch
Kernel: https://github.com/flashinfer-ai/flashinfer/blob/main/flashinfer/norm.py#L406-L409
This should be integrated in the existing rms + quant fusion pass using a similar approach as the silu+fp4 fusion in the act quant fusion pass.
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.