[Feature][Help Wanted]: Add tuning script and config files for Mamba selective_state_update kernel
#33034 opened on Jan 25, 2026
Description
🚀 The feature, motivation and pitch
Background
This PR introduces an optimization for selective_state_update on Blackwell:
https://github.com/vllm-project/vllm/pull/32873
This optimization is GPU specific and hard coded in the code.
Fused MoE tuning
Fused MoE offers a more generic tuning mechanism:
Configuration files for fused MoE can be generated using the script benchmarks/kernels/benchmark_moe.py
The benchmark_moe.py script generates JSON files like the following:
vllm/model_executor/layers/fused_moe/configs/E=128,N=1024,device_name=NVIDIA_H200.json
vLLM can auto-detect the file and use the relevant config for optimal performance.
Requirements
The same capability should be added for selective_state_update:
- Support for JSON config (instead of hard coded values in the code).
- A new benchmark script (similar to
benchmark_moe.py) that can generate the JSON files.
Alternatives
No response
Additional context
Related PRs (not merged): https://github.com/vllm-project/vllm/pull/22728
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.