vllm-project/vllm
View on GitHub[Feature]: Unify MoE "Oracles" with Class Structure
Open
#37753 opened on Mar 21, 2026
feature requestgood first issuehelp wanted
Description
🚀 The feature, motivation and pitch
We currently have the following MoE "oracles", which select the right MoE kernel for each model
model_executor/layers/fused_moe/oracle
We have:
- fp8
- nvfp4
- mxfp8
- unquantized
and will soon have mxfp4
Each of these has the following functions:
select_XX_moe_backend- called by the quantization integration to get the backendconvert_to_XX_moe_kernel_format- called by the quantization integration to shuffle the weightsmake_XX_moe_quant_config- called by the quantization integration to make the quant configmake_fp8_moe_kernel- called by the quantization integration to construct the kernel
Now that we have the structure standardized, we need to create a generic class that implements this logic. Then, each oracle can inherit from this.
So, we would have the following:
class MoEKernelOracle(ABC)
...
class Fp8MoEKernelOracle(MoEOracle):
...
and so on and so forth
Alternatives
just use conventions. this is a bad idea due to drift and duplicated code
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.