vllm-project/vllm
Vedi su GitHub[Feature]: Unify MoE "Oracles" with Class Structure
Open
#37.753 aperta il 21 mar 2026
feature requestgood first issuehelp wanted
Metriche repository
- Star
- (80.034 star)
- Metriche merge PR
- (Merge medio 9g 2h) (921 PR mergiate in 30 g)
Descrizione
🚀 The feature, motivation and pitch
We currently have the following MoE "oracles", which select the right MoE kernel for each model
model_executor/layers/fused_moe/oracle
We have:
- fp8
- nvfp4
- mxfp8
- unquantized
and will soon have mxfp4
Each of these has the following functions:
select_XX_moe_backend- called by the quantization integration to get the backendconvert_to_XX_moe_kernel_format- called by the quantization integration to shuffle the weightsmake_XX_moe_quant_config- called by the quantization integration to make the quant configmake_fp8_moe_kernel- called by the quantization integration to construct the kernel
Now that we have the structure standardized, we need to create a generic class that implements this logic. Then, each oracle can inherit from this.
So, we would have the following:
class MoEKernelOracle(ABC)
...
class Fp8MoEKernelOracle(MoEOracle):
...
and so on and so forth
Alternatives
just use conventions. this is a bad idea due to drift and duplicated code
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.