vllm-project/vllm
View on GitHub[Feature]: Refactor Int8ScaledMMLinearLayerConfig to use QuantKey
Open
#32268 opened on Jan 13, 2026
feature requestgood first issuehelp wanted
Description
🚀 The feature, motivation and pitch
Replace boolean configuration fields in ScaledMMLinearLayerConfig with QuantKey objects to provide a more structured, type-safe quantization configuration API.
Ideally we should change this:
@dataclass
class ScaledMMLinearLayerConfig(ScaledMMLinearLayerConfig):
is_static_input_scheme: bool
is_channelwise: bool
input_symmetric: bool
to this:
@dataclass
class ScaledMMLinearLayerConfig(ScaledMMLinearLayerConfig):
weight_quant_key: QuantKey
activation_quant_key: QuantKey
input_symmetric: bool
A parallel work found here #27814 , has split the configuration into Int8 and Fp8 config classes and uses Quantkey for the FP8 config class.
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.