vllm-project/vllm

[RFC]: Limit the use of envvars in vLLM

Open

#25700 opened on Sep 25, 2025

View on GitHub
 (17 comments) (5 reactions) (2 assignees)Python (80,034 stars) (16,816 forks)batch import
RFCgood first issue

Description

Motivation.

vLLM's envvars are on the edge of getting out of control. There are many envvars should instead be configs. Like the attention backend, all2all kernel backend, and even there is a flag to control KV cache layout. envvars are evil because:

  1. It is equivalent of using global variables everywhere in the code, which is a bad programming practice.
  2. envvars have no advanced structure like, hierarchy, typechecks, etc.

In summary, I think envvars are the kind of thing that is very easy to add so people tend to add envvars as the shortest path to implement their feature, but this can quickly becomes unmanageable and makes the project very hard to use.

Proposed Change.

I think we should:

  1. Spend some effort on reviewing the current envvars and move many of them to configs.
  2. Have very strict bar on what can be an envvar and question every new envvars in vllm.

Feedback Period.

No response

CC List.

No response

Any Other Things.

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Contributor guide