[Feature] Unified JIT / Precompilation Cache Directory · sgl-project/sglang#19612

(2 commenti) (2 reazioni) (0 assegnatari)Python (28.442 star) (6216 fork)auto 404

good first issue

Descrizione

Checklist

If this is not a feature request but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
Please use English. Otherwise, it will be closed.

Motivation

Summary

Request: Unify all JIT and precompilation cache paths under a single configurable root so that users and operators can manage cache location, size, and persistence in one place. The current situation is fragmented across multiple env vars and default paths (including /tmp), which makes it hard to reason about where compiled artifacts live and to reuse caches across runs or machines.

Current State (Fragmented)

1. Triton JIT cache

What	Env var / source	Default / behavior
Direct `@triton.jit` (e.g. allocator, MoE kernels)	`TRITON_CACHE_DIR` (Triton runtime)	Triton default: `~/.triton/cache`; often overwritten by Inductor to a path under `/tmp/torchinductor_*`
SGLang override	`SGLANG_TRITON_CACHE_DIR` (if implemented)	e.g. `~/.triton/cache` or `~/.cache/triton`

Set only in some entry paths (engine, gRPC launcher, scheduler process); custom/PD launchers may not set it.
When PyTorch Inductor runs first, it can set TRITON_CACHE_DIR to its own subdir, so later Triton JIT (including allocator) writes under /tmp.

2. PyTorch Inductor (torch.compile)

What	Env var / source	Default / behavior
Inductor cache	`TORCHINDUCTOR_CACHE_DIR` (PyTorch)	`/tmp/torchinductor_<user>` (or similar)
SGLang override	`SGLANG_TORCHINDUCTOR_CACHE_DIR` (if implemented)	e.g. `~/.cache/sglang/inductor`

Default lives in /tmp, so cache is often non-persistent and can conflict with other users on shared nodes.

3. DeepGEMM JIT cache

What	Env var / source	Default / behavior
DeepGEMM cache	`SGLANG_DG_CACHE_DIR` / `DG_JIT_CACHE_DIR`	`~/.cache/deep_gemm`

Set in layers/deep_gemm_wrapper/compile_utils.py; separate from Triton/Inductor.

Problems

No single root: Triton, Inductor, SGLang torch_compile, and DeepGEMM each have their own env or default; some write to /tmp, others to ~/.cache/....

Proposal

1. Introduce a single JIT cache root

New env var: SGLANG_JIT_CACHE_ROOT (or SGLANG_CACHE_ROOT if we want to align with existing SGLANG_CACHE_ROOT in custom_all_reduce_utils).
Default: ~/.cache/sglang (or $XDG_CACHE_HOME/sglang when set).
Semantics: All JIT/precompilation caches that SGLang controls should live under this root in fixed subdirs.

2. Standard layout under the root

Suggested subdirs (all under SGLANG_JIT_CACHE_ROOT):

Subdir	Purpose	Maps from
`triton/`	Triton JIT (direct `@triton.jit` and, when possible, Triton used by Inductor)	`TRITON_CACHE_DIR`
`inductor/`	PyTorch Inductor (torch.compile)	`TORCHINDUCTOR_CACHE_DIR`
`torch_compile/`	SGLang torch.compile cache (hash-based, when using SGLangBackend)	`SGLANG_CACHE_DIR` + `torch_compile_cache`
`deep_gemm/`	DeepGEMM JIT	`SGLANG_DG_CACHE_DIR` / `DG_JIT_CACHE_DIR`

If we keep backward compatibility, existing env vars (SGLANG_TRITON_CACHE_DIR, SGLANG_TORCHINDUCTOR_CACHE_DIR, SGLANG_CACHE_DIR, SGLANG_DG_CACHE_DIR) could override the default subdir path when set; otherwise they are derived as {SGLANG_JIT_CACHE_ROOT}/{subdir}.

CC @Fridge003 @hnyls2002

Related resources

No response

Guida contributor

Tech stack: python
Dominio: backendbuild systemdeveloper experience
Tipo issue: feature
Difficoltà: 3
Tempo stimato: half day
Stato attività: active
Chiarezza: clear
Prerequisiti: PythonGit
Adatta ai principianti: 60
Direzione di ricerca: Implementa una variabile d'ambiente per la radice della cache JIT unificata (es. SGLANG JIT CACHE ROOT) e aggiorna tutti i percorsi della cache pertinenti (Triton, Inductor, DeepGEMM) per utilizzare le sottodirectory sotto questa radice. Garantisci la retrocompatibilità consentendo alle variabili d'ambiente esistenti di sovrascrivere le impostazioni predefinite delle sottodirectory.