perf(tracking): enabled tracking adds ~7ms synchronous SQLite write to every command's hot path · rtk-ai/rtk#2208

(1 commentaire) (0 réactions) (0 assignés)Rust (2 914 forks)batch import

area:performanceenhancementhelp wantedpriority:high

Métriques du dépôt

Stars: (48 085 stars)
Métriques de merge PR: (Merge moyen 8j 17h) (49 PRs mergées en 30 j)

Description

Summary

With tracking enabled, every recorded command (read, ls, grep, git, …) spends ~7 ms in TimedExecution::track() on a synchronous SQLite write, while the actual work (read + filter + output) is <0.05 ms. The write dominates per-call latency on fsync-expensive filesystems.

Adjacent but distinct from existing work:

#1375 — hook-layer cold-start (~56 ms); explicitly scopes standalone-command latency out.
#1987 (open, not merged) and #1176 (open) — both touch this write path but only add ways to disable tracking (tracking.enabled gate / --no-track). Neither addresses why tracking is slow when enabled, which is what this issue is about.

Core claim — reproducible on stock rtk, no source changes

track() is called by read/ls/grep/… but not by --version. Diffing the two isolates the tracking cost from process spawn (stock 0.40.0, best-of-5 × 100):

rtk --version (no track): 0.83 ms
rtk read      (+track)  : 7.97 ms
net track cost          : 7.14 ms

PRAGMA sweep — requires the patch in the gist

Net track cost (read − --version) across journal/sync settings, via an RTK_JOURNAL/RTK_SYNC env switch added to Tracker::new():

journal_mode	synchronous	net track cost	DB-corruption risk
WAL	FULL (current)	7.26 ms	none
WAL	NORMAL	6.77 ms	none
DELETE	NORMAL	4.53 ms	none
TRUNCATE	NORMAL	4.48 ms	none
MEMORY	OFF	0.29 ms	corruption on crash/power-loss

In-process Instant probing puts ~all of track()'s time in the SQLite connection-open + INSERT-commit path; read/filter/output stay <0.05 ms every run.

Repro harness + patch: https://gist.github.com/coseto6125/19855fc80e113bd5772dbb2fef263077

Notes / caveats

Measured on WSL2, where fsync is unusually expensive. On native Linux / APFS the absolute numbers are likely lower — independent reproduction on a native FS would be valuable before treating 7 ms as universal.
Per-call cost is independent of DB size (22 MB vs a fresh 52 KB DB → same ~8 ms wall), so it's the per-call open→write→sync cycle, not query/scan cost.
The persisted data (token-savings stats) is non-critical and reconstructible — rtk reads none of it for correctness.

Why config/PRAGMA tuning isn't enough

synchronous=NORMAL is within noise. The only setting reaching sub-millisecond (MEMORY+OFF) risks DB corruption and is unacceptable. Safe PRAGMA changes cap at ~38%.

Suggested direction

The order-of-magnitude win needs an architectural change, not a PRAGMA: keep the synchronous disk write off the hot path — e.g. buffer to an append-only file and batch-import on rtk gain, a background writer thread, or amortize one connection across a session. Since #1987 / #1176 already refactor record() / track(), folding batched/async writes into that work would close this in the same pass.

Guide contributeur

Direction de recherche: Profilmez l'écriture synchrone SQLite dans track() et explorez le traitement par lots, les écritures asynchrones ou un fichier en ajout seul pour déplacer l'écriture hors du chemin critique.
Stack technique: rust
Domaine: cli
Type d'issue: Performance
Difficulté: 4
Temps estimé: 3-5 jours
Statut d'activité: Active
Clarté: Claire
Prérequis: RustSQLite
Accessibilité débutant: 30