perf(tracking): enabled tracking adds ~7ms synchronous SQLite write to every command's hot path
#2,208 opened on Jun 2, 2026
Description
Summary
With tracking enabled, every recorded command (read, ls, grep, git, …) spends ~7 ms in TimedExecution::track() on a synchronous SQLite write, while the actual work (read + filter + output) is <0.05 ms. The write dominates per-call latency on fsync-expensive filesystems.
Adjacent but distinct from existing work:
- #1375 — hook-layer cold-start (~56 ms); explicitly scopes standalone-command latency out.
- #1987 (open, not merged) and #1176 (open) — both touch this write path but only add ways to disable tracking (
tracking.enabledgate /--no-track). Neither addresses why tracking is slow when enabled, which is what this issue is about.
Core claim — reproducible on stock rtk, no source changes
track() is called by read/ls/grep/… but not by --version. Diffing the two isolates the tracking cost from process spawn (stock 0.40.0, best-of-5 × 100):
rtk --version (no track): 0.83 ms
rtk read (+track) : 7.97 ms
net track cost : 7.14 ms
PRAGMA sweep — requires the patch in the gist
Net track cost (read − --version) across journal/sync settings, via an RTK_JOURNAL/RTK_SYNC env switch added to Tracker::new():
| journal_mode | synchronous | net track cost | DB-corruption risk |
|---|---|---|---|
| WAL | FULL (current) | 7.26 ms | none |
| WAL | NORMAL | 6.77 ms | none |
| DELETE | NORMAL | 4.53 ms | none |
| TRUNCATE | NORMAL | 4.48 ms | none |
| MEMORY | OFF | 0.29 ms | corruption on crash/power-loss |
In-process Instant probing puts ~all of track()'s time in the SQLite connection-open + INSERT-commit path; read/filter/output stay <0.05 ms every run.
Repro harness + patch: https://gist.github.com/coseto6125/19855fc80e113bd5772dbb2fef263077
Notes / caveats
- Measured on WSL2, where fsync is unusually expensive. On native Linux / APFS the absolute numbers are likely lower — independent reproduction on a native FS would be valuable before treating 7 ms as universal.
- Per-call cost is independent of DB size (22 MB vs a fresh 52 KB DB → same ~8 ms wall), so it's the per-call open→write→sync cycle, not query/scan cost.
- The persisted data (token-savings stats) is non-critical and reconstructible — rtk reads none of it for correctness.
Why config/PRAGMA tuning isn't enough
synchronous=NORMAL is within noise. The only setting reaching sub-millisecond (MEMORY+OFF) risks DB corruption and is unacceptable. Safe PRAGMA changes cap at ~38%.
Suggested direction
The order-of-magnitude win needs an architectural change, not a PRAGMA: keep the synchronous disk write off the hot path — e.g. buffer to an append-only file and batch-import on rtk gain, a background writer thread, or amortize one connection across a session. Since #1987 / #1176 already refactor record() / track(), folding batched/async writes into that work would close this in the same pass.