rtk-ai/rtk

perf(tracking): enabled tracking adds ~7ms synchronous SQLite write to every command's hot path

Open

Aperta il 2 giu 2026

Vedi su GitHub
 (1 commento) (0 reazioni) (0 assegnatari)Rust (48.085 star) (2914 fork)batch import
area:performanceenhancementhelp wantedpriority:high

Descrizione

Summary

With tracking enabled, every recorded command (read, ls, grep, git, …) spends ~7 ms in TimedExecution::track() on a synchronous SQLite write, while the actual work (read + filter + output) is <0.05 ms. The write dominates per-call latency on fsync-expensive filesystems.

Adjacent but distinct from existing work:

  • #1375 — hook-layer cold-start (~56 ms); explicitly scopes standalone-command latency out.
  • #1987 (open, not merged) and #1176 (open) — both touch this write path but only add ways to disable tracking (tracking.enabled gate / --no-track). Neither addresses why tracking is slow when enabled, which is what this issue is about.

Core claim — reproducible on stock rtk, no source changes

track() is called by read/ls/grep/… but not by --version. Diffing the two isolates the tracking cost from process spawn (stock 0.40.0, best-of-5 × 100):

rtk --version (no track): 0.83 ms
rtk read      (+track)  : 7.97 ms
net track cost          : 7.14 ms

PRAGMA sweep — requires the patch in the gist

Net track cost (read--version) across journal/sync settings, via an RTK_JOURNAL/RTK_SYNC env switch added to Tracker::new():

journal_mode synchronous net track cost DB-corruption risk
WAL FULL (current) 7.26 ms none
WAL NORMAL 6.77 ms none
DELETE NORMAL 4.53 ms none
TRUNCATE NORMAL 4.48 ms none
MEMORY OFF 0.29 ms corruption on crash/power-loss

In-process Instant probing puts ~all of track()'s time in the SQLite connection-open + INSERT-commit path; read/filter/output stay <0.05 ms every run.

Repro harness + patch: https://gist.github.com/coseto6125/19855fc80e113bd5772dbb2fef263077

Notes / caveats

  • Measured on WSL2, where fsync is unusually expensive. On native Linux / APFS the absolute numbers are likely lower — independent reproduction on a native FS would be valuable before treating 7 ms as universal.
  • Per-call cost is independent of DB size (22 MB vs a fresh 52 KB DB → same ~8 ms wall), so it's the per-call open→write→sync cycle, not query/scan cost.
  • The persisted data (token-savings stats) is non-critical and reconstructible — rtk reads none of it for correctness.

Why config/PRAGMA tuning isn't enough

synchronous=NORMAL is within noise. The only setting reaching sub-millisecond (MEMORY+OFF) risks DB corruption and is unacceptable. Safe PRAGMA changes cap at ~38%.

Suggested direction

The order-of-magnitude win needs an architectural change, not a PRAGMA: keep the synchronous disk write off the hot path — e.g. buffer to an append-only file and batch-import on rtk gain, a background writer thread, or amortize one connection across a session. Since #1987 / #1176 already refactor record() / track(), folding batched/async writes into that work would close this in the same pass.

Guida contributor