rtk-ai/rtk

perf(tracking): enabled tracking adds ~7ms synchronous SQLite write to every command's hot path

Open

#2 208 ouverte le 2 juin 2026

Voir sur GitHub
 (1 commentaire) (0 réactions) (0 assignés)Rust (2 914 forks)batch import
area:performanceenhancementhelp wantedpriority:high

Métriques du dépôt

Stars
 (48 085 stars)
Métriques de merge PR
 (Merge moyen 8j 17h) (49 PRs mergées en 30 j)

Description

Summary

With tracking enabled, every recorded command (read, ls, grep, git, …) spends ~7 ms in TimedExecution::track() on a synchronous SQLite write, while the actual work (read + filter + output) is <0.05 ms. The write dominates per-call latency on fsync-expensive filesystems.

Adjacent but distinct from existing work:

  • #1375 — hook-layer cold-start (~56 ms); explicitly scopes standalone-command latency out.
  • #1987 (open, not merged) and #1176 (open) — both touch this write path but only add ways to disable tracking (tracking.enabled gate / --no-track). Neither addresses why tracking is slow when enabled, which is what this issue is about.

Core claim — reproducible on stock rtk, no source changes

track() is called by read/ls/grep/… but not by --version. Diffing the two isolates the tracking cost from process spawn (stock 0.40.0, best-of-5 × 100):

rtk --version (no track): 0.83 ms
rtk read      (+track)  : 7.97 ms
net track cost          : 7.14 ms

PRAGMA sweep — requires the patch in the gist

Net track cost (read--version) across journal/sync settings, via an RTK_JOURNAL/RTK_SYNC env switch added to Tracker::new():

journal_mode synchronous net track cost DB-corruption risk
WAL FULL (current) 7.26 ms none
WAL NORMAL 6.77 ms none
DELETE NORMAL 4.53 ms none
TRUNCATE NORMAL 4.48 ms none
MEMORY OFF 0.29 ms corruption on crash/power-loss

In-process Instant probing puts ~all of track()'s time in the SQLite connection-open + INSERT-commit path; read/filter/output stay <0.05 ms every run.

Repro harness + patch: https://gist.github.com/coseto6125/19855fc80e113bd5772dbb2fef263077

Notes / caveats

  • Measured on WSL2, where fsync is unusually expensive. On native Linux / APFS the absolute numbers are likely lower — independent reproduction on a native FS would be valuable before treating 7 ms as universal.
  • Per-call cost is independent of DB size (22 MB vs a fresh 52 KB DB → same ~8 ms wall), so it's the per-call open→write→sync cycle, not query/scan cost.
  • The persisted data (token-savings stats) is non-critical and reconstructible — rtk reads none of it for correctness.

Why config/PRAGMA tuning isn't enough

synchronous=NORMAL is within noise. The only setting reaching sub-millisecond (MEMORY+OFF) risks DB corruption and is unacceptable. Safe PRAGMA changes cap at ~38%.

Suggested direction

The order-of-magnitude win needs an architectural change, not a PRAGMA: keep the synchronous disk write off the hot path — e.g. buffer to an append-only file and batch-import on rtk gain, a background writer thread, or amortize one connection across a session. Since #1987 / #1176 already refactor record() / track(), folding batched/async writes into that work would close this in the same pass.

Guide contributeur