`rtk discover` reports false-positive missed savings for hook-rewritten commands · rtk-ai/rtk#1441

(2 comments) (0 reactions) (1 assignee)Rust (2,914 forks)batch import

area:cliarea:docsbugdocumentationeffort-smallgood first issuehelp wantedpriority:high

Repository metrics

Stars: (48,085 stars)
PR merge metrics: (Avg merge 11d 1h) (45 merged PRs in 30d)

Description

Summary

rtk discover reads the pre-hook command text from Claude Code JSONL transcripts and classifies every non-rtk prefix as "missed". The ~/.claude/hooks/rtk-rewrite.sh hook rewrites most of these at runtime (e.g. grep -n … → rtk grep -n …), but Claude Code never writes the post-hook command back to the transcript. Result: commands that actually ran through RTK get counted as missed, the adoption percentage is artificially deflated, and the user-facing doc actively misdiagnoses the cause.

The sibling command rtk session already solves this correctly in src/analytics/session_cmd.rs:28-51. rtk discover is the outlier.

Reproduction

On a machine with rtk ≥ 0.23.0 and the official hook installed, pick any recent Claude Code session JSONL and locate a bash tool_use whose command starts with grep -n (or any other hook-rewritten command).

CMD (from tool_use.input.command):
  grep -n "^version" crates/ccboard-core/Cargo.toml ...

tool_result.content (93 chars):
  1 matches in 1F:

  [file] crates/ccboard-core/Cargo.toml (1):
       4: version.workspace = true

The result is in the rtk grep filter format ([file] header, "X matches in YF:" summary). Raw grep would have produced path:line:content. The hook rewrote the command; the transcript still logs the original.

Now run:

rtk discover --all

The grep -n command appears in the MISSED SAVINGS column, even though the adjacent tool_result proves it ran through RTK.

Expected

Commands that classify_command returns as Classification::Supported should bucket as already-covered, matching rtk session's behavior. Only RTK_DISABLED=... prefixes and unsupported base commands (pnpm build, node, python3, …) should appear as missed.

Actual

src/discover/mod.rs:169-175 counts only the literal rtk prefix:

Classification::Ignored => {
    if part.trim().starts_with("rtk ") {
        already_rtk += 1;
    }
}

Every Classification::Supported goes to the missed bucket regardless of hook behavior.

Root Cause

Claude Code's JSONL logs the assistant's tool_use.input.command — what the model authored before any PreToolUse hook runs. The post-hook command is never persisted.

src/discover/provider.rs:175 extracts block.pointer("/input/command") from assistant tool_use blocks. No alternative field carries the rewritten command.
src/discover/mod.rs:115-175 makes the already-RTK decision on this pre-hook text.
src/analytics/session_cmd.rs:28-51 (reference implementation) already handles this, with an explicit doc comment:

/// A command is "covered" if it either:
/// - starts with "rtk " (explicit rtk invocation), or
/// - would be rewritten by the hook (classify_command returns Supported)
fn count_rtk_commands(...) -> (usize, usize, usize) {
    ...
    if part.starts_with("rtk ")
        || matches!(classify_command(part), Classification::Supported { .. })
    {
        rtk += 1;
    }
    ...
}

The two commands disagree on the same question.

Downstream impact

Inflated "Est. Savings" in the MISSED column — the user sees a phantom ~5M token opportunity that is already captured.
Deflated "Already using RTK" percentage — in my data, 8% reported vs a real adoption probably above 80%.
Misleading docs — docs/guide/analytics/discover.md:35:

"If commands appear in the missed list after installing RTK, it usually means the hook isn't active for that agent."

This is the inverse of reality. The hook is active; discover just can't see past the transcript. That line sends users on a non-productive troubleshooting path (line 58 in the same doc already describes rtk session correctly).

Proposed fix

Mirror rtk session's logic in discover:

src/discover/mod.rs:115-175 — treat Classification::Supported as already-covered. The RTK_DISABLED= branch at mod.rs:98-112 stays unchanged (genuine opt-out, should still surface).
src/discover/report.rs:52, 83-90 — optionally split into already_rtk_explicit vs rewritten_by_hook so no signal is lost, or keep one counter and rely on the existing "TOP UNHANDLED" section for the gap analysis.
docs/guide/analytics/discover.md:35 — replace the misleading troubleshooting hint with accurate behavior description.

Existing tests in src/discover/report.rs:218-269 only exercise percentage formatting and will not regress. A new test mirroring src/analytics/session_cmd.rs:233-261 would be the right regression net.

Environment

rtk: 0.37.2 (also reproduced on 0.36.0)
OS: macOS 15.6 (Darwin 24.6.0)
Shell: zsh
Hook: ~/.claude/hooks/rtk-rewrite.sh (rtk-hook-version: 3), confirmed active via rtk rewrite "grep -n foo bar" → exit 0 → rtk grep -n foo bar

Contributor guide

Research direction: Inspect the classification logic in src/discover/mod.rs around lines 169-175 and compare with the reference implementation in src/analytics/session cmd.rs lines 28-51. The fix involves treating Classification::Supported as already covered, similar to session cmd.rs.
Tech stack: rust
Domain: clideveloper experience
Issue type: Bug
Difficulty: 2
Estimated time: 1-3 hours
Activity status: Active
Clarity: Clear
Prerequisites: GitRust
Newbie friendliness: 75