Hook adoption (18%) + document-quality concerns + scope-limit-to-coding proposal from a document-production workload
#1698 aperta il 4 mag 2026
Metriche repository
- Star
- (48.085 star)
- Metriche merge PR
- (Merge medio 11g 1h) (45 PR mergiate in 30 g)
Descrizione
Context
rtk discover over the last 30 days reports only 18.0% adoption (3,636 of 20,236 Bash commands routed through RTK), with ~857K tokens of recoverable savings sitting unrewritten. After looking at the unhandled-command list, I think there are actually two distinct issues bundled here — and the second one is the one I'd most like maintainer guidance on.
Issue 1 — hook-rewriting blind spots
The "missed savings" table is dominated by commands RTK already handles:
| Command | Count | RTK Equivalent | Est. Savings |
|---|---|---|---|
grep -n |
3,241 | rtk grep |
~516K tokens |
ls "/abs/path" |
2,225 | rtk ls |
~160K tokens |
cat "/abs/path" |
248 | rtk read |
~89K tokens |
find "/abs/path" |
547 | rtk find |
~62K tokens |
git log |
244 | rtk git |
~14K tokens |
gh pr |
102 | rtk gh |
~10K tokens |
wc -l |
153 | rtk wc |
~3.5K tokens |
| Total | 6,765 | — | ~857K tokens |
Two patterns dominate the unrewritten commands and I suspect they're hook regex blind spots:
- Quoted absolute paths:
ls "/Users/imcapple/o...",cat "/Users/imcapple/...",find "/Users/imcapple/...". The hook's command-rewrite regex may not be matching when the first arg is a quoted absolute path. - Heredocs and compound forms:
python3 << 'EOF',\\npython3,echo "===— commands wrapped in shell constructs (heredocs, line continuations, here-strings) the hook's matcher commonly misses.
A diagnostic the maintainers can run:
echo '{"tool_name":"Bash","tool_input":{"command":"ls \"/Users/imcapple/test\""}}' | rtk hook claude
If the output isn't rewritten to rtk ls ..., that's the bug.
Issue 2 — top unhandled commands and the workload-shape question
python3 4614 python3 ~/.claude/skills/.../course_tools.py ...
python3 << 226 python3 << 'EOF' ...
stat 45 stat -f '%m %N' ...
git checkout 37 git checkout -b ...
unzip 19 unzip -o -q "/Users/imcapple/obsidian/..."
pandoc 16 pandoc -t markdown "M3_L3.1_..."
gh search 11 gh search issues ...
git tag 10 git tag --list --sort=-v:refname
Most of these (python3 script invocations against the Obsidian vault, pandoc conversions on book chapters, unzip of vault archives, heredocs operating on course content) are document production, not code work. That makes me wonder whether the 18% adoption rate isn't just a hook-coverage gap — it's also reflecting the nature of the work.
Issue 3 (the big one) — document-quality concern from compression-by-default
I have concerns about the quality of the documents being emitted under RTK's compression. For pure git operations, build/test runs, and short-output commands, RTK's filtering is genuinely lossless-for-purpose. For prose / lesson content / quiz items / manuscript text, it's a riskier optimization with real failure modes:
rtk readstrips content. The agent treats the filtered view as the source of truth; an author would treat dropped footnotes / callouts / exact phrasing / transition paragraphs as load-bearing.rtk grepgroups and truncates (default ~200 results / ~25 per file). A consistency check across a 10-volume series may silently miss material the agent should have seen — and then write something that contradicts what it didn't see.rtk lscollapses listings. Fine for code repos. For a vault where filenames carry content (lesson IDs, version markers, dates), losing detail in the listing means the agent may misidentify the next file to edit.- The compression is invisible to the agent. It doesn't know it received a filtered view, so it has no instinct to re-fetch unfiltered. There's no signal that filtering happened.
This isn't speculation about my workload — I run a 10-volume crypto book series + paid courses, and most agent cycles are spent reading, editing, and emitting prose. The 18% adoption number may be a feature, not a bug, given the workload mix.
Proposed direction: scope-limit RTK rather than compound the rewriting
Rather than push for the hook to rewrite more commands (which would compound the document-quality risk), the right move may be to limit the RTK whitelist to GitHub and coding-related tasks and exclude document generation/review entirely. Concretely:
- Keep RTK on:
git,gh, build/test commands (npm,cargo,pytest), short-output structural commands likewc,du. - Exclude RTK from:
cat/rtk read,head/tail,grepagainst vault paths,lsagainst vault paths,findagainst vault paths,python3invocations whose first argument is a script under a vault directory or whose argv contains a vault path.
Possible implementations to consider:
- Path-based exclusion in the hook config — e.g.
[hooks] exclude_paths = ["/Users/imcapple/obsidian/**", "/Users/imcapple/.claude/skills/**"]that bypasses rewriting for any command whose argv contains a matching path. This is the cleanest fit for my workload. - Per-tool opt-out via
[hooks] exclude_commands = ["cat", "ls", "grep", "find"]— if this already exists in 0.38+ / 0.39 RC, document it more prominently. - Scope-aware adoption profile — a
--scope=code-onlyor--scope=docs-permittedflag forrtk initthat picks defaults appropriate to the user's workload mix (similar to how some linters ship workload presets). - Per-command-class filter aggression —
rtk readagainst~/.claude/skills/**could pass through unfiltered whilertk readagainst/usr/lib/**keeps current aggression.
What I'd find useful from maintainers
- Confirm or deny the quoted-absolute-path blind spot in the hook regex (testable via the diagnostic above).
- Document the existing
exclude_commands/ path-filtering options if they're already implemented in 0.38+ / 0.39 RC. The conversation in another session referenced[hooks] exclude_commandsbut I haven't found it in the published docs. - Consider whether the python3-heredoc family of commands (4,614 + 226 invocations in my data) warrants a dedicated wrapper, a configurable opt-in, or simply explicit guidance that this class of command is out-of-scope.
- Guidance on the document-production workload trade-off: is the right answer "scope down RTK", "tune filters per-file-type", or "different filter aggression by command class"?
Filed as a single issue rather than three
Filing this as one because the three threads are interconnected: the missed savings and the unhandled commands are both about adoption, but the document-quality concern is about whether more adoption is even desirable for my workload. The right answer may be "yes, fix the regex so the hook catches more — AND add a path-based opt-out so vault paths bypass filtering entirely." Splitting into separate issues would lose that connection.
Happy to split if maintainers prefer.
Reproduction context: discover output captured 2026-05-04 from a heavy document-production workload (Obsidian vault, course-builder skill, multi-volume book series). Latest rtk version known to be in flight at time of filing: 0.38.0 (released April 29) with 0.39 RCs in development.