Built-in signal/bulk classifier for git/gh control-marker commands · rtk-ai/rtk#2121

(1 commento) (0 reazioni) (0 assegnatari)Rust (2914 fork)batch import

area:cliarea:configenhancementhelp wantedpriority:high

Metriche repository

Star: (48.085 star)
Metriche merge PR: (Merge medio 11g 1h) (45 PR mergiate in 30 g)

Descrizione

Problem

rtk hook claude (and the equivalent shim/wrapper paths) currently rewrite the stdout of every Bash tool invocation, including commands whose output is a small control signal that downstream agents grep against canonical markers. This is fine for bulk output (ls -la, find, git log -p, docker logs) but actively harmful for short signal output: rtk shape-changes or inflates the canonical markers, and agents that gate next-step decisions on those markers misclassify successful operations as hung or failed.

Concrete measurements (rtk 0.40.0, macOS, our agent harness):

Command	raw bytes	after `rtk hook claude`	delta
`git status` (clean)	59	123	+108%
`git log --oneline -50`	59	4144	+6924%
`git push origin <branch>` (success)	~80	canonical `To <repo>\n abc..def main -> main` line truncated/reshaped	marker lost

The user-visible incident this surfaces in: an agent runs git push origin <branch>, the canonical To <repo> marker is gone from stdout, the agent treats the push as hung, and re-runs the command in the background for 15+ minutes when in reality the first push succeeded.

Proposal

A built-in signal vs bulk classifier:

Ship a default allowlist of substring patterns whose stdout bypasses rtk hook ... (raw stdout reaches the caller).

Allow operator override via ~/.config/rtk/passthrough.toml:

# ~/.config/rtk/passthrough.toml
[signal]
patterns = [
  "git push", "git pull", "git fetch", "git merge",
  "git status", "git remote", "git rev-parse", "git branch",
  "gh pr", "gh issue", "gh release", "gh api", "gh run",
]

# Operator can extend per-machine:
[signal.extend]
patterns = ["glab mr", "kubectl get pods"]

Match algorithm: substring match against the reconstructed command line (Bash hook input or argv for shim wrappers). Case-sensitive.
Behaviour:
- Match → rtk forwards stdout untouched (or emits a passthrough decision in the Claude PreToolUse JSON shape).
- No match → existing rtk reduction pipeline.

Reference implementation

We've shipped a workaround in our coworker plugin (https://github.com/Arcanada-one/coworker) — a PreToolUse guard that stands in front of rtk hook claude and short-circuits for the patterns above. Sources:

Vendored bash guard: coworker/plugins/rtk_signal_guard.sh
JSON allowlist store: coworker/plugins/rtk_passthrough.py
Codex CLI shim parity: coworker/plugins/rtk_codex_shims.py (passthrough check inside generated git/gh shims)

Happy to upstream this into rtk core (Rust port + TOML config) if the maintainers are interested. The wrapper is currently published under MIT, same as rtk.

Why this matters beyond our agent

Anyone running rtk with an automated downstream consumer (Claude Code agents, Cursor sub-agents, Codex CLI sub-agents, CI bots that parse git push output) hits the same class of false-negative. The cost of the false negative is invisible to rtk's own metrics — rtk reports tokens saved, but the agent above it spent minutes spinning on a successful operation.

Happy to discuss design or open a PR if there's interest. Cheers.

Guida contributor

Direzione di ricerca: Esamina l'attuale meccanismo di riscrittura di stdout in rtk hook claude, quindi implementa una corrispondenza di sottostringhe rispetto alla riga di comando e aggiungi il parsing di configurazione TOML per i pattern di segnale.
Tech stack: rust
Dominio: cli
Tipo issue: Funzionalità
Difficoltà: 3
Tempo stimato: Mezza giornata
Stato attività: Attiva
Chiarezza: Chiara
Prerequisiti: RustGit
Adatta ai principianti: 75