follow-up: preserve more structure than parsed_cmd.type="unknown" for shell-wrapper telemetry · rtk-ai/rtk#1460

(1 comment) (0 reactions) (0 assignees)Rust (2,914 forks)batch import

area:cieffort-mediumenhancementfilter-qualityhelp wantedpriority:medium

Repository metrics

Stars: (48,085 stars)
PR merge metrics: (Avg merge 11d 1h) (45 merged PRs in 30d)

Description

Summary

Follow-up to #1459.

The lower-level exec telemetry appears to collapse complex shell wrappers and multi-line shell programs into a single fallback record:

"parsed_cmd": [{ "type": "unknown", "cmd": "...entire shell program..." }]

That fallback is understandable as a safety valve, but it throws away structure that downstream analytics need in order to distinguish:

real top-level commands,
wrapper-generated internals,
shell syntax/builtins,
and orchestrator invocations.

Once everything becomes type="unknown", higher layers have no reliable way to avoid false positives.

Why this deserves its own issue

#1459 is about the user-facing false positives in top-unhandled-command reporting.

This issue is narrower: the telemetry/parser layer itself is losing too much structure before analytics even run.

Concrete examples

Two representative cases from local Codex session JSONL logs:

1) Multi-line shell wrapper recorded as one unknown blob

A wrapper script containing multiple commands (git rev-parse, gstack-slug, find, tail, etc.) was stored as:

"parsed_cmd": [
  {
    "type": "unknown",
    "cmd": "set -e\nROOT=$(git rev-parse --show-toplevel)\nBRANCH=$(git branch --show-current)\n..."
  }
]

2) Single higher-level command still recorded as unknown

Even a single orchestrator command like:

omx explore --prompt 'Find repo files/symbols related to trk caching context ...'

was stored as:

"parsed_cmd": [{ "type": "unknown", "cmd": "omx explore --prompt '...'" }]

Expected

Telemetry should preserve more structure when possible, even if full semantic parsing is not available.

For example, instead of flattening everything to unknown, it could emit something like:

source_kind: single_command | shell_script | wrapper | builtin | syntax_fragment
argv0: omx or argv0: bash / argv0: zsh
top_level_command: omx explore
is_multiline: true
contains_shell_control_flow: true
confidence: low

Even partial structure would let analytics avoid treating the entire blob as one missing command.

Actual

The fallback record only preserves the raw shell text and type="unknown", which makes downstream classification much harder than it needs to be.

Why this matters

Without structured fallback metadata, downstream consumers are forced to infer meaning from raw shell text. That leads directly to issues like:

false “top unhandled commands”
command-count inflation from wrapper internals
inability to separate shell syntax from real executable gaps
brittle heuristics in analytics/reporting layers

Potential fixes

A few possible directions:

Keep unknown, but enrich it with source-kind / top-level token / multiline flags.
Special-case simple single-command invocations so omx explore ... is not indistinguishable from a 20-line shell script.
Emit a parser confidence level so analytics can exclude low-confidence fallbacks from user-facing ranking.
Represent shell wrappers hierarchically (shell -> top-level exec -> nested tokens) instead of flattening to one opaque string.

Relation to existing issues

#1459 — user-facing false positives from shell-wrapper fallbacks
#1441 — false positives caused by pre-hook vs post-hook command text

This issue sits below both: preserve better telemetry structure so later layers can reason correctly.

Environment

Observed on: April 22, 2026
Context: Codex / OMX / gstack-heavy workflows
OS: macOS

Contributor guide

Research direction: Examine the telemetry parser in the codebase, specifically the part that generates the 'parsed cmd' with type 'unknown'. Understand how the shell wrapper collapses multiple commands into a single unknown record. Then implement one of the proposed fixes, such as adding 'source kind' and 'top level command' fields. Test with examples from the issue to ensure structure is preserved.
Tech stack: rust
Domain: clideveloper experience
Issue type: Feature
Difficulty: 3
Estimated time: 1-2 days
Activity status: Active
Clarity: Clear
Prerequisites: Rust
Newbie friendliness: 65