feat: Automated Issue & PR triage system (templates, labeling, classification) · rtk-ai/rtk#348

(4 comments) (0 reactions) (0 assignees)Rust (2.914 forks)batch import

area:cieffort-largeenhancementhelp wantedpriority:medium

Métricas do repositório

Stars: (48.085 stars)
Métricas de merge de PR: (Mesclagem média 11d 1h) (45 fundiu PRs em 30d)

Description

Context

With 88 open issues and 51 open PRs (as of March 2026), 86% of issues and 72% of PRs have no labels, forcing manual triage every time. This issue tracks a 3-sprint plan to automate the full triage pipeline.

The real bottleneck is PRs: 54% are 500+ lines, 32% have active discussions. Issues close fast (median 1.9 days) but arrive unlabeled.

Estimated time saved: ~3-5h/week of triage once fully implemented.

Target Contributor Experience

Max 2 automated messages per item. No "wall of bots" effect.

Issue Flow

Contributor opens issue
  -> Structured form (Bug/Feature/Question)              [Sprint 1]
  -> Labels auto-applied by template                     [Sprint 1]
  -> Action checks fields:
     - Missing version/OS -> label "needs-info" + 1 comment
     - Duplicate detected (Jaccard > 0.6) -> label "possible-duplicate" + mention
     - Complete -> labels applied, zero comment          [Sprint 2]
  -> If "needs-info" unanswered 14d -> 1 reminder        [Sprint 2]
  -> If still nothing 28d -> close                       [Sprint 2]

PR Flow

Contributor opens PR
  -> Template with RTK checklist                         [Sprint 1]
  -> Labels by path (area:git, area:go, etc.) + size     [Sprint 1]
  -> security-check.yml runs + RTK quality patterns
     = 1 combined "Security & Quality Scan" comment      [Sprint 2]
  -> CODEOWNERS auto-requests maintainer review          [Sprint 1]

Sprint 1 — Foundations (half-day, config only)

P1 — Issue Templates (YAML Forms)

3 templates with required fields:

Bug Report: version, OS, shell, command, repro steps (all required)
Feature Request: target command, use case, expected token savings
Question Auto-labeling at creation. Fixes the #1 problem: 86% of issues without labels.

P2 — PR Template

RTK checklist embedded:

lazy_static! for regex (not inline Regex::new)
anyhow + .context() on all ? operators
Fallback to raw command on filter failure
Exit codes preserved
Token savings ≥60% (with test assertion)
No async fn / tokio

P3 — PR Auto-Labeling by path

actions/labeler mapping all modules. Fixes 72% of PRs without labels.

Label taxonomy (to create):

Category	Labels
Type	`bug`, `enhancement`, `question`, `docs`
Area	`area:git`, `area:cargo`, `area:github`, `area:python`, `area:go`, `area:js`, `area:core`, `area:hooks`, `area:ci`
Size	`size:XS` (<10 lines), `size:S` (<50), `size:M` (<200), `size:L` (<500), `size:XL` (500+)
Priority	`P0-critical`, `P1-high`, `P2-medium`, `P3-low`
Status	`needs-info`, `possible-duplicate`, `stale`
Effort	`effort:XS`, `effort:S`, `effort:M`, `effort:L`, `effort:XL`

P4 — CODEOWNERS

Auto-request review. Sensitive files (runner.rs, tracking.rs, Cargo.toml, workflows) → maintainer required.

Sprint 2 — Automation (3-5 days, GitHub Actions)

P8a — Auto-classification of issues (keyword matching)

Trigger: issues.opened

Logic (bash script, no LLM):

Field check: If body doesn't contain rtk --version or OS mention → label needs-info + comment
Keyword classification:
- Bug: crash, error, fail, panic, broken, regression, unexpected
- Feature: add, support, implement, new command, would be nice
- Question: how to, how do I, is it possible, can I, does rtk
- Doc: documentation, readme, typo, spelling
- Priority: title wins over body on ambiguous matches
Duplicate detection: Jaccard similarity on title (lowercase, stop words removed) vs all open issues. Score > 0.6 → label possible-duplicate + comment
Auto-labeling via gh issue edit --add-label

Accepting ~20-30% false positive rate — label is a signal, not a verdict. Better than 86% with no labels at all.

P9 — Combined Security + Quality scan on PRs

Extend existing security-check.yml (not a new workflow). Contributors receive ONE comment.

Already in security-check.yml:

Cargo audit (CVEs)
Critical files check
Dangerous patterns (Command::new, unsafe, unwrap, network ops)
New dependencies audit
Clippy security lints

Adding: 6. RTK quality patterns: Regex::new( without lazy_static, missing .context( after ?, async fn / tokio::, use std::thread 7. PR size classification: additions/deletions → apply size:* label 8. Overlap detection: files changed vs other open PRs, mention if >50% overlap

P10 — needs-info follow-up

Schedule (daily cron) + issue_comment trigger to reset timer.

14 days without response → reminder comment
28 days → close

P5 — Dependabot

Auto PRs for Cargo deps (weekly) and GitHub Actions (monthly).

Sprint 3 — Intelligence (when volume > 20 issues/week)

P8b — Upgrade to Claude API classification

Replace keyword matching with Claude Haiku. Better classification + semantic duplicate detection (not just Jaccard on words). ~$0.01-0.05/issue. Trigger when keyword matching error rate becomes problematic.

P12 — Dynamic contributor assignment

git log --format='%an' -- src/relevant.rs to find who knows the area. Mention in P8 comment. Useful when 3+ regular contributors.

Out of scope (removed from plan):

~~Welcome Bot~~: Templates already give new contributors the info they need
~~Stale Bot~~: Oldest item is 17 days. No staleness problem yet
~~Weekly Digest~~: The /repo-recap skill covers this manually

Maintenance Cost

Workflow	Expected maintenance
labeler.yml (P3)	Near-zero. Update when adding a new `*_cmd.rs` module
security-check.yml extended (P9)	Low. Patterns are stable
issue-classify.yml (P8a)	Medium. Keywords/Jaccard thresholds may need tuning first weeks
needs-info-followup.yml (P10)	Near-zero. Fixed timer, simple logic
dependabot.yml (P5)	Zero. GitHub maintains it

Total: ~30 min/month once stabilized.

Verification Checklist

Sprint	Test	Success criterion
1	Open issue via each template	Labels auto-applied, required fields work
1	Open PR touching `src/git.rs`	Label `area:git` + `size:*` applied, CODEOWNERS requests review
2	Open bug issue without version/OS	Label `needs-info` + comment requesting missing info
2	Open issue with near-identical title to existing	Label `possible-duplicate` + mention of similar issue
2	Open PR with `unwrap()` in diff	Step Summary contains "RTK Quality Patterns" section
2	Leave `needs-info` issue 14 days without reply	Reminder posted automatically
3	Compare Claude vs keyword on 20 historical issues	Claude > 85% accuracy vs keyword ~70%

Files to Reuse

File	Reuse
`.claude/skills/issue-triage/SKILL.md`	Classification keywords, Jaccard logic → extract for P8a
`.claude/skills/issue-triage/templates/issue-comment.md`	Templates 1 (needs-info) and 2 (duplicate) → adapt for P8a and P10
`.github/workflows/security-check.yml`	Base to extend for P9
`.claude/skills/pr-triage/SKILL.md`	RTK checklist → copy into PR template (P2)
`.claude/skills/pr-triage/templates/review-comment.md`	Review comment format → reuse in P9

MAUI Feature Mapping

MAUI Feature	RTK Equivalent	Sprint
AI analysis tags each issue	P8a (keyword) then P8b (Claude API)	2 → 3
doc/question/bug categories	P1 (templates force category) + P8a (auto-classify)	1 + 2
Checks questions are answered, tags needs-precision	P8a (missing fields detection) + P10 (follow-up timer)	2
Checks for duplicates	P8a (Jaccard > 0.6 on title)	2
Tags contributor on relevant code area	P4 (static CODEOWNERS) + P12 (dynamic via git log)	1 + 3

Guia do colaborador

Direção de pesquisa: Implemente modelos de issue com formulários YAML para relatórios de bug, solicitações de recursos e perguntas. Crie um fluxo de trabalho de rotulagem para rotular automaticamente PRs por caminhos de arquivo. Configure CODEOWNERS para solicitações de revisão automáticas.
Pilha de tecnologia: rust
Domain: clidevops
Tipo Issue: Funcionalidade
Difficulty: 3
Tempo estimado: 1-2 dias
Status da atividade: Recente
Clarity: Claro
Prerequisites: GitGitHub Actions
Simpatia para novatos: 80