feat: Automated Issue & PR triage system (templates, labeling, classification) · rtk-ai/rtk#348

(4 commentaires) (0 réactions) (0 assignés)Rust (2 914 forks)batch import

area:cieffort-largeenhancementhelp wantedpriority:medium

Métriques du dépôt

Stars: (48 085 stars)
Métriques de merge PR: (Merge moyen 11j 1h) (45 PRs mergées en 30 j)

Description

Context

With 88 open issues and 51 open PRs (as of March 2026), 86% of issues and 72% of PRs have no labels, forcing manual triage every time. This issue tracks a 3-sprint plan to automate the full triage pipeline.

The real bottleneck is PRs: 54% are 500+ lines, 32% have active discussions. Issues close fast (median 1.9 days) but arrive unlabeled.

Estimated time saved: ~3-5h/week of triage once fully implemented.

Target Contributor Experience

Max 2 automated messages per item. No "wall of bots" effect.

Issue Flow

Contributor opens issue
  -> Structured form (Bug/Feature/Question)              [Sprint 1]
  -> Labels auto-applied by template                     [Sprint 1]
  -> Action checks fields:
     - Missing version/OS -> label "needs-info" + 1 comment
     - Duplicate detected (Jaccard > 0.6) -> label "possible-duplicate" + mention
     - Complete -> labels applied, zero comment          [Sprint 2]
  -> If "needs-info" unanswered 14d -> 1 reminder        [Sprint 2]
  -> If still nothing 28d -> close                       [Sprint 2]

PR Flow

Contributor opens PR
  -> Template with RTK checklist                         [Sprint 1]
  -> Labels by path (area:git, area:go, etc.) + size     [Sprint 1]
  -> security-check.yml runs + RTK quality patterns
     = 1 combined "Security & Quality Scan" comment      [Sprint 2]
  -> CODEOWNERS auto-requests maintainer review          [Sprint 1]

Sprint 1 — Foundations (half-day, config only)

P1 — Issue Templates (YAML Forms)

3 templates with required fields:

Bug Report: version, OS, shell, command, repro steps (all required)
Feature Request: target command, use case, expected token savings
Question Auto-labeling at creation. Fixes the #1 problem: 86% of issues without labels.

P2 — PR Template

RTK checklist embedded:

lazy_static! for regex (not inline Regex::new)
anyhow + .context() on all ? operators
Fallback to raw command on filter failure
Exit codes preserved
Token savings ≥60% (with test assertion)
No async fn / tokio

P3 — PR Auto-Labeling by path

actions/labeler mapping all modules. Fixes 72% of PRs without labels.

Label taxonomy (to create):

Category	Labels
Type	`bug`, `enhancement`, `question`, `docs`
Area	`area:git`, `area:cargo`, `area:github`, `area:python`, `area:go`, `area:js`, `area:core`, `area:hooks`, `area:ci`
Size	`size:XS` (<10 lines), `size:S` (<50), `size:M` (<200), `size:L` (<500), `size:XL` (500+)
Priority	`P0-critical`, `P1-high`, `P2-medium`, `P3-low`
Status	`needs-info`, `possible-duplicate`, `stale`
Effort	`effort:XS`, `effort:S`, `effort:M`, `effort:L`, `effort:XL`

P4 — CODEOWNERS

Auto-request review. Sensitive files (runner.rs, tracking.rs, Cargo.toml, workflows) → maintainer required.

Sprint 2 — Automation (3-5 days, GitHub Actions)

P8a — Auto-classification of issues (keyword matching)

Trigger: issues.opened

Logic (bash script, no LLM):

Field check: If body doesn't contain rtk --version or OS mention → label needs-info + comment
Keyword classification:
- Bug: crash, error, fail, panic, broken, regression, unexpected
- Feature: add, support, implement, new command, would be nice
- Question: how to, how do I, is it possible, can I, does rtk
- Doc: documentation, readme, typo, spelling
- Priority: title wins over body on ambiguous matches
Duplicate detection: Jaccard similarity on title (lowercase, stop words removed) vs all open issues. Score > 0.6 → label possible-duplicate + comment
Auto-labeling via gh issue edit --add-label

Accepting ~20-30% false positive rate — label is a signal, not a verdict. Better than 86% with no labels at all.

P9 — Combined Security + Quality scan on PRs

Extend existing security-check.yml (not a new workflow). Contributors receive ONE comment.

Already in security-check.yml:

Cargo audit (CVEs)
Critical files check
Dangerous patterns (Command::new, unsafe, unwrap, network ops)
New dependencies audit
Clippy security lints

Adding: 6. RTK quality patterns: Regex::new( without lazy_static, missing .context( after ?, async fn / tokio::, use std::thread 7. PR size classification: additions/deletions → apply size:* label 8. Overlap detection: files changed vs other open PRs, mention if >50% overlap

P10 — needs-info follow-up

Schedule (daily cron) + issue_comment trigger to reset timer.

14 days without response → reminder comment
28 days → close

P5 — Dependabot

Auto PRs for Cargo deps (weekly) and GitHub Actions (monthly).

Sprint 3 — Intelligence (when volume > 20 issues/week)

P8b — Upgrade to Claude API classification

Replace keyword matching with Claude Haiku. Better classification + semantic duplicate detection (not just Jaccard on words). ~$0.01-0.05/issue. Trigger when keyword matching error rate becomes problematic.

P12 — Dynamic contributor assignment

git log --format='%an' -- src/relevant.rs to find who knows the area. Mention in P8 comment. Useful when 3+ regular contributors.

Out of scope (removed from plan):

~~Welcome Bot~~: Templates already give new contributors the info they need
~~Stale Bot~~: Oldest item is 17 days. No staleness problem yet
~~Weekly Digest~~: The /repo-recap skill covers this manually

Maintenance Cost

Workflow	Expected maintenance
labeler.yml (P3)	Near-zero. Update when adding a new `*_cmd.rs` module
security-check.yml extended (P9)	Low. Patterns are stable
issue-classify.yml (P8a)	Medium. Keywords/Jaccard thresholds may need tuning first weeks
needs-info-followup.yml (P10)	Near-zero. Fixed timer, simple logic
dependabot.yml (P5)	Zero. GitHub maintains it

Total: ~30 min/month once stabilized.

Verification Checklist

Sprint	Test	Success criterion
1	Open issue via each template	Labels auto-applied, required fields work
1	Open PR touching `src/git.rs`	Label `area:git` + `size:*` applied, CODEOWNERS requests review
2	Open bug issue without version/OS	Label `needs-info` + comment requesting missing info
2	Open issue with near-identical title to existing	Label `possible-duplicate` + mention of similar issue
2	Open PR with `unwrap()` in diff	Step Summary contains "RTK Quality Patterns" section
2	Leave `needs-info` issue 14 days without reply	Reminder posted automatically
3	Compare Claude vs keyword on 20 historical issues	Claude > 85% accuracy vs keyword ~70%

Files to Reuse

File	Reuse
`.claude/skills/issue-triage/SKILL.md`	Classification keywords, Jaccard logic → extract for P8a
`.claude/skills/issue-triage/templates/issue-comment.md`	Templates 1 (needs-info) and 2 (duplicate) → adapt for P8a and P10
`.github/workflows/security-check.yml`	Base to extend for P9
`.claude/skills/pr-triage/SKILL.md`	RTK checklist → copy into PR template (P2)
`.claude/skills/pr-triage/templates/review-comment.md`	Review comment format → reuse in P9

MAUI Feature Mapping

MAUI Feature	RTK Equivalent	Sprint
AI analysis tags each issue	P8a (keyword) then P8b (Claude API)	2 → 3
doc/question/bug categories	P1 (templates force category) + P8a (auto-classify)	1 + 2
Checks questions are answered, tags needs-precision	P8a (missing fields detection) + P10 (follow-up timer)	2
Checks for duplicates	P8a (Jaccard > 0.6 on title)	2
Tags contributor on relevant code area	P4 (static CODEOWNERS) + P12 (dynamic via git log)	1 + 3

Guide contributeur

Direction de recherche: Implémentez des modèles de problème avec des formulaires YAML pour les rapports de bug, les demandes de fonctionnalités et les questions. Créez un workflow d'étiquetage pour étiqueter automatiquement les PR en fonction des chemins de fichiers. Configurez CODEOWNERS pour les demandes de révision automatiques.
Stack technique: rust
Domaine: clidevops
Type d'issue: Fonctionnalité
Difficulté: 3
Temps estimé: 1-2 jours
Statut d'activité: Récente
Clarté: Claire
Prérequis: GitGitHub Actions
Accessibilité débutant: 80