kornia/kornia

Proposal: Incremental, torch-only SAM-3 integration (inference-first)

Open

#3,406 创建于 2025年12月28日

在 GitHub 查看
 (4 评论) (0 反应) (1 负责人)Python (8,677 star) (892 fork)batch import
enhancement :rocket:help wanted

描述

This issue proposes an incremental integration of SAM-3 (Segment Anything Model v3) into Kornia, following existing contrib model patterns (e.g. SAM v1, RT-DETR).

The goal is to avoid a single large PR and instead iterate via small, reviewable PRs, starting with inference-only functionality, as discussed with maintainers.

proposed solution :

Constraints:

  • Torch-only dependencies (PyTorch + existing Kornia deps)
  • Inference-first (no training code, datasets, or CLI tools)
  • Incremental PRs, each self-contained
  • CI-safe from the first PR
  • No breaking changes to existing SAM (v1)

Design direction: Kornia already integrates SAM v1 using a modular structure (image encoder, prompt encoder, mask decoder, model wrapper). SAM-3 would follow a similar design under a separate sam3 namespace, reusing existing patterns such as ModelBase, configs, and SegmentationResults where applicable.

Proposed components:

  • Image encoder
  • Prompt encoder (points / boxes / masks)
  • Mask decoder
  • Model wrapper + config
  • Optional pretrained weight loading
  • Minimal docs and examples

Incremental plan:

  • Phase 1: Core architecture (image encoder only)
  • Phase 2: Prompt encoder + mask decoder
  • Phase 3: Model wrapper & end-to-end inference (random weights)
  • Phase 4: Pretrained weights (optional)
  • Phase 5: Docs & minimal examples

Each phase would be proposed as a separate PR.

贡献者指南