Proposal: Incremental, torch-only SAM-3 integration (inference-first)
#3,406 创建于 2025年12月28日
描述
This issue proposes an incremental integration of SAM-3 (Segment Anything Model v3) into Kornia, following existing contrib model patterns (e.g. SAM v1, RT-DETR).
The goal is to avoid a single large PR and instead iterate via small, reviewable PRs, starting with inference-only functionality, as discussed with maintainers.
proposed solution :
Constraints:
- Torch-only dependencies (PyTorch + existing Kornia deps)
- Inference-first (no training code, datasets, or CLI tools)
- Incremental PRs, each self-contained
- CI-safe from the first PR
- No breaking changes to existing SAM (v1)
Design direction:
Kornia already integrates SAM v1 using a modular structure (image encoder, prompt encoder, mask decoder, model wrapper).
SAM-3 would follow a similar design under a separate sam3 namespace, reusing existing patterns such as ModelBase, configs, and SegmentationResults where applicable.
Proposed components:
- Image encoder
- Prompt encoder (points / boxes / masks)
- Mask decoder
- Model wrapper + config
- Optional pretrained weight loading
- Minimal docs and examples
Incremental plan:
- Phase 1: Core architecture (image encoder only)
- Phase 2: Prompt encoder + mask decoder
- Phase 3: Model wrapper & end-to-end inference (random weights)
- Phase 4: Pretrained weights (optional)
- Phase 5: Docs & minimal examples
Each phase would be proposed as a separate PR.