Making `transformation` an optional parameter in FasterRCNN · pytorch/vision#2263

(9 评论) (4 反应) (0 负责人)Python (6,858 fork)batch import

enhancementhelp wantedmodule: modelstopic: object detection

仓库指标

Star: (15,050 star)
PR 合并指标: (平均合并 12天 8小时) (30 天内合并 14 个 PR)

描述

Hello,

🚀 Feature

I think it would be more generic to have transform(https://github.com/pytorch/vision/blob/3d65fc6723f1e0709916f24d819d6e17a925b394/torchvision/models/detection/faster_rcnn.py#L231) as a function that can be modified by users rather than a default one.

Motivation

I am applying transformations separately as a part of data augmentation, which includes cropping and resizing. Hence I would prefer to not do the twice while retraining FasterRCNN.

Pitch

I would like to have a fixed size input to be fed into the network for variable-sized images. At present, I do this by resizing the images separately as a part of DataLoader and adjust the parameters of GeneralizedRCNNTransform accordingly.

Alternatives

My present way of using FasterRCNN is an alternative. Since my set of transformations are pre-defined, I have to apply hacks such as setting mean to 0., std to 1. and altering min and max sizes to my default value(this would mean that scale=1 and interpolation would return the same image.

Additional context

While the input to the network is fixed size, I apply many other augmentations such as mirror, random cropping etc, inspired by SSD based networks. Hence I would prefer to do all augmentation in a separate place once instead of twice.

Thank you!

Edit : If you think this would be a meaningful change, I will be happy to send a Pull Request.

贡献者指南

研究方向: 该问题要求使 FasterRCNN 中的 `transform` 参数变为可选。相关文件是 torchvision/models/detection/faster rcnn.py，特别是 `FasterRCNN` 类和 `GeneralizedRCNNTransform`。用户建议允许使用自定义的转换函数，而非默认的。调查当前转换是如何硬编码的，并考虑使其变为可选，同时提供默认行为。查看评论以获取维护者反馈（未见）。用户已表示愿意提交 PR，因此实现方案是开放的。
技术栈: python
领域: backend
议题类型: 功能
难度: 2
预计时间: 半天
活动状态: 活跃
清晰度: 清晰
前置要求: PythonGit
新手友好度: 70