Bug: torchvision/transforms/functional/to_pil_image always converts 1-channel (gray) FloatTensor images to 8-bit unsigned int · pytorch/vision#448

(16 评论) (2 反应) (2 负责人)Python (6,858 fork)batch import

bughelp wantedmodule: documentation

仓库指标

Star: (15,050 star)
PR 合并指标: (平均合并 12天 8小时) (30 天内合并 14 个 PR)

描述

OS: Ubuntu 16.04.4 LTS x64
PyTorch version: 0.3.0
Torchvision version: 0.2.0
How you installed PyTorch (conda, pip, source): conda
Python version: 3
CUDA/cuDNN version: 8.0
GPU models and configuration: Titan X (Maxwell)

ERROR: ValueError: Incorrect mode (<class 'float'>) supplied for input type <class 'numpy.dtype'>. Should be L

The torchvision transform ToPILImage(mode=float) will always break for input of type torch.FloatTensor ToPILImage() uses the internal function to_pil_image found in torchvision/transforms/functional.py

In https://github.com/pytorch/vision/blob/master/torchvision/transforms/functional.py: Line 104 checks if the input is of type torch.FloatTensor If so, line 105 scales the input by 255, but then converts it to byte Lines 113-127 check if the user-specified mode is the expected mode, and throws an error if not. The expected mode is assigned by npimg.dtype, which return np.uint8 if line 105 is executed

I believe the bug can be fixed by changing line 105 from: pic = pic.mul(255).byte() -to- pic = pic.mul(255)

Test script: import torch from torchvision import transforms a = torch.FloatTensor(1,64,64) tform = transforms.Compose([transforms.ToPILImage(mode='F')]) b = tform(a)

Please let me know if I am in error. Thank you.

贡献者指南

研究方向: 检查torchvision/transforms/functional.py中的to pil image函数，并追踪FloatTensor输入的路径。关注第105行，乘以255并转换为字节会导致精度丢失。
技术栈: python
领域: computer vision
议题类型: 缺陷
难度: 2
预计时间: 1-3 小时
活动状态: 活跃
清晰度: 清晰
前置要求: Python
新手友好度: 75

仓库指标

描述

贡献者指南

每天在邮箱收到新鲜 Easy issues。