Skip to content

Bug with transforms.Resize when used with transforms.ConvertImageDtype #4880

Open
@numpee

Description

@numpee

🐛 Describe the bug

Recent releases of Torchvision and the documentations that support it seem to suggest that we can use io.read_image + transforms.ConvertImageDtype instead of the traditional PIL.Image.read_fn + transforms.ToTensor. However, I have found that there are two issues:

  1. io.read_image + transforms.ConvertImageDtype do not actually return the same tensor values as PIL + transforms.ToTensor, even though they are supposed to provide the same functionality.
  2. While io.read_image + transforms.ConvertImageDtype itself is significantly faster than using PIL, combining it with the transforms.Resize operation - specifically when upsampling - makes the operation much slower than the PIL alternative.

To add onto point 2, the two sets of functions I mention return the same type of tensor: torch.float. However, applying transforms.Resize on the tensor generated by io.read_image + transforms.ConvertImageDtype is much slower than applying the same resize operation on the output of PIL read + transforms.ToTensor. I can't really understand why this happens, since both calls to Resize are on tensors of type torch.FloatTensor. Also, this only occurs when upsampling.

Please refer to my post on the Pytorch Forum here for the full analysis.

Versions

Collecting environment information...
PyTorch version: 1.10.0+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 16.04.7 LTS (x86_64)
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.23

Python version: 3.9.4 (default, Apr 9 2021, 01:15:05) [GCC 5.4.0 20160609] (64-bit runtime)
Python platform: Linux-4.15.0-142-generic-x86_64-with-glibc2.23
Is CUDA available: True
CUDA runtime version: 10.0.130
GPU models and configuration:
GPU 0: NVIDIA GeForce RTX 2080 Ti
GPU 1: NVIDIA GeForce RTX 2080 Ti
GPU 2: NVIDIA GeForce RTX 2080 Ti
GPU 3: NVIDIA GeForce RTX 2080 Ti

Nvidia driver version: 465.19.01
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.2.1
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.21.3
[pip3] torch==1.10.0+cu113
[pip3] torchaudio==0.10.0+cu113
[pip3] torchvision==0.11.1+cu113
[conda] Could not collect

cc @vfdev-5 @datumbox

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions