Skip to content

There is a Difference between Torchvsion Shear and PIL Shear #5204

Closed
@SamuelGabriel

Description

@SamuelGabriel

🐛 Describe the bug

As @agaldran pointed out in the TrivialAugment repo (automl/trivialaugment#6), there seems to be difference between the behavior of a shear implemented with the PIL and the TorchVision affine transform.

This is an issue for the autoaugment algorithms, as this might yield to different results compared to other implementations of these algorithms. It might be an issue for other applications as well. Both the AutoAugment (https://github.com/tensorflow/models/blob/fd34f711f319d8c6fe85110d9df6e1784cc5a6ca/research/autoaugment) and the TrivialAugment (https://github.com/automl/trivialaugment) reference implementations use PIL, while RandAugment has no reference implementation.

from PIL import Image
import math
import torchvision # >= 0.11

from torchvision.transforms import functional as F
interpolation = torchvision.transforms.InterpolationMode.NEAREST
fill = None

img = Image.new('RGB', (32,32), (255,255,0))
magnitude = .7

# shear_x as seen in torchvision https://github.com/pytorch/vision/blob/b5aa0915fe16e82ee4c24919032b4e7afae3ae1b/torchvision/transforms/autoaugment.py#L17
im_torch = F.affine(img, angle=0.0, translate=[0, 0], scale=1.0, shear=[math.degrees(magnitude), 0.0],
                    interpolation=interpolation, fill=fill)

# shear_x as seen in https://github.com/automl/trivialaugment/blob/3bfd06552336244b23b357b2c973859500328fbb/aug_lib.py#L156 and https://github.com/tensorflow/models/blob/fd34f711f319d8c6fe85110d9df6e1784cc5a6ca/research/autoaugment/augmentation_transforms.py#L290
im_pil = img.transform(img.size, Image.AFFINE, (1, magnitude, 0, 0, 1, 0))

im_torch.show()
im_pil.show()

This script will yield the following images:

download
download

Which should be the same.

It looks like torchvision has a fixed center, while PIL uses a fixed top. I did not dig into the code much, yet, though. Is there someone here who implemented affine maybe and can give a reasoning for the different shearing.

Versions

Collecting environment information...
PyTorch version: 1.10.1+cu102
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.3 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: 10.0.0-4ubuntu1
CMake version: version 3.16.3
Libc version: glibc-2.31

Python version: 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.4.0-94-generic-x86_64-with-glibc2.31
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.22.0
[pip3] torch==1.10.1
[pip3] torchvision==0.11.2
[conda] mypy-extensions 0.4.3 pypi_0 pypi
[conda] numpy 1.22.0 pypi_0 pypi
[conda] torch 1.10.1 pypi_0 pypi
[conda] torchvision 0.11.2 pypi_0 pypi

cc @vfdev-5 @datumbox

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions