Closed
Description
Opening a file with torchvision.io.decode_jpeg()
will create a tensor with different strides than opening with PIL and converting to a tensor (see snippet below).
This can be the cause of a very significant difference in training time because of pytorch/pytorch#83840. Let's see if pytorch/pytorch#83840 deserves a "fix" or not but regardless, this might not be the only transform that will be sensitive to strides, and it might be worth changing what we output in decode_jpeg()
.
@datumbox @vfdev-5 this is something to keep in mind when you'll benchmark the new transforms: the strides matter a ton.
import torch
from torchvision.io import decode_jpeg, read_file
from torchvision.transforms import ToTensor
from PIL import Image
filepath = "./test/assets/encode_jpeg/grace_hopper_517x606.jpg" # 606 x 517
print(torch.randint(0, 256, (3, 606, 517), dtype=torch.uint8).stride()) # (313302, 517, 1)
print(ToTensor()(Image.open(filepath)).stride()) # (313302, 517, 1))
print(decode_jpeg(read_file(filepath)).stride()) # (1, 1551, 3) -- this makes Resize() 8X slower because antialias is False by default.
Metadata
Metadata
Assignees
Labels
No labels