Skip to content

decode_jpeg() creates tensors with different stride from PIL.Image.open() #6465

Closed
@NicolasHug

Description

@NicolasHug

Opening a file with torchvision.io.decode_jpeg() will create a tensor with different strides than opening with PIL and converting to a tensor (see snippet below).

This can be the cause of a very significant difference in training time because of pytorch/pytorch#83840. Let's see if pytorch/pytorch#83840 deserves a "fix" or not but regardless, this might not be the only transform that will be sensitive to strides, and it might be worth changing what we output in decode_jpeg().

@datumbox @vfdev-5 this is something to keep in mind when you'll benchmark the new transforms: the strides matter a ton.

import torch
from torchvision.io import decode_jpeg, read_file
from torchvision.transforms import ToTensor
from PIL import Image

filepath = "./test/assets/encode_jpeg/grace_hopper_517x606.jpg"  # 606 x 517

print(torch.randint(0, 256, (3, 606, 517), dtype=torch.uint8).stride())  # (313302, 517, 1)
print(ToTensor()(Image.open(filepath)).stride())  # (313302, 517, 1))
print(decode_jpeg(read_file(filepath)).stride())  # (1, 1551, 3)  -- this makes Resize() 8X slower because antialias is False by default.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions