Description
🐛 Describe the bug
There is a big performance difference in reading jpg images using the conda or pip version of torchvision using the function torchvision.io.read_image.
When benchmarking reading 1000 images from a folder the pip version is more than 2x faster than the version installed from conda!
For the test I created 2 new conda environments using
conda create --name tvpip python=3.10
In one environment I installed torchvision using conda:
conda install pytorch torchvision cudatoolkit=11.3 -c pytorch
and in the other using pip:
pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113
Then I used the following code to benchmark torchvision.io.read_image, Pillow and accimage:
import os, torchvision
from time import time as t
f = "test"
files = [file for file in os.listdir(f)]
test_images = len(files)
def test(files, fct):
s = t()
for file in files:
image = fct(os.path.join(f,file))
return t()-s
torchvision.set_image_backend("PIL")
time_needed = test(files, torchvision.io.read_image)
print(f"Torchvision {torchvision.get_image_backend():13s} Loading {test_images} files took {time_needed:.1f}s")
torchvision.set_image_backend("accimage")
time_needed = test(files, torchvision.io.read_image)
print(f"Torchvision {torchvision.get_image_backend():13s} Loading {test_images} files took {time_needed:.1f}s")
from PIL import Image
s = t()
for file in files:
image = Image.open(os.path.join(f,file)).convert("RGB")
time_needed = t() - s
print(f"{'Pillow':25s} Loading {test_images} files took {time_needed:.1f}s")
import accimage
time_needed = test(files, accimage.Image)
print(f"{'AccImage':25s} Loading {test_images} files took {time_needed:.1f}s")
Findings:
- In the conda environment the torchvision.io.read_image takes 4.6s, in the pip environment it takes 1.9s, Should be the same. I couln't figure out where the speed difference comes from, from the timings it looks like pip is using pillow-simd or libjpeg-turbo somehow.
- When using the accimage backend with torchvision (torchvision.set_image_backend) the time to load the images doesn't change at all. Which seems like the same bacend is used. That behavior is the same in the pip and conda environment.
- Installing pillow-simd and accimage in the environment before installing torchvision doesn't change anything apart from the pillow time.
- When installing accimage in the conda environment, the time for torchvision.io.read_image with the accimage backend doesn't change, which in my understanding it should.
I hope you can reproduce the behavior or give some insights why this might be the case. Thanks already.
Versions
Environment pip
Collecting environment information...
PyTorch version: 1.12.1+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31
Python version: 3.10.6 (main, Oct 7 2022, 20:19:58) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.0-50-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to:
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1080 Ti
Nvidia driver version: 515.76
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.23.4
[pip3] torch==1.12.1+cu113
[pip3] torchvision==0.13.1+cu113
[conda] numpy 1.23.4 pypi_0 pypi
[conda] torch 1.12.1+cu113 pypi_0 pypi
[conda] torchvision 0.13.1+cu113 pypi_0 pypi
Environment conda
Collecting environment information...
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31
Python version: 3.10.6 (main, Oct 7 2022, 20:19:58) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.0-50-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to:
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1080 Ti
Nvidia driver version: 515.76
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.23.1
[pip3] torch==1.12.1
[pip3] torchvision==0.13.1
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py310h7f8727e_0
[conda] mkl_fft 1.3.1 py310hd6ae3a3_0
[conda] mkl_random 1.2.2 py310h00e6091_0
[conda] numpy 1.23.1 py310h1794996_0
[conda] numpy-base 1.23.1 py310hcba007f_0
[conda] pytorch 1.12.1 py3.10_cuda11.3_cudnn8.3.2_0 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchvision 0.13.1 py310_cu113 pytorch