Closed
Description
In addition to a lot of other goodies that transforms v2 will bring, we are also actively working on improving the performance. This is a tracker / overview issue of our progress.
Performance was measured with this benchmark script. Unless noted otherwise, the performance improvements reported above were computed on uint8, RGB images and videos while running single-threaded on CPU. You can find the full benchmark results alongside the benchmark script. The results will be constantly updated if new PRs are merged that have an effect on the kernels.
Kernels:
- color
-
adjust_brightness
[proto] Speed up adjust color ops #6784 -
adjust_contrast
[proto] Speed up adjust color ops #6784 [prototype] Speed upadjust_contrast_image_tensor
#6933 -
adjust_gamma
[prototype] Speed improvement for adjust gamma op #6820 replace tensor division with scalar division and tensor multiplication #6903 -
adjust_hue
[proto] Speed improvements for adjust hue op #6805 replace tensor division with scalar division and tensor multiplication #6903 [prototype] Speed upadjust_hue_image_tensor
#6938 -
adjust_saturation
[proto] Speed up adjust color ops #6784 [prototype] Minor change onadjust_saturation_image_tensor
uint8 #6940 -
adjust_sharpness
[proto] Speed up adjust color ops #6784 [prototype] Speed upadjust_sharpness_image_tensor
#6930 -
autocontrast
[proto] Speed improvement for autocontrast op #6811 [prototype] Speed upautocontrast_image_tensor
#6935 [prototype] Port elastic and minor cleanups #6942 -
equalize
[proto] Small improvement for tensor equalize op #6738, [proto] Performance improvements for equalize op #6757, another round of perf improvements for equalize #6776 -
invert
improve performance of {invert, solarize}_image_tensor #6819 -
posterize
remove unneccesary checks from posterize_image_tensor #6823, extend support of posterize to all integer and floating dtypes #6847 -
solarize
improve performance of {invert, solarize}_image_tensor #6819
-
- geometry
-
affine
Fix bug on prototypepad
#6949 -
center_crop
[prototype] Optimize Center Crop performance #6880 Fix bug on prototypepad
#6949 -
crop
Fix bug on prototypepad
#6949 -
elastic
[prototype] Port elastic and minor cleanups #6942 -
erase
[prototype] Remove_FT
aliases from functional #6983 -
five_crop
: Composite kernel Fix bug on prototypepad
#6949 -
pad
Fix bug on prototypepad
#6949 -
perspective
[proto] Small optim for perspective op on images #6907 Fix bug on prototypepad
#6949 -
resize
[prototype] Clean up and port the resize kernel in V2 #6892 -
resized_crop
: Composite kernel [prototype] Clean up and port the resize kernel in V2 #6892 Fix bug on prototypepad
#6949 -
rotate
Fix bug on prototypepad
#6949 -
ten_crop
: Composite kernel Fix bug on prototypepad
#6949
-
- meta
-
convert_color_space
[proto] Speed up adjust color ops #6784 [prototype] Minor improvements on functional #6832 -
convert_dtype
improve perf on convert_image_dtype and add tests #6795 replace tensor division with scalar division and tensor multiplication #6903- There is still some performance gain left for
int
toint
conversion. Currently, we are using a multiplication
but theoretically bit shifts are faster. However, on PyTorch core the CPU kernels for bit shifts are not
vectorized making them slower for regular sized images than a multiplication. Vectorized CPU code implementing left shift operator. pytorch#88607
- There is still some performance gain left for
-
- misc
Transform Classes:
- MixUp/CutMix [prototype] Speed up Augment Transform Classes #6835
- ColorJitter, RandomPhotometricDistort [prototype] Minor speed and nit optimizations on Transform Classes #6837
C++ (PyTorch core):