Skip to content

[PyTorch][Winograd] Winograd kernel been selected has caused issue with test_Conv2d_naive_groups_cuda_float16 #2492

@junliume

Description

@junliume

[Summary]

Winograd kernels are by design aiming performance by sacrificing numerical accuracy.
However, in this case for very small and non-practical case, selecting winograd kernels have caused test_Conv2d_naive_groups_cuda_float16 to fail.

Question:

  • test_Conv2d_naive_groups_cuda_float16 has keyword naive in it, does it expect naive implementations to begin with?
  • @Kirpich30000 should winograd kernels have issues with such cases? i.e. -H 6 -W 6 -k 2
MIOpenDriver convfp16 -n 2 -c 2 -H 6 -W 6 -k 2 -y 3 -x 3 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -F 1 -t 1 -S 0
Forward Conv solutions available: 2
- id: 84 algo: 3, time: 10 ms, ws: 0, name: ConvBinWinogradRxSf2x3g1
- id: 107 algo: 5, time: 20 ms, ws: 1280, name: ConvAsmImplicitGemmGTCDynamicFwdXdlopsNHWC
MIOpen Forward Conv. Algorithm: 3, Solution: 84/ConvBinWinogradRxSf2x3g1
GPU Kernel Time Forward Conv. Elapsed: 0.015378 ms (average)
stats: name, n, c, ho, wo, x, y, k, flopCnt, bytesRead, bytesWritten, GFLOPs, GB/s, timeMs
stats: fwd-conv3x3u1, 2, 2, 4, 4, 3, 3, 2,  2304, 360, 128, 0, 0, 0.015378
Forward Convolution Verifies OK on CPU reference (0.000340009)

[Observation and Steps to reproduce]:

To Reproduce:

PYTORCH_TEST_WITH_ROCM=1 python3 nn/test_convolution.py --use-pytest --verbose -k test_Conv2d_naive_groups_cuda_float16
Docker Images:

ROCM 5.6: rocm/pytorch:rocm5.6_ubuntu20.04_py3.8_pytorch_2.0.1
PyTorch Installed at /var/lib/jenkins/pytorch/test
ROCM 5.7:  compute-artifactory.amd.com:5000/rocm-plus-docker/framework/compute-rocm-rel-5.7:86_ubuntu20.04_py3.9_pytorch_rocm5.7_internal_testing_55fbbdf
Original image: rocm/pytorch-private:86_ubuntu20.04_py3.9_pytorch_rocm5.7_internal_testing_55fbbdf
PyTorch Installed at /var/lib/jenkins/pytorch/test

NOTE: tolerance has already been raised to 1e-1, You need to run git revert e9b273df57b240f14ead07b5fda97bdf2be6673a to see the error
Expected Output:

nn/test_convolution.py::TestConvolutionNNDeviceTypeCUDA::test_Conv2d_naive_groups_cuda_float16 PASSED

Actual Output:

Mismatched elements: 47 / 128 (36.7%)
Greatest absolute difference: 0.0009765625 at index (0, 2, 2, 1) (up to 1e-05 allowed)
Greatest relative difference: 0.0999755859375 at index (0, 0, 2, 0) (up to 0.001 allowed)

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions