dense model using tensorrt infernece on the A30 got wrong

Environment

- TensorRT 8.6.1

- Versions of CUDA(12.1), CUBLAS(12.1.3.1)

- Container used (tensorrt:23.07-py3)

- NVIDIA driver version (510.47.03)

Model：

- [https://drive.google.com/file/d/1Sn7vJYf_dUqRyBn-KNSnH851D7atHu0o/view?usp=drive_link](https://drive.google.com/file/d/1Sn7vJYf_dUqRyBn-KNSnH851D7atHu0o/view?usp=drive_link)

Reproduction Steps

1. A30
```shell
polygraph run gs_concat.onnx --onnxrt --trt --tf32 --atol 1e-4 --pool-imit workspace:10G
```
output：
![image](https://github.com/NVIDIA/trt-samples-for-hackathon-cn/assets/5791375/d0307f2e-286b-4e62-ab0d-e34c2baa834c)

Expected Behavior

- The result of inference using tensorrt is correct

Actual Behavior

- It can be seen that there is a certain gap between the output of trt and onnx, and the inference result is wrong.
 
Additional Notes

- We also tested it on the A6000 and found it correct, 
```shell
polygraph run gs_concat.onnx --onnxrt --trt --tf32 --atol 1e-4 --pool-imit workspace:10G
```
output:
![image](https://github.com/NVIDIA/trt-samples-for-hackathon-cn/assets/5791375/cae96783-8f9e-493d-b745-c70a8d87432e)

So maybe it's hardware related.

bug已经经过导师确认，nv内部的bug id是4259240.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dense model using tensorrt infernece on the A30 got wrong #86

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

dense model using tensorrt infernece on the A30 got wrong #86

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions