`quantize-ort.py` doesn't reproduce the quantized models in the repos

Environment: onnxruntime version 1.17.1, model `PP-ResNet50`.

Running the quantization script `quantize-ort.py` cannot reproduce the quantized model in the repo. The current script will produce int8 quantized ppresnet50 at a size of over 120 MB, which is significantly different from the existing quantized models in the repo at the size of ~26 MB. After some investigation, I think the reason might be that preprocessing is missing. [The ONNX documentation](https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html#qdqdebug) seems to suggest preprocessing is highly encouraged.

Left: Computation graph of already quantized models in the repo or models quantized by the updated script.
Right: Computation graph of the model quantized by the original script.<br>
<img width="500" alt="Screenshot 2024-02-26 at 23 15 36" src="https://github.com/opencv/opencv_zoo/assets/61866948/a7955877-bf40-47a9-812e-7233d8b0fa4d">

We can see that the current script will result in a model with an unoptimized computation graph and redundant computation nodes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`quantize-ort.py` doesn't reproduce the quantized models in the repos #239

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

quantize-ort.py doesn't reproduce the quantized models in the repos #239

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`quantize-ort.py` doesn't reproduce the quantized models in the repos #239