Don't constant fold Quantize/DequantizeLinear nodes by default #2713

ruro · 2025-11-21T05:07:45Z

I added support for exporting QuantizeLinear/DequantizeLinear nodes (from fake_quantize_per_*_affine torch operators) in a previous PR.

Unfortunately, the current default onnxscript optimizer settings tend to automatically remove any weight quantization. This is because the Weight -> QDQ -> ... pattern looks like it can be just constant folded to QDQ(Weight) -> ....

I believe that this behavior is not desirable, since the presence of QDQ nodes in the graph is what allows inference engines to run the supported computations using quantized data types. So the purpose of QDQ nodes is to hold the relevant quantization "metadata". As such, they normally shouldn't be constant folded.

I have extended the existing logic in FoldConstantsPass that was used to exclude ConstantOfShape from constant folding.

I haven't found any tests verifying this behavior for ConstantOfShape and I'm not sure, how to set up such a unit test, so I have left this code untested for now. If adding tests is mandatory, please give me a hint on where should I add such a test and what would be the best way to check/assert that the optimized graph matches the expectations (hopefully without reinventing the wheel or manually introspecting the ir.Model object).

codecov · 2025-11-24T11:21:10Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.11%. Comparing base (3e7d9fb) to head (0d6a91b).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2713   +/-   ##
=======================================
  Coverage   70.11%   70.11%           
=======================================
  Files         226      226           
  Lines       27228    27230    +2     
  Branches     2747     2748    +1     
=======================================
+ Hits        19090    19092    +2     
  Misses       7193     7193           
  Partials      945      945

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copilot

Pull request overview

This PR prevents constant folding of QuantizeLinear and DequantizeLinear nodes by default to preserve quantization metadata that inference engines need for optimized execution. The change extends the existing blacklist mechanism that was previously used only for ConstantOfShape.

Key Changes:

Introduced DEFAULT_CONSTANT_FOLD_BLACKLIST constant containing ops that should not be constant folded
Refactored the constant folding logic to iterate over the blacklist instead of checking a single op type

onnxscript/optimizer/_constant_folding.py

titaiwangms

@justinchuby @gramalingam @xadupre Merging this as it seems pretty reasonable to me and the fix is simple. We can re-visit this PR if it obviously breaks anything and it's spotted in the future (observing from benchmark perhaps?).

justinchuby · 2025-12-09T04:22:48Z

Prefer a more efficient way for checking membership: turn the list to a frozen set, first check that the domain is an onnx domain ("" or "ai.onnx"), then check if the op_type is in the set.

ruro added 2 commits November 21, 2025 07:48

refactor FoldConstantsPass default rules handling

e6c62a1

don't constant fold Quantize/DequantizeLinear nodes by default

a377a0b

github-project-automation bot added this to ONNX Script Review Board Nov 21, 2025

github-project-automation bot moved this to Todo in ONNX Script Review Board Nov 21, 2025

titaiwangms added the module: torchlib Related to the torch/aten function lib in development label Dec 1, 2025

titaiwangms requested review from Copilot, gramalingam and justinchuby December 1, 2025 21:22

Copilot started reviewing on behalf of titaiwangms December 1, 2025 21:23 View session

titaiwangms added module: optimizer and removed module: torchlib Related to the torch/aten function lib in development labels Dec 1, 2025

Copilot finished reviewing on behalf of titaiwangms December 1, 2025 21:24

Copilot AI reviewed Dec 1, 2025

View reviewed changes

onnxscript/optimizer/_constant_folding.py Show resolved Hide resolved

titaiwangms approved these changes Dec 1, 2025

View reviewed changes

titaiwangms enabled auto-merge (squash) December 1, 2025 21:27

Merge branch 'main' into onnx_dont_constant_fold_qdq

4e44470

titaiwangms approved these changes Dec 1, 2025

View reviewed changes

Merge branch 'main' into onnx_dont_constant_fold_qdq

0d6a91b

titaiwangms merged commit bbe9c2b into microsoft:main Dec 3, 2025
32 checks passed

github-project-automation bot moved this from Todo to Done in ONNX Script Review Board Dec 3, 2025

titaiwangms assigned gramalingam and justinchuby Dec 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Don't constant fold Quantize/DequantizeLinear nodes by default #2713

Don't constant fold Quantize/DequantizeLinear nodes by default #2713

ruro commented Nov 21, 2025 •

edited

Loading

Uh oh!

codecov bot commented Nov 24, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

titaiwangms left a comment

Uh oh!

Uh oh!

justinchuby commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Don't constant fold Quantize/DequantizeLinear nodes by default #2713

Don't constant fold Quantize/DequantizeLinear nodes by default #2713

Conversation

ruro commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

titaiwangms left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

justinchuby commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ruro commented Nov 21, 2025 •

edited

Loading

codecov bot commented Nov 24, 2025 •

edited

Loading