ONNX Quantization Framework built on top of
⚠️ This project is under active development.
Install directly from PyPI:
pip install onnx-quantizeHere’s a minimal example to quantize an ONNX model:
from onnx_quantize import QConfig, QuantType, quantize
import onnx
# Load your model
model = onnx.load("your_model.onnx")
# Define quantization configuration
qconfig = QConfig(
is_static=False,
activations_dtype=QuantType.QInt8,
activations_symmetric=False,
weights_dtype=QuantType.QInt8,
weights_symmetric=True,
weights_per_channel=False,
)
# Quantize the model
qmodel = quantize(model, qconfig)
# Save the quantized model
onnx.save(qmodel, "qmodel.onnx")🧩 Features (planned)
The goal is to have all of what Neural compressor have but using ONNXScript and ONNX IR.