Skip to content

How to convert a AMP trained model to get best performance and speed? #3786

@JohnHerry

Description

@JohnHerry

According to the doc: https://docs.pytorch.org/TensorRT/user_guide/mixed_precision.html We can convert model with this project where the param precision are explicitly said in the code. But when I train a model with torch AMP GradScaler where no value precision tagged in model code, Can we use this method to get a conerted chackpoint with best performance and inference speedup?

In fect, we had tried the torch pt->onnx-> tensorrt fp16 pipeline to convert pytorch AMP trained checkpoint into trt model format, but the inference results are noisey. while pt->onnx-> tensorrt fp32 pipeline will get a trt fp32 model the inference slower then what we need.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions