How to convert a AMP trained model to get best performance and speed?

According to the doc: https://docs.pytorch.org/TensorRT/user_guide/mixed_precision.html  We can convert model with this project where the param precision are explicitly said in the code.  But when I train a model with torch AMP GradScaler where no value precision tagged in model code,  Can we use this method to get a conerted chackpoint with best performance and inference speedup?


In fect, we had tried the  torch pt->onnx-> tensorrt fp16  pipeline to convert pytorch AMP trained checkpoint into trt model format,  but the inference results are noisey. while pt->onnx-> tensorrt fp32 pipeline will get a trt fp32 model the inference slower then what we need.   

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to convert a AMP trained model to get best performance and speed? #3786

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to convert a AMP trained model to get best performance and speed? #3786

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions