Skip to content

🐛 [Bug] Forcing to_copy to insert ICast Layer reduces perf on Unet #3723

@zewenli98

Description

@zewenli98

Bug Description

Forcing to_copy to insert ICast Layer reduces perf (~10%) on Unet.
It's not necessary to insert a Cast Layer if the dtype doesn't change, e.g., from DataType.HALF to DataType.HALF:

Forced Cast ITensor [NORMALIZATION]-[aten_ops.native_group_norm.default]-[model.1.submodule.1.submodule.conv.unit0.adn.N/native_group_norm_4]_output from DataType.HALF to DataType.HALF - [aten_ops.torch.ops.aten.clone.default]-[model.1.submodule.1.submodule.conv.unit0.adn.D/clone_4], type: LayerType.CAST, inputs: 1, outputs: 1

Currently, all copy related ops are inserting Cast Layer and TensorRT doesn't remove them for us during optimization. We need to carefully think about when is a must to insert Cast Layer.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions