fix: tex_ts::te_gemm_ts missing dtype by olehtika · Pull Request #142 · AMD-AGI/TraceLens

olehtika · 2025-05-08T13:12:06Z

Fix missing dtype in bytes calculation when calculating output dtype

jasainio · 2025-05-08T13:42:23Z

TraceLens/PerfModel/perf_model.py

        4: ['float', 'scalar'],
        2: ['c10::half', 'c10::bfloat16'],
-        1: ['c10::float8_e4m3fnuz'],
+        1: ['c10::float8_e4m3fnuz', 'unsigned char'],


Do we have any other data types that we could already add here?

ajassani

I dont think we need to make an assumptionabut the out dtype as
output dtype is the index 10 in event["args"]["Input type"]

ajassani · 2025-05-08T13:29:37Z

TraceLens/PerfModel/perf_model.py

+        else:
+            # assume output dtype lowest of inputs, ignore scalars alpha and beta for now
+            # TODO: correct later if better way found
+            self.bpe_output = min(self.bpe_mat1, self.bpe_mat2, self.bpe_bias)


I dont think we need to make an assumption here as
output dtype is the index 10 in event["args"]["Input type"]

Or is it bias dtype?

Here we use index 10 to get shapes for output matrix
https://github.com/AMD-AIG-AIMA/TraceLens/blob/main/TraceLens/PerfModel/perf_model.py#L474

Also we can confirm here that arg index 10 is output matrix
https://github.com/ROCm/TransformerEngine/blob/e9772d4d18b2980e8e0643c94591a94cad9bb8b7/transformer_engine/pytorch/cpp_extensions/gemm.py#L254

Now included output dtype

Copilot

Pull Request Overview

This PR fixes the missing dtype issue in the calculation of bytes per element by updating the dtype mappings and how input types are handled.

Extend dtype mapping to include 'unsigned char'
Update input type tuple to include output and bias dtypes
Adjust the bytes computation logic to use the new tuple structure

Comments suppressed due to low confidence (2)

TraceLens/PerfModel/perf_model.py:503

Verify that the ordering in the input type tuple accurately reflects the intended mapping for A, B, output, and bias, particularly ensuring that index 10 corresponds to the output dtype and index 18 to the bias dtype.

dtype_A_B = (event['args']['Input type'][0], event['args']['Input type'][5], event['args']['Input type'][10], event['args']['Input type'][18])

TraceLens/PerfModel/perf_model.py:519

Confirm that the removal of the min() function and the direct mapping for output dtype are correct and consistent with the intended behavior, ensuring no additional logic is now required.

self.bpe_output = name2bpe(dtype_A_B[2])

Copilot · 2025-05-08T19:06:13Z

TraceLens/PerfModel/perf_model.py

@@ -39,7 +39,7 @@ def name2bpe(name):
        8: ['double', 'long int'],
        4: ['float', 'scalar'],
        2: ['c10::half', 'c10::bfloat16'],


[nitpick] Consider adding a comment to clarify the inclusion of 'unsigned char' in the dtype mapping to improve future maintainability.

Suggested change

2: ['c10::half', 'c10::bfloat16'],

2: ['c10::half', 'c10::bfloat16'],

# 'unsigned char' is included as a representation for 1-byte data types, often used as a fallback or for specific low-precision formats.

Fix missing dtype in bytes calculation when calculating output dtype

fix: tex_ts::te_gemm_ts missing dtype

db17d75

olehtika requested review from ajassani and jasainio May 8, 2025 13:12

fix: extend name2bpe with unsigned char

0a9f4d7

jasainio reviewed May 8, 2025

View reviewed changes

ajassani requested changes May 8, 2025

View reviewed changes

fix: output vs bias dtype

01f0361

ajassani approved these changes May 8, 2025

View reviewed changes

ajassani requested a review from Copilot May 8, 2025 19:05

Copilot AI reviewed May 8, 2025

View reviewed changes

ajassani merged commit fd7dc4b into main May 8, 2025

ajassani deleted the fix/te_ops_byte_missing_dtype branch May 8, 2025 19:08

lauri9 pushed a commit that referenced this pull request Jun 11, 2025

fix: tex_ts::te_gemm_ts missing dtype (#142)

661b91d

Fix missing dtype in bytes calculation when calculating output dtype

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: tex_ts::te_gemm_ts missing dtype#142

fix: tex_ts::te_gemm_ts missing dtype#142
ajassani merged 3 commits intomainfrom
fix/te_ops_byte_missing_dtype

olehtika commented May 8, 2025

Uh oh!

jasainio May 8, 2025

Uh oh!

ajassani left a comment

Uh oh!

ajassani May 8, 2025

Uh oh!

olehtika May 8, 2025

Uh oh!

ajassani May 8, 2025

Uh oh!

olehtika May 8, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	2: ['c10::half', 'c10::bfloat16'],
	2: ['c10::half', 'c10::bfloat16'],
	# 'unsigned char' is included as a representation for 1-byte data types, often used as a fallback or for specific low-precision formats.

Conversation

olehtika commented May 8, 2025

Uh oh!

jasainio May 8, 2025

Choose a reason for hiding this comment

Uh oh!

ajassani left a comment

Choose a reason for hiding this comment

Uh oh!

ajassani May 8, 2025

Choose a reason for hiding this comment

Uh oh!

olehtika May 8, 2025

Choose a reason for hiding this comment

Uh oh!

ajassani May 8, 2025

Choose a reason for hiding this comment

Uh oh!

olehtika May 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI May 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants