Skip to content

add support for grouped gemm#254

Merged
ajassani merged 1 commit intomainfrom
feat/megatron_grouped_gemm
Aug 17, 2025
Merged

add support for grouped gemm#254
ajassani merged 1 commit intomainfrom
feat/megatron_grouped_gemm

Conversation

@ajassani
Copy link
Copy Markdown
Collaborator

No description provided.

@ajassani ajassani requested a review from Copilot August 17, 2025 23:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for grouped GEMM (General Matrix Multiplication) operations to the TraceLens performance modeling framework. Grouped GEMM applies group-specific weight matrices to partitions of input tensors, which is useful for certain neural network architectures.

  • Implements a new GroupedGemm performance model class with forward/backward FLOP and byte calculations
  • Adds a custom implementation custom_grouped_gemm in the Megatron extension for parsing grouped GEMM events
  • Updates the extension mappings to register the new grouped GEMM operation

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
TraceLens/PerfModel/perf_model.py Implements the core GroupedGemm performance model class with comprehensive documentation and computation methods
examples/example_megatron_extension.py Adds custom_grouped_gemm implementation and registers it in the performance model mappings

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

if bpe_in == 1 or bpe_in == 2:
bpe_out = 2
else:
raise ValueError(f"Expected bpe_in to be 1 or 2, got {bpe_in}")
Copy link

Copilot AI Aug 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message should be more informative about what data types are supported. Consider mentioning the supported data types or referring to the name2bpe function documentation.

Suggested change
raise ValueError(f"Expected bpe_in to be 1 or 2, got {bpe_in}")
raise ValueError(
f"Unsupported bpe_in value: {bpe_in}. Supported bpe_in values are 1 (float16) and 2 (float32). "
"Please ensure that the input types are supported. See the name2bpe function documentation for details."
)

Copilot uses AI. Check for mistakes.
Y : tensor, shape (M, N)
The concatenated result of the groupwise multiplications.

Computation is functionally equivalent to (implementation detail will ofcourse be efficient):
Copy link

Copilot AI Aug 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a spelling error: 'ofcourse' should be 'of course'.

Suggested change
Computation is functionally equivalent to (implementation detail will ofcourse be efficient):
Computation is functionally equivalent to (implementation detail will of course be efficient):

Copilot uses AI. Check for mistakes.
@ajassani ajassani merged commit 55def0b into main Aug 17, 2025
1 check passed
@ajassani ajassani deleted the feat/megatron_grouped_gemm branch August 17, 2025 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants