Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for GQA by updating parameter naming and computation logic in performance-related functions.
- Renames parameters (e.g., H → H_Q, d_k → d_h, N_K → N_KV) to align with GQA semantics.
- Updates flops, bytes, and backward flops calculations and adjusts the extraction of tensor shapes in both forward and backward methods.
Comments suppressed due to low confidence (3)
TraceLens/PerfModel/perf_model.py:873
- The dimension ordering in backward get_param_details appears inconsistent with the forward function, where Q shape is expected as (B, N_Q, H_Q, d_h). Verify that the reordering to (B, H_Q, N_Q, d_h) is intentional.
B, H_Q, N_Q, d_h = q_shape
TraceLens/PerfModel/perf_model.py:798
- [nitpick] Using integer floor division in the flops calculation may lead to unintended truncation of critical precision; please confirm that this behavior is as intended for scaling purposes in the GQA computations.
flops_vgrad += B * N_KV * d_h * (H_Q//H_KV -1 )
TraceLens/PerfModel/perf_model.py:809
- [nitpick] Double-check the intended effect of the integer division on H_Q relative to H_KV in the backward gradient computations, as a similar pattern appears with flops_vgrad.
flops_k_grad += B * N_KV * d_h * (H_Q//H_KV -1 )
lauri9
pushed a commit
that referenced
this pull request
Jun 11, 2025
This PR adds support for GQA by updating parameter naming and computation logic in performance-related functions. Renames parameters (e.g., H → H_Q, d_k → d_h, N_K → N_KV) to align with GQA semantics. Updates flops, bytes, and backward flops calculations and adjusts the extraction of tensor shapes in both forward and backward methods.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds support for GQA by updating parameter naming and computation logic in performance-related functions.
Renames parameters (e.g., H → H_Q, d_k → d_h, N_K → N_KV) to align with GQA semantics.
Updates flops, bytes, and backward flops calculations and adjusts the extraction of tensor shapes in both forward and backward methods.