-
Notifications
You must be signed in to change notification settings - Fork 98
MHA fusion cleanup #2481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MHA fusion cleanup #2481
Conversation
Signed-off-by: Ganesan Ramalingam <[email protected]>
Signed-off-by: Ganesan Ramalingam <[email protected]>
Signed-off-by: Ganesan Ramalingam <[email protected]>
Signed-off-by: Ganesan Ramalingam <[email protected]>
Signed-off-by: Ganesan Ramalingam <[email protected]>
❌ 9 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
Signed-off-by: Ganesan Ramalingam <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR cleans up MHA (Multi-Head Attention) fusion rules by simplifying pattern variations and improving scale attribute handling. The main improvements include removing redundant rule variations, introducing proper scale attribute handling through the new AttrVar pattern, and adding common subexpression elimination.
Key changes:
- Eliminates redundant MHA fusion rule variations by removing
transpose_4dandpre_scale_qparameters - Introduces
AttrVarpattern withcan_match_nonesupport for optional attributes like scale - Fixes scale attribute handling across MHA fusion patterns
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| onnxscript/rewriter/pattern.py | Exports new AttrVar pattern for use in fusion rules |
| onnxscript/rewriter/_pattern_ir.py | Implements AttrVar pattern with can_match_none support for optional attributes |
| onnxscript/rewriter/ort_fusions/mha.py | Removes redundant rule variations and simplifies MHA pattern generation |
| onnxscript/rewriter/ort_fusions/mha_bias.py | Updates to use AttrVar for scale attribute handling |
| onnxscript/rewriter/ort_fusions/attention.py | Extracts scale from matched nodes and uses named outputs |
| onnxscript/rewriter/ort_fusions/_core.py | Adds common subexpression elimination pass |
Signed-off-by: Ganesan Ramalingam <[email protected]>
Signed-off-by: Ganesan Ramalingam <[email protected]>
Uh oh!
There was an error while loading. Please reload this page.