-
Notifications
You must be signed in to change notification settings - Fork 218
Write extra state for KV quantizer #673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: jenchen13 <[email protected]>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #673 +/- ##
==========================================
+ Coverage 74.50% 74.80% +0.29%
==========================================
Files 183 192 +9
Lines 18400 18814 +414
==========================================
+ Hits 13709 14073 +364
- Misses 4691 4741 +50 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: jenchen13 <[email protected]>
Signed-off-by: jenchen13 <[email protected]>
|
TODO fix unit test to work while commenting on quantization of test model https://github.com/NVIDIA/Model-Optimizer/blob/main/tests/gpu/torch/quantization/plugins/test_megatron.py#L870 |
|
TODO currently |
What does this PR do?
Type of change: ? Bug fix
Fix bug during resuming training from KV-cache-quantized checkpoint by writing extra state for
core_attentionto checkpointOverview: ?
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Additional Information