-
Notifications
You must be signed in to change notification settings - Fork 12.9k
IQ1_M: 1.75 bpw quantization #6302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 20 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
2a2d66d
iq1_m: basics
ac8b3dd
iq1_m: basics-2
1df37b6
iq1_m: CUDA dequantize works
282f278
iq1_m: separate shifts for each group of 8 in a block
308c50d
iq1_m: go to 3-bit scales
64b9dfd
iq1_m: scalar dot product
a139de5
iq1_m: AVX2 dot product
379fdb6
iq1_m: very slightly faster AVX2 dot product
8009b6d
iq1_m: ARM_NEON dot product
0e36afa
iq1_m: Metal - dequantize works, dot product does not
19fb974
iq1_m: Metal now works
abc1d4f
iq1_m: minor
dff85a8
iq1_m: checking pure iq1_m quantization
f664692
iiq1_m: slightly faster ARM_NEON dot product
b1d1c26
iq1_m: faster ARM_NEON dot product
78ce561
iq1_m: another minor ARM_NEON dot product improvement
3d9c21f
iq1_m: small PPL improvement via super-block scale adjustment
480d6d6
iq1_m: adapt to CUDA refactoring
62dd11f
iq1_m: remove unused variable
22fa121
iq1_m: add to backend-ops tests
b68f32b
iq1_m: fix Windows ARM
9a5786e
iq1_m: use common definition of iq1m_scale_t
cdb2d65
cuda: assert -> NO_DEVICE_CODE
6e4cef5
iq1_M: PR comments
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.