Commit 04aa9b1
ggml webgpu: faster normal quant and some k-quant matrix operations, better shader parameter handling (ggml-org#20173)
* K quant speedup (ggml-org#20)
* Basic JIT compilation for mul_mat, get_rows, and scale (ggml-org#17)
* scale jit working
* preliminary working jit for getrows and mulmat, needs refining
* simplified mul_mat preprocessing switch statement
* get_rows fixes, mul_mat refinement
* formatted + last edits
* removed some extraneous prints
* fixed get_rows, fixed workgroup dispatch in mul_mat. no gibberish
* small fix
* some changes, working
* get_rows and mul_mat jit fixed and working
* Update formatting
* formatting
* Add header
---------
Co-authored-by: Neha Abbas <nehaabbas@ReeseLevines-MacBook-Pro.local>
Co-authored-by: Reese Levine <reeselevine1@gmail.com>
* Start work on all-encompassing shader library
* refactor argmax, set_rows
* Refactor all but flashattention, mat mul
* no gibberish, all k quants added, merged
* vec memory fix
* q6_k matching metal on my machine, tests passing
* Set tile size for q6_k separately
* Separate out fast shaders
---------
Co-authored-by: neha-ha <137219201+neha-ha@users.noreply.github.com>
* Move towards writeBuffer for params
* Move away from multiple buffers for set_rows errors, remove host buffer for parameter buffers, minor cleanups
* Remove extra file
* Formatting
---------
Co-authored-by: neha-ha <137219201+neha-ha@users.noreply.github.com>1 parent 3ac9e6c commit 04aa9b1
5 files changed
Lines changed: 1237 additions & 256 deletions
File tree
- ggml/src/ggml-webgpu
- wgsl-shaders
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | | - | |
| 45 | + | |
| 46 | + | |
46 | 47 | | |
47 | 48 | | |
48 | | - | |
49 | | - | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
50 | 59 | | |
51 | 60 | | |
52 | 61 | | |
| |||
199 | 208 | | |
200 | 209 | | |
201 | 210 | | |
202 | | - | |
| 211 | + | |
| 212 | + | |
203 | 213 | | |
204 | 214 | | |
205 | 215 | | |
| |||
749 | 759 | | |
750 | 760 | | |
751 | 761 | | |
752 | | - | |
753 | | - | |
754 | | - | |
755 | | - | |
756 | | - | |
757 | | - | |
758 | | - | |
759 | | - | |
760 | | - | |
761 | | - | |
762 | | - | |
763 | | - | |
764 | | - | |
765 | | - | |
766 | 762 | | |
767 | 763 | | |
768 | 764 | | |
769 | 765 | | |
770 | 766 | | |
| 767 | + | |
771 | 768 | | |
772 | 769 | | |
773 | 770 | | |
774 | 771 | | |
| 772 | + | |
775 | 773 | | |
776 | 774 | | |
777 | 775 | | |
778 | 776 | | |
779 | 777 | | |
780 | 778 | | |
781 | 779 | | |
| 780 | + | |
782 | 781 | | |
783 | 782 | | |
784 | 783 | | |
| |||
790 | 789 | | |
791 | 790 | | |
792 | 791 | | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
793 | 806 | | |
794 | 807 | | |
795 | 808 | | |
796 | 809 | | |
797 | | - | |
798 | | - | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
799 | 821 | | |
800 | 822 | | |
801 | 823 | | |
| |||
1061 | 1083 | | |
1062 | 1084 | | |
1063 | 1085 | | |
1064 | | - | |
1065 | | - | |
1066 | | - | |
1067 | | - | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
1068 | 1090 | | |
1069 | 1091 | | |
1070 | 1092 | | |
| |||
0 commit comments