cmd/compile/mips: intrinsify bits.RotateLeft32 on MIPS#45028
cmd/compile/mips: intrinsify bits.RotateLeft32 on MIPS#45028stffabi wants to merge 1 commit intogolang:masterfrom
Conversation
This CL implements the ROTR & ROTRV instructions for MIPS and MIPS64, which are mips32r2 instructions. Additionally bits.RotateLeft32 is now instrinsic and will be rewritten to ROTR during the SSA phase. This brings roughly a 65-70% improvement on mipsle code running Chacha20Poly1305 on a MT7688: goos: linux goarch: mipsle pkg: golang.org/x/crypto/chacha20poly1305 name old time/op new time/op delta Chacha20Poly1305/Open-16 56.2µs ±20% 38.5µs ±40% -31.45% (p=0.001 n=8+10) Chacha20Poly1305/Seal-16 68.3µs ±49% 30.6µs ±13% -55.14% (p=0.000 n=10+10) Chacha20Poly1305/Open-64 67.5µs ±22% 37.8µs ±19% -43.98% (p=0.000 n=9+9) Chacha20Poly1305/Seal-64 64.7µs ±10% 37.6µs ± 8% -41.96% (p=0.000 n=9+8) Chacha20Poly1305/Open-256 151µs ±13% 89µs ±20% -41.03% (p=0.000 n=9+10) Chacha20Poly1305/Seal-256 148µs ±19% 93µs ±35% -37.15% (p=0.000 n=10+10) Chacha20Poly1305/Open-1024 456µs ±16% 260µs ±23% -42.95% (p=0.000 n=10+10) Chacha20Poly1305/Seal-1024 469µs ±14% 254µs ±15% -45.88% (p=0.000 n=10+9) Chacha20Poly1305/Open-8192 3.59ms ±23% 1.94ms ±15% -45.86% (p=0.000 n=10+10) Chacha20Poly1305/Seal-8192 3.47ms ±20% 2.03ms ±22% -41.60% (p=0.000 n=9+10) Chacha20Poly1305/Open-16384 7.01ms ± 9% 4.22ms ±22% -39.89% (p=0.000 n=9+10) Chacha20Poly1305/Seal-16384 7.43ms ±19% 4.23ms ±11% -43.04% (p=0.000 n=10+9) name old speed new speed delta Chacha20Poly1305/Open-16 258kB/s ±46% 431kB/s ±32% +67.05% (p=0.000 n=10+10) Chacha20Poly1305/Seal-16 246kB/s ±35% 527kB/s ±13% +114.23% (p=0.000 n=10+10) Chacha20Poly1305/Open-64 927kB/s ±31% 1664kB/s ±22% +79.50% (p=0.000 n=10+10) Chacha20Poly1305/Seal-64 993kB/s ±10% 1709kB/s ± 8% +72.02% (p=0.000 n=9+8) Chacha20Poly1305/Open-256 1.70MB/s ±13% 2.90MB/s ±18% +70.88% (p=0.000 n=9+10) Chacha20Poly1305/Seal-256 1.74MB/s ±17% 2.81MB/s ±28% +61.16% (p=0.000 n=10+10) Chacha20Poly1305/Open-1024 2.26MB/s ±15% 3.99MB/s ±20% +76.38% (p=0.000 n=10+10) Chacha20Poly1305/Seal-1024 2.20MB/s ±13% 3.92MB/s ±32% +78.82% (p=0.000 n=10+10) Chacha20Poly1305/Open-8192 2.31MB/s ±19% 4.24MB/s ±14% +83.72% (p=0.000 n=10+10) Chacha20Poly1305/Seal-8192 2.30MB/s ±29% 4.09MB/s ±19% +77.66% (p=0.000 n=10+10) Chacha20Poly1305/Open-16384 2.34MB/s ±10% 3.93MB/s ±19% +68.04% (p=0.000 n=9+10) Chacha20Poly1305/Seal-16384 2.23MB/s ±17% 3.79MB/s ±23% +70.00% (p=0.000 n=10+10) Fixes golang#39139
|
This PR (HEAD: d438ffc) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/301711 to see it. Tip: You can toggle comments from me using the |
|
Message from Go Bot: Patch Set 1: Congratulations on opening your first change. Thank you for your contribution! Next steps: Most changes in the Go project go through a few rounds of revision. This can be During May-July and Nov-Jan the Go project is in a code freeze, during which Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Keith Randall: Patch Set 1: (5 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Keith Randall: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Keith Randall: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from stffabi: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Dragan Mladjenovic: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Meng Zhuo: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from stffabi: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Agniva De Sarker: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from stffabi: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Meng Zhuo: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from stffabi: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Meng Zhuo: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Keith Randall: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from stffabi: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Filippo Valsorda: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Keith Randall: Patch Set 1: (5 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Keith Randall: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Keith Randall: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from stffabi: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Dragan Mladjenovic: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Meng Zhuo: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from stffabi: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Agniva De Sarker: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from stffabi: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Meng Zhuo: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from stffabi: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Meng Zhuo: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Keith Randall: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from stffabi: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
|
Message from Filippo Valsorda: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/301711. |
This CL implements the ROTR & ROTRV instructions for
MIPS and MIPS64, which are mips32r2 instructions.
Additionally bits.RotateLeft32 is now instrinsic and will be
rewritten to ROTR during the SSA phase.
This brings roughly a 65-70% improvement on mipsle
code running Chacha20Poly1305 on a MT7688:
goos: linux
goarch: mipsle
pkg: golang.org/x/crypto/chacha20poly1305
name old time/op new time/op delta
Chacha20Poly1305/Open-16 56.2µs ±20% 38.5µs ±40% -31.45% (p=0.001 n=8+10)
Chacha20Poly1305/Seal-16 68.3µs ±49% 30.6µs ±13% -55.14% (p=0.000 n=10+10)
Chacha20Poly1305/Open-64 67.5µs ±22% 37.8µs ±19% -43.98% (p=0.000 n=9+9)
Chacha20Poly1305/Seal-64 64.7µs ±10% 37.6µs ± 8% -41.96% (p=0.000 n=9+8)
Chacha20Poly1305/Open-256 151µs ±13% 89µs ±20% -41.03% (p=0.000 n=9+10)
Chacha20Poly1305/Seal-256 148µs ±19% 93µs ±35% -37.15% (p=0.000 n=10+10)
Chacha20Poly1305/Open-1024 456µs ±16% 260µs ±23% -42.95% (p=0.000 n=10+10)
Chacha20Poly1305/Seal-1024 469µs ±14% 254µs ±15% -45.88% (p=0.000 n=10+9)
Chacha20Poly1305/Open-8192 3.59ms ±23% 1.94ms ±15% -45.86% (p=0.000 n=10+10)
Chacha20Poly1305/Seal-8192 3.47ms ±20% 2.03ms ±22% -41.60% (p=0.000 n=9+10)
Chacha20Poly1305/Open-16384 7.01ms ± 9% 4.22ms ±22% -39.89% (p=0.000 n=9+10)
Chacha20Poly1305/Seal-16384 7.43ms ±19% 4.23ms ±11% -43.04% (p=0.000 n=10+9)
name old speed new speed delta
Chacha20Poly1305/Open-16 258kB/s ±46% 431kB/s ±32% +67.05% (p=0.000 n=10+10)
Chacha20Poly1305/Seal-16 246kB/s ±35% 527kB/s ±13% +114.23% (p=0.000 n=10+10)
Chacha20Poly1305/Open-64 927kB/s ±31% 1664kB/s ±22% +79.50% (p=0.000 n=10+10)
Chacha20Poly1305/Seal-64 993kB/s ±10% 1709kB/s ± 8% +72.02% (p=0.000 n=9+8)
Chacha20Poly1305/Open-256 1.70MB/s ±13% 2.90MB/s ±18% +70.88% (p=0.000 n=9+10)
Chacha20Poly1305/Seal-256 1.74MB/s ±17% 2.81MB/s ±28% +61.16% (p=0.000 n=10+10)
Chacha20Poly1305/Open-1024 2.26MB/s ±15% 3.99MB/s ±20% +76.38% (p=0.000 n=10+10)
Chacha20Poly1305/Seal-1024 2.20MB/s ±13% 3.92MB/s ±32% +78.82% (p=0.000 n=10+10)
Chacha20Poly1305/Open-8192 2.31MB/s ±19% 4.24MB/s ±14% +83.72% (p=0.000 n=10+10)
Chacha20Poly1305/Seal-8192 2.30MB/s ±29% 4.09MB/s ±19% +77.66% (p=0.000 n=10+10)
Chacha20Poly1305/Open-16384 2.34MB/s ±10% 3.93MB/s ±19% +68.04% (p=0.000 n=9+10)
Chacha20Poly1305/Seal-16384 2.23MB/s ±17% 3.79MB/s ±23% +70.00% (p=0.000 n=10+10)
Fixes #39139