-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Closed
Labels
Description
On arm's A64fx (i.e. aarch64) the official release (Julia 1.6) converts Float16s to Float32 for arithmetics, although A64fx natively supports Float16
julia> a,b = rand(Float16,2);
julia> @code_llvm a+b
; @ float.jl:324 within `+'
define half @"julia_+_174"(half %0, half %1) {
top:
%2 = fpext half %0 to float
%3 = fpext half %1 to float
%4 = fadd float %2, %3
%5 = fptrunc float %4 to half
ret half %5
}
So @vchuravy suggest to comment out
Lines 622 to 623 in 91c297b
PM->add(createDemoteFloat16Pass()); | |
PM->add(createGVNPass()); |
and build 1.6 from source. This worked fine, and now Float16s are not converted
julia> @code_llvm a+b
; @ float.jl:324 within `+'
define half @"julia_+_174"(half %0, half %1) {
top:
%2 = fadd half %0, %1
ret half %2
}
and give a typical 4x speed-up on arithmetic operations compared to Float64. Unless subnormals are hit, see #40151
jingchangshi