Hardware Float16 on A64fx

On arm's A64fx (i.e. aarch64) the official release (Julia 1.6) converts Float16s to Float32 for arithmetics, although A64fx natively supports Float16
```julia
julia> a,b = rand(Float16,2);

julia> @code_llvm a+b
;  @ float.jl:324 within `+'
define half @"julia_+_174"(half %0, half %1) {
top:
  %2 = fpext half %0 to float
  %3 = fpext half %1 to float
  %4 = fadd float %2, %3
  %5 = fptrunc float %4 to half
  ret half %5
}
```
So @vchuravy suggest to comment out https://github.com/JuliaLang/julia/blob/91c297b6c983aed2828600bcd9b8ba6494f747b6/src/aotcompile.cpp#L622-L623
and build 1.6 from source. This worked fine, and now Float16s are not converted
```julia
julia> @code_llvm a+b
;  @ float.jl:324 within `+'
define half @"julia_+_174"(half %0, half %1) {
top:
  %2 = fadd half %0, %1
  ret half %2
}
```
and give a typical 4x speed-up on arithmetic operations compared to Float64. Unless subnormals are hit, see #40151 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Hardware Float16 on A64fx #40216

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	PM->add(createDemoteFloat16Pass());
	PM->add(createGVNPass());

Uh oh!

Hardware Float16 on A64fx #40216

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions