-
Notifications
You must be signed in to change notification settings - Fork 141
Open
Description
The current compiler implementation of fp128 is confusing. __float128
is a GCC extension with inconsistent support (e.g., LoongArch64 doesn't support Q
suffix literals), while _Float128
is standardized but may lack support in some compilers (e.g., Clang lacks C23 _Float128
and f128
suffixes; see llvm/llvm-project#80195).
We might implement dual support for compatibility: prioritize _Float128
detection first, then fall back to __float128
if unavailable.
See also:
- https://en.cppreference.com/w/cpp/types/floating-point
- https://gcc.gnu.org/onlinedocs/gcc/Floating-Types.html
- Related issue on gcc: GCC Bugzilla: LoongArch: Q Suffix for __float128 Literals Not Supported
- Related issue on clang: [clang] missing support for _Float128 (C23) llvm/llvm-project#80195
- The support status table on cppreference:
Types Defined in header |
Literal suffix | Predefined macro | C language type | bits of storage | bits of precision | bits of exponent | max exponent |
---|---|---|---|---|---|---|---|
float16_t | f16 or F16 | STDCPP_FLOAT16_T | _Float16 | 16 | 11 | 5 | 15 |
float32_t | f32 or F32 | STDCPP_FLOAT32_T | _Float32 | 32 | 24 | 8 | 127 |
float64_t | f64 or F64 | STDCPP_FLOAT64_T | _Float64 | 64 | 53 | 11 | 1023 |
float128_t | f128 or F128 | STDCPP_FLOAT128_T | _Float128 | 128 | 113 | 15 | 16383 |
bfloat16_t | bf16 or BF16 | STDCPP_BFLOAT16_T | (N/A) | 16 | 8 | 8 | 127 |
Metadata
Metadata
Assignees
Labels
No labels