Skip to content

Float::integer_decode cannot be implemented for quadruple precision floating types. #98

@jkarns275

Description

@jkarns275

This is the definition of integer_decode:

fn integer_decode(self) -> (u64, i16, i8)

The 64 bits allocated for the mantissa is too small, 112 bits would be required to store the mantissa of a IEEE 754 quadruple-precision float. The exponent requires 15 bits, so i16 is enough, and naturally i8 is enough for the sign.

I ran into this issue while working on bindings to the GCC __float128 type (aka libquadmath). Would changing the u64 meant for the mantissa to a u128 be too much of a breaking change? If not, I will implement it in the next couple of days.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions