|
4995 | 4995 | The types
|
4996 | 4996 | \keyword{float}, \keyword{double}, and \tcode{\keyword{long} \keyword{double}},
|
4997 | 4997 | and cv-qualified versions\iref{basic.type.qualifier} thereof,
|
| 4998 | +are collectively termed |
| 4999 | +\defnx{standard floating-point types}{type!floating-point!standard}. |
| 5000 | +An implementation may also provide additional types |
| 5001 | +that represent floating-point values and define them (and cv-qualified versions thereof) to be |
| 5002 | +\defnx{extended floating-point types}{type!floating-point!extended}. |
| 5003 | +The standard and extended floating-point types |
4998 | 5004 | are collectively termed \defnx{floating-point types}{type!floating-point}.
|
4999 |
| -The value |
5000 |
| -representation of floating-point types is \impldef{value representation of |
5001 |
| -floating-point types}. |
5002 |
| -\indextext{floating-point type!implementation-defined}% |
5003 | 5005 | \begin{note}
|
5004 |
| -This document imposes no requirements on the accuracy of |
5005 |
| -floating-point operations; see also~\ref{support.limits}. |
| 5006 | +Any additional implementation-specific types representing floating-point values |
| 5007 | +that are not defined by the implementation to be extended floating-point types |
| 5008 | +are not considered to be floating-point types, and |
| 5009 | +this document imposes no requirements on them or |
| 5010 | +their interactions with floating-point types. |
5006 | 5011 | \end{note}
|
| 5012 | +Except as specified in \ref{basic.extended.fp}, |
| 5013 | +the object and value representations and accuracy of operations |
| 5014 | +of floating-point types is \impldef{representation of floating-point types}. |
5007 | 5015 |
|
5008 | 5016 | \pnum
|
5009 | 5017 | Integral and floating-point types are collectively
|
|
5049 | 5057 | same value representation, they are nevertheless different types.
|
5050 | 5058 | \end{note}
|
5051 | 5059 |
|
| 5060 | +\rSec2[basic.extended.fp]{Optional extended floating-point types} |
| 5061 | + |
| 5062 | +\pnum |
| 5063 | +If the implementation supports an extended floating-point type\iref{basic.fundamental} |
| 5064 | +whose properties are specified by |
| 5065 | +the ISO/IEC/IEEE 60559 floating-point interchange format binary16, |
| 5066 | +then the \grammarterm{typedef-name} \tcode{std::float16_t} |
| 5067 | +is defined in the header \libheaderref{stdfloat} and names such a type, |
| 5068 | +the macro \mname{STDCPP_FLOAT16_T} is defined\iref{cpp.predefined}, and |
| 5069 | +the floating-point literal suffixes \tcode{f16} and \tcode{F16} |
| 5070 | +are supported\iref{lex.fcon}. |
| 5071 | + |
| 5072 | +\pnum |
| 5073 | +If the implementation supports an extended floating-point type |
| 5074 | +whose properties are specified by |
| 5075 | +the ISO/IEC/IEEE 60559 floating-point interchange format binary32, |
| 5076 | +then the \grammarterm{typedef-name} \tcode{std::float32_t} |
| 5077 | +is defined in the header \libheader{stdfloat} and names such a type, |
| 5078 | +the macro \mname{STDCPP_FLOAT32_T} is defined, and |
| 5079 | +the floating-point literal suffixes \tcode{f32} and \tcode{F32} are supported. |
| 5080 | + |
| 5081 | +\pnum |
| 5082 | +If the implementation supports an extended floating-point type |
| 5083 | +whose properties are specified by |
| 5084 | +the ISO/IEC/IEEE 60559 floating-point interchange format binary64, |
| 5085 | +then the \grammarterm{typedef-name} \tcode{std::float64_t} |
| 5086 | +is defined in the header \libheader{stdfloat} and names such a type, |
| 5087 | +the macro \mname{STDCPP_FLOAT64_T} is defined, and |
| 5088 | +the floating-point literal suffixes \tcode{f64} and \tcode{F64} are supported. |
| 5089 | + |
| 5090 | +\pnum |
| 5091 | +If the implementation supports an extended floating-point type |
| 5092 | +whose properties are specified by |
| 5093 | +the ISO/IEC/IEEE 60559 floating-point interchange format binary128, |
| 5094 | +then the \grammarterm{typedef-name} \tcode{std::float128_t} |
| 5095 | +is defined in the header \libheader{stdfloat} and names such a type, |
| 5096 | +the macro \mname{STDCPP_FLOAT128_T} is defined, and |
| 5097 | +the floating-point literal suffixes \tcode{f128} and \tcode{F128} are supported. |
| 5098 | + |
| 5099 | +\pnum |
| 5100 | +If the implementation supports an extended floating-point type |
| 5101 | +with the properties, as specified by ISO/IEC/IEEE 60559, of |
| 5102 | +radix ($b$) of 2, |
| 5103 | +storage width in bits ($k$) of 16, |
| 5104 | +precision in bits ($p$) of 8, |
| 5105 | +maximum exponent ($emax$) of 127, and |
| 5106 | +exponent field width in bits ($w$) of 8, then |
| 5107 | +the \grammarterm{typedef-name} \tcode{std::bfloat16_t} |
| 5108 | +is defined in the header \libheader{stdfloat} and names such a type, |
| 5109 | +the macro \mname{STDCPP_BFLOAT16_T} is defined, and |
| 5110 | +the floating-point literal suffixes \tcode{bf16} and \tcode{BF16} are supported. |
| 5111 | + |
| 5112 | +\pnum |
| 5113 | +\begin{note} |
| 5114 | +A summary of the parameters for each type is given in \tref{basic.extended.fp}. |
| 5115 | +The precision $p$ includes the implicit 1 bit at the beginning of the mantissa, |
| 5116 | +so the storage used for the mantissa is $p-1$ bits. |
| 5117 | +ISO/IEC/IEEE 60559 does not assign a name for a type |
| 5118 | +having the parameters specified for \tcode{std::bfloat16_t}. |
| 5119 | +\end{note} |
| 5120 | +\begin{floattable} |
| 5121 | +{Properties of named extended floating-point types}{basic.extended.fp}{llllll} |
| 5122 | +\topline |
| 5123 | +\lhdr{Parameter} & \chdr{\tcode{float16_t}} & \chdr{\tcode{float32_t}} & |
| 5124 | +\chdr{\tcode{float64_t}} & \chdr{\tcode{float128_t}} & |
| 5125 | +\rhdr{\tcode{bfloat16_t}} \\ |
| 5126 | +\capsep |
| 5127 | +ISO/IEC/IEEE 60559 name & binary16 & binary32 & binary64 & binary128 & \\ |
| 5128 | +$k$, storage width in bits & 16 & 32 & 64 & 128 & 16 \\ |
| 5129 | +$p$, precision in bits & 11 & 24 & 53 & 113 & 8 \\ |
| 5130 | +$emax$, maximum exponent & 15 & 127 & 1023 & 16383 & 127 \\ |
| 5131 | +$w$, exponent field width in bits & 5 & 8 & 11 & 15 & 8 \\ |
| 5132 | +\end{floattable} |
| 5133 | + |
| 5134 | +\pnum |
| 5135 | +\recommended |
| 5136 | +Any names that the implementation provides for |
| 5137 | +the extended floating-point types described in this subsection |
| 5138 | +that are in addition to the names defined in the \libheader{stdfloat} header |
| 5139 | +should be chosen to increase compatibility and interoperability |
| 5140 | +with the interchange types |
| 5141 | +\tcode{_Float16}, \tcode{_Float32}, \tcode{_Float64}, and \tcode{_Float128} |
| 5142 | +defined in ISO/IEC TS 18661-3 and with future versions of the C standard. |
| 5143 | + |
5052 | 5144 | \rSec2[basic.compound]{Compound types}
|
5053 | 5145 |
|
5054 | 5146 | \pnum
|
|
5337 | 5429 | has the top-level cv-qualifier \keyword{volatile}.
|
5338 | 5430 | \end{example}
|
5339 | 5431 |
|
5340 |
| -\rSec2[conv.rank]{Integer conversion rank}% |
| 5432 | +\rSec2[conv.rank]{Conversion ranks}% |
5341 | 5433 | \indextext{conversion!integer rank}
|
5342 | 5434 |
|
5343 | 5435 | \pnum
|
|
5394 | 5486 | conversions\iref{expr.arith.conv}.
|
5395 | 5487 | \end{note}
|
5396 | 5488 |
|
| 5489 | +\pnum |
| 5490 | +Every floating-point type has a \defnadj{floating-point}{conversion rank} |
| 5491 | +defined as follows: |
| 5492 | +\begin{itemize} |
| 5493 | +\item |
| 5494 | +The rank of a floating point type \tcode{T} is greater than |
| 5495 | +the rank of any floating-point type |
| 5496 | +whose set of values is a proper subset of the set of values of \tcode{T}. |
| 5497 | +\item |
| 5498 | +The rank of \tcode{\keyword{long} \keyword{double}} is greater than |
| 5499 | +the rank of \keyword{double}, |
| 5500 | +which is greater than the rank of \keyword{float}. |
| 5501 | +\item |
| 5502 | +Two extended floating-point types with the same set of values have equal ranks. |
| 5503 | +\item |
| 5504 | +An extended floating-point type with the same set of values as |
| 5505 | +exactly one cv-unqualified standard floating-point type |
| 5506 | +has a rank equal to the rank of that standard floating-point type. |
| 5507 | +\item |
| 5508 | +An extended floating-point type with the same set of values as |
| 5509 | +more than one cv-unqualified standard floating-point type |
| 5510 | +has a rank equal to the rank of \keyword{double}. |
| 5511 | +\end{itemize} |
| 5512 | +\begin{note} |
| 5513 | +The conversion ranks of floating-point types \tcode{T1} and \tcode{T2} |
| 5514 | +are unordered if the set of values of \tcode{T1} is |
| 5515 | +neither a subset nor a superset of the set of values of \tcode{T2}. |
| 5516 | +This can happen when one type has both a larger range and a lower precision |
| 5517 | +than the other. |
| 5518 | +\end{note} |
| 5519 | + |
| 5520 | +\pnum |
| 5521 | +Floating-point types that have equal floating-point conversion ranks |
| 5522 | +are ordered by floating-point conversion subrank. |
| 5523 | +The subrank forms a total order among types with equal ranks. |
| 5524 | +The types |
| 5525 | +\tcode{std::float16_t}, |
| 5526 | +\tcode{std::float32_t}, |
| 5527 | +\tcode{std::float64_t}, and |
| 5528 | +\tcode{std::float128_t}\iref{stdfloat.syn} |
| 5529 | +have a greater conversion subrank than any standard floating-point type |
| 5530 | +with equal conversion rank. |
| 5531 | +Otherwise, the conversion subrank order is |
| 5532 | +\impldef{floating-point conversion subrank}. |
| 5533 | + |
| 5534 | +\pnum |
| 5535 | +\begin{note} |
| 5536 | +The floating-point conversion rank and subrank are used in |
| 5537 | +the definition of the usual arithmetic conversions\iref{expr.arith.conv}. |
| 5538 | +\end{note} |
| 5539 | + |
5397 | 5540 | \rSec1[basic.exec]{Program execution}
|
5398 | 5541 |
|
5399 | 5542 | \rSec2[intro.execution]{Sequential execution}
|
|
0 commit comments