Skip to content

Commit 7b0efef

Browse files
committed
Support hexadecimal floating point literals
This add hexadecimal floating point literals (IEEE 754-2008 §5.12.3) and support construction of floats from hexadecimal strings. Note that the float constructor accepts more permissive syntax (everything that is currently accepted by the float.fromhex, but with a mandatory base specifier; it also allows grouping digits with underscores). Examples: ```pycon >>> 0x1.1p-1 0.53125 >>> float('0x1.1') 1.0625 >>> 0x1.1 File "<stdin>", line 1 0x1.1 ^ SyntaxError: invalid floating point literal ``` Minor changes: * Py_ISDIGIT/ISXDIGIT macros were transformed to functions * cherry-picked sphinx workaround from python#108184
1 parent 4c4b08d commit 7b0efef

File tree

14 files changed

+253
-71
lines changed

14 files changed

+253
-71
lines changed

Doc/library/functions.rst

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -656,7 +656,8 @@ are always available. They are listed here in alphabetical order.
656656

657657
Return a floating point number constructed from a number or string *x*.
658658

659-
If the argument is a string, it should contain a decimal number, optionally
659+
If the argument is a string, it should contain a decimal number
660+
or a hexadecimal number, optionally
660661
preceded by a sign, and optionally embedded in whitespace. The optional
661662
sign may be ``'+'`` or ``'-'``; a ``'+'`` sign has no effect on the value
662663
produced. The argument may also be a string representing a NaN
@@ -671,13 +672,16 @@ are always available. They are listed here in alphabetical order.
671672
digitpart: `!digit` (["_"] `!digit`)*
672673
number: [`digitpart`] "." `digitpart` | `digitpart` ["."]
673674
exponent: ("e" | "E") ["+" | "-"] `digitpart`
674-
floatnumber: number [`exponent`]
675+
hexfloatnumber: `~python-grammar:hexinteger` | `~python-grammar:hexfraction` | `~python-grammar:hexfloat`
676+
floatnumber: (`number` [`exponent`]) | `hexfloatnumber`
675677
floatvalue: [`sign`] (`floatnumber` | `infinity` | `nan`)
676678

677679
Here ``digit`` is a Unicode decimal digit (character in the Unicode general
678680
category ``Nd``). Case is not significant, so, for example, "inf", "Inf",
679681
"INFINITY", and "iNfINity" are all acceptable spellings for positive
680-
infinity.
682+
infinity. Note also that the exponent of hexadecimal floating point number
683+
is written in decimal, and that it gives the power of 2 by which to multiply
684+
the coefficient.
681685

682686
Otherwise, if the argument is an integer or a floating point number, a
683687
floating point number with the same value (within Python's floating point
@@ -714,6 +718,9 @@ are always available. They are listed here in alphabetical order.
714718
.. versionchanged:: 3.8
715719
Falls back to :meth:`~object.__index__` if :meth:`~object.__float__` is not defined.
716720

721+
.. versionchanged:: 3.13
722+
Added support for hexadecimal floating-point numbers.
723+
717724

718725
.. index::
719726
single: __format__

Doc/reference/lexical_analysis.rst

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -951,25 +951,36 @@ Floating point literals
951951
Floating point literals are described by the following lexical definitions:
952952

953953
.. productionlist:: python-grammar
954-
floatnumber: `pointfloat` | `exponentfloat`
954+
floatnumber: `pointfloat` | `exponentfloat` | `hexfloat`
955955
pointfloat: [`digitpart`] `fraction` | `digitpart` "."
956956
exponentfloat: (`digitpart` | `pointfloat`) `exponent`
957+
hexfloat: ("0x | "0X") ["_"] (`hexdigitpart` | `hexpointfloat`) `hexexponent`
957958
digitpart: `digit` (["_"] `digit`)*
958959
fraction: "." `digitpart`
959960
exponent: ("e" | "E") ["+" | "-"] `digitpart`
961+
hexpointfloat: [`hexdigit`] `hexfraction` | `hexdigitpart` "."
962+
hexfraction: "." `hexdigitpart`
963+
hexdigitpart: `hexdigit` (["_"] `hexdigit`)*
964+
hexexponent: ("p" | "P") ["+" | "-"] `digitpart`
960965

961-
Note that the integer and exponent parts are always interpreted using radix 10.
966+
Note that the exponent parts are always interpreted using radix 10.
962967
For example, ``077e010`` is legal, and denotes the same number as ``77e10``. The
963968
allowed range of floating point literals is implementation-dependent. As in
964969
integer literals, underscores are supported for digit grouping.
965970

971+
The exponent of hexadecimal floating point literal is written in decimal, and
972+
it gives the power of 2 by which to multiply the coefficient.
973+
966974
Some examples of floating point literals::
967975

968976
3.14 10. .001 1e100 3.14e-10 0e0 3.14_15_93
969977

970978
.. versionchanged:: 3.6
971979
Underscores are now allowed for grouping purposes in literals.
972980

981+
.. versionchanged:: 3.13
982+
Added support for hexadecimal floating-point literals.
983+
973984

974985
.. index::
975986
single: j; in numeric literal

Doc/tools/extensions/pyspecific.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,12 @@
4848
Body.enum.converters['lowerroman'] = \
4949
Body.enum.converters['upperroman'] = lambda x: None
5050

51+
# monkey-patch the productionlist directive to allow hyphens in group names
52+
# see sphinx-doc/sphinx#11854
53+
from sphinx.domains import std
54+
55+
std.token_re = re.compile(r'`((~?[\w-]*:)?\w+)`')
56+
5157

5258
# Support for marking up and linking to bugs.python.org issues
5359

Doc/tutorial/floatingpoint.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -210,7 +210,7 @@ the float value exactly:
210210

211211
.. doctest::
212212

213-
>>> x == float.fromhex('0x1.921f9f01b866ep+1')
213+
>>> x == 0x1.921f9f01b866ep+1
214214
True
215215

216216
Since the representation is exact, it is useful for reliably porting values

Include/cpython/pyctype.h

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,17 @@ PyAPI_DATA(const unsigned int) _Py_ctype_table[256];
2121
#define Py_ISLOWER(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_LOWER)
2222
#define Py_ISUPPER(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_UPPER)
2323
#define Py_ISALPHA(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_ALPHA)
24-
#define Py_ISDIGIT(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_DIGIT)
25-
#define Py_ISXDIGIT(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_XDIGIT)
2624
#define Py_ISALNUM(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_ALNUM)
2725
#define Py_ISSPACE(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_SPACE)
2826

27+
static inline int Py_ISDIGIT(char c) {
28+
return _Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_DIGIT;
29+
}
30+
31+
static inline int Py_ISXDIGIT(char c) {
32+
return _Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_XDIGIT;
33+
}
34+
2935
PyAPI_DATA(const unsigned char) _Py_ctype_tolower[256];
3036
PyAPI_DATA(const unsigned char) _Py_ctype_toupper[256];
3137

Include/internal/pycore_floatobject.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ extern PyObject* _Py_string_to_number_with_underscores(
7373

7474
extern double _Py_parse_inf_or_nan(const char *p, char **endptr);
7575

76+
extern double _Py_dg_strtod_hex(const char *str, char **ptr);
7677

7778
#ifdef __cplusplus
7879
}

Lib/test/test_float.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -38,9 +38,9 @@ def test_float(self):
3838
self.assertEqual(float(3.14), 3.14)
3939
self.assertEqual(float(314), 314.0)
4040
self.assertEqual(float(" 3.14 "), 3.14)
41-
self.assertRaises(ValueError, float, " 0x3.1 ")
42-
self.assertRaises(ValueError, float, " -0x3.p-1 ")
43-
self.assertRaises(ValueError, float, " +0x3.p-1 ")
41+
self.assertEqual(float(" 0x3.1 "), 3.0625)
42+
self.assertEqual(float(" -0x3.p-1 "), -1.5)
43+
self.assertEqual(float(" +0x3.p-1 "), 1.5)
4444
self.assertRaises(ValueError, float, "++3.14")
4545
self.assertRaises(ValueError, float, "+-3.14")
4646
self.assertRaises(ValueError, float, "-+3.14")
@@ -70,13 +70,13 @@ def test_noargs(self):
7070

7171
def test_underscores(self):
7272
for lit in VALID_UNDERSCORE_LITERALS:
73-
if not any(ch in lit for ch in 'jJxXoObB'):
73+
if not any(ch in lit for ch in 'jJoObB'):
7474
self.assertEqual(float(lit), eval(lit))
7575
self.assertEqual(float(lit), float(lit.replace('_', '')))
7676
for lit in INVALID_UNDERSCORE_LITERALS:
7777
if lit in ('0_7', '09_99'): # octals are not recognized here
7878
continue
79-
if not any(ch in lit for ch in 'jJxXoObB'):
79+
if not any(ch in lit for ch in 'jJoObB'):
8080
self.assertRaises(ValueError, float, lit)
8181
# Additional test cases; nan and inf are never valid as literals,
8282
# only in the float() constructor, but we don't allow underscores
@@ -1483,7 +1483,7 @@ def roundtrip(x):
14831483
except OverflowError:
14841484
pass
14851485
else:
1486-
self.identical(x, fromHex(toHex(x)))
1486+
self.identical(x, roundtrip(x))
14871487

14881488
def test_subclass(self):
14891489
class F(float):

Lib/test/test_grammar.py

Lines changed: 54 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,11 @@
1919

2020
# These are shared with test_tokenize and other test modules.
2121
#
22-
# Note: since several test cases filter out floats by looking for "e" and ".",
23-
# don't add hexadecimal literals that contain "e" or "E".
22+
# Note:
23+
# 1) several test cases filter out floats by looking for "e" and ".":
24+
# don't add hexadecimal literals that contain "e" or "E".
25+
# 2) several tests also filter out binary integers by looking for "b" or "B":
26+
# so, don't add hexadecimal floating point literals with above digits.
2427
VALID_UNDERSCORE_LITERALS = [
2528
'0_0_0',
2629
'4_2',
@@ -43,6 +46,16 @@
4346
'.1_4j',
4447
'(1_2.5+3_3j)',
4548
'(.5_6j)',
49+
'0x_.1p1',
50+
'0X_.1p1',
51+
'0x1_1.p1',
52+
'0x_1_1.p1',
53+
'0x1.1_1p1',
54+
'0x1.p1_1',
55+
'0xa.p1',
56+
'0x.ap1',
57+
'0xa_c.p1',
58+
'0x.a_cp1',
4659
]
4760
INVALID_UNDERSCORE_LITERALS = [
4861
# Trailing underscores:
@@ -54,6 +67,8 @@
5467
'0xf_',
5568
'0o5_',
5669
'0 if 1_Else 1',
70+
'0x1p1_',
71+
'0x1.1p1_',
5772
# Underscores in the base selector:
5873
'0_b0',
5974
'0_xf',
@@ -71,28 +86,41 @@
7186
'0o5__77',
7287
'1e1__0',
7388
'1e1__0j',
89+
'0x1__1.1p1',
7490
# Underscore right before a dot:
7591
'1_.4',
7692
'1_.4j',
93+
'0x1_.p1',
94+
'0xa_.p1',
7795
# Underscore right after a dot:
7896
'1._4',
7997
'1._4j',
8098
'._5',
8199
'._5j',
100+
'0x1._p1',
101+
'0xa._p1',
82102
# Underscore right after a sign:
83103
'1.0e+_1',
84104
'1.0e+_1j',
105+
'0x1.1p+_1',
85106
# Underscore right before j:
86107
'1.4_j',
87108
'1.4e5_j',
88-
# Underscore right before e:
109+
'0x1.1p1_j',
110+
# Underscore right before e or p:
89111
'1_e1',
90112
'1.4_e1',
91113
'1.4_e1j',
92-
# Underscore right after e:
114+
'0x1_p1',
115+
'0x1_P1',
116+
'0x1.1_p1',
117+
'0x1.1_P1',
118+
# Underscore right after e or p:
93119
'1e_1',
94120
'1.4e_1',
95121
'1.4e_1j',
122+
'0x1p_1',
123+
'0x1.1p_1',
96124
# Complex cases with parens:
97125
'(1+1.5_j_)',
98126
'(1+1.5_j)',
@@ -173,6 +201,18 @@ def test_floats(self):
173201
x = 3.e14
174202
x = .3e14
175203
x = 3.1e4
204+
x = 0x1.2p1
205+
x = 0x1.2p+1
206+
x = 0x1.p1
207+
x = 0x1.p-1
208+
x = 0x1p0
209+
x = 0x1ap1
210+
x = 0x1P1
211+
x = 0x1cp2
212+
x = 0x1.p1
213+
x = 0x1.P1
214+
x = 0x001.1p2
215+
x = 0X1p1
176216

177217
def test_float_exponent_tokenization(self):
178218
# See issue 21642.
@@ -210,20 +250,27 @@ def test_bad_numerical_literals(self):
210250
"use an 0o prefix for octal integers")
211251
check("1.2_", "invalid decimal literal")
212252
check("1e2_", "invalid decimal literal")
213-
check("1e+", "invalid decimal literal")
253+
check("1e+", "invalid floating point literal")
254+
check("0x.p", "invalid floating point literal")
255+
check("0x_.p", "invalid floating point literal")
256+
check("0x1.", "invalid floating point literal")
257+
check("0x1.1", "invalid floating point literal")
258+
check("0x1.1p", "invalid floating point literal")
259+
check("0xp", "invalid hexadecimal literal")
260+
check("0xP", "invalid hexadecimal literal")
214261

215262
def test_end_of_numerical_literals(self):
216263
def check(test, error=False):
217264
with self.subTest(expr=test):
218265
if error:
219266
with warnings.catch_warnings(record=True) as w:
220267
with self.assertRaisesRegex(SyntaxError,
221-
r'invalid \w+ literal'):
268+
r'invalid [ \w]+ literal'):
222269
compile(test, "<testcase>", "eval")
223270
self.assertEqual(w, [])
224271
else:
225272
self.check_syntax_warning(test,
226-
errtext=r'invalid \w+ literal')
273+
errtext=r'invalid [ \w]+ literal')
227274

228275
for num in "0xf", "0o7", "0b1", "9", "0", "1.", "1e3", "1j":
229276
compile(num, "<testcase>", "eval")

Lib/test/test_tokenize.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -265,6 +265,16 @@ def test_float(self):
265265
NAME 'x' (1, 0) (1, 1)
266266
OP '=' (1, 2) (1, 3)
267267
NUMBER '3.14e159' (1, 4) (1, 12)
268+
""")
269+
self.check_tokenize("x = 0x1p1", """\
270+
NAME 'x' (1, 0) (1, 1)
271+
OP '=' (1, 2) (1, 3)
272+
NUMBER '0x1p1' (1, 4) (1, 9)
273+
""")
274+
self.check_tokenize("x = 0x.1p1", """\
275+
NAME 'x' (1, 0) (1, 1)
276+
OP '=' (1, 2) (1, 3)
277+
NUMBER '0x.1p1' (1, 4) (1, 10)
268278
""")
269279

270280
def test_underscore_literals(self):

Lib/tokenize.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,10 @@ def maybe(*choices): return group(*choices) + '?'
7777
Pointfloat = group(r'[0-9](?:_?[0-9])*\.(?:[0-9](?:_?[0-9])*)?',
7878
r'\.[0-9](?:_?[0-9])*') + maybe(Exponent)
7979
Expfloat = r'[0-9](?:_?[0-9])*' + Exponent
80-
Floatnumber = group(Pointfloat, Expfloat)
80+
HexExponent = r'[pP][-+]?[0-9](?:_?[0-9])*'
81+
Hexfloat = group(r'0[xX]_?[0-9a-f](?:_?[0-9a-f])*\.(?:[0-9a-f](?:_?[0-9a-f])*)?',
82+
r'0[xX]_?\.[0-9a-f](?:_?[0-9a-f])*') + HexExponent
83+
Floatnumber = group(Pointfloat, Expfloat, Hexfloat)
8184
Imagnumber = group(r'[0-9](?:_?[0-9])*[jJ]', Floatnumber + r'[jJ]')
8285
Number = group(Imagnumber, Floatnumber, Intnumber)
8386

0 commit comments

Comments
 (0)