-
Notifications
You must be signed in to change notification settings - Fork 61
Closed
Labels
Description
Lines 104 to 182 in c71459e
def wcwidth(wc): | |
r""" | |
Given one unicode character, return its printable length on a terminal. | |
The wcwidth() function returns 0 if the wc argument has no printable effect | |
on a terminal (such as NUL '\0'), -1 if wc is not printable, or has an | |
indeterminate effect on the terminal, such as a control character. | |
Otherwise, the number of column positions the character occupies on a | |
graphic terminal (1 or 2) is returned. | |
The following have a column width of -1: | |
- C0 control characters (U+001 through U+01F). | |
- C1 control characters and DEL (U+07F through U+0A0). | |
The following have a column width of 0: | |
- Non-spacing and enclosing combining characters (general | |
category code Mn or Me in the Unicode database). | |
- NULL (U+0000, 0). | |
- COMBINING GRAPHEME JOINER (U+034F). | |
- ZERO WIDTH SPACE (U+200B) through | |
RIGHT-TO-LEFT MARK (U+200F). | |
- LINE SEPERATOR (U+2028) and | |
PARAGRAPH SEPERATOR (U+2029). | |
- LEFT-TO-RIGHT EMBEDDING (U+202A) through | |
RIGHT-TO-LEFT OVERRIDE (U+202E). | |
- WORD JOINER (U+2060) through | |
INVISIBLE SEPARATOR (U+2063). | |
The following have a column width of 1: | |
- SOFT HYPHEN (U+00AD) has a column width of 1. | |
- All remaining characters (including all printable | |
ISO 8859-1 and WGL4 characters, Unicode control characters, | |
etc.) have a column width of 1. | |
The following have a column width of 2: | |
- Spacing characters in the East Asian Wide (W) or East Asian | |
Full-width (F) category as defined in Unicode Technical | |
Report #11 have a column width of 2. | |
""" | |
# pylint: disable=C0103 | |
# Invalid argument name "wc" | |
ucs = ord(wc) | |
# NOTE: created by hand, there isn't anything identifiable other than | |
# general Cf category code to identify these, and some characters in Cf | |
# category code are of non-zero width. | |
# pylint: disable=too-many-boolean-expressions | |
# Too many boolean expressions in if statement (7/5) | |
if (ucs == 0 or | |
ucs == 0x034F or | |
0x200B <= ucs <= 0x200F or | |
ucs == 0x2028 or | |
ucs == 0x2029 or | |
0x202A <= ucs <= 0x202E or | |
0x2060 <= ucs <= 0x2063): | |
return 0 | |
# C0/C1 control characters | |
if ucs < 32 or 0x07F <= ucs < 0x0A0: | |
return -1 | |
# combining characters with zero width | |
if _bisearch(ucs, ZERO_WIDTH): | |
return 0 | |
return 1 + _bisearch(ucs, WIDE_EASTASIAN) |
if wc
is an empty string, an TypeError: ord() expected a character...
exception will be raised
if there would be a statement if len(wc) == 0: return 0
before ord(wc)
, it will be better, I think.