Skip to content

wc might be an empty string #24

Closed
Closed
@choldrim

Description

@choldrim

wcwidth/wcwidth/wcwidth.py

Lines 104 to 182 in c71459e

def wcwidth(wc):
r"""
Given one unicode character, return its printable length on a terminal.
The wcwidth() function returns 0 if the wc argument has no printable effect
on a terminal (such as NUL '\0'), -1 if wc is not printable, or has an
indeterminate effect on the terminal, such as a control character.
Otherwise, the number of column positions the character occupies on a
graphic terminal (1 or 2) is returned.
The following have a column width of -1:
- C0 control characters (U+001 through U+01F).
- C1 control characters and DEL (U+07F through U+0A0).
The following have a column width of 0:
- Non-spacing and enclosing combining characters (general
category code Mn or Me in the Unicode database).
- NULL (U+0000, 0).
- COMBINING GRAPHEME JOINER (U+034F).
- ZERO WIDTH SPACE (U+200B) through
RIGHT-TO-LEFT MARK (U+200F).
- LINE SEPERATOR (U+2028) and
PARAGRAPH SEPERATOR (U+2029).
- LEFT-TO-RIGHT EMBEDDING (U+202A) through
RIGHT-TO-LEFT OVERRIDE (U+202E).
- WORD JOINER (U+2060) through
INVISIBLE SEPARATOR (U+2063).
The following have a column width of 1:
- SOFT HYPHEN (U+00AD) has a column width of 1.
- All remaining characters (including all printable
ISO 8859-1 and WGL4 characters, Unicode control characters,
etc.) have a column width of 1.
The following have a column width of 2:
- Spacing characters in the East Asian Wide (W) or East Asian
Full-width (F) category as defined in Unicode Technical
Report #11 have a column width of 2.
"""
# pylint: disable=C0103
# Invalid argument name "wc"
ucs = ord(wc)
# NOTE: created by hand, there isn't anything identifiable other than
# general Cf category code to identify these, and some characters in Cf
# category code are of non-zero width.
# pylint: disable=too-many-boolean-expressions
# Too many boolean expressions in if statement (7/5)
if (ucs == 0 or
ucs == 0x034F or
0x200B <= ucs <= 0x200F or
ucs == 0x2028 or
ucs == 0x2029 or
0x202A <= ucs <= 0x202E or
0x2060 <= ucs <= 0x2063):
return 0
# C0/C1 control characters
if ucs < 32 or 0x07F <= ucs < 0x0A0:
return -1
# combining characters with zero width
if _bisearch(ucs, ZERO_WIDTH):
return 0
return 1 + _bisearch(ucs, WIDE_EASTASIAN)

if wc is an empty string, an TypeError: ord() expected a character... exception will be raised

if there would be a statement if len(wc) == 0: return 0 before ord(wc), it will be better, I think.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions