Skip to content

What would wcwidth look like if it were built-in to Python? #94

@jquast

Description

@jquast

Like P1868R2, "🦄 width: clarifying units of width and precision in std::format", Published Proposal, 2020-02-11 https://fmt.dev/papers/p1868.html

Why can't Python just do the right thing? For example, here it gets it wrong,

>>> print(f'|{"\u231a":x<5s}|\n'
...       f'|{"watch":x<5s}|\n')
|⌚xxxx|
|watch|

This emoji is measured as a width of 1, but it is actually a width of 2, causing rjust() to format it wrong. It also fails to account correctly when zero-width, ZWJ, and variation selectors are used. Python fails to get this measurement "right" for any kind of display device at all, but I think it goes without saying that the only purpose of this function is for monospace character displays such as terminals.

I believe the Built-in format string alignment functions, str.rjust, str.ljust, str.center, and textwrap.wrap should measure these unicode characters for their printable width, and not just the "number of codepoints".

The built-in REPL also gets this wrong in the readline-like library input. It becomes impossible to edit strings containing these characters, the cursor position and the result of input is unpredictable and disorienting.

IPython, which uses wcwidth, does a better job and should fare better with #91 closed, but it should not be required to use a large project like IPython as a REPL as a solution.

It would be good to experiment with the source code of Python, to see which parts of the codebase need changing. See #93 for the basic high-level functions

And, it would be better to draft and submit a PEP.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions