Skip to content

consider making 'ascii' the default/recommended casefolding #1718

@slingamn

Description

@slingamn

I'm not at all certain about this and would appreciate input from all stakeholders.

Right now, the default/recommended casefolding is 'precis', i.e. RFC 8264, allowing the use of non-ASCII characters in nicknames and channel names. However, it seems like we've given up hope of the wider IRCv3 community adopting internationalized identifiers at the protocol level; the community seems to be going in the direction of "display names" instead. (Although: I haven't seen a proposal for assigning display names to channels.)

This creates a problem for client developers, who won't have a reliable algorithm for determining whether Ergo considers two identifiers to be equivalent under case normalization. The workaround we've been pushing is #1083, i.e., always publishing the canonical form of the identifier. This approach is vulnerable to bugs and edge cases.

The other problem with PRECIS is confusable characters, which are only imperfectly addressed by the skeleton algorithm.

So the proposal here is to change the default/recommended value of casefolding from 'precis' to 'ascii'. ('precis' would remain fully supported in the codebase, especially because operators can't safely switch between casefoldings, at the risk of making account or channel registrations unusable.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions