Skip to content

four-byte uncode characters confuse ' #28851

Closed
@steveklabnik

Description

@steveklabnik

This Rust program:

fn main() {
    let len = 'ஶ்ரீ'.len_utf8();
}

contains TAMIL SYLLABLE SHRII (śrī), aka U+0BB6 U+0BCD U+0BB0 U+0BC0. When trying to compile this program, I get this error:

2:20: 2:22 error: unterminated character constant: '.
2     let len = 'ஶ்ரீ'.len_utf8();

I know that it isn't a copy-paste issue, because I used vim's C-V u to type in the four code points manually.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-diagnosticsArea: Messages for errors, warnings, and lints

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions