Skip to content

[Misc]: Missing charset handling #213

@Cyrille37

Description

@Cyrille37

Describe the bug

Hi.
Thanks a lot for sharing this useful and great work. 💌

version: JohannesKaufmann/html-to-markdown/v2 v2.5.0

The library seems to not manage charset declared in html page. If a page declares "" the resulted .md is "iso-8859-1" encoded but we can't know that. Perhaps the output should be "utf-8" encoded. It needs the library manage the page declared charset.

Regards & cheers.

HTML Input

Libérez votre créativité

Generated Markdown

Lib�rez votre cr�ativit�

Expected Markdown

Libérez votre créativité

What plugins did you use?

base, commonmark

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingv2version v2.x.x

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions