Skip to content

Uncaught exceptions/crashes when parsing malformed <![ marked sections in Python-Markdown 3.8 with extra extension #1534

Closed
@keshavgoyal1744

Description

@keshavgoyal1744

Description

While fuzz-testing the Python-Markdown library (version 3.8) with the extra extension enabled, I found that certain malformed inputs containing <![ sequences cause the parser to throw uncaught exceptions and crash. These inputs appear to break the parser's handling of XML-style marked sections, leading to errors like "expected name token" or "unknown status keyword."

Steps to Reproduce

  1. Run the following test script (using Python-Markdown 3.8 with extra extension):
import markdown

print(markdown.__version__)

crash_inputs = [
    "<![",
    "<![>og))/uw_     f{tv+pAr$Ss+[6;^{=<:>g2oV|.pdTMu(Q-E#",
    "<![ g'\"7z5r7cojSO;2LAo0(1Vv5G>,-P",
]

for i, crash_input in enumerate(crash_inputs, 1):
    print(f"\nTesting crash input #{i}:\n{repr(crash_input)}")
    try:
        output = markdown.markdown(crash_input, extensions=["extra"])
        print("No crash, output:")
        print(output)
    except Exception as e:
        print(f"Crash confirmed! Exception:\n{e}")
  1. Observe that the first three inputs cause exceptions, crashing the parser.

Expected Behavior

The parser should gracefully handle or sanitize malformed <![ marked sections without raising uncaught exceptions.

Actual Behavior

Uncaught exceptions are raised, such as:

3.8

Testing crash input #1:
'<!['
Crash confirmed! Exception:
expected name token at '<![\n\n'

Testing crash input #2:
'<![>og))/uw_     f{tv+pAr$Ss+[6;^{=<:>g2oV|.pdTMu(Q-E#'
Crash confirmed! Exception:
expected name token at '<![>og))/uw_     f{t'

Testing crash input #3:
'<![ g\'"7z5r7cojSO;2LAo0(1Vv5G>,-P'
Crash confirmed! Exception:
expected name token at '<![ g\'"7z5r7cojSO;2L'
  • expected name token at '<![\n\n'
  • expected name token at '<![>og))/uw_ f{t'
  • expected name token at '<![ g\'"7z5r7cojSO;2L'

Impact

  • Potential Denial of Service by crashing apps parsing untrusted Markdown.
  • Stability risk in applications relying on Python-Markdown.
  • Possible information leakage via stack traces if errors are exposed.

Environment

  • Python-Markdown version: 3.8
  • Python version: 3.11
  • OS: Linux and Windows 11

Additional Notes

I performed fuzz testing and isolated these inputs causing crashes. I'm happy to provide more test cases or logs if needed.

Thank you for your attention to this issue!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBug report.confirmedConfirmed bug report or approved feature request.coreRelated to the core parser code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions