Closed
Description
Description
While fuzz-testing the Python-Markdown library (version 3.8) with the extra
extension enabled, I found that certain malformed inputs containing <![
sequences cause the parser to throw uncaught exceptions and crash. These inputs appear to break the parser's handling of XML-style marked sections, leading to errors like "expected name token" or "unknown status keyword."
Steps to Reproduce
- Run the following test script (using Python-Markdown 3.8 with
extra
extension):
import markdown
print(markdown.__version__)
crash_inputs = [
"<![",
"<![>og))/uw_ f{tv+pAr$Ss+[6;^{=<:>g2oV|.pdTMu(Q-E#",
"<![ g'\"7z5r7cojSO;2LAo0(1Vv5G>,-P",
]
for i, crash_input in enumerate(crash_inputs, 1):
print(f"\nTesting crash input #{i}:\n{repr(crash_input)}")
try:
output = markdown.markdown(crash_input, extensions=["extra"])
print("No crash, output:")
print(output)
except Exception as e:
print(f"Crash confirmed! Exception:\n{e}")
- Observe that the first three inputs cause exceptions, crashing the parser.
Expected Behavior
The parser should gracefully handle or sanitize malformed <![
marked sections without raising uncaught exceptions.
Actual Behavior
Uncaught exceptions are raised, such as:
3.8
Testing crash input #1:
'<!['
Crash confirmed! Exception:
expected name token at '<![\n\n'
Testing crash input #2:
'<![>og))/uw_ f{tv+pAr$Ss+[6;^{=<:>g2oV|.pdTMu(Q-E#'
Crash confirmed! Exception:
expected name token at '<![>og))/uw_ f{t'
Testing crash input #3:
'<![ g\'"7z5r7cojSO;2LAo0(1Vv5G>,-P'
Crash confirmed! Exception:
expected name token at '<![ g\'"7z5r7cojSO;2L'
expected name token at '<![\n\n'
expected name token at '<![>og))/uw_ f{t'
expected name token at '<![ g\'"7z5r7cojSO;2L'
Impact
- Potential Denial of Service by crashing apps parsing untrusted Markdown.
- Stability risk in applications relying on Python-Markdown.
- Possible information leakage via stack traces if errors are exposed.
Environment
- Python-Markdown version: 3.8
- Python version: 3.11
- OS: Linux and Windows 11
Additional Notes
I performed fuzz testing and isolated these inputs causing crashes. I'm happy to provide more test cases or logs if needed.
Thank you for your attention to this issue!