-
-
Notifications
You must be signed in to change notification settings - Fork 32.4k
Closed
Labels
3.13bugs and security fixesbugs and security fixes3.14bugs and security fixesbugs and security fixes3.15new features, bugs and security fixesnew features, bugs and security fixesdocsDocumentation in the Doc dirDocumentation in the Doc dirtype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error
Description
When parsing >>> using html.parser, the actual output differs from the expected behavior as documented.
Run the following code:
from html.parser import HTMLParser
from html.entities import name2codepoint
class MyHTMLParser(HTMLParser):
def handle_starttag(self, tag, attrs):
print("Start tag:", tag)
for attr in attrs:
print(" attr:", attr)
def handle_endtag(self, tag):
print("End tag :", tag)
def handle_data(self, data):
print("Data :", data)
def handle_comment(self, data):
print("Comment :", data)
def handle_entityref(self, name):
c = chr(name2codepoint[name])
print("Named ent:", c)
def handle_charref(self, name):
if name.startswith('x'):
c = chr(int(name[1:], 16))
else:
c = chr(int(name))
print("Num ent :", c)
def handle_decl(self, data):
print("Decl :", data)
parser = MyHTMLParser()
parser.feed('>>>')
According to the documentation, the expected output should be:
Named ent: >
Num ent : >
Num ent : >
The actual output is:
Data : >>>
Linked PRs
Metadata
Metadata
Assignees
Labels
3.13bugs and security fixesbugs and security fixes3.14bugs and security fixesbugs and security fixes3.15new features, bugs and security fixesnew features, bugs and security fixesdocsDocumentation in the Doc dirDocumentation in the Doc dirtype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error
Projects
Status
Todo
Status
Todo