Batch parsing of inline scripts/stylesheets #626

MegaCorn · 2025-06-23T07:39:28Z

Before: html5ever produces tokens for each line of inline content
After: html5ever batchs inline contents until encounter '<', which represents endtag in most cases
Fixes: servo/servo#34502

Signed-off-by: MegaCorn <[email protected]>

simonwuelker · 2025-06-25T08:30:53Z

html5ever emits the line number for each token it parses. I this change will break the line counting, because we no longer hit

html5ever/html5ever/src/tokenizer/mod.rs

Lines 258 to 260 in a7c9d98

    
           if c == '\n' { 
        
               self.current_line.set(self.current_line.get() + 1); 
        
           }

for each newline in the input data. That's probably why we interrupt the tokenizer when we see a newline in the first place.

I'd like to get rid of the line count at some point, because it also has a significant performance overhead in other places (#601 (comment)), but I have not investigated the implications.

Batch parsing of inline scripts/stylesheets

6987128

Signed-off-by: MegaCorn <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batch parsing of inline scripts/stylesheets #626

Batch parsing of inline scripts/stylesheets #626

Uh oh!

MegaCorn commented Jun 23, 2025

Uh oh!

simonwuelker commented Jun 25, 2025

Uh oh!

Uh oh!

Batch parsing of inline scripts/stylesheets #626

Are you sure you want to change the base?

Batch parsing of inline scripts/stylesheets #626

Uh oh!

Conversation

MegaCorn commented Jun 23, 2025

Uh oh!

simonwuelker commented Jun 25, 2025

Uh oh!

Uh oh!