Hash :
93ce33c2
Author :
Date :
2020-07-23T17:34:08
Fix several quadratic runtime issues in HTML push parser Fix a few remaining cases where the HTML push parser would scan more content during lookahead than being parsed later. Make sure that htmlParseDocTypeDecl consumes all content up to the final '>' in case of errors. The old comment said "We shouldn't try to resynchronize", but ignoring invalid content is also what the HTML5 spec mandates. Likewise, make htmlParseEndTag skip to the final '>' in invalid end tags even if not in recovery mode. This is probably the most visible change in practice and leads to different output for some tests but is also more in line with HTML5. Make sure that htmlParsePI and htmlParseComment don't abort if invalid characters are encountered but log an error and ignore the character. Change some other end-of-buffer checks to test for a zero byte instead of relying on IS_CHAR. Fix usage of IS_CHAR macro in htmlParseScript.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81
./test/HTML/doc3.htm:10: HTML parser error : Misplaced DOCTYPE declaration
<!-- END Naviscope Javascript --><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN
^
./test/HTML/doc3.htm:52: HTML parser error : htmlParseEntityRef: expecting ';'
href="http://ads.gamesquad.net/addclick.exe/adclick.cgi?REGION=game|tech|ent&id
^
./test/HTML/doc3.htm:52: HTML parser error : htmlParseEntityRef: expecting ';'
_top"><img src="http://ads.gamesquad.net/addclick.exe/adcycle.cgi?group=52&media
^
./test/HTML/doc3.htm:52: HTML parser error : htmlParseEntityRef: expecting ';'
><img src="http://ads.gamesquad.net/addclick.exe/adcycle.cgi?group=52&media=1&id
^
./test/HTML/doc3.htm:145: HTML parser error : error parsing attribute name
width=70 Gentus?.?></A><BR><A
^
./test/HTML/doc3.htm:148: HTML parser error : Unexpected end tag : p
</P></TD></TR></TBODY></TABLE></CENTER></TD></TR></TBODY></TABLE></CENTER></P>
^
./test/HTML/doc3.htm:236: HTML parser error : Unexpected end tag : font
Specials<BR><BR></FONT></A><BR></FONT></A><B><FONT color=yellow
^
./test/HTML/doc3.htm:236: HTML parser error : Unexpected end tag : a
Specials<BR><BR></FONT></A><BR></FONT></A><B><FONT color=yellow
^
./test/HTML/doc3.htm:747: HTML parser error : htmlParseEntityRef: expecting ';'
er=0 alt="Advertisement" src="http://ads.adflight.com/ad_static.asp?pid=2097&sid
^
./test/HTML/doc3.htm:747: HTML parser error : htmlParseEntityRef: expecting ';'
Advertisement" src="http://ads.adflight.com/ad_static.asp?pid=2097&sid=1881&asid
^
./test/HTML/doc3.htm:747: HTML parser error : Unexpected end tag : li
light.com/ad_static.asp?pid=2097&sid=1881&asid=7708"></a></IFRAME></CENTER></LI>
^
./test/HTML/doc3.htm:747: HTML parser error : Unexpected end tag : font
om/ad_static.asp?pid=2097&sid=1881&asid=7708"></a></IFRAME></CENTER></LI></FONT>
^
./test/HTML/doc3.htm:747: HTML parser error : Unexpected end tag : p
=7708"></a></IFRAME></CENTER></LI></FONT></TD></TR></TBODY></TABLE></CENTER></P>
^
./test/HTML/doc3.htm:772: HTML parser error : Unexpected end tag : form
archive</A></FONT> </FORM></CENTER></TD></TR></TBODY></TABLE><!--
^
./test/HTML/doc3.htm:795: HTML parser error : Unexpected end tag : iframe
document.write("42DF8478957377></IFRAME>");
^
./test/HTML/doc3.htm:803: HTML parser error : End tag : expected '>'
document.write("DF8478957377></SC");
^
./test/HTML/doc3.htm:804: HTML parser error : Unexpected end tag : sc
document.write("RIPT>");
^
./test/HTML/doc3.htm:811: HTML parser error : Unexpected end tag : a
document.write("ype=gif&size=100x90></A>");
^
./test/HTML/doc3.htm:820: HTML parser error : Unexpected end tag : a
</A></A></B><B></NOSCRIPT></B><B><!-- END GoTo.com Search Box --></B
^
./test/HTML/doc3.htm:820: HTML parser error : Unexpected end tag : noscript
</A></A></B><B></NOSCRIPT></B><B><!-- END GoTo.com Search Box --></B
^
./test/HTML/doc3.htm:826: HTML parser error : Opening and ending tag mismatch: form and center
</FORM><!-- Pricewatch Search Box --><A
^
./test/HTML/doc3.htm:833: HTML parser error : Unexpected end tag : p
Special<BR>Code:BP6-hd</FONT></A> </P></CENTER></TD></TR></TBODY></T
^
./test/HTML/doc3.htm:833: HTML parser error : Opening and ending tag mismatch: center and td
Special<BR>Code:BP6-hd</FONT></A> </P></CENTER></TD></TR></TBODY></T
^
./test/HTML/doc3.htm:839: HTML parser error : Unexpected end tag : p
width="100%"> </TD></TR></TBODY></TABLE></P></CENTER></TR></TBODY></TABLE><
^
./test/HTML/doc3.htm:840: HTML parser error : Unexpected end tag : td
<CENTER></CENTER></TD></TR><TR><TD COLSPAN="3" VALIGN="TOP"
^
./test/HTML/doc3.htm:840: HTML parser error : Unexpected end tag : tr
<CENTER></CENTER></TD></TR><TR><TD COLSPAN="3" VALIGN="TOP"
^
./test/HTML/doc3.htm:841: HTML parser error : Unexpected end tag : table
HEIGHT="70"> </TD> </TR></TABLE>
^