|
76d6b0d7
|
2022-11-14T21:02:15
|
|
html: Don't escape ASCII chars in href attributes
In several cases, href attributes can contain ASCII characters which are
illegal in URIs. Escaping them often does more harm than good.
Fixes #321.
|
|
93ce33c2
|
2020-07-23T17:34:08
|
|
Fix several quadratic runtime issues in HTML push parser
Fix a few remaining cases where the HTML push parser would scan more
content during lookahead than being parsed later.
Make sure that htmlParseDocTypeDecl consumes all content up to the
final '>' in case of errors. The old comment said "We shouldn't try to
resynchronize", but ignoring invalid content is also what the HTML5
spec mandates.
Likewise, make htmlParseEndTag skip to the final '>' in invalid end
tags even if not in recovery mode. This is probably the most visible
change in practice and leads to different output for some tests but is
also more in line with HTML5.
Make sure that htmlParsePI and htmlParseComment don't abort if invalid
characters are encountered but log an error and ignore the character.
Change some other end-of-buffer checks to test for a zero byte instead
of relying on IS_CHAR.
Fix usage of IS_CHAR macro in htmlParseScript.
|
|
f933c898
|
2012-09-07T19:32:12
|
|
Keep non-significant blanks node in HTML parser
For https://bugzilla.gnome.org/show_bug.cgi?id=681822
Regardless if the option HTML_PARSE_NOBLANKS is set or not, blank nodes
are removed from a HTML document, for example:
<html>
<head>
<title>This is a test.</title>
</head>
<body>
<p>This is a test.</p>
</body>
</html>
is read as:
<html><head><title>This is a test.</title></head><body>
<p>This is a test.</p>
</body></html>
This changes the default behaviour but the old behaviour is available
as expected when using the parser flag HTML_PARSE_NOBLANKS
Based on original patch from Igor Ignatyuk <igor_ignatiouk@hotmail.com>
* HTMLparser.c: change various places in the parser where ignorable_space
SAX callback was called without checking for the parser flag preference
* xmllint.c: make sure we use the new flag even for HTML parsing
* result/HTML/*: this modifies the output of a number of tests
|
|
42720248
|
2007-04-16T07:02:31
|
|
change the way script/style are parsed to not try to detect comments,
* HTMLparser.c: change the way script/style are parsed to
not try to detect comments, reported by Mike Day
* result/HTML/doc3.*: affects the result of that test
Daniel
svn path=/trunk/; revision=3598
|
|
36d73403
|
2005-09-01T09:52:30
|
|
Applied the last patch from Gary Coady for #304637 changing the behaviour
* HTMLparser.c: Applied the last patch from Gary Coady for #304637
changing the behaviour when text nodes are found in body
* result/HTML/*: this changes the output of some tests
Daniel
|
|
18a65095
|
2004-05-11T15:57:42
|
|
fix to the fix for #141864 from Paul Elseth apply fix from David Gatwood
* xmlIO.c: fix to the fix for #141864 from Paul Elseth
* HTMLparser.c result/HTML/doc3.htm: apply fix from David Gatwood for
#141195 about text between comments.
Daniel
|
|
42fd4126
|
2003-11-04T08:47:48
|
|
change --html to make sure we use the HTML serialization rule by default
* xmllint.c: change --html to make sure we use the HTML serialization
rule by default when HTML parser is used, add --xmlout to allow to
force the XML serializer on HTML.
* HTMLtree.c: ugly tweak to fix the output on <p> element and
solve #125093
* result/HTML/*: this changes the output of some tests
Daniel
|
|
20aa0fb4
|
2003-08-04T19:43:15
|
|
fixed a small problem in the patch for #118763 this reverts back to the
* tree.c: fixed a small problem in the patch for #118763
* result/HTML/doc3.htm*: this reverts back to the previous result
Daniel
|
|
39057f40
|
2003-08-04T01:33:43
|
|
fixing HTML attribute serialization bug #118763 applying a modified
* tree.c: fixing HTML attribute serialization bug #118763
applying a modified version of the patch from Bacek
* result/HTML/doc3.htm*: this modifies the output from one test
Daniel
|
|
8265a18a
|
2003-06-13T10:05:56
|
|
do not generate " for " outside of attributes this changes the output
* entities.c: do not generate " for " outside of attributes
* result//*: this changes the output of some tests
Daniel
|
|
ef0b4501
|
2003-03-24T13:57:34
|
|
fixed some problems related to #75813 about handling of Result Value Trees
* xpath.c: fixed some problems related to #75813 about handling
of Result Value Trees
Daniel
|
|
8c9872ca
|
2002-07-05T18:17:10
|
|
trying to fix 87235 about discarded white spaces in the HTML parser. this
* HTMLparser.c: trying to fix 87235 about discarded white
spaces in the HTML parser.
* result/HTML/*: this changes the output of a number of HTML
regression tests
Daniel
|
|
6231e845
|
2002-04-18T11:54:04
|
|
fixed & serialization bug introduced in 2.4.20 this changes a few things
* HTMLtree.c: fixed & serialization bug introduced in 2.4.20
* result/HTML/*: this changes a few things in the results
Daniel
|
|
eb475a37
|
2002-04-14T22:00:22
|
|
fixing bug #78662 i.e. add proper escaping of URI when saving HTML files.
* HTMLtree.c uri.c: fixing bug #78662 i.e. add proper
escaping of URI when saving HTML files.
* result/HTML/*: this impacted some tests
Daniel
|
|
c1f78343
|
2001-11-10T11:43:05
|
|
fix comment in scripts element parsing. updated the results. Daniel
* HTMLparser.c: fix comment in scripts element parsing.
* result/HTML/doc3*: updated the results.
Daniel
|
|
16698281
|
2001-09-14T10:29:27
|
|
do not output hexadecimal charrefs when serializing HTML since some
* encoding.c entities.c: do not output hexadecimal charrefs
when serializing HTML since some version of Netscape can't
grok it, generate decimal ones.
* result/HTML/doc3.htm: output changed due to previous test
* parserInternals.c: repair xmlKeepBlanksDefault() broken in 2.4.4
Daniel
|
|
02bb170a
|
2001-06-13T21:11:59
|
|
- HTMLparser.[ch] HTMLtree.c: stored the inline/block property
of element and use it to avoid outputting formatting spaces at
the wrong place. Implemented the format parameter for HTML save.
- result/HTML/doc2.htm result/HTML/doc3.htm result/HTML/fp40.htm
result/HTML/script.html result/HTML/test2.html result/HTML/test3.html
result/HTML/wired.html: of course this impact the result of a
number of HTML tests
Daniel
|
|
0a2a163d
|
2001-05-11T14:18:03
|
|
- HTMLparser.c: Patch from Jonas Borgström
(htmlGetEndPriority): New function, returns
the priority of a certain element.
(htmlAutoCloseOnClose): Only close inline elements if they
all have lower or equal priority.
- result/HTML: this of course changed a number of tests results.
Daniel
|
|
56098d4f
|
2001-04-24T12:51:09
|
|
- HTMLparser.c : HTML parsing still sucks ... trying to deal
with madness
- result/HTML/ : this modified the result of the regression tests
a lot.
Daniel
|
|
a3bfca59
|
2001-04-12T15:42:58
|
|
parsing real HTML is a nightmare.
- HTMLparser.c result/HTML/*: revamped the way the HTML
parser handles end of tags or end of input
Daniel
|
|
760f4426
|
2001-02-15T14:59:48
|
|
Couple of fixes, getting ready for 2.3.1:
- configure.in: applied patch from Daniel van Balen for OpenBSD
and bumped version to 2.3.1
- HTMLtree.c result/HTML/doc3.htm result/HTML/wired.html: the
attempt to find autoclosing was simply broken, removed it,
updated the examples, this is better
Daniel
|
|
c4f4f0b7
|
2000-10-29T17:46:30
|
|
- xpath.c: fixed the root evaluation problems
- HTMLparser.c result/HTML/doc3.htm: fixed the problem of non
ignorable spaces with <b> <bold> <em>
- tree.c: fixed a loop in xmlSearchNsByHref()
Daniel
|
|
126f2799
|
2000-10-24T17:10:12
|
|
Bunch of fixes, finishing moving datastructures to the hash stuff:
- hash.[ch] debugXML.c: expanded/enhanced the API, added
multikey tuples, made hash structure opaque
- valid.[ch]: moved elements, attributes, notations decalarations
as well as ID and refs to hash tables.
- entities.c: hash cleanup
- xmlmemory.c: fixed a dump problem in debug mode
- include/Makefile.am: problem passing in DESTDIR= values patch
from Marc Christensen <marc@calderasystems.com>
- nanohttp.c: removed debugging remains
- HTMLparser.c: the bogus tag should be ignored (Wayne)
- HTMLparser.c parser.c: fixing a number of problems with the
macros in the *parser.c files (Wayne).
- HTMLparser.c: close the previous option when opening a new one
(Marc Sanfacon).
- result/HTML/*: updated the HTML results accordingly
Daniel
|
|
7eda8452
|
2000-10-14T23:38:43
|
|
- HTMLparser.c HTMLtree.[ch] SAX.c testHTML.c tree.c: fixed HTML
support for SCRIPT and STYLE with help from Bjorn Reese
- test/HTML/* result/HTML/*: added simple testcase and updated
the existing ones.
Daniel
|
|
aa4f649b
|
2000-10-10T23:54:49
|
|
Fixed the HTML tests output, Daniel.
|
|
970112a9
|
2000-10-03T09:33:21
|
|
Stupid bug fix on the HTML parser:
- HTMLparser.c: Doohhh, attribute name parsing was still case
sensitive ! Fixed this ...
- result/HTML/* : updated the tests results accordingly
Daniel
|
|
b8f25c91
|
2000-08-19T19:52:36
|
|
work done on auto-opening of <p> tags and cleanup of SAX output, Daniel.
|
|
87b95395
|
2000-08-12T21:12:04
|
|
Large sync between my W3C base and Gnome's one:
- parser.[ch]: added xmlGetFeaturesList() xmlGetFeature() and xmlAddFeature()
- tree.[ch]: added xmlAddChildList()
- xmllint.c: MAP_FAILED macro test
- parser.h: added xmlParseCtxtExternalEntity()
- valid.c: applied bug fixes removed warning
- tree.c: added CDATA block to elements content
- testSAX.c: cleanup of output
- testHTML.c: added SAX testing
- encoding.c: better error recovery
- SAX.c, parser.c: fixed one of the external entity processing of the OASis testsuite
- Makefile.am: added HTML SAX regression tests
- configure.in: bumped to 2.2.2
- test/HTML/ result/HTML: added a few of HTML tests, and added the SAX results
Daniel
|