|
76d6b0d7
|
2022-11-14T21:02:15
|
|
html: Don't escape ASCII chars in href attributes
In several cases, href attributes can contain ASCII characters which are
illegal in URIs. Escaping them often does more harm than good.
Fixes #321.
|
|
93ce33c2
|
2020-07-23T17:34:08
|
|
Fix several quadratic runtime issues in HTML push parser
Fix a few remaining cases where the HTML push parser would scan more
content during lookahead than being parsed later.
Make sure that htmlParseDocTypeDecl consumes all content up to the
final '>' in case of errors. The old comment said "We shouldn't try to
resynchronize", but ignoring invalid content is also what the HTML5
spec mandates.
Likewise, make htmlParseEndTag skip to the final '>' in invalid end
tags even if not in recovery mode. This is probably the most visible
change in practice and leads to different output for some tests but is
also more in line with HTML5.
Make sure that htmlParsePI and htmlParseComment don't abort if invalid
characters are encountered but log an error and ignore the character.
Change some other end-of-buffer checks to test for a zero byte instead
of relying on IS_CHAR.
Fix usage of IS_CHAR macro in htmlParseScript.
|
|
f933c898
|
2012-09-07T19:32:12
|
|
Keep non-significant blanks node in HTML parser
For https://bugzilla.gnome.org/show_bug.cgi?id=681822
Regardless if the option HTML_PARSE_NOBLANKS is set or not, blank nodes
are removed from a HTML document, for example:
<html>
<head>
<title>This is a test.</title>
</head>
<body>
<p>This is a test.</p>
</body>
</html>
is read as:
<html><head><title>This is a test.</title></head><body>
<p>This is a test.</p>
</body></html>
This changes the default behaviour but the old behaviour is available
as expected when using the parser flag HTML_PARSE_NOBLANKS
Based on original patch from Igor Ignatyuk <igor_ignatiouk@hotmail.com>
* HTMLparser.c: change various places in the parser where ignorable_space
SAX callback was called without checking for the parser flag preference
* xmllint.c: make sure we use the new flag even for HTML parsing
* result/HTML/*: this modifies the output of a number of tests
|
|
36d73403
|
2005-09-01T09:52:30
|
|
Applied the last patch from Gary Coady for #304637 changing the behaviour
* HTMLparser.c: Applied the last patch from Gary Coady for #304637
changing the behaviour when text nodes are found in body
* result/HTML/*: this changes the output of some tests
Daniel
|
|
42fd4126
|
2003-11-04T08:47:48
|
|
change --html to make sure we use the HTML serialization rule by default
* xmllint.c: change --html to make sure we use the HTML serialization
rule by default when HTML parser is used, add --xmlout to allow to
force the XML serializer on HTML.
* HTMLtree.c: ugly tweak to fix the output on <p> element and
solve #125093
* result/HTML/*: this changes the output of some tests
Daniel
|
|
4b1577f1
|
2003-09-03T13:10:37
|
|
removing the SAXresults tree, keeping result in the same tree, added
* Makefile.am results/*.sax SAXResult/*: removing the SAXresults
tree, keeping result in the same tree, added SAXtests to the
default "make tests"
Daniel
|
|
8265a18a
|
2003-06-13T10:05:56
|
|
do not generate " for " outside of attributes this changes the output
* entities.c: do not generate " for " outside of attributes
* result//*: this changes the output of some tests
Daniel
|
|
ef0b4501
|
2003-03-24T13:57:34
|
|
fixed some problems related to #75813 about handling of Result Value Trees
* xpath.c: fixed some problems related to #75813 about handling
of Result Value Trees
Daniel
|
|
8c9872ca
|
2002-07-05T18:17:10
|
|
trying to fix 87235 about discarded white spaces in the HTML parser. this
* HTMLparser.c: trying to fix 87235 about discarded white
spaces in the HTML parser.
* result/HTML/*: this changes the output of a number of HTML
regression tests
Daniel
|
|
6231e845
|
2002-04-18T11:54:04
|
|
fixed & serialization bug introduced in 2.4.20 this changes a few things
* HTMLtree.c: fixed & serialization bug introduced in 2.4.20
* result/HTML/*: this changes a few things in the results
Daniel
|
|
eb475a37
|
2002-04-14T22:00:22
|
|
fixing bug #78662 i.e. add proper escaping of URI when saving HTML files.
* HTMLtree.c uri.c: fixing bug #78662 i.e. add proper
escaping of URI when saving HTML files.
* result/HTML/*: this impacted some tests
Daniel
|
|
02bb170a
|
2001-06-13T21:11:59
|
|
- HTMLparser.[ch] HTMLtree.c: stored the inline/block property
of element and use it to avoid outputting formatting spaces at
the wrong place. Implemented the format parameter for HTML save.
- result/HTML/doc2.htm result/HTML/doc3.htm result/HTML/fp40.htm
result/HTML/script.html result/HTML/test2.html result/HTML/test3.html
result/HTML/wired.html: of course this impact the result of a
number of HTML tests
Daniel
|
|
0a2a163d
|
2001-05-11T14:18:03
|
|
- HTMLparser.c: Patch from Jonas Borgström
(htmlGetEndPriority): New function, returns
the priority of a certain element.
(htmlAutoCloseOnClose): Only close inline elements if they
all have lower or equal priority.
- result/HTML: this of course changed a number of tests results.
Daniel
|
|
56098d4f
|
2001-04-24T12:51:09
|
|
- HTMLparser.c : HTML parsing still sucks ... trying to deal
with madness
- result/HTML/ : this modified the result of the regression tests
a lot.
Daniel
|
|
a3bfca59
|
2001-04-12T15:42:58
|
|
parsing real HTML is a nightmare.
- HTMLparser.c result/HTML/*: revamped the way the HTML
parser handles end of tags or end of input
Daniel
|
|
760f4426
|
2001-02-15T14:59:48
|
|
Couple of fixes, getting ready for 2.3.1:
- configure.in: applied patch from Daniel van Balen for OpenBSD
and bumped version to 2.3.1
- HTMLtree.c result/HTML/doc3.htm result/HTML/wired.html: the
attempt to find autoclosing was simply broken, removed it,
updated the examples, this is better
Daniel
|
|
f41fbbf6
|
2001-02-13T17:05:35
|
|
testing and bug fixing related to XSLT:
- xpath.c result/XPath/tests/chaptersprefol: bugfixes on order and
on predicate
- HTMLparser.[ch] HTMLtree.c result/HTML/doc3.htm.err
result/HTML/doc3.htm.sax result/HTML/wired.html: sometimes one
really want to have tags closed on output even if we accept
unclosed ones on input
Daniel
|
|
126f2799
|
2000-10-24T17:10:12
|
|
Bunch of fixes, finishing moving datastructures to the hash stuff:
- hash.[ch] debugXML.c: expanded/enhanced the API, added
multikey tuples, made hash structure opaque
- valid.[ch]: moved elements, attributes, notations decalarations
as well as ID and refs to hash tables.
- entities.c: hash cleanup
- xmlmemory.c: fixed a dump problem in debug mode
- include/Makefile.am: problem passing in DESTDIR= values patch
from Marc Christensen <marc@calderasystems.com>
- nanohttp.c: removed debugging remains
- HTMLparser.c: the bogus tag should be ignored (Wayne)
- HTMLparser.c parser.c: fixing a number of problems with the
macros in the *parser.c files (Wayne).
- HTMLparser.c: close the previous option when opening a new one
(Marc Sanfacon).
- result/HTML/*: updated the HTML results accordingly
Daniel
|
|
970112a9
|
2000-10-03T09:33:21
|
|
Stupid bug fix on the HTML parser:
- HTMLparser.c: Doohhh, attribute name parsing was still case
sensitive ! Fixed this ...
- result/HTML/* : updated the tests results accordingly
Daniel
|
|
32bc74ef
|
2000-07-14T14:49:25
|
|
- doc/encoding.html doc/xml.html: added I18N doc
- encoding.[ch] HTMLtree.[ch] parser.c HTMLparser.c: I18N encoding
improvements, both parser and filters, added ASCII & HTML,
fixed the ISO-Latin-1 one
- xmllint.c testHTML.c: added/made visible --encode
- debugXML.c : cleanup
- most .c files: applied patches due to warning on Windows and
when using Sun Pro cc compiler
- xpath.c : cleanup memleaks
- nanoftp.c : added a TESTING preprocessor flag for standalong
compile so that people can report bugs more easilly
- nanohttp.c : ditched socklen_t which was a portability mess
and replaced it with unsigned int.
- tree.[ch]: added xmlHasProp()
- TODO: updated
- test/ : added more test for entities, NS, encoding, HTML, wap
- configure.in: preparing for 2.2.0 release
Daniel
|
|
be803967
|
2000-06-28T23:40:59
|
|
- Large resync between W3C and Gnome tree
- configure.in: 2.1.0 prerelease
- example/Makefile.am example/gjobread.c tree.h: work on
libxml1 libxml2 convergence.
- nanoftp, nanohttp.c: fixed stalled connections probs
- HTMLtree.c SAX.c : support for attribute without values in
HTML for andersca
- valid.c: Fixed most validation + namespace problems
- HTMLparser.c: start document callback for andersca
- debugXML.c xpath.c: lots of XPath fixups from Picdar Technology
- parser.h, SAX.c: serious speed improvement for large
CDATA blocks
- encoding.[ch] xmlIO.[ch]: Improved seriously saving to
different encoding
- config.h.in parser.c xmllint.c: added xmlCheckVersion()
and the LIBXML_TEST_VERSION macro
Daniel
|
|
71b656e0
|
2000-01-05T14:46:17
|
|
- added xmlRemoveID() and xmlRemoveRef()
- added check and handling when possibly removing an ID
- fixed some entities problems
- added xmlParseTryOrFinish()
- changed the way struct aredeclared to allow gtk-doc to expose those
- closed #4960
- fixes to libs detection from Albert Chin-A-Young
- preparing 1.8.3 release
Daniel
|
|
5cb5ab8d
|
1999-12-21T15:35:29
|
|
- release 1.8.2 - HTML handling improvement - new tree handling functions
- release 1.8.2
- HTML handling improvement
- new tree handling functions
- default namespace on attribute bug fixed
- libxml use for C++ fixed (for good this time !)
Daniel
|
|
4a53eca2
|
1999-12-12T13:03:50
|
|
- Updated HTML test outputs
- Fixed taht f....g problem with C++ and includes,
Daniel
|
|
af78a0e1
|
1999-12-12T13:03:50
|
|
Large commit of changes done while travelling to XML'99
- cleanups on memory use and parsers
- start of Link interfaces HTML and XLink
- rebuild the doc
- released as 1.8.0
Daniel
|
|
3500838f
|
1999-10-25T13:15:52
|
|
BUG FIXED #2784 HTML parsing/output improvements Rebuilt, updated the docs
BUG FIXED #2784
HTML parsing/output improvements
Rebuilt, updated the docs
Improvement of regression scripts, make testall should look clean
Released as 1.7.4
|