kmx git

Commit	Date	Message
e986d09c	2022-07-15T14:02:26	Skip incorrectly opened HTML comments Commit 4fd69f3e fixed handling of '<' characters not followed by an ASCII letter. But a '<!' sequence followed by invalid characters should be treated as bogus comment and skipped. Fixes #380.
14517012	2022-04-23T19:19:33	Fix parsing of subtracted regex character classes Fixes #370.
4612ce30	2022-04-21T03:52:52	Implement xpath1() XPointer scheme See https://www.w3.org/2005/04/xpointer-schemes/
41afa89f	2022-04-10T14:09:29	Fix short-lived regression in xmlStaticCopyNode Commit 7618a3b1 didn't account for coalesced text nodes. I think it would be better if xmlStaticCopyNode didn't try to coalesce text nodes at all. This code path can only be triggered if some other code doesn't coalesce text nodes properly. In this case, OSS-Fuzz found such behavior in xinclude.c.
57b81c20	2022-03-05T18:20:29	Normalize XPath strings in-place Simplify the code and fix a potential memory leak. Fixes #343.
bc06a522	2022-03-02T02:57:49	Fix recursion check in xinclude.c Compare the included URL with the document's URL to detect local inclusions. Fixes #348.
24cdc890	2021-07-17T14:06:49	test coverage for abruptly-closed comments These establish baseline behavior so that the subsequent commit is clear about the behavior it will modify.
966b0f21	2021-08-19T02:46:32	Add whitespace folding for some atomic data types that it's missing on. XSD validation fails when some atomic types contain surrounding whitespace even though XML Schema Part 2: Datatypes Second Edition, section 4.3.6 says they should be collapsed. Fix this. (I am not sure whether the test is correct.) Issue: #278
ea6e8f99	2021-12-20T00:34:58	Fix certain combinations of regex range quantifiers Fix regex transitions that have both min/max and a counter. In this case, we want to save the regex state before incrementing the counter. Fixes #301 and the issue reported here: https://mail.gnome.org/archives/xml/2016-April/msg00017.html
382fb056	2021-12-20T00:31:41	Fix range quantifier on subregex Make sure to add counted exit transitions before other counter transitions. Otherwise, we won't backtrack correctly. Fixes #65.
ce0871e1	2022-02-20T16:44:41	Only warn on invalid redeclarations of predefined entities Downgrade the error message to a warning since the error was ignored, anyway. Also print the name of redeclared entity. For a proper fix that also shows filename and line number of the invalid redeclaration, we'd have to - pass the parser context to the entity functions somehow, or - make these functions return distinct error codes. Partial fix for #308.
9edc20c1	2022-02-07T20:38:30	Fix double counting of CRLF in comments Fixes #151.
5408c10c	2022-02-04T14:00:09	Don't normalize namespace URIs in XPointer xmlns() scheme Namespace URIs should be compared without escaping or unescaping: https://www.w3.org/TR/REC-xml-names/#NSNameComparison Fixes #289.
1c7d91ab	2022-02-03T23:31:19	Fix handling of XSD with empty namespace An empty namespace means no default namespace. Fixes #303.
f480f750	2022-02-03T14:43:17	Update NewsML DTD in test suite Switch to version 1.2 which has a clearer license. Fixes #291.
d85245f9	2022-01-16T21:39:04	Fix regression with PEs in external DTD Fix a regression introduced with commit a28f7d87. In some cases, parameter entity references in external DTDs wouldn't be expanded. Fixes #306.
03bb9293	2021-07-07T18:23:18	Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8(). * encoding.c: (UTF16LEToUTF8): - Fix comment to describe what the code does. (UTF16BEToUTF8): - Fix undefined behavior which was applied to UTF16LEToUTF8() in 2f9382033e. - Add bounds check to while() loop which was applied to UTF16LEToUTF8() in be803967db. - Do not return -2 when (in >= inend) to fix the bug. This was applied to UTF16LEToUTF8() in 496a1cf592. - Inline (<< 8) statements to match UTF16LEToUTF8(). Add the following tests and results: test/text-4-byte-UTF-16-BE-offset.xml test/text-4-byte-UTF-16-BE.xml test/text-4-byte-UTF-16-LE-offset.xml test/text-4-byte-UTF-16-LE.xml
2732b234	2022-01-10T13:32:14	Fix regression parsing public IDs literals in HTML Fix regression introduced when reworking htmlParsePubidLiteral in commit 93ce33c2. Fixes #318.
01411e7c	2021-02-08T20:58:32	Check for invalid redeclarations of predefined entities Implement section "4.6 Predefined Entities" of the XML 1.0 spec and check whether redeclarations of predefined entities match the original definitions. Note that some test cases declared <!ENTITY lt "<"> But the XML spec clearly states that this is illegal: > If the entities lt or amp are declared, they MUST be declared as > internal entities whose replacement text is a character reference to > the respective character (less-than sign or ampersand) being escaped; > the double escaping is REQUIRED for these entities so that references > to them produce a well-formed result. Also fixes #217 but the connection is only tangential. The integer overflow discovered by fuzzing was more related to the fact that various parts of the parser disagreed on whether to prefer predefined entities over their redeclarations. The whole situation is a mess and even depends on legacy parser options. But now that redeclarations are validated, it shouldn't make a difference. As noted in the added comment, this is also one of the cases where overly defensive checks can hide interesting logic bugs from fuzzers.
e28d9347	2020-08-04T14:53:19	add test coverage for incorrectly-closed comments this establishes the baseline behavior so that subsequent commits which modify this behavior are clear about what's being changed.
87d20b55	2020-08-19T13:52:08	Fix regression introduced with commit 74dcc10b The code wasn't dead after all, but I can see no reason in delaying the XPointer evaluation. This could lead to nodes included earlier appearing in XPointer results.
d88df4bd	2020-08-16T23:38:48	Fix corner case with empty xi:fallback xi:fallback could become empty after recursive expansion. Use a flag to track whether nodes should be skipped.
1abf2967	2020-08-06T17:51:57	Fix exponential runtime and memory in xi:fallback processing When creating XML_XINCLUDE_START nodes, the children of the original xi:include node must be freed, otherwise fallback content is copied twice, doubling runtime and memory consumption for each nested xi:fallback/xi:include pair. Found with libFuzzer.
0f9817c7	2020-06-10T16:34:52	Don't recurse into xi:include children in xmlXIncludeDoProcess Otherwise, nested xi:include nodes might result in a use-after-free if XML_PARSE_NOXINCNODE is specified. Found with libFuzzer and ASan.
6b4717d6	2020-07-06T12:36:27	Add regexp regression tests - Bug 757711: heap-buffer-overflow in xmlFAParsePosCharGroup <https://bugzilla.gnome.org/show_bug.cgi?id=757711> - Bug 783015 - Integer-overflow in xmlFAParseQuantExact <https://bugzilla.gnome.org/show_bug.cgi?id=783015> (Regexptests): Add support for checking stderr output when running regexp tests. This makes it possible to check in test cases that fail and not see false-positive error output when running the tests. Unlike other libxml2 test suites, if there is no stderr output, no *.err file needs to be created.
477c7f6a	2020-06-28T15:54:23	Fix quadratic runtime in HTML parser Commit eeb99329 removed an important optimization avoiding quadratic runtime when repeatedly scanning the input buffer for terminating characters in the HTML push parser. The related bug is https://bugzilla.gnome.org/show_bug.cgi?id=444994 Make sure that ctxt->checkIndex is always written and store additional parser state in ctxt->inSubset which is unused in the HTML parser. Found by OSS-Fuzz.
32cb5dcc	2020-02-11T13:16:10	Add test case for recursive external parsed entities
2a350ee9	2019-09-30T17:04:54	Large batch of typo fixes Closes #109.
c51e38cb	2019-09-30T13:50:02	Make xmlParseConditionalSections non-recursive Avoid call stack overflow in deeply nested conditional sections. Found by OSS-Fuzz.
c2b0a184	2019-09-25T13:57:42	Fix empty branch in regex Fixes bug 649244: https://bugzilla.gnome.org/show_bug.cgi?id=649244 Closes #57.
6705f4d2	2019-09-16T15:45:27	Remove executable bit from non-executable files
e8c9cd5c	2019-09-16T15:36:02	Fix Schema determinism check of ##other namespaces Non-compound (##local) and compound string atoms are always disjoint regardless of whether the compound atom is negated (##other). Closes #40.
8efc5b28	2019-09-13T12:24:23	14:00 is a valid timezone for xs:dateTime Closes #100
ea695ac0	2019-08-09T15:09:22	Fix unability to RelaxNG-validate grammar with choice-based name class Previously, test/relaxng/ambig_name-class2.xml would fail to validate against test/relaxng/ambig_name-class2.rng: > test/relaxng/ambig_name-class2.rng:4: > element attribute: Relax-NG parser error : > Found anyName attribute without oneOrMore ancestor > Relax-NG schema test/relaxng/ambig_name-class2.rng failed to compile Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
8074b881	2019-08-08T23:33:48	Fix unability to validate ambiguously constructed interleave for RelaxNG Previously, test/relaxng/ambig_name-class.xml would fail to validate for a simple reason -- interleave within "open-name-class" context is supposed to be fine with whatever else is pending the consumption, since effectively, it's unrelated from a higher parsing perspective. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
c2f4da1a	2017-05-21T22:08:50	Improve XPath predicate and filter evaluation Consolidate code paths evaluating XPath predicates and filters. Don't push context node on stack when evaluating predicates. I have no idea why this was done. It seems completely useless and trying to pop the context node from a corrupted stack has already caused security issues. Filter nodesets in-place and don't create node sets with NULL gaps which allows to simplify merging a great deal. Simply move matched nodes backward and create a compact node set. Merge xmlXPathCompOpEvalPositionalPredicate into xmlXPathCompOpEvalPredicate.
30a6533e	2019-03-08T12:15:17	Fix float casts in xmlXPathSubstringFunction Rewrite conversion of double to int in xmlXPathSubstringFunction, adding range checks to avoid undefined behavior. Make sure to add start and length as floating-point numbers before converting to int. Fix a bug when rounding negative start indices. Remove unneeded calls to xmlXPathIs{Inf,NaN} and rely on IEEE math instead. Avoid computing the string length. xmlUTF8Strsub works as expected if the length of the requested substring exceeds the input. Found with libFuzzer and UBSan.
c64d4efb	2018-10-13T00:12:12	Remove redefined starts and defines inside include elements When including a grammar from another grammar, we need to make sure that any redefines of starts and includes that that grammar does inside any of its include elements are also removed.
46da8fc5	2018-10-12T23:46:24	Allow choice within choice in nameClass in RELAX NG The pattern nameClass allows for nested choice elements, for example <name> <choice> <choice> <name>a</name> <name>b</name> </choice> <name>c</name> </choice> </name> which is semantically equivalent to <name> <choice> <name>a</name> <name>b</name> <name>c</name> </choice> </name> The old code didn’t handle this correctly, as it never expected a choice inside another choice. This patch fixes this by flattening any nested choices. This pattern of nested choice elements comes up in RELAX NG simplification, where all choice elements are rewritten in this nested manner, see section 4.12 of the RELAX NG specification.
4338c310	2018-10-12T22:30:26	Look inside divs for starts and defines inside include RELAX NG allows for div elements inside of include elements. We need to look inside those div elements for start and define elements that may be redefining start and define elements in the included grammar.
72182550	2017-11-04T15:38:58	Add test for ICU flush and pivot buffer
5af594d8	2017-10-07T14:54:45	Fix comparison of nodesets to strings Fix two bugs in xmlXPathNodeValHash which could lead to errors when comparing nodesets to strings: - Only use contents of text nodes to compute the hash for element nodes. Comments, PIs, and other node types don't affect the string-value and must be ignored. - Reset `string` to NULL for node types other than text. Reported by Aleksei on the mailing list: https://mail.gnome.org/archives/xml/2017-September/msg00016.html
69936b12	2017-08-30T14:16:01	Revert "Print error messages for truncated UTF-8 sequences" This reverts commit 79c8a6b which caused a serious regression in streaming mode. Also reverts part of commit 52ceced "Fix infinite loops with push parser in recovery mode". Fixes bug 786554.
899a5d9f	2017-07-25T14:59:49	Detect infinite recursion in parameter entities When expanding a parameter entity in a DTD, infinite recursion could lead to an infinite loop or memory exhaustion. Thanks to Wei Lei for the first of many reports. Fixes bug 759579.
5f440d8c	2017-06-12T14:32:34	Rework entity boundary checks Make sure to finish all entities in the internal subset. Nevertheless, readd a sanity check in xmlParseStartTag2 that was lost in my previous commit. Also add a sanity check in xmlPopInput. Popping an input unexpectedly was the source of many recent memory bugs. The check doesn't mitigate such issues but helps with diagnosis. Always base entity boundary checks on the input ID, not the input pointer. The pointer could have been reallocated to the old address. Always throw a well-formedness error if a boundary check fails. In a few places, a validity error was thrown. Fix a few error codes and improve indentation.
85c112a0	2017-06-12T18:26:11	Add test cases for bug 758518 test/HTML/758518-entity.html exposed a bug in pushParseTest() in runtest.c which assumed that an input file was at least 4 bytes long. That test case is only 3 bytes, so we now take the minimum of 4 bytes or the length of the test input. We also now use 'chunkSize' in place of the hard-coded value '1024' later in the function.
79c8a6b1	2017-06-10T17:01:27	Print error messages for truncated UTF-8 sequences Before, truncated UTF-8 sequences at the end of a file were treated as EOF. Create an error message containing the offending bytes. xmlStringCurrentChar would also print characters from the input stream, not the string it's working on.
932cc989	2017-06-03T02:01:29	Fix buffer size checks in xmlSnprintfElementContent xmlSnprintfElementContent failed to correctly check the available buffer space in two locations. Fixes bug 781333 (CVE-2017-9047) and bug 781701 (CVE-2017-9048). Thanks to Marcel Böhme and Thuan Pham for the report.
e2663054	2017-06-05T15:37:17	Fix handling of parameter-entity references There were two bugs where parameter-entity references could lead to an unexpected change of the input buffer in xmlParseNameComplex and xmlDictLookup being called with an invalid pointer. Percent sign in DTD Names ========================= The NEXTL macro used to call xmlParserHandlePEReference. When parsing "complex" names inside the DTD, this could result in entity expansion which created a new input buffer. The fix is to simply remove the call to xmlParserHandlePEReference from the NEXTL macro. This is safe because no users of the macro require expansion of parameter entities. - xmlParseNameComplex - xmlParseNCNameComplex - xmlParseNmtoken The percent sign is not allowed in names, which are grammatical tokens. - xmlParseEntityValue Parameter-entity references in entity values are expanded but this happens in a separate step in this function. - xmlParseSystemLiteral Parameter-entity references are ignored in the system literal. - xmlParseAttValueComplex - xmlParseCharDataComplex - xmlParseCommentComplex - xmlParsePI - xmlParseCDSect Parameter-entity references are ignored outside the DTD. - xmlLoadEntityContent This function is only called from xmlStringLenDecodeEntities and entities are replaced in a separate step immediately after the function call. This bug could also be triggered with an internal subset and double entity expansion. This fixes bug 766956 initially reported by Wei Lei and independently by Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone involved. xmlParseNameComplex with XML_PARSE_OLD10 ======================================== When parsing Names inside an expanded parameter entity with the XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the GROW macro if the input buffer was exhausted. At the end of the parameter entity's replacement text, this function would then call xmlPopInput which invalidated the input buffer. There should be no need to invoke GROW in this situation because the buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and, at least for UTF-8, in xmlCurrentChar. This also matches the code path executed when XML_PARSE_OLD10 is not set. This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050). Thanks to Marcel Böhme and Thuan Pham for the report. Additional hardening ==================== A separate check was added in xmlParseNameComplex to validate the buffer size.
7482f41f	2017-06-01T22:00:19	Check for integer overflow in xmlXPathFormatNumber Check for overflow before casting double to int. Found with afl-fuzz and UBSan.
f4029cd4	2016-04-21T16:37:26	Check XPath exponents for overflow Avoid undefined behavior and wrong results with huge exponents. Found with afl-fuzz and UBSan.
a58331a6	2017-05-29T21:02:21	Check for overflow in xmlXPathIsPositionalPredicate Avoid undefined behavior when casting from double to int. Found with afl-fuzz and UBSan.
a851868a	2017-05-29T20:14:42	Parse small XPath numbers more accurately Don't count leading zeros towards the fraction size limit. This allows to parse numbers like 0.0000000000000000000000000000000000000000000000000000000001 which is the only standard-conformant way to represent such numbers, as scientific notation isn't allowed in XPath 1.0. (It is allowed in XPath 2.0 and in libxml2 as an extension, though.) Overall accuracy is still bad, see bug 783238.
4bebb030	2016-04-21T13:41:09	Rework XPath rounding functions Use the C library's floor and ceil functions. The old code was overly complicated for no apparent reason and could result in undefined behavior when handling NaNs (found with afl-fuzz and UBSan). Fix wrong comment in xmlXPathRoundFunction. The implementation was already following the spec and rounding half up.
40f58521	2017-05-26T20:16:35	Fix axis traversal from attribute and namespace nodes When traversing the "preceding" axis from an attribute node, we must first go up to the attribute's containing element. Otherwise, text children of other attributes could be returned. This made it possible to hit a code path in xmlXPathNextAncestor which contained another bug: The attribute node was initialized with the context node instead of the current node. Normally, this code path is only hit via xmlXPathNextAncestorOrSelf in which case the current and context node are the same. The combination of the two bugs could result in an infinite loop, found with libFuzzer. Traversing the "following" and the "preceding" axis from namespace nodes should be handled similarly. This wasn't supported at all previously.
9ab01a27	2016-06-28T14:22:23	Fix XPointer paths beginning with range-to The old code would invoke the broken xmlXPtrRangeToFunction. range-to isn't really a function but a special kind of location step. Remove this function and always handle range-to in the XPath code. The old xmlXPtrRangeToFunction could also be abused to trigger a use-after-free error with the potential for remote code execution. Found with afl-fuzz. Fixes CVE-2016-5131.
d8083bf7	2016-06-25T12:35:50	Fix NULL pointer deref in XPointer range-to - Check for errors after evaluating first operand. - Add sanity check for empty stack. Found with afl-fuzz.
0bcd05c5	2016-03-01T15:18:04	Heap-based buffer overread in htmlCurrentChar For https://bugzilla.gnome.org/show_bug.cgi?id=758606 * parserInternals.c: (xmlNextChar): Add an test to catch other issues on ctxt->input corruption proactively. For non-UTF-8 charsets, xmlNextChar() failed to check for the end of the input buffer and would continuing reading. Fix this by pulling out the check for the end of the input buffer into common code, and return if we reach the end of the input buffer prematurely. * result/HTML/758606.html: Added. * result/HTML/758606.html.err: Added. * result/HTML/758606.html.sax: Added. * result/HTML/758606_2.html: Added. * result/HTML/758606_2.html.err: Added. * result/HTML/758606_2.html.sax: Added. * test/HTML/758606.html: Added test case. * test/HTML/758606_2.html: Added test case.
00906759	2016-01-26T16:57:03	Heap-based buffer-underreads due to xmlParseName For https://bugzilla.gnome.org/show_bug.cgi?id=759573 * parser.c: (xmlParseElementDecl): Return early on invalid input to fix non-minimized test case (759573-2.xml). Otherwise the parser gets into a bad state in SKIP(3) at the end of the function. (xmlParseConditionalSections): Halt parsing when hitting invalid input that would otherwise caused xmlParserHandlePEReference() to recurse unexpectedly. This fixes the minimized test case (759573.xml). * result/errors/759573-2.xml: Add. * result/errors/759573-2.xml.err: Add. * result/errors/759573-2.xml.str: Add. * result/errors/759573.xml: Add. * result/errors/759573.xml.err: Add. * result/errors/759573.xml.str: Add. * test/errors/759573-2.xml: Add. * test/errors/759573.xml: Add.
38eae571	2016-03-07T14:04:08	Heap use-after-free in xmlSAX2AttributeNs For https://bugzilla.gnome.org/show_bug.cgi?id=759020 * parser.c: (xmlParseStartTag2): Attribute strings are only valid if the base does not change, so add another check where the base may change. Make sure to set 'attvalue' to NULL after freeing it. * result/errors/759020.xml: Added. * result/errors/759020.xml.err: Added. * result/errors/759020.xml.str: Added. * test/errors/759020.xml: Added test case.
45752d2c	2016-03-03T11:50:34	Bug 759398: Heap use-after-free in xmlDictComputeFastKey <https://bugzilla.gnome.org/show_bug.cgi?id=759398> * parser.c: (xmlParseNCNameComplex): Store start position instead of a pointer to the name since the underlying buffer may change, resulting in a stale pointer being used. * result/errors/759398.xml: Added. * result/errors/759398.xml.err: Added. * result/errors/759398.xml.str: Added. * test/errors/759398.xml: Added test case.
a820dbea	2016-03-01T11:34:04	Bug 758605: Heap-based buffer overread in xmlDictAddString <https://bugzilla.gnome.org/show_bug.cgi?id=758605> Reviewed by David Kilzer. * HTMLparser.c: (htmlParseName): Add bounds check. (htmlParseNameComplex): Ditto. * result/HTML/758605.html: Added. * result/HTML/758605.html.err: Added. * result/HTML/758605.html.sax: Added. * runtest.c: (pushParseTest): The input for the new test case was so small (4 bytes) that htmlParseChunk() was never called after htmlCreatePushParserCtxt(), thereby creating a false positive test failure. Fixed by using a do-while loop so we always call htmlParseChunk() at least once. * test/HTML/758605.html: Added.
db07dd61	2016-02-12T09:58:29	Bug 758588: Heap-based buffer overread in xmlParserPrintFileContextInternal <https://bugzilla.gnome.org/show_bug.cgi?id=758588> * parser.c: (xmlParseEndTag2): Add bounds checks before dereferencing ctxt->input->cur past the end of the buffer, or incrementing the pointer past the end of the buffer. * result/errors/758588.xml: Add test result. * result/errors/758588.xml.err: Ditto. * result/errors/758588.xml.str: Ditto. * test/errors/758588.xml: Add regression test.
6eb0894a	2016-05-05T16:49:00	Fix memory leak with XPath namespace nodes Set hasNsNodes to 1 when adding namespace nodes via XP_TEST_HIT.
82b73039	2016-04-30T17:53:10	Fix namespace axis traversal When the namespace axis is traversed in "toBool" mode, the traversal can exit early, before visiting all nodes. In this case, the XPath context still contains a non-NULL tmpNsList. This means that - the check when to start a new traversal was wrong and - the tmpNsList could be leaked. Fixes bug #750037 and, by accident, bug #756075: https://bugzilla.gnome.org/show_bug.cgi?id=750037 https://bugzilla.gnome.org/show_bug.cgi?id=756075
839689a9	2016-04-27T18:00:12	Don't recurse into OP_VALUEs in xmlXPathOptimizeExpression The ch1 slot of OP_VALUEs contains an invalid value. Ignore it. Fixes bug #760325: https://bugzilla.gnome.org/show_bug.cgi?id=760325
f39fd66e	2016-04-27T03:01:16	Fix namespace::node() XPath expression Make sure that xmlXPathNodeSetAddNs is called for namespace nodes when matched with a namespace::node() step. This correctly sets the parent of namespace nodes. Note that xmlXPathNodeSetAddNs must only be called if working on the namespace axis. Otherwise, the context node is not the parent of the namespace node and the standard XP_TEST_HIT macro must be invoked. This explains the errors in the C14N tests that the old TODO comment mentioned.
e2893903	2016-04-21T19:19:23	Fix parsing of NCNames in XPath The NCName parser would allow any NameChar as start character. For example, the following XPath expressions would compile: self::-abc self::0abc self::.abc
cad102b8	2016-04-15T22:41:24	Do normalize string-based datatype value in RelaxNG facet checking Original patch is from Jan Pokorný <jpokorny redhat com> https://mail.gnome.org/archives/xml/2013-November/msg00028.html Improve it according to reviews and add test files.
4f8606c1	2016-01-05T13:38:09	Bug 760183: REGRESSION (v2.9.3): XML push parser fails with bogus UTF-8 encoding error when multi-byte character in large CDATA section is split across buffer <https://bugzilla.gnome.org/show_bug.cgi?id=760183> * parser.c: (xmlCheckCdataPush): Add 'complete' argument to describe whether the buffer passed in is the whole CDATA buffer, or if there is more data to parse. If there is more data to parse, don't return a negative value for an invalid multi-byte UTF-8 character that is split between buffers. (xmlParseTryOrFinish): Pass 'complete' argument to xmlCheckCdataPush() as appropriate. * result/cdata-2-byte-UTF-8.xml: Added. * result/cdata-2-byte-UTF-8.xml.rde: Added. * result/cdata-2-byte-UTF-8.xml.rdr: Added. * result/cdata-2-byte-UTF-8.xml.sax: Added. * result/cdata-2-byte-UTF-8.xml.sax2: Added. * result/cdata-3-byte-UTF-8.xml: Added. * result/cdata-3-byte-UTF-8.xml.rde: Added. * result/cdata-3-byte-UTF-8.xml.rdr: Added. * result/cdata-3-byte-UTF-8.xml.sax: Added. * result/cdata-3-byte-UTF-8.xml.sax2: Added. * result/cdata-4-byte-UTF-8.xml: Added. * result/cdata-4-byte-UTF-8.xml.rde: Added. * result/cdata-4-byte-UTF-8.xml.rdr: Added. * result/cdata-4-byte-UTF-8.xml.sax: Added. * result/cdata-4-byte-UTF-8.xml.sax2: Added. * result/noent/cdata-2-byte-UTF-8.xml: Added. * result/noent/cdata-3-byte-UTF-8.xml: Added. * result/noent/cdata-4-byte-UTF-8.xml: Added. * test/cdata-2-byte-UTF-8.xml: Added. * test/cdata-3-byte-UTF-8.xml: Added. * test/cdata-4-byte-UTF-8.xml: Added. - Add tests and results. Only 'make Readertests XMLPushtests' fails prior to the fix.
4a5d80ad	2015-09-18T15:06:46	Fix a bug in CData error handling in the push parser For https://bugzilla.gnome.org/show_bug.cgi?id=754947 The checking function was returning incorrect args in some cases Adds the test to teh reg suite and fix one of the existing test output
51f02b0a	2015-09-15T16:50:32	Fix a bug on name parsing at the end of current input buffer For https://bugzilla.gnome.org/show_bug.cgi?id=754946 When hitting the end of the current input buffer while parsing a name we could end up loosing the beginning of the name, which led to various issues.
ef709ce2	2015-09-10T19:41:41	Fix the spurious ID already defined error For https://bugzilla.gnome.org/show_bug.cgi?id=737840 the fix for 724903 introduced a regression on external entities carrying IDs, revert that patch in part and add a specific test to avoid readding it
2fab235d	2015-03-16T08:38:36	Fix support for except in nameclasses For https://bugzilla.gnome.org/show_bug.cgi?id=565219 The code was imply missing even if simple, added a few regression tests.
02b252d7	2015-03-08T17:00:37	Regression test for bug #695699
342658a1	2015-03-08T16:46:04	Add a couple of XPath tests
f6aaabce	2015-03-08T16:05:26	Allow attributes on descendant-or-self axis If the context node is an attribute, the attribute itself is on the descendant-or-self axis. The principal node type of this axis is element, so the only node test that can return the attribute is "node()". In other words, "@attr/descendant-or-self::node()" is equivalent to "@attr". This matches the behavior of Saxon-CE.
df23f584	2014-10-23T13:52:47	Adding example from bugs 738805 to regression tests For https://bugzilla.gnome.org/show_bug.cgi?id=738805 Tortuous test case provided by pierre.labastie@neuf.fr
6473a41a	2013-10-23T14:51:33	Implement choice for name classes on attributes https://bugzilla.gnome.org/show_bug.cgi?id=710744
dcc19503	2013-05-22T22:56:45	Fix a parsing bug on non-ascii element and CR/LF usage https://bugzilla.gnome.org/show_bug.cgi?id=698550 Somehow the behaviour of the internal parser routine changed slightly when encountering CR/LF, which led to a bug when parsing document with non-ascii Names
483272f3	2013-03-27T13:37:14	Added a regression tests from bug 694228 data Provided by Mark Rowe <mrowe@apple.com>
4629ee02	2012-07-23T14:15:40	Do not fetch external parsed entities Unless explicietely asked for when validating or replacing entities with their value. Problem pointed out by Tom Lane <tgl@redhat.com> * parser.c: do not load external parsed entities unless needed * test/errors/extparsedent.xml result/errors/extparsedent.xml*: add a regression test to avoid change of the behaviour in the future
a0cd075d	2012-05-11T19:31:12	HTML parser error with <noscript> in the <head> For https://bugzilla.gnome.org/show_bug.cgi?id=615785 When the <noscript> is found, <head> is closed and a <body> element is created. The real <body id="xxx"> gets skipped over, so I can't see any of the body's attributes. Just don't close <head> when encountering a <noscript> Add a regression test too
4609e6c9	2012-05-11T15:31:05	XSD: optional element in complex type extension For https://bugzilla.gnome.org/show_bug.cgi?id=609796 Libxml2 fails to validate an instance document against a schema if an element whose type is a complex extension of some base type with an optional child element and that child element is not specified in the instance document. For example, suppose I have some complex type BaseType that is defined to have one child element in a sequence group that has minOccurs set to 0
868d92da	2012-05-10T15:34:57	Add HTML parser support for HTML5 meta charset encoding declaration For https://bugzilla.gnome.org/show_bug.cgi?id=655218 http://www.w3.org/TR/2011/WD-html5-20110525/semantics.html#the-meta-element """ The charset attribute specifies the character encoding used by the document. This is a character encoding declaration. If the attribute is present in an XML document, its value must be an ASCII case-insensitive match for the string "UTF-8" (and the document is therefore forced to use UTF-8 as its encoding). """ However, while <meta http-equiv="Content-Type" content="text/html; charset=utf8"> works, <meta charset="utf8"> does not. While libxml2 HTML parser is not tuned for HTML5, this is a simple addition Also added a testcase
aa422d92	2009-09-24T11:31:48	595792 fixing a RelaxNG bug introduced in 2.7.4 * relaxng.c: refs definitions added from inported schemas should not be processed as refs from the main schemas * test/relaxng/595792* result/relaxng/595792*: add the test to the regression suite
9332b48f	2009-09-23T18:28:43	Fix a Relaxng bug raised by libvirt test suite * xmlregexp.c: other fixes in 2.7.4 raised this internal error when comparing ranges, this affects among others detection of the determinism * test/relaxng/libvirt* result/relaxng/libvirt*: add a test case based on libvirt schemas and tests
1ba2aca3	2009-08-31T16:47:39	492317 Fix Relax-NG validation problems * relaxng.c xmlregexp.c: a subtle problem when checking for compileable content model, if using the same elements in cases of choices. Handled by adding a special flag to the regexp compilation to detect transitions with different atoms using same strings. * test/relaxng/492317* result/relaxng/492317*: add the test to the regression suite
ec18c960	2009-08-26T18:37:43	558452 fight with reg test and error report * relaxng.c: tiny fix and provide more context on some errors * result/relaxng/558452_0* test/relaxng/558452: add some regression tests for the bugs Makefile.am runtest.c: fight with the fact streaming error messages can differ due to missing node context
4013e83e	2009-08-26T17:24:31	579746 XSD validation not correct / nilable groups * xmlschemas.c: when a particle need to be processed via counted transition, if the group is nillable, the counting won't work, so keep track of nillable subset as they are built and generate the appropriate epsilon transitions as needed * test/schemas/579746* result/schemas/579746*: add related test cases based on the bug report
a6c76a26	2009-08-26T14:37:00	566012 part 2 fix regresion tests and push mode * test/utf16bebom.xml: regression test showed that this test case was broken but previous behaviour would not detect it ! * parser.c: fix 566012 for the push mode of the parser, tricky ! * test/ebcdic_566012.xml result//ebcdic_566012.xml*: add the test to the regression suite
d80d0728	2009-08-22T18:56:01	559410 - Regexp bug on (...)? constructs * xmlregexp.c: fix a regexp bug on some (...)? constructs * test/schemas/nvdcve* result/schemas/nvdcve*: add the tests to the regression suite
a721612e	2009-08-21T18:22:58	446613 small validation bug mixed content with NS * valid.c: fix a bug when valdating mixed content lists and some name use namespaces prefixes. * result/valid/notes.xml* test/valid/dtds/notes.dtd * test/valid/notes.xml: add the test case to the regression suite
be390ed0	2009-08-12T12:40:39	Test case for 570702
edc68aad	2009-08-07T20:29:33	582906 XSD validating multiple imports of the same schema * xmlschemas.c: When validating a schema that includes the same file that has no targetNamespace defined an internal erro was thrown, depending on the orig namespace that should be allowed though * test/schemas/582906-* result/schemas/582906-*: 2 tests case, one where this is allowed, and one where this is forbidden
d9960720	2009-08-07T19:01:32	Bug 582887 – problems validating complex schemas * xmlschemas.c: fixes the problem faced when importing the same schemas multiple times but from different places which is allowed * test/schemas/582887* result/schemas/582887*: adding the specific test to the regressions
83868247	2009-07-09T10:26:22	Aleksey Sanin support for c14n 1.1 * c14n.c include/libxml/c14n.h: adds support for C14N 1.1, new flags at the API level * runtest.c Makefile.am testC14N.c xmllint.c: add support in CLI tools and test binaries * result/c14n/1-1-without-comments/* test/c14n/1-1-without-comments/*: add a new batch of tests
7f4547cd	2008-10-03T07:58:23	preparing the release of 2.7.2 fix the Solaris portability issue * configure.in doc/* NEWS: preparing the release of 2.7.2 * dict.c: fix the Solaris portability issue * parser.c: additional cleanup on #554660 fix * test/ent13 result/ent13* result/noent/ent13: added the example in the regression test suite. HTMLparser.c: handle leading BOM in htmlParseElement() Daniel svn path=/trunk/; revision=3799
a57ba4ce	2008-09-25T16:06:18	fix an HTML parsing error on large data sections reported by Mike Day add * HTMLparser.c: fix an HTML parsing error on large data sections reported by Mike Day * test/HTML/utf8bug.html result/HTML/utf8bug.html.err result/HTML/utf8bug.html.sax result/HTML/utf8bug.html: add the reproducer to the test suite daniel svn path=/trunk/; revision=3797
0161e638	2008-08-28T15:36:32	completely different fix for the recursion detection based on entity * parser.c include/libxml/parser.h: completely different fix for the recursion detection based on entity density, big cleanups in the entity parsing code too * result/.sax: the parser should not ask for used defined versions of the predefined entities * testrecurse.c: automatic test for entity recursion checks * Makefile.am: added testrecurse * test/recurse/lol* test/recurse/good*: a first set of tests for the recursion Daniel svn path=/trunk/; revision=3783

e986d09c

2022-07-15T14:02:26

Skip incorrectly opened HTML comments Commit 4fd69f3e fixed handling of '<' characters not followed by an ASCII letter. But a '<!' sequence followed by invalid characters should be treated as bogus comment and skipped. Fixes #380.

14517012

2022-04-23T19:19:33

Fix parsing of subtracted regex character classes Fixes #370.

4612ce30

2022-04-21T03:52:52

Implement xpath1() XPointer scheme See https://www.w3.org/2005/04/xpointer-schemes/

41afa89f

2022-04-10T14:09:29

Fix short-lived regression in xmlStaticCopyNode Commit 7618a3b1 didn't account for coalesced text nodes. I think it would be better if xmlStaticCopyNode didn't try to coalesce text nodes at all. This code path can only be triggered if some other code doesn't coalesce text nodes properly. In this case, OSS-Fuzz found such behavior in xinclude.c.

57b81c20

2022-03-05T18:20:29

Normalize XPath strings in-place Simplify the code and fix a potential memory leak. Fixes #343.

bc06a522

2022-03-02T02:57:49

Fix recursion check in xinclude.c Compare the included URL with the document's URL to detect local inclusions. Fixes #348.

24cdc890

2021-07-17T14:06:49

test coverage for abruptly-closed comments These establish baseline behavior so that the subsequent commit is clear about the behavior it will modify.

966b0f21

2021-08-19T02:46:32

Add whitespace folding for some atomic data types that it's missing on. XSD validation fails when some atomic types contain surrounding whitespace even though XML Schema Part 2: Datatypes Second Edition, section 4.3.6 says they should be collapsed. Fix this. (I am not sure whether the test is correct.) Issue: #278

ea6e8f99

2021-12-20T00:34:58

Fix certain combinations of regex range quantifiers Fix regex transitions that have both min/max and a counter. In this case, we want to save the regex state before incrementing the counter. Fixes #301 and the issue reported here: https://mail.gnome.org/archives/xml/2016-April/msg00017.html

382fb056

2021-12-20T00:31:41

Fix range quantifier on subregex Make sure to add counted exit transitions before other counter transitions. Otherwise, we won't backtrack correctly. Fixes #65.

ce0871e1

2022-02-20T16:44:41

Only warn on invalid redeclarations of predefined entities Downgrade the error message to a warning since the error was ignored, anyway. Also print the name of redeclared entity. For a proper fix that also shows filename and line number of the invalid redeclaration, we'd have to - pass the parser context to the entity functions somehow, or - make these functions return distinct error codes. Partial fix for #308.

9edc20c1

2022-02-07T20:38:30

Fix double counting of CRLF in comments Fixes #151.

5408c10c

2022-02-04T14:00:09

Don't normalize namespace URIs in XPointer xmlns() scheme Namespace URIs should be compared without escaping or unescaping: https://www.w3.org/TR/REC-xml-names/#NSNameComparison Fixes #289.

1c7d91ab

2022-02-03T23:31:19

Fix handling of XSD with empty namespace An empty namespace means no default namespace. Fixes #303.

f480f750

2022-02-03T14:43:17

Update NewsML DTD in test suite Switch to version 1.2 which has a clearer license. Fixes #291.

d85245f9

2022-01-16T21:39:04

Fix regression with PEs in external DTD Fix a regression introduced with commit a28f7d87. In some cases, parameter entity references in external DTDs wouldn't be expanded. Fixes #306.

03bb9293

2021-07-07T18:23:18

Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8(). * encoding.c: (UTF16LEToUTF8): - Fix comment to describe what the code does. (UTF16BEToUTF8): - Fix undefined behavior which was applied to UTF16LEToUTF8() in 2f9382033e. - Add bounds check to while() loop which was applied to UTF16LEToUTF8() in be803967db. - Do not return -2 when (in >= inend) to fix the bug. This was applied to UTF16LEToUTF8() in 496a1cf592. - Inline (<< 8) statements to match UTF16LEToUTF8(). Add the following tests and results: test/text-4-byte-UTF-16-BE-offset.xml test/text-4-byte-UTF-16-BE.xml test/text-4-byte-UTF-16-LE-offset.xml test/text-4-byte-UTF-16-LE.xml

2732b234

2022-01-10T13:32:14

Fix regression parsing public IDs literals in HTML Fix regression introduced when reworking htmlParsePubidLiteral in commit 93ce33c2. Fixes #318.

01411e7c

2021-02-08T20:58:32

Check for invalid redeclarations of predefined entities Implement section "4.6 Predefined Entities" of the XML 1.0 spec and check whether redeclarations of predefined entities match the original definitions. Note that some test cases declared <!ENTITY lt "<"> But the XML spec clearly states that this is illegal: > If the entities lt or amp are declared, they MUST be declared as > internal entities whose replacement text is a character reference to > the respective character (less-than sign or ampersand) being escaped; > the double escaping is REQUIRED for these entities so that references > to them produce a well-formed result. Also fixes #217 but the connection is only tangential. The integer overflow discovered by fuzzing was more related to the fact that various parts of the parser disagreed on whether to prefer predefined entities over their redeclarations. The whole situation is a mess and even depends on legacy parser options. But now that redeclarations are validated, it shouldn't make a difference. As noted in the added comment, this is also one of the cases where overly defensive checks can hide interesting logic bugs from fuzzers.

e28d9347

2020-08-04T14:53:19

add test coverage for incorrectly-closed comments this establishes the baseline behavior so that subsequent commits which modify this behavior are clear about what's being changed.

87d20b55

2020-08-19T13:52:08

Fix regression introduced with commit 74dcc10b The code wasn't dead after all, but I can see no reason in delaying the XPointer evaluation. This could lead to nodes included earlier appearing in XPointer results.

d88df4bd

2020-08-16T23:38:48

Fix corner case with empty xi:fallback xi:fallback could become empty after recursive expansion. Use a flag to track whether nodes should be skipped.

1abf2967

2020-08-06T17:51:57

Fix exponential runtime and memory in xi:fallback processing When creating XML_XINCLUDE_START nodes, the children of the original xi:include node must be freed, otherwise fallback content is copied twice, doubling runtime and memory consumption for each nested xi:fallback/xi:include pair. Found with libFuzzer.

0f9817c7

2020-06-10T16:34:52

Don't recurse into xi:include children in xmlXIncludeDoProcess Otherwise, nested xi:include nodes might result in a use-after-free if XML_PARSE_NOXINCNODE is specified. Found with libFuzzer and ASan.

6b4717d6

2020-07-06T12:36:27

Add regexp regression tests - Bug 757711: heap-buffer-overflow in xmlFAParsePosCharGroup <https://bugzilla.gnome.org/show_bug.cgi?id=757711> - Bug 783015 - Integer-overflow in xmlFAParseQuantExact <https://bugzilla.gnome.org/show_bug.cgi?id=783015> (Regexptests): Add support for checking stderr output when running regexp tests. This makes it possible to check in test cases that fail and not see false-positive error output when running the tests. Unlike other libxml2 test suites, if there is no stderr output, no *.err file needs to be created.

477c7f6a

2020-06-28T15:54:23

Fix quadratic runtime in HTML parser Commit eeb99329 removed an important optimization avoiding quadratic runtime when repeatedly scanning the input buffer for terminating characters in the HTML push parser. The related bug is https://bugzilla.gnome.org/show_bug.cgi?id=444994 Make sure that ctxt->checkIndex is always written and store additional parser state in ctxt->inSubset which is unused in the HTML parser. Found by OSS-Fuzz.

32cb5dcc

2020-02-11T13:16:10

Add test case for recursive external parsed entities

2a350ee9

2019-09-30T17:04:54

Large batch of typo fixes Closes #109.

c51e38cb

2019-09-30T13:50:02

Make xmlParseConditionalSections non-recursive Avoid call stack overflow in deeply nested conditional sections. Found by OSS-Fuzz.

c2b0a184

2019-09-25T13:57:42

Fix empty branch in regex Fixes bug 649244: https://bugzilla.gnome.org/show_bug.cgi?id=649244 Closes #57.

6705f4d2

2019-09-16T15:45:27

Remove executable bit from non-executable files

e8c9cd5c

2019-09-16T15:36:02

Fix Schema determinism check of ##other namespaces Non-compound (##local) and compound string atoms are always disjoint regardless of whether the compound atom is negated (##other). Closes #40.

8efc5b28

2019-09-13T12:24:23

14:00 is a valid timezone for xs:dateTime Closes #100

ea695ac0

2019-08-09T15:09:22

Fix unability to RelaxNG-validate grammar with choice-based name class Previously, test/relaxng/ambig_name-class2.xml would fail to validate against test/relaxng/ambig_name-class2.rng: > test/relaxng/ambig_name-class2.rng:4: > element attribute: Relax-NG parser error : > Found anyName attribute without oneOrMore ancestor > Relax-NG schema test/relaxng/ambig_name-class2.rng failed to compile Signed-off-by: Jan Pokorný <jpokorny@redhat.com>

8074b881

2019-08-08T23:33:48

Fix unability to validate ambiguously constructed interleave for RelaxNG Previously, test/relaxng/ambig_name-class.xml would fail to validate for a simple reason -- interleave within "open-name-class" context is supposed to be fine with whatever else is pending the consumption, since effectively, it's unrelated from a higher parsing perspective. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>

c2f4da1a

2017-05-21T22:08:50

Improve XPath predicate and filter evaluation Consolidate code paths evaluating XPath predicates and filters. Don't push context node on stack when evaluating predicates. I have no idea why this was done. It seems completely useless and trying to pop the context node from a corrupted stack has already caused security issues. Filter nodesets in-place and don't create node sets with NULL gaps which allows to simplify merging a great deal. Simply move matched nodes backward and create a compact node set. Merge xmlXPathCompOpEvalPositionalPredicate into xmlXPathCompOpEvalPredicate.

30a6533e

2019-03-08T12:15:17

Fix float casts in xmlXPathSubstringFunction Rewrite conversion of double to int in xmlXPathSubstringFunction, adding range checks to avoid undefined behavior. Make sure to add start and length as floating-point numbers before converting to int. Fix a bug when rounding negative start indices. Remove unneeded calls to xmlXPathIs{Inf,NaN} and rely on IEEE math instead. Avoid computing the string length. xmlUTF8Strsub works as expected if the length of the requested substring exceeds the input. Found with libFuzzer and UBSan.

c64d4efb

2018-10-13T00:12:12

Remove redefined starts and defines inside include elements When including a grammar from another grammar, we need to make sure that any redefines of starts and includes that that grammar does inside any of its include elements are also removed.

46da8fc5

2018-10-12T23:46:24

Allow choice within choice in nameClass in RELAX NG The pattern nameClass allows for nested choice elements, for example <name> <choice> <choice> <name>a</name> <name>b</name> </choice> <name>c</name> </choice> </name> which is semantically equivalent to <name> <choice> <name>a</name> <name>b</name> <name>c</name> </choice> </name> The old code didn’t handle this correctly, as it never expected a choice inside another choice. This patch fixes this by flattening any nested choices. This pattern of nested choice elements comes up in RELAX NG simplification, where all choice elements are rewritten in this nested manner, see section 4.12 of the RELAX NG specification.

4338c310

2018-10-12T22:30:26

Look inside divs for starts and defines inside include RELAX NG allows for div elements inside of include elements. We need to look inside those div elements for start and define elements that may be redefining start and define elements in the included grammar.

72182550

2017-11-04T15:38:58

Add test for ICU flush and pivot buffer

5af594d8

2017-10-07T14:54:45

Fix comparison of nodesets to strings Fix two bugs in xmlXPathNodeValHash which could lead to errors when comparing nodesets to strings: - Only use contents of text nodes to compute the hash for element nodes. Comments, PIs, and other node types don't affect the string-value and must be ignored. - Reset `string` to NULL for node types other than text. Reported by Aleksei on the mailing list: https://mail.gnome.org/archives/xml/2017-September/msg00016.html

69936b12

2017-08-30T14:16:01

Revert "Print error messages for truncated UTF-8 sequences" This reverts commit 79c8a6b which caused a serious regression in streaming mode. Also reverts part of commit 52ceced "Fix infinite loops with push parser in recovery mode". Fixes bug 786554.

899a5d9f

2017-07-25T14:59:49

Detect infinite recursion in parameter entities When expanding a parameter entity in a DTD, infinite recursion could lead to an infinite loop or memory exhaustion. Thanks to Wei Lei for the first of many reports. Fixes bug 759579.

5f440d8c

2017-06-12T14:32:34

Rework entity boundary checks Make sure to finish all entities in the internal subset. Nevertheless, readd a sanity check in xmlParseStartTag2 that was lost in my previous commit. Also add a sanity check in xmlPopInput. Popping an input unexpectedly was the source of many recent memory bugs. The check doesn't mitigate such issues but helps with diagnosis. Always base entity boundary checks on the input ID, not the input pointer. The pointer could have been reallocated to the old address. Always throw a well-formedness error if a boundary check fails. In a few places, a validity error was thrown. Fix a few error codes and improve indentation.

85c112a0

2017-06-12T18:26:11

Add test cases for bug 758518 test/HTML/758518-entity.html exposed a bug in pushParseTest() in runtest.c which assumed that an input file was at least 4 bytes long. That test case is only 3 bytes, so we now take the minimum of 4 bytes or the length of the test input. We also now use 'chunkSize' in place of the hard-coded value '1024' later in the function.

79c8a6b1

2017-06-10T17:01:27

Print error messages for truncated UTF-8 sequences Before, truncated UTF-8 sequences at the end of a file were treated as EOF. Create an error message containing the offending bytes. xmlStringCurrentChar would also print characters from the input stream, not the string it's working on.

932cc989

2017-06-03T02:01:29

Fix buffer size checks in xmlSnprintfElementContent xmlSnprintfElementContent failed to correctly check the available buffer space in two locations. Fixes bug 781333 (CVE-2017-9047) and bug 781701 (CVE-2017-9048). Thanks to Marcel Böhme and Thuan Pham for the report.

e2663054

2017-06-05T15:37:17

Fix handling of parameter-entity references There were two bugs where parameter-entity references could lead to an unexpected change of the input buffer in xmlParseNameComplex and xmlDictLookup being called with an invalid pointer. Percent sign in DTD Names ========================= The NEXTL macro used to call xmlParserHandlePEReference. When parsing "complex" names inside the DTD, this could result in entity expansion which created a new input buffer. The fix is to simply remove the call to xmlParserHandlePEReference from the NEXTL macro. This is safe because no users of the macro require expansion of parameter entities. - xmlParseNameComplex - xmlParseNCNameComplex - xmlParseNmtoken The percent sign is not allowed in names, which are grammatical tokens. - xmlParseEntityValue Parameter-entity references in entity values are expanded but this happens in a separate step in this function. - xmlParseSystemLiteral Parameter-entity references are ignored in the system literal. - xmlParseAttValueComplex - xmlParseCharDataComplex - xmlParseCommentComplex - xmlParsePI - xmlParseCDSect Parameter-entity references are ignored outside the DTD. - xmlLoadEntityContent This function is only called from xmlStringLenDecodeEntities and entities are replaced in a separate step immediately after the function call. This bug could also be triggered with an internal subset and double entity expansion. This fixes bug 766956 initially reported by Wei Lei and independently by Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone involved. xmlParseNameComplex with XML_PARSE_OLD10 ======================================== When parsing Names inside an expanded parameter entity with the XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the GROW macro if the input buffer was exhausted. At the end of the parameter entity's replacement text, this function would then call xmlPopInput which invalidated the input buffer. There should be no need to invoke GROW in this situation because the buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and, at least for UTF-8, in xmlCurrentChar. This also matches the code path executed when XML_PARSE_OLD10 is not set. This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050). Thanks to Marcel Böhme and Thuan Pham for the report. Additional hardening ==================== A separate check was added in xmlParseNameComplex to validate the buffer size.

7482f41f

2017-06-01T22:00:19

Check for integer overflow in xmlXPathFormatNumber Check for overflow before casting double to int. Found with afl-fuzz and UBSan.

f4029cd4

2016-04-21T16:37:26

Check XPath exponents for overflow Avoid undefined behavior and wrong results with huge exponents. Found with afl-fuzz and UBSan.

a58331a6

2017-05-29T21:02:21

Check for overflow in xmlXPathIsPositionalPredicate Avoid undefined behavior when casting from double to int. Found with afl-fuzz and UBSan.

a851868a

2017-05-29T20:14:42

Parse small XPath numbers more accurately Don't count leading zeros towards the fraction size limit. This allows to parse numbers like 0.0000000000000000000000000000000000000000000000000000000001 which is the only standard-conformant way to represent such numbers, as scientific notation isn't allowed in XPath 1.0. (It is allowed in XPath 2.0 and in libxml2 as an extension, though.) Overall accuracy is still bad, see bug 783238.

4bebb030

2016-04-21T13:41:09

Rework XPath rounding functions Use the C library's floor and ceil functions. The old code was overly complicated for no apparent reason and could result in undefined behavior when handling NaNs (found with afl-fuzz and UBSan). Fix wrong comment in xmlXPathRoundFunction. The implementation was already following the spec and rounding half up.

40f58521

2017-05-26T20:16:35

Fix axis traversal from attribute and namespace nodes When traversing the "preceding" axis from an attribute node, we must first go up to the attribute's containing element. Otherwise, text children of other attributes could be returned. This made it possible to hit a code path in xmlXPathNextAncestor which contained another bug: The attribute node was initialized with the context node instead of the current node. Normally, this code path is only hit via xmlXPathNextAncestorOrSelf in which case the current and context node are the same. The combination of the two bugs could result in an infinite loop, found with libFuzzer. Traversing the "following" and the "preceding" axis from namespace nodes should be handled similarly. This wasn't supported at all previously.

9ab01a27

2016-06-28T14:22:23

Fix XPointer paths beginning with range-to The old code would invoke the broken xmlXPtrRangeToFunction. range-to isn't really a function but a special kind of location step. Remove this function and always handle range-to in the XPath code. The old xmlXPtrRangeToFunction could also be abused to trigger a use-after-free error with the potential for remote code execution. Found with afl-fuzz. Fixes CVE-2016-5131.

d8083bf7

2016-06-25T12:35:50

Fix NULL pointer deref in XPointer range-to - Check for errors after evaluating first operand. - Add sanity check for empty stack. Found with afl-fuzz.

0bcd05c5

2016-03-01T15:18:04

Heap-based buffer overread in htmlCurrentChar For https://bugzilla.gnome.org/show_bug.cgi?id=758606 * parserInternals.c: (xmlNextChar): Add an test to catch other issues on ctxt->input corruption proactively. For non-UTF-8 charsets, xmlNextChar() failed to check for the end of the input buffer and would continuing reading. Fix this by pulling out the check for the end of the input buffer into common code, and return if we reach the end of the input buffer prematurely. * result/HTML/758606.html: Added. * result/HTML/758606.html.err: Added. * result/HTML/758606.html.sax: Added. * result/HTML/758606_2.html: Added. * result/HTML/758606_2.html.err: Added. * result/HTML/758606_2.html.sax: Added. * test/HTML/758606.html: Added test case. * test/HTML/758606_2.html: Added test case.

00906759

2016-01-26T16:57:03

Heap-based buffer-underreads due to xmlParseName For https://bugzilla.gnome.org/show_bug.cgi?id=759573 * parser.c: (xmlParseElementDecl): Return early on invalid input to fix non-minimized test case (759573-2.xml). Otherwise the parser gets into a bad state in SKIP(3) at the end of the function. (xmlParseConditionalSections): Halt parsing when hitting invalid input that would otherwise caused xmlParserHandlePEReference() to recurse unexpectedly. This fixes the minimized test case (759573.xml). * result/errors/759573-2.xml: Add. * result/errors/759573-2.xml.err: Add. * result/errors/759573-2.xml.str: Add. * result/errors/759573.xml: Add. * result/errors/759573.xml.err: Add. * result/errors/759573.xml.str: Add. * test/errors/759573-2.xml: Add. * test/errors/759573.xml: Add.

38eae571

2016-03-07T14:04:08

Heap use-after-free in xmlSAX2AttributeNs For https://bugzilla.gnome.org/show_bug.cgi?id=759020 * parser.c: (xmlParseStartTag2): Attribute strings are only valid if the base does not change, so add another check where the base may change. Make sure to set 'attvalue' to NULL after freeing it. * result/errors/759020.xml: Added. * result/errors/759020.xml.err: Added. * result/errors/759020.xml.str: Added. * test/errors/759020.xml: Added test case.

45752d2c

2016-03-03T11:50:34

Bug 759398: Heap use-after-free in xmlDictComputeFastKey <https://bugzilla.gnome.org/show_bug.cgi?id=759398> * parser.c: (xmlParseNCNameComplex): Store start position instead of a pointer to the name since the underlying buffer may change, resulting in a stale pointer being used. * result/errors/759398.xml: Added. * result/errors/759398.xml.err: Added. * result/errors/759398.xml.str: Added. * test/errors/759398.xml: Added test case.

a820dbea

2016-03-01T11:34:04

Bug 758605: Heap-based buffer overread in xmlDictAddString <https://bugzilla.gnome.org/show_bug.cgi?id=758605> Reviewed by David Kilzer. * HTMLparser.c: (htmlParseName): Add bounds check. (htmlParseNameComplex): Ditto. * result/HTML/758605.html: Added. * result/HTML/758605.html.err: Added. * result/HTML/758605.html.sax: Added. * runtest.c: (pushParseTest): The input for the new test case was so small (4 bytes) that htmlParseChunk() was never called after htmlCreatePushParserCtxt(), thereby creating a false positive test failure. Fixed by using a do-while loop so we always call htmlParseChunk() at least once. * test/HTML/758605.html: Added.

db07dd61

2016-02-12T09:58:29

Bug 758588: Heap-based buffer overread in xmlParserPrintFileContextInternal <https://bugzilla.gnome.org/show_bug.cgi?id=758588> * parser.c: (xmlParseEndTag2): Add bounds checks before dereferencing ctxt->input->cur past the end of the buffer, or incrementing the pointer past the end of the buffer. * result/errors/758588.xml: Add test result. * result/errors/758588.xml.err: Ditto. * result/errors/758588.xml.str: Ditto. * test/errors/758588.xml: Add regression test.

6eb0894a

2016-05-05T16:49:00

Fix memory leak with XPath namespace nodes Set hasNsNodes to 1 when adding namespace nodes via XP_TEST_HIT.

82b73039

2016-04-30T17:53:10

Fix namespace axis traversal When the namespace axis is traversed in "toBool" mode, the traversal can exit early, before visiting all nodes. In this case, the XPath context still contains a non-NULL tmpNsList. This means that - the check when to start a new traversal was wrong and - the tmpNsList could be leaked. Fixes bug #750037 and, by accident, bug #756075: https://bugzilla.gnome.org/show_bug.cgi?id=750037 https://bugzilla.gnome.org/show_bug.cgi?id=756075

839689a9

2016-04-27T18:00:12

Don't recurse into OP_VALUEs in xmlXPathOptimizeExpression The ch1 slot of OP_VALUEs contains an invalid value. Ignore it. Fixes bug #760325: https://bugzilla.gnome.org/show_bug.cgi?id=760325

f39fd66e

2016-04-27T03:01:16

Fix namespace::node() XPath expression Make sure that xmlXPathNodeSetAddNs is called for namespace nodes when matched with a namespace::node() step. This correctly sets the parent of namespace nodes. Note that xmlXPathNodeSetAddNs must only be called if working on the namespace axis. Otherwise, the context node is not the parent of the namespace node and the standard XP_TEST_HIT macro must be invoked. This explains the errors in the C14N tests that the old TODO comment mentioned.

e2893903

2016-04-21T19:19:23

Fix parsing of NCNames in XPath The NCName parser would allow any NameChar as start character. For example, the following XPath expressions would compile: self::-abc self::0abc self::.abc

cad102b8

2016-04-15T22:41:24

Do normalize string-based datatype value in RelaxNG facet checking Original patch is from Jan Pokorný <jpokorny redhat com> https://mail.gnome.org/archives/xml/2013-November/msg00028.html Improve it according to reviews and add test files.

4f8606c1

2016-01-05T13:38:09

Bug 760183: REGRESSION (v2.9.3): XML push parser fails with bogus UTF-8 encoding error when multi-byte character in large CDATA section is split across buffer <https://bugzilla.gnome.org/show_bug.cgi?id=760183> * parser.c: (xmlCheckCdataPush): Add 'complete' argument to describe whether the buffer passed in is the whole CDATA buffer, or if there is more data to parse. If there is more data to parse, don't return a negative value for an invalid multi-byte UTF-8 character that is split between buffers. (xmlParseTryOrFinish): Pass 'complete' argument to xmlCheckCdataPush() as appropriate. * result/cdata-2-byte-UTF-8.xml: Added. * result/cdata-2-byte-UTF-8.xml.rde: Added. * result/cdata-2-byte-UTF-8.xml.rdr: Added. * result/cdata-2-byte-UTF-8.xml.sax: Added. * result/cdata-2-byte-UTF-8.xml.sax2: Added. * result/cdata-3-byte-UTF-8.xml: Added. * result/cdata-3-byte-UTF-8.xml.rde: Added. * result/cdata-3-byte-UTF-8.xml.rdr: Added. * result/cdata-3-byte-UTF-8.xml.sax: Added. * result/cdata-3-byte-UTF-8.xml.sax2: Added. * result/cdata-4-byte-UTF-8.xml: Added. * result/cdata-4-byte-UTF-8.xml.rde: Added. * result/cdata-4-byte-UTF-8.xml.rdr: Added. * result/cdata-4-byte-UTF-8.xml.sax: Added. * result/cdata-4-byte-UTF-8.xml.sax2: Added. * result/noent/cdata-2-byte-UTF-8.xml: Added. * result/noent/cdata-3-byte-UTF-8.xml: Added. * result/noent/cdata-4-byte-UTF-8.xml: Added. * test/cdata-2-byte-UTF-8.xml: Added. * test/cdata-3-byte-UTF-8.xml: Added. * test/cdata-4-byte-UTF-8.xml: Added. - Add tests and results. Only 'make Readertests XMLPushtests' fails prior to the fix.

4a5d80ad

2015-09-18T15:06:46

Fix a bug in CData error handling in the push parser For https://bugzilla.gnome.org/show_bug.cgi?id=754947 The checking function was returning incorrect args in some cases Adds the test to teh reg suite and fix one of the existing test output

51f02b0a

2015-09-15T16:50:32

Fix a bug on name parsing at the end of current input buffer For https://bugzilla.gnome.org/show_bug.cgi?id=754946 When hitting the end of the current input buffer while parsing a name we could end up loosing the beginning of the name, which led to various issues.

ef709ce2

2015-09-10T19:41:41

Fix the spurious ID already defined error For https://bugzilla.gnome.org/show_bug.cgi?id=737840 the fix for 724903 introduced a regression on external entities carrying IDs, revert that patch in part and add a specific test to avoid readding it

2fab235d

2015-03-16T08:38:36

Fix support for except in nameclasses For https://bugzilla.gnome.org/show_bug.cgi?id=565219 The code was imply missing even if simple, added a few regression tests.

02b252d7

2015-03-08T17:00:37

Regression test for bug #695699

342658a1

2015-03-08T16:46:04

Add a couple of XPath tests

f6aaabce

2015-03-08T16:05:26

Allow attributes on descendant-or-self axis If the context node is an attribute, the attribute itself is on the descendant-or-self axis. The principal node type of this axis is element, so the only node test that can return the attribute is "node()". In other words, "@attr/descendant-or-self::node()" is equivalent to "@attr". This matches the behavior of Saxon-CE.

df23f584

2014-10-23T13:52:47

Adding example from bugs 738805 to regression tests For https://bugzilla.gnome.org/show_bug.cgi?id=738805 Tortuous test case provided by pierre.labastie@neuf.fr

6473a41a

2013-10-23T14:51:33

Implement choice for name classes on attributes https://bugzilla.gnome.org/show_bug.cgi?id=710744

dcc19503

2013-05-22T22:56:45

Fix a parsing bug on non-ascii element and CR/LF usage https://bugzilla.gnome.org/show_bug.cgi?id=698550 Somehow the behaviour of the internal parser routine changed slightly when encountering CR/LF, which led to a bug when parsing document with non-ascii Names

483272f3

2013-03-27T13:37:14

Added a regression tests from bug 694228 data Provided by Mark Rowe <mrowe@apple.com>

4629ee02

2012-07-23T14:15:40

Do not fetch external parsed entities Unless explicietely asked for when validating or replacing entities with their value. Problem pointed out by Tom Lane <tgl@redhat.com> * parser.c: do not load external parsed entities unless needed * test/errors/extparsedent.xml result/errors/extparsedent.xml*: add a regression test to avoid change of the behaviour in the future

a0cd075d

2012-05-11T19:31:12

HTML parser error with <noscript> in the <head> For https://bugzilla.gnome.org/show_bug.cgi?id=615785 When the <noscript> is found, <head> is closed and a <body> element is created. The real <body id="xxx"> gets skipped over, so I can't see any of the body's attributes. Just don't close <head> when encountering a <noscript> Add a regression test too

4609e6c9

2012-05-11T15:31:05

XSD: optional element in complex type extension For https://bugzilla.gnome.org/show_bug.cgi?id=609796 Libxml2 fails to validate an instance document against a schema if an element whose type is a complex extension of some base type with an optional child element and that child element is not specified in the instance document. For example, suppose I have some complex type BaseType that is defined to have one child element in a sequence group that has minOccurs set to 0

868d92da

2012-05-10T15:34:57

Add HTML parser support for HTML5 meta charset encoding declaration For https://bugzilla.gnome.org/show_bug.cgi?id=655218 http://www.w3.org/TR/2011/WD-html5-20110525/semantics.html#the-meta-element """ The charset attribute specifies the character encoding used by the document. This is a character encoding declaration. If the attribute is present in an XML document, its value must be an ASCII case-insensitive match for the string "UTF-8" (and the document is therefore forced to use UTF-8 as its encoding). """ However, while <meta http-equiv="Content-Type" content="text/html; charset=utf8"> works, <meta charset="utf8"> does not. While libxml2 HTML parser is not tuned for HTML5, this is a simple addition Also added a testcase

aa422d92

2009-09-24T11:31:48

595792 fixing a RelaxNG bug introduced in 2.7.4 * relaxng.c: refs definitions added from inported schemas should not be processed as refs from the main schemas * test/relaxng/595792* result/relaxng/595792*: add the test to the regression suite

9332b48f

2009-09-23T18:28:43

Fix a Relaxng bug raised by libvirt test suite * xmlregexp.c: other fixes in 2.7.4 raised this internal error when comparing ranges, this affects among others detection of the determinism * test/relaxng/libvirt* result/relaxng/libvirt*: add a test case based on libvirt schemas and tests

1ba2aca3

2009-08-31T16:47:39

492317 Fix Relax-NG validation problems * relaxng.c xmlregexp.c: a subtle problem when checking for compileable content model, if using the same elements in cases of choices. Handled by adding a special flag to the regexp compilation to detect transitions with different atoms using same strings. * test/relaxng/492317* result/relaxng/492317*: add the test to the regression suite

ec18c960

2009-08-26T18:37:43

558452 fight with reg test and error report * relaxng.c: tiny fix and provide more context on some errors * result/relaxng/558452_0* test/relaxng/558452*: add some regression tests for the bugs * Makefile.am runtest.c: fight with the fact streaming error messages can differ due to missing node context

4013e83e

2009-08-26T17:24:31

579746 XSD validation not correct / nilable groups * xmlschemas.c: when a particle need to be processed via counted transition, if the group is nillable, the counting won't work, so keep track of nillable subset as they are built and generate the appropriate epsilon transitions as needed * test/schemas/579746* result/schemas/579746*: add related test cases based on the bug report

a6c76a26

2009-08-26T14:37:00

566012 part 2 fix regresion tests and push mode * test/utf16bebom.xml: regression test showed that this test case was broken but previous behaviour would not detect it ! * parser.c: fix 566012 for the push mode of the parser, tricky ! * test/ebcdic_566012.xml result//ebcdic_566012.xml*: add the test to the regression suite

d80d0728

2009-08-22T18:56:01

559410 - Regexp bug on (...)? constructs * xmlregexp.c: fix a regexp bug on some (...)? constructs * test/schemas/nvdcve* result/schemas/nvdcve*: add the tests to the regression suite

a721612e

2009-08-21T18:22:58

446613 small validation bug mixed content with NS * valid.c: fix a bug when valdating mixed content lists and some name use namespaces prefixes. * result/valid/notes.xml* test/valid/dtds/notes.dtd * test/valid/notes.xml: add the test case to the regression suite

be390ed0

2009-08-12T12:40:39

Test case for 570702

edc68aad

2009-08-07T20:29:33

582906 XSD validating multiple imports of the same schema * xmlschemas.c: When validating a schema that includes the same file that has no targetNamespace defined an internal erro was thrown, depending on the orig namespace that should be allowed though * test/schemas/582906-* result/schemas/582906-*: 2 tests case, one where this is allowed, and one where this is forbidden

d9960720

2009-08-07T19:01:32

Bug 582887 – problems validating complex schemas * xmlschemas.c: fixes the problem faced when importing the same schemas multiple times but from different places which is allowed * test/schemas/582887* result/schemas/582887*: adding the specific test to the regressions

83868247

2009-07-09T10:26:22

Aleksey Sanin support for c14n 1.1 * c14n.c include/libxml/c14n.h: adds support for C14N 1.1, new flags at the API level * runtest.c Makefile.am testC14N.c xmllint.c: add support in CLI tools and test binaries * result/c14n/1-1-without-comments/* test/c14n/1-1-without-comments/*: add a new batch of tests

7f4547cd

2008-10-03T07:58:23

preparing the release of 2.7.2 fix the Solaris portability issue * configure.in doc/* NEWS: preparing the release of 2.7.2 * dict.c: fix the Solaris portability issue * parser.c: additional cleanup on #554660 fix * test/ent13 result/ent13* result/noent/ent13*: added the example in the regression test suite. * HTMLparser.c: handle leading BOM in htmlParseElement() Daniel svn path=/trunk/; revision=3799

a57ba4ce

2008-09-25T16:06:18

fix an HTML parsing error on large data sections reported by Mike Day add * HTMLparser.c: fix an HTML parsing error on large data sections reported by Mike Day * test/HTML/utf8bug.html result/HTML/utf8bug.html.err result/HTML/utf8bug.html.sax result/HTML/utf8bug.html: add the reproducer to the test suite daniel svn path=/trunk/; revision=3797

0161e638

2008-08-28T15:36:32

completely different fix for the recursion detection based on entity * parser.c include/libxml/parser.h: completely different fix for the recursion detection based on entity density, big cleanups in the entity parsing code too * result/*.sax*: the parser should not ask for used defined versions of the predefined entities * testrecurse.c: automatic test for entity recursion checks * Makefile.am: added testrecurse * test/recurse/lol* test/recurse/good*: a first set of tests for the recursion Daniel svn path=/trunk/; revision=3783

kc3-lang/libxml2/test

test

Log