kmx git

Commit	Date	Message
b90d8989	2017-09-19T15:45:35	Fix regression with librsvg Instead of using xmlCreateIOParserCtxt, librsvg pushes its own xmlParserInput on top of a memory push parser. This incorrect use of the API confuses several parser checks and, since 2.9.5, completely breaks documents with internal subsets. Work around the problem with internal subsets. Thanks to Petr Sumbera for the report: https://mail.gnome.org/archives/xml/2017-September/msg00011.html Also see https://bugzilla.gnome.org/show_bug.cgi?id=787895
abbda93c	2017-09-11T01:14:16	Handle more invalid entity values in recovery mode In attribute content, don't emit entity references if there are problems with the entity value. Otherwise some illegal entity values like <!ENTITY a '&#x123456789;'> would later cause problems like integer overflow. Make xmlStringLenDecodeEntities return NULL on more error conditions including invalid char refs and errors from recursive calls. Remove some fragile error checks based on lastError that shouldn't be needed now. Clear the entity content in xmlParseAttValueComplex if an error was found. Found by OSS-Fuzz. Should fix bug 783052. Also see https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3343
0fcab658	2017-09-07T18:25:11	Handle illegal entity values in recovery mode Make xmlParseEntityValue always return NULL on error. Otherwise some illegal entity values like <!ENTITY e '&%#4294967298;'> would later cause problems like integer overflow. Found by OSS-Fuzz. Should fix bug 783052. Also see https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=592 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=2732
69936b12	2017-08-30T14:16:01	Revert "Print error messages for truncated UTF-8 sequences" This reverts commit 79c8a6b which caused a serious regression in streaming mode. Also reverts part of commit 52ceced "Fix infinite loops with push parser in recovery mode". Fixes bug 786554.
454e397e	2017-08-28T14:30:43	Porting libxml2 on zOS encoding of code First set of patches for zOS - entities.c parser.c tree.c xmlschemas.c xmlschemastypes.c xpath.c xpointer.c: ask conversion of code to ISO Latin 1 to avoid having the compiler assume EBCDIC codepoint for characters. - xmlmodule.c: make sure we have support for modules - xmlIO.c: zOS path names are special avoid dsome of the expectstions from Unix/Windows
899a5d9f	2017-07-25T14:59:49	Detect infinite recursion in parameter entities When expanding a parameter entity in a DTD, infinite recursion could lead to an infinite loop or memory exhaustion. Thanks to Wei Lei for the first of many reports. Fixes bug 759579.
52ceced6	2017-07-01T17:49:30	Fix infinite loops with push parser in recovery mode Make sure that the input pointer advances in case of errors. Otherwise, the push parser can loop infinitely. Found with libFuzzer.
3eef3f39	2017-06-20T16:13:57	Fix NULL deref in xmlParseExternalEntityPrivate If called from xmlParseExternalEntity, oldctxt is NULL which leads to a NULL deref if an error occurs. This only affects external code that calls xmlParseExternalEntity. Patch from David Kilzer with minor changes. Fixes bug 780159.
872fea94	2017-06-19T00:24:12	Get rid of "blanks wrapper" for parameter entities Now that replacement of parameter entities goes exclusively through xmlSkipBlankChars, we can account for the surrounding space characters there and remove the "blanks wrapper" hack.
d9e43c7d	2017-06-19T18:01:23	Make sure not to call IS_BLANK_CH when parsing the DTD This is required to get rid of the "blanks wrapper" hack. Checking the return value of xmlSkipBlankChars is more efficient, too.
453dff1e	2017-06-19T17:55:20	Remove unnecessary calls to xmlPopInput It's enough if xmlPopInput is called from xmlSkipBlankChars. Since the replacement text of a parameter entity is surrounded with space characters, that's the only place where the replacement can end in a well-formed document. This is also required to get rid of the "blanks wrapper" hack.
aa267cd1	2017-06-18T23:29:51	Simplify handling of parameter entity references There are only two places where parameter entity references must be handled. For the internal subset in xmlParseInternalSubset. For the external subset or content from other external PEs in xmlSkipBlankChars. Make sure that xmlSkipBlankChars skips over sequences of PEs and whitespace. Rely on xmlSkipBlankChars instead of calling xmlParsePEReference directly when in the external subset or a conditional section. xmlParserHandlePEReference is unused now.
24246c76	2017-06-20T12:56:36	Fix xmlHaltParser Pop all extra input streams before resetting the input. Otherwise, a call to xmlPopInput could make input available again. Also set input->end to input->cur. Changes the test output for some error tests. Unfortunately, some fuzzed test cases were added to the test suite without manual cleanup. This makes it almost impossible to review the impact of later changes on the test output.
8bbe4508	2017-06-17T16:15:09	Spelling and grammar fixes Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other misspellings.
5f440d8c	2017-06-12T14:32:34	Rework entity boundary checks Make sure to finish all entities in the internal subset. Nevertheless, readd a sanity check in xmlParseStartTag2 that was lost in my previous commit. Also add a sanity check in xmlPopInput. Popping an input unexpectedly was the source of many recent memory bugs. The check doesn't mitigate such issues but helps with diagnosis. Always base entity boundary checks on the input ID, not the input pointer. The pointer could have been reallocated to the old address. Always throw a well-formedness error if a boundary check fails. In a few places, a validity error was thrown. Fix a few error codes and improve indentation.
46dc9890	2017-06-08T02:24:56	Don't switch encoding for internal parameter entities This is only needed for external entities. Trying to switch the encoding for internal entities could also cause a memory leak in recovery mode.
03904159	2017-06-05T21:16:00	Merge duplicate code paths handling PE references xmlParsePEReference is essentially a subset of xmlParserHandlePEReference, so make xmlParserHandlePEReference call xmlParsePEReference. The code paths in these functions differed slighty, but the code from xmlParserHandlePEReference seems more solid and tested.
3f0627a1	2017-06-16T21:30:42	Fix duplicate SAX callbacks for entity content Reset 'was_checked' to prevent entity from being parsed twice and SAX callbacks being invoked twice if XML_PARSE_NOENT was set. This regressed in version 2.9.3 and caused problems with WebKit. Fixes bug 760367.
fb2f518c	2017-06-10T17:06:16	Fix potential infinite loop in xmlStringLenDecodeEntities Make sure that xmlParseStringPEReference advances the "str" pointer even if the parser was stopped. Otherwise xmlStringLenDecodeEntities can loop infinitely.
4ba8cc85	2017-06-10T02:33:58	Remove useless check in xmlParseAttributeListDecl Since we already successfully parsed the attribute name and other items, it is guaranteed that we made progress in the input stream. Comparing the input pointer to a previous value also looks fragile to me. What if the input buffer was reallocated and the new "cur" pointer happens to be the same as the old one? There are a couple of similar checks which also take "consumed" into account. This seems to be safer but I'm not convinced that it couldn't lead to false alarms in rare situations.
bedbef80	2017-06-09T15:10:13	Fix memory leak in xmlParseEntityDecl error path When parsing the entity value, it can happen that an external entity with an unsupported encoding is loaded and the parser is stopped. This would lead to a memory leak. A custom SAX callback could also stop the parser. Found with libFuzzer and ASan.
030b1f7a	2017-06-06T15:53:42	Revert "Add an XML_PARSE_NOXXE flag to block all entities loading even local" This reverts commit 2304078555896cf1638c628f50326aeef6f0e0d0. The new flag doesn't work and the change even broke the XML_PARSE_NONET option.
e2663054	2017-06-05T15:37:17	Fix handling of parameter-entity references There were two bugs where parameter-entity references could lead to an unexpected change of the input buffer in xmlParseNameComplex and xmlDictLookup being called with an invalid pointer. Percent sign in DTD Names ========================= The NEXTL macro used to call xmlParserHandlePEReference. When parsing "complex" names inside the DTD, this could result in entity expansion which created a new input buffer. The fix is to simply remove the call to xmlParserHandlePEReference from the NEXTL macro. This is safe because no users of the macro require expansion of parameter entities. - xmlParseNameComplex - xmlParseNCNameComplex - xmlParseNmtoken The percent sign is not allowed in names, which are grammatical tokens. - xmlParseEntityValue Parameter-entity references in entity values are expanded but this happens in a separate step in this function. - xmlParseSystemLiteral Parameter-entity references are ignored in the system literal. - xmlParseAttValueComplex - xmlParseCharDataComplex - xmlParseCommentComplex - xmlParsePI - xmlParseCDSect Parameter-entity references are ignored outside the DTD. - xmlLoadEntityContent This function is only called from xmlStringLenDecodeEntities and entities are replaced in a separate step immediately after the function call. This bug could also be triggered with an internal subset and double entity expansion. This fixes bug 766956 initially reported by Wei Lei and independently by Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone involved. xmlParseNameComplex with XML_PARSE_OLD10 ======================================== When parsing Names inside an expanded parameter entity with the XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the GROW macro if the input buffer was exhausted. At the end of the parameter entity's replacement text, this function would then call xmlPopInput which invalidated the input buffer. There should be no need to invoke GROW in this situation because the buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and, at least for UTF-8, in xmlCurrentChar. This also matches the code path executed when XML_PARSE_OLD10 is not set. This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050). Thanks to Marcel Böhme and Thuan Pham for the report. Additional hardening ==================== A separate check was added in xmlParseNameComplex to validate the buffer size.
855c19ef	2017-06-01T01:04:08	Avoid reparsing in xmlParseStartTag2 The code in xmlParseStartTag2 must handle the case that the input buffer was grown and reallocated which can invalidate pointers to attribute values. Before, this was handled by detecting changes of the input buffer "base" pointer and, in case of a change, jumping back to the beginning of the function and reparsing the start tag. The major problem of this approach is that whether an input buffer is reallocated is nondeterministic, resulting in seemingly random test failures. See the mailing list thread "runtest mystery bug: name2.xml error case regression test" from 2012, for example. If a reallocation was detected, the code also made no attempts to continue parsing in case of errors which makes a difference in the lax "recover" mode. Now we store the current input buffer "base" pointer for each (not separately allocated) attribute in the namespace URI field, which isn't used until later. After the whole start tag was parsed, the pointers to the attribute values are reconstructed using the offset between the new and the old input buffer. This relies on arithmetic on dangling pointers which is technically undefined behavior. But it seems like the easiest and most efficient fix and a similar approach is used in xmlParserInputGrow. This changes the error output of several tests, typically making it more verbose because we try harder to continue parsing in case of errors. (Another possible solution is to check not only the "base" pointer but the size of the input buffer as well. But this would result in even more reparsing.)
07b7428b	2017-06-01T00:19:14	Simplify control flow in xmlParseStartTag2 Remove some goto labels and deduplicate a bit of code after handling namespaces. Before: loop { parseAttribute if (ok) { if (defaultNamespace) { handleDefaultNamespace if (error) goto skip_default_ns; handleDefaultNamespace skip_default_ns: freeAttr nextAttr continue; } if (namespace) { handleNamespace if (error) goto skip_ns; handleNamespace skip_ns: freeAttr nextAttr; continue; } handleAttr } else { freeAttr } nextAttr } After: loop { parseAttribute if (!ok) goto next_attr; if (defaultNamespace) { handleDefaultNamespace if (error) goto next_attr; handleDefaultNamespace } else if (namespace) { handleNamespace if (error) goto next_attr; handleNamespace } else { handleAttr } next_attr: freeAttr nextAttr }
47496724	2017-05-31T16:46:39	Avoid spurious UBSan errors in parser.c If available, use a C99 flexible array member to avoid spurious UBSan errors.
8627e4ed	2017-05-23T18:11:08	Fix memory leak in parser error path Triggered in mixed content ELEMENT declarations if there's an invalid name after the first valid name: <!ELEMENT para (#PCDATA\|a\|<invalid>)*> Found with libFuzzer and ASan.
90ccb582	2017-04-07T17:43:02	Prevent unwanted external entity reference For https://bugzilla.gnome.org/show_bug.cgi?id=780691 * parser.c: add a specific check to avoid PE reference
23040785	2017-04-07T16:45:56	Add an XML_PARSE_NOXXE flag to block all entities loading even local For https://bugzilla.gnome.org/show_bug.cgi?id=772726 * include/libxml/parser.h: Add a new parser flag XML_PARSE_NOXXE * elfgcchack.h, xmlIO.h, xmlIO.c: associated loading routine * include/libxml/xmlerror.h: new error raised * xmllint.c: adds --noxxe flag to activate the option
bdd66182	2016-05-23T12:27:58	Avoid building recursive entities For https://bugzilla.gnome.org/show_bug.cgi?id=762100 When we detect a recusive entity we should really not build the associated data, moreover if someone bypass libxml2 fatal errors and still tries to serialize a broken entity make sure we don't risk to get ito a recursion * parser.c: xmlParserEntityCheck() don't build if entity loop were found and remove the associated text content * tree.c: xmlStringGetNodeList() avoid a potential recursion
00906759	2016-01-26T16:57:03	Heap-based buffer-underreads due to xmlParseName For https://bugzilla.gnome.org/show_bug.cgi?id=759573 * parser.c: (xmlParseElementDecl): Return early on invalid input to fix non-minimized test case (759573-2.xml). Otherwise the parser gets into a bad state in SKIP(3) at the end of the function. (xmlParseConditionalSections): Halt parsing when hitting invalid input that would otherwise caused xmlParserHandlePEReference() to recurse unexpectedly. This fixes the minimized test case (759573.xml). * result/errors/759573-2.xml: Add. * result/errors/759573-2.xml.err: Add. * result/errors/759573-2.xml.str: Add. * result/errors/759573.xml: Add. * result/errors/759573.xml.err: Add. * result/errors/759573.xml.str: Add. * test/errors/759573-2.xml: Add. * test/errors/759573.xml: Add.
38eae571	2016-03-07T14:04:08	Heap use-after-free in xmlSAX2AttributeNs For https://bugzilla.gnome.org/show_bug.cgi?id=759020 * parser.c: (xmlParseStartTag2): Attribute strings are only valid if the base does not change, so add another check where the base may change. Make sure to set 'attvalue' to NULL after freeing it. * result/errors/759020.xml: Added. * result/errors/759020.xml.err: Added. * result/errors/759020.xml.str: Added. * test/errors/759020.xml: Added test case.
4472c3a5	2016-05-13T15:13:17	Fix some format string warnings with possible format string vulnerability For https://bugzilla.gnome.org/show_bug.cgi?id=761029 Decorate every method in libxml2 with the appropriate LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups following the reports.
b1d34de4	2016-03-14T17:19:44	Fix inappropriate fetch of entities content For https://bugzilla.gnome.org/show_bug.cgi?id=761430 libfuzzer regression testing exposed another case where the parser would fetch content of an external entity while not in validating mode. Plug that hole
45752d2c	2016-03-03T11:50:34	Bug 759398: Heap use-after-free in xmlDictComputeFastKey <https://bugzilla.gnome.org/show_bug.cgi?id=759398> * parser.c: (xmlParseNCNameComplex): Store start position instead of a pointer to the name since the underlying buffer may change, resulting in a stale pointer being used. * result/errors/759398.xml: Added. * result/errors/759398.xml.err: Added. * result/errors/759398.xml.str: Added. * test/errors/759398.xml: Added test case.
db07dd61	2016-02-12T09:58:29	Bug 758588: Heap-based buffer overread in xmlParserPrintFileContextInternal <https://bugzilla.gnome.org/show_bug.cgi?id=758588> * parser.c: (xmlParseEndTag2): Add bounds checks before dereferencing ctxt->input->cur past the end of the buffer, or incrementing the pointer past the end of the buffer. * result/errors/758588.xml: Add test result. * result/errors/758588.xml.err: Ditto. * result/errors/758588.xml.str: Ditto. * test/errors/758588.xml: Add regression test.
8f30bdff	2016-04-15T11:56:55	Add missing increments of recursion depth counter to XML parser. For https://bugzilla.gnome.org/show_bug.cgi?id=765207 CVE-2016-3705 The functions xmlParserEntityCheck() and xmlParseAttValueComplex() used to call xmlStringDecodeEntities() in a recursive context without incrementing the 'depth' counter in the parser context. Because of that omission, the parser failed to detect attribute recursions in certain documents before running out of stack space.
bb654feb	2016-04-13T16:56:07	Fix typos: dictio{ nn -> n }ar{y,ies} Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
4f8606c1	2016-01-05T13:38:09	Bug 760183: REGRESSION (v2.9.3): XML push parser fails with bogus UTF-8 encoding error when multi-byte character in large CDATA section is split across buffer <https://bugzilla.gnome.org/show_bug.cgi?id=760183> * parser.c: (xmlCheckCdataPush): Add 'complete' argument to describe whether the buffer passed in is the whole CDATA buffer, or if there is more data to parse. If there is more data to parse, don't return a negative value for an invalid multi-byte UTF-8 character that is split between buffers. (xmlParseTryOrFinish): Pass 'complete' argument to xmlCheckCdataPush() as appropriate. * result/cdata-2-byte-UTF-8.xml: Added. * result/cdata-2-byte-UTF-8.xml.rde: Added. * result/cdata-2-byte-UTF-8.xml.rdr: Added. * result/cdata-2-byte-UTF-8.xml.sax: Added. * result/cdata-2-byte-UTF-8.xml.sax2: Added. * result/cdata-3-byte-UTF-8.xml: Added. * result/cdata-3-byte-UTF-8.xml.rde: Added. * result/cdata-3-byte-UTF-8.xml.rdr: Added. * result/cdata-3-byte-UTF-8.xml.sax: Added. * result/cdata-3-byte-UTF-8.xml.sax2: Added. * result/cdata-4-byte-UTF-8.xml: Added. * result/cdata-4-byte-UTF-8.xml.rde: Added. * result/cdata-4-byte-UTF-8.xml.rdr: Added. * result/cdata-4-byte-UTF-8.xml.sax: Added. * result/cdata-4-byte-UTF-8.xml.sax2: Added. * result/noent/cdata-2-byte-UTF-8.xml: Added. * result/noent/cdata-3-byte-UTF-8.xml: Added. * result/noent/cdata-4-byte-UTF-8.xml: Added. * test/cdata-2-byte-UTF-8.xml: Added. * test/cdata-3-byte-UTF-8.xml: Added. * test/cdata-4-byte-UTF-8.xml: Added. - Add tests and results. Only 'make Readertests XMLPushtests' fails prior to the fix.
a7a94612	2016-02-09T12:55:29	Heap-based buffer overread in xmlNextChar For https://bugzilla.gnome.org/show_bug.cgi?id=759671 when the end of the internal subset isn't properly detected xmlParseInternalSubset should just return instead of trying to process input further.
f1063fdb	2015-11-20T16:06:59	CVE-2015-7500 Fix memory access error due to incorrect entities boundaries For https://bugzilla.gnome.org/show_bug.cgi?id=756525 handle properly the case where we popped out of the current entity while processing a start tag Reported by Kostya Serebryany @ Google This slightly modifies the output of 754946 in regression tests
3bd6ae14	2015-11-20T15:06:02	Fix some loop issues embedding NEXT Next can switch the parser back to XML_PARSER_EOF state, we need to consider those in loops consuming input
35bcb1d7	2015-11-20T15:04:09	Detect incoherency on GROW the current pointer to the input has to be between the base and end if not stop everything we have an internal state error.
e3b15974	2015-11-20T14:59:30	Reuse xmlHaltParser() where it makes sense Unify the various place where either xmlStopParser was called (which resets the error as a side effect) and places where we used ctxt->instate = XML_PARSER_EOF to stop further processing
28cd9cb7	2015-11-20T14:55:30	Add xmlHaltParser() to stop the parser The problem is doing it in a consistent and safe fashion It's more complex than just setting ctxt->instate = XML_PARSER_EOF Update the public function to reuse that new internal routine
69030714	2015-11-20T11:13:45	CVE-2015-5312 Another entity expansion issue For https://bugzilla.gnome.org/show_bug.cgi?id=756733 It is one case where the code in place to detect entities expansions failed to exit when the situation was detected, leading to DoS Problem reported by Kostya Serebryany @ Google Patch provided by David Drysdale @ Google
53ac9c96	2015-11-09T18:16:00	xmlStopParser reset errNo I had used it in contexts where that information ought to be preserved
afd27c21	2015-11-09T18:07:18	Avoid processing entities after encoding conversion failures For https://bugzilla.gnome.org/show_bug.cgi?id=756527 and was also raised by Chromium team in the past When we hit a convwersion failure when switching encoding it is bestter to stop parsing there, this was treated as a fatal error but the parser was continuing to process to extract more errors, unfortunately that makes little sense as the data is obviously corrupt and can potentially lead to unexpected behaviour.
ab2b9a93	2015-11-03T20:40:49	Avoid extra processing of MarkupDecl when EOF For https://bugzilla.gnome.org/show_bug.cgi?id=756263 One place where ctxt->instate == XML_PARSER_EOF whic was set up by entity detection issues doesn't get noticed, and even overrided
41ac9049	2015-10-27T10:53:44	Fix an error in previous Conditional section patch an off by one mistake in the change, led to error on correct document where the end of the included entity was exactly the end of the conditional section, leading to regtest failure
bd0526e6	2015-10-23T19:02:28	Another variation of overflow in Conditional sections Which happen after the previous fix to https://bugzilla.gnome.org/show_bug.cgi?id=756456 But stopping the parser and exiting we didn't pop the intermediary entities and doing the SKIP there applies on an input which may be too small
cf77e605	2015-09-30T14:46:29	Add missing Null check in xmlParseExternalEntityPrivate For https://bugzilla.gnome.org/show_bug.cgi?id=755857 a case where we check for NULL but not everywhere
4a5d80ad	2015-09-18T15:06:46	Fix a bug in CData error handling in the push parser For https://bugzilla.gnome.org/show_bug.cgi?id=754947 The checking function was returning incorrect args in some cases Adds the test to teh reg suite and fix one of the existing test output
51f02b0a	2015-09-15T16:50:32	Fix a bug on name parsing at the end of current input buffer For https://bugzilla.gnome.org/show_bug.cgi?id=754946 When hitting the end of the current input buffer while parsing a name we could end up loosing the beginning of the name, which led to various issues.
709a9521	2015-06-29T16:10:26	Fail parsing early on if encoding conversion failed For https://bugzilla.gnome.org/show_bug.cgi?id=751631 If we fail conversing the current input stream while processing the encoding declaration of the XMLDecl then it's safer to just abort there and not try to report further errors.
9aa37588	2015-06-29T09:08:25	Do not process encoding values if the declaration if broken For https://bugzilla.gnome.org/show_bug.cgi?id=751603 If the string is not properly terminated do not try to convert to the given encoding.
9b851233	2015-02-23T11:29:20	Cleanup conditional section error handling For https://bugzilla.gnome.org/show_bug.cgi?id=744980 The error handling of Conditional Section also need to be straightened as the structure of the document can't be guessed on a failure there and it's better to stop parsing as further errors are likely to be irrelevant.
a7dfab74	2015-02-23T11:17:35	Stop parsing on entities boundaries errors For https://bugzilla.gnome.org/show_bug.cgi?id=744980 There are times, like on unterminated entities that it's preferable to stop parsing, even if that means less error reporting. Entities are feeding the parser on further processing, and if they are ill defined then it's possible to get the parser to bug. Also do the same on Conditional Sections if the input is broken, as the structure of the document can't be guessed.
72a46a51	2014-10-23T11:35:36	Fix missing entities after CVE-2014-3660 fix For https://bugzilla.gnome.org/show_bug.cgi?id=738805 The fix for CVE-2014-3660 introduced a regression in some case where entity substitution is required and the entity is used first in anotther entity referenced from an attribute value
f65128f3	2014-10-17T17:13:41	Revert "Missing initialization for the catalog module" This reverts commit 054c716ea1bf001544127a4ab4f4346d1b9947e7. As this break xmlcatalog command https://bugzilla.redhat.com/show_bug.cgi?id=1153753
be2a7eda	2014-10-16T13:59:47	Fix for CVE-2014-3660 Issues related to the billion laugh entity expansion which happened to escape the initial set of fixes
500c54ef	2014-10-16T12:17:20	fix memory leak xml header encoding field with XML_PARSE_IGNORE_ENC When the xml parser encounters an xml encoding in an xml header while configured with option XML_PARSE_IGNORE_ENC, it fails to free memory allocated for storing the encoding. The patch below fixes this. How to reproduce: 1. Change doc/examples/parse4.c to add xmlCtxtUseOptions(ctxt, XML_PARSE_IGNORE_ENC); after the call to xmlCreatePushParserCtxt. 2. Rebuild 3. run the following command from the top libxml2 directory: LD_LIBRARY_PATH=.libs/ valgrind --leak-check=full ./doc/examples/.libs/parse4 ./test.xml , where test.xml contains following input: <?xml version="1.0" encoding="UTF-81" ?><hi/> valgrind will report: ==1964== 10 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==1964== at 0x4C272DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==1964== by 0x4E88497: xmlParseEncName (parser.c:10224) ==1964== by 0x4E888FE: xmlParseEncodingDecl (parser.c:10295) ==1964== by 0x4E89630: xmlParseXMLDecl (parser.c:10534) ==1964== by 0x4E8B737: xmlParseTryOrFinish (parser.c:11293) ==1964== by 0x4E8E775: xmlParseChunk (parser.c:12283) Signed-off-by: Bart De Schuymer <bart at amplidata com>
7cf57380	2014-10-08T16:09:56	Parser error on repeated recursive entity expansion containing < For https://bugzilla.gnome.org/show_bug.cgi?id=736417 basically a weird side effect and a failure to properly parenthesize a boolean expression led to this bug
7e9bbdf8	2014-10-06T20:34:14	parser bug on misformed namespace attributes For https://bugzilla.gnome.org/show_bug.cgi?id=672539 Reported by Axel Miller <axel.miller@ppi.de> Consider the following start-tag: <x xmlns=""version=""> The start-tag does not conform to the rule [40] STag ::= '<' Name (S Attribute)* S? '>' since there is no whitespace in front of the attribute "version". Thus, libxml2 should reject the start-tag. But it doesn't: $ echo '<x xmlns=""version=""/>' \| xmllint - <?xml version="1.0"?> <x xmlns="" version=""/> The error seems to happen only if there is a namespace declaration in front of the attribute. A missing whitespace between other attributes is handled correctly: $ echo '<x someattr=""version=""/>' \| xmllint - -:1: parser error : attributes construct error <x someattr=""version=""/> ^ [...]
24fb4c32	2014-10-06T18:19:12	wrong error column in structured error when parsing end tag For https://bugzilla.gnome.org/show_bug.cgi?id=734283 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after parsing an end tag.
33f658c9	2014-08-07T17:30:36	wrong error column in structured error when parsing attribute values For https://bugzilla.gnome.org/show_bug.cgi?id=734280 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after parsing XML attribute values. Example XML: <?xml version="1.0" encoding="UTF-8"?> <root xmlns="urn:colbug">&</root> <!-- 1 2 3 4 1234567890123456789012345678901234567890 --> Expected location of the error would be line 3, column 21. The actual location of the error is line 3, column 9: $ ./xmlparse colbug2.xml colbug2.xml:3:9: xmlParseEntityRef: no name The 12 characters of the xmlns attribute value "urn:colbug" are not accounted for in the error column value.
5d4310af	2014-08-07T16:28:09	wrong error column in structured error when skipping whitespace in xml decl For https://bugzilla.gnome.org/show_bug.cgi?id=734276 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after an XML declaration containing whitespace. Example XML: <?xml version="1.0" encoding="UTF-8" ?><root>&</root> <!-- 1 2 3 4 5 6 123456789012345678901234567890123456789012345678901234567890 --> Expected location of the error would be line 1, column 53. The actual location of the error is line 1, column 44: $ ./xmlparse colbug1.xml colbug1.xml:1:44: xmlParseEntityRef: no name
2f9b126a	2014-07-26T20:29:36	typo in error messages "colon are forbidden from..." For https://bugzilla.gnome.org/show_bug.cgi?id=731511 Pointed byt vincent Lefevre
c836ba66	2014-07-14T16:39:50	Fix a potential NULL dereference For https://bugzilla.gnome.org/show_bug.cgi?id=733040 xmlDictLookup() may return NULL in case of allocation error, though very unlikely it need to be checked.
dd8367da	2014-06-11T16:54:32	Fix regressions introduced by CVE-2014-0191 patch A number of issues have been raised after the fix, and this patch tries to correct all of them, though most were related to postvalidation. https://bugzilla.gnome.org/show_bug.cgi?id=730290 and other reports on list, off-list and on Red Hat bugzilla
9cd1c3cf	2014-04-22T15:30:56	Do not fetch external parameter entities Unless explicitely asked for when validating or replacing entities with their value. Problem pointed out by Daniel Berrange <berrange@redhat.com>
6faa126f	2014-03-21T17:05:51	Fix xmlParseInNodeContext() if node is not element We really need to have ctxt->instate == XML_PARSER_CONTENT when jumping in content parsing Bug reported by Frank Gross
190a0b89	2014-02-06T10:58:17	Fix a portability issue on Windows Apparently an verflow when comparing macro and unsigned long
054c716e	2014-01-26T15:02:25	Missing initialization for the catalog module
4e1476c5	2013-12-09T15:23:40	adding init calls to xml and html Read parsing entry points As pointed out by "Tassyns, Bram <BramT@enfocus.com>" on the list some call had it other didn't, clean it up and add to all missing ones
9a85d40c	2013-11-29T23:26:25	Fix incorrect spelling entites->entities Partially, a follow-up of 81d7a8245cf9a31a49499a5a195c2b89e6f91180. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
dcc19503	2013-05-22T22:56:45	Fix a parsing bug on non-ascii element and CR/LF usage https://bugzilla.gnome.org/show_bug.cgi?id=698550 Somehow the behaviour of the internal parser routine changed slightly when encountering CR/LF, which led to a bug when parsing document with non-ascii Names
63588f47	2013-05-10T14:01:46	Fix a regression in xmlGetDocCompressMode() The switch to xzlib had for consequence that the compression level of the input was not gathered anymore in ctxt->input->buf, then the parser compression flags was left to -1 and propagated to the resulting document. Fix the I/O layer to get compression detection in xzlib, then carry it in the input buffer and the resulting document This should fix https://lsbbugs.linuxfoundation.org/show_bug.cgi?id=3456
d4a5d981	2013-04-30T17:45:36	Cast encoding name to char pointer to match arg type
704d8c5e	2013-04-23T13:02:11	Fix an error in xmlCleanupParser https://bugzilla.gnome.org/show_bug.cgi?id=698582 xmlCleanupParser calls xmlCleanupGlobals() and then xmlResetLastError() but the later reallocate the global data freed by previous call. Just swap the two calls.
9ca816b3	2013-04-16T22:00:13	Fix a couple of return without value Error introduced in previous commit !
e50ba816	2013-04-11T15:54:51	Improve handling of xmlStopParser() Add a specific parser error Try to stop parsing as quickly as possible
cff2546f	2013-03-11T15:57:55	Cache presence of '<' in entities content slightly modify how ent->checked is used, and use the lowest bit to keep the information
a3f1e3e5	2013-03-11T13:57:53	Avoid extra processing on entities If an entity has already been checked for correctness no need to check it on every reference
23f05e0c	2013-02-19T10:21:49	Detect excessive entities expansion upon replacement If entities expansion in the XML parser is asked for, it is possble to craft relatively small input document leading to excessive on-the-fly content generation. This patch accounts for those replacement and stop parsing after a given threshold. it can be bypassed as usual with the HUGE parser option.
bf058dce	2013-02-13T18:19:42	Fix the flushing out of raw buffers on encoding conversions https://bugzilla.gnome.org/show_bug.cgi?id=692915 the new set of converting functions tried to limit the encoding conversion of the raw buffer to the consumption one to work in a more progressive fashion. Unfortunately this was bad for performances and led to errors on progressive parsing when a very large chunk was close to the end of the document. Fix the new internal function and switch back to the old way of converting. Fix another bug in the process.
de0cc20c	2013-02-12T16:55:34	Fix some buffer conversion issues https://bugzilla.gnome.org/show_bug.cgi?id=690202 Buffer overflow errors originating from xmlBufGetInputBase in 2.9.0 The pointers from the context input were not properly reset after that call which can do reallocations.
9c8eaabe	2013-01-04T12:41:53	Fix compiler warning after 153cf15905cf4ec080612ada6703757d10caba1e Add missing cast for xmlNop to silence a compiler warning.
cf8f0424	2012-12-21T11:13:31	Fix an error in the progressive DTD parsing code For https://bugzilla.gnome.org/show_bug.cgi?id=689958 We were looking for the wrong character in the input stream
fb27e2cd	2012-09-28T08:59:33	Fix spelling of "length".
6a36fbe3	2012-10-29T10:39:55	Fix potential out of bound access
153cf159	2012-10-26T13:50:47	Fix large parse of file from memory https://bugzilla.redhat.com/show_bug.cgi?id=862969 The new code trying to detect excessive input lookup would just get wrong sometimes in the case of very large file parsed directly from memory.
711b15d5	2012-10-25T19:23:26	Fix a bug in the nsclean option of the parser Raised as a side effect of: https://bugzilla.gnome.org/show_bug.cgi?id=663844
6c91aa38	2012-10-25T15:33:59	Fix a regression in 2.9.0 breaking validation while streaming https://bugzilla.gnome.org/show_bug.cgi?id=684774 with help from Kjell Ahlstedt <kjell.ahlstedt@bredband.net>
81d7a824	2012-09-13T15:56:51	Fix typos in parser comments Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
f8e3db04	2012-09-11T13:26:36	Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.
28f5e1a2	2012-09-04T11:18:39	Fix potential crash on entities errors Related to https://bugs.launchpad.net/lxml/+bug/502959 Basically the core of the issue is that if an entity references another entity, then in case we are replacing entities content, we should always do so by copying the referenced content as long as the reference is done within the entity. Otherwise, if for some reason there is a later parsing error that entity content may be freed. Complex scenario exposed by command: thinkpad:~/XML/diveintopython-5.4/xml -> valgrind --db-attach=yes ../../xmllint --loaddtd --noout --noent diveintopython.xml Document references &a; a references &b; we references b content directly in by linking in the a content a has an error further down we free a, freeing the chunk from b Document references &b; after &a; we try to copy b content, but it was freed already => segfault * parser.c: never reference directly entity content without copying if we aren't in the document main entity
1f972e9f	2012-08-15T10:16:37	Cleanup some of the parser code Prefetching assumptions about the amount of data read in GROW should be backed up with test for 0 termination when at the end of the buffer.
968a03a2	2012-08-13T12:41:33	Add support for big line numbers in error reporting Fix the lack of line number as reported by Johan Corveleyn <jcorvel@gmail.com> * parser.c include/libxml/parser.h: add an XML_PARSE_BIG_LINES parser option not switch on by default, it's an opt-in * SAX2.c: if XML_PARSE_BIG_LINES is set store the long line numbers in the psvi field of text nodes * tree.c: expand xmlGetLineNo to extract those informations, also make sure we can't fail on recursive behaviour * error.c: in __xmlRaiseError, if a node is provided, call xmlGetLineNo() if we can't get a valid line number. * xmllint.c: switch on XML_PARSE_BIG_LINES in xmllint
5353bbf7	2012-08-03T12:03:31	More fixups on the push parser behaviour

b90d8989

2017-09-19T15:45:35

Fix regression with librsvg Instead of using xmlCreateIOParserCtxt, librsvg pushes its own xmlParserInput on top of a memory push parser. This incorrect use of the API confuses several parser checks and, since 2.9.5, completely breaks documents with internal subsets. Work around the problem with internal subsets. Thanks to Petr Sumbera for the report: https://mail.gnome.org/archives/xml/2017-September/msg00011.html Also see https://bugzilla.gnome.org/show_bug.cgi?id=787895

abbda93c

2017-09-11T01:14:16

Handle more invalid entity values in recovery mode In attribute content, don't emit entity references if there are problems with the entity value. Otherwise some illegal entity values like <!ENTITY a '&#x123456789;'> would later cause problems like integer overflow. Make xmlStringLenDecodeEntities return NULL on more error conditions including invalid char refs and errors from recursive calls. Remove some fragile error checks based on lastError that shouldn't be needed now. Clear the entity content in xmlParseAttValueComplex if an error was found. Found by OSS-Fuzz. Should fix bug 783052. Also see https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3343

0fcab658

2017-09-07T18:25:11

Handle illegal entity values in recovery mode Make xmlParseEntityValue always return NULL on error. Otherwise some illegal entity values like <!ENTITY e '&%#4294967298;'> would later cause problems like integer overflow. Found by OSS-Fuzz. Should fix bug 783052. Also see https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=592 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=2732

69936b12

2017-08-30T14:16:01

Revert "Print error messages for truncated UTF-8 sequences" This reverts commit 79c8a6b which caused a serious regression in streaming mode. Also reverts part of commit 52ceced "Fix infinite loops with push parser in recovery mode". Fixes bug 786554.

454e397e

2017-08-28T14:30:43

Porting libxml2 on zOS encoding of code First set of patches for zOS - entities.c parser.c tree.c xmlschemas.c xmlschemastypes.c xpath.c xpointer.c: ask conversion of code to ISO Latin 1 to avoid having the compiler assume EBCDIC codepoint for characters. - xmlmodule.c: make sure we have support for modules - xmlIO.c: zOS path names are special avoid dsome of the expectstions from Unix/Windows

899a5d9f

2017-07-25T14:59:49

Detect infinite recursion in parameter entities When expanding a parameter entity in a DTD, infinite recursion could lead to an infinite loop or memory exhaustion. Thanks to Wei Lei for the first of many reports. Fixes bug 759579.

52ceced6

2017-07-01T17:49:30

Fix infinite loops with push parser in recovery mode Make sure that the input pointer advances in case of errors. Otherwise, the push parser can loop infinitely. Found with libFuzzer.

3eef3f39

2017-06-20T16:13:57

Fix NULL deref in xmlParseExternalEntityPrivate If called from xmlParseExternalEntity, oldctxt is NULL which leads to a NULL deref if an error occurs. This only affects external code that calls xmlParseExternalEntity. Patch from David Kilzer with minor changes. Fixes bug 780159.

872fea94

2017-06-19T00:24:12

Get rid of "blanks wrapper" for parameter entities Now that replacement of parameter entities goes exclusively through xmlSkipBlankChars, we can account for the surrounding space characters there and remove the "blanks wrapper" hack.

d9e43c7d

2017-06-19T18:01:23

Make sure not to call IS_BLANK_CH when parsing the DTD This is required to get rid of the "blanks wrapper" hack. Checking the return value of xmlSkipBlankChars is more efficient, too.

453dff1e

2017-06-19T17:55:20

Remove unnecessary calls to xmlPopInput It's enough if xmlPopInput is called from xmlSkipBlankChars. Since the replacement text of a parameter entity is surrounded with space characters, that's the only place where the replacement can end in a well-formed document. This is also required to get rid of the "blanks wrapper" hack.

aa267cd1

2017-06-18T23:29:51

Simplify handling of parameter entity references There are only two places where parameter entity references must be handled. For the internal subset in xmlParseInternalSubset. For the external subset or content from other external PEs in xmlSkipBlankChars. Make sure that xmlSkipBlankChars skips over sequences of PEs and whitespace. Rely on xmlSkipBlankChars instead of calling xmlParsePEReference directly when in the external subset or a conditional section. xmlParserHandlePEReference is unused now.

24246c76

2017-06-20T12:56:36

Fix xmlHaltParser Pop all extra input streams before resetting the input. Otherwise, a call to xmlPopInput could make input available again. Also set input->end to input->cur. Changes the test output for some error tests. Unfortunately, some fuzzed test cases were added to the test suite without manual cleanup. This makes it almost impossible to review the impact of later changes on the test output.

8bbe4508

2017-06-17T16:15:09

Spelling and grammar fixes Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other misspellings.

5f440d8c

2017-06-12T14:32:34

Rework entity boundary checks Make sure to finish all entities in the internal subset. Nevertheless, readd a sanity check in xmlParseStartTag2 that was lost in my previous commit. Also add a sanity check in xmlPopInput. Popping an input unexpectedly was the source of many recent memory bugs. The check doesn't mitigate such issues but helps with diagnosis. Always base entity boundary checks on the input ID, not the input pointer. The pointer could have been reallocated to the old address. Always throw a well-formedness error if a boundary check fails. In a few places, a validity error was thrown. Fix a few error codes and improve indentation.

46dc9890

2017-06-08T02:24:56

Don't switch encoding for internal parameter entities This is only needed for external entities. Trying to switch the encoding for internal entities could also cause a memory leak in recovery mode.

03904159

2017-06-05T21:16:00

Merge duplicate code paths handling PE references xmlParsePEReference is essentially a subset of xmlParserHandlePEReference, so make xmlParserHandlePEReference call xmlParsePEReference. The code paths in these functions differed slighty, but the code from xmlParserHandlePEReference seems more solid and tested.

3f0627a1

2017-06-16T21:30:42

Fix duplicate SAX callbacks for entity content Reset 'was_checked' to prevent entity from being parsed twice and SAX callbacks being invoked twice if XML_PARSE_NOENT was set. This regressed in version 2.9.3 and caused problems with WebKit. Fixes bug 760367.

fb2f518c

2017-06-10T17:06:16

Fix potential infinite loop in xmlStringLenDecodeEntities Make sure that xmlParseStringPEReference advances the "str" pointer even if the parser was stopped. Otherwise xmlStringLenDecodeEntities can loop infinitely.

4ba8cc85

2017-06-10T02:33:58

Remove useless check in xmlParseAttributeListDecl Since we already successfully parsed the attribute name and other items, it is guaranteed that we made progress in the input stream. Comparing the input pointer to a previous value also looks fragile to me. What if the input buffer was reallocated and the new "cur" pointer happens to be the same as the old one? There are a couple of similar checks which also take "consumed" into account. This seems to be safer but I'm not convinced that it couldn't lead to false alarms in rare situations.

bedbef80

2017-06-09T15:10:13

Fix memory leak in xmlParseEntityDecl error path When parsing the entity value, it can happen that an external entity with an unsupported encoding is loaded and the parser is stopped. This would lead to a memory leak. A custom SAX callback could also stop the parser. Found with libFuzzer and ASan.

030b1f7a

2017-06-06T15:53:42

Revert "Add an XML_PARSE_NOXXE flag to block all entities loading even local" This reverts commit 2304078555896cf1638c628f50326aeef6f0e0d0. The new flag doesn't work and the change even broke the XML_PARSE_NONET option.

e2663054

2017-06-05T15:37:17

Fix handling of parameter-entity references There were two bugs where parameter-entity references could lead to an unexpected change of the input buffer in xmlParseNameComplex and xmlDictLookup being called with an invalid pointer. Percent sign in DTD Names ========================= The NEXTL macro used to call xmlParserHandlePEReference. When parsing "complex" names inside the DTD, this could result in entity expansion which created a new input buffer. The fix is to simply remove the call to xmlParserHandlePEReference from the NEXTL macro. This is safe because no users of the macro require expansion of parameter entities. - xmlParseNameComplex - xmlParseNCNameComplex - xmlParseNmtoken The percent sign is not allowed in names, which are grammatical tokens. - xmlParseEntityValue Parameter-entity references in entity values are expanded but this happens in a separate step in this function. - xmlParseSystemLiteral Parameter-entity references are ignored in the system literal. - xmlParseAttValueComplex - xmlParseCharDataComplex - xmlParseCommentComplex - xmlParsePI - xmlParseCDSect Parameter-entity references are ignored outside the DTD. - xmlLoadEntityContent This function is only called from xmlStringLenDecodeEntities and entities are replaced in a separate step immediately after the function call. This bug could also be triggered with an internal subset and double entity expansion. This fixes bug 766956 initially reported by Wei Lei and independently by Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone involved. xmlParseNameComplex with XML_PARSE_OLD10 ======================================== When parsing Names inside an expanded parameter entity with the XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the GROW macro if the input buffer was exhausted. At the end of the parameter entity's replacement text, this function would then call xmlPopInput which invalidated the input buffer. There should be no need to invoke GROW in this situation because the buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and, at least for UTF-8, in xmlCurrentChar. This also matches the code path executed when XML_PARSE_OLD10 is not set. This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050). Thanks to Marcel Böhme and Thuan Pham for the report. Additional hardening ==================== A separate check was added in xmlParseNameComplex to validate the buffer size.

855c19ef

2017-06-01T01:04:08

Avoid reparsing in xmlParseStartTag2 The code in xmlParseStartTag2 must handle the case that the input buffer was grown and reallocated which can invalidate pointers to attribute values. Before, this was handled by detecting changes of the input buffer "base" pointer and, in case of a change, jumping back to the beginning of the function and reparsing the start tag. The major problem of this approach is that whether an input buffer is reallocated is nondeterministic, resulting in seemingly random test failures. See the mailing list thread "runtest mystery bug: name2.xml error case regression test" from 2012, for example. If a reallocation was detected, the code also made no attempts to continue parsing in case of errors which makes a difference in the lax "recover" mode. Now we store the current input buffer "base" pointer for each (not separately allocated) attribute in the namespace URI field, which isn't used until later. After the whole start tag was parsed, the pointers to the attribute values are reconstructed using the offset between the new and the old input buffer. This relies on arithmetic on dangling pointers which is technically undefined behavior. But it seems like the easiest and most efficient fix and a similar approach is used in xmlParserInputGrow. This changes the error output of several tests, typically making it more verbose because we try harder to continue parsing in case of errors. (Another possible solution is to check not only the "base" pointer but the size of the input buffer as well. But this would result in even more reparsing.)

07b7428b

2017-06-01T00:19:14

Simplify control flow in xmlParseStartTag2 Remove some goto labels and deduplicate a bit of code after handling namespaces. Before: loop { parseAttribute if (ok) { if (defaultNamespace) { handleDefaultNamespace if (error) goto skip_default_ns; handleDefaultNamespace skip_default_ns: freeAttr nextAttr continue; } if (namespace) { handleNamespace if (error) goto skip_ns; handleNamespace skip_ns: freeAttr nextAttr; continue; } handleAttr } else { freeAttr } nextAttr } After: loop { parseAttribute if (!ok) goto next_attr; if (defaultNamespace) { handleDefaultNamespace if (error) goto next_attr; handleDefaultNamespace } else if (namespace) { handleNamespace if (error) goto next_attr; handleNamespace } else { handleAttr } next_attr: freeAttr nextAttr }

47496724

2017-05-31T16:46:39

Avoid spurious UBSan errors in parser.c If available, use a C99 flexible array member to avoid spurious UBSan errors.

8627e4ed

2017-05-23T18:11:08

Fix memory leak in parser error path Triggered in mixed content ELEMENT declarations if there's an invalid name after the first valid name: <!ELEMENT para (#PCDATA|a|<invalid>)*> Found with libFuzzer and ASan.

90ccb582

2017-04-07T17:43:02

Prevent unwanted external entity reference For https://bugzilla.gnome.org/show_bug.cgi?id=780691 * parser.c: add a specific check to avoid PE reference

23040785

2017-04-07T16:45:56

Add an XML_PARSE_NOXXE flag to block all entities loading even local For https://bugzilla.gnome.org/show_bug.cgi?id=772726 * include/libxml/parser.h: Add a new parser flag XML_PARSE_NOXXE * elfgcchack.h, xmlIO.h, xmlIO.c: associated loading routine * include/libxml/xmlerror.h: new error raised * xmllint.c: adds --noxxe flag to activate the option

bdd66182

2016-05-23T12:27:58

Avoid building recursive entities For https://bugzilla.gnome.org/show_bug.cgi?id=762100 When we detect a recusive entity we should really not build the associated data, moreover if someone bypass libxml2 fatal errors and still tries to serialize a broken entity make sure we don't risk to get ito a recursion * parser.c: xmlParserEntityCheck() don't build if entity loop were found and remove the associated text content * tree.c: xmlStringGetNodeList() avoid a potential recursion

00906759

2016-01-26T16:57:03

Heap-based buffer-underreads due to xmlParseName For https://bugzilla.gnome.org/show_bug.cgi?id=759573 * parser.c: (xmlParseElementDecl): Return early on invalid input to fix non-minimized test case (759573-2.xml). Otherwise the parser gets into a bad state in SKIP(3) at the end of the function. (xmlParseConditionalSections): Halt parsing when hitting invalid input that would otherwise caused xmlParserHandlePEReference() to recurse unexpectedly. This fixes the minimized test case (759573.xml). * result/errors/759573-2.xml: Add. * result/errors/759573-2.xml.err: Add. * result/errors/759573-2.xml.str: Add. * result/errors/759573.xml: Add. * result/errors/759573.xml.err: Add. * result/errors/759573.xml.str: Add. * test/errors/759573-2.xml: Add. * test/errors/759573.xml: Add.

38eae571

2016-03-07T14:04:08

Heap use-after-free in xmlSAX2AttributeNs For https://bugzilla.gnome.org/show_bug.cgi?id=759020 * parser.c: (xmlParseStartTag2): Attribute strings are only valid if the base does not change, so add another check where the base may change. Make sure to set 'attvalue' to NULL after freeing it. * result/errors/759020.xml: Added. * result/errors/759020.xml.err: Added. * result/errors/759020.xml.str: Added. * test/errors/759020.xml: Added test case.

4472c3a5

2016-05-13T15:13:17

Fix some format string warnings with possible format string vulnerability For https://bugzilla.gnome.org/show_bug.cgi?id=761029 Decorate every method in libxml2 with the appropriate LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups following the reports.

b1d34de4

2016-03-14T17:19:44

Fix inappropriate fetch of entities content For https://bugzilla.gnome.org/show_bug.cgi?id=761430 libfuzzer regression testing exposed another case where the parser would fetch content of an external entity while not in validating mode. Plug that hole

45752d2c

2016-03-03T11:50:34

Bug 759398: Heap use-after-free in xmlDictComputeFastKey <https://bugzilla.gnome.org/show_bug.cgi?id=759398> * parser.c: (xmlParseNCNameComplex): Store start position instead of a pointer to the name since the underlying buffer may change, resulting in a stale pointer being used. * result/errors/759398.xml: Added. * result/errors/759398.xml.err: Added. * result/errors/759398.xml.str: Added. * test/errors/759398.xml: Added test case.

db07dd61

2016-02-12T09:58:29

Bug 758588: Heap-based buffer overread in xmlParserPrintFileContextInternal <https://bugzilla.gnome.org/show_bug.cgi?id=758588> * parser.c: (xmlParseEndTag2): Add bounds checks before dereferencing ctxt->input->cur past the end of the buffer, or incrementing the pointer past the end of the buffer. * result/errors/758588.xml: Add test result. * result/errors/758588.xml.err: Ditto. * result/errors/758588.xml.str: Ditto. * test/errors/758588.xml: Add regression test.

8f30bdff

2016-04-15T11:56:55

Add missing increments of recursion depth counter to XML parser. For https://bugzilla.gnome.org/show_bug.cgi?id=765207 CVE-2016-3705 The functions xmlParserEntityCheck() and xmlParseAttValueComplex() used to call xmlStringDecodeEntities() in a recursive context without incrementing the 'depth' counter in the parser context. Because of that omission, the parser failed to detect attribute recursions in certain documents before running out of stack space.

bb654feb

2016-04-13T16:56:07

Fix typos: dictio{ nn -> n }ar{y,ies} Signed-off-by: Jan Pokorný <jpokorny@redhat.com>

4f8606c1

2016-01-05T13:38:09

Bug 760183: REGRESSION (v2.9.3): XML push parser fails with bogus UTF-8 encoding error when multi-byte character in large CDATA section is split across buffer <https://bugzilla.gnome.org/show_bug.cgi?id=760183> * parser.c: (xmlCheckCdataPush): Add 'complete' argument to describe whether the buffer passed in is the whole CDATA buffer, or if there is more data to parse. If there is more data to parse, don't return a negative value for an invalid multi-byte UTF-8 character that is split between buffers. (xmlParseTryOrFinish): Pass 'complete' argument to xmlCheckCdataPush() as appropriate. * result/cdata-2-byte-UTF-8.xml: Added. * result/cdata-2-byte-UTF-8.xml.rde: Added. * result/cdata-2-byte-UTF-8.xml.rdr: Added. * result/cdata-2-byte-UTF-8.xml.sax: Added. * result/cdata-2-byte-UTF-8.xml.sax2: Added. * result/cdata-3-byte-UTF-8.xml: Added. * result/cdata-3-byte-UTF-8.xml.rde: Added. * result/cdata-3-byte-UTF-8.xml.rdr: Added. * result/cdata-3-byte-UTF-8.xml.sax: Added. * result/cdata-3-byte-UTF-8.xml.sax2: Added. * result/cdata-4-byte-UTF-8.xml: Added. * result/cdata-4-byte-UTF-8.xml.rde: Added. * result/cdata-4-byte-UTF-8.xml.rdr: Added. * result/cdata-4-byte-UTF-8.xml.sax: Added. * result/cdata-4-byte-UTF-8.xml.sax2: Added. * result/noent/cdata-2-byte-UTF-8.xml: Added. * result/noent/cdata-3-byte-UTF-8.xml: Added. * result/noent/cdata-4-byte-UTF-8.xml: Added. * test/cdata-2-byte-UTF-8.xml: Added. * test/cdata-3-byte-UTF-8.xml: Added. * test/cdata-4-byte-UTF-8.xml: Added. - Add tests and results. Only 'make Readertests XMLPushtests' fails prior to the fix.

a7a94612

2016-02-09T12:55:29

Heap-based buffer overread in xmlNextChar For https://bugzilla.gnome.org/show_bug.cgi?id=759671 when the end of the internal subset isn't properly detected xmlParseInternalSubset should just return instead of trying to process input further.

f1063fdb

2015-11-20T16:06:59

CVE-2015-7500 Fix memory access error due to incorrect entities boundaries For https://bugzilla.gnome.org/show_bug.cgi?id=756525 handle properly the case where we popped out of the current entity while processing a start tag Reported by Kostya Serebryany @ Google This slightly modifies the output of 754946 in regression tests

3bd6ae14

2015-11-20T15:06:02

Fix some loop issues embedding NEXT Next can switch the parser back to XML_PARSER_EOF state, we need to consider those in loops consuming input

35bcb1d7

2015-11-20T15:04:09

Detect incoherency on GROW the current pointer to the input has to be between the base and end if not stop everything we have an internal state error.

e3b15974

2015-11-20T14:59:30

Reuse xmlHaltParser() where it makes sense Unify the various place where either xmlStopParser was called (which resets the error as a side effect) and places where we used ctxt->instate = XML_PARSER_EOF to stop further processing

28cd9cb7

2015-11-20T14:55:30

Add xmlHaltParser() to stop the parser The problem is doing it in a consistent and safe fashion It's more complex than just setting ctxt->instate = XML_PARSER_EOF Update the public function to reuse that new internal routine

69030714

2015-11-20T11:13:45

CVE-2015-5312 Another entity expansion issue For https://bugzilla.gnome.org/show_bug.cgi?id=756733 It is one case where the code in place to detect entities expansions failed to exit when the situation was detected, leading to DoS Problem reported by Kostya Serebryany @ Google Patch provided by David Drysdale @ Google

53ac9c96

2015-11-09T18:16:00

xmlStopParser reset errNo I had used it in contexts where that information ought to be preserved

afd27c21

2015-11-09T18:07:18

Avoid processing entities after encoding conversion failures For https://bugzilla.gnome.org/show_bug.cgi?id=756527 and was also raised by Chromium team in the past When we hit a convwersion failure when switching encoding it is bestter to stop parsing there, this was treated as a fatal error but the parser was continuing to process to extract more errors, unfortunately that makes little sense as the data is obviously corrupt and can potentially lead to unexpected behaviour.

ab2b9a93

2015-11-03T20:40:49

Avoid extra processing of MarkupDecl when EOF For https://bugzilla.gnome.org/show_bug.cgi?id=756263 One place where ctxt->instate == XML_PARSER_EOF whic was set up by entity detection issues doesn't get noticed, and even overrided

41ac9049

2015-10-27T10:53:44

Fix an error in previous Conditional section patch an off by one mistake in the change, led to error on correct document where the end of the included entity was exactly the end of the conditional section, leading to regtest failure

bd0526e6

2015-10-23T19:02:28

Another variation of overflow in Conditional sections Which happen after the previous fix to https://bugzilla.gnome.org/show_bug.cgi?id=756456 But stopping the parser and exiting we didn't pop the intermediary entities and doing the SKIP there applies on an input which may be too small

cf77e605

2015-09-30T14:46:29

Add missing Null check in xmlParseExternalEntityPrivate For https://bugzilla.gnome.org/show_bug.cgi?id=755857 a case where we check for NULL but not everywhere

4a5d80ad

2015-09-18T15:06:46

Fix a bug in CData error handling in the push parser For https://bugzilla.gnome.org/show_bug.cgi?id=754947 The checking function was returning incorrect args in some cases Adds the test to teh reg suite and fix one of the existing test output

51f02b0a

2015-09-15T16:50:32

Fix a bug on name parsing at the end of current input buffer For https://bugzilla.gnome.org/show_bug.cgi?id=754946 When hitting the end of the current input buffer while parsing a name we could end up loosing the beginning of the name, which led to various issues.

709a9521

2015-06-29T16:10:26

Fail parsing early on if encoding conversion failed For https://bugzilla.gnome.org/show_bug.cgi?id=751631 If we fail conversing the current input stream while processing the encoding declaration of the XMLDecl then it's safer to just abort there and not try to report further errors.

9aa37588

2015-06-29T09:08:25

Do not process encoding values if the declaration if broken For https://bugzilla.gnome.org/show_bug.cgi?id=751603 If the string is not properly terminated do not try to convert to the given encoding.

9b851233

2015-02-23T11:29:20

Cleanup conditional section error handling For https://bugzilla.gnome.org/show_bug.cgi?id=744980 The error handling of Conditional Section also need to be straightened as the structure of the document can't be guessed on a failure there and it's better to stop parsing as further errors are likely to be irrelevant.

a7dfab74

2015-02-23T11:17:35

Stop parsing on entities boundaries errors For https://bugzilla.gnome.org/show_bug.cgi?id=744980 There are times, like on unterminated entities that it's preferable to stop parsing, even if that means less error reporting. Entities are feeding the parser on further processing, and if they are ill defined then it's possible to get the parser to bug. Also do the same on Conditional Sections if the input is broken, as the structure of the document can't be guessed.

72a46a51

2014-10-23T11:35:36

Fix missing entities after CVE-2014-3660 fix For https://bugzilla.gnome.org/show_bug.cgi?id=738805 The fix for CVE-2014-3660 introduced a regression in some case where entity substitution is required and the entity is used first in anotther entity referenced from an attribute value

f65128f3

2014-10-17T17:13:41

Revert "Missing initialization for the catalog module" This reverts commit 054c716ea1bf001544127a4ab4f4346d1b9947e7. As this break xmlcatalog command https://bugzilla.redhat.com/show_bug.cgi?id=1153753

be2a7eda

2014-10-16T13:59:47

Fix for CVE-2014-3660 Issues related to the billion laugh entity expansion which happened to escape the initial set of fixes

500c54ef

2014-10-16T12:17:20

fix memory leak xml header encoding field with XML_PARSE_IGNORE_ENC When the xml parser encounters an xml encoding in an xml header while configured with option XML_PARSE_IGNORE_ENC, it fails to free memory allocated for storing the encoding. The patch below fixes this. How to reproduce: 1. Change doc/examples/parse4.c to add xmlCtxtUseOptions(ctxt, XML_PARSE_IGNORE_ENC); after the call to xmlCreatePushParserCtxt. 2. Rebuild 3. run the following command from the top libxml2 directory: LD_LIBRARY_PATH=.libs/ valgrind --leak-check=full ./doc/examples/.libs/parse4 ./test.xml , where test.xml contains following input: <?xml version="1.0" encoding="UTF-81" ?><hi/> valgrind will report: ==1964== 10 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==1964== at 0x4C272DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==1964== by 0x4E88497: xmlParseEncName (parser.c:10224) ==1964== by 0x4E888FE: xmlParseEncodingDecl (parser.c:10295) ==1964== by 0x4E89630: xmlParseXMLDecl (parser.c:10534) ==1964== by 0x4E8B737: xmlParseTryOrFinish (parser.c:11293) ==1964== by 0x4E8E775: xmlParseChunk (parser.c:12283) Signed-off-by: Bart De Schuymer <bart at amplidata com>

7cf57380

2014-10-08T16:09:56

Parser error on repeated recursive entity expansion containing < For https://bugzilla.gnome.org/show_bug.cgi?id=736417 basically a weird side effect and a failure to properly parenthesize a boolean expression led to this bug

7e9bbdf8

2014-10-06T20:34:14

parser bug on misformed namespace attributes For https://bugzilla.gnome.org/show_bug.cgi?id=672539 Reported by Axel Miller <axel.miller@ppi.de> Consider the following start-tag: <x xmlns=""version=""> The start-tag does not conform to the rule [40] STag ::= '<' Name (S Attribute)* S? '>' since there is no whitespace in front of the attribute "version". Thus, libxml2 should reject the start-tag. But it doesn't: $ echo '<x xmlns=""version=""/>' | xmllint - <?xml version="1.0"?> <x xmlns="" version=""/> The error seems to happen only if there is a namespace declaration in front of the attribute. A missing whitespace between other attributes is handled correctly: $ echo '<x someattr=""version=""/>' | xmllint - -:1: parser error : attributes construct error <x someattr=""version=""/> ^ [...]

24fb4c32

2014-10-06T18:19:12

wrong error column in structured error when parsing end tag For https://bugzilla.gnome.org/show_bug.cgi?id=734283 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after parsing an end tag.

33f658c9

2014-08-07T17:30:36

wrong error column in structured error when parsing attribute values For https://bugzilla.gnome.org/show_bug.cgi?id=734280 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after parsing XML attribute values. Example XML: <?xml version="1.0" encoding="UTF-8"?> <root xmlns="urn:colbug">&</root>  Expected location of the error would be line 3, column 21. The actual location of the error is line 3, column 9: $ ./xmlparse colbug2.xml colbug2.xml:3:9: xmlParseEntityRef: no name The 12 characters of the xmlns attribute value "urn:colbug" are not accounted for in the error column value.

5d4310af

2014-08-07T16:28:09

wrong error column in structured error when skipping whitespace in xml decl For https://bugzilla.gnome.org/show_bug.cgi?id=734276 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after an XML declaration containing whitespace. Example XML: <?xml version="1.0" encoding="UTF-8" ?><root>&</root>  Expected location of the error would be line 1, column 53. The actual location of the error is line 1, column 44: $ ./xmlparse colbug1.xml colbug1.xml:1:44: xmlParseEntityRef: no name

2f9b126a

2014-07-26T20:29:36

typo in error messages "colon are forbidden from..." For https://bugzilla.gnome.org/show_bug.cgi?id=731511 Pointed byt vincent Lefevre

c836ba66

2014-07-14T16:39:50

Fix a potential NULL dereference For https://bugzilla.gnome.org/show_bug.cgi?id=733040 xmlDictLookup() may return NULL in case of allocation error, though very unlikely it need to be checked.

dd8367da

2014-06-11T16:54:32

Fix regressions introduced by CVE-2014-0191 patch A number of issues have been raised after the fix, and this patch tries to correct all of them, though most were related to postvalidation. https://bugzilla.gnome.org/show_bug.cgi?id=730290 and other reports on list, off-list and on Red Hat bugzilla

9cd1c3cf

2014-04-22T15:30:56

Do not fetch external parameter entities Unless explicitely asked for when validating or replacing entities with their value. Problem pointed out by Daniel Berrange <berrange@redhat.com>

6faa126f

2014-03-21T17:05:51

Fix xmlParseInNodeContext() if node is not element We really need to have ctxt->instate == XML_PARSER_CONTENT when jumping in content parsing Bug reported by Frank Gross

190a0b89

2014-02-06T10:58:17

Fix a portability issue on Windows Apparently an verflow when comparing macro and unsigned long

054c716e

2014-01-26T15:02:25

Missing initialization for the catalog module

4e1476c5

2013-12-09T15:23:40

adding init calls to xml and html Read parsing entry points As pointed out by "Tassyns, Bram <BramT@enfocus.com>" on the list some call had it other didn't, clean it up and add to all missing ones

9a85d40c

2013-11-29T23:26:25

Fix incorrect spelling entites->entities Partially, a follow-up of 81d7a8245cf9a31a49499a5a195c2b89e6f91180. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>

dcc19503

2013-05-22T22:56:45

Fix a parsing bug on non-ascii element and CR/LF usage https://bugzilla.gnome.org/show_bug.cgi?id=698550 Somehow the behaviour of the internal parser routine changed slightly when encountering CR/LF, which led to a bug when parsing document with non-ascii Names

63588f47

2013-05-10T14:01:46

Fix a regression in xmlGetDocCompressMode() The switch to xzlib had for consequence that the compression level of the input was not gathered anymore in ctxt->input->buf, then the parser compression flags was left to -1 and propagated to the resulting document. Fix the I/O layer to get compression detection in xzlib, then carry it in the input buffer and the resulting document This should fix https://lsbbugs.linuxfoundation.org/show_bug.cgi?id=3456

d4a5d981

2013-04-30T17:45:36

Cast encoding name to char pointer to match arg type

704d8c5e

2013-04-23T13:02:11

Fix an error in xmlCleanupParser https://bugzilla.gnome.org/show_bug.cgi?id=698582 xmlCleanupParser calls xmlCleanupGlobals() and then xmlResetLastError() but the later reallocate the global data freed by previous call. Just swap the two calls.

9ca816b3

2013-04-16T22:00:13

Fix a couple of return without value Error introduced in previous commit !

e50ba816

2013-04-11T15:54:51

Improve handling of xmlStopParser() Add a specific parser error Try to stop parsing as quickly as possible

cff2546f

2013-03-11T15:57:55

Cache presence of '<' in entities content slightly modify how ent->checked is used, and use the lowest bit to keep the information

a3f1e3e5

2013-03-11T13:57:53

Avoid extra processing on entities If an entity has already been checked for correctness no need to check it on every reference

23f05e0c

2013-02-19T10:21:49

Detect excessive entities expansion upon replacement If entities expansion in the XML parser is asked for, it is possble to craft relatively small input document leading to excessive on-the-fly content generation. This patch accounts for those replacement and stop parsing after a given threshold. it can be bypassed as usual with the HUGE parser option.

bf058dce

2013-02-13T18:19:42

Fix the flushing out of raw buffers on encoding conversions https://bugzilla.gnome.org/show_bug.cgi?id=692915 the new set of converting functions tried to limit the encoding conversion of the raw buffer to the consumption one to work in a more progressive fashion. Unfortunately this was bad for performances and led to errors on progressive parsing when a very large chunk was close to the end of the document. Fix the new internal function and switch back to the old way of converting. Fix another bug in the process.

de0cc20c

2013-02-12T16:55:34

Fix some buffer conversion issues https://bugzilla.gnome.org/show_bug.cgi?id=690202 Buffer overflow errors originating from xmlBufGetInputBase in 2.9.0 The pointers from the context input were not properly reset after that call which can do reallocations.

9c8eaabe

2013-01-04T12:41:53

Fix compiler warning after 153cf15905cf4ec080612ada6703757d10caba1e Add missing cast for xmlNop to silence a compiler warning.

cf8f0424

2012-12-21T11:13:31

Fix an error in the progressive DTD parsing code For https://bugzilla.gnome.org/show_bug.cgi?id=689958 We were looking for the wrong character in the input stream

fb27e2cd

2012-09-28T08:59:33

Fix spelling of "length".

6a36fbe3

2012-10-29T10:39:55

Fix potential out of bound access

153cf159

2012-10-26T13:50:47

Fix large parse of file from memory https://bugzilla.redhat.com/show_bug.cgi?id=862969 The new code trying to detect excessive input lookup would just get wrong sometimes in the case of very large file parsed directly from memory.

711b15d5

2012-10-25T19:23:26

Fix a bug in the nsclean option of the parser Raised as a side effect of: https://bugzilla.gnome.org/show_bug.cgi?id=663844

6c91aa38

2012-10-25T15:33:59

Fix a regression in 2.9.0 breaking validation while streaming https://bugzilla.gnome.org/show_bug.cgi?id=684774 with help from Kjell Ahlstedt <kjell.ahlstedt@bredband.net>

81d7a824

2012-09-13T15:56:51

Fix typos in parser comments Signed-off-by: Jan Pokorný <jpokorny@redhat.com>

f8e3db04

2012-09-11T13:26:36

Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.

28f5e1a2

2012-09-04T11:18:39

Fix potential crash on entities errors Related to https://bugs.launchpad.net/lxml/+bug/502959 Basically the core of the issue is that if an entity references another entity, then in case we are replacing entities content, we should always do so by copying the referenced content as long as the reference is done within the entity. Otherwise, if for some reason there is a later parsing error that entity content may be freed. Complex scenario exposed by command: thinkpad:~/XML/diveintopython-5.4/xml -> valgrind --db-attach=yes ../../xmllint --loaddtd --noout --noent diveintopython.xml Document references &a; a references &b; we references b content directly in by linking in the a content a has an error further down we free a, freeing the chunk from b Document references &b; after &a; we try to copy b content, but it was freed already => segfault * parser.c: never reference directly entity content without copying if we aren't in the document main entity

1f972e9f

2012-08-15T10:16:37

Cleanup some of the parser code Prefetching assumptions about the amount of data read in GROW should be backed up with test for 0 termination when at the end of the buffer.

968a03a2

2012-08-13T12:41:33

Add support for big line numbers in error reporting Fix the lack of line number as reported by Johan Corveleyn <jcorvel@gmail.com> * parser.c include/libxml/parser.h: add an XML_PARSE_BIG_LINES parser option not switch on by default, it's an opt-in * SAX2.c: if XML_PARSE_BIG_LINES is set store the long line numbers in the psvi field of text nodes * tree.c: expand xmlGetLineNo to extract those informations, also make sure we can't fail on recursive behaviour * error.c: in __xmlRaiseError, if a node is provided, call xmlGetLineNo() if we can't get a valid line number. * xmllint.c: switch on XML_PARSE_BIG_LINES in xmllint

5353bbf7

2012-08-03T12:03:31

More fixups on the push parser behaviour

kc3-lang/libxml2/parser.c

parser.c

Log