kmx git

Commit	Date	Message
4472c3a5	2016-05-13T15:13:17	Fix some format string warnings with possible format string vulnerability For https://bugzilla.gnome.org/show_bug.cgi?id=761029 Decorate every method in libxml2 with the appropriate LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups following the reports.
b1d34de4	2016-03-14T17:19:44	Fix inappropriate fetch of entities content For https://bugzilla.gnome.org/show_bug.cgi?id=761430 libfuzzer regression testing exposed another case where the parser would fetch content of an external entity while not in validating mode. Plug that hole
45752d2c	2016-03-03T11:50:34	Bug 759398: Heap use-after-free in xmlDictComputeFastKey <https://bugzilla.gnome.org/show_bug.cgi?id=759398> * parser.c: (xmlParseNCNameComplex): Store start position instead of a pointer to the name since the underlying buffer may change, resulting in a stale pointer being used. * result/errors/759398.xml: Added. * result/errors/759398.xml.err: Added. * result/errors/759398.xml.str: Added. * test/errors/759398.xml: Added test case.
db07dd61	2016-02-12T09:58:29	Bug 758588: Heap-based buffer overread in xmlParserPrintFileContextInternal <https://bugzilla.gnome.org/show_bug.cgi?id=758588> * parser.c: (xmlParseEndTag2): Add bounds checks before dereferencing ctxt->input->cur past the end of the buffer, or incrementing the pointer past the end of the buffer. * result/errors/758588.xml: Add test result. * result/errors/758588.xml.err: Ditto. * result/errors/758588.xml.str: Ditto. * test/errors/758588.xml: Add regression test.
8f30bdff	2016-04-15T11:56:55	Add missing increments of recursion depth counter to XML parser. For https://bugzilla.gnome.org/show_bug.cgi?id=765207 CVE-2016-3705 The functions xmlParserEntityCheck() and xmlParseAttValueComplex() used to call xmlStringDecodeEntities() in a recursive context without incrementing the 'depth' counter in the parser context. Because of that omission, the parser failed to detect attribute recursions in certain documents before running out of stack space.
bb654feb	2016-04-13T16:56:07	Fix typos: dictio{ nn -> n }ar{y,ies} Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
4f8606c1	2016-01-05T13:38:09	Bug 760183: REGRESSION (v2.9.3): XML push parser fails with bogus UTF-8 encoding error when multi-byte character in large CDATA section is split across buffer <https://bugzilla.gnome.org/show_bug.cgi?id=760183> * parser.c: (xmlCheckCdataPush): Add 'complete' argument to describe whether the buffer passed in is the whole CDATA buffer, or if there is more data to parse. If there is more data to parse, don't return a negative value for an invalid multi-byte UTF-8 character that is split between buffers. (xmlParseTryOrFinish): Pass 'complete' argument to xmlCheckCdataPush() as appropriate. * result/cdata-2-byte-UTF-8.xml: Added. * result/cdata-2-byte-UTF-8.xml.rde: Added. * result/cdata-2-byte-UTF-8.xml.rdr: Added. * result/cdata-2-byte-UTF-8.xml.sax: Added. * result/cdata-2-byte-UTF-8.xml.sax2: Added. * result/cdata-3-byte-UTF-8.xml: Added. * result/cdata-3-byte-UTF-8.xml.rde: Added. * result/cdata-3-byte-UTF-8.xml.rdr: Added. * result/cdata-3-byte-UTF-8.xml.sax: Added. * result/cdata-3-byte-UTF-8.xml.sax2: Added. * result/cdata-4-byte-UTF-8.xml: Added. * result/cdata-4-byte-UTF-8.xml.rde: Added. * result/cdata-4-byte-UTF-8.xml.rdr: Added. * result/cdata-4-byte-UTF-8.xml.sax: Added. * result/cdata-4-byte-UTF-8.xml.sax2: Added. * result/noent/cdata-2-byte-UTF-8.xml: Added. * result/noent/cdata-3-byte-UTF-8.xml: Added. * result/noent/cdata-4-byte-UTF-8.xml: Added. * test/cdata-2-byte-UTF-8.xml: Added. * test/cdata-3-byte-UTF-8.xml: Added. * test/cdata-4-byte-UTF-8.xml: Added. - Add tests and results. Only 'make Readertests XMLPushtests' fails prior to the fix.
a7a94612	2016-02-09T12:55:29	Heap-based buffer overread in xmlNextChar For https://bugzilla.gnome.org/show_bug.cgi?id=759671 when the end of the internal subset isn't properly detected xmlParseInternalSubset should just return instead of trying to process input further.
f1063fdb	2015-11-20T16:06:59	CVE-2015-7500 Fix memory access error due to incorrect entities boundaries For https://bugzilla.gnome.org/show_bug.cgi?id=756525 handle properly the case where we popped out of the current entity while processing a start tag Reported by Kostya Serebryany @ Google This slightly modifies the output of 754946 in regression tests
3bd6ae14	2015-11-20T15:06:02	Fix some loop issues embedding NEXT Next can switch the parser back to XML_PARSER_EOF state, we need to consider those in loops consuming input
35bcb1d7	2015-11-20T15:04:09	Detect incoherency on GROW the current pointer to the input has to be between the base and end if not stop everything we have an internal state error.
e3b15974	2015-11-20T14:59:30	Reuse xmlHaltParser() where it makes sense Unify the various place where either xmlStopParser was called (which resets the error as a side effect) and places where we used ctxt->instate = XML_PARSER_EOF to stop further processing
28cd9cb7	2015-11-20T14:55:30	Add xmlHaltParser() to stop the parser The problem is doing it in a consistent and safe fashion It's more complex than just setting ctxt->instate = XML_PARSER_EOF Update the public function to reuse that new internal routine
69030714	2015-11-20T11:13:45	CVE-2015-5312 Another entity expansion issue For https://bugzilla.gnome.org/show_bug.cgi?id=756733 It is one case where the code in place to detect entities expansions failed to exit when the situation was detected, leading to DoS Problem reported by Kostya Serebryany @ Google Patch provided by David Drysdale @ Google
53ac9c96	2015-11-09T18:16:00	xmlStopParser reset errNo I had used it in contexts where that information ought to be preserved
afd27c21	2015-11-09T18:07:18	Avoid processing entities after encoding conversion failures For https://bugzilla.gnome.org/show_bug.cgi?id=756527 and was also raised by Chromium team in the past When we hit a convwersion failure when switching encoding it is bestter to stop parsing there, this was treated as a fatal error but the parser was continuing to process to extract more errors, unfortunately that makes little sense as the data is obviously corrupt and can potentially lead to unexpected behaviour.
ab2b9a93	2015-11-03T20:40:49	Avoid extra processing of MarkupDecl when EOF For https://bugzilla.gnome.org/show_bug.cgi?id=756263 One place where ctxt->instate == XML_PARSER_EOF whic was set up by entity detection issues doesn't get noticed, and even overrided
41ac9049	2015-10-27T10:53:44	Fix an error in previous Conditional section patch an off by one mistake in the change, led to error on correct document where the end of the included entity was exactly the end of the conditional section, leading to regtest failure
bd0526e6	2015-10-23T19:02:28	Another variation of overflow in Conditional sections Which happen after the previous fix to https://bugzilla.gnome.org/show_bug.cgi?id=756456 But stopping the parser and exiting we didn't pop the intermediary entities and doing the SKIP there applies on an input which may be too small
cf77e605	2015-09-30T14:46:29	Add missing Null check in xmlParseExternalEntityPrivate For https://bugzilla.gnome.org/show_bug.cgi?id=755857 a case where we check for NULL but not everywhere
4a5d80ad	2015-09-18T15:06:46	Fix a bug in CData error handling in the push parser For https://bugzilla.gnome.org/show_bug.cgi?id=754947 The checking function was returning incorrect args in some cases Adds the test to teh reg suite and fix one of the existing test output
51f02b0a	2015-09-15T16:50:32	Fix a bug on name parsing at the end of current input buffer For https://bugzilla.gnome.org/show_bug.cgi?id=754946 When hitting the end of the current input buffer while parsing a name we could end up loosing the beginning of the name, which led to various issues.
709a9521	2015-06-29T16:10:26	Fail parsing early on if encoding conversion failed For https://bugzilla.gnome.org/show_bug.cgi?id=751631 If we fail conversing the current input stream while processing the encoding declaration of the XMLDecl then it's safer to just abort there and not try to report further errors.
9aa37588	2015-06-29T09:08:25	Do not process encoding values if the declaration if broken For https://bugzilla.gnome.org/show_bug.cgi?id=751603 If the string is not properly terminated do not try to convert to the given encoding.
9b851233	2015-02-23T11:29:20	Cleanup conditional section error handling For https://bugzilla.gnome.org/show_bug.cgi?id=744980 The error handling of Conditional Section also need to be straightened as the structure of the document can't be guessed on a failure there and it's better to stop parsing as further errors are likely to be irrelevant.
a7dfab74	2015-02-23T11:17:35	Stop parsing on entities boundaries errors For https://bugzilla.gnome.org/show_bug.cgi?id=744980 There are times, like on unterminated entities that it's preferable to stop parsing, even if that means less error reporting. Entities are feeding the parser on further processing, and if they are ill defined then it's possible to get the parser to bug. Also do the same on Conditional Sections if the input is broken, as the structure of the document can't be guessed.
72a46a51	2014-10-23T11:35:36	Fix missing entities after CVE-2014-3660 fix For https://bugzilla.gnome.org/show_bug.cgi?id=738805 The fix for CVE-2014-3660 introduced a regression in some case where entity substitution is required and the entity is used first in anotther entity referenced from an attribute value
f65128f3	2014-10-17T17:13:41	Revert "Missing initialization for the catalog module" This reverts commit 054c716ea1bf001544127a4ab4f4346d1b9947e7. As this break xmlcatalog command https://bugzilla.redhat.com/show_bug.cgi?id=1153753
be2a7eda	2014-10-16T13:59:47	Fix for CVE-2014-3660 Issues related to the billion laugh entity expansion which happened to escape the initial set of fixes
500c54ef	2014-10-16T12:17:20	fix memory leak xml header encoding field with XML_PARSE_IGNORE_ENC When the xml parser encounters an xml encoding in an xml header while configured with option XML_PARSE_IGNORE_ENC, it fails to free memory allocated for storing the encoding. The patch below fixes this. How to reproduce: 1. Change doc/examples/parse4.c to add xmlCtxtUseOptions(ctxt, XML_PARSE_IGNORE_ENC); after the call to xmlCreatePushParserCtxt. 2. Rebuild 3. run the following command from the top libxml2 directory: LD_LIBRARY_PATH=.libs/ valgrind --leak-check=full ./doc/examples/.libs/parse4 ./test.xml , where test.xml contains following input: <?xml version="1.0" encoding="UTF-81" ?><hi/> valgrind will report: ==1964== 10 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==1964== at 0x4C272DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==1964== by 0x4E88497: xmlParseEncName (parser.c:10224) ==1964== by 0x4E888FE: xmlParseEncodingDecl (parser.c:10295) ==1964== by 0x4E89630: xmlParseXMLDecl (parser.c:10534) ==1964== by 0x4E8B737: xmlParseTryOrFinish (parser.c:11293) ==1964== by 0x4E8E775: xmlParseChunk (parser.c:12283) Signed-off-by: Bart De Schuymer <bart at amplidata com>
7cf57380	2014-10-08T16:09:56	Parser error on repeated recursive entity expansion containing < For https://bugzilla.gnome.org/show_bug.cgi?id=736417 basically a weird side effect and a failure to properly parenthesize a boolean expression led to this bug
7e9bbdf8	2014-10-06T20:34:14	parser bug on misformed namespace attributes For https://bugzilla.gnome.org/show_bug.cgi?id=672539 Reported by Axel Miller <axel.miller@ppi.de> Consider the following start-tag: <x xmlns=""version=""> The start-tag does not conform to the rule [40] STag ::= '<' Name (S Attribute)* S? '>' since there is no whitespace in front of the attribute "version". Thus, libxml2 should reject the start-tag. But it doesn't: $ echo '<x xmlns=""version=""/>' \| xmllint - <?xml version="1.0"?> <x xmlns="" version=""/> The error seems to happen only if there is a namespace declaration in front of the attribute. A missing whitespace between other attributes is handled correctly: $ echo '<x someattr=""version=""/>' \| xmllint - -:1: parser error : attributes construct error <x someattr=""version=""/> ^ [...]
24fb4c32	2014-10-06T18:19:12	wrong error column in structured error when parsing end tag For https://bugzilla.gnome.org/show_bug.cgi?id=734283 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after parsing an end tag.
33f658c9	2014-08-07T17:30:36	wrong error column in structured error when parsing attribute values For https://bugzilla.gnome.org/show_bug.cgi?id=734280 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after parsing XML attribute values. Example XML: <?xml version="1.0" encoding="UTF-8"?> <root xmlns="urn:colbug">&</root> <!-- 1 2 3 4 1234567890123456789012345678901234567890 --> Expected location of the error would be line 3, column 21. The actual location of the error is line 3, column 9: $ ./xmlparse colbug2.xml colbug2.xml:3:9: xmlParseEntityRef: no name The 12 characters of the xmlns attribute value "urn:colbug" are not accounted for in the error column value.
5d4310af	2014-08-07T16:28:09	wrong error column in structured error when skipping whitespace in xml decl For https://bugzilla.gnome.org/show_bug.cgi?id=734276 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after an XML declaration containing whitespace. Example XML: <?xml version="1.0" encoding="UTF-8" ?><root>&</root> <!-- 1 2 3 4 5 6 123456789012345678901234567890123456789012345678901234567890 --> Expected location of the error would be line 1, column 53. The actual location of the error is line 1, column 44: $ ./xmlparse colbug1.xml colbug1.xml:1:44: xmlParseEntityRef: no name
2f9b126a	2014-07-26T20:29:36	typo in error messages "colon are forbidden from..." For https://bugzilla.gnome.org/show_bug.cgi?id=731511 Pointed byt vincent Lefevre
c836ba66	2014-07-14T16:39:50	Fix a potential NULL dereference For https://bugzilla.gnome.org/show_bug.cgi?id=733040 xmlDictLookup() may return NULL in case of allocation error, though very unlikely it need to be checked.
dd8367da	2014-06-11T16:54:32	Fix regressions introduced by CVE-2014-0191 patch A number of issues have been raised after the fix, and this patch tries to correct all of them, though most were related to postvalidation. https://bugzilla.gnome.org/show_bug.cgi?id=730290 and other reports on list, off-list and on Red Hat bugzilla
9cd1c3cf	2014-04-22T15:30:56	Do not fetch external parameter entities Unless explicitely asked for when validating or replacing entities with their value. Problem pointed out by Daniel Berrange <berrange@redhat.com>
6faa126f	2014-03-21T17:05:51	Fix xmlParseInNodeContext() if node is not element We really need to have ctxt->instate == XML_PARSER_CONTENT when jumping in content parsing Bug reported by Frank Gross
190a0b89	2014-02-06T10:58:17	Fix a portability issue on Windows Apparently an verflow when comparing macro and unsigned long
054c716e	2014-01-26T15:02:25	Missing initialization for the catalog module
4e1476c5	2013-12-09T15:23:40	adding init calls to xml and html Read parsing entry points As pointed out by "Tassyns, Bram <BramT@enfocus.com>" on the list some call had it other didn't, clean it up and add to all missing ones
9a85d40c	2013-11-29T23:26:25	Fix incorrect spelling entites->entities Partially, a follow-up of 81d7a8245cf9a31a49499a5a195c2b89e6f91180. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
dcc19503	2013-05-22T22:56:45	Fix a parsing bug on non-ascii element and CR/LF usage https://bugzilla.gnome.org/show_bug.cgi?id=698550 Somehow the behaviour of the internal parser routine changed slightly when encountering CR/LF, which led to a bug when parsing document with non-ascii Names
63588f47	2013-05-10T14:01:46	Fix a regression in xmlGetDocCompressMode() The switch to xzlib had for consequence that the compression level of the input was not gathered anymore in ctxt->input->buf, then the parser compression flags was left to -1 and propagated to the resulting document. Fix the I/O layer to get compression detection in xzlib, then carry it in the input buffer and the resulting document This should fix https://lsbbugs.linuxfoundation.org/show_bug.cgi?id=3456
d4a5d981	2013-04-30T17:45:36	Cast encoding name to char pointer to match arg type
704d8c5e	2013-04-23T13:02:11	Fix an error in xmlCleanupParser https://bugzilla.gnome.org/show_bug.cgi?id=698582 xmlCleanupParser calls xmlCleanupGlobals() and then xmlResetLastError() but the later reallocate the global data freed by previous call. Just swap the two calls.
9ca816b3	2013-04-16T22:00:13	Fix a couple of return without value Error introduced in previous commit !
e50ba816	2013-04-11T15:54:51	Improve handling of xmlStopParser() Add a specific parser error Try to stop parsing as quickly as possible
cff2546f	2013-03-11T15:57:55	Cache presence of '<' in entities content slightly modify how ent->checked is used, and use the lowest bit to keep the information
a3f1e3e5	2013-03-11T13:57:53	Avoid extra processing on entities If an entity has already been checked for correctness no need to check it on every reference
23f05e0c	2013-02-19T10:21:49	Detect excessive entities expansion upon replacement If entities expansion in the XML parser is asked for, it is possble to craft relatively small input document leading to excessive on-the-fly content generation. This patch accounts for those replacement and stop parsing after a given threshold. it can be bypassed as usual with the HUGE parser option.
bf058dce	2013-02-13T18:19:42	Fix the flushing out of raw buffers on encoding conversions https://bugzilla.gnome.org/show_bug.cgi?id=692915 the new set of converting functions tried to limit the encoding conversion of the raw buffer to the consumption one to work in a more progressive fashion. Unfortunately this was bad for performances and led to errors on progressive parsing when a very large chunk was close to the end of the document. Fix the new internal function and switch back to the old way of converting. Fix another bug in the process.
de0cc20c	2013-02-12T16:55:34	Fix some buffer conversion issues https://bugzilla.gnome.org/show_bug.cgi?id=690202 Buffer overflow errors originating from xmlBufGetInputBase in 2.9.0 The pointers from the context input were not properly reset after that call which can do reallocations.
9c8eaabe	2013-01-04T12:41:53	Fix compiler warning after 153cf15905cf4ec080612ada6703757d10caba1e Add missing cast for xmlNop to silence a compiler warning.
cf8f0424	2012-12-21T11:13:31	Fix an error in the progressive DTD parsing code For https://bugzilla.gnome.org/show_bug.cgi?id=689958 We were looking for the wrong character in the input stream
fb27e2cd	2012-09-28T08:59:33	Fix spelling of "length".
6a36fbe3	2012-10-29T10:39:55	Fix potential out of bound access
153cf159	2012-10-26T13:50:47	Fix large parse of file from memory https://bugzilla.redhat.com/show_bug.cgi?id=862969 The new code trying to detect excessive input lookup would just get wrong sometimes in the case of very large file parsed directly from memory.
711b15d5	2012-10-25T19:23:26	Fix a bug in the nsclean option of the parser Raised as a side effect of: https://bugzilla.gnome.org/show_bug.cgi?id=663844
6c91aa38	2012-10-25T15:33:59	Fix a regression in 2.9.0 breaking validation while streaming https://bugzilla.gnome.org/show_bug.cgi?id=684774 with help from Kjell Ahlstedt <kjell.ahlstedt@bredband.net>
81d7a824	2012-09-13T15:56:51	Fix typos in parser comments Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
f8e3db04	2012-09-11T13:26:36	Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.
28f5e1a2	2012-09-04T11:18:39	Fix potential crash on entities errors Related to https://bugs.launchpad.net/lxml/+bug/502959 Basically the core of the issue is that if an entity references another entity, then in case we are replacing entities content, we should always do so by copying the referenced content as long as the reference is done within the entity. Otherwise, if for some reason there is a later parsing error that entity content may be freed. Complex scenario exposed by command: thinkpad:~/XML/diveintopython-5.4/xml -> valgrind --db-attach=yes ../../xmllint --loaddtd --noout --noent diveintopython.xml Document references &a; a references &b; we references b content directly in by linking in the a content a has an error further down we free a, freeing the chunk from b Document references &b; after &a; we try to copy b content, but it was freed already => segfault * parser.c: never reference directly entity content without copying if we aren't in the document main entity
1f972e9f	2012-08-15T10:16:37	Cleanup some of the parser code Prefetching assumptions about the amount of data read in GROW should be backed up with test for 0 termination when at the end of the buffer.
968a03a2	2012-08-13T12:41:33	Add support for big line numbers in error reporting Fix the lack of line number as reported by Johan Corveleyn <jcorvel@gmail.com> * parser.c include/libxml/parser.h: add an XML_PARSE_BIG_LINES parser option not switch on by default, it's an opt-in * SAX2.c: if XML_PARSE_BIG_LINES is set store the long line numbers in the psvi field of text nodes * tree.c: expand xmlGetLineNo to extract those informations, also make sure we can't fail on recursive behaviour * error.c: in __xmlRaiseError, if a node is provided, call xmlGetLineNo() if we can't get a valid line number. * xmllint.c: switch on XML_PARSE_BIG_LINES in xmllint
5353bbf7	2012-08-03T12:03:31	More fixups on the push parser behaviour
2b52aa00	2012-07-31T10:53:47	Strengthen behaviour of the push parser in problematic situations Implement the maximum lookahead stategy, and fix some handling of DTD to speed up processing.
e7bf892d	2012-07-30T20:09:25	Improve error reporting on parser errors The extra string was being dismissed when provided. * parser.c: handle bot case properly * result/: this changes a few error reports
48b4cdde	2012-07-30T16:16:04	Enforce XML_PARSER_EOF state handling through the parser That condition is one raised when the parser should positively stop processing further even to report errors. Best is to test is after most GROW call especially within loops
0df83cae	2012-07-30T15:41:10	Fixup limits parser
52d8ade7	2012-07-30T10:08:45	Introduce some default parser limits Those can be overrided by the XML_PARSE_HUGE option, they are just default limits for Name lenght, dictionary size limits and maximum amount of parser lookup. * include/libxml/parserInternals.h: define the limits * include/libxml/xmlerror.h: add a new error * parser.c parserInternals.c: implements the new limits
f572a78d	2012-07-19T20:36:25	More avoid quadratic behaviour
51304816	2012-07-19T20:34:26	Impose a reasonable limit on PI size Unless the XML_PARSE_HUGE option is given to the parser, the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a text node within content. Also cleanup some unsigned int used for memory size.
65686451	2012-07-19T18:25:01	Avoid quadratic behaviour in some push parsing cases avoid rescanning over and over a very long input, just check the incoming chunks
58f73aca	2012-07-19T11:58:47	Impose a reasonable limit on comment size Unless the XML_PARSE_HUGE option is given to the parser, the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a text node within content. Also cleanup some unsigned int used for memory size.
e17db994	2012-07-19T11:25:16	Impose a reasonable limit on attribute size Unless the XML_PARSE_HUGE option is given to the parser, the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a text node within content.
00ac0d3b	2012-07-16T18:03:01	More cleanups for input/buffers code When calling xmlParserInputBufferPush, the buffer may be reallocated and at the input level the pointers for base, cur and end need to be reevaluated. * buf.c buf.h: add two new functions, one to get the base from the input of the buffer, and another one to reset the pointers based on the cur and base inded * HTMLparser.c parser.c: cleanup to use the new helper functions as well as making sure size_t is used for the indexes computations
61551a1e	2012-07-16T16:28:47	Cleanup function xmlBufResetInput() to set input from Buffer This was scattered in a number of modules, xmlParserInputPtr have usually their base, cur and end pointer set from an xmlBuf used as input. * buf.c buf.h: add a new function implementing this setup * parser.c HTMLparser.c catalog.c parserInternals.c xmlreader.c use the new function instead of digging into the buffer in all those modules
768eb3b8	2012-07-16T14:19:49	Convert XML parser to the new input buffers The main changes are when the internal of the buffers structure were adressed directly, we now use routines coming from buf.h The routine xmlParserInputRead() which wasn't used anywhere is deprecated too.
4629ee02	2012-07-23T14:15:40	Do not fetch external parsed entities Unless explicietely asked for when validating or replacing entities with their value. Problem pointed out by Tom Lane <tgl@redhat.com> * parser.c: do not load external parsed entities unless needed * test/errors/extparsedent.xml result/errors/extparsedent.xml*: add a regression test to avoid change of the behaviour in the future
459eeb9d	2012-07-17T16:19:17	Fix parser local buffers size problems
379ebc1d	2012-05-18T15:41:31	Cleanup on randomization tsan reported that rand() is not thread safe, so create a thread safe wrapper, use rand_r() if available. Consolidate the function, initialization and cleanup in dict.c and make sure it is initialized in xmlInitParser()
ed35d3d7	2012-05-11T10:52:27	Fix an uninitialized variable use When compiled without SAX1 support
24464be6	2012-05-10T16:14:55	Avoid memory leak if xmlParserInputBufferCreateIO fails For https://bugzilla.gnome.org/show_bug.cgi?id=643949 In case of error on an IO creation input the given context is terminated with the given close function, except if the error happened in xmlParserInputBufferCreateIO. This can lead to a resource leak which is fixed by this patch.
8658d27d	2012-05-08T16:39:05	wrong message for double hyphen in comment XML error The error message when you have a double hyphen in a comment is "comment not terminated" and should be "double hyphen in comment".
288bb627	2012-05-07T15:01:29	Fix an error in comment nsWarn handler is not about parser fatal errors
4aa68abb	2012-04-02T17:50:54	Try to fix a problem with entities in SAX mode this is a problem which hit the raptor code and that small patch should be a reliable workaround
eae52617	2011-09-18T16:59:13	add lzma compression support
5bd3c061	2011-12-16T18:53:35	Fix an allocation error when copying entities
77404b8b	2011-12-14T16:18:25	Make sure the parser returns when getting a Stop order patch backported from chromiun bug fixes, assuming author is Chris
5825ebb2	2011-11-10T13:50:22	Fix some potential problems on reallocation failures(parser.c) This problem is the same as d7958b21e7f8c447a26bb2436f08402b2c308be4. The operation "ctxt->nameMax * = 2;" should be placed after the function call of xmlRealloc().
4c4653e5	2011-06-05T11:29:29	Add exception for new W3C PI xml-model
c62efc84	2011-05-16T16:03:50	Add options to ignore the internal encoding For both XML and HTML, the document can provide an encoding either in XMLDecl in XML, or as a meta element in HTML head. This adds options to ignore those encodings if the encoding is known in advace for example if the content had been converted before being passed to the parser. * parser.c include/libxml/parser.h: add XML_PARSE_IGNORE_ENC option for XML parsing * include/libxml/HTMLparser.h HTMLparser.c: adds the HTML_PARSE_IGNORE_ENC for HTML parsing * HTMLtree.c: fix the handling of saving when an unknown encoding is defined in meta document header * xmllint.c: add a --noenc option to activate the new parser options
c794eb5b	2011-02-18T12:17:17	Fix memory corruption when xmlParseBalancedChunkMemoryInternal is called from xmlParseBalancedChunk
48f7dcb7	2010-11-04T17:42:42	480323 add code to plug in ICU converters by default This is not configured in by default but after some serious massaging incorporate that patch from Chromium/Chrome.
60587d6e	2010-11-04T15:16:27	606592 update language ID parser to RFC 5646 Mostly except we keep support for some older constructs and don't implement extension or privateuse. It's messy because it's used mostly by XSD datatype which itself reference RFC 3066 and suggests a lexical space completely different from what 5646 defines.
e6ad10a5	2010-11-01T11:35:14	Cleanup encoding pointer comparison * parser.c: Compare encoding pointer with a NULL instead of xmlCharEncoding enum value 0 then casted to char * !
e6f05099	2010-10-15T19:50:03	Fix a potential segfault due to weak symbols on pthreads In xmlInitParser, both __xmlGlobalInitMutexLock and xmlInitGlobals are called before xmlInitThreads, and both use pthread symbols. __xmlGlobalInitMutexLock does so directly, without checking if the symbol exists, and xmlInitGlobals calls xmlNewMutex, which correctly depends on libxml_is_threaded... except libxml_is_threaded is still -1 by then... And again, when releasing the global mutex in __xmlGlobalInitMutexUnlock, the pthread function is called directly. The patch changes the initialization order and make sure the functions are available before calling them

4472c3a5

2016-05-13T15:13:17

Fix some format string warnings with possible format string vulnerability For https://bugzilla.gnome.org/show_bug.cgi?id=761029 Decorate every method in libxml2 with the appropriate LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups following the reports.

b1d34de4

2016-03-14T17:19:44

Fix inappropriate fetch of entities content For https://bugzilla.gnome.org/show_bug.cgi?id=761430 libfuzzer regression testing exposed another case where the parser would fetch content of an external entity while not in validating mode. Plug that hole

45752d2c

2016-03-03T11:50:34

Bug 759398: Heap use-after-free in xmlDictComputeFastKey <https://bugzilla.gnome.org/show_bug.cgi?id=759398> * parser.c: (xmlParseNCNameComplex): Store start position instead of a pointer to the name since the underlying buffer may change, resulting in a stale pointer being used. * result/errors/759398.xml: Added. * result/errors/759398.xml.err: Added. * result/errors/759398.xml.str: Added. * test/errors/759398.xml: Added test case.

db07dd61

2016-02-12T09:58:29

Bug 758588: Heap-based buffer overread in xmlParserPrintFileContextInternal <https://bugzilla.gnome.org/show_bug.cgi?id=758588> * parser.c: (xmlParseEndTag2): Add bounds checks before dereferencing ctxt->input->cur past the end of the buffer, or incrementing the pointer past the end of the buffer. * result/errors/758588.xml: Add test result. * result/errors/758588.xml.err: Ditto. * result/errors/758588.xml.str: Ditto. * test/errors/758588.xml: Add regression test.

8f30bdff

2016-04-15T11:56:55

Add missing increments of recursion depth counter to XML parser. For https://bugzilla.gnome.org/show_bug.cgi?id=765207 CVE-2016-3705 The functions xmlParserEntityCheck() and xmlParseAttValueComplex() used to call xmlStringDecodeEntities() in a recursive context without incrementing the 'depth' counter in the parser context. Because of that omission, the parser failed to detect attribute recursions in certain documents before running out of stack space.

bb654feb

2016-04-13T16:56:07

Fix typos: dictio{ nn -> n }ar{y,ies} Signed-off-by: Jan Pokorný <jpokorny@redhat.com>

4f8606c1

2016-01-05T13:38:09

Bug 760183: REGRESSION (v2.9.3): XML push parser fails with bogus UTF-8 encoding error when multi-byte character in large CDATA section is split across buffer <https://bugzilla.gnome.org/show_bug.cgi?id=760183> * parser.c: (xmlCheckCdataPush): Add 'complete' argument to describe whether the buffer passed in is the whole CDATA buffer, or if there is more data to parse. If there is more data to parse, don't return a negative value for an invalid multi-byte UTF-8 character that is split between buffers. (xmlParseTryOrFinish): Pass 'complete' argument to xmlCheckCdataPush() as appropriate. * result/cdata-2-byte-UTF-8.xml: Added. * result/cdata-2-byte-UTF-8.xml.rde: Added. * result/cdata-2-byte-UTF-8.xml.rdr: Added. * result/cdata-2-byte-UTF-8.xml.sax: Added. * result/cdata-2-byte-UTF-8.xml.sax2: Added. * result/cdata-3-byte-UTF-8.xml: Added. * result/cdata-3-byte-UTF-8.xml.rde: Added. * result/cdata-3-byte-UTF-8.xml.rdr: Added. * result/cdata-3-byte-UTF-8.xml.sax: Added. * result/cdata-3-byte-UTF-8.xml.sax2: Added. * result/cdata-4-byte-UTF-8.xml: Added. * result/cdata-4-byte-UTF-8.xml.rde: Added. * result/cdata-4-byte-UTF-8.xml.rdr: Added. * result/cdata-4-byte-UTF-8.xml.sax: Added. * result/cdata-4-byte-UTF-8.xml.sax2: Added. * result/noent/cdata-2-byte-UTF-8.xml: Added. * result/noent/cdata-3-byte-UTF-8.xml: Added. * result/noent/cdata-4-byte-UTF-8.xml: Added. * test/cdata-2-byte-UTF-8.xml: Added. * test/cdata-3-byte-UTF-8.xml: Added. * test/cdata-4-byte-UTF-8.xml: Added. - Add tests and results. Only 'make Readertests XMLPushtests' fails prior to the fix.

a7a94612

2016-02-09T12:55:29

Heap-based buffer overread in xmlNextChar For https://bugzilla.gnome.org/show_bug.cgi?id=759671 when the end of the internal subset isn't properly detected xmlParseInternalSubset should just return instead of trying to process input further.

f1063fdb

2015-11-20T16:06:59

CVE-2015-7500 Fix memory access error due to incorrect entities boundaries For https://bugzilla.gnome.org/show_bug.cgi?id=756525 handle properly the case where we popped out of the current entity while processing a start tag Reported by Kostya Serebryany @ Google This slightly modifies the output of 754946 in regression tests

3bd6ae14

2015-11-20T15:06:02

Fix some loop issues embedding NEXT Next can switch the parser back to XML_PARSER_EOF state, we need to consider those in loops consuming input

35bcb1d7

2015-11-20T15:04:09

Detect incoherency on GROW the current pointer to the input has to be between the base and end if not stop everything we have an internal state error.

e3b15974

2015-11-20T14:59:30

Reuse xmlHaltParser() where it makes sense Unify the various place where either xmlStopParser was called (which resets the error as a side effect) and places where we used ctxt->instate = XML_PARSER_EOF to stop further processing

28cd9cb7

2015-11-20T14:55:30

Add xmlHaltParser() to stop the parser The problem is doing it in a consistent and safe fashion It's more complex than just setting ctxt->instate = XML_PARSER_EOF Update the public function to reuse that new internal routine

69030714

2015-11-20T11:13:45

CVE-2015-5312 Another entity expansion issue For https://bugzilla.gnome.org/show_bug.cgi?id=756733 It is one case where the code in place to detect entities expansions failed to exit when the situation was detected, leading to DoS Problem reported by Kostya Serebryany @ Google Patch provided by David Drysdale @ Google

53ac9c96

2015-11-09T18:16:00

xmlStopParser reset errNo I had used it in contexts where that information ought to be preserved

afd27c21

2015-11-09T18:07:18

Avoid processing entities after encoding conversion failures For https://bugzilla.gnome.org/show_bug.cgi?id=756527 and was also raised by Chromium team in the past When we hit a convwersion failure when switching encoding it is bestter to stop parsing there, this was treated as a fatal error but the parser was continuing to process to extract more errors, unfortunately that makes little sense as the data is obviously corrupt and can potentially lead to unexpected behaviour.

ab2b9a93

2015-11-03T20:40:49

Avoid extra processing of MarkupDecl when EOF For https://bugzilla.gnome.org/show_bug.cgi?id=756263 One place where ctxt->instate == XML_PARSER_EOF whic was set up by entity detection issues doesn't get noticed, and even overrided

41ac9049

2015-10-27T10:53:44

Fix an error in previous Conditional section patch an off by one mistake in the change, led to error on correct document where the end of the included entity was exactly the end of the conditional section, leading to regtest failure

bd0526e6

2015-10-23T19:02:28

Another variation of overflow in Conditional sections Which happen after the previous fix to https://bugzilla.gnome.org/show_bug.cgi?id=756456 But stopping the parser and exiting we didn't pop the intermediary entities and doing the SKIP there applies on an input which may be too small

cf77e605

2015-09-30T14:46:29

Add missing Null check in xmlParseExternalEntityPrivate For https://bugzilla.gnome.org/show_bug.cgi?id=755857 a case where we check for NULL but not everywhere

4a5d80ad

2015-09-18T15:06:46

Fix a bug in CData error handling in the push parser For https://bugzilla.gnome.org/show_bug.cgi?id=754947 The checking function was returning incorrect args in some cases Adds the test to teh reg suite and fix one of the existing test output

51f02b0a

2015-09-15T16:50:32

Fix a bug on name parsing at the end of current input buffer For https://bugzilla.gnome.org/show_bug.cgi?id=754946 When hitting the end of the current input buffer while parsing a name we could end up loosing the beginning of the name, which led to various issues.

709a9521

2015-06-29T16:10:26

Fail parsing early on if encoding conversion failed For https://bugzilla.gnome.org/show_bug.cgi?id=751631 If we fail conversing the current input stream while processing the encoding declaration of the XMLDecl then it's safer to just abort there and not try to report further errors.

9aa37588

2015-06-29T09:08:25

Do not process encoding values if the declaration if broken For https://bugzilla.gnome.org/show_bug.cgi?id=751603 If the string is not properly terminated do not try to convert to the given encoding.

9b851233

2015-02-23T11:29:20

Cleanup conditional section error handling For https://bugzilla.gnome.org/show_bug.cgi?id=744980 The error handling of Conditional Section also need to be straightened as the structure of the document can't be guessed on a failure there and it's better to stop parsing as further errors are likely to be irrelevant.

a7dfab74

2015-02-23T11:17:35

Stop parsing on entities boundaries errors For https://bugzilla.gnome.org/show_bug.cgi?id=744980 There are times, like on unterminated entities that it's preferable to stop parsing, even if that means less error reporting. Entities are feeding the parser on further processing, and if they are ill defined then it's possible to get the parser to bug. Also do the same on Conditional Sections if the input is broken, as the structure of the document can't be guessed.

72a46a51

2014-10-23T11:35:36

Fix missing entities after CVE-2014-3660 fix For https://bugzilla.gnome.org/show_bug.cgi?id=738805 The fix for CVE-2014-3660 introduced a regression in some case where entity substitution is required and the entity is used first in anotther entity referenced from an attribute value

f65128f3

2014-10-17T17:13:41

Revert "Missing initialization for the catalog module" This reverts commit 054c716ea1bf001544127a4ab4f4346d1b9947e7. As this break xmlcatalog command https://bugzilla.redhat.com/show_bug.cgi?id=1153753

be2a7eda

2014-10-16T13:59:47

Fix for CVE-2014-3660 Issues related to the billion laugh entity expansion which happened to escape the initial set of fixes

500c54ef

2014-10-16T12:17:20

fix memory leak xml header encoding field with XML_PARSE_IGNORE_ENC When the xml parser encounters an xml encoding in an xml header while configured with option XML_PARSE_IGNORE_ENC, it fails to free memory allocated for storing the encoding. The patch below fixes this. How to reproduce: 1. Change doc/examples/parse4.c to add xmlCtxtUseOptions(ctxt, XML_PARSE_IGNORE_ENC); after the call to xmlCreatePushParserCtxt. 2. Rebuild 3. run the following command from the top libxml2 directory: LD_LIBRARY_PATH=.libs/ valgrind --leak-check=full ./doc/examples/.libs/parse4 ./test.xml , where test.xml contains following input: <?xml version="1.0" encoding="UTF-81" ?><hi/> valgrind will report: ==1964== 10 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==1964== at 0x4C272DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==1964== by 0x4E88497: xmlParseEncName (parser.c:10224) ==1964== by 0x4E888FE: xmlParseEncodingDecl (parser.c:10295) ==1964== by 0x4E89630: xmlParseXMLDecl (parser.c:10534) ==1964== by 0x4E8B737: xmlParseTryOrFinish (parser.c:11293) ==1964== by 0x4E8E775: xmlParseChunk (parser.c:12283) Signed-off-by: Bart De Schuymer <bart at amplidata com>

7cf57380

2014-10-08T16:09:56

Parser error on repeated recursive entity expansion containing < For https://bugzilla.gnome.org/show_bug.cgi?id=736417 basically a weird side effect and a failure to properly parenthesize a boolean expression led to this bug

7e9bbdf8

2014-10-06T20:34:14

parser bug on misformed namespace attributes For https://bugzilla.gnome.org/show_bug.cgi?id=672539 Reported by Axel Miller <axel.miller@ppi.de> Consider the following start-tag: <x xmlns=""version=""> The start-tag does not conform to the rule [40] STag ::= '<' Name (S Attribute)* S? '>' since there is no whitespace in front of the attribute "version". Thus, libxml2 should reject the start-tag. But it doesn't: $ echo '<x xmlns=""version=""/>' | xmllint - <?xml version="1.0"?> <x xmlns="" version=""/> The error seems to happen only if there is a namespace declaration in front of the attribute. A missing whitespace between other attributes is handled correctly: $ echo '<x someattr=""version=""/>' | xmllint - -:1: parser error : attributes construct error <x someattr=""version=""/> ^ [...]

24fb4c32

2014-10-06T18:19:12

wrong error column in structured error when parsing end tag For https://bugzilla.gnome.org/show_bug.cgi?id=734283 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after parsing an end tag.

33f658c9

2014-08-07T17:30:36

wrong error column in structured error when parsing attribute values For https://bugzilla.gnome.org/show_bug.cgi?id=734280 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after parsing XML attribute values. Example XML: <?xml version="1.0" encoding="UTF-8"?> <root xmlns="urn:colbug">&</root>  Expected location of the error would be line 3, column 21. The actual location of the error is line 3, column 9: $ ./xmlparse colbug2.xml colbug2.xml:3:9: xmlParseEntityRef: no name The 12 characters of the xmlns attribute value "urn:colbug" are not accounted for in the error column value.

5d4310af

2014-08-07T16:28:09

wrong error column in structured error when skipping whitespace in xml decl For https://bugzilla.gnome.org/show_bug.cgi?id=734276 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after an XML declaration containing whitespace. Example XML: <?xml version="1.0" encoding="UTF-8" ?><root>&</root>  Expected location of the error would be line 1, column 53. The actual location of the error is line 1, column 44: $ ./xmlparse colbug1.xml colbug1.xml:1:44: xmlParseEntityRef: no name

2f9b126a

2014-07-26T20:29:36

typo in error messages "colon are forbidden from..." For https://bugzilla.gnome.org/show_bug.cgi?id=731511 Pointed byt vincent Lefevre

c836ba66

2014-07-14T16:39:50

Fix a potential NULL dereference For https://bugzilla.gnome.org/show_bug.cgi?id=733040 xmlDictLookup() may return NULL in case of allocation error, though very unlikely it need to be checked.

dd8367da

2014-06-11T16:54:32

Fix regressions introduced by CVE-2014-0191 patch A number of issues have been raised after the fix, and this patch tries to correct all of them, though most were related to postvalidation. https://bugzilla.gnome.org/show_bug.cgi?id=730290 and other reports on list, off-list and on Red Hat bugzilla

9cd1c3cf

2014-04-22T15:30:56

Do not fetch external parameter entities Unless explicitely asked for when validating or replacing entities with their value. Problem pointed out by Daniel Berrange <berrange@redhat.com>

6faa126f

2014-03-21T17:05:51

Fix xmlParseInNodeContext() if node is not element We really need to have ctxt->instate == XML_PARSER_CONTENT when jumping in content parsing Bug reported by Frank Gross

190a0b89

2014-02-06T10:58:17

Fix a portability issue on Windows Apparently an verflow when comparing macro and unsigned long

054c716e

2014-01-26T15:02:25

Missing initialization for the catalog module

4e1476c5

2013-12-09T15:23:40

adding init calls to xml and html Read parsing entry points As pointed out by "Tassyns, Bram <BramT@enfocus.com>" on the list some call had it other didn't, clean it up and add to all missing ones

9a85d40c

2013-11-29T23:26:25

Fix incorrect spelling entites->entities Partially, a follow-up of 81d7a8245cf9a31a49499a5a195c2b89e6f91180. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>

dcc19503

2013-05-22T22:56:45

Fix a parsing bug on non-ascii element and CR/LF usage https://bugzilla.gnome.org/show_bug.cgi?id=698550 Somehow the behaviour of the internal parser routine changed slightly when encountering CR/LF, which led to a bug when parsing document with non-ascii Names

63588f47

2013-05-10T14:01:46

Fix a regression in xmlGetDocCompressMode() The switch to xzlib had for consequence that the compression level of the input was not gathered anymore in ctxt->input->buf, then the parser compression flags was left to -1 and propagated to the resulting document. Fix the I/O layer to get compression detection in xzlib, then carry it in the input buffer and the resulting document This should fix https://lsbbugs.linuxfoundation.org/show_bug.cgi?id=3456

d4a5d981

2013-04-30T17:45:36

Cast encoding name to char pointer to match arg type

704d8c5e

2013-04-23T13:02:11

Fix an error in xmlCleanupParser https://bugzilla.gnome.org/show_bug.cgi?id=698582 xmlCleanupParser calls xmlCleanupGlobals() and then xmlResetLastError() but the later reallocate the global data freed by previous call. Just swap the two calls.

9ca816b3

2013-04-16T22:00:13

Fix a couple of return without value Error introduced in previous commit !

e50ba816

2013-04-11T15:54:51

Improve handling of xmlStopParser() Add a specific parser error Try to stop parsing as quickly as possible

cff2546f

2013-03-11T15:57:55

Cache presence of '<' in entities content slightly modify how ent->checked is used, and use the lowest bit to keep the information

a3f1e3e5

2013-03-11T13:57:53

Avoid extra processing on entities If an entity has already been checked for correctness no need to check it on every reference

23f05e0c

2013-02-19T10:21:49

Detect excessive entities expansion upon replacement If entities expansion in the XML parser is asked for, it is possble to craft relatively small input document leading to excessive on-the-fly content generation. This patch accounts for those replacement and stop parsing after a given threshold. it can be bypassed as usual with the HUGE parser option.

bf058dce

2013-02-13T18:19:42

Fix the flushing out of raw buffers on encoding conversions https://bugzilla.gnome.org/show_bug.cgi?id=692915 the new set of converting functions tried to limit the encoding conversion of the raw buffer to the consumption one to work in a more progressive fashion. Unfortunately this was bad for performances and led to errors on progressive parsing when a very large chunk was close to the end of the document. Fix the new internal function and switch back to the old way of converting. Fix another bug in the process.

de0cc20c

2013-02-12T16:55:34

Fix some buffer conversion issues https://bugzilla.gnome.org/show_bug.cgi?id=690202 Buffer overflow errors originating from xmlBufGetInputBase in 2.9.0 The pointers from the context input were not properly reset after that call which can do reallocations.

9c8eaabe

2013-01-04T12:41:53

Fix compiler warning after 153cf15905cf4ec080612ada6703757d10caba1e Add missing cast for xmlNop to silence a compiler warning.

cf8f0424

2012-12-21T11:13:31

Fix an error in the progressive DTD parsing code For https://bugzilla.gnome.org/show_bug.cgi?id=689958 We were looking for the wrong character in the input stream

fb27e2cd

2012-09-28T08:59:33

Fix spelling of "length".

6a36fbe3

2012-10-29T10:39:55

Fix potential out of bound access

153cf159

2012-10-26T13:50:47

Fix large parse of file from memory https://bugzilla.redhat.com/show_bug.cgi?id=862969 The new code trying to detect excessive input lookup would just get wrong sometimes in the case of very large file parsed directly from memory.

711b15d5

2012-10-25T19:23:26

Fix a bug in the nsclean option of the parser Raised as a side effect of: https://bugzilla.gnome.org/show_bug.cgi?id=663844

6c91aa38

2012-10-25T15:33:59

Fix a regression in 2.9.0 breaking validation while streaming https://bugzilla.gnome.org/show_bug.cgi?id=684774 with help from Kjell Ahlstedt <kjell.ahlstedt@bredband.net>

81d7a824

2012-09-13T15:56:51

Fix typos in parser comments Signed-off-by: Jan Pokorný <jpokorny@redhat.com>

f8e3db04

2012-09-11T13:26:36

Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.

28f5e1a2

2012-09-04T11:18:39

Fix potential crash on entities errors Related to https://bugs.launchpad.net/lxml/+bug/502959 Basically the core of the issue is that if an entity references another entity, then in case we are replacing entities content, we should always do so by copying the referenced content as long as the reference is done within the entity. Otherwise, if for some reason there is a later parsing error that entity content may be freed. Complex scenario exposed by command: thinkpad:~/XML/diveintopython-5.4/xml -> valgrind --db-attach=yes ../../xmllint --loaddtd --noout --noent diveintopython.xml Document references &a; a references &b; we references b content directly in by linking in the a content a has an error further down we free a, freeing the chunk from b Document references &b; after &a; we try to copy b content, but it was freed already => segfault * parser.c: never reference directly entity content without copying if we aren't in the document main entity

1f972e9f

2012-08-15T10:16:37

Cleanup some of the parser code Prefetching assumptions about the amount of data read in GROW should be backed up with test for 0 termination when at the end of the buffer.

968a03a2

2012-08-13T12:41:33

Add support for big line numbers in error reporting Fix the lack of line number as reported by Johan Corveleyn <jcorvel@gmail.com> * parser.c include/libxml/parser.h: add an XML_PARSE_BIG_LINES parser option not switch on by default, it's an opt-in * SAX2.c: if XML_PARSE_BIG_LINES is set store the long line numbers in the psvi field of text nodes * tree.c: expand xmlGetLineNo to extract those informations, also make sure we can't fail on recursive behaviour * error.c: in __xmlRaiseError, if a node is provided, call xmlGetLineNo() if we can't get a valid line number. * xmllint.c: switch on XML_PARSE_BIG_LINES in xmllint

5353bbf7

2012-08-03T12:03:31

More fixups on the push parser behaviour

2b52aa00

2012-07-31T10:53:47

Strengthen behaviour of the push parser in problematic situations Implement the maximum lookahead stategy, and fix some handling of DTD to speed up processing.

e7bf892d

2012-07-30T20:09:25

Improve error reporting on parser errors The extra string was being dismissed when provided. * parser.c: handle bot case properly * result/: this changes a few error reports

48b4cdde

2012-07-30T16:16:04

Enforce XML_PARSER_EOF state handling through the parser That condition is one raised when the parser should positively stop processing further even to report errors. Best is to test is after most GROW call especially within loops

0df83cae

2012-07-30T15:41:10

Fixup limits parser

52d8ade7

2012-07-30T10:08:45

Introduce some default parser limits Those can be overrided by the XML_PARSE_HUGE option, they are just default limits for Name lenght, dictionary size limits and maximum amount of parser lookup. * include/libxml/parserInternals.h: define the limits * include/libxml/xmlerror.h: add a new error * parser.c parserInternals.c: implements the new limits

f572a78d

2012-07-19T20:36:25

More avoid quadratic behaviour

51304816

2012-07-19T20:34:26

Impose a reasonable limit on PI size Unless the XML_PARSE_HUGE option is given to the parser, the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a text node within content. Also cleanup some unsigned int used for memory size.

65686451

2012-07-19T18:25:01

Avoid quadratic behaviour in some push parsing cases avoid rescanning over and over a very long input, just check the incoming chunks

58f73aca

2012-07-19T11:58:47

Impose a reasonable limit on comment size Unless the XML_PARSE_HUGE option is given to the parser, the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a text node within content. Also cleanup some unsigned int used for memory size.

e17db994

2012-07-19T11:25:16

Impose a reasonable limit on attribute size Unless the XML_PARSE_HUGE option is given to the parser, the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a text node within content.

00ac0d3b

2012-07-16T18:03:01

More cleanups for input/buffers code When calling xmlParserInputBufferPush, the buffer may be reallocated and at the input level the pointers for base, cur and end need to be reevaluated. * buf.c buf.h: add two new functions, one to get the base from the input of the buffer, and another one to reset the pointers based on the cur and base inded * HTMLparser.c parser.c: cleanup to use the new helper functions as well as making sure size_t is used for the indexes computations

61551a1e

2012-07-16T16:28:47

Cleanup function xmlBufResetInput() to set input from Buffer This was scattered in a number of modules, xmlParserInputPtr have usually their base, cur and end pointer set from an xmlBuf used as input. * buf.c buf.h: add a new function implementing this setup * parser.c HTMLparser.c catalog.c parserInternals.c xmlreader.c use the new function instead of digging into the buffer in all those modules

768eb3b8

2012-07-16T14:19:49

Convert XML parser to the new input buffers The main changes are when the internal of the buffers structure were adressed directly, we now use routines coming from buf.h The routine xmlParserInputRead() which wasn't used anywhere is deprecated too.

4629ee02

2012-07-23T14:15:40

Do not fetch external parsed entities Unless explicietely asked for when validating or replacing entities with their value. Problem pointed out by Tom Lane <tgl@redhat.com> * parser.c: do not load external parsed entities unless needed * test/errors/extparsedent.xml result/errors/extparsedent.xml*: add a regression test to avoid change of the behaviour in the future

459eeb9d

2012-07-17T16:19:17

Fix parser local buffers size problems

379ebc1d

2012-05-18T15:41:31

Cleanup on randomization tsan reported that rand() is not thread safe, so create a thread safe wrapper, use rand_r() if available. Consolidate the function, initialization and cleanup in dict.c and make sure it is initialized in xmlInitParser()

ed35d3d7

2012-05-11T10:52:27

Fix an uninitialized variable use When compiled without SAX1 support

24464be6

2012-05-10T16:14:55

Avoid memory leak if xmlParserInputBufferCreateIO fails For https://bugzilla.gnome.org/show_bug.cgi?id=643949 In case of error on an IO creation input the given context is terminated with the given close function, except if the error happened in xmlParserInputBufferCreateIO. This can lead to a resource leak which is fixed by this patch.

8658d27d

2012-05-08T16:39:05

wrong message for double hyphen in comment XML error The error message when you have a double hyphen in a comment is "comment not terminated" and should be "double hyphen in comment".

288bb627

2012-05-07T15:01:29

Fix an error in comment nsWarn handler is not about parser fatal errors

4aa68abb

2012-04-02T17:50:54

Try to fix a problem with entities in SAX mode this is a problem which hit the raptor code and that small patch should be a reliable workaround

eae52617

2011-09-18T16:59:13

add lzma compression support

5bd3c061

2011-12-16T18:53:35

Fix an allocation error when copying entities

77404b8b

2011-12-14T16:18:25

Make sure the parser returns when getting a Stop order patch backported from chromiun bug fixes, assuming author is Chris

5825ebb2

2011-11-10T13:50:22

Fix some potential problems on reallocation failures(parser.c) This problem is the same as d7958b21e7f8c447a26bb2436f08402b2c308be4. The operation "ctxt->nameMax * = 2;" should be placed after the function call of xmlRealloc().

4c4653e5

2011-06-05T11:29:29

Add exception for new W3C PI xml-model

c62efc84

2011-05-16T16:03:50

Add options to ignore the internal encoding For both XML and HTML, the document can provide an encoding either in XMLDecl in XML, or as a meta element in HTML head. This adds options to ignore those encodings if the encoding is known in advace for example if the content had been converted before being passed to the parser. * parser.c include/libxml/parser.h: add XML_PARSE_IGNORE_ENC option for XML parsing * include/libxml/HTMLparser.h HTMLparser.c: adds the HTML_PARSE_IGNORE_ENC for HTML parsing * HTMLtree.c: fix the handling of saving when an unknown encoding is defined in meta document header * xmllint.c: add a --noenc option to activate the new parser options

c794eb5b

2011-02-18T12:17:17

Fix memory corruption when xmlParseBalancedChunkMemoryInternal is called from xmlParseBalancedChunk

48f7dcb7

2010-11-04T17:42:42

480323 add code to plug in ICU converters by default This is not configured in by default but after some serious massaging incorporate that patch from Chromium/Chrome.

60587d6e

2010-11-04T15:16:27

606592 update language ID parser to RFC 5646 Mostly except we keep support for some older constructs and don't implement extension or privateuse. It's messy because it's used mostly by XSD datatype which itself reference RFC 3066 and suggests a lexical space completely different from what 5646 defines.

e6ad10a5

2010-11-01T11:35:14

Cleanup encoding pointer comparison * parser.c: Compare encoding pointer with a NULL instead of xmlCharEncoding enum value 0 then casted to char * !

e6f05099

2010-10-15T19:50:03

Fix a potential segfault due to weak symbols on pthreads In xmlInitParser, both __xmlGlobalInitMutexLock and xmlInitGlobals are called before xmlInitThreads, and both use pthread symbols. __xmlGlobalInitMutexLock does so directly, without checking if the symbol exists, and xmlInitGlobals calls xmlNewMutex, which correctly depends on libxml_is_threaded... except libxml_is_threaded is still -1 by then... And again, when releasing the global mutex in __xmlGlobalInitMutexUnlock, the pthread function is called directly. The patch changes the initialization order and make sure the functions are available before calling them

kc3-lang/libxml2/parser.c

parser.c

Log