parser.c


Log

Author Commit Date CI Message
David Kilzer 4472c3a5 2016-05-13T15:13:17 Fix some format string warnings with possible format string vulnerability For https://bugzilla.gnome.org/show_bug.cgi?id=761029 Decorate every method in libxml2 with the appropriate LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups following the reports.
Daniel Veillard b1d34de4 2016-03-14T17:19:44 Fix inappropriate fetch of entities content For https://bugzilla.gnome.org/show_bug.cgi?id=761430 libfuzzer regression testing exposed another case where the parser would fetch content of an external entity while not in validating mode. Plug that hole
Pranjal Jumde 45752d2c 2016-03-03T11:50:34 Bug 759398: Heap use-after-free in xmlDictComputeFastKey <https://bugzilla.gnome.org/show_bug.cgi?id=759398> * parser.c: (xmlParseNCNameComplex): Store start position instead of a pointer to the name since the underlying buffer may change, resulting in a stale pointer being used. * result/errors/759398.xml: Added. * result/errors/759398.xml.err: Added. * result/errors/759398.xml.str: Added. * test/errors/759398.xml: Added test case.
David Kilzer db07dd61 2016-02-12T09:58:29 Bug 758588: Heap-based buffer overread in xmlParserPrintFileContextInternal <https://bugzilla.gnome.org/show_bug.cgi?id=758588> * parser.c: (xmlParseEndTag2): Add bounds checks before dereferencing ctxt->input->cur past the end of the buffer, or incrementing the pointer past the end of the buffer. * result/errors/758588.xml: Add test result. * result/errors/758588.xml.err: Ditto. * result/errors/758588.xml.str: Ditto. * test/errors/758588.xml: Add regression test.
Peter Simons 8f30bdff 2016-04-15T11:56:55 Add missing increments of recursion depth counter to XML parser. For https://bugzilla.gnome.org/show_bug.cgi?id=765207 CVE-2016-3705 The functions xmlParserEntityCheck() and xmlParseAttValueComplex() used to call xmlStringDecodeEntities() in a recursive context without incrementing the 'depth' counter in the parser context. Because of that omission, the parser failed to detect attribute recursions in certain documents before running out of stack space.
Jan Pokorný bb654feb 2016-04-13T16:56:07 Fix typos: dictio{ nn -> n }ar{y,ies} Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
David Kilzer 4f8606c1 2016-01-05T13:38:09 Bug 760183: REGRESSION (v2.9.3): XML push parser fails with bogus UTF-8 encoding error when multi-byte character in large CDATA section is split across buffer <https://bugzilla.gnome.org/show_bug.cgi?id=760183> * parser.c: (xmlCheckCdataPush): Add 'complete' argument to describe whether the buffer passed in is the whole CDATA buffer, or if there is more data to parse. If there is more data to parse, don't return a negative value for an invalid multi-byte UTF-8 character that is split between buffers. (xmlParseTryOrFinish): Pass 'complete' argument to xmlCheckCdataPush() as appropriate. * result/cdata-2-byte-UTF-8.xml: Added. * result/cdata-2-byte-UTF-8.xml.rde: Added. * result/cdata-2-byte-UTF-8.xml.rdr: Added. * result/cdata-2-byte-UTF-8.xml.sax: Added. * result/cdata-2-byte-UTF-8.xml.sax2: Added. * result/cdata-3-byte-UTF-8.xml: Added. * result/cdata-3-byte-UTF-8.xml.rde: Added. * result/cdata-3-byte-UTF-8.xml.rdr: Added. * result/cdata-3-byte-UTF-8.xml.sax: Added. * result/cdata-3-byte-UTF-8.xml.sax2: Added. * result/cdata-4-byte-UTF-8.xml: Added. * result/cdata-4-byte-UTF-8.xml.rde: Added. * result/cdata-4-byte-UTF-8.xml.rdr: Added. * result/cdata-4-byte-UTF-8.xml.sax: Added. * result/cdata-4-byte-UTF-8.xml.sax2: Added. * result/noent/cdata-2-byte-UTF-8.xml: Added. * result/noent/cdata-3-byte-UTF-8.xml: Added. * result/noent/cdata-4-byte-UTF-8.xml: Added. * test/cdata-2-byte-UTF-8.xml: Added. * test/cdata-3-byte-UTF-8.xml: Added. * test/cdata-4-byte-UTF-8.xml: Added. - Add tests and results. Only 'make Readertests XMLPushtests' fails prior to the fix.
Daniel Veillard a7a94612 2016-02-09T12:55:29 Heap-based buffer overread in xmlNextChar For https://bugzilla.gnome.org/show_bug.cgi?id=759671 when the end of the internal subset isn't properly detected xmlParseInternalSubset should just return instead of trying to process input further.
Daniel Veillard f1063fdb 2015-11-20T16:06:59 CVE-2015-7500 Fix memory access error due to incorrect entities boundaries For https://bugzilla.gnome.org/show_bug.cgi?id=756525 handle properly the case where we popped out of the current entity while processing a start tag Reported by Kostya Serebryany @ Google This slightly modifies the output of 754946 in regression tests
Daniel Veillard 3bd6ae14 2015-11-20T15:06:02 Fix some loop issues embedding NEXT Next can switch the parser back to XML_PARSER_EOF state, we need to consider those in loops consuming input
Daniel Veillard 35bcb1d7 2015-11-20T15:04:09 Detect incoherency on GROW the current pointer to the input has to be between the base and end if not stop everything we have an internal state error.
Daniel Veillard e3b15974 2015-11-20T14:59:30 Reuse xmlHaltParser() where it makes sense Unify the various place where either xmlStopParser was called (which resets the error as a side effect) and places where we used ctxt->instate = XML_PARSER_EOF to stop further processing
Daniel Veillard 28cd9cb7 2015-11-20T14:55:30 Add xmlHaltParser() to stop the parser The problem is doing it in a consistent and safe fashion It's more complex than just setting ctxt->instate = XML_PARSER_EOF Update the public function to reuse that new internal routine
David Drysdale 69030714 2015-11-20T11:13:45 CVE-2015-5312 Another entity expansion issue For https://bugzilla.gnome.org/show_bug.cgi?id=756733 It is one case where the code in place to detect entities expansions failed to exit when the situation was detected, leading to DoS Problem reported by Kostya Serebryany @ Google Patch provided by David Drysdale @ Google
Daniel Veillard 53ac9c96 2015-11-09T18:16:00 xmlStopParser reset errNo I had used it in contexts where that information ought to be preserved
Daniel Veillard afd27c21 2015-11-09T18:07:18 Avoid processing entities after encoding conversion failures For https://bugzilla.gnome.org/show_bug.cgi?id=756527 and was also raised by Chromium team in the past When we hit a convwersion failure when switching encoding it is bestter to stop parsing there, this was treated as a fatal error but the parser was continuing to process to extract more errors, unfortunately that makes little sense as the data is obviously corrupt and can potentially lead to unexpected behaviour.
Hugh Davenport ab2b9a93 2015-11-03T20:40:49 Avoid extra processing of MarkupDecl when EOF For https://bugzilla.gnome.org/show_bug.cgi?id=756263 One place where ctxt->instate == XML_PARSER_EOF whic was set up by entity detection issues doesn't get noticed, and even overrided
Daniel Veillard 41ac9049 2015-10-27T10:53:44 Fix an error in previous Conditional section patch an off by one mistake in the change, led to error on correct document where the end of the included entity was exactly the end of the conditional section, leading to regtest failure
Daniel Veillard bd0526e6 2015-10-23T19:02:28 Another variation of overflow in Conditional sections Which happen after the previous fix to https://bugzilla.gnome.org/show_bug.cgi?id=756456 But stopping the parser and exiting we didn't pop the intermediary entities and doing the SKIP there applies on an input which may be too small
Gaurav Gupta cf77e605 2015-09-30T14:46:29 Add missing Null check in xmlParseExternalEntityPrivate For https://bugzilla.gnome.org/show_bug.cgi?id=755857 a case where we check for NULL but not everywhere
Daniel Veillard 4a5d80ad 2015-09-18T15:06:46 Fix a bug in CData error handling in the push parser For https://bugzilla.gnome.org/show_bug.cgi?id=754947 The checking function was returning incorrect args in some cases Adds the test to teh reg suite and fix one of the existing test output
Daniel Veillard 51f02b0a 2015-09-15T16:50:32 Fix a bug on name parsing at the end of current input buffer For https://bugzilla.gnome.org/show_bug.cgi?id=754946 When hitting the end of the current input buffer while parsing a name we could end up loosing the beginning of the name, which led to various issues.
Daniel Veillard 709a9521 2015-06-29T16:10:26 Fail parsing early on if encoding conversion failed For https://bugzilla.gnome.org/show_bug.cgi?id=751631 If we fail conversing the current input stream while processing the encoding declaration of the XMLDecl then it's safer to just abort there and not try to report further errors.
Daniel Veillard 9aa37588 2015-06-29T09:08:25 Do not process encoding values if the declaration if broken For https://bugzilla.gnome.org/show_bug.cgi?id=751603 If the string is not properly terminated do not try to convert to the given encoding.
Daniel Veillard 9b851233 2015-02-23T11:29:20 Cleanup conditional section error handling For https://bugzilla.gnome.org/show_bug.cgi?id=744980 The error handling of Conditional Section also need to be straightened as the structure of the document can't be guessed on a failure there and it's better to stop parsing as further errors are likely to be irrelevant.
Daniel Veillard a7dfab74 2015-02-23T11:17:35 Stop parsing on entities boundaries errors For https://bugzilla.gnome.org/show_bug.cgi?id=744980 There are times, like on unterminated entities that it's preferable to stop parsing, even if that means less error reporting. Entities are feeding the parser on further processing, and if they are ill defined then it's possible to get the parser to bug. Also do the same on Conditional Sections if the input is broken, as the structure of the document can't be guessed.
Daniel Veillard 72a46a51 2014-10-23T11:35:36 Fix missing entities after CVE-2014-3660 fix For https://bugzilla.gnome.org/show_bug.cgi?id=738805 The fix for CVE-2014-3660 introduced a regression in some case where entity substitution is required and the entity is used first in anotther entity referenced from an attribute value
Daniel Veillard f65128f3 2014-10-17T17:13:41 Revert "Missing initialization for the catalog module" This reverts commit 054c716ea1bf001544127a4ab4f4346d1b9947e7. As this break xmlcatalog command https://bugzilla.redhat.com/show_bug.cgi?id=1153753
Daniel Veillard be2a7eda 2014-10-16T13:59:47 Fix for CVE-2014-3660 Issues related to the billion laugh entity expansion which happened to escape the initial set of fixes
Bart De Schuymer 500c54ef 2014-10-16T12:17:20 fix memory leak xml header encoding field with XML_PARSE_IGNORE_ENC When the xml parser encounters an xml encoding in an xml header while configured with option XML_PARSE_IGNORE_ENC, it fails to free memory allocated for storing the encoding. The patch below fixes this. How to reproduce: 1. Change doc/examples/parse4.c to add xmlCtxtUseOptions(ctxt, XML_PARSE_IGNORE_ENC); after the call to xmlCreatePushParserCtxt. 2. Rebuild 3. run the following command from the top libxml2 directory: LD_LIBRARY_PATH=.libs/ valgrind --leak-check=full ./doc/examples/.libs/parse4 ./test.xml , where test.xml contains following input: <?xml version="1.0" encoding="UTF-81" ?><hi/> valgrind will report: ==1964== 10 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==1964== at 0x4C272DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==1964== by 0x4E88497: xmlParseEncName (parser.c:10224) ==1964== by 0x4E888FE: xmlParseEncodingDecl (parser.c:10295) ==1964== by 0x4E89630: xmlParseXMLDecl (parser.c:10534) ==1964== by 0x4E8B737: xmlParseTryOrFinish (parser.c:11293) ==1964== by 0x4E8E775: xmlParseChunk (parser.c:12283) Signed-off-by: Bart De Schuymer <bart at amplidata com>
Daniel Veillard 7cf57380 2014-10-08T16:09:56 Parser error on repeated recursive entity expansion containing &lt; For https://bugzilla.gnome.org/show_bug.cgi?id=736417 basically a weird side effect and a failure to properly parenthesize a boolean expression led to this bug
Dennis Filder 7e9bbdf8 2014-10-06T20:34:14 parser bug on misformed namespace attributes For https://bugzilla.gnome.org/show_bug.cgi?id=672539 Reported by Axel Miller <axel.miller@ppi.de> Consider the following start-tag: <x xmlns=""version=""> The start-tag does not conform to the rule [40] STag ::= '<' Name (S Attribute)* S? '>' since there is no whitespace in front of the attribute "version". Thus, libxml2 should reject the start-tag. But it doesn't: $ echo '<x xmlns=""version=""/>' | xmllint - <?xml version="1.0"?> <x xmlns="" version=""/> The error seems to happen only if there is a namespace declaration in front of the attribute. A missing whitespace between other attributes is handled correctly: $ echo '<x someattr=""version=""/>' | xmllint - -:1: parser error : attributes construct error <x someattr=""version=""/> ^ [...]
Juergen Keil 24fb4c32 2014-10-06T18:19:12 wrong error column in structured error when parsing end tag For https://bugzilla.gnome.org/show_bug.cgi?id=734283 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after parsing an end tag.
Juergen Keil 33f658c9 2014-08-07T17:30:36 wrong error column in structured error when parsing attribute values For https://bugzilla.gnome.org/show_bug.cgi?id=734280 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after parsing XML attribute values. Example XML: <?xml version="1.0" encoding="UTF-8"?> <root xmlns="urn:colbug">&</root> <!-- 1 2 3 4 1234567890123456789012345678901234567890 --> Expected location of the error would be line 3, column 21. The actual location of the error is line 3, column 9: $ ./xmlparse colbug2.xml colbug2.xml:3:9: xmlParseEntityRef: no name The 12 characters of the xmlns attribute value "urn:colbug" are not accounted for in the error column value.
Juergen Keil 5d4310af 2014-08-07T16:28:09 wrong error column in structured error when skipping whitespace in xml decl For https://bugzilla.gnome.org/show_bug.cgi?id=734276 libxml2 reports wrong error column numbers (field int2 in xmlError) in structured error handler, after an XML declaration containing whitespace. Example XML: <?xml version="1.0" encoding="UTF-8" ?><root>&</root> <!-- 1 2 3 4 5 6 123456789012345678901234567890123456789012345678901234567890 --> Expected location of the error would be line 1, column 53. The actual location of the error is line 1, column 44: $ ./xmlparse colbug1.xml colbug1.xml:1:44: xmlParseEntityRef: no name
Daniel Veillard 2f9b126a 2014-07-26T20:29:36 typo in error messages "colon are forbidden from..." For https://bugzilla.gnome.org/show_bug.cgi?id=731511 Pointed byt vincent Lefevre
Daniel Veillard c836ba66 2014-07-14T16:39:50 Fix a potential NULL dereference For https://bugzilla.gnome.org/show_bug.cgi?id=733040 xmlDictLookup() may return NULL in case of allocation error, though very unlikely it need to be checked.
Daniel Veillard dd8367da 2014-06-11T16:54:32 Fix regressions introduced by CVE-2014-0191 patch A number of issues have been raised after the fix, and this patch tries to correct all of them, though most were related to postvalidation. https://bugzilla.gnome.org/show_bug.cgi?id=730290 and other reports on list, off-list and on Red Hat bugzilla
Daniel Veillard 9cd1c3cf 2014-04-22T15:30:56 Do not fetch external parameter entities Unless explicitely asked for when validating or replacing entities with their value. Problem pointed out by Daniel Berrange <berrange@redhat.com>
Daniel Veillard 6faa126f 2014-03-21T17:05:51 Fix xmlParseInNodeContext() if node is not element We really need to have ctxt->instate == XML_PARSER_CONTENT when jumping in content parsing Bug reported by Frank Gross
Longstreth Jon 190a0b89 2014-02-06T10:58:17 Fix a portability issue on Windows Apparently an verflow when comparing macro and unsigned long
Daniel Veillard 054c716e 2014-01-26T15:02:25 Missing initialization for the catalog module
Daniel Veillard 4e1476c5 2013-12-09T15:23:40 adding init calls to xml and html Read parsing entry points As pointed out by "Tassyns, Bram <BramT@enfocus.com>" on the list some call had it other didn't, clean it up and add to all missing ones
Jan Pokorný 9a85d40c 2013-11-29T23:26:25 Fix incorrect spelling entites->entities Partially, a follow-up of 81d7a8245cf9a31a49499a5a195c2b89e6f91180. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Daniel Veillard dcc19503 2013-05-22T22:56:45 Fix a parsing bug on non-ascii element and CR/LF usage https://bugzilla.gnome.org/show_bug.cgi?id=698550 Somehow the behaviour of the internal parser routine changed slightly when encountering CR/LF, which led to a bug when parsing document with non-ascii Names
Daniel Veillard 63588f47 2013-05-10T14:01:46 Fix a regression in xmlGetDocCompressMode() The switch to xzlib had for consequence that the compression level of the input was not gathered anymore in ctxt->input->buf, then the parser compression flags was left to -1 and propagated to the resulting document. Fix the I/O layer to get compression detection in xzlib, then carry it in the input buffer and the resulting document This should fix https://lsbbugs.linuxfoundation.org/show_bug.cgi?id=3456
Nikolay Sivov d4a5d981 2013-04-30T17:45:36 Cast encoding name to char pointer to match arg type
Alexander Pastukhov 704d8c5e 2013-04-23T13:02:11 Fix an error in xmlCleanupParser https://bugzilla.gnome.org/show_bug.cgi?id=698582 xmlCleanupParser calls xmlCleanupGlobals() and then xmlResetLastError() but the later reallocate the global data freed by previous call. Just swap the two calls.
Jüri Aedla 9ca816b3 2013-04-16T22:00:13 Fix a couple of return without value Error introduced in previous commit !
Daniel Veillard e50ba816 2013-04-11T15:54:51 Improve handling of xmlStopParser() Add a specific parser error Try to stop parsing as quickly as possible
Daniel Veillard cff2546f 2013-03-11T15:57:55 Cache presence of '<' in entities content slightly modify how ent->checked is used, and use the lowest bit to keep the information
Daniel Veillard a3f1e3e5 2013-03-11T13:57:53 Avoid extra processing on entities If an entity has already been checked for correctness no need to check it on every reference
Daniel Veillard 23f05e0c 2013-02-19T10:21:49 Detect excessive entities expansion upon replacement If entities expansion in the XML parser is asked for, it is possble to craft relatively small input document leading to excessive on-the-fly content generation. This patch accounts for those replacement and stop parsing after a given threshold. it can be bypassed as usual with the HUGE parser option.
Daniel Veillard bf058dce 2013-02-13T18:19:42 Fix the flushing out of raw buffers on encoding conversions https://bugzilla.gnome.org/show_bug.cgi?id=692915 the new set of converting functions tried to limit the encoding conversion of the raw buffer to the consumption one to work in a more progressive fashion. Unfortunately this was bad for performances and led to errors on progressive parsing when a very large chunk was close to the end of the document. Fix the new internal function and switch back to the old way of converting. Fix another bug in the process.
Daniel Veillard de0cc20c 2013-02-12T16:55:34 Fix some buffer conversion issues https://bugzilla.gnome.org/show_bug.cgi?id=690202 Buffer overflow errors originating from xmlBufGetInputBase in 2.9.0 The pointers from the context input were not properly reset after that call which can do reallocations.
Patrick Gansterer 9c8eaabe 2013-01-04T12:41:53 Fix compiler warning after 153cf15905cf4ec080612ada6703757d10caba1e Add missing cast for xmlNop to silence a compiler warning.
Dan Winship cf8f0424 2012-12-21T11:13:31 Fix an error in the progressive DTD parsing code For https://bugzilla.gnome.org/show_bug.cgi?id=689958 We were looking for the wrong character in the input stream
Michael Wood fb27e2cd 2012-09-28T08:59:33 Fix spelling of "length".
Daniel Veillard 6a36fbe3 2012-10-29T10:39:55 Fix potential out of bound access
Daniel Veillard 153cf159 2012-10-26T13:50:47 Fix large parse of file from memory https://bugzilla.redhat.com/show_bug.cgi?id=862969 The new code trying to detect excessive input lookup would just get wrong sometimes in the case of very large file parsed directly from memory.
Daniel Veillard 711b15d5 2012-10-25T19:23:26 Fix a bug in the nsclean option of the parser Raised as a side effect of: https://bugzilla.gnome.org/show_bug.cgi?id=663844
Daniel Veillard 6c91aa38 2012-10-25T15:33:59 Fix a regression in 2.9.0 breaking validation while streaming https://bugzilla.gnome.org/show_bug.cgi?id=684774 with help from Kjell Ahlstedt <kjell.ahlstedt@bredband.net>
Jan Pokorný 81d7a824 2012-09-13T15:56:51 Fix typos in parser comments Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Daniel Veillard f8e3db04 2012-09-11T13:26:36 Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.
Daniel Veillard 28f5e1a2 2012-09-04T11:18:39 Fix potential crash on entities errors Related to https://bugs.launchpad.net/lxml/+bug/502959 Basically the core of the issue is that if an entity references another entity, then in case we are replacing entities content, we should always do so by copying the referenced content as long as the reference is done within the entity. Otherwise, if for some reason there is a later parsing error that entity content may be freed. Complex scenario exposed by command: thinkpad:~/XML/diveintopython-5.4/xml -> valgrind --db-attach=yes ../../xmllint --loaddtd --noout --noent diveintopython.xml Document references &a; a references &b; we references b content directly in by linking in the a content a has an error further down we free a, freeing the chunk from b Document references &b; after &a; we try to copy b content, but it was freed already => segfault * parser.c: never reference directly entity content without copying if we aren't in the document main entity
Daniel Veillard 1f972e9f 2012-08-15T10:16:37 Cleanup some of the parser code Prefetching assumptions about the amount of data read in GROW should be backed up with test for 0 termination when at the end of the buffer.
Daniel Veillard 968a03a2 2012-08-13T12:41:33 Add support for big line numbers in error reporting Fix the lack of line number as reported by Johan Corveleyn <jcorvel@gmail.com> * parser.c include/libxml/parser.h: add an XML_PARSE_BIG_LINES parser option not switch on by default, it's an opt-in * SAX2.c: if XML_PARSE_BIG_LINES is set store the long line numbers in the psvi field of text nodes * tree.c: expand xmlGetLineNo to extract those informations, also make sure we can't fail on recursive behaviour * error.c: in __xmlRaiseError, if a node is provided, call xmlGetLineNo() if we can't get a valid line number. * xmllint.c: switch on XML_PARSE_BIG_LINES in xmllint
Daniel Veillard 5353bbf7 2012-08-03T12:03:31 More fixups on the push parser behaviour
Daniel Veillard 2b52aa00 2012-07-31T10:53:47 Strengthen behaviour of the push parser in problematic situations Implement the maximum lookahead stategy, and fix some handling of DTD to speed up processing.
Daniel Veillard e7bf892d 2012-07-30T20:09:25 Improve error reporting on parser errors The extra string was being dismissed when provided. * parser.c: handle bot case properly * result/: this changes a few error reports
Daniel Veillard 48b4cdde 2012-07-30T16:16:04 Enforce XML_PARSER_EOF state handling through the parser That condition is one raised when the parser should positively stop processing further even to report errors. Best is to test is after most GROW call especially within loops
Daniel Veillard 0df83cae 2012-07-30T15:41:10 Fixup limits parser
Daniel Veillard 52d8ade7 2012-07-30T10:08:45 Introduce some default parser limits Those can be overrided by the XML_PARSE_HUGE option, they are just default limits for Name lenght, dictionary size limits and maximum amount of parser lookup. * include/libxml/parserInternals.h: define the limits * include/libxml/xmlerror.h: add a new error * parser.c parserInternals.c: implements the new limits
Daniel Veillard f572a78d 2012-07-19T20:36:25 More avoid quadratic behaviour
Daniel Veillard 51304816 2012-07-19T20:34:26 Impose a reasonable limit on PI size Unless the XML_PARSE_HUGE option is given to the parser, the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a text node within content. Also cleanup some unsigned int used for memory size.
Daniel Veillard 65686451 2012-07-19T18:25:01 Avoid quadratic behaviour in some push parsing cases avoid rescanning over and over a very long input, just check the incoming chunks
Daniel Veillard 58f73aca 2012-07-19T11:58:47 Impose a reasonable limit on comment size Unless the XML_PARSE_HUGE option is given to the parser, the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a text node within content. Also cleanup some unsigned int used for memory size.
Daniel Veillard e17db994 2012-07-19T11:25:16 Impose a reasonable limit on attribute size Unless the XML_PARSE_HUGE option is given to the parser, the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a text node within content.
Daniel Veillard 00ac0d3b 2012-07-16T18:03:01 More cleanups for input/buffers code When calling xmlParserInputBufferPush, the buffer may be reallocated and at the input level the pointers for base, cur and end need to be reevaluated. * buf.c buf.h: add two new functions, one to get the base from the input of the buffer, and another one to reset the pointers based on the cur and base inded * HTMLparser.c parser.c: cleanup to use the new helper functions as well as making sure size_t is used for the indexes computations
Daniel Veillard 61551a1e 2012-07-16T16:28:47 Cleanup function xmlBufResetInput() to set input from Buffer This was scattered in a number of modules, xmlParserInputPtr have usually their base, cur and end pointer set from an xmlBuf used as input. * buf.c buf.h: add a new function implementing this setup * parser.c HTMLparser.c catalog.c parserInternals.c xmlreader.c use the new function instead of digging into the buffer in all those modules
Daniel Veillard 768eb3b8 2012-07-16T14:19:49 Convert XML parser to the new input buffers The main changes are when the internal of the buffers structure were adressed directly, we now use routines coming from buf.h The routine xmlParserInputRead() which wasn't used anywhere is deprecated too.
Daniel Veillard 4629ee02 2012-07-23T14:15:40 Do not fetch external parsed entities Unless explicietely asked for when validating or replacing entities with their value. Problem pointed out by Tom Lane <tgl@redhat.com> * parser.c: do not load external parsed entities unless needed * test/errors/extparsedent.xml result/errors/extparsedent.xml*: add a regression test to avoid change of the behaviour in the future
Daniel Veillard 459eeb9d 2012-07-17T16:19:17 Fix parser local buffers size problems
Daniel Veillard 379ebc1d 2012-05-18T15:41:31 Cleanup on randomization tsan reported that rand() is not thread safe, so create a thread safe wrapper, use rand_r() if available. Consolidate the function, initialization and cleanup in dict.c and make sure it is initialized in xmlInitParser()
Daniel Veillard ed35d3d7 2012-05-11T10:52:27 Fix an uninitialized variable use When compiled without SAX1 support
Lin Yi-Li 24464be6 2012-05-10T16:14:55 Avoid memory leak if xmlParserInputBufferCreateIO fails For https://bugzilla.gnome.org/show_bug.cgi?id=643949 In case of error on an IO creation input the given context is terminated with the given close function, except if the error happened in xmlParserInputBufferCreateIO. This can lead to a resource leak which is fixed by this patch.
Bryan Henderson 8658d27d 2012-05-08T16:39:05 wrong message for double hyphen in comment XML error The error message when you have a double hyphen in a comment is "comment not terminated" and should be "double hyphen in comment".
Daniel Veillard 288bb627 2012-05-07T15:01:29 Fix an error in comment nsWarn handler is not about parser fatal errors
Daniel Veillard 4aa68abb 2012-04-02T17:50:54 Try to fix a problem with entities in SAX mode this is a problem which hit the raptor code and that small patch should be a reliable workaround
Anders F Bjorklund eae52617 2011-09-18T16:59:13 add lzma compression support
Daniel Veillard 5bd3c061 2011-12-16T18:53:35 Fix an allocation error when copying entities
Chris Evans 77404b8b 2011-12-14T16:18:25 Make sure the parser returns when getting a Stop order patch backported from chromiun bug fixes, assuming author is Chris
Xia Xinfeng 5825ebb2 2011-11-10T13:50:22 Fix some potential problems on reallocation failures(parser.c) This problem is the same as d7958b21e7f8c447a26bb2436f08402b2c308be4. The operation "ctxt->nameMax * = 2;" should be placed after the function call of xmlRealloc().
Daniel Veillard 4c4653e5 2011-06-05T11:29:29 Add exception for new W3C PI xml-model
Daniel Veillard c62efc84 2011-05-16T16:03:50 Add options to ignore the internal encoding For both XML and HTML, the document can provide an encoding either in XMLDecl in XML, or as a meta element in HTML head. This adds options to ignore those encodings if the encoding is known in advace for example if the content had been converted before being passed to the parser. * parser.c include/libxml/parser.h: add XML_PARSE_IGNORE_ENC option for XML parsing * include/libxml/HTMLparser.h HTMLparser.c: adds the HTML_PARSE_IGNORE_ENC for HTML parsing * HTMLtree.c: fix the handling of saving when an unknown encoding is defined in meta document header * xmllint.c: add a --noenc option to activate the new parser options
Rob Richards c794eb5b 2011-02-18T12:17:17 Fix memory corruption when xmlParseBalancedChunkMemoryInternal is called from xmlParseBalancedChunk
Giuseppe Iuculano 48f7dcb7 2010-11-04T17:42:42 480323 add code to plug in ICU converters by default This is not configured in by default but after some serious massaging incorporate that patch from Chromium/Chrome.
Daniel Veillard 60587d6e 2010-11-04T15:16:27 606592 update language ID parser to RFC 5646 Mostly except we keep support for some older constructs and don't implement extension or privateuse. It's messy because it's used mostly by XSD datatype which itself reference RFC 3066 and suggests a lexical space completely different from what 5646 defines.
Nikolay Sivov e6ad10a5 2010-11-01T11:35:14 Cleanup encoding pointer comparison * parser.c: Compare encoding pointer with a NULL instead of xmlCharEncoding enum value 0 then casted to char * !
Mike Hommey e6f05099 2010-10-15T19:50:03 Fix a potential segfault due to weak symbols on pthreads In xmlInitParser, both __xmlGlobalInitMutexLock and xmlInitGlobals are called before xmlInitThreads, and both use pthread symbols. __xmlGlobalInitMutexLock does so directly, without checking if the symbol exists, and xmlInitGlobals calls xmlNewMutex, which correctly depends on libxml_is_threaded... except libxml_is_threaded is still -1 by then... And again, when releasing the global mutex in __xmlGlobalInitMutexUnlock, the pthread function is called directly. The patch changes the initialization order and make sure the functions are available before calling them