include/libxml


Log

Author Commit Date CI Message
Nick Wellnhofer 1fe38530 2020-12-16T15:27:13 Remove temporary members from struct _xmlXPathContext These values are hardcoded now and the struct members, while public, were recently introduced and never part of an official release.
Nick Wellnhofer acdc2ff3 2020-06-04T23:02:08 Simplify xmlexports.h All the compiler switches essentially set the same macros. The only exception was MSVC which omitted the "extern" keyword for exported variables. This in turn broke clang-cl. This commit rewrites and simplifies the whole header. Closes #163.
Nick Wellnhofer 438e595a 2020-08-09T14:43:53 Stop counting nbChars in parser context The value was inaccurate and never used.
Nick Wellnhofer 20c60886 2020-03-08T17:19:42 Fix typos Resolves #133.
Nick Wellnhofer c2e09f44 2020-02-11T11:32:23 Add xmlPopOutputCallbacks Add function to pop a single set of output callbacks from the stack. This was only implemented for input callbacks before. Fixes #135.
Nick Wellnhofer 74a8a91f 2019-09-30T17:58:59 Fix a few more typos ("fonction")
Jared Yanovich 2a350ee9 2019-09-30T17:04:54 Large batch of typo fixes Closes #109.
Nick Wellnhofer d56184a0 2019-09-26T12:11:39 Disable xmlExp regex code This is apparently another regex engine that was never used, see commit 81a8ec6.
Nick Wellnhofer 37189c08 2019-07-08T12:18:24 dict.h: gcc 2.95 doesn't allow multiple storage classes This is a partial revert of commit c71f9305. I'm not sure what issue this commit was trying to solve but it seems to be related to a circular dependency. It might be related to tree.h being included from dict.h which is unnecessary. Resolves !22.
Nick Wellnhofer 2d97a97a 2019-03-15T16:27:58 Optional recursion limit when parsing XPath expressions Useful to avoid call stack overflows when fuzzing. Note that parsing a parenthesized expression currently consumes more than 10 stack frames, so this limit should be set rather low.
Nick Wellnhofer 64115ed6 2019-03-18T11:34:26 Optional recursion limit when evaluating XPath expressions Useful to avoid call stack overflows when fuzzing.
Nick Wellnhofer 852c93a2 2019-03-12T16:12:05 Optional XPath operation limit Optionally limit the maximum numbers of XPath operations when evaluating an expression. Useful to avoid timeouts when fuzzing. The following operations count towards the limit: - XPath operations - Location step iterations - Union operations Enabled by setting opLimit to a non-zero value. Note that it's the user's responsibility to reset opCount. This allows to enforce the operation limit across multiple reuses of an XPath context.
Nick Wellnhofer 9a82ae30 2019-02-28T12:18:37 Stop defining _REENTRANT on some Win32 platforms The _REENTRANT macro was defined unconditionally on some Win32 builds using the Microsoft C runtime. It shouldn't have an effect under MSVCRT and was presumably only defined because of the LIBXML_THREAD_ENABLED issue fixed with the previous commit.
Michael Haubenwallner cf68fe3d 2019-02-27T15:00:14 Always define LIBXML_THREAD_ENABLED when enabled When libxml2 is compiled with threads enabled, have the header file define LIBXML_THREAD_ENABLED even if the subsequent application by itself does not enable threads. Otherwise, the application would see the unthreaded API functions, but these are not exported (where it does make a difference, like on Win32 based platforms).
Nick Wellnhofer ee501f54 2018-10-13T15:23:35 Stop using doc->charset outside parser code doc->charset does not specify the in-memory encoding which is always UTF-8.
Michael Haubenwallner 73b2417c 2018-09-22T15:45:02 Variables need 'extern' in static lib on Cygwin While the dllimport/dllexport macros now work for Cygwin, using the static library still requires variables to be declared as 'extern'. This is a regression of c65c9e8ee07e2dab0647392c2bd1795a5bc99829, found+fixed by Bruno Haible using static libxml embedded in gettext.
Nick Wellnhofer 1dafb427 2018-09-03T15:29:50 Don't include SAX.h from globals.h SAX.h contains a legacy interface with several unprefixed symbols like `reference`, causing severe namespace pollution. The globals.h header doesn't need any of these symbols, so remove the #include.
Michael Haubenwallner c65c9e8e 2018-08-31T11:42:14 Really declare dllexport/dllimport for Cygwin Cygwin does not define _WIN32, but still requires dllexport/dllimport tags for when applications use the --disable-auto-import linker flag, probably set by the gl_WOE32_DLL autoconf macro in woe32-dll.m4 file.
Nick Wellnhofer ff628d46 2017-11-13T18:35:51 Stop including ansidecl.h This seems to be an undocumented, internal GCC header added a long time ago. I don't know why it was included, but I think it can be safely removed.
Nick Wellnhofer 4dd6d7a5 2017-11-09T17:28:00 Fix list callback signatures Make sure that all parameters and return values of list callback functions exactly match the callback function type. This is required to pass clang's Control Flow Integrity checks and to allow compilation to asm.js with Emscripten. Also change the `user` parameter type from `const void *` to `void *`.
Nick Wellnhofer e03f0a19 2017-11-09T16:42:47 Fix hash callback signatures Make sure that all parameters and return values of hash callback functions exactly match the callback function type. This is required to pass clang's Control Flow Integrity checks and to allow compilation to asm.js with Emscripten. Fixes bug 784861.
Joel Hockey 0b19f236 2017-10-25T18:11:12 Fixed ICU to set flush correctly and provide pivot buffer. By always setting flush=TRUE when doing multiple reads, ICU will not correctly handle truncated utf8 chars across read boundaries. The fix is to set flush=TRUE only on final read, and to provide a pivot buffer which is maintained by libxml between calls to ucnv_convertEx.
J. Peter Mugaas 882a165a 2017-10-21T14:04:20 Fix preprocessor conditional in threads.h Make sure that the preprocessor conditions and types for xmlDllMain match exactly in threads.h and threads.c.
Nick Wellnhofer e3890546 2017-10-09T00:20:01 Fix the Windows header mess Don't include windows.h and wsockcompat.h from config.h but only when needed. Don't define _WINSOCKAPI_ manually. This was apparently done to stop windows.h from including winsock.h which is a problem if winsock2.h wasn't included first. But on MinGW, this causes compiler warnings. Define WIN32_LEAN_AND_MEAN instead which has the same effect. Always use the compiler-defined _WIN32 macro instead of WIN32.
Nick Wellnhofer 8bbe4508 2017-06-17T16:15:09 Spelling and grammar fixes Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other misspellings.
Nick Wellnhofer 576912fa 2017-06-17T15:59:13 Make HTML parser functions take const pointers The 'cur' parameter of htmlParseDoc and htmlSAXParseDoc should be 'const xmlChar *'. Fixes bug 770650.
Nick Wellnhofer 030b1f7a 2017-06-06T15:53:42 Revert "Add an XML_PARSE_NOXXE flag to block all entities loading even local" This reverts commit 2304078555896cf1638c628f50326aeef6f0e0d0. The new flag doesn't work and the change even broke the XML_PARSE_NONET option.
Doran Moppert 23040785 2017-04-07T16:45:56 Add an XML_PARSE_NOXXE flag to block all entities loading even local For https://bugzilla.gnome.org/show_bug.cgi?id=772726 * include/libxml/parser.h: Add a new parser flag XML_PARSE_NOXXE * elfgcchack.h, xmlIO.h, xmlIO.c: associated loading routine * include/libxml/xmlerror.h: new error raised * xmllint.c: adds --noxxe flag to activate the option
David Kilzer 4472c3a5 2016-05-13T15:13:17 Fix some format string warnings with possible format string vulnerability For https://bugzilla.gnome.org/show_bug.cgi?id=761029 Decorate every method in libxml2 with the appropriate LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups following the reports.
Patrick Monnerat c71f9305 2016-05-02T16:21:47 dict.h: Move xmlDictPtr definition before includes to allow direct inclusion.
Nick Wellnhofer 91ac664f 2016-04-26T14:47:56 Fix OOB write in xmlXPathEmptyNodeSet xmlXPathEmptyNodeSet would write a NULL pointer just beyond the end of the nodeTab array. This macro isn't used in libxml2, but in some of the math functions in libexslt where it can result in heap corruption and denial of service. Found by afl-fuzz and ASan.
Audric Schiltknecht cad102b8 2016-04-15T22:41:24 Do normalize string-based datatype value in RelaxNG facet checking Original patch is from Jan Pokorný <jpokorny redhat com> https://mail.gnome.org/archives/xml/2013-November/msg00028.html Improve it according to reviews and add test files.
Jan Pokorný bb654feb 2016-04-13T16:56:07 Fix typos: dictio{ nn -> n }ar{y,ies} Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Michael Catanzaro b02a167a 2015-04-14T13:51:01 Silence clang's -Wunknown-attribute Clang doesn't have perfect feature compatibility with GCC, unfortunately. https://bugzilla.gnome.org/show_bug.cgi?id=747870
Daniel Veillard 213f1fe0 2015-04-14T17:41:48 CVE-2015-1819 Enforce the reader to run in constant memory One of the operation on the reader could resolve entities leading to the classic expansion issue. Make sure the buffer used for xmlreader operation is bounded. Introduce a new allocation type for the buffers for this effect.
Daniel Veillard 7a72f4af 2014-10-13T16:23:24 Fix a couple of issues raised by make dist
Kurt Roeckx 95ebe53b 2014-10-13T16:06:21 Fix and add const qualifiers For https://bugzilla.gnome.org/show_bug.cgi?id=689483 It seems there are functions that do use the const qualifier for some of the arguments, but it seems that there are a lot of functions that don't use it and probably should. So I created a patch against 2.9.0 that makes as much as possible const in tree.h, and changed other files as needed. There were a lot of cases like "const xmlNodePtr node". This doesn't actually do anything, there the *pointer* is constant not the object it points to. So I changed those to "const xmlNode *node". I also removed some consts, mostly in the Copy functions, because those functions can actually modify the doc or node they copy from
Nicolas Le Cam 77b5b464 2014-02-10T10:32:45 Legacy needs xmlSAX2StartElement() and xmlSAX2EndElement(). Fix compilation with minimum and legacy.
Patrick Monnerat 44313c0a 2013-12-12T14:59:18 Shortening lines in headers no change of semantic
Jan Pokorný 9a85d40c 2013-11-29T23:26:25 Fix incorrect spelling entites->entities Partially, a follow-up of 81d7a8245cf9a31a49499a5a195c2b89e6f91180. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Daniel Veillard e50ba816 2013-04-11T15:54:51 Improve handling of xmlStopParser() Add a specific parser error Try to stop parsing as quickly as possible
Alex Bligh 28876afb 2013-03-23T17:23:27 Add xmlXPathSetContextNode and xmlXPathNodeEval This patch adds xmlXPathSetContextNode and xmlXPathNodeEval, which make it easier to evaluation XPath expressions with a context node other than the document root without poking about inside the internals of the context. This patch is compile-tested only, and is my first libxml2 contribution, so please go easy. Signed-off-by: Alex Bligh <alex@alex.org.uk>
Daniel Veillard cff2546f 2013-03-11T15:57:55 Cache presence of '<' in entities content slightly modify how ent->checked is used, and use the lowest bit to keep the information
Daniel Veillard 23f05e0c 2013-02-19T10:21:49 Detect excessive entities expansion upon replacement If entities expansion in the XML parser is asked for, it is possble to craft relatively small input document leading to excessive on-the-fly content generation. This patch accounts for those replacement and stop parsing after a given threshold. it can be bypassed as usual with the HUGE parser option.
Tim Starling 0ad948ed 2012-10-29T13:41:55 Define LIBXML_THREAD_ALLOC_ENABLED via xmlversion.h Otherwise, direct calls to xmlFree() etc. from the application will use a different set of allocation functions to what was used to allocate the memory internally.
Daniel Richard bbe19451 2012-09-18T11:15:06 Windows build fixes Building 2.9.0 on MSVC7.1 was failing This is because HAVE_CONFIG_H is not #defined The patch addresses the above, adds testrecurse.exe and the standard "make check" suite of tests to the MSVC makefile, and also fixes the following (MSVC7.1) warnings: buf.c(674) : warning C4028: formal parameter 1 different from declaration libxml2\timsort.h(71) : warning C4028: formal parameter 1 different from declaration
Daniel Veillard f8e3db04 2012-09-11T13:26:36 Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.
Csaba Raduly 429d3a0a 2012-09-11T11:50:25 Allow to set the quoting character of an xmlWriter It's otherwise impossible to set the quoting character of attribute values of an xmlWriter.
Daniel Veillard 47881284 2012-09-07T14:24:50 Add a forbidden variable error number and message to XPath Related to https://bugzilla.gnome.org/show_bug.cgi?id=680938 When the XML_XPATH_NOVAR flags is being used it means that variables are forbidden, not that they are missing
Daniel Veillard 1bd45d13 2012-09-05T15:35:19 Change the XPath code to percolate allocation errors looping 1000 time on an error stating that a nodeset has grown out of control is useless, make sure we percolate error up to the various loops and break when errors occurs
Daniel Veillard 857104cd 2012-09-04T14:25:23 Remove all .cvsignore as they are not used anymore For https://bugzilla.gnome.org/show_bug.cgi?id=682985 suggested by Adrian Bunk <bunk@stusta.de>
Daniel Veillard 8880170e 2012-08-27T16:20:05 Fix the XPath arity check to also check the XPath stack limits Example xmlXPathNormalizeFunction() would do CHECK_ARITY(1) and the expect valuePop(ctxt); to return an object, except now valuePop() looks at the XPath stack frames and fails returning NULL, and we end up crashing dereferencing the object. Real solution is to exten CHECK_ARITY() and recompile all XPath functions using it.
Daniel Veillard 82cdfc4e 2012-08-22T11:05:09 Expose xmlBufShrink in the public tree API As suggested by Andrew W. Nosenko: Proposal: expose the new xmlBufShrink() to the "public" API for compatibility with xmlBufUse(). Reason: the following scenario: 1. Read something into xmlParserInputBuffer (e.g. using xmlParserInputBufferRead()) 2. Extract content through xmlBufContent() 3. Extract content length through xmlBufUse(). Result have type 'size_t'. 4. Use this content 5. Now, you need to shrink the buffer. How to do it? Doing that through legacy xmlBufferShrink() is unsafe because it uses 'unsigned int' and the whole point of introducing the new API was handling the cases, when 'unsigned int' is not enough. Therefore, need to use the new xmlBufShrink(). But it is "private". Therefore, I propose to expose the new xmlBufShrink() in the same way, as xmlBufContent() and xmlBufUse() are exposed.
Daniel Veillard 97fa5b3c 2012-08-14T11:01:07 Fix file and line report for XSD SAX and reader streaming validation Things now work correctly at the xmllint level: thinkpad:~/XML -> xmllint --sax --noout --schema test_schema.xsd test_xml.xml test_xml.xml:72721: Schemas validity error : Element 'level1': Missing child element(s). Expected is ( level2 ). test_xml.xml fails to validate thinkpad:~/XML -> xmllint --stream --schema test_schema.xsd test_xml.xml test_xml.xml:72721: Schemas validity error : Element 'level1': Missing child element(s). Expected is ( level2 ). test_xml.xml fails to validate thinkpad:~/XML -> * error.c: fix a corner case of not reporting lines when we should * include/libxml/xmlschemas.h doc/symbols.xml: had to add new entry points to set the filename on a validation context and a locator callback used to fetch the line and file from the context * xmlschemas.c: add the new entry points xmlSchemaValidateSetFilename() and xmlSchemaValidateSetLocator(), plus make sure the error reporting routine gets the information if available. Add a locator for SAX. * xmlreader.c: add and plug a locator for readers.
Daniel Veillard 3b666224 2012-08-13T17:49:15 Fix const qualifyer to definition of xmlBufferDetach For https://bugzilla.gnome.org/show_bug.cgi?id=676629 As the buffer is beng mdified by the call the const doesn't make sense.
Daniel Veillard 968a03a2 2012-08-13T12:41:33 Add support for big line numbers in error reporting Fix the lack of line number as reported by Johan Corveleyn <jcorvel@gmail.com> * parser.c include/libxml/parser.h: add an XML_PARSE_BIG_LINES parser option not switch on by default, it's an opt-in * SAX2.c: if XML_PARSE_BIG_LINES is set store the long line numbers in the psvi field of text nodes * tree.c: expand xmlGetLineNo to extract those informations, also make sure we can't fail on recursive behaviour * error.c: in __xmlRaiseError, if a node is provided, call xmlGetLineNo() if we can't get a valid line number. * xmllint.c: switch on XML_PARSE_BIG_LINES in xmllint
Daniel Veillard 28cc42d0 2012-08-10T10:00:18 Regenerating docs and API files Various cleanups * configure.in: force regeneration of APIs in my environment * buf.c buf.h enc.h encoding.c include/libxml/tree.h include/libxml/xmlerror.h save.h tree.c: various comment cleanups pointed by apibuild * doc/apibuild.py: added the 3 new internal headers in the excludes * doc/libxml2-api.xml doc/libxml2-refs.xml: regenerated the API * doc/symbols.xml: listing new entry points for 2.9.0 * doc/devhelp/*: regenerated
Daniel Richard G 5706b6d8 2012-08-06T11:32:54 Various "make distcheck" and portability fixups Makefile.am: * Don't use @VAR@, use $(VAR). Autoconf's AC_SUBST provides us the Make variable, it allows overriding the value at the command line, and (notably) it avoids a Make parse error in the libxml2_la_LDFLAGS assignment when @MODULE_PLATFORM_LIBS@ is empty * Changed how the THREADS_W32 mechanism switches the build between testThreads.c and testThreadsWin32.c as appropriate; using AM_CONDITIONAL allows this to work cleanly and plays well with dependencies * testapi.c should be specified as BUILT_SOURCES * Create symlinks to the test/ and result/ subdirs so that the runtests target is usable in out-of-source-tree builds * Don't do MAKEFLAGS+=--silent as this is not portable to non-GNU Makes * Fixed incorrect find(1) syntax in the "cleanup" rule, and doing "rm -f" instead of just "rm" is good form * (DIST)CLEANFILES needed a bit more coverage to allow "make distcheck" to pass configure.in: * Need AC_PROG_LN_S to create test/ and result/ symlinks in Makefile.am * AC_LIBTOOL_WIN32_DLL and AM_PROG_LIBTOOL are obsolete; these have been superceded by LT_INIT * Don't rebuild docs by default, as this requires GNU Make (as implemented) * Check for uint32_t as some platforms don't provide it * Check for some more functions, and undefine HAVE_MMAP if we don't also HAVE_MUNMAP (one system I tested on actually needed this) * Changed THREADS_W32 from a filename insert into an Automake conditional * The "Copyright" file will not be in the current directory if builddir != srcdir doc/Makefile.am: * EXTRA_DIST cannot use wildcards when they refer to generated files; this breaks dependencies. What I did was define EXTRA_DIST_wc, which uses GNU Make $(wildcard) directives to build up a list of files, and EXTRA_DIST, as a literal expansion of EXTRA_DIST_wc. I also added a new rule, "check-extra-dist", to simplify checking that the two variables are equivalent. (Note that this works only when builddir == srcdir) (I can implement this differently if desired; this is just one way of doing it) * Don't define an "all" target; this steps on Automake's toes * Fixed up the "libxml2-api.xml ..." rule by using $(wildcard) for dependencies (as Make doesn't process the wildcards otherwise) and qualifying appropriate files with $(srcdir) (Note that $(srcdir) is not needed in the dependencies, thanks to VPATH, which we can count on as this is GNU-Make-only code anyway) doc/devhelp/Makefile.am: * Qualified appropriate files with $(srcdir) * Added an "uninstall-local" rule so that "make distcheck" passes doc/examples/Makefile.am: * Rather than use a wildcard that doesn't work, use a substitution that most Make programs can handle doc/examples/index.py: * Do the same here include/libxml/nanoftp.h: * Some platforms (e.g. MSVC 6) already #define INVALID_SOCKET: user@host:/cygdrive/c/Program Files/Microsoft Visual Studio/VC98/\ Include$ grep -R INVALID_SOCKET . ./WINSOCK.H:#define INVALID_SOCKET (SOCKET)(~0) ./WINSOCK2.H:#define INVALID_SOCKET (SOCKET)(~0) include/libxml/xmlversion.h.in: * Support ancient GCCs (I was actually able to build the library with 2.5 but for this bit) python/Makefile.am: * Expanded CLEANFILES to allow "make distcheck" to pass python/tests/Makefile.am: * Define CLEANFILES instead of a "clean" rule, and added tmp.xml to allow "make distcheck" to pass testRelax.c: * Use HAVE_MMAP instead of the less explicit HAVE_SYS_MMAN_H (as some systems have the header but not the function) testSchemas.c: * Use HAVE_MMAP instead of the less explicit HAVE_SYS_MMAN_H testapi.c: * Don't use putenv() if it's not available threads.c: * This fixes the following build error on Solaris 8: libtool: compile: cc -DHAVE_CONFIG_H -I. -I./include -I./include \ -D_REENTRANT -D__EXTENSIONS__ -D_REENTRANT -Dsparc -Xa -mt -v \ -xarch=v9 -xcrossfile -xO5 -c threads.c -KPIC -DPIC -o threads.o "threads.c", line 442: controlling expressions must have scalar type "threads.c", line 512: controlling expressions must have scalar type cc: acomp failed for threads.c *** Error code 1 trio.c: * Define isascii() if the system doesn't provide it trio.h: * The trio library's HAVE_CONFIG_H header is not the same as LibXML2's HAVE_CONFIG_H header; this change is needed to avoid a double-inclusion win32/configure.js: * Added support for the LZMA compression option win32/Makefile.{bcb,mingw,msvc}: * Added appropriate bits to support WITH_LZMA=1 * Install the header files under $(INCPREFIX)\libxml2\libxml instead of $(INCPREFIX)\libxml, to mirror the install location on Unix+Autotools xml2-config.in: * @MODULE_PLATFORM_LIBS@ (usually "-ldl") needs to be in there in order for `xml2-config --libs` to provide a complete set of dependencies xmllint.c: * Use HAVE_MMAP instead of the less-explicit HAVE_SYS_MMAN_H
Daniel Veillard e258adec 2012-08-06T11:16:30 Provide new accessors for xmlOutputBuffer To avoid digging into buf->buffer insternal strcuture the two new entry points xmlOutputBufferGetContent() and xmlOutputBufferGetSize() should make the ode cleaner. * include/libxml/xmlIO.h: add two new functions * xmlIO.c: impement the 2 functions based on the new buffer entry points
Daniel Veillard 18e1f1f1 2012-08-06T10:16:41 Improvements for old buffer compatibility Now tree.h exports LIBXML2_NEW_BUFFER macro indicating that the API uses the new buffers, important to keep code working with both versions. * tree.h buf.h: also export xmlBufContent(), xmlBufEnd(), and xmlBufUse() to help port the old code * buf.c: make sure the compatibility counters are updated on buffer usage, to keep proper working of application compiled against the old structures, but take care of int overflow
Daniel Veillard 52d8ade7 2012-07-30T10:08:45 Introduce some default parser limits Those can be overrided by the XML_PARSE_HUGE option, they are just default limits for Name lenght, dictionary size limits and maximum amount of parser lookup. * include/libxml/parserInternals.h: define the limits * include/libxml/xmlerror.h: add a new error * parser.c parserInternals.c: implements the new limits
Daniel Veillard 7c693dad 2012-07-25T16:32:18 Cleanups and new limit APIs for dictionaries * include/libxml/dict.h dict.c: adding 2 new functions xmlDictGetUsage and xmlDictSetLimit allowing to review the amount of memory allocated for dictionary strings. Aslo cleanup of various signed int used as size values in the code.
Daniel Veillard 57560386 2012-07-24T11:44:23 Cleanup URI module memory allocation code * uri.c: cleanup the code doing the allocations, set up a structured error handler to report memory errors, and set up an abitrary limit on URI saving size * error.c include/libxml/xmlerror.h: add a new FROM_URI indication for structured error reporting, also adding strings for schematron and buffer which were missing
Daniel Veillard dddeede0 2012-07-16T14:44:26 Provide new xmlBuf based saving functions * include/libxml/tree.h: adds xmlBufGetNodeContent and xmlBufNodeDump as xmlBuf based equivalents of xmlNodeGetContent and xmlNodeDump * tree.c: implements one new routine and converts xmlNodeBufGetContent to use the xmlBuf equivalent. It should behave better as a result in case of data larger than 2GB.
Daniel Veillard 65c7d3b2 2012-07-16T14:13:58 Incompatible change to the Input and Output buffers Since the whole set of structures was public, the only way to switch to size_t clean buffer is to introduce an incompatible API change. Modifying the xmlParserInputBuffer and xmlOutputBuffer structures is the best place to make this change as those structures are deep into the parser feeding data, and no public API suggest to build those manually.
Daniel Veillard bca22f40 2012-07-11T16:48:47 Adding a new buf module for buffers This also add converter functions between xmlBuf and xmlBuffer * buf.c buf.h: the old xmlBuffer routines but modified for size_t and using xmlBuf instead of xmlBuffer * Makefile.am: add the 2 new files * include/libxml/xmlerror.h: add an entry for the new module * include/libxml/tree.h: expose the xmlBufPtr type but not the structure which stay private
Daniel Veillard 379ebc1d 2012-05-18T15:41:31 Cleanup on randomization tsan reported that rand() is not thread safe, so create a thread safe wrapper, use rand_r() if available. Consolidate the function, initialization and cleanup in dict.c and make sure it is initialized in xmlInitParser()
Daniel Veillard 0d51cfeb 2012-05-15T11:18:40 Fix a race in xmlNewInputStream For https://bugzilla.gnome.org/show_bug.cgi?id=643148 Reported by Bill Clarke <llib@computer.org>, it used a global variable as a counter for the input id and this was not thread safe. To avoid the race without adding unneeded locking in the parser path, move the id to the parser context instead.
Conrad Irwin 7d0d2a50 2012-05-14T14:18:58 Use a hybrid allocation scheme in xmlNodeSetContent On Fri, May 11, 2012 at 9:10 AM, Daniel Veillard <veillard@redhat.com> wrote: >  Hi Conrad, > > that's interesting ! I was initially afraid of a sudden explosion of > memory allocations for building a tree since by default buffers tend to > "waste" memory by using doubling allocations, but that's not the case. >  xmllint --noout doc/libxml2-api.xml > when compiled with memory debug produce > > paphio:~/XML -> cat .memdump >      MEMORY ALLOCATED : 0, MAX was 12756699 > > and without your patch 12755657, i.e. the increase is minimal. Heh, I thought that too. Actually you're looking at the result with XML_ALLOC_EXACT! This is because EXACT adds 10bytes "spare" on each alloc, and that interestingly wastes about the same amount of space as XML_ALLOC_DOUBLEIT on this example (see below). So it turns out that the default realloc() on my system actually handles this case really well — and I guess that all the time in xmlRealloc() was actually in xmlStrlen, not the underlying realloc() after all (sorry for misleading you). If you replace the realloc() with a bad one (like valgrind's), then the performance degrades severely. This patch implements a HYBRID allocator which has the behaviour you describe (it's like EXACT to start with, though without the spare 10 bytes; and switches to DOUBLEIT after 4kb) — that gets the memory back down to 12755657, with no noticeable impact on the performance of the synthetic pathological example under valgrind. In summary: max_memory on ./xmllint --noout doc/libxml2-api.xml, valgrind time on https://gist.github.com/2656940 max_memory valgrind time before | 12755657 | 29:18.2 EXACT | 12756699 | 2:58.6 <-- this is the state after the first patch. DOUBLEIT | 12756727 | 0:02.7 HYBRID | 12755754 | 0:02.7 <-- this is the state with both patches. > > There is also the cost of creating the buffers all the time. > I need to read the code and check but I may be interested in an hybrid > approach where we switch to buffer only when the text node starts to > become too big (4k would remove nearly all usuall types of "document" > usage, i.e. not blocks of data) I tried to avoid too much buffer creation by introducing the xmlBufferDetach function, which allows re-using one buffer to construct many strings. It's maybe a bit of a "hack" in API terms though I thought the gains would be worth it. Conrad ------8<------ To keep memory usage tight in normal conditions it's desirable to only allocate as much space as is needed. Unfortunately this can lead to problems when constructing a long string out of small chunks, because every chunk you add will need to resize the buffer. To fix this XML_ALLOC_HYBRID will switch (when the buffer is 4kb big) from using exact allocations to doubling buffer size every time it is full. This limits the number of buffer resizes to O(log n) (down from O(n)), and thus greatly increases the performance of constructing very large strings in this manner.
Conrad Irwin 7d553f83 2012-05-10T20:17:25 Use buffers when constructing string node lists. Hi Veillard and all, Firstly, thanks for libxml: it's awesome! I noticed recently that libxml was taking a surprisingly long time to perform some operations (many minutes instead of milliseconds), and so I did some digging. It turns out that the problem was caused by the realloc()ing done in xmlNodeAddContentLen() which can be called many (many) times when assigning some content into a node. For background, I'm dealing with XML that contains emails, these can have large attachments (~6MB) which are base-64 encoded, line-wrapped at 78 chars, and each line ends with &#13;. This means that xmlNodeAddContentLen() is being called about 200,000 times, and so there are 200,000 reallocs of a 6MB string, which takes a while... (I put a synthetic example of this at https://gist.github.com/2656940) The attached patch works around that problem by using the existing buffer API to merge the strings together before even creating the text node, this keeps the number of realloc()s at a managable level. I'd love feedback on the patch, and am happy to fix problems with it, or explore other solutions if you think that this is barking up the wrong tree :). Thanks, Conrad P.S. Should I create a bug for this too? ------8<------ Before this change xmlStringGetNodeList would perform a realloc() of the entire new content for every XML entity in the assigned text in order to merge together adjacent text nodes. This had the effect of making xmlSetNodeContent O(n^2), which led to unexpectedly bad performance on inputs that contained a large number of XML entities. After this change the memory management is done by the buffer API, avoiding the need to continually re-measure and realloc() the string. For my test data (6MB of 80 character lines, each ending with &#13;) this takes the time to xmlSetNodeContent from about 500 seconds to around 50ms. I have not profiled smaller cases, though I tried to minimize the performance impact of my change by avoiding unnecessary string copying. Signed-off-by: Conrad Irwin <conrad.irwin@gmail.com>
Michael Cronenworth 1eabc314 2012-05-10T11:25:38 Fix library problems with mingw-w64 For https://bugzilla.gnome.org/show_bug.cgi?id=663588 Fix a windows only issue when compiling the library with MingW (64 bits) using Fedora cross-compiler chain. Change the dllexport for data
Noam Postavsky 15794990 2012-03-19T16:08:16 add function xmlTextReaderRelaxNGValidateCtxt() Since there is xmlTextReaderSchemaValidateCtxt() it seems like there should be an equivalent RelaxNG function. The attached patch adds it. The code is essentially the same as Schema implementation, but I'm uncertain as to how to add things to the documentation and test suite: there seems to be a lot of auto-generation going on.
Anders F Bjorklund eae52617 2011-09-18T16:59:13 add lzma compression support
Daniel Veillard f5048b3e 2011-08-18T17:10:13 Hardening of XPath evaluation Add a mechanism of frame for XPath evaluation when entering a function or a scoped evaluation, also fix a potential problem in predicate evaluation.
Daniel Veillard c62efc84 2011-05-16T16:03:50 Add options to ignore the internal encoding For both XML and HTML, the document can provide an encoding either in XMLDecl in XML, or as a meta element in HTML head. This adds options to ignore those encodings if the encoding is known in advace for example if the content had been converted before being passed to the parser. * parser.c include/libxml/parser.h: add XML_PARSE_IGNORE_ENC option for XML parsing * include/libxml/HTMLparser.h HTMLparser.c: adds the HTML_PARSE_IGNORE_ENC for HTML parsing * HTMLtree.c: fix the handling of saving when an unknown encoding is defined in meta document header * xmllint.c: add a --noenc option to activate the new parser options
Daniel Veillard 4c2e7c65 2010-11-04T18:35:57 Release of libxml2-2.7.8
Giuseppe Iuculano 48f7dcb7 2010-11-04T17:42:42 480323 add code to plug in ICU converters by default This is not configured in by default but after some serious massaging incorporate that patch from Chromium/Chrome.
Ozkan Sezer f99d2223 2010-11-04T12:08:08 614087 Fix Socket API usage to allow Windows64 compilation In Windows 64 a socket is no more represented by an int, this breaks the nanoftp API and nanoftp/nanohttp, the patch changes this and fix the API for Win64 Regenerated the XML and documentation as a result too.
Adrian Bunk 64b0d60c 2010-11-04T09:43:31 Switch from the obsolete mkinstalldirs to AC_PROG_MKDIR_P This was obsoleted in 2005 so we should be safe. But keep AC_PREREQ to 2.59 as it's still widely deployed.
Adam Spragg d2e62311 2010-11-03T15:33:40 Add xmlSaveOption XML_SAVE_WSNONSIG non destructive indentation option using spaces within markup constructs and hence not modifying content * include/libxml/xmlsave.h: new option * xmlsave.c: some refactoring and new code for the new option * xmllint.c: adds --pretty option where option 2 uses the new formatting
Daniel Veillard f1121c48 2010-07-26T14:02:42 Add an HTML parser option to avoid a default doctype - include/libxml/HTMLparser.h: defines the new HTML parser option HTML_PARSE_NODEFDTD - HTMLparser.c: if option is set don't add a default DTD - xmllint.c: add the corresponding --nodefdtd option in xmllint
Eugene Pimenov 615904f5 2010-03-15T15:16:02 Switch the HTML parser to be non-recursive * HTMLparser.c: new htmlParseElementInternal non recursive, with htmlParseContentInternal and new function to handle node info and element end. * include/libxml/parser.h: add new stack for element info in parser context * parserInternals.c: fee element info stack
Roumen Petrov 120a2699 2010-03-10T10:07:49 Fix build with mingw - include/libxml/xmlexports.h: restore export decoration otherwise xsltproc and xmlsec crash - libxml.h: define LIBXML_STATIC for static build - configure.in: enable modules support for mingw* builds - Makefile.am: flags for testdso if modules support enabled
Daniel Veillard e20fb5a7 2010-01-29T20:47:08 Fix xmlParseInNodeContext for HTML content xmlParseInNodeContext notices that the enclosing document is an HTML document, so invoke the HTML parser for that fragment, and the HTML parser finding a "<p>hello world!</p>" document automatically augment it with defaulted <html> and <body>. This defaulting should be turned off in the HTML parser for this to work, but there is no such HTML parser option. There is an htmlOmittedDefaultValue global variable that you could use, but really we should not rely on global variable for processing options anymore, best is to add an HTML_PARSE_NOIMPLIED. * include/libxml/HTMLparser.h: add the HTML_PARSE_NOIMPLIED parser flag * HTMLparser.c: do add implied element if HTML_PARSE_NOIMPLIED is set * parser.c: add HTML_PARSE_NOIMPLIED to options for xmlParseInNodeContext on HTML documents
Daniel Veillard 57f71aed 2009-09-09T18:57:26 594250 rename ATTRIBUTE_ALLOC_SIZE to avoid clashes * include/libxml/xmlmemory.h include/libxml/xmlversion.h.in: rename it to LIBXML_ATTR_ALLOC_SIZE to avoid conflicts in public headers
Paul Smith 65d359e3 2009-09-07T15:24:24 Fix the globals.h to use XMLPUBFUN * include/libxml/globals.h: in addition to the extern extern Paul Smith noted that XMLPUBFUN should be used instead of LIBXML_DLL_IMPORT
Daniel Veillard 82cf412d 2009-09-07T15:20:24 Problem with extern extern in header * include/libxml/globals.h: LIBXML_DLL_IMPORT should not be followed by extern * include/libxml/xmlmemory.h: fix the same problem but in a comment
Stefan Behnel b9590e9c 2009-08-24T19:45:54 440226 Add xmlXIncludeProcessTreeFlagsData API * xinclude.c include/libxml/xinclude.h: new function similar to xmlXIncludeProcessFlagsData but operating on a subtree
Wang Lam 1de382eb 2009-08-24T17:34:25 Fix SetGenericErrorFunc and SetStructured clash * include/libxml/globals.h globals.c global.data: define a new global variable (per thread) for structured error reporting, to not conflict with generic one * error.c: when defined use the structured error report over any generic one
Daniel Veillard 029a04d2 2009-08-24T12:50:23 541335 HTML avoid creating 2 head or 2 body element * HTMLparser.c: check when we see an head or a body tag and avoid autogenerating them * include/libxml/parser.h: the values for ctxt->html change depending on the head or body tags being seen
Daniel Veillard f39eafaa 2009-08-20T19:15:08 Make xmlRecoverDoc const (Martin Trappel) * include/libxml/parser.h parser.c: just make the parameter a const
Daniel Veillard fcf2457d 2009-08-12T23:02:08 Both args of xmlStrcasestr are const * include/libxml/xmlstring.h xmlstring.c: fix the constness of the second arg of xmlStrcasestr()
Daniel Veillard a194ccb8 2009-08-10T10:08:41 Try to avoid __imp__xmlFree link trouble on msys * include/libxml/xmlexports.h: when compiling with mingw/MSYS or linking to an precompiled library this _imp__xmlFree missing at runtime is a common problem. Igor and various people faced it and this seems the minimal fix for it, should resolve 590302 and 561340
Aleksey Sanin 175beba0 2009-07-09T22:54:00 Fix a couple of ABI issues with C14N 1.1 * include/libxml/c14n.h c14n.c: fix API to not include enum xmlC14NMode in the arguments, and do a bit more check on input
Aleksey Sanin 83868247 2009-07-09T10:26:22 Aleksey Sanin support for c14n 1.1 * c14n.c include/libxml/c14n.h: adds support for C14N 1.1, new flags at the API level * runtest.c Makefile.am testC14N.c xmllint.c: add support in CLI tools and test binaries * result/c14n/1-1-without-comments/* test/c14n/1-1-without-comments/*: add a new batch of tests
Daniel Veillard f076f348 2009-04-15T09:20:25 change ATTRIBUTE_PRINTF into LIBXML_ATTR_FORMAT to avoid macro name * include/libxml/parser.h include/libxml/xmlwriter.h include/libxml/relaxng.h include/libxml/xmlversion.h.in include/libxml/xmlwin32version.h.in include/libxml/valid.h include/libxml/xmlschemas.h include/libxml/xmlerror.h: change ATTRIBUTE_PRINTF into LIBXML_ATTR_FORMAT to avoid macro name collisions with other packages and headers as reported by Belgabor and Mike Hommey daniel svn path=/trunk/; revision=3827
Daniel Veillard 97ff9b36 2009-01-18T21:43:30 preparing 0.7.3 release fix a typo in a name Daniel * configure.in doc/xml.html doc/*: preparing 0.7.3 release * include/libxml/parserInternals.h SAX2.c: fix a typo in a name Daniel svn path=/trunk/; revision=3814
Daniel Veillard f63085de 2009-01-18T20:53:59 port patch from Marcus Meissner to add gcc checking for printf like * include/libxml/parser.h include/libxml/xmlwriter.h include/libxml/relaxng.h include/libxml/xmlversion.h.in include/libxml/xmlwin32version.h.in include/libxml/valid.h include/libxml/xmlschemas.h include/libxml/xmlerror.h: port patch from Marcus Meissner to add gcc checking for printf like functions parameters, should fix #65068 * doc/apibuild.py doc/*: modified the script accordingly and regenerated * xpath.c xmlmemory.c threads.c: fix a few warnings Daniel svn path=/trunk/; revision=3813
Daniel Veillard d032a5bc 2009-01-18T19:41:26 windows header should get the same define Daniel * include/libxml/xmlwin32version.h.in: windows header should get the same define Daniel svn path=/trunk/; revision=3812
Daniel Veillard d4d47057 2009-01-18T17:26:02 apply patch from Marcus Meissner to add gcc attribute alloc_size should * include/libxml/xmlversion.h.in include/libxml/xmlmemory.h: apply patch from Marcus Meissner to add gcc attribute alloc_size should fix #552505 * doc/apibuild.py doc/* testapi.c: regenerate the API * include/libxml/parserInternals.h: fix a comment problem raised by apibuild.py daniel svn path=/trunk/; revision=3811