kmx git

Commit	Date	Message
8f5ccada	2021-07-07T19:24:36	xmlAddChild() and xmlAddNextSibling() may not attach their second argument Use the return value of xmlAddChild() and xmlAddNextSibling() instead of the second argument directly. Found by OSS-Fuzz. Fixes #316
53983804	2022-01-25T03:08:22	Run CI tests with UBSan implicit-conversion checks This enables the remaining checks from the "integer" group: - implicit-unsigned-integer-truncation - implicit-signed-integer-truncation - implicit-integer-sign-change These checks can find all kinds of bugs and only require explicit casts if integer truncation or sign change is really intended.
a647e430	2022-01-25T02:59:40	Fix casting of line numbers in SAX2.c The line member is an unsigned short. Avoids integer conversion warnings with UBSan. Also use USHRT_MAX instead of hard-coded constant.
67c2e78b	2022-01-25T02:44:37	Fix integer conversion warnings in hash.c Use unsigned long for temporary variable to avoid integer conversion warnings with UBSan. Note that this does change the computation of hash values for input bytes larger than 0x7F. Before, these bytes were first converted to a (typically) signed char with a negative value, then to a large unsigned long near ULONG_MAX. I doubt that this was intentional. Input bytes larger than 0x7F are now converted to unsigned long unchanged.
21217dd9	2022-01-25T02:34:40	Add explicit casts in runtest.c Avoids integer conversion warnings with UBSan.
7abc6e6a	2022-01-25T02:27:53	Fix integer conversion warning in xmlIconvWrapper Use size_t for return value of iconv(3) to avoid an UBSan integer conversion warning.
f4a74bf0	2022-01-25T02:21:05	Add suffix to unsigned constant in xmlmemory.c Avoids an integer conversion warning with UBSan.
5948abfe	2022-01-25T01:59:03	Add explicit casts in testchar.c Avoids integer conversion warnings with UBSan.
6f95273e	2022-01-25T01:46:59	Fix integer conversion warnings in xmlstring.c Use an int to avoid an integer conversion warning with UBSan when left-shifting a char.
0596d67d	2022-01-25T01:39:41	Add explicit cast in xmlURIUnescapeString Avoids an integer conversion warning with UBSan.
f872aa18	2022-01-25T01:16:00	Fix handling of ctxt->base in xmlXPtrEvalXPtrPart Also set ctxt->base when updating ctxt->cur. Always restore ctxt->cur on error. Avoids integer truncation and wrong column numbers in xmlXPathErr. Stop hiding modification of ctxt members behind a macro. Found with UBSan.
97fe1279	2022-01-20T16:08:35	Remove wrong tarname from AC_INIT Remove the "tarname" added in commit 7c0253aa. Having a tarname including a version number would result in tarballs named libxml2-2.9.12-2.9.12.tar.gz. This change also means that documentation will now be installed in $(datadir)/doc/libxml2 instead of $(datadir)/doc/libxml2-$(version). Having a version number in the documentation directory doesn't seem helpful. The new location also matches the default autotools $(docdir).
00e618eb	2022-01-17T21:39:27	Remove old devhelp format See #295.
d85245f9	2022-01-16T21:39:04	Fix regression with PEs in external DTD Fix a regression introduced with commit a28f7d87. In some cases, parameter entity references in external DTDs wouldn't be expanded. Fixes #306.
9f4cb84c	2022-01-16T18:39:51	Fix xmllint --maxmem xmlMemSetup must be called before initializing the parser, otherwise some data structures will be allocated with system malloc instead of our custom allocator. This throws off built-in memory debugging and sanitizers.
e4c91f74	2021-11-03T11:41:11	Fix Null-deref-in-xmlSchemaGetComponentTargetNs
9277abe2	2022-01-16T15:50:56	Fix libxml2.doap Add description. Change category to "infrastructure". Apparently, "platform" isn't allowed anymore. Add programming language.
87a99270	2021-08-26T11:50:41	Added regression tests for xmlReadFd() and htmlReadFd()
fe6890e2	2021-07-27T13:20:20	Fix htmlReadFd, which was using a mix of xml and html context functions
67953a9f	2022-01-16T15:30:02	Fix memory leak in xmlXPathCompNodeTest Found by Coverity.
3cc64a88	2021-07-22T15:46:38	setup.py.in: Try to import setuptools This way, we can build binary wheels easily if needed
dbfe6151	2021-07-22T15:36:15	Python distutils: Make DLL packaging more flexible This updates setup.py.in to pack the DLLs according to the options we specified to configure.js or CMake (or, even configure, although autotools builds are not likely to build the libxml2 Python module via distutils). At this point, we can pack only the DLLs that libxml2 really depends on, and pack the libxslt DLLs only if we really built the libxslt Python modules. Also make the DLL filenames more easily configured
1b7d4e2b	2021-07-22T14:46:48	tstmem.py: Try importing from libxmlmods.libxml2mod if needed Distutils builds place libxml2mod.pyd under the libxmlmods subdir, so try this directory if 'import libxml2mod' failed.
6e169c14	2021-03-30T16:11:13	python: Port python 3.x module to Windows On Windows, we don't have fcntl() which helps us to find out how a file was opened, so we need to resort to the Windows API NtQueryInformationFile() in ntdll.dll to help us, and compare the file access modes as appropriate to deduce the modes we want to pass into fdopen(). As all official Python 3.x releases are built against newer Windows CRTs that toughen checks on the validity of the file descriptor when we convert the fd to a native Windows File Handle using _get_osfhandle(), we need to define an empty handler so that the program does not abort if the fd that was passed in was invalid; instead, we just return NULL if _get_osfhandle() could not return us a valid Windows File Handle.
eb4c1bf8	2021-11-03T09:48:13	Fix random dropping of characters on dumping ASCII encoded XML Fix a bug in xmlCharEncOutput return value which will cause xmlNodeDumpOutput to drop characters randomly. xmlCharEncOutput returns zero if the length of the input buffer is zero but ignores the fact that it may already encoded the input buffer and the input's length is zero due to the fact that xmlEncOutputChunk returned -2 errors and underlying code tries to fix the error by encoding the input. xmlCharEncOutput is collecting the number of bytes written to the output buffer but is returning zero instead of the total number of bytes in this situation. This commit will fix this issue by returning the total number of bytes instead. So the xmlNodeDumpOutput will also continue writing and will not stop due to the fact that it mistakenly thinks the output buffer is not changed in that iteration. Fixes #314
66fb340a	2021-10-14T15:01:24	Update URL for libxml++ C++ binding Fixes #267
ae728bb8	2022-01-16T15:05:41	Fix null pointer deref in xmlStringGetNodeList Check for malloc failure to avoid null deref.
46c658b0	2021-08-06T08:48:24	move current position before possible calling of ctxt->sax->characters.
96753450	2021-07-29T12:14:03	Correctly install the HTML examples into their subdirectory. Previous to this commit, the examples where installed haphazardly within all the other html documents, also overwriting index.html, for example. Signed-off-by: Mattia Rizzolo <mattia@mapreri.org>
7c0253aa	2021-07-29T12:11:08	Refactor the settings of $docdir This is a completely noop change for this project, since before this commit nothing was using $docdir nor PROGRAM_TARNAME. Setting the fourth parameter of AC_INIT() makes it set PROGRAM_TARNAME, which then used as the last path component of the default docdir, effectively making $docdir be the same as the previous $BASE_DIR/$DOC_MODULE. Signed-off-by: Mattia Rizzolo <mattia@mapreri.org>
1a013ba7	2021-07-26T20:11:56	configure: remove unused checks for libraries These libraries are queried for, but no code cares about the results, so remove the checks.
0aad075c	2021-07-26T20:10:52	cmake: remove unused checks Even the configured `config.h` did not forward the results of these checks.
9669bd68	2021-07-26T20:09:32	configure: remove unused checks for headers These headers are checked for at configure time, but the code never cares about the results of these checks, so skip them.
f8608235	2021-07-26T20:06:18	cmake: fix `ATTRIBUTE_DESTRUCTOR` definition The code expects it to be set to the attribute for `xmlDestructor`, but in CMake, it is only ever available as `1` or undefined. Instead, match the behavior or autoconf.
51c88c6f	2021-07-26T20:12:45	configure: remove unused checks for functions Nothing uses the results from these checks, so remove the checks. There are some "uses" in order to suppress macro shadowing in MSVC's implementation of `isinf` and `isnan` as macros, but those are hard-coded and do not require checks to manage.
3ba59b93	2021-07-23T22:34:29	Generate devhelp2 index file The devhelp2 format was introduced in 2005, and the devhelp format was deprecated in 2017. Fixes: https://gitlab.gnome.org/GNOME/libxml2/-/issues/295
91b3d3f9	2021-07-14T17:12:11	Remove duplicated code in xmlcatalog Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
d7f11fd0	2021-07-14T17:03:46	Fix leak in __xmlOutputBufferCreateFilename Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
477f6de3	2021-07-14T15:35:31	Fix memory leak in xmlRelaxNGNewDocParserCtxt Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
483de2c2	2021-07-14T15:31:55	Fix memory leak in xmlRelaxNGParseData Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
9a9dd31b	2021-07-14T15:28:56	Fix memory leak in libxml_C14NDocSaveTo Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
d68c1637	2021-07-14T15:23:11	Fix memory leak in libxml_saveNodeTo Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
328456bf	2021-07-14T14:43:59	Fix memory leak in xmlNewInputFromFile Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
fe564967	2021-07-14T14:35:17	Fix memory leak in xmlCreateIOParserCtxt Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
f0904f32	2021-07-14T14:14:34	Fix memory leak in xmlParseSGMLCatalog Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
2510f43c	2021-07-14T14:03:44	Fix memory leak in xmlParseCatalogFile Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
92bce68c	2021-07-14T11:37:07	Fix memory leak in xmlSAX2AttributeDecl Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
e7d1c53a	2021-07-14T11:32:57	Fix memory leak in xmlFreeParserInputBuffer Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806
03bb9293	2021-07-07T18:23:18	Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8(). * encoding.c: (UTF16LEToUTF8): - Fix comment to describe what the code does. (UTF16BEToUTF8): - Fix undefined behavior which was applied to UTF16LEToUTF8() in 2f9382033e. - Add bounds check to while() loop which was applied to UTF16LEToUTF8() in be803967db. - Do not return -2 when (in >= inend) to fix the bug. This was applied to UTF16LEToUTF8() in 496a1cf592. - Inline (<< 8) statements to match UTF16LEToUTF8(). Add the following tests and results: test/text-4-byte-UTF-16-BE-offset.xml test/text-4-byte-UTF-16-BE.xml test/text-4-byte-UTF-16-LE-offset.xml test/text-4-byte-UTF-16-LE.xml
e6adc19f	2021-07-05T13:40:54	man: Mention XML_CATALOG_FILES is space-separated Fixes: https://bugzilla.gnome.org/show_bug.cgi?id=781274
bdd482c2	2021-07-05T18:48:10	add documentaiton for xmllint exit code 10 Closes: https://gitlab.gnome.org/GNOME/libxml2/-/issues/280
a0f9211b	2021-06-28T02:03:15	python/Makefile.am: use _LIBADD, not _LDFLAGS for LIBS This fixes over-linking in the built Python modules with various libraries. _LIBADD is intended for adding additional libraries for linking, while _LDFLAGS is for miscellaneous extra flags (possibly user-supplied). If using -Wl,-as-needed within user-supplied LDFLAGS, it is passed too late (after the library link line) and therefore has no effect. Notes: * Noticed while working on Gentoo's migration to libxcrypt because libxml2's Python modules were linking to libcrypt (and other libraries) unexpectedly. * It was suggested we could actually stop linking explicitly with all of Python's libraries / don't copy its LDFLAGS, but this resolves the original issue downstream and is a separate discussion. I couldn't find any clear documentation for/against such a change. Bug: https://bugs.gentoo.org/798942 Signed-off-by: Sam James <sam@gentoo.org>
ff05c94a	2022-01-16T13:56:17	Fix check for libtool in autogen.sh libtoolize is named glibtoolize on some macOS systems.
343bf0d3	2022-01-16T13:52:21	Add myself to maintainers Fixes #319.
c35628a2	2022-01-15T18:18:22	Revert "Make schema validation fail with multiple top-level elements" This reverts commit 4f2aee18f6e2d40e58eb224f4f7935dc2400fe25. Fixes #305.
798bdf13	2022-01-10T14:50:20	Different approach to fix quadratic behavior in HTML push parser The old approach introduced a regression, see issue #312 and the previous commit. Disable code that tries to recover from invalid start tags. This only affects "recovery" mode. Add a comment outlining a better fix in accordance with the HTML5 spec.
094fc08a	2022-01-10T14:02:10	Fix regression when parsing invalid HTML tags in push mode Revert part of commit 173a0830 that changed behavior when parsing malformed start tags with the push parser. This reintroduces quadratic behavior in recovery mode which will be worked around in the next commit. Fixes #312.
2732b234	2022-01-10T13:32:14	Fix regression parsing public IDs literals in HTML Fix regression introduced when reworking htmlParsePubidLiteral in commit 93ce33c2. Fixes #318.
dea91c97	2021-07-27T16:12:54	Fix buffering in xmlOutputBufferWrite Fix a regression introduced with commit a697ed1e which caused xmlOutputBufferWrite to flush internal buffers too late. Fixes #296.
ec6e3efb	2021-07-06T21:56:04	Patch to forbid epsilon-reduction of final states When building the internal representation of a regexp, it is possible that a lot of empty transitions are created. Therefore there is a step to reduce them in the function xmlFAEliminateSimpleEpsilonTransitions. There is an error there for this case: * State 1 has a transition with an atom (in this case "a") to state 2. * State 2 is final and has an epsilon transition to state 1. After reduction it looked like: * State 1 has a transition with an atom (in this case "a") to itself and is final. In other words, the empty string is accepted when it shouldn't be. The attached patch skips the reduction step for final states. An alternative would be to insert or increment counters when reducing a final state, but this seemed error prone and unnecessary, since there aren't that many final states. Fixes #282
22f15211	2021-06-04T09:57:46	Use version in configure.ac for CMake Now CMake script reads version from configure.ac to prevent unsynchronized versions
92d9ab4c	2021-06-07T15:09:53	Fix whitespace when serializing empty HTML documents The old, non-recursive HTML serialization code would always terminate the output with a newline. The new implementation omitted the newline if the document node had no children. Readd the newline when serializing empty documents. Fixes #266.
3e1aad4f	2021-06-02T17:31:49	Fix XPath recursion limit Fix accounting of recursion depth when parsing XPath expressions. This silly bug introduced in commit 804c5297 could lead to spurious errors when parsing larger expressions or XSLT documents. Should fix #264.
13ad8736	2021-05-25T10:55:25	Fix regression in xmlNodeDumpOutputInternal Commit 85b1792e could cause additional whitespace if xmlNodeDump was called with a non-zero starting level.
a46e85f6	2021-05-22T15:20:46	Update CMake project version
a1cac3bb	2021-05-22T14:51:26	Add CMake alias targets for embedded projects
2c0f2f03	2021-05-18T09:52:55	Fix some validation errors in the FAQ Move paragraphs inside li elements.
b92b16f6	2021-05-19T10:15:54	Remove unused variable in xmlCharEncOutFunc Fixes a compiler warning: encoding.c: In function 'xmlCharEncOutFunc__internal_alias': encoding.c:2632:9: warning: unused variable 'output' [-Wunused-variable] 2632 \| int output = 0; https://gitlab.gnome.org/GNOME/libxml2/-/issues/254
7d4060d2	2021-05-16T18:00:21	Add missing file xmlwin32version.h.in to EXTRA_DIST
4fc473d7	2021-05-16T17:48:07	Add instructions on how to use CMake to compile libxml
85b1792e	2021-05-18T20:08:28	Work around lxml API abuse Make xmlNodeDumpOutput and htmlNodeDumpFormatOutput work with corrupted parent pointers. This used to work with the old recursive code but the non-recursive rewrite required parent pointers to be set correctly. Unfortunately, lxml relies on the old behavior and passes subtrees with a corrupted structure. Fall back to a recursive function call if an invalid parent pointer is detected. Fixes #255.
a7b9f3eb	2021-05-20T13:38:54	fix: avoid segfault at exit when using custom memory functions This extends the fix introduced by 956534e to Windows processes dynamically loading libxml2. Closes #256.
b48e77cf	2021-05-13T20:56:16	Release of libxml2-2.9.12 Brown paper bag release, some recently added sources were missing from the 2.9.11 tarball: - configure.ac: bump version - fuzz/Makefile.am: add fuzz.h and seed/regexp to EXTRA_DIST
e1bcffea	2021-05-13T15:35:21	Release of libxml2-2.9.11 Prompted by CVE-2021-3541, but this includes an awful lot of serious bug fixes by Nick and others. - configure.ac: bumped to new release - doc/* updated and regenerated
8598060b	2021-05-13T14:55:12	Patch for security issue CVE-2021-3541 This is relapted to parameter entities expansion and following the line of the billion laugh attack. Somehow in that path the counting of parameters was missed and the normal algorithm based on entities "density" was useless.
bfd2f430	2021-05-09T18:56:57	Fix null deref in legacy SAX1 parser Always call nameNsPush instead of namePush. The latter is unused now and should probably be removed from the public API. I can't see how it could be used reasonably from client code and the unprefixed name has always polluted the global namespace. Fixes a null pointer dereference introduced with de5b624f when parsing in SAX1 mode. Found by OSS-Fuzz.
ce00c36e	2021-05-08T21:20:05	Store per-element parser state in a struct Make the parser context's "pushTab" point to an array of structs instead of void pointers. This avoids casting unrelated types to void pointers, improving readability and portability, and allows for more efficient packing. Ultimately, the struct could be extended to include the contents of "nameTab" and "spaceTab", further simplifying the code. Historically, "pushTab" was only used by the push parser (hence the name), so the change to the public headers should be safe. Also remove an unused parameter from xmlParseEndTag2.
de5b624f	2021-05-08T20:21:29	Fix handling of unexpected EOF in xmlParseContent Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was removed in commit 62150ed2. This commit also introduced a regression for direct users of xmlParseContent. Unclosed tags weren't checked.
3e80560d	2021-05-07T10:51:38	Fix line numbers in error messages for mismatched tags Commit 62150ed2 introduced a small regression in the error messages for mismatched tags. This typically only affected messages after the first mismatch, but with custom SAX handlers all line numbers would be off. This also fixes line numbers in the SAX push parser which were never handled correctly.
7279d236	2021-05-06T10:37:07	Fix htmlTagLookup Fix regression introduced with b25acce8. Some users like libxslt may call the HTML output functions on documents with uppercase tag names, so we must keep case-insensitive string comparison. Fixes #248.
33468d7e	2021-05-03T16:09:44	update for xsd:language type check Fixes #242.
babe7503	2021-05-01T16:53:33	Propagate error in xmlParseElementChildrenContentDeclPriv Check return value of recursive calls to xmlParseElementChildrenContentDeclPriv and return immediately in case of errors. Otherwise, struct xmlElementContent could contain unexpected null pointers, leading to a null deref when post-validating documents which aren't well-formed and parsed in recovery mode. Fixes #243.
5465a8e5	2021-04-25T21:19:59	Update INSTALL.libxml2 Fixes #238.
1098c30a	2021-04-22T19:26:28	Fix user-after-free with `xmllint --xinclude --dropdtd` The --dropdtd option can leave dangling pointers in entity reference nodes. Make sure to skip these nodes when processing XIncludes. This also avoids scanning entity declarations and even modifying them inadvertently during XInclude processing. Move from a block list to an allow list approach to avoid descending into other node types that can't contain elements. Fixes #237.
72b3c067	2021-04-22T19:24:50	Fix dangling pointer with `xmllint --dropdtd` Reset doc->intSubset when dropping the DTD.
bf227135	2020-08-16T17:19:35	Validate UTF8 in xmlEncodeEntities Code is currently assuming UTF-8 without validating. Truncated UTF-8 input can cause out-of-bounds array access. Adds further checks to partial fix in 50f06b3e. Fixes #178
1358d157	2021-04-21T13:23:27	Fix use-after-free with `xmllint --html --push` Call htmlCtxtUseOptions to make sure that names aren't stored in dictionaries. Note that this issue only affects xmllint using the HTML push parser. Fixes #230.
fb08d9fe	2021-03-20T22:02:26	Fix include order in c14n.h - Include xmlversion.h before testing feature flags. - Include libxml headers before extern "C". Fixes #226.
d3a02679	2021-03-15T13:44:34	CMake: Only add postfixes if MSVC Currently, it catches mingw-w64 in there as well, but mingw-w64 follows linux-like naming with no weird postfixes Signed-off-by: Christopher Degawa <ccom@randomderp.com>
868e49cf	2021-03-16T10:36:04	Allow FP division by zero in xmlXPathInit
d25460da	2021-03-13T19:12:00	Fix XPath NaN/Inf for older GCC versions The DBL_MAX approach could lead to errors caused by excess precision. Switch back to the division-by-zero approach with a work-around for MSVC and use the extern globals instead of macro expressions.
e20c9c14	2021-03-13T18:41:47	Fix xmlGetNodePath with invalid node types Make xmlGetNodePath return NULL instead of invalid XPath when hitting unsupported node types like DTD content. Reported here: https://mail.gnome.org/archives/xml/2021-January/msg00012.html Original report: https://bugs.php.net/bug.php?id=80680
c3fd8c42	2021-03-13T17:19:32	Fix exponential behavior with recursive entities Fix another case where only recursion depth was limited, but entities would still be expanded over and over again. The test case discovered by fuzzing only affected parsing in recovery mode with XML_PARSE_RECOVER. Found by OSS-Fuzz.
683de7ef	2021-03-04T19:06:04	Fix duplicate xmlStrEqual calls in htmlParseEndTag
8095365b	2021-03-04T18:46:11	Speed up htmlCheckAutoClose Switch to binary search.
b25acce8	2021-03-04T17:44:45	Speed up htmlTagLookup Switch to binary search. This is the first time bsearch is used in the libxml2 code base. But it's a standard library function since C89 and should be portable.
ad101bb5	2021-03-02T13:32:53	Clarify xmlNewDocProp documentation
a6e6498f	2021-03-02T13:09:06	Stop checking attributes for UTF-8 validity I can't see a reason to check attribute content for UTF-8 validity. Other parts of the API like xmlNewText have always assumed valid UTF-8 as extra checks only slow down processing. Besides, setting doc->encoding to "ISO-8859-1" seems pointless, and not freeing the old encoding would cause a memory leak. Note that this was last changed in 2008 with commit 6f8611fd which removed unnecessary encoding/decoding steps. Setting attributes should be even faster now. Found by OSS-Fuzz.
8446d459	2021-03-01T20:56:40	Reduce some fuzzer timeouts OSS-Fuzz has been fuzzing the HTML parser with inputs up to 1 MB for several hundred hours without hitting the 20s timeout. It seems that most timeouts resulting from accidentally quadratic behavior in the HTML parser have been fixed. Start to gradually reduce the timeout to find new performance issues.
688b41a0	2021-03-01T14:17:42	Fix quadratic behavior when looking up xml:* attributes Add a special case for the predefined XML namespace when looking up DTD attribute defaults in xmlGetPropNodeInternal to avoid calling xmlGetNsList. This fixes quadratic behavior in - xmlNodeGetBase - xmlNodeGetLang - xmlNodeGetSpacePreserve Found by OSS-Fuzz.

8f5ccada

2021-07-07T19:24:36

xmlAddChild() and xmlAddNextSibling() may not attach their second argument Use the return value of xmlAddChild() and xmlAddNextSibling() instead of the second argument directly. Found by OSS-Fuzz. Fixes #316

53983804

2022-01-25T03:08:22

Run CI tests with UBSan implicit-conversion checks This enables the remaining checks from the "integer" group: - implicit-unsigned-integer-truncation - implicit-signed-integer-truncation - implicit-integer-sign-change These checks can find all kinds of bugs and only require explicit casts if integer truncation or sign change is really intended.

a647e430

2022-01-25T02:59:40

Fix casting of line numbers in SAX2.c The line member is an unsigned short. Avoids integer conversion warnings with UBSan. Also use USHRT_MAX instead of hard-coded constant.

67c2e78b

2022-01-25T02:44:37

Fix integer conversion warnings in hash.c Use unsigned long for temporary variable to avoid integer conversion warnings with UBSan. Note that this does change the computation of hash values for input bytes larger than 0x7F. Before, these bytes were first converted to a (typically) signed char with a negative value, then to a large unsigned long near ULONG_MAX. I doubt that this was intentional. Input bytes larger than 0x7F are now converted to unsigned long unchanged.

21217dd9

2022-01-25T02:34:40

Add explicit casts in runtest.c Avoids integer conversion warnings with UBSan.

7abc6e6a

2022-01-25T02:27:53

Fix integer conversion warning in xmlIconvWrapper Use size_t for return value of iconv(3) to avoid an UBSan integer conversion warning.

f4a74bf0

2022-01-25T02:21:05

Add suffix to unsigned constant in xmlmemory.c Avoids an integer conversion warning with UBSan.

5948abfe

2022-01-25T01:59:03

Add explicit casts in testchar.c Avoids integer conversion warnings with UBSan.

6f95273e

2022-01-25T01:46:59

Fix integer conversion warnings in xmlstring.c Use an int to avoid an integer conversion warning with UBSan when left-shifting a char.

0596d67d

2022-01-25T01:39:41

Add explicit cast in xmlURIUnescapeString Avoids an integer conversion warning with UBSan.

f872aa18

2022-01-25T01:16:00

Fix handling of ctxt->base in xmlXPtrEvalXPtrPart Also set ctxt->base when updating ctxt->cur. Always restore ctxt->cur on error. Avoids integer truncation and wrong column numbers in xmlXPathErr. Stop hiding modification of ctxt members behind a macro. Found with UBSan.

97fe1279

2022-01-20T16:08:35

Remove wrong tarname from AC_INIT Remove the "tarname" added in commit 7c0253aa. Having a tarname including a version number would result in tarballs named libxml2-2.9.12-2.9.12.tar.gz. This change also means that documentation will now be installed in $(datadir)/doc/libxml2 instead of $(datadir)/doc/libxml2-$(version). Having a version number in the documentation directory doesn't seem helpful. The new location also matches the default autotools $(docdir).

00e618eb

2022-01-17T21:39:27

Remove old devhelp format See #295.

d85245f9

2022-01-16T21:39:04

Fix regression with PEs in external DTD Fix a regression introduced with commit a28f7d87. In some cases, parameter entity references in external DTDs wouldn't be expanded. Fixes #306.

9f4cb84c

2022-01-16T18:39:51

Fix xmllint --maxmem xmlMemSetup must be called before initializing the parser, otherwise some data structures will be allocated with system malloc instead of our custom allocator. This throws off built-in memory debugging and sanitizers.

e4c91f74

2021-11-03T11:41:11

Fix Null-deref-in-xmlSchemaGetComponentTargetNs

9277abe2

2022-01-16T15:50:56

Fix libxml2.doap Add description. Change category to "infrastructure". Apparently, "platform" isn't allowed anymore. Add programming language.

87a99270

2021-08-26T11:50:41

Added regression tests for xmlReadFd() and htmlReadFd()

fe6890e2

2021-07-27T13:20:20

Fix htmlReadFd, which was using a mix of xml and html context functions

67953a9f

2022-01-16T15:30:02

Fix memory leak in xmlXPathCompNodeTest Found by Coverity.

3cc64a88

2021-07-22T15:46:38

setup.py.in: Try to import setuptools This way, we can build binary wheels easily if needed

dbfe6151

2021-07-22T15:36:15

Python distutils: Make DLL packaging more flexible This updates setup.py.in to pack the DLLs according to the options we specified to configure.js or CMake (or, even configure, although autotools builds are not likely to build the libxml2 Python module via distutils). At this point, we can pack only the DLLs that libxml2 really depends on, and pack the libxslt DLLs only if we really built the libxslt Python modules. Also make the DLL filenames more easily configured

1b7d4e2b

2021-07-22T14:46:48

tstmem.py: Try importing from libxmlmods.libxml2mod if needed Distutils builds place libxml2mod.pyd under the libxmlmods subdir, so try this directory if 'import libxml2mod' failed.

6e169c14

2021-03-30T16:11:13

python: Port python 3.x module to Windows On Windows, we don't have fcntl() which helps us to find out how a file was opened, so we need to resort to the Windows API NtQueryInformationFile() in ntdll.dll to help us, and compare the file access modes as appropriate to deduce the modes we want to pass into fdopen(). As all official Python 3.x releases are built against newer Windows CRTs that toughen checks on the validity of the file descriptor when we convert the fd to a native Windows File Handle using _get_osfhandle(), we need to define an empty handler so that the program does not abort if the fd that was passed in was invalid; instead, we just return NULL if _get_osfhandle() could not return us a valid Windows File Handle.

eb4c1bf8

2021-11-03T09:48:13

Fix random dropping of characters on dumping ASCII encoded XML Fix a bug in xmlCharEncOutput return value which will cause xmlNodeDumpOutput to drop characters randomly. xmlCharEncOutput returns zero if the length of the input buffer is zero but ignores the fact that it may already encoded the input buffer and the input's length is zero due to the fact that xmlEncOutputChunk returned -2 errors and underlying code tries to fix the error by encoding the input. xmlCharEncOutput is collecting the number of bytes written to the output buffer but is returning zero instead of the total number of bytes in this situation. This commit will fix this issue by returning the total number of bytes instead. So the xmlNodeDumpOutput will also continue writing and will not stop due to the fact that it mistakenly thinks the output buffer is not changed in that iteration. Fixes #314

66fb340a

2021-10-14T15:01:24

Update URL for libxml++ C++ binding Fixes #267

ae728bb8

2022-01-16T15:05:41

Fix null pointer deref in xmlStringGetNodeList Check for malloc failure to avoid null deref.

46c658b0

2021-08-06T08:48:24

move current position before possible calling of ctxt->sax->characters.

96753450

2021-07-29T12:14:03

Correctly install the HTML examples into their subdirectory. Previous to this commit, the examples where installed haphazardly within all the other html documents, also overwriting index.html, for example. Signed-off-by: Mattia Rizzolo <mattia@mapreri.org>

7c0253aa

2021-07-29T12:11:08

Refactor the settings of $docdir This is a completely noop change for this project, since before this commit nothing was using $docdir nor PROGRAM_TARNAME. Setting the fourth parameter of AC_INIT() makes it set PROGRAM_TARNAME, which then used as the last path component of the default docdir, effectively making $docdir be the same as the previous $BASE_DIR/$DOC_MODULE. Signed-off-by: Mattia Rizzolo <mattia@mapreri.org>

1a013ba7

2021-07-26T20:11:56

configure: remove unused checks for libraries These libraries are queried for, but no code cares about the results, so remove the checks.

0aad075c

2021-07-26T20:10:52

cmake: remove unused checks Even the configured `config.h` did not forward the results of these checks.

9669bd68

2021-07-26T20:09:32

configure: remove unused checks for headers These headers are checked for at configure time, but the code never cares about the results of these checks, so skip them.

f8608235

2021-07-26T20:06:18

cmake: fix `ATTRIBUTE_DESTRUCTOR` definition The code expects it to be set to the attribute for `xmlDestructor`, but in CMake, it is only ever available as `1` or undefined. Instead, match the behavior or autoconf.

51c88c6f

2021-07-26T20:12:45

configure: remove unused checks for functions Nothing uses the results from these checks, so remove the checks. There are some "uses" in order to suppress macro shadowing in MSVC's implementation of `isinf` and `isnan` as macros, but those are hard-coded and do not require checks to manage.

3ba59b93

2021-07-23T22:34:29

Generate devhelp2 index file The devhelp2 format was introduced in 2005, and the devhelp format was deprecated in 2017. Fixes: https://gitlab.gnome.org/GNOME/libxml2/-/issues/295

91b3d3f9

2021-07-14T17:12:11

Remove duplicated code in xmlcatalog Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

d7f11fd0

2021-07-14T17:03:46

Fix leak in __xmlOutputBufferCreateFilename Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

477f6de3

2021-07-14T15:35:31

Fix memory leak in xmlRelaxNGNewDocParserCtxt Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

483de2c2

2021-07-14T15:31:55

Fix memory leak in xmlRelaxNGParseData Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

9a9dd31b

2021-07-14T15:28:56

Fix memory leak in libxml_C14NDocSaveTo Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

d68c1637

2021-07-14T15:23:11

Fix memory leak in libxml_saveNodeTo Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

328456bf

2021-07-14T14:43:59

Fix memory leak in xmlNewInputFromFile Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

fe564967

2021-07-14T14:35:17

Fix memory leak in xmlCreateIOParserCtxt Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

f0904f32

2021-07-14T14:14:34

Fix memory leak in xmlParseSGMLCatalog Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

2510f43c

2021-07-14T14:03:44

Fix memory leak in xmlParseCatalogFile Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

92bce68c

2021-07-14T11:37:07

Fix memory leak in xmlSAX2AttributeDecl Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

e7d1c53a

2021-07-14T11:32:57

Fix memory leak in xmlFreeParserInputBuffer Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806

03bb9293

2021-07-07T18:23:18

Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8(). * encoding.c: (UTF16LEToUTF8): - Fix comment to describe what the code does. (UTF16BEToUTF8): - Fix undefined behavior which was applied to UTF16LEToUTF8() in 2f9382033e. - Add bounds check to while() loop which was applied to UTF16LEToUTF8() in be803967db. - Do not return -2 when (in >= inend) to fix the bug. This was applied to UTF16LEToUTF8() in 496a1cf592. - Inline (<< 8) statements to match UTF16LEToUTF8(). Add the following tests and results: test/text-4-byte-UTF-16-BE-offset.xml test/text-4-byte-UTF-16-BE.xml test/text-4-byte-UTF-16-LE-offset.xml test/text-4-byte-UTF-16-LE.xml

e6adc19f

2021-07-05T13:40:54

man: Mention XML_CATALOG_FILES is space-separated Fixes: https://bugzilla.gnome.org/show_bug.cgi?id=781274

bdd482c2

2021-07-05T18:48:10

add documentaiton for xmllint exit code 10 Closes: https://gitlab.gnome.org/GNOME/libxml2/-/issues/280

a0f9211b

2021-06-28T02:03:15

python/Makefile.am: use *_LIBADD, not *_LDFLAGS for LIBS This fixes over-linking in the built Python modules with various libraries. *_LIBADD is intended for adding additional libraries for linking, while *_LDFLAGS is for miscellaneous extra flags (possibly user-supplied). If using -Wl,-as-needed within user-supplied LDFLAGS, it is passed too late (after the library link line) and therefore has no effect. Notes: * Noticed while working on Gentoo's migration to libxcrypt because libxml2's Python modules were linking to libcrypt (and other libraries) unexpectedly. * It was suggested we could actually stop linking explicitly with all of Python's libraries / don't copy its LDFLAGS, but this resolves the original issue downstream and is a separate discussion. I couldn't find any clear documentation for/against such a change. Bug: https://bugs.gentoo.org/798942 Signed-off-by: Sam James <sam@gentoo.org>

ff05c94a

2022-01-16T13:56:17

Fix check for libtool in autogen.sh libtoolize is named glibtoolize on some macOS systems.

343bf0d3

2022-01-16T13:52:21

Add myself to maintainers Fixes #319.

c35628a2

2022-01-15T18:18:22

Revert "Make schema validation fail with multiple top-level elements" This reverts commit 4f2aee18f6e2d40e58eb224f4f7935dc2400fe25. Fixes #305.

798bdf13

2022-01-10T14:50:20

Different approach to fix quadratic behavior in HTML push parser The old approach introduced a regression, see issue #312 and the previous commit. Disable code that tries to recover from invalid start tags. This only affects "recovery" mode. Add a comment outlining a better fix in accordance with the HTML5 spec.

094fc08a

2022-01-10T14:02:10

Fix regression when parsing invalid HTML tags in push mode Revert part of commit 173a0830 that changed behavior when parsing malformed start tags with the push parser. This reintroduces quadratic behavior in recovery mode which will be worked around in the next commit. Fixes #312.

2732b234

2022-01-10T13:32:14

Fix regression parsing public IDs literals in HTML Fix regression introduced when reworking htmlParsePubidLiteral in commit 93ce33c2. Fixes #318.

dea91c97

2021-07-27T16:12:54

Fix buffering in xmlOutputBufferWrite Fix a regression introduced with commit a697ed1e which caused xmlOutputBufferWrite to flush internal buffers too late. Fixes #296.

ec6e3efb

2021-07-06T21:56:04

Patch to forbid epsilon-reduction of final states When building the internal representation of a regexp, it is possible that a lot of empty transitions are created. Therefore there is a step to reduce them in the function xmlFAEliminateSimpleEpsilonTransitions. There is an error there for this case: * State 1 has a transition with an atom (in this case "a") to state 2. * State 2 is final and has an epsilon transition to state 1. After reduction it looked like: * State 1 has a transition with an atom (in this case "a") to itself and is final. In other words, the empty string is accepted when it shouldn't be. The attached patch skips the reduction step for final states. An alternative would be to insert or increment counters when reducing a final state, but this seemed error prone and unnecessary, since there aren't that many final states. Fixes #282

22f15211

2021-06-04T09:57:46

Use version in configure.ac for CMake Now CMake script reads version from configure.ac to prevent unsynchronized versions

92d9ab4c

2021-06-07T15:09:53

Fix whitespace when serializing empty HTML documents The old, non-recursive HTML serialization code would always terminate the output with a newline. The new implementation omitted the newline if the document node had no children. Readd the newline when serializing empty documents. Fixes #266.

3e1aad4f

2021-06-02T17:31:49

Fix XPath recursion limit Fix accounting of recursion depth when parsing XPath expressions. This silly bug introduced in commit 804c5297 could lead to spurious errors when parsing larger expressions or XSLT documents. Should fix #264.

13ad8736

2021-05-25T10:55:25

Fix regression in xmlNodeDumpOutputInternal Commit 85b1792e could cause additional whitespace if xmlNodeDump was called with a non-zero starting level.

a46e85f6

2021-05-22T15:20:46

Update CMake project version

a1cac3bb

2021-05-22T14:51:26

Add CMake alias targets for embedded projects

2c0f2f03

2021-05-18T09:52:55

Fix some validation errors in the FAQ Move paragraphs inside li elements.

b92b16f6

2021-05-19T10:15:54

Remove unused variable in xmlCharEncOutFunc Fixes a compiler warning: encoding.c: In function 'xmlCharEncOutFunc__internal_alias': encoding.c:2632:9: warning: unused variable 'output' [-Wunused-variable] 2632 | int output = 0; https://gitlab.gnome.org/GNOME/libxml2/-/issues/254

7d4060d2

2021-05-16T18:00:21

Add missing file xmlwin32version.h.in to EXTRA_DIST

4fc473d7

2021-05-16T17:48:07

Add instructions on how to use CMake to compile libxml

85b1792e

2021-05-18T20:08:28

Work around lxml API abuse Make xmlNodeDumpOutput and htmlNodeDumpFormatOutput work with corrupted parent pointers. This used to work with the old recursive code but the non-recursive rewrite required parent pointers to be set correctly. Unfortunately, lxml relies on the old behavior and passes subtrees with a corrupted structure. Fall back to a recursive function call if an invalid parent pointer is detected. Fixes #255.

a7b9f3eb

2021-05-20T13:38:54

fix: avoid segfault at exit when using custom memory functions This extends the fix introduced by 956534e to Windows processes dynamically loading libxml2. Closes #256.

b48e77cf

2021-05-13T20:56:16

Release of libxml2-2.9.12 Brown paper bag release, some recently added sources were missing from the 2.9.11 tarball: - configure.ac: bump version - fuzz/Makefile.am: add fuzz.h and seed/regexp to EXTRA_DIST

e1bcffea

2021-05-13T15:35:21

Release of libxml2-2.9.11 Prompted by CVE-2021-3541, but this includes an awful lot of serious bug fixes by Nick and others. - configure.ac: bumped to new release - doc/* updated and regenerated

8598060b

2021-05-13T14:55:12

Patch for security issue CVE-2021-3541 This is relapted to parameter entities expansion and following the line of the billion laugh attack. Somehow in that path the counting of parameters was missed and the normal algorithm based on entities "density" was useless.

bfd2f430

2021-05-09T18:56:57

Fix null deref in legacy SAX1 parser Always call nameNsPush instead of namePush. The latter is unused now and should probably be removed from the public API. I can't see how it could be used reasonably from client code and the unprefixed name has always polluted the global namespace. Fixes a null pointer dereference introduced with de5b624f when parsing in SAX1 mode. Found by OSS-Fuzz.

ce00c36e

2021-05-08T21:20:05

Store per-element parser state in a struct Make the parser context's "pushTab" point to an array of structs instead of void pointers. This avoids casting unrelated types to void pointers, improving readability and portability, and allows for more efficient packing. Ultimately, the struct could be extended to include the contents of "nameTab" and "spaceTab", further simplifying the code. Historically, "pushTab" was only used by the push parser (hence the name), so the change to the public headers should be safe. Also remove an unused parameter from xmlParseEndTag2.

de5b624f

2021-05-08T20:21:29

Fix handling of unexpected EOF in xmlParseContent Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was removed in commit 62150ed2. This commit also introduced a regression for direct users of xmlParseContent. Unclosed tags weren't checked.

3e80560d

2021-05-07T10:51:38

Fix line numbers in error messages for mismatched tags Commit 62150ed2 introduced a small regression in the error messages for mismatched tags. This typically only affected messages after the first mismatch, but with custom SAX handlers all line numbers would be off. This also fixes line numbers in the SAX push parser which were never handled correctly.

7279d236

2021-05-06T10:37:07

Fix htmlTagLookup Fix regression introduced with b25acce8. Some users like libxslt may call the HTML output functions on documents with uppercase tag names, so we must keep case-insensitive string comparison. Fixes #248.

33468d7e

2021-05-03T16:09:44

update for xsd:language type check Fixes #242.

babe7503

2021-05-01T16:53:33

Propagate error in xmlParseElementChildrenContentDeclPriv Check return value of recursive calls to xmlParseElementChildrenContentDeclPriv and return immediately in case of errors. Otherwise, struct xmlElementContent could contain unexpected null pointers, leading to a null deref when post-validating documents which aren't well-formed and parsed in recovery mode. Fixes #243.

5465a8e5

2021-04-25T21:19:59

Update INSTALL.libxml2 Fixes #238.

1098c30a

2021-04-22T19:26:28

Fix user-after-free with `xmllint --xinclude --dropdtd` The --dropdtd option can leave dangling pointers in entity reference nodes. Make sure to skip these nodes when processing XIncludes. This also avoids scanning entity declarations and even modifying them inadvertently during XInclude processing. Move from a block list to an allow list approach to avoid descending into other node types that can't contain elements. Fixes #237.

72b3c067

2021-04-22T19:24:50

Fix dangling pointer with `xmllint --dropdtd` Reset doc->intSubset when dropping the DTD.

bf227135

2020-08-16T17:19:35

Validate UTF8 in xmlEncodeEntities Code is currently assuming UTF-8 without validating. Truncated UTF-8 input can cause out-of-bounds array access. Adds further checks to partial fix in 50f06b3e. Fixes #178

1358d157

2021-04-21T13:23:27

Fix use-after-free with `xmllint --html --push` Call htmlCtxtUseOptions to make sure that names aren't stored in dictionaries. Note that this issue only affects xmllint using the HTML push parser. Fixes #230.

fb08d9fe

2021-03-20T22:02:26

Fix include order in c14n.h - Include xmlversion.h before testing feature flags. - Include libxml headers before extern "C". Fixes #226.

d3a02679

2021-03-15T13:44:34

CMake: Only add postfixes if MSVC Currently, it catches mingw-w64 in there as well, but mingw-w64 follows linux-like naming with no weird postfixes Signed-off-by: Christopher Degawa <ccom@randomderp.com>

868e49cf

2021-03-16T10:36:04

Allow FP division by zero in xmlXPathInit

d25460da

2021-03-13T19:12:00

Fix XPath NaN/Inf for older GCC versions The DBL_MAX approach could lead to errors caused by excess precision. Switch back to the division-by-zero approach with a work-around for MSVC and use the extern globals instead of macro expressions.

e20c9c14

2021-03-13T18:41:47

Fix xmlGetNodePath with invalid node types Make xmlGetNodePath return NULL instead of invalid XPath when hitting unsupported node types like DTD content. Reported here: https://mail.gnome.org/archives/xml/2021-January/msg00012.html Original report: https://bugs.php.net/bug.php?id=80680

c3fd8c42

2021-03-13T17:19:32

Fix exponential behavior with recursive entities Fix another case where only recursion depth was limited, but entities would still be expanded over and over again. The test case discovered by fuzzing only affected parsing in recovery mode with XML_PARSE_RECOVER. Found by OSS-Fuzz.

683de7ef

2021-03-04T19:06:04

Fix duplicate xmlStrEqual calls in htmlParseEndTag

8095365b

2021-03-04T18:46:11

Speed up htmlCheckAutoClose Switch to binary search.

b25acce8

2021-03-04T17:44:45

Speed up htmlTagLookup Switch to binary search. This is the first time bsearch is used in the libxml2 code base. But it's a standard library function since C89 and should be portable.

ad101bb5

2021-03-02T13:32:53

Clarify xmlNewDocProp documentation

a6e6498f

2021-03-02T13:09:06

Stop checking attributes for UTF-8 validity I can't see a reason to check attribute content for UTF-8 validity. Other parts of the API like xmlNewText have always assumed valid UTF-8 as extra checks only slow down processing. Besides, setting doc->encoding to "ISO-8859-1" seems pointless, and not freeing the old encoding would cause a memory leak. Note that this was last changed in 2008 with commit 6f8611fd which removed unnecessary encoding/decoding steps. Setting attributes should be even faster now. Found by OSS-Fuzz.

8446d459

2021-03-01T20:56:40

Reduce some fuzzer timeouts OSS-Fuzz has been fuzzing the HTML parser with inputs up to 1 MB for several hundred hours without hitting the 20s timeout. It seems that most timeouts resulting from accidentally quadratic behavior in the HTML parser have been fixed. Start to gradually reduce the timeout to find new performance issues.

688b41a0

2021-03-01T14:17:42

Fix quadratic behavior when looking up xml:* attributes Add a special case for the predefined XML namespace when looking up DTD attribute defaults in xmlGetPropNodeInternal to avoid calling xmlGetNsList. This fixes quadratic behavior in - xmlNodeGetBase - xmlNodeGetLang - xmlNodeGetSpacePreserve Found by OSS-Fuzz.

kc3-lang/libxml2

Log