tree.c


Log

Author Commit Date CI Message
Nick Wellnhofer 8677f547 2024-03-05T03:24:45 malloc-fail: Fix erroneous report in xmlNodeGetBaseSafe
Nick Wellnhofer 9b3750c6 2024-03-04T03:49:23 malloc-fail: Avoid use-after-free in xmlAddChild Returning NULL doesn't signal that the node was freed.
Nick Wellnhofer 702f2e46 2024-03-04T01:39:34 malloc-fail: Fix memory leak in xmlNewNodeEatName
Nick Wellnhofer e4e90961 2024-03-07T13:00:14 tree: Prefer xmlGetPropNodeInternal over xmlHasNsProp xmlHasNsProp can cause unreported malloc failures when looking up default attributes. Switch to xmlGetPropNodeInternal when moving attributes. We don't care about default attributes in this case.
Nick Wellnhofer 7d9ffd40 2024-03-06T19:44:00 tree: Report malloc failure in xmlAddNextSibling
Nick Wellnhofer bc7ab5a2 2024-03-02T18:59:51 tree: Rewrite xmlSetTreeDoc Report malloc failures. Fix when called directly on attribute node. Clear 'content' and 'last' and look up new entity for entity reference nodes.
Nick Wellnhofer 2ba690a7 2024-03-05T16:34:22 tree: Remove more unused node types
Nick Wellnhofer fc9a2ca0 2024-03-06T16:02:24 tree: Report more malloc failures
Nick Wellnhofer 536aa2cd 2024-03-04T16:55:32 tree: Fix adding ids in xmlNewPropInternal Don't try to add ids to NULL document. Report malloc failure from xmlIsID.
Nick Wellnhofer d0d6174e 2024-02-29T19:38:29 valid: Rework xmlAddID
Nick Wellnhofer d57c57ed 2024-03-05T14:53:35 tree: Improve argument check in xmlTextConcat
Nick Wellnhofer 16c29557 2024-03-05T14:52:34 tree: Remove unused node types
Nick Wellnhofer f960c60d 2024-03-05T03:25:16 tree: Make namespace comparison more consistent The API allows NULL namespace URIs, so we should match them consistently. Simply use xmlStrEqual which already takes NULL strings into account.
Nick Wellnhofer d1cc6f7d 2024-03-05T04:34:59 tree: Don't allow NULL name in xmlSetNsProp
Nick Wellnhofer 2840e33c 2024-03-04T07:34:25 tree: Allocate XML namespace statically
Nick Wellnhofer 696faeb4 2024-03-05T16:17:57 tree: Rework xmlNodeListGetString Use string buffer to avoid quadratic complexity. Handle entities with xmlBufGetNodeContent. Report malloc failures.
Nick Wellnhofer 41964548 2024-02-28T12:17:57 tree: Rework xmlTextMerge Return NULL on error. Check for malloc failure. Check that nodes are distinct.
Nick Wellnhofer a3713f78 2024-02-28T11:44:46 tree: Rework xmlNodeSetName Disallow xmlNodeSetName on DTD nodes. DTD nodes don't store the name in a dictionary. Calling xmlNodeSetName with a DTD node could result in an invalid free. This function doesn't report errors but we can make sure that name isn't set to NULL.
Nick Wellnhofer 77c71350 2024-02-27T20:21:48 tree: Simplify xmlAddChild with text parent
Nick Wellnhofer 7e462425 2024-02-27T20:18:42 tree: Don't allow misuse of xmlAddChild xmlAddChild assumes that the child is unlinked. If the child is already linked, return an error instead of corrupting the tree.
Nick Wellnhofer b043d959 2024-03-08T12:40:12 tree: Check return value of xmlNodeAddContent
Nick Wellnhofer 18ebdacf 2024-03-07T13:02:46 tree: Fix error return in xmlGetPropNodeValueInternal
Nick Wellnhofer 2c214a50 2024-02-27T16:29:52 tree: Fix xmlAddPropSibling with duplicate attributes Look up existing attribute before unlinking new attribute. This makes it easier for the fuzzer to detect which attribute will de deleted if there are multiple attributes with the same name.
Nick Wellnhofer 2e765083 2024-02-27T16:23:44 tree: Fix indentation in xmlAddPropSibling
Nick Wellnhofer 16c0374a 2024-02-27T15:31:33 tree: Fix xmlAddSibling with last sibling If the node to be added was already at the correct position, the tree could be corrupted.
Nick Wellnhofer 74ca2f59 2024-02-27T13:44:54 tree: Move type check in xmlAddChild Avoid aborting halfway after changing parent pointer if node types don't match when adding attributes.
Nick Wellnhofer 29db9881 2024-02-23T16:59:40 tree: Fix xmlDocSetRootElement with multiple top-level elements Fix xmlDocSetRootElement when setting the original root if multiple top-level elements are present.
Nick Wellnhofer 4b698dba 2024-02-22T18:13:53 tree: Only allow elements in xmlDocSetRootElement
Nick Wellnhofer d5f50602 2024-02-22T16:12:07 tree: Disallow setting content of entity reference nodes The content of entity reference nodes points to the entity declaration and isn't freed. Changing the content would result in a memory leak.
Nick Wellnhofer 77f2012c 2024-02-22T15:25:05 tree: Rework xmlReconciliateNs
Nick Wellnhofer af66a6b5 2024-02-22T13:03:59 tree: Unlink DTD in xmlStaticCopyNodeList Avoid tree corruption when copying within a document.
Nick Wellnhofer bb22cfb9 2024-02-22T12:39:42 tree: Unlink DTD in xmlFreeNodeList Avoid dangling next/prev pointers.
Nick Wellnhofer a581f651 2024-02-21T12:09:10 tree: Check for integer overflow in xmlStringGetNodeList This function is called with unvalidated strings from functions like xmlNewDocProp, xmlNewDocNode or xmlNodeSetContent, so we have to check for integer overflow after all.
Nick Wellnhofer 6aae1767 2024-02-01T15:18:26 tree: Fix error condition in xmlNodeListGetString Don't return NULL in case of undeclared entities.
Nick Wellnhofer d025cfbb 2023-12-27T03:53:24 parser: Always copy content from entity to target. Make sure that references from IDs are updated. Note that if there are IDs with the same value in a document, the last one will now be returned. IDs should be unique, but maybe this should be addressed.
Nick Wellnhofer c49572e5 2023-12-23T15:03:22 malloc-fail: Fix erroneous report in xmlStringGetNodeList The parser can produce invalid attribute content in recovery mode. Unless this is fixed, xmlStringGetNodeList should ignore such errors silently.
Nick Wellnhofer 0ea47327 2023-12-13T14:44:29 malloc-fail: Fix memory leak in xmlNodeGetBaseSafe Short-lived regression.
Nick Wellnhofer 5c06f4e3 2023-12-12T14:37:17 malloc-fail: Fix erroneous reports in xmlNodeListGetString Short-lived regression.
Nick Wellnhofer aca16fb3 2023-12-10T16:37:43 tree: Report malloc failures Fix many places where malloc failures aren't reported. Make some API function return an error code. Changing the return type from void to int is technically an ABI break but should be safe on most platforms. - xmlNodeSetContent - xmlNodeSetContentLen - xmlNodeAddContent - xmlNodeAddContentLen - xmlNodeSetBase Introduce new API functions that return a separate error code if a memory allocation fails. - xmlNodeGetAttrValue - xmlNodeGetBaseSafe - xmlGetNsListSafe Introduce private functions xmlTreeEnsureXMLDecl and xmlSplitQName4. Don't report low-level errors to the global error handler. Fix tree Introduce xmlGetNsListSafe Fix tree
Nick Wellnhofer 502971cc 2023-12-01T17:49:48 tree: Another fix related to #538 Should fix #639.
Nick Wellnhofer 8707838e 2023-11-28T13:27:25 tree: Fix #583 again Only set doc->intSubset after successful copy to avoid dangling pointers in error case.
Nick Wellnhofer de3f7014 2023-11-28T13:01:38 tree: Fix regression when copying DTDs This reverts commit d39f78069dff496ec865c73aa44d7110e429bce9. Fixes #634.
Nick Wellnhofer 97e99f41 2023-10-05T17:11:24 parser: Acknowledge that entities with namespaces are broken Entities which reference out-of-scope namespace have always been broken. xmlParseBalancedChunkMemoryInternal tried to reuse the namespaces currently in scope but these namespaces were ignored by the SAX handler. Besides, there could be different namespaces in scope when expanding the entity again. For example: <!DOCTYPE doc [ <!ENTITY ent "<ns:elem/>"> ]> <doc> <decl1 xmlns:ns="urn:ns1"> &ent; </decl1> <decl2 xmlns:ns="urn:ns2"> &ent; </decl2> </doc> Add some comments outlining possible solutions to this problem. For now, we stop copying namespaces to the temporary parser context in xmlParseBalancedChunkMemoryInternal. This has never really worked and the recent changes contained a partial fix which uncovered other problems like a use-after-free with the XML Reader interface, found by OSS-Fuzz.
Nick Wellnhofer 8c084ebd 2023-09-21T22:57:33 doc: Make apibuild.py happy
Nick Wellnhofer 9b5cce7a 2023-09-21T00:44:50 include: Remove more unnecessary includes
Nick Wellnhofer 11a1839d 2023-09-20T17:54:48 globals: Move remaining globals back to correct header files This undoes a lot of damage.
Nick Wellnhofer dc3382ef 2023-09-20T12:58:03 globals: Move xmlRegisterNodeDefault to tree.c Code in globals.c must not try to access globals itself since the accessor macros aren't defined and we would only see the main variable.
Nick Wellnhofer 4e1c13eb 2023-09-18T14:45:10 debug: Remove debugging code This is barely useful these days and only clutters the code base.
Nick Wellnhofer d39f7806 2023-08-23T20:24:24 tree: Fix copying of DTDs - Don't create multiple DTD nodes. - Fix UAF if malloc fails. - Skip DTD nodes if tree module is disabled. Fixes #583.
Nick Wellnhofer b8961df6 2023-05-09T03:25:24 SAX: Always validate xml:ids The behavior shouldn't depend on mostly random configuration options.
Nick Wellnhofer dbc893f5 2023-03-03T13:02:11 malloc-fail: Fix memory leak in xmlCopyNamespaceList Found with libFuzzer, see #344.
Nick Wellnhofer a442d16a 2023-02-26T14:48:23 malloc-fail: Fix memory leak in xmlGetNsList Found with libFuzzer, see #344.
Nick Wellnhofer bc7740b3 2023-02-16T11:45:58 malloc-fail: Fix memory leak in xmlCopyPropList Found with libFuzzer, see #344.
Nick Wellnhofer e6401b68 2023-01-17T14:01:23 tree: Fix recursion check in xmlStringGetNodeList Use the new entity flag to check for recursion.
Nick Wellnhofer 481d79d4 2022-12-19T15:26:46 entities: Add XML_ENT_PARSED flag To check whether an entity was already parsed, the code previously tested whether "checked" was non-zero or "children" was non-null. The "children" check could be unreliable because an empty entity also results in an empty (NULL) node list. Use a separate flag to make this check more reliable.
Nick Wellnhofer 2059df53 2022-11-14T22:27:58 buf: Deprecate static/immutable buffers
Nick Wellnhofer b4592709 2022-11-02T16:22:54 malloc-fail: Fix memory leak in xmlStringGetNodeList Also make sure to return NULL on error instead of a partial node list. Found with libFuzzer, see #344.
Nick Wellnhofer dd50cfeb 2022-11-02T15:58:31 malloc-fail: Fix memory leak in xmlNewDocNodeEatName Found with libFuzzer, see #344.
Nick Wellnhofer fa361de0 2022-11-02T15:53:52 malloc-fail: Fix memory leak in xmlNewPropInternal Also fixes a memory leak if called with a non-element node. Found with libFuzzer, see #344.
Nick Wellnhofer a22bd982 2022-11-02T15:44:42 malloc-fail: Fix memory leak in xmlStaticCopyNodeList Found with libFuzzer, see #344.
Nick Wellnhofer 2fc8d123 2022-10-22T19:08:43 xinclude: Make xmlXIncludeCopyNode non-recursive Avoid call stack overflows. Also switch to xmlStaticCopyNode which avoids duplicate namespace definitions.
Nick Wellnhofer 59f2f60e 2022-09-02T00:27:57 Remove "runtime debugging" This doesn't seem useful as configuration option.
Nick Wellnhofer bdcf842c 2022-09-01T20:45:35 Move xmlIsXHTML to tree.c It's declared in tree.h and not guarded by LIBXML_OUTPUT_ENABLED like the other functions in xmlsave.c.
Nick Wellnhofer 2cac6269 2022-09-01T03:14:13 Don't use sizeof(xmlChar) or sizeof(char)
Nick Wellnhofer ad338ca7 2022-09-01T01:18:30 Remove explicit integer casts Remove explicit integer casts as final operation - in assignments - when passing arguments - when returning values Remove casts - to the same type - from certain range-bound values The main motivation is that these explicit casts don't change the result of operations and only render UBSan's implicit-conversion checks useless. Removing these casts allows UBSan to detect cases where truncation or sign-changes occur unexpectedly. Document some explicit casts as truncating and add a few missing ones.
Nick Wellnhofer d7a334f2 2022-08-26T14:43:28 Silence -Warray-bounds warning This is a hack, but works for now. Fixes #389.
Nick Wellnhofer 0f568c0b 2022-08-26T01:22:33 Consolidate private header files Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.
Nick Wellnhofer 39745c92 2022-07-19T21:23:44 Improve documentation of tree manipulation API - Discourage use of node constructors without document. - Mention that xmlReconciliateNs is crucial when moving nodes from one document to another.
Nick Wellnhofer 3e7b4f37 2022-05-20T23:28:25 Avoid calling xmlSetTreeDoc Create text nodes with xmlNewDocText or set the document directly to avoid xmlSetTreeDoc being called when the node is inserted.
Nick Wellnhofer 823bf161 2022-05-20T22:38:38 Simplify xmlFreeNode
Nick Wellnhofer a17a1f56 2022-05-18T02:17:31 Don't reset nsDef when changing node content nsDef is only used for element nodes.
Nick Wellnhofer 24646525 2022-05-18T02:16:34 Fix unintended fall-through in xmlNodeAddContentLen
David Kilzer 6ef16dee 2022-05-13T14:43:33 Reserve byte for NUL terminator and report errors consistently in xmlBuf and xmlBuffer This is a follow-up to commit 6c283d83. * buf.c: (xmlBufGrowInternal): - Call xmlBufMemoryError() when the buffer size would overflow. - Account for NUL terminator byte when using XML_MAX_TEXT_LENGTH. - Do not include NUL terminator byte when returning length. (xmlBufAdd): - Call xmlBufMemoryError() when the buffer size would overflow. * tree.c: (xmlBufferGrow): - Call xmlTreeErrMemory() when the buffer size would overflow. - Do not include NUL terminator byte when returning length. (xmlBufferResize): - Update error message in xmlTreeErrMemory() to be consistent with other similar messages. (xmlBufferAdd): - Call xmlTreeErrMemory() when the buffer size would overflow. (xmlBufferAddHead): - Add overflow checks similar to those in xmlBufferAdd().
David Kilzer 4ce2abf6 2022-05-29T09:46:00 Fix missing NUL terminators in xmlBuf and xmlBuffer functions * buf.c: (xmlBufAddLen): - Change check for remaining space to account for the NUL terminator. When adding a length exactly equal to the number of unused bytes, a NUL terminator was not written. (xmlBufResize): - Set `buf->use` and NUL terminator when allocating a new buffer. * tree.c: (xmlBufferResize): - Set `buf->use` and NUL terminator when allocating a new buffer. (xmlBufferAddHead): - Set NUL terminator before returning early when shifting contents.
David Kilzer a6df42e6 2022-05-28T08:08:29 Fix integer overflow in xmlBufferDump() * tree.c: (xmlBufferDump): - Cap the return value to INT_MAX.
David Kilzer 461ef8ac 2022-05-25T14:19:10 Fix double colon typos in xmlBufferResize() Introduced in commit 6c283d83e.
David Kilzer 4bc3ebf3 2022-03-19T17:17:40 Fix ownership of xmlNodePtr & xmlAttrPtr fields in xmlSetTreeDoc() When changing `doc` on an xmlNodePtr or xmlAttrPtr, certain fields must either be a free-standing string, or they must be owned by `doc->dict`. The code to make this change was simply missing, so the crash happened when an xmlAttrPtr was being torn down after `doc` changed from non-NULL to NULL, but the `name` field was not copied. This is scenario 1 below. The xmlNodePtr->name and xmlNodePtr->content fields are also fixed at the same time. Note that xmlNodePtr->content is never added to the dictionary, so NULL is used instead of `newDict` to force a free-standing copy. This change covers all cases of dictionary changes: 1. Owned by old dictionary -> NULL new dictionary - Create free-standing copy of string. 2. Owned by old dictionary -> Non-NULL new dictionary - Get string from new dictionary pool. 3. Not owned by old dictionary -> Non-NULL new dictionary - No action necessary (already a free-standing string). 4. Not owned by old dictionary -> NULL new dictionary - No action necessary (already a free-standing string). * tree.c: (_copyStringForNewDictIfNeeded): Add. (xmlSetTreeDoc): - Update xmlNodePtr->name, xmlNodePtr->content and xmlAttrPtr->name when changing the document, if needed. Found by OSS-Fuzz Issue 45132.
Nick Wellnhofer 6c283d83 2022-03-08T20:10:02 [CVE-2022-29824] Fix integer overflows in xmlBuf and xmlBuffer In several places, the code handling string buffers didn't check for integer overflow or used wrong types for buffer sizes. This could result in out-of-bounds writes or other memory errors when working on large, multi-gigabyte buffers. Thanks to Felix Wilhelm for the report.
Nick Wellnhofer d314046f 2022-04-23T17:41:44 Don't try to copy children of entity references This would result in an error, aborting the whole copy operation. Regressed in commit 7618a3b1. Fixes #371.
Nick Wellnhofer 41afa89f 2022-04-10T14:09:29 Fix short-lived regression in xmlStaticCopyNode Commit 7618a3b1 didn't account for coalesced text nodes. I think it would be better if xmlStaticCopyNode didn't try to coalesce text nodes at all. This code path can only be triggered if some other code doesn't coalesce text nodes properly. In this case, OSS-Fuzz found such behavior in xinclude.c.
Nick Wellnhofer 7618a3b1 2022-02-06T21:11:38 Make xmlStaticCopyNode non-recursive
Nick Wellnhofer d99ddd9b 2022-03-05T21:46:40 Improve buffer allocation scheme In most places, we really need the double-it scheme to avoid quadratic behavior. The hybrid scheme still can cause many reallocations and the bounded scheme doesn't seem to provide meaningful protection in xmlreader.c.
Nick Wellnhofer 4a8c71eb 2022-03-04T03:35:57 Remove DOCBparser This code has been broken and deprecated since version 2.6.0, released in 2003. Because of a bug in commit 961b535c, DOCBparser.c was never compiled since 2012. I couldn't find a Debian package using any of its symbols, so it seems safe to remove this module.
Nick Wellnhofer 776d15d3 2022-03-02T00:29:17 Don't check for standard C89 headers Don't check for - ctype.h - errno.h - float.h - limits.h - math.h - signal.h - stdarg.h - stdlib.h - string.h - time.h Stop including non-standard headers - malloc.h - strings.h
Nick Wellnhofer c41bc10d 2022-02-22T19:57:12 Fix unused variable warnings with disabled features
Nick Wellnhofer 346c3a93 2022-02-20T18:46:42 Remove elfgcchack.h The same optimization can be enabled with -fno-semantic-interposition since GCC 5. clang has always used this option by default.
Nick Wellnhofer 57b3abd5 2022-02-07T22:09:25 Fix xmlSetTreeDoc with entity references The children member of entity reference nodes points to the entity declaration and must never be followed when traversing a tree. In the worst case, this could lead to an infinite loop. It's somewhat unclear how moving entity references to other documents should work exactly. For now we simply set the children pointer to NULL to avoid a reference to the original document. Fixes #42.
Nick Wellnhofer ea53fc18 2022-02-07T18:24:03 Properly handle nested documents in xmlFreeNode Client code should never add document nodes as children of other nodes, but even our own XPointer code has a bug that can produce such trees. Make sure to really free nested documents. Also see commits 0815302d and 0762c9b6. Should fix #269.
Nick Wellnhofer ae728bb8 2022-01-16T15:05:41 Fix null pointer deref in xmlStringGetNodeList Check for malloc failure to avoid null deref.
Nick Wellnhofer e20c9c14 2021-03-13T18:41:47 Fix xmlGetNodePath with invalid node types Make xmlGetNodePath return NULL instead of invalid XPath when hitting unsupported node types like DTD content. Reported here: https://mail.gnome.org/archives/xml/2021-January/msg00012.html Original report: https://bugs.php.net/bug.php?id=80680
Nick Wellnhofer ad101bb5 2021-03-02T13:32:53 Clarify xmlNewDocProp documentation
Nick Wellnhofer a6e6498f 2021-03-02T13:09:06 Stop checking attributes for UTF-8 validity I can't see a reason to check attribute content for UTF-8 validity. Other parts of the API like xmlNewText have always assumed valid UTF-8 as extra checks only slow down processing. Besides, setting doc->encoding to "ISO-8859-1" seems pointless, and not freeing the old encoding would cause a memory leak. Note that this was last changed in 2008 with commit 6f8611fd which removed unnecessary encoding/decoding steps. Setting attributes should be even faster now. Found by OSS-Fuzz.
Nick Wellnhofer 688b41a0 2021-03-01T14:17:42 Fix quadratic behavior when looking up xml:* attributes Add a special case for the predefined XML namespace when looking up DTD attribute defaults in xmlGetPropNodeInternal to avoid calling xmlGetNsList. This fixes quadratic behavior in - xmlNodeGetBase - xmlNodeGetLang - xmlNodeGetSpacePreserve Found by OSS-Fuzz.
Nick Wellnhofer 01411e7c 2021-02-08T20:58:32 Check for invalid redeclarations of predefined entities Implement section "4.6 Predefined Entities" of the XML 1.0 spec and check whether redeclarations of predefined entities match the original definitions. Note that some test cases declared <!ENTITY lt "<"> But the XML spec clearly states that this is illegal: > If the entities lt or amp are declared, they MUST be declared as > internal entities whose replacement text is a character reference to > the respective character (less-than sign or ampersand) being escaped; > the double escaping is REQUIRED for these entities so that references > to them produce a well-formed result. Also fixes #217 but the connection is only tangential. The integer overflow discovered by fuzzing was more related to the fact that various parts of the parser disagreed on whether to prefer predefined entities over their redeclarations. The whole situation is a mess and even depends on legacy parser options. But now that redeclarations are validated, it shouldn't make a difference. As noted in the added comment, this is also one of the cases where overly defensive checks can hide interesting logic bugs from fuzzers.
SVGAnimate 07920b43 2021-01-26T05:42:48 Add the copy of type from original xmlDoc in xmlCopyDoc() A bug related to php DOMDocument: https://bugs.php.net/bug.php?id=80665 When copy/clone an html document, the xmlDoc->type goes from XML_HTML_DOCUMENT_NODE to XML_DOCUMENT_NODE.
Nick Wellnhofer 1d73f07d 2020-12-18T00:55:00 Fix null deref in xmlStringGetNodeList Check for malloc failure to avoid null deref. Found with libFuzzer.
Nick Wellnhofer 20c60886 2020-03-08T17:19:42 Fix typos Resolves #133.
Nick Wellnhofer b0725121 2020-01-10T15:55:07 Fix integer overflow in xmlBufferResize Found by OSS-Fuzz.
Nick Wellnhofer 0815302d 2019-12-06T12:27:29 Fix freeing of nested documents Apparently, some libxslt RVTs can contain nested document nodes, see issue #132. I'm not sure how this happens exactly but it can cause a segfault in xmlFreeNodeList after the changes in commit 0762c9b6. Make sure not to touch the (nonexistent) `content` member of xmlDocs.
Nick Wellnhofer db0c0450 2019-11-02T15:14:10 Enable more undefined behavior sanitizers Minor fix to xmlStringLenGetNodeList to avoid a pointer overflow during API test. Enable pointer-overflow and unsigned-integer-overflow sanitizers in CI tests. Technically, unsigned integer overflows aren't undefined behavior, but they typically indicate programming errors. Some hash functions that really require unsigned integer overflows have already been annotated.