tree.c


Log

Author Commit Date CI Message
Nick Wellnhofer 944cc23c 2024-07-03T15:54:32 tree: Fix handling of empty strings in xmlNodeParseContent We shouldn't create an empty text node to match the old behavior. Fixes #759.
Nick Wellnhofer 842a0448 2024-07-03T11:46:06 valid: Restore ID lookup Revert a change from d025cfbb and don't overwrite ID table entries, so that the first attribute will be returned if there are duplicate IDs. This requires two other changes: - Attributes in entity content are never added to the ID table. This seems reasonable. - Remove the optimization to skip ID lookup when copying and the target document has an empty ID table. This also seems more correct since the document could have ID declarations nevertheless or we could be copying xml:ids into the document for the first time. Fixes #757.
Nick Wellnhofer f505dcae 2024-06-26T14:11:34 tree: Remove underscores from xmlRegisterCallbacks
Nick Wellnhofer b0fc67aa 2024-06-15T22:53:55 build: Remove --with-tree configuration option This option would allow for a smaller, but mostly useless minimal build. But it complicates the symbol availability logic in an insane way and requires specialized tools like our custom C parser in doc/apibuild.py. See #717.
Nick Wellnhofer 2f128096 2024-06-14T16:44:09 tree: Fix freeing entities via xmlFreeNode Call xmlFreeEntity to free all entity members. Fixes #731.
Nick Wellnhofer 5198de4b 2024-05-31T13:42:08 fuzz: Make allocation in xmlBuildQName more likely Limit size of static buffer in fuzzing mode.
Nick Wellnhofer e75e878e 2024-05-20T13:58:22 doc: Update and fix documentation
Nick Wellnhofer b8597f46 2024-04-30T15:58:01 tree: Handle predefined entities in xmlBufGetEntityRefContent It's possible to create references to predefined entities using the tree API. This edge case was exposed by making predefined entities const in commit 63ce5f9a.
Nick Wellnhofer 619e2808 2024-04-30T15:53:08 tree: Don't call xmlNewCharRef in xmlNodeParseContent xmlNewCharRef also tries to handle strings like '&name;' but in xmlNodeParseContentInternal, we really want to use the possibly invalid name without modification. Otherwise, content like '&"' could create a reference to a predefined entity.
Nick Wellnhofer 5e80f438 2024-04-28T17:33:19 tree: Deprecate xmlRegisterNodeDefault This rarely used feature should be phased out.
Nick Wellnhofer 88169bfd 2024-04-28T17:54:36 tree: Deprecate xmlSetCompressMode
Niels Dossche 6053f1ff 2023-11-02T13:57:54 Remove redundant size check The condition size > UINT_MAX - 10 is already checked earlier, so the check is always false.
Nick Wellnhofer fbea03f3 2024-04-19T15:22:30 tree: Remove another redundant check in xmlDOMWrapCloneNode The node type was already checked earlier.
Niels Dossche 1a865567 2023-11-02T14:07:00 Remove redundant NULL check on cur This variable is already NULL checked in the previous if condition.
Niels Dossche 6fadd798 2023-11-02T14:05:31 Remove always-false check old == cur This case is already checked at the start of the function. There it returns NULL, which seems more correct.
Niels Dossche 27665200 2023-11-02T13:59:54 Remove redundant NULL check on cur cur = node, and node cannot be NULL as it is checked at the start of the function.
Nick Wellnhofer a0341ac8 2024-04-18T12:08:30 tree: Don't return empty localname in xmlSplitQName{2,3} Match the behavior of xmlSplitQName and xmlSplitQName4.
Nick Wellnhofer 5c553325 2024-03-29T13:45:19 Revert "tree: Only allow elements in xmlDocSetRootElement" This reverts commit 4b698dbaec9bc6775fc8341ef8a3f0d8321f8548. lxml assumes that xmlDocSetRootElement works with non-elements.
Nick Wellnhofer 7c5daa37 2024-03-29T14:35:07 tree: Ignore namespace with NULL href in xmlSearchNs Some users set href to NULL to unset a namespace without deleting it. Also change the duplicate check in xmlNewNs which must agree with xmlSearchNs. Short-lived regression from f960c60d.
Nick Wellnhofer f43197fc 2024-03-29T11:16:45 tree: Don't coalesce text nodes in xmlAdd{Prev,Next}Sibling Commit 9e1c72da from 2001 introduced a bug where xmlAddPrevSibling and xmlAddNextSibling would only try to merge text nodes with one of its new siblings. Commit 4ccd3eb8 fixed this bug but unfortunately, lxml and possibly other downstream code depend on text nodes not being merged. To avoid breaking downstream code while still having somewhat consistent API behavior, it's probably best to make these functions never coalesce text nodes.
Nick Wellnhofer 2a713a80 2024-03-28T15:09:46 tree: Document behavior if xmlSetTreeDoc fails
Nick Wellnhofer f1e9c7bd 2024-03-28T14:54:18 tree: Optimize xmlInsertNode Relink the node directly without calling xmlUnlinkNodeInternal.
Nick Wellnhofer ea0ee365 2024-03-28T12:38:43 tree: Align xmlAddChild with other node insertion functions Make xmlAddChild unlink the child before insertion. Originally, linked children would most likely cause tree corruption. The first fix disallowed linked nodes, but there are cases where insertion of such nodes could succeed. Don't abort if the node is already a child of parent. In this case, the node will be moved to the end of the child list.
Nick Wellnhofer e5cdb23f 2024-03-28T14:09:10 tree: Introduce xmlUnlinkNodeInternal xmlUnlinkNode also removes references to DTD nodes which shouldn't be done when moving nodes within a document. Introduce a new function xmlUnlinkNodeInternal which only unlinks a node from the tree. Remove references to DTD nodes in xmlNodeSetDoc. Note that moving element and attribute declarations to another document will still leave references in the source document.
Nick Wellnhofer 23a81841 2024-03-25T20:51:14 tree: Work on documentation
Nick Wellnhofer ad9a5637 2024-03-22T19:37:12 tree: Fix uninitialized value in xmlSearchNsSafe Short-lived regression.
Nick Wellnhofer 7b316c11 2024-03-22T12:15:23 tree: Fix uninitialized value in xmlSearchNsByHrefSafe Short-lived regression.
Nick Wellnhofer 3f05508a 2024-03-18T14:14:00 tree: Report malloc failures in attribute setters
Nick Wellnhofer 6a49bb77 2024-03-17T17:16:55 tree: Introduce xmlSearchNsSafe After the failed experiment with a static XML namespace, introduce versions of xmlSearchNs that report malloc failures. Optimize the no-document case by only adding the XML namespace declaration if it wasn't found in an ancestor.
Nick Wellnhofer 047ea3ec 2024-03-17T16:23:31 Revert "tree: Allocate XML namespace statically" This reverts commit 2840e33c5e4b51589a0b96e8102638eeaea6df72.
Nick Wellnhofer 2469d5d0 2024-03-15T02:55:11 tree: Tighten source doc check in xmlDOMWrapAdoptNode sourceDoc must match even if node->doc is NULL.
Nick Wellnhofer 37556eb3 2024-03-14T16:32:58 tree: Check destParent->doc in xmlDOMWrapCloneNode The document must match destDoc to avoid tree corruption.
Nick Wellnhofer 7c48c01b 2024-03-13T12:42:43 tree: Switch to xmlNodeSetDoc in xmlDOMWrapAdoptNode Report malloc failures. Also fixes an issue where xmlDOMWrapAdoptAttr would descend into entity references.
Nick Wellnhofer be2c26fb 2024-03-13T12:15:30 tree: Fix tree iteration in xmlDOMWrapRemoveNode We didn't descend into elements having attributes.
Nick Wellnhofer 4a90ce08 2024-03-12T22:30:43 tree: Don't abort early if malloc fails in DOM functions If malloc fails halfway through updating a subtree, we must process the rest of the tree to avoid tree corruption.
Nick Wellnhofer ad019ba1 2024-03-12T19:50:45 tree: Fix reallocation in xmlDOMWrapNSNormAddNsMapItem2
Nick Wellnhofer e321eba0 2024-03-12T17:42:28 tree: Set parent->last early in xmlDOMWrapCloneNode Avoids a corrupted tree in error case.
Nick Wellnhofer 84e6dc9e 2024-03-12T17:41:30 tree: Declare namespace on clone in xmlDOMWrapCloneNode The new namespace must be declared on the cloned node, not the source node.
Nick Wellnhofer 09905670 2024-03-12T17:40:30 tree: Don't free linked DOM namespaces in error case
Nick Wellnhofer 27f07f10 2024-03-12T16:49:10 tree: Report malloc failure in xmlDOMWrapCloneNode Also don't store text content in dictionaries.
Nick Wellnhofer 8d04f0ee 2024-03-11T20:44:47 tree: Refactor text node updates
Nick Wellnhofer 4ccd3eb8 2024-03-11T19:43:56 tree: Refactor node insertion Also fixes a text coalescing bug.
Nick Wellnhofer 9f049afa 2024-03-11T15:57:14 tree: Refactor element creation and parsing of attribute values Replace xmlStringGetNodeList and xmlStringLenGetNodeList with xmlNodeParseContentInternal which also updates an optional parent node. Don't look up entities a second time via xmlNewReference.
Nick Wellnhofer 9991fae4 2024-03-05T16:16:31 tree: Simplify xmlNodeGetContent, xmlBufGetNodeContent Factor out xmlBufGetEntityRefContent and xmlBufGetChildContent. Also allow entity declarations. Optimize single text children. Ignore missing or recursive entities silently. Prefer xmlNodeGetContent over xmlNodeListGetString. Check for entity cycles in xmlBufGetNodeContent. Use children pointer of entity reference nodes if available to look up entities.
Nick Wellnhofer 05adfbf8 2024-03-11T13:42:15 buf: Don't use default buffer size for small strings Detaching strings from a buffer with a default size of 4096 can waste a lot of memory.
Nick Wellnhofer e3342f73 2024-03-07T17:47:06 tree: Work on documentation
Nick Wellnhofer 8677f547 2024-03-05T03:24:45 malloc-fail: Fix erroneous report in xmlNodeGetBaseSafe
Nick Wellnhofer 9b3750c6 2024-03-04T03:49:23 malloc-fail: Avoid use-after-free in xmlAddChild Returning NULL doesn't signal that the node was freed.
Nick Wellnhofer 702f2e46 2024-03-04T01:39:34 malloc-fail: Fix memory leak in xmlNewNodeEatName
Nick Wellnhofer b043d959 2024-03-08T12:40:12 tree: Check return value of xmlNodeAddContent
Nick Wellnhofer 18ebdacf 2024-03-07T13:02:46 tree: Fix error return in xmlGetPropNodeValueInternal
Nick Wellnhofer e4e90961 2024-03-07T13:00:14 tree: Prefer xmlGetPropNodeInternal over xmlHasNsProp xmlHasNsProp can cause unreported malloc failures when looking up default attributes. Switch to xmlGetPropNodeInternal when moving attributes. We don't care about default attributes in this case.
Nick Wellnhofer 7d9ffd40 2024-03-06T19:44:00 tree: Report malloc failure in xmlAddNextSibling
Nick Wellnhofer bc7ab5a2 2024-03-02T18:59:51 tree: Rewrite xmlSetTreeDoc Report malloc failures. Fix when called directly on attribute node. Clear 'content' and 'last' and look up new entity for entity reference nodes.
Nick Wellnhofer 2ba690a7 2024-03-05T16:34:22 tree: Remove more unused node types
Nick Wellnhofer fc9a2ca0 2024-03-06T16:02:24 tree: Report more malloc failures
Nick Wellnhofer 536aa2cd 2024-03-04T16:55:32 tree: Fix adding ids in xmlNewPropInternal Don't try to add ids to NULL document. Report malloc failure from xmlIsID.
Nick Wellnhofer d0d6174e 2024-02-29T19:38:29 valid: Rework xmlAddID
Nick Wellnhofer d57c57ed 2024-03-05T14:53:35 tree: Improve argument check in xmlTextConcat
Nick Wellnhofer 16c29557 2024-03-05T14:52:34 tree: Remove unused node types
Nick Wellnhofer f960c60d 2024-03-05T03:25:16 tree: Make namespace comparison more consistent The API allows NULL namespace URIs, so we should match them consistently. Simply use xmlStrEqual which already takes NULL strings into account.
Nick Wellnhofer d1cc6f7d 2024-03-05T04:34:59 tree: Don't allow NULL name in xmlSetNsProp
Nick Wellnhofer 2840e33c 2024-03-04T07:34:25 tree: Allocate XML namespace statically
Nick Wellnhofer 696faeb4 2024-03-05T16:17:57 tree: Rework xmlNodeListGetString Use string buffer to avoid quadratic complexity. Handle entities with xmlBufGetNodeContent. Report malloc failures.
Nick Wellnhofer 41964548 2024-02-28T12:17:57 tree: Rework xmlTextMerge Return NULL on error. Check for malloc failure. Check that nodes are distinct.
Nick Wellnhofer a3713f78 2024-02-28T11:44:46 tree: Rework xmlNodeSetName Disallow xmlNodeSetName on DTD nodes. DTD nodes don't store the name in a dictionary. Calling xmlNodeSetName with a DTD node could result in an invalid free. This function doesn't report errors but we can make sure that name isn't set to NULL.
Nick Wellnhofer 77c71350 2024-02-27T20:21:48 tree: Simplify xmlAddChild with text parent
Nick Wellnhofer 7e462425 2024-02-27T20:18:42 tree: Don't allow misuse of xmlAddChild xmlAddChild assumes that the child is unlinked. If the child is already linked, return an error instead of corrupting the tree.
Nick Wellnhofer 2c214a50 2024-02-27T16:29:52 tree: Fix xmlAddPropSibling with duplicate attributes Look up existing attribute before unlinking new attribute. This makes it easier for the fuzzer to detect which attribute will de deleted if there are multiple attributes with the same name.
Nick Wellnhofer 2e765083 2024-02-27T16:23:44 tree: Fix indentation in xmlAddPropSibling
Nick Wellnhofer 16c0374a 2024-02-27T15:31:33 tree: Fix xmlAddSibling with last sibling If the node to be added was already at the correct position, the tree could be corrupted.
Nick Wellnhofer 74ca2f59 2024-02-27T13:44:54 tree: Move type check in xmlAddChild Avoid aborting halfway after changing parent pointer if node types don't match when adding attributes.
Nick Wellnhofer 29db9881 2024-02-23T16:59:40 tree: Fix xmlDocSetRootElement with multiple top-level elements Fix xmlDocSetRootElement when setting the original root if multiple top-level elements are present.
Nick Wellnhofer 4b698dba 2024-02-22T18:13:53 tree: Only allow elements in xmlDocSetRootElement
Nick Wellnhofer d5f50602 2024-02-22T16:12:07 tree: Disallow setting content of entity reference nodes The content of entity reference nodes points to the entity declaration and isn't freed. Changing the content would result in a memory leak.
Nick Wellnhofer 77f2012c 2024-02-22T15:25:05 tree: Rework xmlReconciliateNs
Nick Wellnhofer af66a6b5 2024-02-22T13:03:59 tree: Unlink DTD in xmlStaticCopyNodeList Avoid tree corruption when copying within a document.
Nick Wellnhofer bb22cfb9 2024-02-22T12:39:42 tree: Unlink DTD in xmlFreeNodeList Avoid dangling next/prev pointers.
Nick Wellnhofer a581f651 2024-02-21T12:09:10 tree: Check for integer overflow in xmlStringGetNodeList This function is called with unvalidated strings from functions like xmlNewDocProp, xmlNewDocNode or xmlNodeSetContent, so we have to check for integer overflow after all.
Nick Wellnhofer 6aae1767 2024-02-01T15:18:26 tree: Fix error condition in xmlNodeListGetString Don't return NULL in case of undeclared entities.
Nick Wellnhofer d025cfbb 2023-12-27T03:53:24 parser: Always copy content from entity to target. Make sure that references from IDs are updated. Note that if there are IDs with the same value in a document, the last one will now be returned. IDs should be unique, but maybe this should be addressed.
Nick Wellnhofer c49572e5 2023-12-23T15:03:22 malloc-fail: Fix erroneous report in xmlStringGetNodeList The parser can produce invalid attribute content in recovery mode. Unless this is fixed, xmlStringGetNodeList should ignore such errors silently.
Nick Wellnhofer 0ea47327 2023-12-13T14:44:29 malloc-fail: Fix memory leak in xmlNodeGetBaseSafe Short-lived regression.
Nick Wellnhofer 5c06f4e3 2023-12-12T14:37:17 malloc-fail: Fix erroneous reports in xmlNodeListGetString Short-lived regression.
Nick Wellnhofer aca16fb3 2023-12-10T16:37:43 tree: Report malloc failures Fix many places where malloc failures aren't reported. Make some API function return an error code. Changing the return type from void to int is technically an ABI break but should be safe on most platforms. - xmlNodeSetContent - xmlNodeSetContentLen - xmlNodeAddContent - xmlNodeAddContentLen - xmlNodeSetBase Introduce new API functions that return a separate error code if a memory allocation fails. - xmlNodeGetAttrValue - xmlNodeGetBaseSafe - xmlGetNsListSafe Introduce private functions xmlTreeEnsureXMLDecl and xmlSplitQName4. Don't report low-level errors to the global error handler. Fix tree Introduce xmlGetNsListSafe Fix tree
Nick Wellnhofer 502971cc 2023-12-01T17:49:48 tree: Another fix related to #538 Should fix #639.
Nick Wellnhofer 8707838e 2023-11-28T13:27:25 tree: Fix #583 again Only set doc->intSubset after successful copy to avoid dangling pointers in error case.
Nick Wellnhofer de3f7014 2023-11-28T13:01:38 tree: Fix regression when copying DTDs This reverts commit d39f78069dff496ec865c73aa44d7110e429bce9. Fixes #634.
Nick Wellnhofer 97e99f41 2023-10-05T17:11:24 parser: Acknowledge that entities with namespaces are broken Entities which reference out-of-scope namespace have always been broken. xmlParseBalancedChunkMemoryInternal tried to reuse the namespaces currently in scope but these namespaces were ignored by the SAX handler. Besides, there could be different namespaces in scope when expanding the entity again. For example: <!DOCTYPE doc [ <!ENTITY ent "<ns:elem/>"> ]> <doc> <decl1 xmlns:ns="urn:ns1"> &ent; </decl1> <decl2 xmlns:ns="urn:ns2"> &ent; </decl2> </doc> Add some comments outlining possible solutions to this problem. For now, we stop copying namespaces to the temporary parser context in xmlParseBalancedChunkMemoryInternal. This has never really worked and the recent changes contained a partial fix which uncovered other problems like a use-after-free with the XML Reader interface, found by OSS-Fuzz.
Nick Wellnhofer 8c084ebd 2023-09-21T22:57:33 doc: Make apibuild.py happy
Nick Wellnhofer 9b5cce7a 2023-09-21T00:44:50 include: Remove more unnecessary includes
Nick Wellnhofer 11a1839d 2023-09-20T17:54:48 globals: Move remaining globals back to correct header files This undoes a lot of damage.
Nick Wellnhofer dc3382ef 2023-09-20T12:58:03 globals: Move xmlRegisterNodeDefault to tree.c Code in globals.c must not try to access globals itself since the accessor macros aren't defined and we would only see the main variable.
Nick Wellnhofer 4e1c13eb 2023-09-18T14:45:10 debug: Remove debugging code This is barely useful these days and only clutters the code base.
Nick Wellnhofer d39f7806 2023-08-23T20:24:24 tree: Fix copying of DTDs - Don't create multiple DTD nodes. - Fix UAF if malloc fails. - Skip DTD nodes if tree module is disabled. Fixes #583.
Nick Wellnhofer b8961df6 2023-05-09T03:25:24 SAX: Always validate xml:ids The behavior shouldn't depend on mostly random configuration options.
Nick Wellnhofer dbc893f5 2023-03-03T13:02:11 malloc-fail: Fix memory leak in xmlCopyNamespaceList Found with libFuzzer, see #344.
Nick Wellnhofer a442d16a 2023-02-26T14:48:23 malloc-fail: Fix memory leak in xmlGetNsList Found with libFuzzer, see #344.
Nick Wellnhofer bc7740b3 2023-02-16T11:45:58 malloc-fail: Fix memory leak in xmlCopyPropList Found with libFuzzer, see #344.
Nick Wellnhofer e6401b68 2023-01-17T14:01:23 tree: Fix recursion check in xmlStringGetNodeList Use the new entity flag to check for recursion.