kmx git

Commit	Date	Message
45fe9924	2024-04-22T17:12:54	parser: Don't create reference in xmlLookupGeneralEntity This should only be done in xmlParseReference. The handling of undeclared entities is still somewhat inconsistent. In element content we create references even if entity substitution is enabled. In attribute values undeclared entities are always ignored.
b717abdd	2024-04-22T15:42:39	parser: Consolidate error handling for undeclared entities Always use XML_WAR_UNDECLARED_ENTITY with warning error level in documents with external subset or parameter entities. Use XML_ERR_UNDECLARED_ENTITY otherwise.
f506ec66	2024-04-15T11:27:44	parser: Always decode entities in namespace URIs Also decode entities in namespace URIs if entity substitution wasn't requested. This should fix some corner cases when comparing namespace URIs. The Namespaces in XML 1.0 spec says: > In a namespace declaration, the URI reference is the normalized value > of the attribute, so replacement of XML character and entity > references has already been done before any comparison. Make the serialization code escape special characters in namespace URIs like in attribute values. This fixes serialization if entities were substituted when parsing. Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/106
5bb84b47	2024-04-04T11:55:28	relaxng: Fix tree corruption in xmlRelaxNGParseNameClass Don't create cycles in tree structure. This will lead to an infinite loop or call stack overflow later. Closes: https://gitlab.gnome.org/GNOME/libxml2/-/issues/711
f43197fc	2024-03-29T11:16:45	tree: Don't coalesce text nodes in xmlAdd{Prev,Next}Sibling Commit 9e1c72da from 2001 introduced a bug where xmlAddPrevSibling and xmlAddNextSibling would only try to merge text nodes with one of its new siblings. Commit 4ccd3eb8 fixed this bug but unfortunately, lxml and possibly other downstream code depend on text nodes not being merged. To avoid breaking downstream code while still having somewhat consistent API behavior, it's probably best to make these functions never coalesce text nodes.
4ccd3eb8	2024-03-11T19:43:56	tree: Refactor node insertion Also fixes a text coalescing bug.
186562a1	2024-03-12T19:55:33	parser: Fix detection of duplicate attributes in XML namespace Fixes a regression from commit e0dd330b, resulting in duplicate attributes in the predefined XML namespace not being detected or extraneous default attributes being passed. Fixes #704.
63986c45	2024-01-22T21:02:16	parser: Report fatal error if document entity couldn't be loaded Only lower error level when loading entities. Fixes #667.
29beef65	2024-01-02T21:50:38	parser: Pop inputs if parsing DTD failed This should provide some statistics in ctxt->sizeentcopy even in the error or recovery case.
f237e5b9	2024-01-05T15:40:23	parser: Avoid duplicate namespace errors Don't report an extra attribute uniqueness error if a namespace is undeclared. This matches old behavior.
07c05546	2024-01-04T02:48:02	error: Make xmlFormatError public This is a useful function to get a verbose error report. Allows to remove duplicated code from runtest.c. Also reactivate check for schema parser failures.
d0eb5a7e	2024-01-03T18:12:29	parser: Remove xmlErrEncodingInt Convert the last user to xmlFatalErr.
e8fb3d63	2024-01-02T17:45:54	parser: Convert some "internal errors" to meaningful codes
37c6618b	2023-12-30T02:50:34	parser: Rework parsing of attribute and entity values Don't use a separate function to handle "complex" attributes. Validate UTF-8 byte sequences without decoding. This should improve performance considerably when parsing multi-byte UTF-8 sequences. Use a string buffer to avoid unnecessary allocations and copying when expanding entities. Normalize attribute values in a single pass while expanding entities. Be more lenient in recovery mode. If no entity substitution was requested, validate entities without expanding. Fixes #596. Also fixes #655.
f0dc52d0	2023-12-29T06:00:20	parser: Move cleanup of element stacks to xmlParseContent
d025cfbb	2023-12-27T03:53:24	parser: Always copy content from entity to target. Make sure that references from IDs are updated. Note that if there are IDs with the same value in a document, the last one will now be returned. IDs should be unique, but maybe this should be addressed.
4ecc85d2	2023-12-27T00:44:16	parser: Push general entity input streams on the stack This allows the error handler to give more context.
d944a415	2023-12-26T02:10:35	parser: Fix in-parameter-entity and in-external-dtd checks Use in ctxt->input->entity instead of ctxt->inputNr to determine whether we are inside a parameter entity. Stop using ctxt->external to check whether we're in an external DTD. This is signaled by ctxt->inSubset == 2.
b8313b58	2023-12-26T21:59:08	xpath: Rewrite substring-before and substring-after Don't use buffers. Check malloc failures.
f3fa34dc	2023-12-26T22:37:26	parser: Fix general entity parsing Clear namespace database. Ignore non-fatal errors.
ecfbcc8a	2023-12-25T04:33:00	parser: Rework general entity parsing Don't create a new parser context but reuse the existing one. This exposes bug #601 in a more obvious way.
6e3a2ac6	2023-12-22T21:38:50	xinclude: Rework xml:base fixup The xml:base fixup was broken in more complex cases. Also avoid parsing and building the included URI multiple times.
f0df3e6d	2023-12-21T14:35:18	tests: Try to fix RelaxNG test cases These were added recently in ea695ac0 and 8074b881 but were a total mess of symbolic links and apparently mixed up files. Symbolic links don't work on Windows. Try to salvage one of the tests.
8d0aaf4b	2023-12-19T20:47:36	parser: Remove xmlErrEncoding Use xmlFatalErr or xmlCtxtErrIO.
7e511f35	2023-12-19T15:41:37	io: Pass error codes from xmlFileOpenReal to xmlNewInputFromFile This allows to report the reason why opening a file failed to the parser context and improve error messages. Now we can also remove the stat call before opening a file.
83c6aeef	2023-12-18T21:12:29	relaxng: Improve error handling Pass RelaxNG structured error handler to XML parser. Handle malloc failure from xmlRaiseError. Remove argument from memory error handler. Use xmlRaiseMemoryError. Don't use xmlGenericError. Remove TODO macro.
157df344	2023-12-10T18:23:53	xmlreader: Report malloc failures Fix many places where malloc failures aren't reported. Introduce a new API function xmlTextReaderGetLastError.
e58ea29f	2023-12-10T18:10:42	SAX2: Report malloc failures Fix many places where malloc failures aren't reported. Improve error handling when parsing entity declarations. Fixes #308.
a1f7ecae	2023-12-10T15:25:42	entities: Report malloc failures Fix places where malloc failures aren't reported. Introduce new API function xmlAddEntity that returns separate error codes. Don't invoke global error handler for low-level errors which should be handled by higher layers. Invalid redelcaration warnings will be fixed later.
7d446e97	2023-12-08T12:13:49	parser: Fix namespaces redefined from default attributes This regressed in commit e0dd330b. Also fixes a long-standing issue where namespaces from default attributes weren't added if they match an existing namespace. Fixes #643.
e3959461	2023-11-30T16:15:46	html: Reenable buggy detection of XML declarations Switch to UTF-8 if a document starts with '<?xm' to match old behavior. Also enable this check in the push parser. Fixes #637.
43b511fa	2023-11-26T14:31:39	parser: Make CRLF increment line number Partial revert of cb927e85 fixing CRLFs not incrementing the line number. This requires to rework xmlParseQNameHashed. The original implementation prompted the change to xmlCurrentChar which really shouldn't modify the 'cur' pointer as side effect. But the NEXTL macro relies on this behavior. Ultimately, we should reintroduce the change to xmlCurrentChar and fix the NEXTL macro. This will lead to single CRs incrementing the line number as well which seems more consistent. Fixes #628.
a2b5c90a	2023-11-21T14:35:54	hash: Fix deletion of entries during scan Functions like xmlCleanSpecialAttr scan a hash table and possibly delete entries in the callback. xmlHashScanFull must detect such deletions and rescan the entry. This regressed when rewriting the hash table code in 4a513d56. Fixes #626.
7a2d412f	2023-10-31T20:15:38	parser: Copy default namespace in xmlParseBalancedChunkMemory
e0c2f14d	2023-10-31T13:53:15	parser: Copy namespaces in xmlParseBalancedChunkMemory Reenable copying of namespaces but don't set SAX data. This should match the old behavior.
b76d81da	2023-10-06T11:50:29	parser: Fix regression when push parsing parameter entities Short-lived regression from 834b8123. Also shrink parameter entity buffers when push parsing.
134d2ad8	2023-10-06T00:31:44	parser: Protect against quadratic default attribute expansion
0ba22c05	2023-10-05T22:05:04	parser: Support encoded external PEs in entity values Corner case which was never supported.
6337a14a	2023-10-06T10:44:38	tests: Handle entities in SAX tests
e48f3d8e	2023-09-27T16:47:37	tests: Add more tests for redefined attributes
a873191c	2023-09-25T14:51:35	parser: Introduce xmlParseQNameHashed
53050b1d	2023-08-29T20:06:43	parser: More fixes to push parser error handling
bbd918b2	2023-08-29T15:56:37	parser: Fix detection of null bytes Also suppress misleading extra errors. Fixes #122.
c6083a32	2023-08-29T16:30:22	parser: Improve error handling in push parser - Report errors earlier - Align error messages with pull parser
855818bd	2023-08-08T15:21:37	parser: Check for truncated multi-byte sequences When decoding input data, check whether the "raw" buffer is empty after parsing the document. Otherwise, the input ends with a truncated multi-byte sequence which shouldn't be silently ignored.
0ffc2d82	2023-04-30T20:28:47	runtest: Skip element name in schema error messages This makes sure that memory and streaming tests will report the same messages.
e4f85f1b	2023-04-07T11:46:35	[CVE-2023-28484] Fix null deref in xmlSchemaFixupComplexType Fix a null pointer dereference when parsing (invalid) XML schemas. Thanks to Robby Simpson for the report! Fixes #491.
cb1b8b85	2023-04-10T13:06:18	xmlValidatePopElement() can return invalid value (-1) Covered by: test/VC/ElementValid5 This only affects XML Reader API with LIBXML_REGEXP_ENABLED and LIBXML_VALID_ENABLED turned on. * result/VC/ElementValid5.rdr: - Update result to add missing error message. * python/tests/reader2.py: * result/VC/ElementValid6.rdr: * result/VC/ElementValid7.rdr: * result/valid/781333.xml.err.rdr: - Update result to fix grammar issue. * valid.c: (xmlValidatePopElement): - Check return value of xmlRegExecPushString() to handle -1, and assign 'ret = 0;' to return 0 from xmlValidatePopElement(). This change affects xmlTextReaderValidatePop() from xmlreader.c. - Fix grammar of error message by changing 'child' to 'children'.
d7d0bc65	2023-03-31T16:47:48	SAX2: Ignore namespaces in HTML documents In commit 21ca8829, we started to ignore namespaces in HTML element names but we still called xmlSplitQName, effectively stripping the namespace prefix. This would cause elements like <o:p> being parsed as <p>. Now we leave the name untouched. Fixes #508.
e20f4d7a	2023-02-13T14:38:05	xinclude: Fix quadratic behavior in xmlXIncludeLoadTxt Also make text inclusions work with memory buffers, for example when using a custom entity loader, and fix a memory leak in case of invalid characters. Fixes #483.
be0ec005	2023-02-03T14:37:49	xinclude: Abort immediately if max depth was exceeded Avoids resource exhaustion if the maximum recursion depth was exceeded. Note that the XInclude engine offers no protection against other "billion laughs"-style amplification attacks as long as they stay below the maximum depth.
74aa61e0	2023-01-22T13:09:03	parser: Halt parser on DTD errors If we try to continue parsing after an error in the internal or external subset, entity expansion accounting gets more complicated. Simply halt the parser. Found with libFuzzer.
608c65bb	2023-01-18T15:15:41	xpath: number('-') should return NaN Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/81
d320a683	2023-01-17T13:50:51	parser: Fix entity check in attributes Don't set the "checked" flag when checking entities in default attribute values. These entities could reference other entities which weren't defined yet, so the check isn't reliable. This fixes a short-lived regression which could lead to a call stack overflow later in xmlStringGetNodeList.
a41b09c7	2022-12-23T21:29:28	parser: Improve detection of entity loops Set a flag to detect entity loops at once instead of processing until the depth limit is exceeded.
d972393f	2022-12-23T21:01:20	parser: Only report a single entity error Don't report errors multiple times for nested entity references.
ae0c9cfa	2022-12-12T23:54:39	uri: Fix handling of port numbers Allow port number without host, real fix for #71. Also compare port numbers in xmlBuildRelativeURI. Fix handling of port numbers in xmlUriEscape.
76c6da42	2022-12-04T23:01:00	error: Make sure that error messages are valid UTF-8 This has caused issues with the Python bindings for a long time. Should fix #64.
9c63cea5	2022-11-20T15:36:41	test: Add test for push parser boundaries
68a6518c	2022-11-15T18:23:33	parser: Rewrite push parser boundary checks Remove inaccurate xmlParseCheckTransition check. Remove non-incremental xmlParseGetLasts check. Add functions that check for several boundary constructs more accurately, keeping track of progress in ctxt->checkIndex. Fixes #439.
76d6b0d7	2022-11-14T21:02:15	html: Don't escape ASCII chars in href attributes In several cases, href attributes can contain ASCII characters which are illegal in URIs. Escaping them often does more harm than good. Fixes #321.
f61b8a62	2022-11-13T21:47:03	parser: Fix DTD parser progress checks This is another attempt at fixing parser progress checks. Instead of relying on in->consumed, which could overflow, change some DTD parser functions to make guaranteed progress on certain byte sequences.
b456e3bb	2022-10-30T20:28:20	xinclude: Always allow XPtr expressions in external documents
eef0a739	2022-10-30T12:21:20	xinclude: Implement "streaming" mode When using xmlreader, XPointer expressions in XIncludes simply cannot work. Expressions can reference nodes which weren't parsed yet or which were already deleted. After fixing nested XIncludes, we reference includes which were parsed previously. When streaming, these nodes could have been deleted, leading to use-after-free errors. Disallow XPointer expressions and truncate the include table in streaming mode.
20e2fb4c	2022-10-23T17:52:29	xinclude: Avoid creation of subcontexts Don't create subcontext in xmlXIncludeRecurseDoc. Save and restore 'doc' and 'incTab' instead. Make xmlXIncludeLoadFallback call xmlXIncludeCopyNode which seems safer than xmlXIncludeDoProcess since the latter may modify the document. This should also be more performant since we need to copy the whole fallback subtree anyway. Also make sure to avoid replacements in fallback elements in xmlXIncludeDoProcess.
d2ed1e4f	2022-10-22T16:50:18	xinclude: Limit recursion depth This avoids call stack overflows.
ea7c9fb5	2022-10-22T16:48:58	xinclude: Don't create result doc for test with errors
34496f26	2022-10-22T16:09:21	xinclude: Test for inclusion loops
bc267cb9	2022-10-22T02:19:22	xinclude: Expand includes in xmlXIncludeCopyNode This should make nested includes work reliably. Fixes #424.
c99cde3f	2022-10-22T16:59:35	xinclude: Also test error messages The reader interface with XIncludes is somewhat broken and can generate different error messages. Start to move tests which are sketchy with reader to a separate directory.
938105b5	2022-10-21T15:56:12	Revert "xinclude: Fix regression with nested includes" This reverts commit 7f04e297318b1b908cec20711f74f75625afed7f which caused memory errors. See #424.
7f04e297	2022-10-18T18:40:00	xinclude: Fix regression with nested includes This reverts commits 74dcc10b and 87d20b55. Fixes #424.
1d4f5d24	2022-09-13T16:40:31	schemas: Fix null-pointer-deref in xmlSchemaCheckCOSSTDerivedOK Found by OSS-Fuzz.
c7149792	2022-09-01T23:15:35	Fix --with-valid --without-regexps build This build config resulted in segfaults in 'runtest' because a special xmlElementContentPtr showed up in a few places. I'm not sure if this is the right fix. An error message was changed to conform to the --with-regexps build. There are still a few missing validity errors, so the tests don't pass.
e986d09c	2022-07-15T14:02:26	Skip incorrectly opened HTML comments Commit 4fd69f3e fixed handling of '<' characters not followed by an ASCII letter. But a '<!' sequence followed by invalid characters should be treated as bogus comment and skipped. Fixes #380.
14517012	2022-04-23T19:19:33	Fix parsing of subtracted regex character classes Fixes #370.
4612ce30	2022-04-21T03:52:52	Implement xpath1() XPointer scheme See https://www.w3.org/2005/04/xpointer-schemes/
41afa89f	2022-04-10T14:09:29	Fix short-lived regression in xmlStaticCopyNode Commit 7618a3b1 didn't account for coalesced text nodes. I think it would be better if xmlStaticCopyNode didn't try to coalesce text nodes at all. This code path can only be triggered if some other code doesn't coalesce text nodes properly. In this case, OSS-Fuzz found such behavior in xinclude.c.
4de7f2ac	2022-04-04T03:28:21	Remove unused result files
f1c32b4c	2020-07-09T03:19:13	Allow missing result files in runtest Treat missing files as empty.
95c7f315	2022-04-03T21:39:14	Move SVG tests to runtest.c Also update the test results for the first time since 2000.
48b03c84	2022-04-03T20:36:38	Remove major parts of old test suite Remove all the parts of the old test suite which are covered by runtest.c for quite some time. The following test programs are removed: - testC14N - testHTML - testReader - testRelax - testSAX - testSchemas - testURI - testXPath This also removes a few results of unimportant tests only run by the old test suite.
57b81c20	2022-03-05T18:20:29	Normalize XPath strings in-place Simplify the code and fix a potential memory leak. Fixes #343.
bc06a522	2022-03-02T02:57:49	Fix recursion check in xinclude.c Compare the included URL with the document's URL to detect local inclusions. Fixes #348.
d7b287b9	2021-07-17T14:36:53	htmlParseComment: handle abruptly-closed comments See guidance provided on abrutply-closed comments here: https://html.spec.whatwg.org/multipage/parsing.html#parse-error-abrupt-closing-of-empty-comment
24cdc890	2021-07-17T14:06:49	test coverage for abruptly-closed comments These establish baseline behavior so that the subsequent commit is clear about the behavior it will modify.
ea6e8f99	2021-12-20T00:34:58	Fix certain combinations of regex range quantifiers Fix regex transitions that have both min/max and a counter. In this case, we want to save the regex state before incrementing the counter. Fixes #301 and the issue reported here: https://mail.gnome.org/archives/xml/2016-April/msg00017.html
382fb056	2021-12-20T00:31:41	Fix range quantifier on subregex Make sure to add counted exit transitions before other counter transitions. Otherwise, we won't backtrack correctly. Fixes #65.
ce0871e1	2022-02-20T16:44:41	Only warn on invalid redeclarations of predefined entities Downgrade the error message to a warning since the error was ignored, anyway. Also print the name of redeclared entity. For a proper fix that also shows filename and line number of the invalid redeclaration, we'd have to - pass the parser context to the entity functions somehow, or - make these functions return distinct error codes. Partial fix for #308.
652dd12a	2022-02-08T03:29:24	[CVE-2022-23308] Use-after-free of ID and IDREF attributes If a document is parsed with XML_PARSE_DTDVALID and without XML_PARSE_NOENT, the value of ID attributes has to be normalized after potentially expanding entities in xmlRemoveID. Otherwise, later calls to xmlGetID can return a pointer to previously freed memory. ID attributes which are empty or contain only whitespace after entity expansion are affected in a similar way. This is fixed by not storing such attributes in the ID table. The test to detect streaming mode when validating against a DTD was broken. In connection with the defects above, this could result in a use-after-free when using the xmlReader interface with validation. Fix detection of streaming mode to avoid similar issues. (This changes the expected result of a test case. But as far as I can tell, using the XML reader with XIncludes referencing the root document never worked properly, anyway.) All of these issues can result in denial of service. Using xmlReader with validation could result in disclosure of memory via the error channel, typically stderr. The security impact of xmlGetID returning a pointer to freed memory depends on the application. The typical use case of calling xmlGetID on an unmodified document is not affected.
9edc20c1	2022-02-07T20:38:30	Fix double counting of CRLF in comments Fixes #151.
5408c10c	2022-02-04T14:00:09	Don't normalize namespace URIs in XPointer xmlns() scheme Namespace URIs should be compared without escaping or unescaping: https://www.w3.org/TR/REC-xml-names/#NSNameComparison Fixes #289.
1c7d91ab	2022-02-03T23:31:19	Fix handling of XSD with empty namespace An empty namespace means no default namespace. Fixes #303.
f480f750	2022-02-03T14:43:17	Update NewsML DTD in test suite Switch to version 1.2 which has a clearer license. Fixes #291.
d85245f9	2022-01-16T21:39:04	Fix regression with PEs in external DTD Fix a regression introduced with commit a28f7d87. In some cases, parameter entity references in external DTDs wouldn't be expanded. Fixes #306.
03bb9293	2021-07-07T18:23:18	Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8(). * encoding.c: (UTF16LEToUTF8): - Fix comment to describe what the code does. (UTF16BEToUTF8): - Fix undefined behavior which was applied to UTF16LEToUTF8() in 2f9382033e. - Add bounds check to while() loop which was applied to UTF16LEToUTF8() in be803967db. - Do not return -2 when (in >= inend) to fix the bug. This was applied to UTF16LEToUTF8() in 496a1cf592. - Inline (<< 8) statements to match UTF16LEToUTF8(). Add the following tests and results: test/text-4-byte-UTF-16-BE-offset.xml test/text-4-byte-UTF-16-BE.xml test/text-4-byte-UTF-16-LE-offset.xml test/text-4-byte-UTF-16-LE.xml
2732b234	2022-01-10T13:32:14	Fix regression parsing public IDs literals in HTML Fix regression introduced when reworking htmlParsePubidLiteral in commit 93ce33c2. Fixes #318.
de5b624f	2021-05-08T20:21:29	Fix handling of unexpected EOF in xmlParseContent Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was removed in commit 62150ed2. This commit also introduced a regression for direct users of xmlParseContent. Unclosed tags weren't checked.
3e80560d	2021-05-07T10:51:38	Fix line numbers in error messages for mismatched tags Commit 62150ed2 introduced a small regression in the error messages for mismatched tags. This typically only affected messages after the first mismatch, but with custom SAX handlers all line numbers would be off. This also fixes line numbers in the SAX push parser which were never handled correctly.
01411e7c	2021-02-08T20:58:32	Check for invalid redeclarations of predefined entities Implement section "4.6 Predefined Entities" of the XML 1.0 spec and check whether redeclarations of predefined entities match the original definitions. Note that some test cases declared <!ENTITY lt "<"> But the XML spec clearly states that this is illegal: > If the entities lt or amp are declared, they MUST be declared as > internal entities whose replacement text is a character reference to > the respective character (less-than sign or ampersand) being escaped; > the double escaping is REQUIRED for these entities so that references > to them produce a well-formed result. Also fixes #217 but the connection is only tangential. The integer overflow discovered by fuzzing was more related to the fact that various parts of the parser disagreed on whether to prefer predefined entities over their redeclarations. The whole situation is a mess and even depends on legacy parser options. But now that redeclarations are validated, it shouldn't make a difference. As noted in the added comment, this is also one of the cases where overly defensive checks can hide interesting logic bugs from fuzzers.

45fe9924

2024-04-22T17:12:54

parser: Don't create reference in xmlLookupGeneralEntity This should only be done in xmlParseReference. The handling of undeclared entities is still somewhat inconsistent. In element content we create references even if entity substitution is enabled. In attribute values undeclared entities are always ignored.

b717abdd

2024-04-22T15:42:39

parser: Consolidate error handling for undeclared entities Always use XML_WAR_UNDECLARED_ENTITY with warning error level in documents with external subset or parameter entities. Use XML_ERR_UNDECLARED_ENTITY otherwise.

f506ec66

2024-04-15T11:27:44

parser: Always decode entities in namespace URIs Also decode entities in namespace URIs if entity substitution wasn't requested. This should fix some corner cases when comparing namespace URIs. The Namespaces in XML 1.0 spec says: > In a namespace declaration, the URI reference is the normalized value > of the attribute, so replacement of XML character and entity > references has already been done before any comparison. Make the serialization code escape special characters in namespace URIs like in attribute values. This fixes serialization if entities were substituted when parsing. Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/106

5bb84b47

2024-04-04T11:55:28

relaxng: Fix tree corruption in xmlRelaxNGParseNameClass Don't create cycles in tree structure. This will lead to an infinite loop or call stack overflow later. Closes: https://gitlab.gnome.org/GNOME/libxml2/-/issues/711

f43197fc

2024-03-29T11:16:45

tree: Don't coalesce text nodes in xmlAdd{Prev,Next}Sibling Commit 9e1c72da from 2001 introduced a bug where xmlAddPrevSibling and xmlAddNextSibling would only try to merge text nodes with one of its new siblings. Commit 4ccd3eb8 fixed this bug but unfortunately, lxml and possibly other downstream code depend on text nodes not being merged. To avoid breaking downstream code while still having somewhat consistent API behavior, it's probably best to make these functions never coalesce text nodes.

4ccd3eb8

2024-03-11T19:43:56

tree: Refactor node insertion Also fixes a text coalescing bug.

186562a1

2024-03-12T19:55:33

parser: Fix detection of duplicate attributes in XML namespace Fixes a regression from commit e0dd330b, resulting in duplicate attributes in the predefined XML namespace not being detected or extraneous default attributes being passed. Fixes #704.

63986c45

2024-01-22T21:02:16

parser: Report fatal error if document entity couldn't be loaded Only lower error level when loading entities. Fixes #667.

29beef65

2024-01-02T21:50:38

parser: Pop inputs if parsing DTD failed This should provide some statistics in ctxt->sizeentcopy even in the error or recovery case.

f237e5b9

2024-01-05T15:40:23

parser: Avoid duplicate namespace errors Don't report an extra attribute uniqueness error if a namespace is undeclared. This matches old behavior.

07c05546

2024-01-04T02:48:02

error: Make xmlFormatError public This is a useful function to get a verbose error report. Allows to remove duplicated code from runtest.c. Also reactivate check for schema parser failures.

d0eb5a7e

2024-01-03T18:12:29

parser: Remove xmlErrEncodingInt Convert the last user to xmlFatalErr.

e8fb3d63

2024-01-02T17:45:54

parser: Convert some "internal errors" to meaningful codes

37c6618b

2023-12-30T02:50:34

parser: Rework parsing of attribute and entity values Don't use a separate function to handle "complex" attributes. Validate UTF-8 byte sequences without decoding. This should improve performance considerably when parsing multi-byte UTF-8 sequences. Use a string buffer to avoid unnecessary allocations and copying when expanding entities. Normalize attribute values in a single pass while expanding entities. Be more lenient in recovery mode. If no entity substitution was requested, validate entities without expanding. Fixes #596. Also fixes #655.

f0dc52d0

2023-12-29T06:00:20

parser: Move cleanup of element stacks to xmlParseContent

d025cfbb

2023-12-27T03:53:24

parser: Always copy content from entity to target. Make sure that references from IDs are updated. Note that if there are IDs with the same value in a document, the last one will now be returned. IDs should be unique, but maybe this should be addressed.

4ecc85d2

2023-12-27T00:44:16

parser: Push general entity input streams on the stack This allows the error handler to give more context.

d944a415

2023-12-26T02:10:35

parser: Fix in-parameter-entity and in-external-dtd checks Use in ctxt->input->entity instead of ctxt->inputNr to determine whether we are inside a parameter entity. Stop using ctxt->external to check whether we're in an external DTD. This is signaled by ctxt->inSubset == 2.

b8313b58

2023-12-26T21:59:08

xpath: Rewrite substring-before and substring-after Don't use buffers. Check malloc failures.

f3fa34dc

2023-12-26T22:37:26

parser: Fix general entity parsing Clear namespace database. Ignore non-fatal errors.

ecfbcc8a

2023-12-25T04:33:00

parser: Rework general entity parsing Don't create a new parser context but reuse the existing one. This exposes bug #601 in a more obvious way.

6e3a2ac6

2023-12-22T21:38:50

xinclude: Rework xml:base fixup The xml:base fixup was broken in more complex cases. Also avoid parsing and building the included URI multiple times.

f0df3e6d

2023-12-21T14:35:18

tests: Try to fix RelaxNG test cases These were added recently in ea695ac0 and 8074b881 but were a total mess of symbolic links and apparently mixed up files. Symbolic links don't work on Windows. Try to salvage one of the tests.

8d0aaf4b

2023-12-19T20:47:36

parser: Remove xmlErrEncoding Use xmlFatalErr or xmlCtxtErrIO.

7e511f35

2023-12-19T15:41:37

io: Pass error codes from xmlFileOpenReal to xmlNewInputFromFile This allows to report the reason why opening a file failed to the parser context and improve error messages. Now we can also remove the stat call before opening a file.

83c6aeef

2023-12-18T21:12:29

relaxng: Improve error handling Pass RelaxNG structured error handler to XML parser. Handle malloc failure from xmlRaiseError. Remove argument from memory error handler. Use xmlRaiseMemoryError. Don't use xmlGenericError. Remove TODO macro.

157df344

2023-12-10T18:23:53

xmlreader: Report malloc failures Fix many places where malloc failures aren't reported. Introduce a new API function xmlTextReaderGetLastError.

e58ea29f

2023-12-10T18:10:42

SAX2: Report malloc failures Fix many places where malloc failures aren't reported. Improve error handling when parsing entity declarations. Fixes #308.

a1f7ecae

2023-12-10T15:25:42

entities: Report malloc failures Fix places where malloc failures aren't reported. Introduce new API function xmlAddEntity that returns separate error codes. Don't invoke global error handler for low-level errors which should be handled by higher layers. Invalid redelcaration warnings will be fixed later.

7d446e97

2023-12-08T12:13:49

parser: Fix namespaces redefined from default attributes This regressed in commit e0dd330b. Also fixes a long-standing issue where namespaces from default attributes weren't added if they match an existing namespace. Fixes #643.

e3959461

2023-11-30T16:15:46

html: Reenable buggy detection of XML declarations Switch to UTF-8 if a document starts with '<?xm' to match old behavior. Also enable this check in the push parser. Fixes #637.

43b511fa

2023-11-26T14:31:39

parser: Make CRLF increment line number Partial revert of cb927e85 fixing CRLFs not incrementing the line number. This requires to rework xmlParseQNameHashed. The original implementation prompted the change to xmlCurrentChar which really shouldn't modify the 'cur' pointer as side effect. But the NEXTL macro relies on this behavior. Ultimately, we should reintroduce the change to xmlCurrentChar and fix the NEXTL macro. This will lead to single CRs incrementing the line number as well which seems more consistent. Fixes #628.

a2b5c90a

2023-11-21T14:35:54

hash: Fix deletion of entries during scan Functions like xmlCleanSpecialAttr scan a hash table and possibly delete entries in the callback. xmlHashScanFull must detect such deletions and rescan the entry. This regressed when rewriting the hash table code in 4a513d56. Fixes #626.

7a2d412f

2023-10-31T20:15:38

parser: Copy default namespace in xmlParseBalancedChunkMemory

e0c2f14d

2023-10-31T13:53:15

parser: Copy namespaces in xmlParseBalancedChunkMemory Reenable copying of namespaces but don't set SAX data. This should match the old behavior.

b76d81da

2023-10-06T11:50:29

parser: Fix regression when push parsing parameter entities Short-lived regression from 834b8123. Also shrink parameter entity buffers when push parsing.

134d2ad8

2023-10-06T00:31:44

parser: Protect against quadratic default attribute expansion

0ba22c05

2023-10-05T22:05:04

parser: Support encoded external PEs in entity values Corner case which was never supported.

6337a14a

2023-10-06T10:44:38

tests: Handle entities in SAX tests

e48f3d8e

2023-09-27T16:47:37

tests: Add more tests for redefined attributes

a873191c

2023-09-25T14:51:35

parser: Introduce xmlParseQNameHashed

53050b1d

2023-08-29T20:06:43

parser: More fixes to push parser error handling

bbd918b2

2023-08-29T15:56:37

parser: Fix detection of null bytes Also suppress misleading extra errors. Fixes #122.

c6083a32

2023-08-29T16:30:22

parser: Improve error handling in push parser - Report errors earlier - Align error messages with pull parser

855818bd

2023-08-08T15:21:37

parser: Check for truncated multi-byte sequences When decoding input data, check whether the "raw" buffer is empty after parsing the document. Otherwise, the input ends with a truncated multi-byte sequence which shouldn't be silently ignored.

0ffc2d82

2023-04-30T20:28:47

runtest: Skip element name in schema error messages This makes sure that memory and streaming tests will report the same messages.

e4f85f1b

2023-04-07T11:46:35

[CVE-2023-28484] Fix null deref in xmlSchemaFixupComplexType Fix a null pointer dereference when parsing (invalid) XML schemas. Thanks to Robby Simpson for the report! Fixes #491.

cb1b8b85

2023-04-10T13:06:18

xmlValidatePopElement() can return invalid value (-1) Covered by: test/VC/ElementValid5 This only affects XML Reader API with LIBXML_REGEXP_ENABLED and LIBXML_VALID_ENABLED turned on. * result/VC/ElementValid5.rdr: - Update result to add missing error message. * python/tests/reader2.py: * result/VC/ElementValid6.rdr: * result/VC/ElementValid7.rdr: * result/valid/781333.xml.err.rdr: - Update result to fix grammar issue. * valid.c: (xmlValidatePopElement): - Check return value of xmlRegExecPushString() to handle -1, and assign 'ret = 0;' to return 0 from xmlValidatePopElement(). This change affects xmlTextReaderValidatePop() from xmlreader.c. - Fix grammar of error message by changing 'child' to 'children'.

d7d0bc65

2023-03-31T16:47:48

SAX2: Ignore namespaces in HTML documents In commit 21ca8829, we started to ignore namespaces in HTML element names but we still called xmlSplitQName, effectively stripping the namespace prefix. This would cause elements like <o:p> being parsed as <p>. Now we leave the name untouched. Fixes #508.

e20f4d7a

2023-02-13T14:38:05

xinclude: Fix quadratic behavior in xmlXIncludeLoadTxt Also make text inclusions work with memory buffers, for example when using a custom entity loader, and fix a memory leak in case of invalid characters. Fixes #483.

be0ec005

2023-02-03T14:37:49

xinclude: Abort immediately if max depth was exceeded Avoids resource exhaustion if the maximum recursion depth was exceeded. Note that the XInclude engine offers no protection against other "billion laughs"-style amplification attacks as long as they stay below the maximum depth.

74aa61e0

2023-01-22T13:09:03

parser: Halt parser on DTD errors If we try to continue parsing after an error in the internal or external subset, entity expansion accounting gets more complicated. Simply halt the parser. Found with libFuzzer.

608c65bb

2023-01-18T15:15:41

xpath: number('-') should return NaN Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/81

d320a683

2023-01-17T13:50:51

parser: Fix entity check in attributes Don't set the "checked" flag when checking entities in default attribute values. These entities could reference other entities which weren't defined yet, so the check isn't reliable. This fixes a short-lived regression which could lead to a call stack overflow later in xmlStringGetNodeList.

a41b09c7

2022-12-23T21:29:28

parser: Improve detection of entity loops Set a flag to detect entity loops at once instead of processing until the depth limit is exceeded.

d972393f

2022-12-23T21:01:20

parser: Only report a single entity error Don't report errors multiple times for nested entity references.

ae0c9cfa

2022-12-12T23:54:39

uri: Fix handling of port numbers Allow port number without host, real fix for #71. Also compare port numbers in xmlBuildRelativeURI. Fix handling of port numbers in xmlUriEscape.

76c6da42

2022-12-04T23:01:00

error: Make sure that error messages are valid UTF-8 This has caused issues with the Python bindings for a long time. Should fix #64.

9c63cea5

2022-11-20T15:36:41

test: Add test for push parser boundaries

68a6518c

2022-11-15T18:23:33

parser: Rewrite push parser boundary checks Remove inaccurate xmlParseCheckTransition check. Remove non-incremental xmlParseGetLasts check. Add functions that check for several boundary constructs more accurately, keeping track of progress in ctxt->checkIndex. Fixes #439.

76d6b0d7

2022-11-14T21:02:15

html: Don't escape ASCII chars in href attributes In several cases, href attributes can contain ASCII characters which are illegal in URIs. Escaping them often does more harm than good. Fixes #321.

f61b8a62

2022-11-13T21:47:03

parser: Fix DTD parser progress checks This is another attempt at fixing parser progress checks. Instead of relying on in->consumed, which could overflow, change some DTD parser functions to make guaranteed progress on certain byte sequences.

b456e3bb

2022-10-30T20:28:20

xinclude: Always allow XPtr expressions in external documents

eef0a739

2022-10-30T12:21:20

xinclude: Implement "streaming" mode When using xmlreader, XPointer expressions in XIncludes simply cannot work. Expressions can reference nodes which weren't parsed yet or which were already deleted. After fixing nested XIncludes, we reference includes which were parsed previously. When streaming, these nodes could have been deleted, leading to use-after-free errors. Disallow XPointer expressions and truncate the include table in streaming mode.

20e2fb4c

2022-10-23T17:52:29

xinclude: Avoid creation of subcontexts Don't create subcontext in xmlXIncludeRecurseDoc. Save and restore 'doc' and 'incTab' instead. Make xmlXIncludeLoadFallback call xmlXIncludeCopyNode which seems safer than xmlXIncludeDoProcess since the latter may modify the document. This should also be more performant since we need to copy the whole fallback subtree anyway. Also make sure to avoid replacements in fallback elements in xmlXIncludeDoProcess.

d2ed1e4f

2022-10-22T16:50:18

xinclude: Limit recursion depth This avoids call stack overflows.

ea7c9fb5

2022-10-22T16:48:58

xinclude: Don't create result doc for test with errors

34496f26

2022-10-22T16:09:21

xinclude: Test for inclusion loops

bc267cb9

2022-10-22T02:19:22

xinclude: Expand includes in xmlXIncludeCopyNode This should make nested includes work reliably. Fixes #424.

c99cde3f

2022-10-22T16:59:35

xinclude: Also test error messages The reader interface with XIncludes is somewhat broken and can generate different error messages. Start to move tests which are sketchy with reader to a separate directory.

938105b5

2022-10-21T15:56:12

Revert "xinclude: Fix regression with nested includes" This reverts commit 7f04e297318b1b908cec20711f74f75625afed7f which caused memory errors. See #424.

7f04e297

2022-10-18T18:40:00

xinclude: Fix regression with nested includes This reverts commits 74dcc10b and 87d20b55. Fixes #424.

1d4f5d24

2022-09-13T16:40:31

schemas: Fix null-pointer-deref in xmlSchemaCheckCOSSTDerivedOK Found by OSS-Fuzz.

c7149792

2022-09-01T23:15:35

Fix --with-valid --without-regexps build This build config resulted in segfaults in 'runtest' because a special xmlElementContentPtr showed up in a few places. I'm not sure if this is the right fix. An error message was changed to conform to the --with-regexps build. There are still a few missing validity errors, so the tests don't pass.

e986d09c

2022-07-15T14:02:26

Skip incorrectly opened HTML comments Commit 4fd69f3e fixed handling of '<' characters not followed by an ASCII letter. But a '<!' sequence followed by invalid characters should be treated as bogus comment and skipped. Fixes #380.

14517012

2022-04-23T19:19:33

Fix parsing of subtracted regex character classes Fixes #370.

4612ce30

2022-04-21T03:52:52

Implement xpath1() XPointer scheme See https://www.w3.org/2005/04/xpointer-schemes/

41afa89f

2022-04-10T14:09:29

Fix short-lived regression in xmlStaticCopyNode Commit 7618a3b1 didn't account for coalesced text nodes. I think it would be better if xmlStaticCopyNode didn't try to coalesce text nodes at all. This code path can only be triggered if some other code doesn't coalesce text nodes properly. In this case, OSS-Fuzz found such behavior in xinclude.c.

4de7f2ac

2022-04-04T03:28:21

Remove unused result files

f1c32b4c

2020-07-09T03:19:13

Allow missing result files in runtest Treat missing files as empty.

95c7f315

2022-04-03T21:39:14

Move SVG tests to runtest.c Also update the test results for the first time since 2000.

48b03c84

2022-04-03T20:36:38

Remove major parts of old test suite Remove all the parts of the old test suite which are covered by runtest.c for quite some time. The following test programs are removed: - testC14N - testHTML - testReader - testRelax - testSAX - testSchemas - testURI - testXPath This also removes a few results of unimportant tests only run by the old test suite.

57b81c20

2022-03-05T18:20:29

Normalize XPath strings in-place Simplify the code and fix a potential memory leak. Fixes #343.

bc06a522

2022-03-02T02:57:49

Fix recursion check in xinclude.c Compare the included URL with the document's URL to detect local inclusions. Fixes #348.

d7b287b9

2021-07-17T14:36:53

htmlParseComment: handle abruptly-closed comments See guidance provided on abrutply-closed comments here: https://html.spec.whatwg.org/multipage/parsing.html#parse-error-abrupt-closing-of-empty-comment

24cdc890

2021-07-17T14:06:49

test coverage for abruptly-closed comments These establish baseline behavior so that the subsequent commit is clear about the behavior it will modify.

ea6e8f99

2021-12-20T00:34:58

Fix certain combinations of regex range quantifiers Fix regex transitions that have both min/max and a counter. In this case, we want to save the regex state before incrementing the counter. Fixes #301 and the issue reported here: https://mail.gnome.org/archives/xml/2016-April/msg00017.html

382fb056

2021-12-20T00:31:41

Fix range quantifier on subregex Make sure to add counted exit transitions before other counter transitions. Otherwise, we won't backtrack correctly. Fixes #65.

ce0871e1

2022-02-20T16:44:41

Only warn on invalid redeclarations of predefined entities Downgrade the error message to a warning since the error was ignored, anyway. Also print the name of redeclared entity. For a proper fix that also shows filename and line number of the invalid redeclaration, we'd have to - pass the parser context to the entity functions somehow, or - make these functions return distinct error codes. Partial fix for #308.

652dd12a

2022-02-08T03:29:24

[CVE-2022-23308] Use-after-free of ID and IDREF attributes If a document is parsed with XML_PARSE_DTDVALID and without XML_PARSE_NOENT, the value of ID attributes has to be normalized after potentially expanding entities in xmlRemoveID. Otherwise, later calls to xmlGetID can return a pointer to previously freed memory. ID attributes which are empty or contain only whitespace after entity expansion are affected in a similar way. This is fixed by not storing such attributes in the ID table. The test to detect streaming mode when validating against a DTD was broken. In connection with the defects above, this could result in a use-after-free when using the xmlReader interface with validation. Fix detection of streaming mode to avoid similar issues. (This changes the expected result of a test case. But as far as I can tell, using the XML reader with XIncludes referencing the root document never worked properly, anyway.) All of these issues can result in denial of service. Using xmlReader with validation could result in disclosure of memory via the error channel, typically stderr. The security impact of xmlGetID returning a pointer to freed memory depends on the application. The typical use case of calling xmlGetID on an unmodified document is not affected.

9edc20c1

2022-02-07T20:38:30

Fix double counting of CRLF in comments Fixes #151.

5408c10c

2022-02-04T14:00:09

Don't normalize namespace URIs in XPointer xmlns() scheme Namespace URIs should be compared without escaping or unescaping: https://www.w3.org/TR/REC-xml-names/#NSNameComparison Fixes #289.

1c7d91ab

2022-02-03T23:31:19

Fix handling of XSD with empty namespace An empty namespace means no default namespace. Fixes #303.

f480f750

2022-02-03T14:43:17

Update NewsML DTD in test suite Switch to version 1.2 which has a clearer license. Fixes #291.

d85245f9

2022-01-16T21:39:04

Fix regression with PEs in external DTD Fix a regression introduced with commit a28f7d87. In some cases, parameter entity references in external DTDs wouldn't be expanded. Fixes #306.

03bb9293

2021-07-07T18:23:18

Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8(). * encoding.c: (UTF16LEToUTF8): - Fix comment to describe what the code does. (UTF16BEToUTF8): - Fix undefined behavior which was applied to UTF16LEToUTF8() in 2f9382033e. - Add bounds check to while() loop which was applied to UTF16LEToUTF8() in be803967db. - Do not return -2 when (in >= inend) to fix the bug. This was applied to UTF16LEToUTF8() in 496a1cf592. - Inline (<< 8) statements to match UTF16LEToUTF8(). Add the following tests and results: test/text-4-byte-UTF-16-BE-offset.xml test/text-4-byte-UTF-16-BE.xml test/text-4-byte-UTF-16-LE-offset.xml test/text-4-byte-UTF-16-LE.xml

2732b234

2022-01-10T13:32:14

Fix regression parsing public IDs literals in HTML Fix regression introduced when reworking htmlParsePubidLiteral in commit 93ce33c2. Fixes #318.

de5b624f

2021-05-08T20:21:29

Fix handling of unexpected EOF in xmlParseContent Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was removed in commit 62150ed2. This commit also introduced a regression for direct users of xmlParseContent. Unclosed tags weren't checked.

3e80560d

2021-05-07T10:51:38

Fix line numbers in error messages for mismatched tags Commit 62150ed2 introduced a small regression in the error messages for mismatched tags. This typically only affected messages after the first mismatch, but with custom SAX handlers all line numbers would be off. This also fixes line numbers in the SAX push parser which were never handled correctly.

01411e7c

2021-02-08T20:58:32

Check for invalid redeclarations of predefined entities Implement section "4.6 Predefined Entities" of the XML 1.0 spec and check whether redeclarations of predefined entities match the original definitions. Note that some test cases declared <!ENTITY lt "<"> But the XML spec clearly states that this is illegal: > If the entities lt or amp are declared, they MUST be declared as > internal entities whose replacement text is a character reference to > the respective character (less-than sign or ampersand) being escaped; > the double escaping is REQUIRED for these entities so that references > to them produce a well-formed result. Also fixes #217 but the connection is only tangential. The integer overflow discovered by fuzzing was more related to the fact that various parts of the parser disagreed on whether to prefer predefined entities over their redeclarations. The whole situation is a mess and even depends on legacy parser options. But now that redeclarations are validated, it shouldn't make a difference. As noted in the added comment, this is also one of the cases where overly defensive checks can hide interesting logic bugs from fuzzers.

kc3-lang/libxml2/result

result

Log