Log

Author Commit Date CI Message
Nick Wellnhofer 7ec314ef 2023-01-30T15:59:55 malloc-fail: Add error checks in xmlXPathEqualValuesCommon Avoid null deref. Found with libFuzzer, see #344.
Nick Wellnhofer 08695683 2023-01-30T15:52:00 malloc-fail: Add error check in xmlXPathEqualNodeSetFloat Avoid null deref. Found with libFuzzer, see #344.
Nick Wellnhofer 75534401 2023-01-30T15:40:23 malloc-fail: Record malloc failure in xmlXPathCompLiteral Avoid OOB array access. Found with libFuzzer, see #344.
Nick Wellnhofer 0e4421e7 2023-01-30T15:05:58 malloc-fail: Check return value of xmlXPathNodeSetDupNs Avoid null deref if allocation fails. Found with libFuzzer, see #344.
Nick Wellnhofer 621c222e 2023-01-30T15:48:11 malloc-fail: Fix error check in xmlXPathCompareValues Avoid null deref. Found with libFuzzer, see #344.
Nick Wellnhofer 6fd89041 2023-01-22T19:42:41 malloc-fail: Fix use-after-free in xmlParseStartTag2 Fix error handling in xmlCtxtGrowAttrs. Found with libFuzzer, see #344.
Nick Wellnhofer c7260a47 2023-01-23T10:19:59 malloc-fail: Don't call xmlErrMemory in xmlstring.c Functions like xmlStrdup are called in the error handling code (__xmlRaiseError) which can cause problems like use-after-free or infinite loops when invoked recursively. Calling xmlErrMemory without a context argument isn't helpful anyway. Found with libFuzzer, see #344.
Nick Wellnhofer e6d22f92 2023-01-23T01:48:37 malloc-fail: Fix reallocation in inputPush Store xmlRealloc result in temporary variable to avoid null deref in error handler. Found with libFuzzer, see #344.
Nick Wellnhofer 33d4a0fe 2023-01-22T15:41:00 parser: Fix progress check in xmlParseExternalSubset Avoid infinite loop. Short-lived regression from f61b8a62. Found with libFuzzer.
Nick Wellnhofer f65133fc 2023-01-22T14:13:56 uri: Add explicit cast in xmlSaveUri Fix -fsanitize=implicit-conversion error. We should probably percent-escape the host name here.
Nick Wellnhofer c266a220 2023-01-22T18:18:00 malloc-fail: Handle memory errors in xmlTextReaderEntPush Unfortunately, there's no way to properly report memory errors. Found with libFuzzer, see #344.
Nick Wellnhofer f8c5e7fb 2023-01-22T13:49:19 buf: Fix return value of xmlBufGetInputBase Don't return (size_t) -1 in error case. Found with libFuzzer and -fsanitize=implicit-conversion.
Nick Wellnhofer 74aa61e0 2023-01-22T13:09:03 parser: Halt parser on DTD errors If we try to continue parsing after an error in the internal or external subset, entity expansion accounting gets more complicated. Simply halt the parser. Found with libFuzzer.
Nick Wellnhofer d1b87856 2023-01-22T17:42:09 malloc-fail: Fix infinite loop in xmlParseTextDecl Memory errors can set `instate` to `XML_PARSER_EOF` which results in `NEXT` making no progress. Found with libFuzzer, see #344.
Nick Wellnhofer bd9de3a3 2023-01-22T16:52:39 malloc-fail: Fix null deref in xmlAddDefAttrs Found with libFuzzer, see #344.
Nick Wellnhofer 2355eac5 2023-01-22T14:52:06 malloc-fail: Fix null deref if growing input buffer fails Also add some error checks. Found with libFuzzer, see #344.
Nick Wellnhofer 0c5f40b7 2023-01-22T13:27:41 malloc-fail: Fix null deref in xmlSAX2AttributeInternal Found with libFuzzer, see #344.
Nick Wellnhofer 1aabc9db 2023-01-22T13:20:15 malloc-fail: Fix null deref in xmlBufResize Found with libFuzzer, see #344.
Nick Wellnhofer b3b53dcc 2023-01-22T11:28:46 malloc-fail: Fix null deref in xmlSAX2Text Found with libFuzzer, see #344.
Nick Wellnhofer d9a8dab3 2023-01-22T12:00:59 error: Don't move past current position Make sure that we never move past the current position in xmlParserPrintFileContextInternal. Found with libFuzzer and -fsanitize=implicit-conversion.
Nick Wellnhofer 608c65bb 2023-01-18T15:15:41 xpath: number('-') should return NaN Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/81
Nick Wellnhofer bbb2b8f1 2023-01-17T16:08:06 Remove symbols from version script The version script didn't account for symbols disabled by configuration options. This has caused problems on some OSs in the past and breaks lld 16 which enables --no-undefined-version by default. A proper fix would be rather involved, so we simply remove all symbols from the version script. This is an ELF-only feature and libxml2 never made use of symbol versioning anyway. Ultimately, this removes the need for a lot of bookkeeping without tangible benefits. We have to keep the version nodes to avoid errors when running binaries linked against older versions of libxml2. Fixes #473.
Nick Wellnhofer e6401b68 2023-01-17T14:01:23 tree: Fix recursion check in xmlStringGetNodeList Use the new entity flag to check for recursion.
Nick Wellnhofer d320a683 2023-01-17T13:50:51 parser: Fix entity check in attributes Don't set the "checked" flag when checking entities in default attribute values. These entities could reference other entities which weren't defined yet, so the check isn't reliable. This fixes a short-lived regression which could lead to a call stack overflow later in xmlStringGetNodeList.
Nick Wellnhofer 59b33661 2022-12-27T14:15:51 error: Limit number of parser errors Reporting errors is expensive and some abusive test cases can generate an error for each invalid input byte. This causes the parser to spend most of the time with error handling. Limit the number of errors and warnings to 100.
Nick Wellnhofer ba910d34 2022-12-26T17:58:33 fuzz: Add test/recurse to seed corpus
Nick Wellnhofer 09dac45a 2022-12-26T17:49:27 fuzz: Add separate XInclude fuzzer XIncludes involve XPath processing which can still lead to timeouts when fuzzing. This will probably take a while to fix. The rest of the XML parsing code should hopefully run without timeouts now. OSS-Fuzz only shows a single timeout test case, so separate the XInclude from the core XML fuzzer.
Nick Wellnhofer 66e9fd66 2022-12-25T21:26:17 parser: Fix infinite loop with push parser in recovery mode Short-lived regression from commit b1f9c193. Found by OSS-Fuzz.
Nick Wellnhofer 49b54d7e 2022-12-25T15:06:51 parser: Fix null deref in xmlStringDecodeEntitiesInt Short-lived regression.
Nick Wellnhofer c885bebb 2022-12-23T23:06:32 fuzz: Remove size limit, disable XInclude Now that entity expansion issues should be fixed, we should get more interesting timeout errors from OSS-Fuzz. Disable XInclude for now, since it often timeouts in XPath computations. The XInclude tests should be moved to a separate fuzz target.
Nick Wellnhofer 1865668b 2022-12-23T22:44:40 parser: Fix accounting of consumed input bytes Only add consumed bytes if - we're not parsing an entity - we're parsing external parameter entities for the first time. Always ignore internal parameter entities.
Nick Wellnhofer bc18f4a6 2022-12-23T21:55:38 parser: Lower entity nesting limit with XML_PARSE_HUGE The old limit of 1024 could lead to excessively deep call stacks. This could probably be set much lower without causing issues.
Nick Wellnhofer dd62e541 2022-12-23T21:53:30 parser: Don't increase depth twice when parsing internal entities Fix xmlParseBalancedChunkMemoryInternal.
Nick Wellnhofer a41b09c7 2022-12-23T21:29:28 parser: Improve detection of entity loops Set a flag to detect entity loops at once instead of processing until the depth limit is exceeded.
Nick Wellnhofer d972393f 2022-12-23T21:01:20 parser: Only report a single entity error Don't report errors multiple times for nested entity references.
Nick Wellnhofer 28b3777e 2022-12-22T15:35:28 runsuite: Some errors are expected
Nick Wellnhofer 077df27e 2022-12-22T15:22:01 parser: Fix integer overflow of input ID Applies a patch from Chromium. Also stop incrementing input ID of subcontexts. This isn't necessary. Fixes #465.
David Kilzer 0bd4e4e0 2022-12-21T19:21:30 xmlParseStartTag2() contains typo when checking for default definitions for an attribute in a namespace * parser.c: (xmlParseStartTag2): - Fix index into defaults->values. It is only correct the first time through the loop when i == 0. Fixes #467.
Nick Wellnhofer 78c4430f 2022-12-22T00:03:10 doc: Remove ancient files
Nick Wellnhofer 4c763dd0 2022-12-21T22:20:43 gitlab-ci: Revert accidental change to setup_mingw.sh Commit 3aaaf5ca shouldn't have changed this line. We need these libraries for a full libxml2 build.
Nick Wellnhofer c74e5903 2022-12-21T22:24:50 Remove ancient TODOs
Nick Wellnhofer 101a542e 2022-12-21T21:47:10 Remove RPM build, Makefile.tests, README.tests
Nick Wellnhofer b47ebf04 2022-12-21T00:02:47 parser: Deprecate xmlString*DecodeEntities These are internal functions.
Nick Wellnhofer ec6633af 2022-12-20T03:09:11 parser: Remove useless ent->etype test in xmlParseReference If ent->etype is invalid, ret can't equal XML_ERR_OK.
Nick Wellnhofer 7ee7f036 2022-12-20T02:06:38 parser: Remove useless ent->children tests in xmlParseReference The if-block before always returns if ent->children == NULL.
Nick Wellnhofer cfc036bd 2022-12-21T19:27:45 testrecurse: Test parameter entity accounting
Nick Wellnhofer 106c4cdd 2022-12-21T17:05:54 testrecurse: Support multiple huge docs
Nick Wellnhofer 079da5b2 2022-12-21T03:26:31 testrecurse: Add external entities to huge test
Nick Wellnhofer 01bcb23d 2022-12-21T01:01:36 testrecurse: Add test cases for external entities Add test cases for external general and parameter entities.
Nick Wellnhofer 046f99c5 2022-12-21T05:15:51 testrecurse: Add lol_param.xml Add test case contributed by Sebastian Pipping for CVE-2021-3541.
Nick Wellnhofer fafa0252 2022-12-21T01:01:07 testrecurse: Rename test files
Nick Wellnhofer 69aeff53 2022-12-20T22:33:28 testrecurse: Also test without entity substitution
Nick Wellnhofer 4c7cb8f4 2022-12-20T22:42:24 testrecurse: Also test SAX parser
Nick Wellnhofer 583cd2f6 2022-12-21T05:13:23 testrecurse: Start to test entity expansion stats
Nick Wellnhofer ce76ebfd 2022-12-19T20:56:23 entities: Stop counting entities This was only used in the old version of xmlParserEntityCheck.
Nick Wellnhofer a3c8b180 2022-12-19T20:51:52 entities: Add entity flag for loop check
Nick Wellnhofer 463bbeec 2022-12-19T18:39:45 entities: Rework entity amplification checks This commit implements robust detection of entity amplification attacks, better known as the "billion laughs" attack. We now limit the size of the document after substitution of entities to 10 times the size before expansion. This guarantees linear behavior by definition. There already was a similar check before, but the accounting of "sizeentities" (size of external entities) and "sizeentcopy" (size of all copies created by entity references) wasn't accurate. We also need saturation arithmetic since we're historically limited to "unsigned long" which is 32-bit on many platforms. A maximum of 10 MB of substitutions is always allowed. This should make use cases like DITA work which have caused problems in the past. The old checks based on the number of entities were removed. This is accounted for by adding a fixed cost to each entity reference. Entity amplification checks are now enabled even if XML_PARSE_HUGE is set. This option is mainly used to allow larger text nodes. Most users were unaware that it also disabled entity expansion checks. Some of the limits might be adjusted later. If this change turns out to affect legitimate use cases, we can add a separate parser option to disable the checks. Fixes #294. Fixes #345.
Nick Wellnhofer 7e3f469b 2022-12-19T15:59:49 entities: Use flags to store '<' check results Instead of abusing the LSB of the "checked" member, store the result of testing for occurrence of '<' character in "flags". Also use the flags in xmlParseStringEntityRef instead of rescanning every time.
Nick Wellnhofer 481d79d4 2022-12-19T15:26:46 entities: Add XML_ENT_PARSED flag To check whether an entity was already parsed, the code previously tested whether "checked" was non-zero or "children" was non-null. The "children" check could be unreliable because an empty entity also results in an empty (NULL) node list. Use a separate flag to make this check more reliable.
Nick Wellnhofer f34f184f 2022-12-19T15:24:53 entities: Add "flags" member to struct xmlEntity This will hold various flags and eventually replace the "checked" member.
Nick Wellnhofer f67dc618 2022-12-17T00:14:56 xmlreader: Try to fix regression when reading from memory This reverts a change from commit 2059df53, see #462.
Nick Wellnhofer ae0c9cfa 2022-12-12T23:54:39 uri: Fix handling of port numbers Allow port number without host, real fix for #71. Also compare port numbers in xmlBuildRelativeURI. Fix handling of port numbers in xmlUriEscape.
Nick Wellnhofer 8ed40c62 2022-12-13T00:51:33 Revert "uri: Allow port without host" This reverts commit f30adb54f55e4e765d58195163f2a21f7ac759fb. Fixes #460.
Nick Wellnhofer a77e3273 2022-12-08T19:45:40 xmlmemory.c: Remove xmlMemContentShow This debug function was always unsafe and hard-coded pointer sizes to 32 bits. Instead of attempting a fix, remove it completely. These days, tools like ASan are much better to debug memory issues. Fixes #214.
Nick Wellnhofer 25ea7b6a 2022-12-08T19:44:09 testapi.c: Initialize catalog early Avoid leak reports when testing --with-mem-debug.
Nick Wellnhofer eaebf37f 2022-12-08T18:38:45 gentest.py: Fix memory leak in API tests Regressed in commit ff34ba3e.
Nick Wellnhofer 785cfcff 2022-12-08T19:05:12 doc/libxml2-api.xml: Regenerate
Nick Wellnhofer 0f54af74 2022-12-08T18:36:45 encoding.c: Fix for documentation generator Top-level macro invocations throw off the documentation parser.
Lukáš Tyrychtr 85c6cacd 2022-12-08T13:32:49 catalog.c: Silence a cast warning on VS 2022 Fixes #457.
Nick Wellnhofer 93a01c46 2022-12-08T03:58:41 libxml.h: Add comments and indentation
Nick Wellnhofer 92b8ffad 2022-12-08T03:56:40 libxml.h: Remove dubious definition of LIBXML_STATIC This macro is supposed to be set by the build system.
Nick Wellnhofer 60d457be 2022-12-08T03:45:37 libxml.h: Don't include stdio.h
Nick Wellnhofer 924ed827 2022-12-08T03:41:36 libxml.h: Remove ancient LynxOS setup
Nick Wellnhofer a6debffd 2022-12-08T03:37:24 xmlexports.h: Disable docs for internal macro XMLPUBLIC
Nick Wellnhofer 3b6cc47a 2022-12-08T02:51:52 xmlexports.h: Remove LIBXML_FASTCALL optimization This was an experimental and undocumented micro-optimization for Windows which apparently required different calling conventions for variable-argument functions, making it impossible to maintain without domain knowledge.
Nick Wellnhofer ce9baf94 2022-12-08T02:48:27 Remove XMLCALL and XMLCDECL macros from public headers
Nick Wellnhofer dd3569ea 2022-12-08T02:43:17 Remove XMLDECL macro from .c files
Nick Wellnhofer 06b7a7e0 2022-12-08T00:54:13 Update README.md Mention official releases and Git repo prominently. Remove links to old mailing list.
Nick Wellnhofer b92768cd 2022-12-08T00:24:53 tests: Enable "runsuite" test This enables some tests with testcases in - test/xsdtest - test/relaxng/OASIS/spectest.xml - test/relaxng/testsuite.xml The XML Schema Test Suite will also be run it was downloaded, see xstc/Makefile.am. Gitlab CI should be updated to fetch these files. There are 10 expected errors in the XSD test suite. This seems to be the case since at least version 2.9.0 from 2012.
Ross Burton 4762c856 2022-12-06T21:40:01 Use python3 not python As per https://peps.python.org/pep-0394/, the python binary can be one of the following options: - Python 2 - Python 3 - Not exist All of the scripts in libxml2 use 'python', which may not exist. As Python 2 reached EOL on the 1st January 2020, it's safe to move the scripts to use python3 explicitly.
Ross Burton ff49041c 2022-12-07T12:04:41 xstc/fixup-tests.py: port to Python 3
Ross Burton 7640362e 2022-12-07T12:01:39 xstc/fixup-tests.py: unify whitespace The source contains a mix of tabs and spaces, so unify on spaces.
Ross Burton d598d8af 2022-12-05T16:18:14 libxml.m4: deprecate AM_PATH_XML2, wrap PKG_CHECK_MODULES instead pkg-config has been around for a very long time now, so deprecate the hand-written libxml.m4 fragment providing AM_PATH_XML2 and simply change it to a wrapper around PKG_CHECK_MODULES.
Ross Burton 0ac8c15e 2022-12-06T15:48:55 python/tests/reader2: use absolute paths everywhere The expected errors contain an relative path, but the messages from the parser contain absolute paths. However, due to the tests not actually failing if there was an error this wasn't noticed. Instead of putting relative paths in the expected messages use format() to embed the correct absolute path. Also use os.path.join() consistently when constructing paths to ensure uniformly formatted paths.
Ross Burton b9ba5e1d 2022-12-06T15:34:04 python/tests/reader2: always exit(1) if a test fails Batch up the errors in the first parse tests and ensure that the last tests exit with an error if they fail. Also remove an unused import.
Ross Burton 21f2ce71 2022-12-06T14:33:32 testModule: exit if the module can't be opened Instead of silently exiting with success when the module cannot be found, emit a message and fail the test.
Ross Burton b1b0df6e 2022-12-06T17:00:03 CI: disable modules in gcc:static build When shared libraries are disabled we can't build loadable modules either, so the testModule test can't work as the testdso.la target doesn't build a module.
Ross Burton 3aaaf5ca 2022-12-06T17:05:14 CI: fix CI on MinGW builds The XML test case tarball isn't actually compressed: the published URL is a .tar and fetches of the .tar.gz redirect silently to the .tar, which is then passed to gzip which refuses to decompress uncompressed data. Fetch the .tar as that is the documented URL, and remove the decompression.
Nick Wellnhofer 76c6da42 2022-12-04T23:01:00 error: Make sure that error messages are valid UTF-8 This has caused issues with the Python bindings for a long time. Should fix #64.
Alex Richardson 4b959ee1 2022-12-01T13:23:09 Remove hacky heuristic from b2dc5675e94aa6b5557ba63f7d66b0f08dd17e4d Checking whether the context is close to the parent context by hardcoding 250 is not portable (I noticed tests were failing on Morello since the value is 288 there due to pointers being 128 bits). Instead we should ensure that the XML_VCTXT_USE_PCTXT flag is not set in cases where the user data is not actually a parser context (or ideally add a separate field but that would be an ABI break. From what I can see in the source, the XML_VCTXT_USE_PCTXT is only set if the userData field points to a valid context, and if this is not the case the flag should be cleared when changing userData rather than relying on the offset between the two. Looking at the history, I think d7cb33cf44aa688f24215c9cd398c1a26f0d25ff fixed most of the need for this workaround, but it looks like there are a few more locations that need updating; This commit changes two more places to set/clear/copy the XML_VCTXT_USE_PCTXT flag, so this heuristic should not be needed anymore. I've also drop two = NULL assignment in xmllint since this is not needed after a call to memset(). There was also an uninitialized vctxt.flags (and other fields) in `xmlShellValidate()`, which I've fixed by adding a memset() call.
Alex Richardson c715ded0 2022-12-01T12:53:15 Avoid creating an out-of-bounds pointer by rewriting a check Creating more than one-past-the-end pointers is undefined behaviour in C and while this code is unlikely to be miscompiled, I discovered that an out-of-bounds pointer is being created using UBSan on a CHERI-enabled system.
Alex Richardson c62c0d82 2022-12-01T12:58:11 Correctly relocate internal pointers after realloc() Adding an offset to a deallocated pointer and assuming that it can be dereferenced is undefined behaviour. When running libxml2 on CHERI-enabled systems such as Arm Morello this results in the creation of an out-of-bounds pointer that cannot be dereferenced and therefore crashes at runtime. The effect of this UB is not just limited to architectures such as CHERI, incorrect relocation of pointers after realloc can in fact cause FORTIFY_SOURCE errors with recent GCC: https://developers.redhat.com/articles/2022/09/17/gccs-new-fortification-level
Nick Wellnhofer c7a9b85c 2022-11-30T17:11:33 html: Improve parsing of nested lists Allow ul/ol as immediate children of ul/ol. This is more in line with the HTML5 spec. Fixes #447.
Nick Wellnhofer ccb6d544 2022-11-27T02:09:27 Hide internal functions These functions were never declared in public headers, so it should be safe to hide them. Fixes #139.
Nick Wellnhofer 82bd2c37 2022-11-25T18:09:15 python: Fix memory leak checks xmlInitParser doesn't allocate memory anymore, so the checks can be simplified.
Nick Wellnhofer 1966382b 2022-11-25T17:39:01 memory: Don't use locks in xmlMemUsed The Python tests call xmlMemUsed after xmlCleanupParser which doesn't work with statically allocated mutexes. This is only used for debugging, so a lock isn't necessary.
Nick Wellnhofer e414f825 2022-11-25T15:01:22 html: Fix htmlInitAutoClose documentation
Nick Wellnhofer c16fd705 2022-11-25T14:52:37 xpath: Make init function private
Nick Wellnhofer 53ab3840 2022-11-25T14:26:59 encoding: Make init function private
Nick Wellnhofer 3e9d5e4f 2022-11-25T14:19:36 encoding: Remove unused variable xmlDefaultCharEncodingHandler