Log

Author Commit Date CI Message
Nick Wellnhofer c885bebb 2022-12-23T23:06:32 fuzz: Remove size limit, disable XInclude Now that entity expansion issues should be fixed, we should get more interesting timeout errors from OSS-Fuzz. Disable XInclude for now, since it often timeouts in XPath computations. The XInclude tests should be moved to a separate fuzz target.
Nick Wellnhofer 1865668b 2022-12-23T22:44:40 parser: Fix accounting of consumed input bytes Only add consumed bytes if - we're not parsing an entity - we're parsing external parameter entities for the first time. Always ignore internal parameter entities.
Nick Wellnhofer dd62e541 2022-12-23T21:53:30 parser: Don't increase depth twice when parsing internal entities Fix xmlParseBalancedChunkMemoryInternal.
Nick Wellnhofer a41b09c7 2022-12-23T21:29:28 parser: Improve detection of entity loops Set a flag to detect entity loops at once instead of processing until the depth limit is exceeded.
Nick Wellnhofer bc18f4a6 2022-12-23T21:55:38 parser: Lower entity nesting limit with XML_PARSE_HUGE The old limit of 1024 could lead to excessively deep call stacks. This could probably be set much lower without causing issues.
Nick Wellnhofer d972393f 2022-12-23T21:01:20 parser: Only report a single entity error Don't report errors multiple times for nested entity references.
Nick Wellnhofer 28b3777e 2022-12-22T15:35:28 runsuite: Some errors are expected
Nick Wellnhofer 077df27e 2022-12-22T15:22:01 parser: Fix integer overflow of input ID Applies a patch from Chromium. Also stop incrementing input ID of subcontexts. This isn't necessary. Fixes #465.
David Kilzer 0bd4e4e0 2022-12-21T19:21:30 xmlParseStartTag2() contains typo when checking for default definitions for an attribute in a namespace * parser.c: (xmlParseStartTag2): - Fix index into defaults->values. It is only correct the first time through the loop when i == 0. Fixes #467.
Nick Wellnhofer 78c4430f 2022-12-22T00:03:10 doc: Remove ancient files
Nick Wellnhofer 4c763dd0 2022-12-21T22:20:43 gitlab-ci: Revert accidental change to setup_mingw.sh Commit 3aaaf5ca shouldn't have changed this line. We need these libraries for a full libxml2 build.
Nick Wellnhofer c74e5903 2022-12-21T22:24:50 Remove ancient TODOs
Nick Wellnhofer 101a542e 2022-12-21T21:47:10 Remove RPM build, Makefile.tests, README.tests
Nick Wellnhofer b47ebf04 2022-12-21T00:02:47 parser: Deprecate xmlString*DecodeEntities These are internal functions.
Nick Wellnhofer 7ee7f036 2022-12-20T02:06:38 parser: Remove useless ent->children tests in xmlParseReference The if-block before always returns if ent->children == NULL.
Nick Wellnhofer cfc036bd 2022-12-21T19:27:45 testrecurse: Test parameter entity accounting
Nick Wellnhofer ec6633af 2022-12-20T03:09:11 parser: Remove useless ent->etype test in xmlParseReference If ent->etype is invalid, ret can't equal XML_ERR_OK.
Nick Wellnhofer 106c4cdd 2022-12-21T17:05:54 testrecurse: Support multiple huge docs
Nick Wellnhofer 079da5b2 2022-12-21T03:26:31 testrecurse: Add external entities to huge test
Nick Wellnhofer 01bcb23d 2022-12-21T01:01:36 testrecurse: Add test cases for external entities Add test cases for external general and parameter entities.
Nick Wellnhofer 046f99c5 2022-12-21T05:15:51 testrecurse: Add lol_param.xml Add test case contributed by Sebastian Pipping for CVE-2021-3541.
Nick Wellnhofer fafa0252 2022-12-21T01:01:07 testrecurse: Rename test files
Nick Wellnhofer 69aeff53 2022-12-20T22:33:28 testrecurse: Also test without entity substitution
Nick Wellnhofer 4c7cb8f4 2022-12-20T22:42:24 testrecurse: Also test SAX parser
Nick Wellnhofer 583cd2f6 2022-12-21T05:13:23 testrecurse: Start to test entity expansion stats
Nick Wellnhofer ce76ebfd 2022-12-19T20:56:23 entities: Stop counting entities This was only used in the old version of xmlParserEntityCheck.
Nick Wellnhofer a3c8b180 2022-12-19T20:51:52 entities: Add entity flag for loop check
Nick Wellnhofer 463bbeec 2022-12-19T18:39:45 entities: Rework entity amplification checks This commit implements robust detection of entity amplification attacks, better known as the "billion laughs" attack. We now limit the size of the document after substitution of entities to 10 times the size before expansion. This guarantees linear behavior by definition. There already was a similar check before, but the accounting of "sizeentities" (size of external entities) and "sizeentcopy" (size of all copies created by entity references) wasn't accurate. We also need saturation arithmetic since we're historically limited to "unsigned long" which is 32-bit on many platforms. A maximum of 10 MB of substitutions is always allowed. This should make use cases like DITA work which have caused problems in the past. The old checks based on the number of entities were removed. This is accounted for by adding a fixed cost to each entity reference. Entity amplification checks are now enabled even if XML_PARSE_HUGE is set. This option is mainly used to allow larger text nodes. Most users were unaware that it also disabled entity expansion checks. Some of the limits might be adjusted later. If this change turns out to affect legitimate use cases, we can add a separate parser option to disable the checks. Fixes #294. Fixes #345.
Nick Wellnhofer 7e3f469b 2022-12-19T15:59:49 entities: Use flags to store '<' check results Instead of abusing the LSB of the "checked" member, store the result of testing for occurrence of '<' character in "flags". Also use the flags in xmlParseStringEntityRef instead of rescanning every time.
Nick Wellnhofer 481d79d4 2022-12-19T15:26:46 entities: Add XML_ENT_PARSED flag To check whether an entity was already parsed, the code previously tested whether "checked" was non-zero or "children" was non-null. The "children" check could be unreliable because an empty entity also results in an empty (NULL) node list. Use a separate flag to make this check more reliable.
Nick Wellnhofer f34f184f 2022-12-19T15:24:53 entities: Add "flags" member to struct xmlEntity This will hold various flags and eventually replace the "checked" member.
Nick Wellnhofer f67dc618 2022-12-17T00:14:56 xmlreader: Try to fix regression when reading from memory This reverts a change from commit 2059df53, see #462.
Nick Wellnhofer ae0c9cfa 2022-12-12T23:54:39 uri: Fix handling of port numbers Allow port number without host, real fix for #71. Also compare port numbers in xmlBuildRelativeURI. Fix handling of port numbers in xmlUriEscape.
Nick Wellnhofer 8ed40c62 2022-12-13T00:51:33 Revert "uri: Allow port without host" This reverts commit f30adb54f55e4e765d58195163f2a21f7ac759fb. Fixes #460.
Nick Wellnhofer a77e3273 2022-12-08T19:45:40 xmlmemory.c: Remove xmlMemContentShow This debug function was always unsafe and hard-coded pointer sizes to 32 bits. Instead of attempting a fix, remove it completely. These days, tools like ASan are much better to debug memory issues. Fixes #214.
Nick Wellnhofer 25ea7b6a 2022-12-08T19:44:09 testapi.c: Initialize catalog early Avoid leak reports when testing --with-mem-debug.
Nick Wellnhofer eaebf37f 2022-12-08T18:38:45 gentest.py: Fix memory leak in API tests Regressed in commit ff34ba3e.
Nick Wellnhofer 785cfcff 2022-12-08T19:05:12 doc/libxml2-api.xml: Regenerate
Nick Wellnhofer 0f54af74 2022-12-08T18:36:45 encoding.c: Fix for documentation generator Top-level macro invocations throw off the documentation parser.
Lukáš Tyrychtr 85c6cacd 2022-12-08T13:32:49 catalog.c: Silence a cast warning on VS 2022 Fixes #457.
Nick Wellnhofer 93a01c46 2022-12-08T03:58:41 libxml.h: Add comments and indentation
Nick Wellnhofer 92b8ffad 2022-12-08T03:56:40 libxml.h: Remove dubious definition of LIBXML_STATIC This macro is supposed to be set by the build system.
Nick Wellnhofer 60d457be 2022-12-08T03:45:37 libxml.h: Don't include stdio.h
Nick Wellnhofer 924ed827 2022-12-08T03:41:36 libxml.h: Remove ancient LynxOS setup
Nick Wellnhofer a6debffd 2022-12-08T03:37:24 xmlexports.h: Disable docs for internal macro XMLPUBLIC
Nick Wellnhofer 3b6cc47a 2022-12-08T02:51:52 xmlexports.h: Remove LIBXML_FASTCALL optimization This was an experimental and undocumented micro-optimization for Windows which apparently required different calling conventions for variable-argument functions, making it impossible to maintain without domain knowledge.
Nick Wellnhofer ce9baf94 2022-12-08T02:48:27 Remove XMLCALL and XMLCDECL macros from public headers
Nick Wellnhofer dd3569ea 2022-12-08T02:43:17 Remove XMLDECL macro from .c files
Nick Wellnhofer 06b7a7e0 2022-12-08T00:54:13 Update README.md Mention official releases and Git repo prominently. Remove links to old mailing list.
Nick Wellnhofer b92768cd 2022-12-08T00:24:53 tests: Enable "runsuite" test This enables some tests with testcases in - test/xsdtest - test/relaxng/OASIS/spectest.xml - test/relaxng/testsuite.xml The XML Schema Test Suite will also be run it was downloaded, see xstc/Makefile.am. Gitlab CI should be updated to fetch these files. There are 10 expected errors in the XSD test suite. This seems to be the case since at least version 2.9.0 from 2012.
Ross Burton 4762c856 2022-12-06T21:40:01 Use python3 not python As per https://peps.python.org/pep-0394/, the python binary can be one of the following options: - Python 2 - Python 3 - Not exist All of the scripts in libxml2 use 'python', which may not exist. As Python 2 reached EOL on the 1st January 2020, it's safe to move the scripts to use python3 explicitly.
Ross Burton ff49041c 2022-12-07T12:04:41 xstc/fixup-tests.py: port to Python 3
Ross Burton 7640362e 2022-12-07T12:01:39 xstc/fixup-tests.py: unify whitespace The source contains a mix of tabs and spaces, so unify on spaces.
Ross Burton d598d8af 2022-12-05T16:18:14 libxml.m4: deprecate AM_PATH_XML2, wrap PKG_CHECK_MODULES instead pkg-config has been around for a very long time now, so deprecate the hand-written libxml.m4 fragment providing AM_PATH_XML2 and simply change it to a wrapper around PKG_CHECK_MODULES.
Ross Burton 0ac8c15e 2022-12-06T15:48:55 python/tests/reader2: use absolute paths everywhere The expected errors contain an relative path, but the messages from the parser contain absolute paths. However, due to the tests not actually failing if there was an error this wasn't noticed. Instead of putting relative paths in the expected messages use format() to embed the correct absolute path. Also use os.path.join() consistently when constructing paths to ensure uniformly formatted paths.
Ross Burton b9ba5e1d 2022-12-06T15:34:04 python/tests/reader2: always exit(1) if a test fails Batch up the errors in the first parse tests and ensure that the last tests exit with an error if they fail. Also remove an unused import.
Ross Burton 21f2ce71 2022-12-06T14:33:32 testModule: exit if the module can't be opened Instead of silently exiting with success when the module cannot be found, emit a message and fail the test.
Ross Burton b1b0df6e 2022-12-06T17:00:03 CI: disable modules in gcc:static build When shared libraries are disabled we can't build loadable modules either, so the testModule test can't work as the testdso.la target doesn't build a module.
Ross Burton 3aaaf5ca 2022-12-06T17:05:14 CI: fix CI on MinGW builds The XML test case tarball isn't actually compressed: the published URL is a .tar and fetches of the .tar.gz redirect silently to the .tar, which is then passed to gzip which refuses to decompress uncompressed data. Fetch the .tar as that is the documented URL, and remove the decompression.
Nick Wellnhofer 76c6da42 2022-12-04T23:01:00 error: Make sure that error messages are valid UTF-8 This has caused issues with the Python bindings for a long time. Should fix #64.
Alex Richardson 4b959ee1 2022-12-01T13:23:09 Remove hacky heuristic from b2dc5675e94aa6b5557ba63f7d66b0f08dd17e4d Checking whether the context is close to the parent context by hardcoding 250 is not portable (I noticed tests were failing on Morello since the value is 288 there due to pointers being 128 bits). Instead we should ensure that the XML_VCTXT_USE_PCTXT flag is not set in cases where the user data is not actually a parser context (or ideally add a separate field but that would be an ABI break. From what I can see in the source, the XML_VCTXT_USE_PCTXT is only set if the userData field points to a valid context, and if this is not the case the flag should be cleared when changing userData rather than relying on the offset between the two. Looking at the history, I think d7cb33cf44aa688f24215c9cd398c1a26f0d25ff fixed most of the need for this workaround, but it looks like there are a few more locations that need updating; This commit changes two more places to set/clear/copy the XML_VCTXT_USE_PCTXT flag, so this heuristic should not be needed anymore. I've also drop two = NULL assignment in xmllint since this is not needed after a call to memset(). There was also an uninitialized vctxt.flags (and other fields) in `xmlShellValidate()`, which I've fixed by adding a memset() call.
Alex Richardson c715ded0 2022-12-01T12:53:15 Avoid creating an out-of-bounds pointer by rewriting a check Creating more than one-past-the-end pointers is undefined behaviour in C and while this code is unlikely to be miscompiled, I discovered that an out-of-bounds pointer is being created using UBSan on a CHERI-enabled system.
Alex Richardson c62c0d82 2022-12-01T12:58:11 Correctly relocate internal pointers after realloc() Adding an offset to a deallocated pointer and assuming that it can be dereferenced is undefined behaviour. When running libxml2 on CHERI-enabled systems such as Arm Morello this results in the creation of an out-of-bounds pointer that cannot be dereferenced and therefore crashes at runtime. The effect of this UB is not just limited to architectures such as CHERI, incorrect relocation of pointers after realloc can in fact cause FORTIFY_SOURCE errors with recent GCC: https://developers.redhat.com/articles/2022/09/17/gccs-new-fortification-level
Nick Wellnhofer c7a9b85c 2022-11-30T17:11:33 html: Improve parsing of nested lists Allow ul/ol as immediate children of ul/ol. This is more in line with the HTML5 spec. Fixes #447.
Nick Wellnhofer ccb6d544 2022-11-27T02:09:27 Hide internal functions These functions were never declared in public headers, so it should be safe to hide them. Fixes #139.
Nick Wellnhofer 82bd2c37 2022-11-25T18:09:15 python: Fix memory leak checks xmlInitParser doesn't allocate memory anymore, so the checks can be simplified.
Nick Wellnhofer 1966382b 2022-11-25T17:39:01 memory: Don't use locks in xmlMemUsed The Python tests call xmlMemUsed after xmlCleanupParser which doesn't work with statically allocated mutexes. This is only used for debugging, so a lock isn't necessary.
Nick Wellnhofer e414f825 2022-11-25T15:01:22 html: Fix htmlInitAutoClose documentation
Nick Wellnhofer c16fd705 2022-11-25T14:52:37 xpath: Make init function private
Nick Wellnhofer 53ab3840 2022-11-25T14:26:59 encoding: Make init function private
Nick Wellnhofer 3e9d5e4f 2022-11-25T14:19:36 encoding: Remove unused variable xmlDefaultCharEncodingHandler
Nick Wellnhofer 05c3a458 2022-11-25T14:15:43 tests: Check that xmlInitParser doesn't allocate memory
Nick Wellnhofer 78c0391b 2022-11-25T13:55:39 parser: Register atexit handler in locked section
Nick Wellnhofer 71931233 2022-10-24T21:50:34 threads: Use __libc_single_threaded if available Fixes #427
Nick Wellnhofer c73d464a 2022-11-24T15:00:03 threads: Deprecate some internal functions
Nick Wellnhofer 65d381f3 2022-11-24T20:54:18 threads: Allocate mutexes statically
Nick Wellnhofer 9ef80ff1 2022-11-25T12:33:25 memory: Remove xmlDictInitialized Call xmlInitParser when creating dicts instead.
Nick Wellnhofer ed053c50 2022-11-25T12:27:14 dict: Make init/cleanup functions private
Nick Wellnhofer 2e9aeecb 2022-11-25T12:21:49 memory: Remove xmlMemInitialized Call xmlInitParser instead of xmlInitMemoryInternal.
Nick Wellnhofer 7010d877 2022-11-25T12:06:27 threads: Rework initialization Make init/cleanup functions private. Merge xmlOnceInit into xmlInitThreadsInternal.
Nick Wellnhofer 9dbf1374 2022-11-24T20:52:57 parser: Make some module init/cleanup functions private
Nick Wellnhofer cecd364d 2022-11-24T16:38:47 parser: Don't call *DefaultSAXHandlerInit from xmlInitParser Change the default handler definitions to match the result after calling the initialization functions. This makes sure that no thread-local variables are accessed when calling xmlInitParser.
Nick Wellnhofer 1406b20f 2022-11-24T19:14:33 encoding: Allocate default handlers statically
Sam James 278e7874 2022-11-23T03:07:56 libxml.m4: fix -Wstrict-prototypes Signed-off-by: Sam James <sam@gentoo.org>
Chun-wei Fan 707ade22 2022-11-22T14:56:58 Visual Studio builds: Allow silencing deprecation warnings Define XML_IGNORE_DEPRECATION_WARNINGS and the corresponding XML_POP_WARNINGS for Visual Studio, and consequently define XML_IGNORE_FPTR_CAST_WARNINGS so that we do not get a compiler warning on Visual Studio by doing a __pragma(warning(pop)) without a corresponding __pragma(warning(push)). Also correct the documentation a bit for XML_POP_WARNINGS.
Chun-wei Fan b9590d5d 2022-11-18T11:23:23 Visual Studio: Define XML_DEPRECATED We can mark APIs as deprecated using __declspec(deprecated) with Visual Studio 2005 and later, so add a definition of that so that we can help users avoid using deprecated APIs when using Visual Studio as well. For the existing GCC definition, check whether we are on GCC 3.1+ before enabling the definition.
Nick Wellnhofer b1f9c193 2022-11-22T21:39:01 parser: Fix push parser with unterminated CDATA sections Short-lived regression found by OSS-Fuzz.
Nick Wellnhofer 97c0a9cf 2022-11-22T17:01:39 tests: Fix use-after-free in Python tests The nodeset must be freed before the document. Fixes #443.
Nick Wellnhofer 55034505 2022-11-22T17:01:21 Fix .editorconfig
Nick Wellnhofer 34a5a4a5 2022-11-22T15:40:51 tests: Remove unneeded #includes
Nick Wellnhofer 701beb4e 2022-11-22T15:37:49 xmllint: Include <io.h> on Windows
Nick Wellnhofer b9689d13 2022-11-22T15:37:12 gitlab-ci: Make Test-Msvc exit if ctest fails
Nick Wellnhofer 138c897d 2022-11-22T14:57:58 gitlab-ci: Treat compiler warnings as errors on MSVC
Nick Wellnhofer d725addd 2022-11-22T14:50:14 warnings: Work around MSVC bug MSVC apparently complains when passing a `const char **` to memset. Unlike `const char *const *`, this isn't a pointer to const memory.
Chun-wei Fan cfbe68e4 2022-11-22T15:20:53 sources: Silence C4013 warnings on Visual Studio The read(), close(), open(), lseek() functions are found in io.h on Visual Studio, which does not ship unistd.h, so include io.h on Windows if unistd.h is not found. C4013 (aka implicit declaration of ...) warnings can often ring alarm bells.
Nick Wellnhofer 0e193f0d 2022-11-21T22:09:19 parser: Remove dangerous check in xmlParseCharData If this check succeeds, xmlParseCharData could be called over and over again without making progress, resulting in an infinite loop. It's only important to check for XML_PARSER_EOF which is done later. Related to #441.
Nick Wellnhofer 94ca36c2 2022-11-21T22:07:11 parser: Restore parser state in xmlParseCDSect Fixes #441.
Nick Wellnhofer a8b31e68 2022-11-21T21:35:01 parser: Fix progress check when parsing character data Skip over zero bytes to guarantee progress. Short-lived regression.
Nick Wellnhofer 23491536 2022-11-21T20:11:53 Fix .editorconfig
Nick Wellnhofer c63900fb 2022-11-21T20:11:35 parser: Check terminate flag when push parsing CDATA sections Found by OSS-Fuzz.