|
eed1a07d
|
2025-03-04T13:32:52
|
|
build: Remove version script
|
|
cdc5cfed
|
2025-03-04T13:26:51
|
|
legacy: Remove legacy symbols
|
|
3250a01d
|
2025-03-04T13:15:42
|
|
error: Convert initGenericErrorDefaultFunc to macro
|
|
c42b3227
|
2025-03-04T13:11:18
|
|
parser: Convert inputPush and inputPop to macros
|
|
361f7bff
|
2025-03-04T13:02:36
|
|
parser: Make nodePush, nodePop, namePush, namePop private
|
|
0b27097a
|
2025-03-04T12:55:25
|
|
encoding: Rename unprefixed public functions
|
|
66fdf94c
|
2025-03-03T10:12:18
|
|
cmake: Fix WITH_RELAXNG option
Dependent options must come after dependencies.
|
|
a0f156ff
|
2025-03-02T13:21:29
|
|
io: Fix `compressed` flag for uncompressed stdin
This could cause xmlstarlet to generate compressed output unexpectedly.
Regressed with a78843be. Should fix #869.
|
|
05bd1720
|
2025-03-01T10:25:29
|
|
parser: Fix parsing of DTD content
Regressed in 2.11. Fixes #868.
|
|
552864f1
|
2025-02-25T23:10:46
|
|
Remove os400 port
This is based on an ancient version and completely outdated.
|
|
e60f0712
|
2025-02-25T23:07:55
|
|
Update NEWS
|
|
e50d314a
|
2025-02-25T23:07:19
|
|
build: Add separate configuration option for RELAX NG
Support for RELAX NG used to be enabled together with XML Schema support
(--with-schemas). Now there's a separate option and a new feature macro
LIBXML_RELAXNG_ENABLED.
|
|
ce1b704e
|
2025-02-25T20:09:36
|
|
doc: Regenerate libxml2-api.xml
|
|
6ab430ca
|
2025-02-22T21:17:42
|
|
Remove unnecessary #includes
|
|
7ae8e8ac
|
2025-02-22T21:06:34
|
|
schemas: Make xmlSchemaDump depend on DEBUG_ENABLED
|
|
6fc26076
|
2025-02-22T20:31:45
|
|
regexp: Hide debugging code behind DEBUG_REGEXP
xmlRegexpPrint is now a deprecated no-op.
|
|
4649f28f
|
2025-02-22T19:29:07
|
|
xmlregexp: add support for compact form of automata in xmlRegexpPrint
|
|
c82270a9
|
2025-02-22T18:51:38
|
|
regexp: Avoid dangling start/stop pointers in atom
States could be eliminated later, so set start/stop pointers to NULL
after they're used in xmlFAGenerateTransitions.
|
|
5ed4eafd
|
2025-02-22T14:51:39
|
|
html: Don't invoke SAX callbacks if parser was stopped
|
|
6dfa68ac
|
2025-02-22T14:49:51
|
|
SAX2: Fix ctxt->nodemem check
In some error cases and maybe other situations, nodemem can have a
value of -1.
|
|
73514f2d
|
2025-02-20T18:50:58
|
|
gitlab-ci: Stop downloading and installing CMake for MSVC
CMake should already be installed.
|
|
064a0211
|
2025-02-20T13:52:40
|
|
meson: Fix Python module build
|
|
c2e2d762
|
2025-02-20T13:51:26
|
|
python: Pass destination dir to generator.py
Simplify usage across build systems.
|
|
82fb5cae
|
2025-02-20T13:49:39
|
|
meson: Use project_name instead of 'libxml2'
|
|
e649c972
|
2024-12-18T12:49:24
|
|
fuzz: Add utility scripts
Add scripts to minimize a corpus and generate HTML coverage reports.
|
|
63dfcca6
|
2024-12-16T01:34:29
|
|
fuzz: Reduce initial array size
|
|
6f903d43
|
2024-12-13T19:15:38
|
|
fuzz: Rework fixed parser options
Remove XML_PARSE_XINCLUDE. This is only honored by the XML Reader
interface which is now fuzzed in reader.c.
Don't validate in XInclude fuzzer. This doesn't increase coverage after
moving the Reader fuzzer.
|
|
44628d45
|
2024-12-13T15:23:30
|
|
fuzz: Harden leak check in lint fuzzer
Check for undetected memory leaks from previous iterations. This also
makes sure that the maxmem limit is checked deterministically.
|
|
c6c6d8af
|
2024-12-11T16:24:23
|
|
fuzz: Mutate fuzz data chunks separately
Implement a custom mutator that takes a list of fixed-size chunks which
are mutated with a given probability. This makes sure that values like
parser options or failure position are mutated regularly even as the
fuzz data grows large. Values can also be adjusted temporarily to make
the fuzzer focus on failure injection, for example.
Thanks to David Kilzer for the idea.
|
|
f5257d92
|
2024-12-11T16:24:43
|
|
fuzz: Fix failure injection in schema fuzzer
|
|
9f86dae9
|
2024-12-15T14:27:05
|
|
test: Add test case for UAF in xmlSchemaIDCFillNodeTables
|
|
fd359a7e
|
2024-12-10T15:54:12
|
|
fuzz: Start to fuzz XML Schema validator
|
|
fe7f835f
|
2025-02-20T10:24:50
|
|
Fix C4296 warning: Resolve comparison of unsigned int with 0
|
|
b8234e8c
|
2025-02-19T12:53:32
|
|
html: Fix check for partial named character references
Digits are allowed after the first character.
|
|
f68c70d2
|
2025-02-19T12:20:57
|
|
html: Remove htmlSaveErr
This function is useless now.
|
|
0315ac93
|
2025-02-19T12:18:50
|
|
html: Handle error from htmlFindOutputEncoder
|
|
22ada0a0
|
2025-02-18T23:27:40
|
|
tests: Look for xmlconf in source directory
Add -d option to runxmlconf for automake.
Fix extraction of xmlconf.tar.gz on Windows.
Make runxmlconf work with Meson CI.
|
|
aedc1f3d
|
2025-02-18T23:15:20
|
|
gitlab-ci: Run meson tests verbosely
|
|
9037dce9
|
2025-02-18T19:38:28
|
|
fuzz: Add dictionary for lint fuzzer
Mostly a combination of xml.dict and xpath.dict. This should with
fuzzing pattern.c.
|
|
51622c05
|
2025-02-18T17:27:16
|
|
doc: Update release instructions
|
|
8c8753ad
|
2025-02-11T17:30:40
|
|
[CVE-2025-24928] Fix stack-buffer-overflow in xmlSnprintfElements
Fixes #847.
|
|
5880a9a6
|
2024-12-10T16:52:05
|
|
[CVE-2024-56171] Fix use-after-free after xmlSchemaItemListAdd
xmlSchemaItemListAdd can reallocate the items array. Update local
variables after adding item in
- xmlSchemaIDCFillNodeTables
- xmlSchemaBubbleIDCNodeTables
Fixes #828.
|
|
06b39650
|
2025-02-17T12:19:23
|
|
fuzz: Stop testing xmllint --memory option
The --memory option mmaps files directly, bypassing the resource loader.
We'd need a temp file to make it work when fuzzing.
|
|
25ae533b
|
2025-02-17T11:27:30
|
|
xmllint: Fix SIGBUS with --memory option
If the input file size is a multiple of page size, the byte after the
file's content is on a new page and accessing it will lead to SIGBUS.
Remove XML_INPUT_BUF_ZERO_TERMINATED hint for mmapped files.
Regressed with a221cd78.
Fixes #864.
|
|
7a61c32b
|
2025-02-13T23:09:28
|
|
html: Use enum instead of magic values for insertion modes
|
|
3793eaad
|
2025-02-16T13:54:56
|
|
fuzz: Fix build
|
|
69b91da3
|
2025-02-13T19:45:41
|
|
Revert "xpath: Make contextSize and proximityPosition default to 1"
This reverts commit afbc0a0405236de4ab8cbac94745e9885db0a198.
|
|
9c16a153
|
2025-02-13T18:41:33
|
|
Revert "include: Make most IS_* macros private"
This reverts commit 84a6c82ff83d04963d6e1c5cd18ded68ea02d99f.
|
|
6c716d49
|
2025-02-13T16:48:53
|
|
pattern: Fix compilation of explicit child axis
The child axis is the default axis and should generate XML_OP_ELEM like
the case without an axis.
|
|
8cf6129b
|
2025-02-13T18:20:46
|
|
html: Stop implying <p> start tags
Only <html>, <head> or <body> should be implied. Opening extra <p> tags
has always been a libxml2 quirk.
|
|
71122421
|
2025-02-13T14:04:10
|
|
html: Make implied <p> tags more deterministic
libxml2's HTML parser adds <p> start tags in some situations. This
behavior, which doesn't follow any standard, was added in 2000, see
here: http://veillard.com/XML/messages/0655.html
Text nodes that only contain whitespace don't imply a <p> tag, but the
whitespace check cannot work reliably if we're parsing partial text data
which can happen with both pull and push parser.
The logic in `areBlanks` is hard to follow. The checks involving `CUR`
depend on the position of the input pointer and seem dubious. It's also
possible that the behavior changed inadvertently with a later commit.
As a result, it's hard to come up with good test cases.
We now process leading whitespace before creating implied tags. This is
more in line with HTML5 and should avoid at least some issues with
partial text data.
For example, parsing the string "<head> x" used to result in:
<html>
<head></head>
<body><p> x</p></body>
</html>
And now results in:
<html>
<head> </head>
<body><p>x</p></body>
</html>
Except for the implied <p> tag, this matches HTML5.
|
|
ebbc31cc
|
2025-02-13T12:09:58
|
|
malloc-fail: Check for malloc failure in xhtmlNodeDumpOutput
|
|
79ab721c
|
2025-02-11T11:39:08
|
|
tests: Fix error return in testHugeEncodedChunk
Fixes #859.
|
|
cfc854b8
|
2025-02-11T00:21:12
|
|
fuzz: Work around glibc iconv() bug
|
|
3a1526a5
|
2025-02-10T19:32:32
|
|
xpath: Don't raise OOM error on long names
Short-lived regression.
|
|
3dcde736
|
2025-02-05T15:18:48
|
|
Use __has_attribute to check for __counted_by__ support
The initial clang patch to support __counted_by__ was landed and
reverted several times. There are some clang toolchains (e.g. the
Android toolchain) that report themselves as version 18 but do not
support __counted_by__. While it is debatable if Android should be
shipping a pre-release clang, using __has_attribute should be a bit
simpler overall.
Note that this doesn't migrate everything else to use __has_attribute:
while clang has always supported __has_attribute, gcc didn't support
it until a bit later.
|
|
35d8a230
|
2025-02-06T10:14:56
|
|
tests: Fix expected errors in runxmlconf
The extra failure if regexps weren't enabled was actually a regression
fixed by the previous commit.
|
|
b466e70a
|
2025-02-05T14:11:04
|
|
Fix early return in vstateVPush in valid.c
While looking over the code in the fallback method for `vstateVPush` in
valid.c when `LIBXML_REGEXP_ENABLED` is not defined, I noticed that
there is an ungated `return(-1)` after attempting to allocate memory.
I believe this should be inside a check, for if the malloc fails.
|
|
62d4697d
|
2025-02-02T16:43:25
|
|
gitlab-ci: Disable cmake:mingw for now
Executing /mingw64/bin/cmake.exe with any arguments fails without error
message and exit code 127 since 2025-01-21. I have no idea why.
|
|
a25dc439
|
2025-02-02T15:01:50
|
|
Debug CI failure
|
|
cd491ac0
|
2025-02-02T13:13:20
|
|
dict: Handle ENOSYS from getentropy gracefully
Also add some comments.
Should fix #854.
|
|
8d7e38d5
|
2025-02-01T22:41:53
|
|
fuzz: Ignore encodings when fuzzing on Apple
Not long ago, Apple decided to replace GNU libiconv with a patched up
version of FreeBSD's iconv implementation in their operating systems.
Unfortunately, the quality of both the original implementation as well
as Apple's patches is so abysmal that you routinely find issues when
fuzzing your own code.
|
|
68be036f
|
2025-02-01T22:09:18
|
|
fuzz: Disable HTML encoding detection for now
This doesn't work with the push parser.
|
|
b4d3d87e
|
2025-02-01T22:02:33
|
|
parser: Fix parsing of doctype declarations
Fix some long-standing issues.
Fixes #504.
|
|
c13fcc19
|
2025-02-01T19:36:06
|
|
html: Chunk text data in push parser
Follow the logic of the XML parser and chunk large text nodes.
|
|
08028572
|
2025-02-01T18:21:47
|
|
html: Make data parsing modes work with push parser
This can't be solved with a simple scan for a terminator. Instead, we
make htmlParseCharData handle incomplete data if the "partial" flag is
set.
|
|
4be1e8be
|
2025-02-01T15:00:26
|
|
html: Simplify htmlParseTryOrFinish a little
|
|
12732592
|
2025-02-01T00:36:12
|
|
html: Remove unused epilog state
|
|
70bf754e
|
2025-02-01T00:17:01
|
|
html: Fix pull-parsing of incomplete end tags
Handle this HTML5 quirk in htmlParseEndTag.
|
|
4a776c78
|
2025-01-31T23:57:44
|
|
html: Use htmlParseElementInternal in push parser
|
|
ba153737
|
2025-01-31T22:51:59
|
|
html: Fix corner case when push-parsing HTML5 comments
|
|
e48fb5e4
|
2025-01-31T22:08:13
|
|
html: Handle incomplete UTF-8 when push-parsing
For now, incomplete UTF-8 is always an error in push mode.
Eventually, we could pass chunked data to the character handler when
push-parsing. Then we'd have to handle incomplete sequences.
|
|
bc437868
|
2025-01-31T23:11:55
|
|
fuzz: Improve HTML fuzzer
Verify that pull and push parser produce the same result.
Fixes #849.
|
|
c4f760be
|
2025-02-01T15:29:56
|
|
encoding: Handle iconv() returning EOPNOTSUPP on Apple
iconv() really shouldn't return undocumented error codes.
|
|
6bb2ea8e
|
2025-02-01T14:58:06
|
|
html: Adjust xmlDetectEncoding for HTML
Don't check for UTF-32 or EBCDIC.
We now perform BOM sniffing and the first step of the HTML5 prescan
algorithm (detect UTF-16 XML declarations). The rest of the algorithm
still has to be implemented.
|
|
227d8f73
|
2025-01-31T21:05:22
|
|
html: Support encoding auto-detection in push parser
Align with pull parser.
|
|
641fb1ac
|
2025-01-31T20:41:28
|
|
html: Fix state update in push parser
|
|
a86a8ae9
|
2025-01-31T20:09:54
|
|
html: Fix push-parsing of empty documents
Also simplify end-of-document handling in push parser.
Align with pull parser.
|
|
d2fb68ed
|
2025-01-31T19:02:33
|
|
fuzz: Make large chunk size more likely
This now detects issues like 3eced32e in about 30 seconds.
|
|
cdfb54ff
|
2025-01-31T18:38:40
|
|
Fix typos
|
|
57e4bbd8
|
2025-01-31T16:45:35
|
|
parser: Improve handling of NOCDATA option
Don't modify the callback structure. This makes sure that unsetting the
option works.
|
|
1f5b5371
|
2025-01-31T16:21:20
|
|
parser: Improve handling of NOBLANKS option
Don't change the SAX handler.
Use a helper function to invoke "characters" SAX callback.
The old code didn't advance the input pointer consistently before
invoking the callback. There was also some inconsistency wrt to
ctxt->space handling. I don't understand the ctxt->space thing, but
now we always behave like the non-complex case before.
|
|
7a8722f5
|
2025-01-31T14:55:29
|
|
parser: Document that XML_PARSE_NOBLANKS is broken
Long text content can generate multiple "characters" callbacks which can
lead to NOBLANKS removing whitespace in non-whitespace text nodes. So
the NOBLANKS option doesn't even work reliably with the pull parser.
This would be extremely hard to fix.
Unfortunately, `xmllint --format` relies on this option which is another
reason why this feature never really worked.
|
|
40e423d6
|
2025-01-30T19:30:44
|
|
fuzz: Improve fuzzing of push parser
Also serialize the result of push-parsing and compare whether pull and
push parser produce the same result (differential fuzzing).
We lose the ability to inject IO errors when serializing for now, but
this isn't too important.
Use variable chunk size for push parser.
Fixes #849.
|
|
9efe1414
|
2025-01-31T13:07:35
|
|
parser: Fix detection of ']]>' when push-parsing
Fixes #850.
|
|
115b13f9
|
2025-01-30T23:18:56
|
|
parser: Document push parser limitations
|
|
53a48468
|
2025-01-30T15:15:30
|
|
xmllint: Make --push report parse errors
The push parser leaves documents in ctxt->myDoc even if they're invalid.
Also fix documentation.
Regressed with f8ff4d86.
|
|
5535721f
|
2025-01-30T01:27:03
|
|
parser: Grow input buffer after lots of whitespace
Make sure that the input buffer is grown after consuming large amounts
of whitespace.
Also move a comment.
|
|
218264fa
|
2025-01-30T01:26:01
|
|
parser: Always shrink input buffer
Shrinking the input buffer is cheap now and should be done as soon as
possible.
|
|
0de90f51
|
2025-01-30T01:25:31
|
|
parser: Define SIZE_MAX
|
|
3eced32e
|
2025-01-29T23:49:56
|
|
parser: Fix push parser with encoding and single chunk
When push-parsing with an encoding handler, we must convert the whole
buffer in the initial conversion. Otherwise, parsing a single chunk
larger than ~4KB would fail.
Regressed with commit 34c9108f.
|
|
4bd66d45
|
2025-01-29T13:11:38
|
|
Mention contributors in Copyright
To clarify that libxml2 is the work of many people, add the following
copyright notice to Copyright:
Copyright (C) The Libxml2 Contributors.
|
|
fdc73dd0
|
2025-01-29T12:58:31
|
|
README: Fix CMake example options
zlib is disabled by default now.
|
|
64bfe1f7
|
2025-01-29T12:48:50
|
|
README: Add note about security issues
|
|
93506d41
|
2025-01-29T00:17:01
|
|
parser: Make catalog PIs opt-in
This is an obscure feature that shouldn't be enabled by default.
|
|
1082d813
|
2025-01-28T23:21:34
|
|
parser: Prepare to make decompression opt-in
Add a new parser option XML_PARSE_UNZIP that enables decompression.
xmlReadFile, xmlCtxtReadFile and xmlCreateURLParserCtxt always set
this option currently, but downstream users should start to set the
option if they really need it.
|
|
a78843be
|
2025-01-28T20:13:58
|
|
xmllint: Support compressed input from stdin
Another regression related to reading from stdin.
Making a "-" filename read from stdin was deeply baked into the core
IO code but is inherently insecure. I really want to reenable this
dangerous feature as sparingly as possible.
This now enables compressed input when using the "Fd" API functions
which wan't supported before. But XML_PARSE_NO_UNZIP will be
inverted later.
Allow compressed stdin in xmlReadFile to support xmlstarlet and older
versions of xsltproc. So far, these are the only known command-line
tools that rely on "-" meaning stdin.
|
|
a8d8a70c
|
2025-01-27T13:31:08
|
|
uri: Fix handling of Windows drive letters
Allow drive letters in URI paths. Technically, these should be treated
as URI schemes, but this is not what users expect. This also makes sure
that paths with drive letters are resolved as filesystem paths and
unescaped, for example when used in libxslt's document() function.
Should fix #832.
|
|
6904d4c2
|
2025-01-25T13:54:15
|
|
fuzz: Fix OSS-Fuzz build of lint fuzzer
|
|
cd7299a8
|
2025-01-24T18:59:12
|
|
meson: Fix setup with ICU as sibling subproject
Meson wrapdb provides a wrap for ICU, so libxml2 and ICU could both be
built as subprojects of the same Meson parent project. In this case, with
the icu option enabled, setup was failing with:
subprojects/libxml2-2.13.5/meson.build:603:22: ERROR: Could not get an internal variable and no default provided for <InternalDependency dep228908115162702543524838879388991448872: True>
This is because we can't get a dependency variable from a subproject that
hasn't been built yet. Fall back to assuming DEFS is empty, as it is on
my system.
|