|
19cae17f
|
2020-08-19T13:07:28
|
|
Revert "Fix quadratic runtime in xi:fallback processing"
This reverts commit 27119ec33c9f6b9830efa1e0da0acfa353dfa55a.
Not copying fallback children didn't fix up namespaces and could lead
to use-after-free errors.
Found by OSS-Fuzz.
|
|
d63cfeca
|
2020-08-17T15:40:06
|
|
Add TODO comment in xinclude.c
Add some thoughts on the major remaining problems with the XInclude
implementation.
|
|
804c5297
|
2020-08-17T03:37:18
|
|
Stop using maxParserDepth in xpath.c
Only use a single maxDepth value.
|
|
74dcc10b
|
2020-08-17T03:24:56
|
|
Remove dead code in xinclude.c
'doc' is checked for NULL in xmlXIncludeLoadDoc, so several code
paths can be eliminated.
|
|
0ff52748
|
2020-08-17T02:54:28
|
|
Fix autotools warnings
|
|
d88df4bd
|
2020-08-16T23:38:48
|
|
Fix corner case with empty xi:fallback
xi:fallback could become empty after recursive expansion. Use a flag
to track whether nodes should be skipped.
|
|
00a86d41
|
2020-08-16T23:38:00
|
|
Don't add formatting newlines to XInclude nodes
|
|
dba82a8c
|
2020-08-16T23:02:20
|
|
Fix XInclude regression introduced with recent commit
The change to xmlXIncludeLoadFallback in commit 11b57459 could
process already freed nodes if text nodes were merged after deleting
nodes with an empty fallback.
Found by OSS-Fuzz.
|
|
e1c2d0ad
|
2020-08-16T22:22:57
|
|
Fix memory leak in runtest.c
|
|
2c747129
|
2020-08-17T00:54:12
|
|
Fix error reporting with xi:fallback
When reporting errors, don't use href of xi:include if xi:fallback
was used. I think this can only be reproduced with
"xmllint --postvalid", see the original bug report:
https://bugzilla.gnome.org/show_bug.cgi?id=152623
|
|
2b4769a6
|
2020-08-16T22:02:04
|
|
Make "xmllint --push --recovery" work
|
|
99fc048d
|
2020-08-14T14:18:50
|
|
Don't use SAX1 if all element handlers are NULL
Running xmllint with "--sax --noout" installs a SAX2 handler with all
callbacks set to NULL. In this case or similar situations, we don't want
to switch to SAX1 parsing.
|
|
27119ec3
|
2020-08-17T00:05:19
|
|
Fix quadratic runtime in xi:fallback processing
Copying the tree would lead to runtime quadratic in nested fallback
depth, similar to naive string concatenation.
|
|
c1ba6f54
|
2020-08-15T18:32:29
|
|
Revert "Do not URI escape in server side includes"
This reverts commit 960f0e275616cadc29671a218d7fb9b69eb35588.
This commit introduced
- an infinite loop, found by OSS-Fuzz, which could be easily fixed.
- an algorithm with quadratic runtime
- a security issue, see
https://bugzilla.gnome.org/show_bug.cgi?id=769760
A better approach is to add an option not to escape URLs at all
which libxml2 should have possibly done in the first place.
|
|
b82fa3dd
|
2020-08-09T14:50:46
|
|
Fix column number accounting in xmlParse*NameAndCompare
Thanks to Frederic Vancraeyveldt for the report.
|
|
438e595a
|
2020-08-09T14:43:53
|
|
Stop counting nbChars in parser context
The value was inaccurate and never used.
|
|
f6a9541f
|
2020-08-09T14:29:35
|
|
Remove unneeded progress checks in HTML parser
The HTML parser should now be guaranteed to make progress, so the
checks became unnecessary.
|
|
9de7b94d
|
2020-08-08T20:37:30
|
|
Use strcmp when fuzzing
This should improve data-flow-guided fuzzing.
|
|
10a07948
|
2020-08-08T17:46:11
|
|
Fix XPath fuzzer
|
|
6c128fd5
|
2020-06-05T13:43:45
|
|
Fuzz XInclude engine
|
|
50f06b3e
|
2020-08-07T21:54:27
|
|
Fix out-of-bounds read with 'xmllint --htmlout'
Make sure that truncated UTF-8 sequences don't cause an out-of-bounds
array access.
Thanks to @SuhwanSong and the Agency for Defense Development (ADD) for
the report.
Fixes #178.
|
|
1abf2967
|
2020-08-06T17:51:57
|
|
Fix exponential runtime and memory in xi:fallback processing
When creating XML_XINCLUDE_START nodes, the children of the original
xi:include node must be freed, otherwise fallback content is copied
twice, doubling runtime and memory consumption for each nested
xi:fallback/xi:include pair.
Found with libFuzzer.
|
|
11b57459
|
2020-08-07T18:39:19
|
|
Don't process siblings of root in xmlXIncludeProcess
xmlXIncludeDoProcess would follow the siblings of the tree root and
also expand these nodes. When using an XML reader, this could lead to
siblings of the current node being expanded without having been parsed
completely.
|
|
0f9817c7
|
2020-06-10T16:34:52
|
|
Don't recurse into xi:include children in xmlXIncludeDoProcess
Otherwise, nested xi:include nodes might result in a use-after-free
if XML_PARSE_NOXINCNODE is specified.
Found with libFuzzer and ASan.
|
|
5725c115
|
2020-06-10T15:11:40
|
|
Fix memory leak in xmlXIncludeIncludeNode error paths
Found with libFuzzer and ASan.
|
|
ad26a60f
|
2020-08-06T13:20:01
|
|
Add XPath and XPointer fuzzer
|
|
956534e0
|
2020-08-04T19:27:13
|
|
Check for custom free function in global destructor
Calling a custom deallocation function in the global destructor could
cause all kinds of unexpected problems. See for example
https://github.com/sparklemotion/nokogiri/issues/2059
Only clean up if memory is managed with malloc/free.
|
|
8e7c20a1
|
2020-08-03T17:30:41
|
|
Fix integer overflow when comparing schema dates
Found by OSS-Fuzz.
|
|
905820a4
|
2020-07-12T22:59:39
|
|
Update fuzzing code
- Shorten timeouts
- Align options from Makefile and options files
- Add section headers to Makefile
- Skip invalid UTF-8 in regexp fuzzer
- Update regexp.dict
- Generate HTML seed corpus in correct format
|
|
68eadabd
|
2020-07-11T21:32:10
|
|
Fix exponential runtime in xmlFARecurseDeterminism
In order to prevent visiting a state twice, states must be marked as
visited for the whole duration of graph traversal because states might
be reached by different paths. Otherwise state graphs like the
following can lead to exponential runtime:
->O-->O-->O-->O-->O->
\ / \ / \ / \ /
O O O O
Reset the "visited" flag only after the graph was traversed.
xmlFAComputesDeterminism still has massive performance problems when
handling fuzzed input. By design, it has quadratic time complexity in
the number of reachable states. Some issues might also stem from
redundant epsilon transitions. With this fix, fuzzing regexes with a
maximum length of 100 becomes feasible at least.
Found with libFuzzer.
|
|
1a360c1c
|
2020-07-29T00:39:15
|
|
More *NodeDumpOutput fixes
When leaving nodes, restrict more operations to XML_ELEMENT_NODEs.
|
|
7b2e5172
|
2020-07-28T21:52:55
|
|
Fix *NodeDumpOutput functions
Only output end tag for elements. Should fix serialization of document
fragments.
|
|
dc6f0092
|
2020-07-28T19:07:19
|
|
Make xmlNodeDumpOutputInternal non-recursive
Fixes stack overflow with deeply nested documents.
|
|
5330153d
|
2020-07-28T18:33:50
|
|
Make xhtmlNodeDumpOutput non-recursive
Fixes stack overflow with deeply nested documents.
|
|
b79ab6e6
|
2020-07-28T02:42:37
|
|
Make htmlNodeDumpFormatOutput non-recursive
Fixes stack overflow with deeply nested HTML documents.
Found by OSS-Fuzz.
|
|
21ca8829
|
2020-07-25T17:57:29
|
|
Don't try to handle namespaces when building HTML documents
Don't try to resolve namespace in xmlSAX2StartElement when parsing
HTML documents. This useless operation could slow down the parser
considerably.
Found by OSS-Fuzz.
|
|
93ce33c2
|
2020-07-23T17:34:08
|
|
Fix several quadratic runtime issues in HTML push parser
Fix a few remaining cases where the HTML push parser would scan more
content during lookahead than being parsed later.
Make sure that htmlParseDocTypeDecl consumes all content up to the
final '>' in case of errors. The old comment said "We shouldn't try to
resynchronize", but ignoring invalid content is also what the HTML5
spec mandates.
Likewise, make htmlParseEndTag skip to the final '>' in invalid end
tags even if not in recovery mode. This is probably the most visible
change in practice and leads to different output for some tests but is
also more in line with HTML5.
Make sure that htmlParsePI and htmlParseComment don't abort if invalid
characters are encountered but log an error and ignore the character.
Change some other end-of-buffer checks to test for a zero byte instead
of relying on IS_CHAR.
Fix usage of IS_CHAR macro in htmlParseScript.
|
|
10d09472
|
2020-07-23T19:16:21
|
|
Fix .gitattributes
The files in 'test' and 'result' have mixed line endings, so disable
end-of-line conversion.
|
|
173a0830
|
2020-07-22T23:15:35
|
|
Fix quadratic runtime when push parsing HTML start tags
Make sure that htmlParseStartTag doesn't terminate on characters for
which IS_CHAR_CH is false like control chars.
In htmlParseTryOrFinish, only switch to START_TAG if the next character
starts a valid name. Otherwise, htmlParseStartTag might return without
consuming all characters up to the final '>'.
Found by OSS-Fuzz.
|
|
0e5c4fec
|
2020-07-13T15:20:45
|
|
Reset XML parser input before reporting errors
Apply changes to htmlParseChunk() in 13ba5b61 and 3f18e748 to
xmlParseChunk().
|
|
6995eed0
|
2020-07-19T13:54:52
|
|
Fix quadratic runtime when push parsing HTML entity refs
The HTML push parser would look ahead for characters in "; >/" to
terminate an entity reference but actual parsing could stop earlier,
potentially resulting in quadratic runtime.
Parse char data and references alternately in htmlParseTryOrFinish
and only look ahead once for a terminating '<' character.
Found by OSS-Fuzz.
|
|
8e219b15
|
2020-07-12T21:43:44
|
|
Fix HTML push parser lookahead
The parsing rules when looking for terminating chars or sequences in
the push parser differed from the actual parsing code. This could
result in the lookahead to overshoot and data being rescanned,
potentially leading to quadratic runtime.
Comments must never be handled during lookahead. Attribute values must
only be skipped for start tags and doctype declarations, not for end
tags, comments, PIs and script content.
|
|
e050062c
|
2020-07-15T14:38:55
|
|
Make htmlCurrentChar always translate U+0000
The general assumption is that htmlCurrentChar only returns 0 if the
end of the input buffer is reached. The UTF-8 path already logged an
error if a zero byte U+0000 was found and returned a space character
instead. Make the ASCII code path do the same.
htmlParseTryOrFinish skips zero bytes at the beginning of a buffer, so
even if 0 was returned from htmlCurrentChar, the push parser would make
progress. But rescanning the input could cause performance problems.
The pull parser would abort parsing and now handles zero bytes in ASCII
mode the same way as the push parser or as in UTF-8 mode.
It would be better to return the replacement character U+FFFD instead,
but some of the client code assumes that the UTF-8 length of input and
output matches.
|
|
dfd4e330
|
2020-07-15T14:22:08
|
|
Rework control flow in htmlCurrentChar
Don't call xmlCurrentChar after switching encodings. Rearrange code
blocks and fall through to normal UTF-8 handling.
|
|
922bebcc
|
2020-07-15T14:20:42
|
|
Make 'xmllint --html --push -' read from stdin
|
|
1493130e
|
2020-07-15T12:54:25
|
|
Fix UTF-8 decoder in HTML parser
Reject sequences starting with a continuation byte as well as overlong
sequences like the XML parser.
Also fixes an infinite loop in connection with previous commit 50078922
since htmlCurrentChar would return 0 even if not at the end of the
buffer.
Found by OSS-Fuzz.
|
|
beb7d71a
|
2020-07-13T12:41:19
|
|
Remove misleading comments in xpath.c
Fixes #169
|
|
50078922
|
2020-07-12T20:28:47
|
|
Fix quadratic runtime when parsing HTML script content
If htmlParseScript returns upon hitting an invalid character,
htmlParseLookupSequence will be called again with checkIndex reset to
zero, potentially resulting in quadratic runtime. Make sure that
htmlParseScript consumes all input in one go and simply skips over
invalid characters similar to htmlParseCharDataInternal.
Found by OSS-Fuzz.
|
|
d6761e70
|
2020-07-13T11:59:45
|
|
Update to Devhelp index file format version 2
Fixes #89
|
|
d514e2bd
|
2020-07-12T18:42:49
|
|
Set project language to C
|
|
5ddf02f2
|
2020-06-07T16:06:17
|
|
Update config.h.cmake.in
|
|
8bec210d
|
2020-06-04T17:37:21
|
|
Add variable for working directory of XML Conformance Test Suite
|
|
270e1655
|
2020-06-04T14:45:48
|
|
Add additional tests and XML Conformance Test Suite
|
|
e6ba4bd7
|
2020-06-04T11:58:04
|
|
Add command line option for temp directory in runtest
|
|
40e7ceaa
|
2020-06-04T11:57:28
|
|
Ensure LF line endings for test files
|
|
9ecf5ad6
|
2020-06-04T00:16:15
|
|
Enable runtests and testThreads
|
|
3f18e748
|
2020-07-11T14:34:57
|
|
Reset HTML parser input before reporting error
Avoid use-after-free, similar to 13ba5b61. Also make sure that
xmlBufSetInputBaseCur sets valid pointers in case of buffer errors.
Found by OSS-Fuzz.
|
|
3da8d947
|
2020-07-09T16:08:38
|
|
Fix more quadratic runtime issues in HTML push parser
Make sure that checkIndex is set when returning without match from
inside a comment. Also track parser state in htmlParseLookupChars.
Found by OSS-Fuzz.
|
|
741b0d0a
|
2020-07-07T12:54:34
|
|
Fix regression introduced with 477c7f6a
The 'inSubset' member is actually used by the SAX2 handlers. Store
extra parser state in 'hasPErefs'.
|
|
fc842f6e
|
2020-07-06T15:22:12
|
|
Limit regexp nesting depth
Enforce a maximum nesting depth of 50 for regular expressions. Avoids
stack overflows with deeply nested regexes.
Found by OSS-Fuzz.
|
|
1e41e4fa
|
2020-06-30T02:43:57
|
|
Fix return values and documentation in encoding.c
Make xmlEncInputChunk and xmlEncOutputChunk return 0 on success and
never a positive value.
Make xmlCharEncFirstLineInt, xmlCharEncFirstLineInt and
xmlCharEncOutFunc return the number of bytes written.
|
|
6b4717d6
|
2020-07-06T12:36:27
|
|
Add regexp regression tests
- Bug 757711: heap-buffer-overflow in xmlFAParsePosCharGroup
<https://bugzilla.gnome.org/show_bug.cgi?id=757711>
- Bug 783015 - Integer-overflow in xmlFAParseQuantExact
<https://bugzilla.gnome.org/show_bug.cgi?id=783015>
(Regexptests): Add support for checking stderr output when
running regexp tests. This makes it possible to check in test
cases that fail and not see false-positive error output when
running the tests. Unlike other libxml2 test suites, if there
is no stderr output, no *.err file needs to be created.
|
|
477c7f6a
|
2020-06-28T15:54:23
|
|
Fix quadratic runtime in HTML parser
Commit eeb99329 removed an important optimization avoiding quadratic
runtime when repeatedly scanning the input buffer for terminating
characters in the HTML push parser. The related bug is
https://bugzilla.gnome.org/show_bug.cgi?id=444994
Make sure that ctxt->checkIndex is always written and store additional
parser state in ctxt->inSubset which is unused in the HTML parser.
Found by OSS-Fuzz.
|
|
f8329fdc
|
2020-07-02T11:51:31
|
|
Report error for invalid regexp quantifiers
|
|
13ba5b61
|
2020-06-28T13:16:46
|
|
Reset HTML parser input before reporting encoding error
If charset conversion fails, reset the input pointers before reporting
the error and bailing out. Otherwise, the input pointers are left in an
invalid state which could lead to use-after-free and other memory
errors.
Similar to f9e7997e. Found by OSS-Fuzz.
|
|
1e7851b5
|
2020-06-25T12:17:50
|
|
Fix integer overflow in xmlFAParseQuantExact
Found by OSS-Fuzz.
|
|
84bab955
|
2020-06-24T20:07:32
|
|
Fix return value of xmlC14NDocDumpMemory
Make sure to return -1 in case of buffer errors.
Fixes #174.
|
|
43a8836c
|
2020-05-31T18:46:21
|
|
Fix rebuilding docs, by hiding __attribute__((...)) behind a macro.
When enabled via `./configure --enable-rebuild-docs`,
`make -C doc libxml2-api.xml` will invoke apibuild.py
to rebuild libxml2-api.xml from the sources.
But the code added in
9fa3200cb366c726f7c8ef234282603bb9e8816d made it error out with
```
Parsing ../parser.c
Parse Error: parsing type : expecting a name
('Got token ', ('sep', '('))
('Last token: ', ('sep', '('))
('Token queue: ', [('name', 'destructor'), ('sep', ')'), ('sep', ')')])
('Line 14689 end: ', '')
```
|
|
9f42f6ba
|
2020-06-24T15:33:38
|
|
Don't follow next pointer on documents in xmlXPathRunStreamEval
RVTs from libxslt are document nodes which are linked using the 'next'
pointer. These pointers must never be used to navigate the document
tree. Otherwise, random content from other RVTs could be returned
when evaluating XPath expressions.
It's interesting that this seemingly long-standing bug wasn't
discovered earlier. This issue could also cause severe performance
degradation.
Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/37
|
|
c0440868
|
2020-06-22T13:08:11
|
|
Copy xs:duration parser from libexslt
The duration parser in libexslt checks for integer overflows.
|
|
18425d3a
|
2020-06-21T19:14:23
|
|
Fix integer overflow in _xmlSchemaParseGYear
Found with libFuzzer and UBSan.
|
|
070d635e
|
2020-06-21T16:26:38
|
|
Fix integer overflow when parsing {min,max}Occurs
Clamp value to INT_MAX.
Found with libFuzzer and UBSan.
|
|
50f18830
|
2020-06-21T15:21:45
|
|
Fix another memory leak in xmlSchemaValAtomicType
Don't collapse language IDs twice.
Found with libFuzzer and ASan.
|
|
eac1c7e2
|
2020-06-21T14:42:00
|
|
Fuzz target for XML Schemas
This only tests the schema parser for now.
|
|
ffd31dbe
|
2020-06-21T12:14:19
|
|
Move entity recorder to fuzz.c
|
|
681f094e
|
2020-06-15T15:23:05
|
|
Fix unsigned integer overflow in htmlParseTryOrFinish
Cast to signed type before subtraction to avoid unsigned integer
overflow. Also use ptrdiff_t to avoid potential integer truncation.
Found with libFuzzer and UBSan.
|
|
31ca4a72
|
2020-06-15T18:47:53
|
|
Fix integer overflow in htmlParseCharRef
Fixes #115.
|
|
2f938203
|
2020-06-15T15:45:47
|
|
Fix undefined behavior in UTF16LEToUTF8
Don't perform arithmetic on null pointer.
Found with libFuzzer and UBSan.
|
|
536f421d
|
2020-06-15T12:20:54
|
|
Fuzz target for HTML parser
|
|
a697ed1e
|
2020-06-15T14:49:22
|
|
Fix return value of xmlCharEncOutput
Commit 407b393d introduced a regression caused by xmlCharEncOutput
returning 0 in case of success instead of the number of bytes written.
Always use its return value for nbchars in xmlOutputBufferWrite.
Fixes #166.
|
|
af893a58
|
2020-06-11T16:08:16
|
|
Update GitLab CI container
|
|
a28f7d87
|
2020-06-10T13:41:13
|
|
Never expand parameter entities in text declaration
When parsing the text declaration of external DTDs or entities, make
sure that parameter entities are not expanded. This also fixes a memory
leak in certain error cases.
The change to xmlSkipBlankChars assumes that the parser state is
maintained correctly when parsing external DTDs or parameter entities,
and might expose bugs in the code that were hidden previously.
Found by OSS-Fuzz.
|
|
487871b0
|
2020-06-10T13:23:43
|
|
Fix undefined behavior in xmlXPathTryStreamCompile
&NULL[0] is undefined behavior.
|
|
e98150d4
|
2020-06-09T13:45:31
|
|
Add options file for xml fuzzer
This will be picked up OSS-Fuzz, limiting the maximum input size to
80 KB and hopefully avoiding timeouts. Some of the timeouts seem to be
related to our suboptimal handling of excessive entity expansion.
The new fuzzers support external entities and make this problem even
more prominent.
|
|
2af3c2a8
|
2020-06-08T12:49:51
|
|
Fix use-after-free with validating reader
Just like IDs, IDREF attributes must be removed from the document's
refs table when they're freed by a reader. This bug is often hidden
because xmlAttr structs are reused and strings are stored in a
dictionary unless XML_PARSE_NODICT is specified.
Found by OSS-Fuzz.
|
|
00ed736e
|
2020-06-05T12:49:25
|
|
Add a couple of libFuzzer targets
- XML fuzzer
Currently tests the pull parser, push parser and reader, as well as
serialization. Supports splitting fuzz data into multiple documents
for things like external DTDs or entities. The seed corpus is built
from parts of the test suite.
- Regexp fuzzer
Seed corpus was statically generated from test suite.
- URI fuzzer
Tests parsing and most other functions from uri.c.
|
|
2e8cc66d
|
2020-05-30T15:40:08
|
|
xmlParseBalancedChunkMemory must not be called with NULL doc
There is no way to avoid memory leaks without a document to hold the
namespace list.
|
|
a0a8059b
|
2020-05-30T15:33:03
|
|
Revert "Fix memory leak in xmlParseBalancedChunkMemoryRecover"
This reverts commit 5a02583c7e683896d84878bd90641d8d9b0d0549.
Fixes #161.
|
|
ff009f99
|
2020-05-30T15:32:25
|
|
Fix memory leak in xmlXIncludeLoadDoc error path
Found by OSS-Fuzz.
|
|
a230b728
|
2020-04-10T19:22:07
|
|
win32: allow passing *FLAGS on command line
nmake is a primitive tool, so this is a primitive implementation:
append EXTRA_CFLAGS etc. variables.
Command line variables should be appended to allow overriding flags set
in the makefile.
It doesn't work to pass in CFLAGS like in make because that always
overrides the assignments in the makefile.
|
|
4f2aee18
|
2020-05-04T14:03:52
|
|
Make schema validation fail with multiple top-level elements
Closes #126.
|
|
106757e8
|
2020-04-10T14:52:03
|
|
Guard new calls to xmlValidatePopElement in xml_reader.c
Closes #154.
|
|
386fb276
|
2020-04-28T17:00:37
|
|
Add LIBXML_VALID_ENABLED to xmlreader
There are already LIBXML_VALID_ENABLED in this file to guard against
"--without-valid" at "./configure" step, but here they were missing.
|
|
e7ff2efc
|
2020-04-21T21:16:07
|
|
Configure file xmlwin32version.h.in on MSVC
|
|
e2f10494
|
2020-04-21T21:04:23
|
|
List headers individually
|
|
2a2c38f3
|
2020-04-21T00:53:12
|
|
Add CMake build files
Closes #24.
|
|
9fa3200c
|
2020-03-31T23:18:25
|
|
Call xmlCleanupParser on ELF destruction
Fixes #153.
|
|
e4fb3684
|
2020-02-28T12:48:14
|
|
Parenthesize Py<type>_Check() in ifs
In C, if expressions should be parenthesized.
PyLong_Check, PyUnicode_Check etc. happened to expand to a parenthesized
expression before, but that's not API to rely on.
Since Python 3.9.0a4 it needs to be parenthesized explicitly.
Fixes https://gitlab.gnome.org/GNOME/libxml2/issues/149
|
|
20c60886
|
2020-03-08T17:19:42
|
|
Fix typos
Resolves #133.
|
|
2a7b6684
|
2020-03-02T11:52:52
|
|
Disable LeakSanitizer
The GitLab runner doesn't run in privileged mode anymore [1], at least
for projects outside the GNOME group. Disable LeakSanitizer for now
as it needs the ptrace capability.
[1] https://gitlab.gnome.org/Infrastructure/Infrastructure/issues/251
|