|
4fefba4c
|
2024-05-15T17:52:20
|
|
parser: Rework handling of undeclared entities
Throw an error if entity substitution was requested.
Now we only downgrade to a warning if
- XML_PARSE_DTDLOAD wasn't specified, and
- entity aren't substituted or XML_PARSE_NO_XXE was specified.
Should fix #724.
|
|
b717abdd
|
2024-04-22T15:42:39
|
|
parser: Consolidate error handling for undeclared entities
Always use XML_WAR_UNDECLARED_ENTITY with warning error level in
documents with external subset or parameter entities. Use
XML_ERR_UNDECLARED_ENTITY otherwise.
|
|
186562a1
|
2024-03-12T19:55:33
|
|
parser: Fix detection of duplicate attributes in XML namespace
Fixes a regression from commit e0dd330b, resulting in duplicate
attributes in the predefined XML namespace not being detected or
extraneous default attributes being passed.
Fixes #704.
|
|
29beef65
|
2024-01-02T21:50:38
|
|
parser: Pop inputs if parsing DTD failed
This should provide some statistics in ctxt->sizeentcopy even in the
error or recovery case.
|
|
f237e5b9
|
2024-01-05T15:40:23
|
|
parser: Avoid duplicate namespace errors
Don't report an extra attribute uniqueness error if a namespace is
undeclared. This matches old behavior.
|
|
d0eb5a7e
|
2024-01-03T18:12:29
|
|
parser: Remove xmlErrEncodingInt
Convert the last user to xmlFatalErr.
|
|
e8fb3d63
|
2024-01-02T17:45:54
|
|
parser: Convert some "internal errors" to meaningful codes
|
|
37c6618b
|
2023-12-30T02:50:34
|
|
parser: Rework parsing of attribute and entity values
Don't use a separate function to handle "complex" attributes. Validate
UTF-8 byte sequences without decoding. This should improve performance
considerably when parsing multi-byte UTF-8 sequences.
Use a string buffer to avoid unnecessary allocations and copying when
expanding entities.
Normalize attribute values in a single pass while expanding entities.
Be more lenient in recovery mode.
If no entity substitution was requested, validate entities without
expanding. Fixes #596.
Also fixes #655.
|
|
4ecc85d2
|
2023-12-27T00:44:16
|
|
parser: Push general entity input streams on the stack
This allows the error handler to give more context.
|
|
f3fa34dc
|
2023-12-26T22:37:26
|
|
parser: Fix general entity parsing
Clear namespace database.
Ignore non-fatal errors.
|
|
ecfbcc8a
|
2023-12-25T04:33:00
|
|
parser: Rework general entity parsing
Don't create a new parser context but reuse the existing one.
This exposes bug #601 in a more obvious way.
|
|
8d0aaf4b
|
2023-12-19T20:47:36
|
|
parser: Remove xmlErrEncoding
Use xmlFatalErr or xmlCtxtErrIO.
|
|
7e511f35
|
2023-12-19T15:41:37
|
|
io: Pass error codes from xmlFileOpenReal to xmlNewInputFromFile
This allows to report the reason why opening a file failed to the parser
context and improve error messages. Now we can also remove the stat call
before opening a file.
|
|
e58ea29f
|
2023-12-10T18:10:42
|
|
SAX2: Report malloc failures
Fix many places where malloc failures aren't reported.
Improve error handling when parsing entity declarations.
Fixes #308.
|
|
a1f7ecae
|
2023-12-10T15:25:42
|
|
entities: Report malloc failures
Fix places where malloc failures aren't reported.
Introduce new API function xmlAddEntity that returns separate error
codes.
Don't invoke global error handler for low-level errors which should be
handled by higher layers.
Invalid redelcaration warnings will be fixed later.
|
|
43b511fa
|
2023-11-26T14:31:39
|
|
parser: Make CRLF increment line number
Partial revert of cb927e85 fixing CRLFs not incrementing the line
number.
This requires to rework xmlParseQNameHashed. The original implementation
prompted the change to xmlCurrentChar which really shouldn't modify the
'cur' pointer as side effect. But the NEXTL macro relies on this
behavior.
Ultimately, we should reintroduce the change to xmlCurrentChar and fix
the NEXTL macro. This will lead to single CRs incrementing the line
number as well which seems more consistent.
Fixes #628.
|
|
134d2ad8
|
2023-10-06T00:31:44
|
|
parser: Protect against quadratic default attribute expansion
|
|
e48f3d8e
|
2023-09-27T16:47:37
|
|
tests: Add more tests for redefined attributes
|
|
53050b1d
|
2023-08-29T20:06:43
|
|
parser: More fixes to push parser error handling
|
|
bbd918b2
|
2023-08-29T15:56:37
|
|
parser: Fix detection of null bytes
Also suppress misleading extra errors.
Fixes #122.
|
|
c6083a32
|
2023-08-29T16:30:22
|
|
parser: Improve error handling in push parser
- Report errors earlier
- Align error messages with pull parser
|
|
855818bd
|
2023-08-08T15:21:37
|
|
parser: Check for truncated multi-byte sequences
When decoding input data, check whether the "raw" buffer is empty after
parsing the document. Otherwise, the input ends with a truncated
multi-byte sequence which shouldn't be silently ignored.
|
|
74aa61e0
|
2023-01-22T13:09:03
|
|
parser: Halt parser on DTD errors
If we try to continue parsing after an error in the internal or external
subset, entity expansion accounting gets more complicated. Simply halt
the parser.
Found with libFuzzer.
|
|
d320a683
|
2023-01-17T13:50:51
|
|
parser: Fix entity check in attributes
Don't set the "checked" flag when checking entities in default attribute
values. These entities could reference other entities which weren't
defined yet, so the check isn't reliable.
This fixes a short-lived regression which could lead to a call stack
overflow later in xmlStringGetNodeList.
|
|
a41b09c7
|
2022-12-23T21:29:28
|
|
parser: Improve detection of entity loops
Set a flag to detect entity loops at once instead of processing until
the depth limit is exceeded.
|
|
d972393f
|
2022-12-23T21:01:20
|
|
parser: Only report a single entity error
Don't report errors multiple times for nested entity references.
|
|
76c6da42
|
2022-12-04T23:01:00
|
|
error: Make sure that error messages are valid UTF-8
This has caused issues with the Python bindings for a long time.
Should fix #64.
|
|
68a6518c
|
2022-11-15T18:23:33
|
|
parser: Rewrite push parser boundary checks
Remove inaccurate xmlParseCheckTransition check.
Remove non-incremental xmlParseGetLasts check.
Add functions that check for several boundary constructs more
accurately, keeping track of progress in ctxt->checkIndex.
Fixes #439.
|
|
f61b8a62
|
2022-11-13T21:47:03
|
|
parser: Fix DTD parser progress checks
This is another attempt at fixing parser progress checks. Instead of
relying on in->consumed, which could overflow, change some DTD parser
functions to make guaranteed progress on certain byte sequences.
|
|
f1c32b4c
|
2020-07-09T03:19:13
|
|
Allow missing result files in runtest
Treat missing files as empty.
|
|
ce0871e1
|
2022-02-20T16:44:41
|
|
Only warn on invalid redeclarations of predefined entities
Downgrade the error message to a warning since the error was ignored,
anyway. Also print the name of redeclared entity. For a proper fix that
also shows filename and line number of the invalid redeclaration, we'd
have to
- pass the parser context to the entity functions somehow, or
- make these functions return distinct error codes.
Partial fix for #308.
|
|
9edc20c1
|
2022-02-07T20:38:30
|
|
Fix double counting of CRLF in comments
Fixes #151.
|
|
de5b624f
|
2021-05-08T20:21:29
|
|
Fix handling of unexpected EOF in xmlParseContent
Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was
removed in commit 62150ed2.
This commit also introduced a regression for direct users of
xmlParseContent. Unclosed tags weren't checked.
|
|
3e80560d
|
2021-05-07T10:51:38
|
|
Fix line numbers in error messages for mismatched tags
Commit 62150ed2 introduced a small regression in the error messages for
mismatched tags. This typically only affected messages after the first
mismatch, but with custom SAX handlers all line numbers would be off.
This also fixes line numbers in the SAX push parser which were never
handled correctly.
|
|
01411e7c
|
2021-02-08T20:58:32
|
|
Check for invalid redeclarations of predefined entities
Implement section "4.6 Predefined Entities" of the XML 1.0 spec and
check whether redeclarations of predefined entities match the original
definitions.
Note that some test cases declared
<!ENTITY lt "<">
But the XML spec clearly states that this is illegal:
> If the entities lt or amp are declared, they MUST be declared as
> internal entities whose replacement text is a character reference to
> the respective character (less-than sign or ampersand) being escaped;
> the double escaping is REQUIRED for these entities so that references
> to them produce a well-formed result.
Also fixes #217 but the connection is only tangential. The integer
overflow discovered by fuzzing was more related to the fact that various
parts of the parser disagreed on whether to prefer predefined entities
over their redeclarations. The whole situation is a mess and even
depends on legacy parser options. But now that redeclarations are
validated, it shouldn't make a difference.
As noted in the added comment, this is also one of the cases where
overly defensive checks can hide interesting logic bugs from fuzzers.
|
|
79301d3d
|
2020-12-18T12:50:21
|
|
Fix timeout when handling recursive entities
Abort parsing early to avoid an almost infinite loop in certain error
cases involving recursive entities.
Found with libFuzzer.
|
|
32cb5dcc
|
2020-02-11T13:16:10
|
|
Add test case for recursive external parsed entities
|
|
f20daa9e
|
2020-02-11T13:13:52
|
|
Enable error tests with entity substitution
|
|
c2f209c0
|
2019-09-30T14:13:21
|
|
Disallow conditional sections in internal subset
Conditional sections are only allowed in *external* parameter entities
referenced from the internal subset.
|
|
c51e38cb
|
2019-09-30T13:50:02
|
|
Make xmlParseConditionalSections non-recursive
Avoid call stack overflow in deeply nested conditional sections.
Found by OSS-Fuzz.
|
|
62150ed2
|
2019-09-23T14:46:41
|
|
Make xmlParseContent and xmlParseElement non-recursive
Split xmlParseElement into subfunctions. Use nameNsPush to store prefix,
URI and nsNr on the heap, similar to the push parser.
Closes #84.
|
|
f9fce963
|
2019-05-16T21:16:01
|
|
Fix unsigned integer overflow
It's defined behavior but -fsanitize=unsigned-integer-overflow is
useful to discover bugs.
|
|
123234f2
|
2018-09-11T14:52:07
|
|
Free input buffer in xmlHaltParser
This avoids miscalculation of available bytes.
Thanks to Yunho Kim for the report.
Closes: #26
|
|
69936b12
|
2017-08-30T14:16:01
|
|
Revert "Print error messages for truncated UTF-8 sequences"
This reverts commit 79c8a6b which caused a serious regression in
streaming mode.
Also reverts part of commit 52ceced "Fix infinite loops with push
parser in recovery mode".
Fixes bug 786554.
|
|
899a5d9f
|
2017-07-25T14:59:49
|
|
Detect infinite recursion in parameter entities
When expanding a parameter entity in a DTD, infinite recursion could
lead to an infinite loop or memory exhaustion.
Thanks to Wei Lei for the first of many reports.
Fixes bug 759579.
|
|
872fea94
|
2017-06-19T00:24:12
|
|
Get rid of "blanks wrapper" for parameter entities
Now that replacement of parameter entities goes exclusively through
xmlSkipBlankChars, we can account for the surrounding space characters
there and remove the "blanks wrapper" hack.
|
|
24246c76
|
2017-06-20T12:56:36
|
|
Fix xmlHaltParser
Pop all extra input streams before resetting the input. Otherwise,
a call to xmlPopInput could make input available again.
Also set input->end to input->cur.
Changes the test output for some error tests. Unfortunately, some
fuzzed test cases were added to the test suite without manual cleanup.
This makes it almost impossible to review the impact of later changes
on the test output.
|
|
8bbe4508
|
2017-06-17T16:15:09
|
|
Spelling and grammar fixes
Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other
misspellings.
|
|
5f440d8c
|
2017-06-12T14:32:34
|
|
Rework entity boundary checks
Make sure to finish all entities in the internal subset. Nevertheless,
readd a sanity check in xmlParseStartTag2 that was lost in my previous
commit. Also add a sanity check in xmlPopInput. Popping an input
unexpectedly was the source of many recent memory bugs. The check
doesn't mitigate such issues but helps with diagnosis.
Always base entity boundary checks on the input ID, not the input
pointer. The pointer could have been reallocated to the old address.
Always throw a well-formedness error if a boundary check fails. In a
few places, a validity error was thrown.
Fix a few error codes and improve indentation.
|
|
79c8a6b1
|
2017-06-10T17:01:27
|
|
Print error messages for truncated UTF-8 sequences
Before, truncated UTF-8 sequences at the end of a file were treated as
EOF. Create an error message containing the offending bytes.
xmlStringCurrentChar would also print characters from the input stream,
not the string it's working on.
|
|
855c19ef
|
2017-06-01T01:04:08
|
|
Avoid reparsing in xmlParseStartTag2
The code in xmlParseStartTag2 must handle the case that the input
buffer was grown and reallocated which can invalidate pointers to
attribute values. Before, this was handled by detecting changes of
the input buffer "base" pointer and, in case of a change, jumping
back to the beginning of the function and reparsing the start tag.
The major problem of this approach is that whether an input buffer is
reallocated is nondeterministic, resulting in seemingly random test
failures. See the mailing list thread "runtest mystery bug: name2.xml
error case regression test" from 2012, for example.
If a reallocation was detected, the code also made no attempts to
continue parsing in case of errors which makes a difference in
the lax "recover" mode.
Now we store the current input buffer "base" pointer for each (not
separately allocated) attribute in the namespace URI field, which isn't
used until later. After the whole start tag was parsed, the pointers
to the attribute values are reconstructed using the offset between the
new and the old input buffer. This relies on arithmetic on dangling
pointers which is technically undefined behavior. But it seems like
the easiest and most efficient fix and a similar approach is used in
xmlParserInputGrow.
This changes the error output of several tests, typically making it
more verbose because we try harder to continue parsing in case of
errors.
(Another possible solution is to check not only the "base" pointer
but the size of the input buffer as well. But this would result in
even more reparsing.)
|
|
00906759
|
2016-01-26T16:57:03
|
|
Heap-based buffer-underreads due to xmlParseName
For https://bugzilla.gnome.org/show_bug.cgi?id=759573
* parser.c:
(xmlParseElementDecl): Return early on invalid input to fix
non-minimized test case (759573-2.xml). Otherwise the parser
gets into a bad state in SKIP(3) at the end of the function.
(xmlParseConditionalSections): Halt parsing when hitting invalid
input that would otherwise caused xmlParserHandlePEReference()
to recurse unexpectedly. This fixes the minimized test case
(759573.xml).
* result/errors/759573-2.xml: Add.
* result/errors/759573-2.xml.err: Add.
* result/errors/759573-2.xml.str: Add.
* result/errors/759573.xml: Add.
* result/errors/759573.xml.err: Add.
* result/errors/759573.xml.str: Add.
* test/errors/759573-2.xml: Add.
* test/errors/759573.xml: Add.
|
|
38eae571
|
2016-03-07T14:04:08
|
|
Heap use-after-free in xmlSAX2AttributeNs
For https://bugzilla.gnome.org/show_bug.cgi?id=759020
* parser.c:
(xmlParseStartTag2): Attribute strings are only valid if the
base does not change, so add another check where the base may
change. Make sure to set 'attvalue' to NULL after freeing it.
* result/errors/759020.xml: Added.
* result/errors/759020.xml.err: Added.
* result/errors/759020.xml.str: Added.
* test/errors/759020.xml: Added test case.
|
|
45752d2c
|
2016-03-03T11:50:34
|
|
Bug 759398: Heap use-after-free in xmlDictComputeFastKey <https://bugzilla.gnome.org/show_bug.cgi?id=759398>
* parser.c:
(xmlParseNCNameComplex): Store start position instead of a
pointer to the name since the underlying buffer may change,
resulting in a stale pointer being used.
* result/errors/759398.xml: Added.
* result/errors/759398.xml.err: Added.
* result/errors/759398.xml.str: Added.
* test/errors/759398.xml: Added test case.
|
|
db07dd61
|
2016-02-12T09:58:29
|
|
Bug 758588: Heap-based buffer overread in xmlParserPrintFileContextInternal <https://bugzilla.gnome.org/show_bug.cgi?id=758588>
* parser.c:
(xmlParseEndTag2): Add bounds checks before dereferencing
ctxt->input->cur past the end of the buffer, or incrementing the
pointer past the end of the buffer.
* result/errors/758588.xml: Add test result.
* result/errors/758588.xml.err: Ditto.
* result/errors/758588.xml.str: Ditto.
* test/errors/758588.xml: Add regression test.
|
|
a7a94612
|
2016-02-09T12:55:29
|
|
Heap-based buffer overread in xmlNextChar
For https://bugzilla.gnome.org/show_bug.cgi?id=759671
when the end of the internal subset isn't properly detected
xmlParseInternalSubset should just return instead of trying
to process input further.
|
|
f1063fdb
|
2015-11-20T16:06:59
|
|
CVE-2015-7500 Fix memory access error due to incorrect entities boundaries
For https://bugzilla.gnome.org/show_bug.cgi?id=756525
handle properly the case where we popped out of the current entity
while processing a start tag
Reported by Kostya Serebryany @ Google
This slightly modifies the output of 754946 in regression tests
|
|
4a5d80ad
|
2015-09-18T15:06:46
|
|
Fix a bug in CData error handling in the push parser
For https://bugzilla.gnome.org/show_bug.cgi?id=754947
The checking function was returning incorrect args in some cases
Adds the test to teh reg suite and fix one of the existing test output
|
|
51f02b0a
|
2015-09-15T16:50:32
|
|
Fix a bug on name parsing at the end of current input buffer
For https://bugzilla.gnome.org/show_bug.cgi?id=754946
When hitting the end of the current input buffer while parsing
a name we could end up loosing the beginning of the name, which
led to various issues.
|
|
e7bf892d
|
2012-07-30T20:09:25
|
|
Improve error reporting on parser errors
The extra string was being dismissed when provided.
* parser.c: handle bot case properly
* result/: this changes a few error reports
|
|
4629ee02
|
2012-07-23T14:15:40
|
|
Do not fetch external parsed entities
Unless explicietely asked for when validating or replacing entities
with their value. Problem pointed out by Tom Lane <tgl@redhat.com>
* parser.c: do not load external parsed entities unless needed
* test/errors/extparsedent.xml result/errors/extparsedent.xml*:
add a regression test to avoid change of the behaviour in the future
|
|
d7af5553
|
2008-08-04T15:29:44
|
|
rewrite the URI parser to update to rfc3986 (from 2396) removed the error
* uri.c include/libxml/uri.h: rewrite the URI parser to update to
rfc3986 (from 2396)
* test/errors/webdav.xml result/errors/webdav.xml*: removed the
error test, 'DAV:' is a correct URI under 3986
* Makefile.am: small cleanup in make check
Daniel
svn path=/trunk/; revision=3763
|
|
37334576
|
2008-07-31T08:20:02
|
|
added a skipped list, insert rmt-ns10-035 improve 'make check' clean up
* runxmlconf.c: added a skipped list, insert rmt-ns10-035
* Makefile.am: improve 'make check'
* include/libxml/xmlerror.h parser.c: clean up namespace errors
checking and reporting, errors when a document is labelled
as UTF-16 while it is parsed as UTF-8 and no encoding was given
explicitely.
* result/errors/webdav.xml.*: some warnings are no recategorized
as Namespace errors
Daniel
svn path=/trunk/; revision=3761
|
|
c707d0b7
|
2008-01-24T14:48:54
|
|
fix a memeory leak in internal subset parsing with a fix from Ashwin add
* parser.c: fix a memeory leak in internal subset parsing with
a fix from Ashwin
* test/errors/content1.xml result/errors/content1.xml*:
add test to regressions
Daniel
svn path=/trunk/; revision=3680
|
|
da629347
|
2007-08-01T07:49:06
|
|
fixed a parser bug where invalid char in comment may not be detected,
* parser.c: fixed a parser bug where invalid char in comment may
not be detected, reported by Ashwin Sinha
* test/errors/comment1.xml result/errors/comment1.xml*: added
the example to the regression suite
Daniel
svn path=/trunk/; revision=3647
|
|
b9e5acc4
|
2007-06-12T13:43:00
|
|
fix bug #414846 where invalid characters in attributes would sometimes not
* parser.c: fix bug #414846 where invalid characters in attributes
would sometimes not be detected.
* test/errors/attr4.xml result/errors/attr4.xml*: added a specific
test case to the regression tests
Daniel
svn path=/trunk/; revision=3634
|
|
dcec6724
|
2006-10-15T20:32:53
|
|
fix the patch for unreproductable #343000 but also fix a line/column
* parser.c: fix the patch for unreproductable #343000 but
also fix a line/column keeping error
* result/errors/attr1.xml.err result/errors/attr2.xml.err
result/errors/name.xml.err result/errors/name2.xml.err
result/schemas/anyAttr-processContents-err1_0_0.err
result/schemas/bug312957_1_0.err: affected lines in error output
of the regression tests
Daniel
|
|
f810de04
|
2005-07-06T22:48:41
|
|
fixed problem with free on dupl attribute in dtd (bug309637). added
* parser.c: fixed problem with free on dupl attribute in
dtd (bug309637).
* test/errors/attr3.xml, result/errors/attr3.*: added
regression test for this
|
|
3fa5e7e4
|
2005-07-04T11:12:25
|
|
fixed a bug failing to detect UTF-8 violations in CData in push mode.
* parser.c: fixed a bug failing to detect UTF-8 violations in
CData in push mode.
* result/errors/cdata.xml* test/errors/cdata.xml: added the test
to the regressions
Daniel
|
|
b8590d4c
|
2005-01-21T15:10:23
|
|
fixed bug #164556 where non-fatal errors stopped push parsing and
* parser.c: fixed bug #164556 where non-fatal errors stopped
push parsing and xmlreader.
* Makefile.am: fixup
* test/errors/webdav.xml result/errors/webdav*: adding regression
test for this problem.
Daniel
|
|
4a14fb8f
|
2004-06-14T19:58:20
|
|
fix from Steve Ball and update of the comment. William pointed out that
* xmlreader.c: fix from Steve Ball and update of the comment.
* Makefile.am result/errors/*.str: William pointed out that
the streaming error checking part wasn't streaming, fixing
Daniel
|
|
37fd3074
|
2004-06-03T11:22:31
|
|
fixed a bug where invalid charrefs may not be detected sometimes as
* parser.c: fixed a bug where invalid charrefs may not be detected
sometimes as pointed by Morus Walter.
* test/errors/charref1.xm result/errors/charref1.xml*: added the
test in the regression suite.
Daniel
|
|
6c662996
|
2004-02-21T11:55:44
|
|
Beuah ! Daniel
Beuah !
Daniel
|