|
fdfeecfe
|
2024-07-02T21:54:26
|
|
parser: Reenable ctxt->directory
Unused internally, but used in downstream code.
Should fix #753.
|
|
606f4108
|
2024-07-02T20:57:15
|
|
parser: Allow to disable catalogs with parser options
Implement XML_PARSE_NO_SYS_CATALOG and XML_PARSE_NO_CATALOG_PI.
Fixes #735.
|
|
866be54e
|
2024-07-02T04:27:53
|
|
parser: Don't use deprecated xmlSplitQName
|
|
bc793390
|
2024-06-27T16:23:14
|
|
parser: Update documentation
|
|
eca972e6
|
2024-06-26T02:22:04
|
|
parser: Add getters for XML declaration to parser context
Access to struct members will be deprecated.
|
|
bbbbbb46
|
2024-06-20T03:19:48
|
|
parser: implement xmlCtxtGetOptions
In 712a31ab, the `options` struct member was deprecated. To allow
callers to check the status of options bits, introduce
xmlCtxtGetOptions.
|
|
217e9b7a
|
2024-06-08T12:27:45
|
|
clang-tidy: don't return in void functions
Found with readability-redundant-control-flow
Signed-off-by: Rosen Penev <rosenp@gmail.com>
|
|
32cac377
|
2024-06-17T17:59:49
|
|
parser: Selectively reenable reading from "-"
Make filename "-" mean stdin for legacy SAX1 functions and xmlReadFile.
This should hopefully fix most command line utilities.
See #737.
|
|
33a1f897
|
2024-06-16T19:16:47
|
|
legacy: Merge SAX.c into legacy.c
|
|
10d60d15
|
2024-06-16T00:04:46
|
|
regexp: Stop using LIBXML_AUTOMATA_ENABLED
This macro always equals LIBXML_REGEXP_ENABLED.
|
|
b0fc67aa
|
2024-06-15T22:53:55
|
|
build: Remove --with-tree configuration option
This option would allow for a smaller, but mostly useless minimal build.
But it complicates the symbol availability logic in an insane way and
requires specialized tools like our custom C parser in doc/apibuild.py.
See #717.
|
|
039ce1e8
|
2024-06-14T16:41:43
|
|
parser: Pass global object to sax->setDocumentLocator
Revert part of commit c011e760.
Fixes #732.
|
|
dba1ed85
|
2024-06-12T18:19:55
|
|
ftp: Remove FTP support
Remove the built-in FTP client. If you configure --with-legacy, old
symbols are retained for ABI compatibility.
|
|
52384043
|
2024-06-11T19:10:41
|
|
parser: Pass resource type to resource loader
|
|
89fcae4d
|
2024-06-11T16:19:58
|
|
parser: Don't report malloc failures when creating context
We don't want messages to stderr before an error handler could be set on
a parser context.
|
|
410931e3
|
2024-06-11T00:55:38
|
|
parser: Only set input ID for PE refs
Other input streams don't require IDs.
|
|
ff3b0919
|
2024-06-11T00:00:32
|
|
parser: Implement XML_PARSE_NO_UNZIP option
|
|
47cbb6bb
|
2024-06-10T14:04:00
|
|
doc: Don't mention xmlNewInputURL
|
|
8318b5a6
|
2024-06-09T14:22:53
|
|
parser: Fix NULL checks for output arguments
|
|
0cde1b78
|
2024-06-06T23:50:03
|
|
parser: Fix "Truncated multi-byte sequence" error
Don't raise the error if decoding failed.
|
|
122b6130
|
2024-06-04T16:33:02
|
|
parser: Fix performance regression when parsing namespaces
The namespace hash table didn't reuse deleted buckets, leading to
quadratic behavior.
Also ignore deleted buckets when resizing.
Fixes #726.
|
|
a7e26707
|
2024-06-03T14:04:44
|
|
parser: Don't overwrite OOM errors in xmlSBuf
|
|
e75e878e
|
2024-05-20T13:58:22
|
|
doc: Update and fix documentation
|
|
4fefba4c
|
2024-05-15T17:52:20
|
|
parser: Rework handling of undeclared entities
Throw an error if entity substitution was requested.
Now we only downgrade to a warning if
- XML_PARSE_DTDLOAD wasn't specified, and
- entity aren't substituted or XML_PARSE_NO_XXE was specified.
Should fix #724.
|
|
4ff2dccf
|
2024-05-10T02:04:52
|
|
SAX2: Warn if URI resolution failed
|
|
4fe116eb
|
2024-05-10T00:05:44
|
|
parser: Don't report error on invalid URI
Only fragment identifiers are an error.
This removes the last user of xmlErrMsg*. Now every error reported by
the parser should result in one of ctxt->wellFormed, ctxt->nsWellFormed
or ctxt->valid being set to zero.
|
|
a4c2b723
|
2024-05-05T17:26:31
|
|
io: Don't set close callback in xmlParserInputBufferCreateFd
|
|
fdc5ff36
|
2024-05-02T16:23:04
|
|
parser: Always throw entity errors if external DTD is loaded
When parsing with XML_PARSE_DTDLOAD, missing entities are always an
error.
Also consolidate behavior when validating. See b717abdd.
|
|
39e5b35b
|
2024-05-02T22:06:19
|
|
parser: Don't create undeclared entity refs in substitution mode
We never want to create entity reference nodes if entity substitution
is enabled. This also applies to undeclared entities.
|
|
1cdfece1
|
2024-04-28T18:33:40
|
|
memory: Remove memory debugging
This is useless compared to sanitizers or valgrind and has a
considerable performance impact if enabled accidentally.
|
|
45fe9924
|
2024-04-22T17:12:54
|
|
parser: Don't create reference in xmlLookupGeneralEntity
This should only be done in xmlParseReference.
The handling of undeclared entities is still somewhat inconsistent. In
element content we create references even if entity substitution is
enabled. In attribute values undeclared entities are always ignored.
|
|
b717abdd
|
2024-04-22T15:42:39
|
|
parser: Consolidate error handling for undeclared entities
Always use XML_WAR_UNDECLARED_ENTITY with warning error level in
documents with external subset or parameter entities. Use
XML_ERR_UNDECLARED_ENTITY otherwise.
|
|
f506ec66
|
2024-04-15T11:27:44
|
|
parser: Always decode entities in namespace URIs
Also decode entities in namespace URIs if entity substitution wasn't
requested. This should fix some corner cases when comparing namespace
URIs. The Namespaces in XML 1.0 spec says:
> In a namespace declaration, the URI reference is the normalized value
> of the attribute, so replacement of XML character and entity
> references has already been done before any comparison.
Make the serialization code escape special characters in namespace URIs
like in attribute values. This fixes serialization if entities were
substituted when parsing.
Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/106
|
|
2840e33c
|
2024-03-04T07:34:25
|
|
tree: Allocate XML namespace statically
|
|
186562a1
|
2024-03-12T19:55:33
|
|
parser: Fix detection of duplicate attributes in XML namespace
Fixes a regression from commit e0dd330b, resulting in duplicate
attributes in the predefined XML namespace not being detected or
extraneous default attributes being passed.
Fixes #704.
|
|
4d774612
|
2024-02-13T11:35:12
|
|
parser: Fix column number in attribute values
Short-lived regression from 37c6618b.
|
|
95f2a174
|
2024-01-30T13:25:17
|
|
parser: Fix crash in xmlParseInNodeContext with HTML documents
Ignore namespaces if we have an HTML document with namespaces added
manually.
Fixes #672.
|
|
6dc2fdb2
|
2024-01-07T14:30:57
|
|
parser: Account for full size of non-well-formed entities
Account for the full size of the entity if parsing stops because of
errors. In our cost model, we have to assume that the entity loader
processes the whole entity regardless of its content.
|
|
29beef65
|
2024-01-02T21:50:38
|
|
parser: Pop inputs if parsing DTD failed
This should provide some statistics in ctxt->sizeentcopy even in the
error or recovery case.
|
|
02a2038d
|
2024-01-10T14:17:49
|
|
parser: Handle NOCDATA properly when expanding entities
Short-lived regression from e1153832.
|
|
e1153832
|
2024-01-07T01:29:37
|
|
parser: Fix quadratic behavior when copying entities
Process the first and last text node with the SAX handler to make the
text merging optimization kick in.
Fixes #657.
|
|
f237e5b9
|
2024-01-05T15:40:23
|
|
parser: Avoid duplicate namespace errors
Don't report an extra attribute uniqueness error if a namespace is
undeclared. This matches old behavior.
|
|
02cc5c36
|
2024-01-05T04:17:14
|
|
parser: Add XML_PARSE_NO_XXE parser option
|
|
12f0bb94
|
2024-01-05T01:14:28
|
|
parser: Synchronize more options
|
|
3efbe916
|
2024-01-05T00:11:29
|
|
parser: Mark 'token' member as unused in xmlParserCtxt
|
|
b82fd81d
|
2024-01-04T23:25:06
|
|
parser: Rework xmlCtxtParseDocument
Make xmlCtxtParseDocument take a parser input which can be popped after
parsing.
|
|
d7d300ba
|
2024-01-04T17:50:11
|
|
parser: Remove remnants of runtime debugging feature
Apparently, this feature was remove long ago.
Fixes #651.
|
|
8c5848bd
|
2024-01-04T17:14:31
|
|
parser: Make xmlParseContent more useful
This is an internal function which isn't really usable without some
hacks. See WebKit/Chromium trying to recreate the effects of
xmlDetectSAX2 manually, for example.
Make xmlParseContent perform late initialization and check whether the
content was fully parsed.
Also rename xmlDetectSAX2 and document why it's needed.
|
|
a7356dfe
|
2024-01-03T18:02:46
|
|
parser: Clear invalid entity content
This was removed in earlier commits, but we really want to make sure
that entity content is syntactically valid.
|
|
30d83977
|
2024-01-04T15:18:14
|
|
fuzz: Disable catalogs
The catalogs API doesn't report OOM errors. It's basically impossible
to use it safely in its current form.
|
|
85f99023
|
2024-01-02T17:52:43
|
|
parser: Fix buffer size checks
Don't test size of remaining data. This causes false positives with
memory buffers.
Also impose XML_MAX_HUGE_LENGTH limit when parsing with XML_PARSE_HUGE.
|
|
e8fb3d63
|
2024-01-02T17:45:54
|
|
parser: Convert some "internal errors" to meaningful codes
|
|
5cb4b05c
|
2024-01-02T17:16:22
|
|
parser: Lower maximum entity nesting depth
Limit entity nesting depth to 20 or 40 with XML_PARSE_HUGE.
Change error code to XML_ERR_RESOURCE_LIMIT.
|
|
a2cc7f5f
|
2024-01-02T17:02:21
|
|
parser: Set depth limit to 2048 with XML_PARSE_HUGE
Deeply nested documents can cause performance problems, so the nesting
depth should always be limited to a reasonable value.
Also remove the global xmlParserMaxDepth setting which isn't thread-safe
and seems unused.
|
|
875bb084
|
2023-09-07T03:25:45
|
|
parser: Implement xmlCtxtSetOptions
Surprisingly, some options can only be enabled with xmlCtxtUseOptions
and it's impossible to unset them. Add a new API function
xmlCtxtSetOptions which sets or clears all options.
Finally document all parser options.
Make sure to synchronize option bits and struct members.
|
|
33ec407a
|
2023-09-07T03:33:09
|
|
parser: Always prefer option members over bitmask
If an option has an extra member in xmlParserCtxt, it takes precedence
over the value from the options bitmask. Fix a few places where this was
ignored.
|
|
22fd571f
|
2023-09-06T22:15:20
|
|
parser: Don't modify SAX2 handler if XML_PARSE_SAX1 is set
It's a bad idea to modify members of the SAX handler struct for option
state management. Ideally, ctxt->options should be the preferred source
of truth.
|
|
37c6618b
|
2023-12-30T02:50:34
|
|
parser: Rework parsing of attribute and entity values
Don't use a separate function to handle "complex" attributes. Validate
UTF-8 byte sequences without decoding. This should improve performance
considerably when parsing multi-byte UTF-8 sequences.
Use a string buffer to avoid unnecessary allocations and copying when
expanding entities.
Normalize attribute values in a single pass while expanding entities.
Be more lenient in recovery mode.
If no entity substitution was requested, validate entities without
expanding. Fixes #596.
Also fixes #655.
|
|
2b79f106
|
2023-12-29T21:07:04
|
|
parser: Simplify entity size accounting
|
|
08d9b258
|
2023-12-29T15:20:56
|
|
parser: Support namespace scope in NsData struct
The previous approach of recreating the NsData struct was flawed.
|
|
5de48d12
|
2023-12-29T14:41:40
|
|
parser: Simplify error handling when parsing entities
|
|
f0dc52d0
|
2023-12-29T06:00:20
|
|
parser: Move cleanup of element stacks to xmlParseContent
|
|
a1ed589b
|
2023-12-29T23:12:06
|
|
parser: Avoid unwanted expansion of parameter entities
Remove PE handling from xmlSkipBlankChars and add a separate version
that handles PEs. Only call xmlSkipBlankCharsPE when parsing DTD
constructs. This should make sure that PEs don't get expanded
accidentally, for example in text declarations.
|
|
a73483ed
|
2023-12-29T00:22:02
|
|
parser: Remove extraneous error message
This is not an "internal error" but some other error reported elsewhere.
|
|
7e0bbbc1
|
2023-12-27T18:33:30
|
|
parser: New input API
Provide a new set of functions to create xmlParserInputs. These can be
used for the document entity or from external entity loaders.
- Don't require xmlParserInputBuffer.
- All functions take a base URI.
- All functions take an encoding as string.
- xmlNewInputURL also takes a public ID.
- xmlNewInputMemory takes a size_t.
- Optimization hints for memory buffers.
Improve documentation.
Only call xmlInitParser before allocating a new parser context.
Call xmlCtxtUseOptions as early as possible.
|
|
45157261
|
2023-12-27T21:30:13
|
|
parser: Downgrade XML_ERR_UNSUPPORTED_ENCODING to warning
If the actual encoding is UTF-8 or ASCII, we don't want to fail.
|
|
24b7144f
|
2023-12-27T15:50:58
|
|
parser: More refactoring of entity parsing
Remove xmlCreateEntityParserCtxtInternal.
Rework xmlNewEntityInputStream.
|
|
d3ceea0b
|
2023-12-27T15:18:09
|
|
parser: Fix encoding handling in xmlParserInputBufferCreateIO
Don't pass encoding to xmlParserInputBufferCreateIO but use
xmlSwitchEncoding to make sure that the encoding sticks.
|
|
d025cfbb
|
2023-12-27T03:53:24
|
|
parser: Always copy content from entity to target.
Make sure that references from IDs are updated.
Note that if there are IDs with the same value in a document, the last
one will now be returned. IDs should be unique, but maybe this should be
addressed.
|
|
6337ff79
|
2023-12-27T03:29:13
|
|
parser: Simplify control flow in xmlParseReference
|
|
579186f2
|
2023-12-27T03:03:26
|
|
parser: Remove xmlSetEntityReferenceFunc feature
This has been deprecated for a long time.
|
|
b848338c
|
2023-12-27T01:46:40
|
|
parser: More refactoring of entity loading
This sets input->entity also for general entities.
|
|
4ecc85d2
|
2023-12-27T00:44:16
|
|
parser: Push general entity input streams on the stack
This allows the error handler to give more context.
|
|
6a9a88a1
|
2023-12-26T03:13:05
|
|
parser: Move progressive flag into input struct
|
|
4f14fe9c
|
2023-12-26T02:44:38
|
|
parser: Remove remaining ctxt->instate checks
Now ctxt->instate is only used for push parser states.
|
|
d944a415
|
2023-12-26T02:10:35
|
|
parser: Fix in-parameter-entity and in-external-dtd checks
Use in ctxt->input->entity instead of ctxt->inputNr to determine whether
we are inside a parameter entity.
Stop using ctxt->external to check whether we're in an external DTD.
This is signaled by ctxt->inSubset == 2.
|
|
f3fa34dc
|
2023-12-26T22:37:26
|
|
parser: Fix general entity parsing
Clear namespace database.
Ignore non-fatal errors.
|
|
ecfbcc8a
|
2023-12-25T04:33:00
|
|
parser: Rework general entity parsing
Don't create a new parser context but reuse the existing one.
This exposes bug #601 in a more obvious way.
|
|
955c177f
|
2023-12-23T00:58:36
|
|
parser: Stop using 'directory' struct member
This was only used as a pointless fallback for URI resolution.
|
|
e8de3401
|
2023-12-22T02:57:19
|
|
parser: Also set document properties when push parsing
Add new function xmlFinishDocument which invokes the endDocument SAX
handler and sets the document's properties.
|
|
13043691
|
2023-12-20T00:33:34
|
|
parser: Rename xmlErrParser to xmlCtxtErr
|
|
8d0aaf4b
|
2023-12-19T20:47:36
|
|
parser: Remove xmlErrEncoding
Use xmlFatalErr or xmlCtxtErrIO.
|
|
23345a1c
|
2023-12-19T19:52:28
|
|
io: Report IO errors through xmlCtxtErrIO
This is also a new public API function to be used in external entity
loaders.
|
|
531d06ad
|
2023-12-18T22:48:24
|
|
error: Stop printing some errors by default
Unfortunately, it's long-standing behavior for libxml2 to print all
reported errors to stderr by default. This default behavior is now
partially disabled. If no error handler is set, only parser and
validation errors are passed to a generic error handler or printed to
stderr. Other errors are still available via xmlGetLastError and can be
captured with a structured error handler.
|
|
54c70ed5
|
2023-12-18T19:31:29
|
|
parser: Improve error handling
Introduce xmlCtxtSetErrorHandler allowing to set a structured error for
a parser context. There already was the "serror" SAX handler but this
always receives the parser context as argument.
Start to use xmlRaiseMemoryError.
Remove useless arguments from memory error functions. Rename
xmlErrMemory to xmlCtxtErrMemory.
Remove a few calls to xmlGenericError.
Remove support for runtime entity debugging.
|
|
1c106edf
|
2023-12-13T23:56:19
|
|
parser: Allow recovery in xmlParseInNodeContext
Should fix #645.
|
|
862e9ce0
|
2023-12-13T14:53:44
|
|
malloc-fail: Fix use-of-uninitialized-value in xmlParseConditionalSections
Short-lived regression.
|
|
c2bbeed1
|
2023-12-12T23:51:32
|
|
io: Fix memory lifetime issue with input buffers
xmlParserInputBufferCreateMem must make a copy of the buffer.
This fixes a regression from 2.11 which could cause reads from freed
memory depending on the use case.
Undeprecate xmlParserInputBufferCreateStatic which can avoid copying
the whole buffer.
|
|
f19a9510
|
2023-12-10T17:50:22
|
|
parser: Report malloc failures
Fix many places where malloc failures aren't reported.
Make xmlErrMemory public. This is useful for custom external entity
loaders.
Introduce new API function xmlSwitchEncodingName.
Change the way how we store whether the the parser is stopped. This used
to be signaled by setting ctxt->instate to XML_PARSER_EOF which was
misdesigned and error-prone. Set ctxt->disableSAX to 2 instead and
introduce a macro PARSER_STOPPED. Also stop to remove parser inputs in
xmlHaltParser. This allows to remove many checks of ctxt->instate.
Introduce xmlErrParser to handle errors if a parser context is
available.
|
|
7d446e97
|
2023-12-08T12:13:49
|
|
parser: Fix namespaces redefined from default attributes
This regressed in commit e0dd330b.
Also fixes a long-standing issue where namespaces from default
attributes weren't added if they match an existing namespace.
Fixes #643.
|
|
c011e760
|
2023-12-06T01:09:31
|
|
globals: Remove unused globals from thread storage
Setting these deprecated globals hasn't had an effect for a long time.
Make them constants. This reduces the size of per-thread storage from
~700 to ~250 bytes.
|
|
7f00273c
|
2023-12-01T19:21:17
|
|
parser: Fix invalid free in xmlParseBalancedChunkMemoryRecover
Set the dictionary for newDoc in xmlParseBalancedChunkMemoryRecover.
This is a long-standing bug which was masked by
- xmlParseBalancedChunkMemoryRecover changing the document of the root
node. This is a really bad idea, resulting in a mismatch between
ctxt->myDoc and ctxt->node->doc.
- SAX2.c preferring ctxt->node->doc over ctxt->myDoc until commit
a31e1b06.
Fixes #641.
|
|
c7629c9e
|
2023-11-30T16:52:34
|
|
parser: Clarify documentation regarding xmlReadMemory buffer size
Fixes #638.
|
|
43b511fa
|
2023-11-26T14:31:39
|
|
parser: Make CRLF increment line number
Partial revert of cb927e85 fixing CRLFs not incrementing the line
number.
This requires to rework xmlParseQNameHashed. The original implementation
prompted the change to xmlCurrentChar which really shouldn't modify the
'cur' pointer as side effect. But the NEXTL macro relies on this
behavior.
Ultimately, we should reintroduce the change to xmlCurrentChar and fix
the NEXTL macro. This will lead to single CRs incrementing the line
number as well which seems more consistent.
Fixes #628.
|
|
aca37d8c
|
2023-11-20T15:20:37
|
|
parser: Only enable SAX2 if there are SAX2 element handlers
This reverts part of commit 235b15a5 for backward compatibility and
adds some comments trying to clarify the whole mess.
Fixes #623.
|
|
529df196
|
2023-11-15T12:10:25
|
|
parser: Don't overwrite error state in xmlParseTextDecl
Fixes a null deref in xmlLoadEntityContent found by OSS-Fuzz.
|
|
70cc45b8
|
2023-11-05T00:49:40
|
|
parser: Improve attribute hash table
There's no need to grow the hash table dynamically. The size is known
which simplifies the implementation.
|
|
58598494
|
2023-11-04T23:47:33
|
|
parser: Fix combination of hash values
This bug resulted in a stuck bit in hash values which can have a severe
performance impact.
|
|
7a2d412f
|
2023-10-31T20:15:38
|
|
parser: Copy default namespace in xmlParseBalancedChunkMemory
|
|
e0c2f14d
|
2023-10-31T13:53:15
|
|
parser: Copy namespaces in xmlParseBalancedChunkMemory
Reenable copying of namespaces but don't set SAX data. This should
match the old behavior.
|