|
38ea8fa9
|
2025-05-06T18:31:45
|
|
doc: Fix varargs
|
|
4a010875
|
2025-05-03T15:38:15
|
|
doc: Move parser option docs to enum
|
|
9bbffec5
|
2025-05-06T17:42:46
|
|
doc: Move brief to top, params to bottom of doc comments
|
|
e6cfd049
|
2025-05-04T14:52:42
|
|
doc: Misc fixes to tree docs
|
|
1bf44f09
|
2025-05-04T02:15:25
|
|
doc: Misc fixes to parser docs
|
|
f7c41287
|
2025-05-02T15:57:17
|
|
doc: Remove more comment block headers
|
|
1eca6e34
|
2025-04-30T00:54:00
|
|
parser: Deprecate xmlClearParserCtxt
|
|
fd6ab89b
|
2025-04-28T15:58:19
|
|
doc: Adjust documentation of public structs
|
|
8816f267
|
2025-04-28T14:55:47
|
|
doc: Adjust documentation of enums
|
|
e549622b
|
2025-04-28T15:11:24
|
|
doc: Convert documentation to Doxygen
Automated conversion based on a few regexes.
|
|
61890e39
|
2025-04-27T21:50:15
|
|
doc: Prepare for conversion to Doxygen
Fix many params in internal functions (not really necessary but Doxygen
warns about that in XML mode).
Fix formatting in a few corner cases that automatic conversion can't
handle.
Rearrange some DOC_DISABLE blocks.
|
|
fc8899d4
|
2025-04-27T12:59:41
|
|
parser: Make xmlCtxtGetValidCtxt depend on VALID_ENABLED
|
|
aa4ef773
|
2025-04-17T19:53:14
|
|
parser: Deprecate output-related globals
|
|
b3492259
|
2025-03-14T00:01:11
|
|
include: Change some return types from int to enum
This also affects some new functions from 2.13.
|
|
fd1b9391
|
2025-03-13T23:20:16
|
|
include: Convert some macros to enums
|
|
03a8f1dd
|
2025-03-11T18:53:24
|
|
doc: Document SAX handlers a little more
|
|
ba9148d8
|
2025-03-09T20:30:49
|
|
parser: Undeprecate input->consumed
Should be deprecated after fixing #762.
|
|
a0dbf030
|
2025-03-09T20:24:06
|
|
parser: Undeprecate ctxt->loadsubset
Should be deprecated after fixing #873.
|
|
d96911f1
|
2025-03-08T23:00:29
|
|
doc: Documentation fixes
|
|
5f0b1378
|
2025-03-08T22:07:15
|
|
parser: Add more parser context accessors
Fixes #763.
|
|
69657224
|
2025-03-04T20:32:02
|
|
globals: Remove unused globals
- xmlBufferAllocScheme
- xmlDefaultBufferSize
- xmlParserDebugEntities
|
|
3d37ff84
|
2025-03-04T15:10:09
|
|
globals: Also use global state struct if threads are disabled
|
|
a15ad9b2
|
2025-03-04T14:06:50
|
|
parser: Remove compatibility symbols
|
|
8e871162
|
2025-03-04T13:36:55
|
|
parser: Remove oldXMLWDcompatibility
|
|
cdc5cfed
|
2025-03-04T13:26:51
|
|
legacy: Remove legacy symbols
|
|
e50d314a
|
2025-02-25T23:07:19
|
|
build: Add separate configuration option for RELAX NG
Support for RELAX NG used to be enabled together with XML Schema support
(--with-schemas). Now there's a separate option and a new feature macro
LIBXML_RELAXNG_ENABLED.
|
|
93506d41
|
2025-01-29T00:17:01
|
|
parser: Make catalog PIs opt-in
This is an obscure feature that shouldn't be enabled by default.
|
|
1082d813
|
2025-01-28T23:21:34
|
|
parser: Prepare to make decompression opt-in
Add a new parser option XML_PARSE_UNZIP that enables decompression.
xmlReadFile, xmlCtxtReadFile and xmlCreateURLParserCtxt always set
this option currently, but downstream users should start to set the
option if they really need it.
|
|
0dc26910
|
2024-11-20T21:04:19
|
|
parser: Deprecate more internal functions
|
|
5a51f085
|
2024-11-17T13:50:15
|
|
valid: Implement xmlCtxtValidateDocument
This allows to use the error handler or resource loader of a parser
context.
|
|
7f8c436c
|
2024-11-15T16:30:52
|
|
parser: Implement xmlCtxtParseDtd and xmlCtxtValidateDtd
This allows to use the context's error handler, options and other
settings.
Fixes #808.
|
|
eb66d03e
|
2024-07-07T23:15:54
|
|
io: Deprecate a few functions
|
|
69f12d6d
|
2024-07-13T00:17:18
|
|
encoding: Deprecate xmlByteConsumed
This was only used by Chromium/WebKit to detect whether xmlParseContent
really succeeded. It's a horrible, overcomplicated hack.
See 8c5848bd and #767.
|
|
8af55c8d
|
2024-07-06T22:14:21
|
|
parser: Rename new input API functions
These weren't made public yet.
|
|
4f329dc5
|
2024-07-10T03:27:47
|
|
parser: Implement xmlCtxtParseContent
This implements xmlCtxtParseContent, a better alternative to
xmlParseInNodeContext or xmlParseBalancedChunkMemory. It accepts a
parser context and a parser input, making it a lot more versatile.
xmlParseInNodeContext is now implemented in terms of
xmlCtxtParseContent. This makes sure that xmlParseInNodeContext never
modifies the target document, improving thread safety.
xmlParseInNodeContext is also more lenient now with regard to undeclared
entities.
Fixes #727.
|
|
82e0455c
|
2024-07-06T19:48:07
|
|
Undeprecate some symbols for now
- xmlKeepBlanksDefault is needed as a work-around for
xmlParseBalancedChunk, see issue #727.
- ctxt->options already has an accessor and will be deprecated
later.
- input->cur, input->base, input->end: See #762.
|
|
205e56da
|
2024-07-02T22:32:43
|
|
parser: Undeprecate ctxt->directory
|
|
606f4108
|
2024-07-02T20:57:15
|
|
parser: Allow to disable catalogs with parser options
Implement XML_PARSE_NO_SYS_CATALOG and XML_PARSE_NO_CATALOG_PI.
Fixes #735.
|
|
221df375
|
2024-06-28T00:34:52
|
|
parser: Support custom charset conversion implementations
Implement xmlCtxtSetCharEncConvImpl. I agree that the name is terrible.
|
|
044ddf07
|
2024-06-28T03:14:12
|
|
parser: Undeprecate some parser context members
|
|
193f4653
|
2024-06-26T19:28:28
|
|
parser: Implement xmlCtxtGetStatus
This allows access to ctxt->wellFormed, ctxt->nsWellFormed and
ctxt->valid. It also detects several fatal non-parser errors which
really should be another error level.
|
|
cc0cc2d3
|
2024-06-26T04:32:49
|
|
parser: Add more parser context accessors
|
|
eca972e6
|
2024-06-26T02:22:04
|
|
parser: Add getters for XML declaration to parser context
Access to struct members will be deprecated.
|
|
f9c33a55
|
2024-06-21T18:25:11
|
|
parser: Undeprecate some xmlParserInput members
|
|
1228b4e0
|
2024-06-21T18:22:04
|
|
parser: Deprecate xmlParserCtxt->lastError
We alredy have xmlCtxtGetLastError().
|
|
f82ca02b
|
2024-06-21T18:17:11
|
|
parser: Undeprecate some xmlParserCtxt members
These are essential for SAX parsers.
|
|
bbbbbb46
|
2024-06-20T03:19:48
|
|
parser: implement xmlCtxtGetOptions
In 712a31ab, the `options` struct member was deprecated. To allow
callers to check the status of options bits, introduce
xmlCtxtGetOptions.
|
|
1112699c
|
2024-06-17T02:42:18
|
|
legacy: Remove most legacy functions from public headers
Also remove warning messages.
|
|
5fca9498
|
2024-06-16T19:56:08
|
|
doc: Hide internal macro
|
|
387f0c78
|
2023-12-06T18:35:30
|
|
include: Readd circular dependency between tree.h and parser.h
There are dozens of downstream projects that only include tree.h but use
declarations from parser.h. This broke after the recent cleanup of
circular dependencies.
Make tree.h include parser.h again. This is a hack but doesn't change
the include directory struture.
This commit only made it into the 2.12 branch but wasn't applied to
master, so the issue turned up in 2.13.0 again.
Should fix #734.
|
|
712a31ab
|
2024-06-10T23:06:13
|
|
parser: Deprecate most public struct members
This will probably cause many warnings in downstream code abusing
libxml2 internals, but we can always undeprecate some members later.
|
|
52384043
|
2024-06-11T19:10:41
|
|
parser: Pass resource type to resource loader
|
|
64ad2725
|
2024-06-11T03:51:43
|
|
parser: Introduce per-context resource loader
|
|
ff3b0919
|
2024-06-11T00:00:32
|
|
parser: Implement XML_PARSE_NO_UNZIP option
|
|
5b1d7ff0
|
2024-05-20T22:51:44
|
|
parser: Remove redefinitions for legacy globals
|
|
8961056f
|
2024-01-23T00:47:44
|
|
parser: Make experimental input API private
This needs to be reworked.
|
|
02cc5c36
|
2024-01-05T04:17:14
|
|
parser: Add XML_PARSE_NO_XXE parser option
|
|
12f0bb94
|
2024-01-05T01:14:28
|
|
parser: Synchronize more options
|
|
3efbe916
|
2024-01-05T00:11:29
|
|
parser: Mark 'token' member as unused in xmlParserCtxt
|
|
b82fd81d
|
2024-01-04T23:25:06
|
|
parser: Rework xmlCtxtParseDocument
Make xmlCtxtParseDocument take a parser input which can be popped after
parsing.
|
|
d7d300ba
|
2024-01-04T17:50:11
|
|
parser: Remove remnants of runtime debugging feature
Apparently, this feature was remove long ago.
Fixes #651.
|
|
875bb084
|
2023-09-07T03:25:45
|
|
parser: Implement xmlCtxtSetOptions
Surprisingly, some options can only be enabled with xmlCtxtUseOptions
and it's impossible to unset them. Add a new API function
xmlCtxtSetOptions which sets or clears all options.
Finally document all parser options.
Make sure to synchronize option bits and struct members.
|
|
2b79f106
|
2023-12-29T21:07:04
|
|
parser: Simplify entity size accounting
|
|
7e0bbbc1
|
2023-12-27T18:33:30
|
|
parser: New input API
Provide a new set of functions to create xmlParserInputs. These can be
used for the document entity or from external entity loaders.
- Don't require xmlParserInputBuffer.
- All functions take a base URI.
- All functions take an encoding as string.
- xmlNewInputURL also takes a public ID.
- xmlNewInputMemory takes a size_t.
- Optimization hints for memory buffers.
Improve documentation.
Only call xmlInitParser before allocating a new parser context.
Call xmlCtxtUseOptions as early as possible.
|
|
a5dcf0f4
|
2023-12-26T03:27:23
|
|
parser: Mark more parser context members as unused
|
|
6a9a88a1
|
2023-12-26T03:13:05
|
|
parser: Move progressive flag into input struct
|
|
d944a415
|
2023-12-26T02:10:35
|
|
parser: Fix in-parameter-entity and in-external-dtd checks
Use in ctxt->input->entity instead of ctxt->inputNr to determine whether
we are inside a parameter entity.
Stop using ctxt->external to check whether we're in an external DTD.
This is signaled by ctxt->inSubset == 2.
|
|
c1bddd4c
|
2023-12-23T01:09:17
|
|
parser: Mark 'length' member of xmlParserInput as unused
|
|
955c177f
|
2023-12-23T00:58:36
|
|
parser: Stop using 'directory' struct member
This was only used as a pointless fallback for URI resolution.
|
|
54c70ed5
|
2023-12-18T19:31:29
|
|
parser: Improve error handling
Introduce xmlCtxtSetErrorHandler allowing to set a structured error for
a parser context. There already was the "serror" SAX handler but this
always receives the parser context as argument.
Start to use xmlRaiseMemoryError.
Remove useless arguments from memory error functions. Rename
xmlErrMemory to xmlCtxtErrMemory.
Remove a few calls to xmlGenericError.
Remove support for runtime entity debugging.
|
|
5d2dbe79
|
2023-12-14T13:37:25
|
|
parser: Fix build --without-output
Fixes #647
|
|
df0b540b
|
2023-12-07T14:40:13
|
|
include: Rename XML_EMPTY helper macro
Avoid name clash with downstream projects.
|
|
a9738e31
|
2023-12-07T14:15:29
|
|
include: Move declaration of xmlInitGlobals
Fix downstream build issues after reworking globals.h.
|
|
9122ad0c
|
2023-12-06T19:56:50
|
|
include: Move globals from xmlsave.h to parser.h
Fix downstream build issues after reworking globals.h.
|
|
c011e760
|
2023-12-06T01:09:31
|
|
globals: Remove unused globals from thread storage
Setting these deprecated globals hasn't had an effect for a long time.
Make them constants. This reduces the size of per-thread storage from
~700 to ~250 bytes.
|
|
ff6c3188
|
2023-11-23T15:22:59
|
|
include: Remove useless 'const' from function arguments
|
|
aca37d8c
|
2023-11-20T15:20:37
|
|
parser: Only enable SAX2 if there are SAX2 element handlers
This reverts part of commit 235b15a5 for backward compatibility and
adds some comments trying to clarify the whole mess.
Fixes #623.
|
|
e0dd330b
|
2023-09-29T00:18:44
|
|
parser: Use hash tables to avoid quadratic behavior
Use a hash table to lookup namespaces by prefix. The hash table stores
an index into the namespace table. Auxiliary data for namespaces is
stored in a separate array along the main namespace table.
Use a hash table to verify attribute uniqueness. The hash table stores
an index into the attribute table.
Reuse hash value from the dictionary to avoid computing them twice.
See #346.
|
|
8c084ebd
|
2023-09-21T22:57:33
|
|
doc: Make apibuild.py happy
|
|
72262030
|
2023-09-21T14:52:14
|
|
parser: Readd some includes to parser.h and xmlreader.h
Fix backward compatibility.
|
|
da274bfa
|
2023-09-21T01:29:40
|
|
build: Fix build when certain modules are disabled
|
|
d6ba4033
|
2023-09-20T20:49:59
|
|
globals: Move remaining declarations to correct places
globals.h is now deprecated. Sanity is restored.
|
|
11a1839d
|
2023-09-20T17:54:48
|
|
globals: Move remaining globals back to correct header files
This undoes a lot of damage.
|
|
d1336fd3
|
2023-09-20T17:00:50
|
|
globals: Move malloc hooks back to xmlmemory.h
|
|
2e6c49a7
|
2023-09-20T14:43:14
|
|
globals: Don't store xmlParserVersion in global state
This is a constant.
|
|
db8b9722
|
2023-09-20T13:56:16
|
|
parser: Deprecate global parser options
Note that setting global options has no effect anyway when using any of
the modern parser API functions which take an option argument like
xmlReadMemory or when using xmlCtxtUseOptions.
Global options only have an effect when using old API functions
xmlParse* or xmlSAXParse* or when using an xmlParserCtxt without calling
xmlCtxtUseOptions.
Unfortunately, many downstream projects still modify global parser
options often without realizing that it has no effect. If necessary,
switch to the modern API. Then you can safely remove all code that
changes global options.
Here's a list of deprecated functions and global variables together with
the corresponding parser options.
- xmlSubstituteEntitiesDefault, xmlSubstituteEntitiesDefaultValue
Parser option XML_PARSE_NOENT
- xmlKeepBlanksDefault, xmlKeepBlanksDefaultValue
Inverse of parser option XML_PARSE_NOBLANKS
- xmlPedanticParserDefault, xmlPedanticParserDefaultValue
Parser option XML_PARSE_PEDANTIC
- xmlLineNumbersDefault, xmlLineNumbersDefaultValue
Always enabled by new API
- xmlDoValidityCheckingDefaultValue
Parser option XML_PARSE_DTDVALID
- xmlGetWarningsDefaultValue
Inverse of parser option XML_PARSE_NOWARNING
- xmlLoadExtDtdDefaultValue
Parser options XML_PARSE_DTDLOAD and XML_PARSE_DTDATTR
|
|
ed3bd052
|
2023-08-20T20:48:10
|
|
parser: Allow to set maximum amplification factor
|
|
ec7be506
|
2023-08-08T15:19:46
|
|
parser: Rework encoding detection
Introduce XML_INPUT_HAS_ENCODING flag for xmlParserInput which is set
when xmlSwitchEncoding is called. The parser can use the flag to
reliably detect whether an encoding was already set via user override,
BOM or other auto-detection. In this case, the encoding declaration
won't be used to switch the encoding.
Before, an inscrutable mix of ctxt->charset, ctxt->input->encoding
and ctxt->input->buf->encoder was used.
Introduce private helper functions to switch encodings used by both the
XML and HTML parser:
- xmlDetectEncoding which skips over the BOM, allowing to remove the
BOM checks from other encoding functions.
- xmlSetDeclaredEncoding, replacing htmlCheckEncodingDirect, which warns
about encoding mismatches.
If users override the encoding, store the declared instead of the actual
encoding in xmlDoc. In this case, the actual encoding is known and the
raw value from the doc is more useful.
Also use the input flags to store the ISO-8859-1 fallback state.
Restrict the fallback to cases where no encoding was specified. (The
fallback is only useful in recovery mode and these days broken UTF-8 is
probably more likely than ISO-8859-1, so it might eventually be removed
completely.)
The 'charset' member of xmlParserCtxt is now unused. The 'encoding'
member of xmlParserInput is now unused.
The 'standalone' member of xmlParserInput is renamed to 'flags'.
A new parser state XML_PARSER_XML_DECL is added for the push parser.
|
|
e7c3a4ca
|
2023-03-13T19:19:46
|
|
parser: Deprecate some parser input functions
|
|
59b33661
|
2022-12-27T14:15:51
|
|
error: Limit number of parser errors
Reporting errors is expensive and some abusive test cases can generate
an error for each invalid input byte. This causes the parser to spend
most of the time with error handling. Limit the number of errors and
warnings to 100.
|
|
ce76ebfd
|
2022-12-19T20:56:23
|
|
entities: Stop counting entities
This was only used in the old version of xmlParserEntityCheck.
|
|
463bbeec
|
2022-12-19T18:39:45
|
|
entities: Rework entity amplification checks
This commit implements robust detection of entity amplification attacks,
better known as the "billion laughs" attack.
We now limit the size of the document after substitution of entities to
10 times the size before expansion. This guarantees linear behavior by
definition. There already was a similar check before, but the accounting
of "sizeentities" (size of external entities) and "sizeentcopy" (size of
all copies created by entity references) wasn't accurate.
We also need saturation arithmetic since we're historically limited to
"unsigned long" which is 32-bit on many platforms.
A maximum of 10 MB of substitutions is always allowed. This should make
use cases like DITA work which have caused problems in the past.
The old checks based on the number of entities were removed. This is
accounted for by adding a fixed cost to each entity reference.
Entity amplification checks are now enabled even if XML_PARSE_HUGE is
set. This option is mainly used to allow larger text nodes. Most users
were unaware that it also disabled entity expansion checks.
Some of the limits might be adjusted later. If this change turns out to
affect legitimate use cases, we can add a separate parser option to
disable the checks.
Fixes #294.
Fixes #345.
|
|
ce9baf94
|
2022-12-08T02:48:27
|
|
Remove XMLCALL and XMLCDECL macros from public headers
|
|
68a6518c
|
2022-11-15T18:23:33
|
|
parser: Rewrite push parser boundary checks
Remove inaccurate xmlParseCheckTransition check.
Remove non-incremental xmlParseGetLasts check.
Add functions that check for several boundary constructs more
accurately, keeping track of progress in ctxt->checkIndex.
Fixes #439.
|
|
65dc8a63
|
2022-09-01T00:13:19
|
|
Make xmlNewSAXParserCtx take a const sax handler
Also improve documentation.
|
|
51035c53
|
2022-08-25T19:53:04
|
|
Generate deprecation warnings for old SAX API
|
|
9a82b94a
|
2022-08-24T04:21:58
|
|
Introduce xmlNewSAXParserCtxt and htmlNewSAXParserCtxt
Add API functions to create a parser context with a custom SAX handler
without having to mess with ctxt->sax manually.
|
|
4a8c71eb
|
2022-03-04T03:35:57
|
|
Remove DOCBparser
This code has been broken and deprecated since version 2.6.0, released
in 2003. Because of a bug in commit 961b535c, DOCBparser.c was never
compiled since 2012. I couldn't find a Debian package using any of its
symbols, so it seems safe to remove this module.
|
|
ebb17970
|
2022-03-04T02:31:59
|
|
Remove unneeded #includes
|
|
cf4893f7
|
2022-02-20T19:56:41
|
|
Deprecate legacy functions
|