include/libxml/parser.h


Log

Author Commit Date CI Message
Nick Wellnhofer 84369160 2025-07-27T12:55:11 doc: Add another warning to XML_PARSE_DTDVALID While most parts of libxml2, including the parser, are still vulnerable to such attacks, it is unlikely that DTD validation will ever be fixed.
Nick Wellnhofer 8689523a 2025-07-22T23:57:03 parser: Implement xmlCtxtGetInputWindow See #762.
Nick Wellnhofer 6c018854 2025-07-23T02:15:40 io: Deprecate xmlParserInputBuffer members
Nick Wellnhofer 8aaa53d7 2025-07-22T22:38:50 parser: Implement xmlCtxtGetInputPosition See #762.
Nick Wellnhofer a7fc9e1a 2025-07-22T20:50:13 parser: Add more parser context accessors The only thing remaining is access to parser input, see #762.
Nick Wellnhofer c7a9ef1d 2025-07-04T16:20:28 doc: Document struct typedefs Unfortunately, Doxygen's TYPDEF_HIDES_STRUCT option is too broken. Document struct typedefs to make autolinks work.
Nick Wellnhofer 7c913850 2025-06-22T20:12:48 parser: Remove unnecessary dict checks when freeing strings The following strings are never allocated from a dict: - xmlParserCtxt.version - xmlParserCtxt.encoding - xmlParserCtxt.extSubURI - xmlParserCtxt.extSubSystem - xmlDoc.version - xmlDoc.encoding - xmlDoc.URL - xmlDTD.ExternalID - xmlDTD.SystemID - xmlID.value Also make the struct members point to non-const chars to avoid casts when freeing.
Nick Wellnhofer e7802738 2025-06-22T14:39:28 parser: Don't load external content if only XML_SKIP_IDS is set At some point, the `loadsubset` member was augmented to also control handling of ID attributes in addition to loading of external DTDs. These two features are unrelated and shouldn't have been mixed. This mistake was probably inspired by the misnamed XML_DETECT_IDS flag. As a side effect, setting XML_SKIP_IDS always enabled loading of external DTDs and parameter entities. This change makes it possible to ignore IDs without loading external content. This is a deliberate API change that improves security and is unlikely to affect users. This also makes sure that the new XML_PARSE_SKIP_IDS option doesn't enable unsafe behavior.
Nick Wellnhofer a4d25b3d 2025-06-18T16:00:57 doc: Small fixes
Michael Mann cf4f9672 2025-06-21T11:16:39 Add XML_PARSE_SKIP_IDS to replace XML_SKIP_IDS Mark loadset member as deprecated Fixes #873
Nick Wellnhofer 7bd8d1d9 2025-05-28T15:53:38 doc: Prefix autolinks with '#' Use `#func` instead of `func()` to ignore parameters and make all autolinks work.
Nick Wellnhofer 30cf6d09 2025-05-26T01:13:24 parser: Add XML_INPUT_USE_SYS_CATALOG Also clean up catalog resolution and add error handling using the global error. Don't try to look up the resolved URI a second time. Add some comments. Fix documentation.
Nick Wellnhofer 4dc44c83 2025-05-21T20:21:32 parser: Rework entity boundary check for element content Only use depth of input stack. This makes the input ID unused internally.
Nick Wellnhofer c5b45fbc 2025-05-16T16:54:09 doc: Misc fixes
Nick Wellnhofer 6f4b4527 2025-05-15T23:43:32 parser: Stop using ctxt->linenumbers I think this was used to avoid setting the `line` member before it was added (20+ years ago).
Nick Wellnhofer a40f36e7 2025-05-14T04:04:28 include: Stop using *Ptr typedefs in public headers
Nick Wellnhofer 2d83a84c 2025-05-14T00:29:19 doc: Misc improvements
Nick Wellnhofer 38ea8fa9 2025-05-06T18:31:45 doc: Fix varargs
Nick Wellnhofer 9bbffec5 2025-05-06T17:42:46 doc: Move brief to top, params to bottom of doc comments
Nick Wellnhofer e6cfd049 2025-05-04T14:52:42 doc: Misc fixes to tree docs
Nick Wellnhofer 1bf44f09 2025-05-04T02:15:25 doc: Misc fixes to parser docs
Nick Wellnhofer 4a010875 2025-05-03T15:38:15 doc: Move parser option docs to enum
Nick Wellnhofer f7c41287 2025-05-02T15:57:17 doc: Remove more comment block headers
Nick Wellnhofer 1eca6e34 2025-04-30T00:54:00 parser: Deprecate xmlClearParserCtxt
Nick Wellnhofer fd6ab89b 2025-04-28T15:58:19 doc: Adjust documentation of public structs
Nick Wellnhofer 8816f267 2025-04-28T14:55:47 doc: Adjust documentation of enums
Nick Wellnhofer e549622b 2025-04-28T15:11:24 doc: Convert documentation to Doxygen Automated conversion based on a few regexes.
Nick Wellnhofer 61890e39 2025-04-27T21:50:15 doc: Prepare for conversion to Doxygen Fix many params in internal functions (not really necessary but Doxygen warns about that in XML mode). Fix formatting in a few corner cases that automatic conversion can't handle. Rearrange some DOC_DISABLE blocks.
Nick Wellnhofer fc8899d4 2025-04-27T12:59:41 parser: Make xmlCtxtGetValidCtxt depend on VALID_ENABLED
Nick Wellnhofer aa4ef773 2025-04-17T19:53:14 parser: Deprecate output-related globals
Nick Wellnhofer b3492259 2025-03-14T00:01:11 include: Change some return types from int to enum This also affects some new functions from 2.13.
Nick Wellnhofer fd1b9391 2025-03-13T23:20:16 include: Convert some macros to enums
Nick Wellnhofer 03a8f1dd 2025-03-11T18:53:24 doc: Document SAX handlers a little more
Nick Wellnhofer ba9148d8 2025-03-09T20:30:49 parser: Undeprecate input->consumed Should be deprecated after fixing #762.
Nick Wellnhofer a0dbf030 2025-03-09T20:24:06 parser: Undeprecate ctxt->loadsubset Should be deprecated after fixing #873.
Nick Wellnhofer d96911f1 2025-03-08T23:00:29 doc: Documentation fixes
Nick Wellnhofer 5f0b1378 2025-03-08T22:07:15 parser: Add more parser context accessors Fixes #763.
Nick Wellnhofer 69657224 2025-03-04T20:32:02 globals: Remove unused globals - xmlBufferAllocScheme - xmlDefaultBufferSize - xmlParserDebugEntities
Nick Wellnhofer 3d37ff84 2025-03-04T15:10:09 globals: Also use global state struct if threads are disabled
Nick Wellnhofer a15ad9b2 2025-03-04T14:06:50 parser: Remove compatibility symbols
Nick Wellnhofer 8e871162 2025-03-04T13:36:55 parser: Remove oldXMLWDcompatibility
Nick Wellnhofer cdc5cfed 2025-03-04T13:26:51 legacy: Remove legacy symbols
Nick Wellnhofer e50d314a 2025-02-25T23:07:19 build: Add separate configuration option for RELAX NG Support for RELAX NG used to be enabled together with XML Schema support (--with-schemas). Now there's a separate option and a new feature macro LIBXML_RELAXNG_ENABLED.
Nick Wellnhofer 93506d41 2025-01-29T00:17:01 parser: Make catalog PIs opt-in This is an obscure feature that shouldn't be enabled by default.
Nick Wellnhofer 1082d813 2025-01-28T23:21:34 parser: Prepare to make decompression opt-in Add a new parser option XML_PARSE_UNZIP that enables decompression. xmlReadFile, xmlCtxtReadFile and xmlCreateURLParserCtxt always set this option currently, but downstream users should start to set the option if they really need it.
Nick Wellnhofer 0dc26910 2024-11-20T21:04:19 parser: Deprecate more internal functions
Nick Wellnhofer 5a51f085 2024-11-17T13:50:15 valid: Implement xmlCtxtValidateDocument This allows to use the error handler or resource loader of a parser context.
Nick Wellnhofer 7f8c436c 2024-11-15T16:30:52 parser: Implement xmlCtxtParseDtd and xmlCtxtValidateDtd This allows to use the context's error handler, options and other settings. Fixes #808.
Nick Wellnhofer eb66d03e 2024-07-07T23:15:54 io: Deprecate a few functions
Nick Wellnhofer 69f12d6d 2024-07-13T00:17:18 encoding: Deprecate xmlByteConsumed This was only used by Chromium/WebKit to detect whether xmlParseContent really succeeded. It's a horrible, overcomplicated hack. See 8c5848bd and #767.
Nick Wellnhofer 8af55c8d 2024-07-06T22:14:21 parser: Rename new input API functions These weren't made public yet.
Nick Wellnhofer 4f329dc5 2024-07-10T03:27:47 parser: Implement xmlCtxtParseContent This implements xmlCtxtParseContent, a better alternative to xmlParseInNodeContext or xmlParseBalancedChunkMemory. It accepts a parser context and a parser input, making it a lot more versatile. xmlParseInNodeContext is now implemented in terms of xmlCtxtParseContent. This makes sure that xmlParseInNodeContext never modifies the target document, improving thread safety. xmlParseInNodeContext is also more lenient now with regard to undeclared entities. Fixes #727.
Nick Wellnhofer 82e0455c 2024-07-06T19:48:07 Undeprecate some symbols for now - xmlKeepBlanksDefault is needed as a work-around for xmlParseBalancedChunk, see issue #727. - ctxt->options already has an accessor and will be deprecated later. - input->cur, input->base, input->end: See #762.
Nick Wellnhofer 205e56da 2024-07-02T22:32:43 parser: Undeprecate ctxt->directory
Nick Wellnhofer 606f4108 2024-07-02T20:57:15 parser: Allow to disable catalogs with parser options Implement XML_PARSE_NO_SYS_CATALOG and XML_PARSE_NO_CATALOG_PI. Fixes #735.
Nick Wellnhofer 221df375 2024-06-28T00:34:52 parser: Support custom charset conversion implementations Implement xmlCtxtSetCharEncConvImpl. I agree that the name is terrible.
Nick Wellnhofer 044ddf07 2024-06-28T03:14:12 parser: Undeprecate some parser context members
Nick Wellnhofer 193f4653 2024-06-26T19:28:28 parser: Implement xmlCtxtGetStatus This allows access to ctxt->wellFormed, ctxt->nsWellFormed and ctxt->valid. It also detects several fatal non-parser errors which really should be another error level.
Nick Wellnhofer cc0cc2d3 2024-06-26T04:32:49 parser: Add more parser context accessors
Nick Wellnhofer eca972e6 2024-06-26T02:22:04 parser: Add getters for XML declaration to parser context Access to struct members will be deprecated.
Nick Wellnhofer f9c33a55 2024-06-21T18:25:11 parser: Undeprecate some xmlParserInput members
Nick Wellnhofer 1228b4e0 2024-06-21T18:22:04 parser: Deprecate xmlParserCtxt->lastError We alredy have xmlCtxtGetLastError().
Nick Wellnhofer f82ca02b 2024-06-21T18:17:11 parser: Undeprecate some xmlParserCtxt members These are essential for SAX parsers.
Mike Dalessio bbbbbb46 2024-06-20T03:19:48 parser: implement xmlCtxtGetOptions In 712a31ab, the `options` struct member was deprecated. To allow callers to check the status of options bits, introduce xmlCtxtGetOptions.
Nick Wellnhofer 1112699c 2024-06-17T02:42:18 legacy: Remove most legacy functions from public headers Also remove warning messages.
Nick Wellnhofer 5fca9498 2024-06-16T19:56:08 doc: Hide internal macro
Nick Wellnhofer 387f0c78 2023-12-06T18:35:30 include: Readd circular dependency between tree.h and parser.h There are dozens of downstream projects that only include tree.h but use declarations from parser.h. This broke after the recent cleanup of circular dependencies. Make tree.h include parser.h again. This is a hack but doesn't change the include directory struture. This commit only made it into the 2.12 branch but wasn't applied to master, so the issue turned up in 2.13.0 again. Should fix #734.
Nick Wellnhofer 712a31ab 2024-06-10T23:06:13 parser: Deprecate most public struct members This will probably cause many warnings in downstream code abusing libxml2 internals, but we can always undeprecate some members later.
Nick Wellnhofer 52384043 2024-06-11T19:10:41 parser: Pass resource type to resource loader
Nick Wellnhofer 64ad2725 2024-06-11T03:51:43 parser: Introduce per-context resource loader
Nick Wellnhofer ff3b0919 2024-06-11T00:00:32 parser: Implement XML_PARSE_NO_UNZIP option
Nick Wellnhofer 5b1d7ff0 2024-05-20T22:51:44 parser: Remove redefinitions for legacy globals
Nick Wellnhofer 8961056f 2024-01-23T00:47:44 parser: Make experimental input API private This needs to be reworked.
Nick Wellnhofer 02cc5c36 2024-01-05T04:17:14 parser: Add XML_PARSE_NO_XXE parser option
Nick Wellnhofer 12f0bb94 2024-01-05T01:14:28 parser: Synchronize more options
Nick Wellnhofer 3efbe916 2024-01-05T00:11:29 parser: Mark 'token' member as unused in xmlParserCtxt
Nick Wellnhofer b82fd81d 2024-01-04T23:25:06 parser: Rework xmlCtxtParseDocument Make xmlCtxtParseDocument take a parser input which can be popped after parsing.
Nick Wellnhofer d7d300ba 2024-01-04T17:50:11 parser: Remove remnants of runtime debugging feature Apparently, this feature was remove long ago. Fixes #651.
Nick Wellnhofer 875bb084 2023-09-07T03:25:45 parser: Implement xmlCtxtSetOptions Surprisingly, some options can only be enabled with xmlCtxtUseOptions and it's impossible to unset them. Add a new API function xmlCtxtSetOptions which sets or clears all options. Finally document all parser options. Make sure to synchronize option bits and struct members.
Nick Wellnhofer 2b79f106 2023-12-29T21:07:04 parser: Simplify entity size accounting
Nick Wellnhofer 7e0bbbc1 2023-12-27T18:33:30 parser: New input API Provide a new set of functions to create xmlParserInputs. These can be used for the document entity or from external entity loaders. - Don't require xmlParserInputBuffer. - All functions take a base URI. - All functions take an encoding as string. - xmlNewInputURL also takes a public ID. - xmlNewInputMemory takes a size_t. - Optimization hints for memory buffers. Improve documentation. Only call xmlInitParser before allocating a new parser context. Call xmlCtxtUseOptions as early as possible.
Nick Wellnhofer a5dcf0f4 2023-12-26T03:27:23 parser: Mark more parser context members as unused
Nick Wellnhofer 6a9a88a1 2023-12-26T03:13:05 parser: Move progressive flag into input struct
Nick Wellnhofer d944a415 2023-12-26T02:10:35 parser: Fix in-parameter-entity and in-external-dtd checks Use in ctxt->input->entity instead of ctxt->inputNr to determine whether we are inside a parameter entity. Stop using ctxt->external to check whether we're in an external DTD. This is signaled by ctxt->inSubset == 2.
Nick Wellnhofer c1bddd4c 2023-12-23T01:09:17 parser: Mark 'length' member of xmlParserInput as unused
Nick Wellnhofer 955c177f 2023-12-23T00:58:36 parser: Stop using 'directory' struct member This was only used as a pointless fallback for URI resolution.
Nick Wellnhofer 54c70ed5 2023-12-18T19:31:29 parser: Improve error handling Introduce xmlCtxtSetErrorHandler allowing to set a structured error for a parser context. There already was the "serror" SAX handler but this always receives the parser context as argument. Start to use xmlRaiseMemoryError. Remove useless arguments from memory error functions. Rename xmlErrMemory to xmlCtxtErrMemory. Remove a few calls to xmlGenericError. Remove support for runtime entity debugging.
Nick Wellnhofer 5d2dbe79 2023-12-14T13:37:25 parser: Fix build --without-output Fixes #647
Nick Wellnhofer df0b540b 2023-12-07T14:40:13 include: Rename XML_EMPTY helper macro Avoid name clash with downstream projects.
Nick Wellnhofer a9738e31 2023-12-07T14:15:29 include: Move declaration of xmlInitGlobals Fix downstream build issues after reworking globals.h.
Nick Wellnhofer 9122ad0c 2023-12-06T19:56:50 include: Move globals from xmlsave.h to parser.h Fix downstream build issues after reworking globals.h.
Nick Wellnhofer c011e760 2023-12-06T01:09:31 globals: Remove unused globals from thread storage Setting these deprecated globals hasn't had an effect for a long time. Make them constants. This reduces the size of per-thread storage from ~700 to ~250 bytes.
Nick Wellnhofer ff6c3188 2023-11-23T15:22:59 include: Remove useless 'const' from function arguments
Nick Wellnhofer aca37d8c 2023-11-20T15:20:37 parser: Only enable SAX2 if there are SAX2 element handlers This reverts part of commit 235b15a5 for backward compatibility and adds some comments trying to clarify the whole mess. Fixes #623.
Nick Wellnhofer e0dd330b 2023-09-29T00:18:44 parser: Use hash tables to avoid quadratic behavior Use a hash table to lookup namespaces by prefix. The hash table stores an index into the namespace table. Auxiliary data for namespaces is stored in a separate array along the main namespace table. Use a hash table to verify attribute uniqueness. The hash table stores an index into the attribute table. Reuse hash value from the dictionary to avoid computing them twice. See #346.
Nick Wellnhofer 8c084ebd 2023-09-21T22:57:33 doc: Make apibuild.py happy
Nick Wellnhofer 72262030 2023-09-21T14:52:14 parser: Readd some includes to parser.h and xmlreader.h Fix backward compatibility.
Nick Wellnhofer da274bfa 2023-09-21T01:29:40 build: Fix build when certain modules are disabled
Nick Wellnhofer d6ba4033 2023-09-20T20:49:59 globals: Move remaining declarations to correct places globals.h is now deprecated. Sanity is restored.
Nick Wellnhofer 11a1839d 2023-09-20T17:54:48 globals: Move remaining globals back to correct header files This undoes a lot of damage.