Log

Author Commit Date CI Message
Nick Wellnhofer 3d71ab4f 2025-06-03T00:17:03 doc: Small fixes
Nick Wellnhofer 0ab5d7c5 2025-06-03T00:13:26 entities: Deprecate internal DTD-related functions
Nick Wellnhofer 347c2b2e 2025-06-02T23:26:19 valid: Deprecate a few functions and `xmllint --insert`
Nick Wellnhofer f6eec6f4 2025-06-02T22:45:19 globals: Remove lineNumbersDefaultValue from globals struct
Nick Wellnhofer ac6ae391 2025-06-02T14:33:15 valid: Readd more argument checks in xmlAddElementDecl Fix crashes with API fuzzer.
Nick Wellnhofer 0f8543e1 2025-06-02T14:19:01 parser: Fix error reporting in xmlSkipBlankCharsPEBalanced Short-lived regression.
Nick Wellnhofer 6a6a46f0 2025-05-28T16:02:41 doc: Fix autolink errors Fix links, remove links to internal functions.
Nick Wellnhofer 7bd8d1d9 2025-05-28T15:53:38 doc: Prefix autolinks with '#' Use `#func` instead of `func()` to ignore parameters and make all autolinks work.
Nick Wellnhofer 6e33d136 2025-05-28T14:57:37 error: Fix initGenericErrorDefaultFunc compatibility macro again Now it really should work as before.
Nick Wellnhofer 8baa5de1 2025-05-27T17:51:50 parser: Avoid integer overflow in xmlParseCharDataInternal `nbchar` could overflow with larger than 2GB memory buffers which some new APIs allow. This shouldn't affect memory safety. Limit maximum amount of bytes passed to character callback to XML_MAX_ITEMS (1e9).
Nick Wellnhofer 35d04a08 2025-05-27T17:05:05 README: Set expectations straight Fixes #913.
Nick Wellnhofer ad346c9a 2025-05-27T12:53:17 tree: Fix integer overflow in xmlBuildQName This issue affects memory safety and might receive a CVE ID later. Fixes #926.
Nick Wellnhofer 77c583e0 2025-05-27T12:19:25 valid: Readd argument check in xmlAddElementDecl Fix crashes with API fuzzer.
Nick Wellnhofer 78454e30 2025-05-25T16:53:41 io: Remove xmlInputDefaultOpen Not necessary after removal of HTTP client.
Nick Wellnhofer b79cf474 2025-05-26T15:34:48 globals: Fix memory leak on Windows DLL unload Thread-local memory must be freed for the main thread as well after 4f08a1a2. Fixes #925.
Nick Wellnhofer ab06bfa1 2025-05-26T15:03:07 parser: Fix error return in xmlParseElementContentDecl Avoid internal error later in xmlValidBuildAContentModel after 2a60ca06c. Also avoids some unnecessary error messages.
Nick Wellnhofer 30cf6d09 2025-05-26T01:13:24 parser: Add XML_INPUT_USE_SYS_CATALOG Also clean up catalog resolution and add error handling using the global error. Don't try to look up the resolved URI a second time. Add some comments. Fix documentation.
Nick Wellnhofer 34bafa14 2025-05-25T20:56:40 parser: Use parser context as default in resource loader This allows to access the original context for example when using modules like XInclude or schemas.
Nick Wellnhofer 997830a3 2025-05-16T22:07:39 tests: Always use xmlMalloc/xmlFree
Nick Wellnhofer 4aa7192f 2025-05-21T16:32:17 tests: Add dtor for xmlElementContent in testapi.c
Nick Wellnhofer fc1cabc8 2025-05-25T14:03:50 valid: Also raise duplicate ID error without validation support Whether an error is raised should not depend on config options.
Nick Wellnhofer dd1961e0 2025-05-20T16:37:18 valid: Skip more validity checks if not validating
Nick Wellnhofer 6c2bd975 2025-05-20T15:51:18 valid: Don't validate unused default attributes See erratum E9 of XML 1.0 Second Edition. See #120.
Nick Wellnhofer fca0860d 2025-05-19T21:17:39 tree: Deprecate public struct members related to DTDs Let's deprecate these members for now. If these are really used, they can be undeprecated later.
Nick Wellnhofer c136118d 2025-05-25T13:59:50 README: Mention CMake single/multi-config
Nick Wellnhofer 4dc44c83 2025-05-21T20:21:32 parser: Rework entity boundary check for element content Only use depth of input stack. This makes the input ID unused internally.
Nick Wellnhofer 74ea6b48 2025-05-21T17:44:27 parser: Start using input depth for entity boundary check Now that we make sure that PEs starting markup won't be popped implicitly, it's enough to check that no new entities are on the stack when checking boundaries.
Nick Wellnhofer 6f8ac953 2025-05-20T23:17:55 valid: Don't use element namespace for attributes This makes no sense. Unprefixed attributes never have a namespace.
Nick Wellnhofer 5ec83f77 2025-05-20T03:21:27 valid: Remove duplicate #FIXED check for namespaces Unlike the comment indicates, this is already checked.
Nick Wellnhofer 7c10fff2 2025-05-20T22:48:25 valid: Don't validate twice in xmlAddAttributeDecl This should only be done in xmlValidateAttributeDecl.
Nick Wellnhofer db65b2fc 2025-05-20T22:41:08 SAX1: Align handling of default attributes with SAX2 The SAX1 parser is legacy code, but it seems more maintainable to align it with SAX2.
Nick Wellnhofer e4cbc295 2025-05-20T21:57:01 parser: Check attribute normalization standalone constraint To fully implement "VC: Standalone Document Declaration", we have to check for normalization changes caused by non-CDATA attribute types declared externally. Fixes #119.
Nick Wellnhofer 682195c8 2025-05-20T22:00:57 parser: Fix "Proper Declaration/PE Nesting" validity constraint Now that we handle "WFC: PE Between Declarations" correctly, we can turn "Proper Declaration/PE Nesting" from a WFC into VC as specified. Fixes #118.
Nick Wellnhofer 74ff6c00 2025-05-20T22:00:29 error: Fix line number in entities Allow line numbers from more domains, see code above.
Nick Wellnhofer 2f3655c9 2025-05-20T19:40:06 parser: Pop PEs that start markup declarations explicitly We currently only handle "Validity constraint: Proper Declaration/PE Nesting", but we must detect "Well-formedness constraint: PE Between Declarations" separately: > The replacement text of a parameter entity reference in a DeclSep must > match the production extSubsetDecl. PEs in DeclSeps are PEs that start with a full markup declaration (or another PE). These are handled in xmParse{Internal|External}Subset. We set a flag on these PEs and don't close them implicitly in xmlSkipBlankCharsPE. This will make unterminated declarations in such PEs cause a parser error. The PEs are closed explicitly in xmParse{Internal|External}Subset, the only location where they are allowed to end.
Nick Wellnhofer 2a60ca06 2025-05-20T16:50:32 valid: Don't check enum values Rely on the parser to pass valid arguments.
Dag-Erling Smørgrav 3ab040c2 2025-05-24T01:12:15 Fix unidiomatic use of vsnprintf(). * Don't terminate an already-terminated buffer. * Consistently use 1024-byte buffers. * While here, consistently use ap for a va_list.
Dag-Erling Smørgrav 8ea253b8 2025-05-24T01:00:25 Remove bogus casts. * Casting a string literal to `char *` and then immediately passing or assigning the result to a `const char *` makes no sense. * There is no need to cast `int` to `Py_ssize_t` as they have the same sign and the latter is at least as wide as the former.
Nick Wellnhofer 7c9b5535 2025-05-19T19:10:55 doc: Document unused error domains
Nick Wellnhofer 47aca2c6 2025-05-19T18:43:14 parser: Only check validity contraints when validating
Nick Wellnhofer 3a68d0b7 2025-05-19T18:59:51 SAX2: Handle xml:id errors separately
Nick Wellnhofer 172550d2 2025-05-18T17:45:11 parser: Only validate EnumerationTypes when requested This has quadratic behavior and is only a validity constraint.
Nick Wellnhofer 7008740a 2025-05-18T01:52:38 parser: Consolidate scanning of XML Names Use new productions by default. Fixes #194. Fixes #364. See #707.
Nick Wellnhofer 657254a8 2025-05-18T01:21:43 parser: Factor out xmlIsNameCharNew/Old
Nick Wellnhofer 315bd443 2025-05-17T18:59:52 meson: Switch to cfg_data.set10()
Nick Wellnhofer 4e5945fc 2025-05-17T14:41:28 cmake: Avoid overlinking with non-CMake libxml2-config.cmake Align libxml2-config.cmake generated by Autotools and Meson with the CMake version and only add dependencies to libraries when linking statically. Also set LIBXML_STATIC for static builds. Fixes #918.
Nick Wellnhofer faaa01b8 2025-05-17T12:20:32 cmake: Make iconv a private dependency This was only needed for the headers before 2.14.
Nick Wellnhofer 70e5d664 2025-05-17T01:30:41 doc: Don't document deprecated headers
Nick Wellnhofer 7c82391c 2025-05-17T01:01:03 codegen: Factor out code to generate range tables
Nick Wellnhofer 502c5f65 2025-05-17T00:11:03 meson: Dependency on directory doesn't work
Nick Wellnhofer 210f5a37 2025-05-16T21:18:16 chvalid: Mark functions as deprecated
Nick Wellnhofer 954aae90 2025-05-16T21:13:17 doc: Improve regexp documentation
Nick Wellnhofer cbad60ff 2025-05-16T18:31:16 xmllint: Remove unused macros
Nick Wellnhofer 2132150d 2025-05-16T18:27:00 xmllint: Switch to xmlCtxtGetDocument
Nick Wellnhofer c5b45fbc 2025-05-16T16:54:09 doc: Misc fixes
Nick Wellnhofer c4926b19 2025-05-16T02:12:23 codegen: Merge xmlunicode.c into xmlregexp.c Include generated parts. Generate xmlChRangeGroups instead of functions for Unicode blocks.
Nick Wellnhofer 4cb767e9 2025-05-16T01:52:44 codegen: Only generate tables for character ranges The rest can be easily maintained manually.
Nick Wellnhofer 258d8706 2025-05-15T17:49:49 codegen: Consolidate tools for code generation Move tools, source files and output tables into codegen directory. Rename some files. Adjust tools to match modified files. Remove generation date and source files from output. Distribute all tools and sources.
Nick Wellnhofer 0d34d690 2025-05-15T17:11:33 README: Update configuration options Python is disabled by default now. Mention --prefix.
Nick Wellnhofer adfbeb7e 2025-05-14T04:58:21 doc: Stop using *Ptr typedefs in documentation
Nick Wellnhofer a40f36e7 2025-05-14T04:04:28 include: Stop using *Ptr typedefs in public headers
Nick Wellnhofer 0da20b83 2025-05-14T04:20:07 autotools: Quote filenames in doc/Makefile.am
Nick Wellnhofer 2d83a84c 2025-05-14T00:29:19 doc: Misc improvements
Nick Wellnhofer 770c6dec 2025-05-16T01:19:19 buf: Remove ABI compatibility hack I think this was required when some struct members like xmlParserInputBuffer::buffer were changed from xmlBuffer to xmlBuf (20+ years ago). Unfortunately, I missed the opportunity to align xmlBuffer with xmlBuf before the ABI break.
Nick Wellnhofer 344190db 2025-05-16T00:54:51 doc: Document deprecated xmlThrDef* functions
Nick Wellnhofer 6f4b4527 2025-05-15T23:43:32 parser: Stop using ctxt->linenumbers I think this was used to avoid setting the `line` member before it was added (20+ years ago).
Nick Wellnhofer 5ce48ec1 2025-05-15T22:51:54 SAX2: Rework xmlSAX2Text Simplify and make more readable.
Nick Wellnhofer d834437b 2025-05-15T19:12:25 python: Add deprecation warning
Nick Wellnhofer a05fa9a9 2025-05-15T18:41:35 codegen: Rerun codegen scripts
Nick Wellnhofer 87087def 2025-05-13T16:19:42 tests: Remove result files committed by accident
Nick Wellnhofer d6151c23 2025-05-13T13:28:28 libxml2.doap: Remove inactive maintainer
Nick Wellnhofer af4fae5a 2025-05-13T12:05:15 html: Add some comments regarding HTML5 serialization It seems that the specification of the HTML output method in XSLT 1.0 had a lot of influence on how the HTML serializer in libxml2 ended up: https://www.w3.org/TR/xslt-10/#section-HTML-Output-Method There are two remaining behaviors suggested by XSLT 1.0 that don't match the HTML5 fragment serialization algorithm: We escape non-ASCII characters in URI attributes (the list of which is probably outdated). This was originally recommended in appendix B of the HTML 4.01 spec, but only for user agents: https://www.w3.org/TR/html401/appendix/notes.html#h-B.2.1 From my experience, any tool that processes HTML should escape as little as possible. For example, we used to escape many more characters which are invalid in URIs, but often used in template languages. (Note that we still escape whitespace and control chars.) Nevertheless, I guess that some libxslt users continue to expect this behavior from libxml2. Then we collapse Boolean attributes using an outdated list. This is mostly a cosmetic issue, but a somewhat important one for libxslt users. We probably need a serialization option for the xmlsave module that enables fully HTML5-conformant output.
Nick Wellnhofer b0234633 2025-05-13T20:19:39 encoding: Preserve original encoding label When using built-in encodings, the label would be normalized which causes various issues. We now create a copy of the handler with the original name. This is somewhat dangerous as it will require users to free built-in encodings with xmlCharEncCloseFunc. But to handle the general case, this was already required. Fixes #916 in another way than originally proposed.
Nick Wellnhofer fcb7a777 2025-05-13T22:38:15 io: Make xmlOutputBufferCreate* not free encoder on error Revert a530ff12 which was an inadvertent API change.
Nick Wellnhofer 5b71dca6 2025-05-12T21:39:54 Fix -Wunterminated-string-initialization warnings Don't use strings for table.
Nick Wellnhofer cdce17c3 2025-05-12T21:21:25 html: Only map HTML encodings from meta tag
Nick Wellnhofer 19b99311 2025-05-12T21:07:41 encoding: Fix -Wswitch warning
Nick Wellnhofer 39ae5d12 2025-05-12T21:04:41 save: Add NULL check in xmlBufDumpEntityContent Short-lived regression.
Nick Wellnhofer c2929b5d 2025-05-12T21:01:35 html: Ignore namespaces when handling meta tags Revert to old behavior to fix issues with XHTML documents.
Nick Wellnhofer 4df8d557 2025-05-12T17:31:14 io: Fix stack use after scope Short-lived regression.
Nick Wellnhofer f0983199 2025-05-12T13:00:20 html: Map some encodings according to HTML5 Windows-1252 is a superset of ISO-8859-1 and should be used instead. Same for ASCII. Also map UCS-2 and UTF-16 to UTF-16LE.
Nick Wellnhofer 93f67106 2025-05-12T12:27:54 encoding: Add HTML5 aliases
Nick Wellnhofer 628006f4 2025-05-12T11:47:40 encoding: Add windows-1252 Fixes #915.
Nick Wellnhofer a7016bae 2025-05-12T02:40:36 tools: Remove unnecessary data from iso8859x.inc
Nick Wellnhofer c92374f1 2025-05-12T02:15:11 tools: Recreate script to generate iso8859x.inc The script to create these tables was never committed to version control.
Nick Wellnhofer f602c0c1 2025-05-12T00:04:22 html: Rework serialization of meta encoding attributes Don't allocate memory.
Nick Wellnhofer 7654c2ef 2025-05-11T23:37:38 html: Rework serialization of URIs Don't allocate memory.
Nick Wellnhofer bd777e4f 2025-05-11T22:18:31 html: Speed up htmlIsBooleanAttr This is used when serializing.
Nick Wellnhofer 825f3a9d 2025-05-11T21:38:16 html: Always serialize attributes with double quotes Align with HTML5.
Nick Wellnhofer 5c4cc456 2025-05-11T21:19:22 html: Escape encoding in meta tags
Nick Wellnhofer 0674ccb7 2025-05-11T20:55:57 html: Stop omitting end tags when serializing Align with HTML5.
Nick Wellnhofer 05b8fe0a 2025-04-12T23:10:40 html: Don't escape RAWTEXT and PLAINTEXT Align with HTML5.
Nick Wellnhofer 809ded58 2025-04-12T22:50:56 html: Add more empty elements Add empty HTML5 elements <bgsound>, <keygen>, <source>, <track> and <wbr>. Make <embed> an empty element.
Nick Wellnhofer 5f8ebc88 2025-05-10T00:56:18 save: Avoid xmlOutputBufferWriteQuotedString xmlOutputBufferWriteQuotedString should be reserved for things like system IDs.
Nick Wellnhofer 0d81d6f8 2025-05-10T00:52:22 html: Use xmlOutputBufferWrite if possible
Nick Wellnhofer 89fcfe3a 2025-05-10T00:14:05 html: Start to use xmlSerializeText Avoid temporary copy to speed up serialization.
Nick Wellnhofer 777e2adf 2025-05-09T23:53:03 io: Consolidate escaping code Use generated table approach of xmlSerializeText for xmlEscapeText. Move most code to xmlIO.c.
Nick Wellnhofer cdaf657f 2025-05-09T23:02:32 html: Don't escape < and > when serializing attribute values Align with HTML5. This will break some test suites.
Nick Wellnhofer e0e0a1f0 2025-05-09T22:44:54 html: Remove special handling of &{...} when serializing See https://www.w3.org/TR/html401/appendix/notes.html#h-B.7.1 Align with HTML5.
Nick Wellnhofer dad11630 2025-05-09T22:05:38 entities: Always replace invalid chars when escaping The previous refactor painstakingly recreated the different behavior of separate functions that were merged. It makes Optimize IS_CHAR check for non-ASCII chars.