Log

Author Commit Date CI Message
Nick Wellnhofer a40f36e7 2025-05-14T04:04:28 include: Stop using *Ptr typedefs in public headers
Nick Wellnhofer 0da20b83 2025-05-14T04:20:07 autotools: Quote filenames in doc/Makefile.am
Nick Wellnhofer 2d83a84c 2025-05-14T00:29:19 doc: Misc improvements
Nick Wellnhofer 87087def 2025-05-13T16:19:42 tests: Remove result files committed by accident
Nick Wellnhofer d6151c23 2025-05-13T13:28:28 libxml2.doap: Remove inactive maintainer
Nick Wellnhofer af4fae5a 2025-05-13T12:05:15 html: Add some comments regarding HTML5 serialization It seems that the specification of the HTML output method in XSLT 1.0 had a lot of influence on how the HTML serializer in libxml2 ended up: https://www.w3.org/TR/xslt-10/#section-HTML-Output-Method There are two remaining behaviors suggested by XSLT 1.0 that don't match the HTML5 fragment serialization algorithm: We escape non-ASCII characters in URI attributes (the list of which is probably outdated). This was originally recommended in appendix B of the HTML 4.01 spec, but only for user agents: https://www.w3.org/TR/html401/appendix/notes.html#h-B.2.1 From my experience, any tool that processes HTML should escape as little as possible. For example, we used to escape many more characters which are invalid in URIs, but often used in template languages. (Note that we still escape whitespace and control chars.) Nevertheless, I guess that some libxslt users continue to expect this behavior from libxml2. Then we collapse Boolean attributes using an outdated list. This is mostly a cosmetic issue, but a somewhat important one for libxslt users. We probably need a serialization option for the xmlsave module that enables fully HTML5-conformant output.
Nick Wellnhofer b0234633 2025-05-13T20:19:39 encoding: Preserve original encoding label When using built-in encodings, the label would be normalized which causes various issues. We now create a copy of the handler with the original name. This is somewhat dangerous as it will require users to free built-in encodings with xmlCharEncCloseFunc. But to handle the general case, this was already required. Fixes #916 in another way than originally proposed.
Nick Wellnhofer fcb7a777 2025-05-13T22:38:15 io: Make xmlOutputBufferCreate* not free encoder on error Revert a530ff12 which was an inadvertent API change.
Nick Wellnhofer 5b71dca6 2025-05-12T21:39:54 Fix -Wunterminated-string-initialization warnings Don't use strings for table.
Nick Wellnhofer cdce17c3 2025-05-12T21:21:25 html: Only map HTML encodings from meta tag
Nick Wellnhofer 19b99311 2025-05-12T21:07:41 encoding: Fix -Wswitch warning
Nick Wellnhofer 39ae5d12 2025-05-12T21:04:41 save: Add NULL check in xmlBufDumpEntityContent Short-lived regression.
Nick Wellnhofer c2929b5d 2025-05-12T21:01:35 html: Ignore namespaces when handling meta tags Revert to old behavior to fix issues with XHTML documents.
Nick Wellnhofer 4df8d557 2025-05-12T17:31:14 io: Fix stack use after scope Short-lived regression.
Nick Wellnhofer f0983199 2025-05-12T13:00:20 html: Map some encodings according to HTML5 Windows-1252 is a superset of ISO-8859-1 and should be used instead. Same for ASCII. Also map UCS-2 and UTF-16 to UTF-16LE.
Nick Wellnhofer 93f67106 2025-05-12T12:27:54 encoding: Add HTML5 aliases
Nick Wellnhofer 628006f4 2025-05-12T11:47:40 encoding: Add windows-1252 Fixes #915.
Nick Wellnhofer a7016bae 2025-05-12T02:40:36 tools: Remove unnecessary data from iso8859x.inc
Nick Wellnhofer c92374f1 2025-05-12T02:15:11 tools: Recreate script to generate iso8859x.inc The script to create these tables was never committed to version control.
Nick Wellnhofer f602c0c1 2025-05-12T00:04:22 html: Rework serialization of meta encoding attributes Don't allocate memory.
Nick Wellnhofer 7654c2ef 2025-05-11T23:37:38 html: Rework serialization of URIs Don't allocate memory.
Nick Wellnhofer bd777e4f 2025-05-11T22:18:31 html: Speed up htmlIsBooleanAttr This is used when serializing.
Nick Wellnhofer 825f3a9d 2025-05-11T21:38:16 html: Always serialize attributes with double quotes Align with HTML5.
Nick Wellnhofer 5c4cc456 2025-05-11T21:19:22 html: Escape encoding in meta tags
Nick Wellnhofer 0674ccb7 2025-05-11T20:55:57 html: Stop omitting end tags when serializing Align with HTML5.
Nick Wellnhofer 05b8fe0a 2025-04-12T23:10:40 html: Don't escape RAWTEXT and PLAINTEXT Align with HTML5.
Nick Wellnhofer 809ded58 2025-04-12T22:50:56 html: Add more empty elements Add empty HTML5 elements <bgsound>, <keygen>, <source>, <track> and <wbr>. Make <embed> an empty element.
Nick Wellnhofer cdaf657f 2025-05-09T23:02:32 html: Don't escape < and > when serializing attribute values Align with HTML5. This will break some test suites.
Nick Wellnhofer e0e0a1f0 2025-05-09T22:44:54 html: Remove special handling of &{...} when serializing See https://www.w3.org/TR/html401/appendix/notes.html#h-B.7.1 Align with HTML5.
Nick Wellnhofer dad11630 2025-05-09T22:05:38 entities: Always replace invalid chars when escaping The previous refactor painstakingly recreated the different behavior of separate functions that were merged. It makes Optimize IS_CHAR check for non-ASCII chars.
Nick Wellnhofer c8cea39d 2025-05-09T21:31:07 save: Fix serialization of attribute defaults containing &lt; Long-standing bug that produced invalid XML.
Nick Wellnhofer 971038e5 2025-05-09T20:26:33 html: Call lower-level escaping functions Removes the need to pass a document around.
Nick Wellnhofer 63535d39 2025-05-09T20:13:43 tree: Make xmlNodeListGetStringInternal work with escape flags
Nick Wellnhofer 442c1903 2025-05-09T18:52:36 doc: Fix some damage from automated conversions Add some newlines, fix returns.
Nick Wellnhofer 98a61c9d 2025-05-09T16:48:09 doc: Fix briefs in tree docs
Nick Wellnhofer 4b4bc15a 2025-05-09T16:24:35 doc: Misc fixes to buffer docs
Nick Wellnhofer ad390a5d 2025-05-09T15:34:53 parser: Set doc properties in endDocument SAX handler
Nick Wellnhofer c7c49643 2025-05-09T15:26:15 html: Move DTD creation to endDocument SAX callback
Nick Wellnhofer 46f05ea4 2025-05-09T00:21:47 html: Rework meta charset handling Don't use encoding from meta tags when serializing. Only use the value in `doc->encoding`, matching the XML serializer. This is the actual encoding used when parsing. Stop modifying the input document by setting meta tags before serializing. Meta tags are now injected during serialization. Add full support for <meta charset=""> which is also used when adding meta tags. Align with HTML5 and implement the "algorithm for extracting a character encoding from a meta element". Only modify the encoding substring in Content-Type meta tags. Only switch encoding once when parsing. Fix htmlSaveFileFormat with a NULL encoding not to declare a misleading UTF-8 charset. Fixes #909.
Nick Wellnhofer 9aaa52fe 2025-05-08T22:49:20 tree: Make xmlNodeAddContent work with attributes
Nick Wellnhofer 655ac5f8 2025-05-07T16:35:09 html: Add comment regarding hack for XML documents
Nick Wellnhofer f3a080bc 2025-05-07T14:32:42 html: Ignore U+0000 in body text Align with HTML5. Fixes #908.
Nick Wellnhofer a1e83b24 2025-05-07T20:16:17 io: Fix negation of potentially unsigned value
Nick Wellnhofer b3854fe9 2025-05-07T20:20:31 reader: Fix null deref on malloc failure Short-lived regression from 177067ea.
Nick Wellnhofer 6684eb93 2025-05-07T20:13:59 fuzz: Fix out-of-tree build
Nick Wellnhofer 6bd380ce 2025-05-07T14:32:26 fuzz: Update README
Nick Wellnhofer 967df734 2025-05-07T13:03:11 malloc-fail: Handle malloc failure in xmlSchemaCopyValue Avoid null pointer dereference. Fixes #905.
Nick Wellnhofer 5f8ebc88 2025-05-10T00:56:18 save: Avoid xmlOutputBufferWriteQuotedString xmlOutputBufferWriteQuotedString should be reserved for things like system IDs.
Nick Wellnhofer 0d81d6f8 2025-05-10T00:52:22 html: Use xmlOutputBufferWrite if possible
Nick Wellnhofer 89fcfe3a 2025-05-10T00:14:05 html: Start to use xmlSerializeText Avoid temporary copy to speed up serialization.
Nick Wellnhofer 777e2adf 2025-05-09T23:53:03 io: Consolidate escaping code Use generated table approach of xmlSerializeText for xmlEscapeText. Move most code to xmlIO.c.
Pavel Kopylov 4ed71574 2025-05-09T11:58:01 python: fix use-after-free in functions xmlPythonFileReadRaw(), xmlPythonFileRead() with python2. Fixes #910.
Nick Wellnhofer 714decd6 2025-05-04T17:50:26 doc: Misc fixes to entities docs
Nick Wellnhofer f38f3e7b 2025-05-04T16:49:49 doc: Misc fixes to IO documentation
Nick Wellnhofer e6cfd049 2025-05-04T14:52:42 doc: Misc fixes to tree docs
Nick Wellnhofer 1bf44f09 2025-05-04T02:15:25 doc: Misc fixes to parser docs
Nick Wellnhofer b7274fb0 2025-05-03T16:34:02 doc: Misc fixes to HTML parser docs
Nick Wellnhofer 411f30ef 2025-05-03T16:21:15 doc: Don't document legacy HTML parser macros
Nick Wellnhofer 4a010875 2025-05-03T15:38:15 doc: Move parser option docs to enum
Nick Wellnhofer 0173fac7 2025-05-03T02:12:46 gitlab-ci: Only build documentation once per CMake platform
Nick Wellnhofer a449c5fd 2025-05-03T01:31:09 catalog: Deprecate some functions
Nick Wellnhofer 306b8bf2 2025-05-03T01:30:44 autotools: Remove -DSYSCONFDIR This is handled in config.h now.
Nick Wellnhofer 075283d4 2025-05-03T00:17:39 xlink: Deprecate remaining public function This was never finished.
Nick Wellnhofer 38ea8fa9 2025-05-06T18:31:45 doc: Fix varargs
Nick Wellnhofer 9bbffec5 2025-05-06T17:42:46 doc: Move brief to top, params to bottom of doc comments
Nick Wellnhofer 7bc7ae9d 2025-05-06T15:30:46 doc: Enable Doxygen autobrief
Nick Wellnhofer ab13fbfd 2025-05-06T14:06:43 doc: Misc fixes to error docs
Nick Wellnhofer b1685459 2025-05-06T12:50:52 doc: Misc fixes to xmlsave docs
Nick Wellnhofer 7d689fab 2025-05-06T10:54:46 doc: Fix doc installation with Autotools
Nick Wellnhofer 7b59e74c 2025-05-06T10:54:18 doc: Always use case sensitive filenames with Doxygen Avoid platform-specific behavior.
Nick Wellnhofer 298f70b3 2025-05-05T21:36:36 doc: Misc fixes to HTML tree docs
Nick Wellnhofer 18d20a68 2025-05-05T18:26:16 doc: More fine-grained redirects for old pages
Nick Wellnhofer 80b6429f 2025-05-04T19:13:24 doc: Misc fixes to encoding docs
Nick Wellnhofer 81ac2e27 2025-05-04T18:41:44 doc: Misc fixes to valid docs
Nick Wellnhofer 05d0f592 2025-05-06T19:47:00 python: Skip __xml thread-local accessors So we can remove conditional directives for Doxygen.
Nick Wellnhofer 9f496fdb 2025-05-03T14:29:27 xmllint: Return early on invalid args At this point, no memory was allocated and xmllintOom wasn't initialized. Return immediately on invalid args to avoid triggering false positive unreported OOM errors when fuzzing.
Nick Wellnhofer 488939b6 2025-05-02T23:05:35 gitlab-ci: Enable documentation in more tests
Nick Wellnhofer 8c032073 2025-05-02T23:04:48 doc: More Doxygen cleanup - Move Doxyfile into doc directory - Add files to EXTRA_DIST - Remove conversion script - Add docs to Meson summary
Nick Wellnhofer e9366ffb 2025-05-02T22:26:06 tests: Remove XSTC Python tests I think this has been ported to runsuite.c. Convert part of Makefile.am into a script to download the test suite.
Nick Wellnhofer e0c7a929 2025-05-02T21:03:05 doc: Add custom main page for API docs
Nick Wellnhofer c8d1b7ba 2025-05-02T20:32:57 gitlab-ci: Treat Doxygen warnings as error
Nick Wellnhofer 2c150e62 2025-05-02T20:18:34 doc: Formatting fixes
Nick Wellnhofer 08a282f9 2025-05-02T20:12:52 doc: Doxygen fixes for xmlversion.h
Nick Wellnhofer cb1635a6 2025-05-02T19:05:25 doc: Use @since command
Nick Wellnhofer e78e05c9 2025-05-02T17:32:51 doc: Fix autolinks to functions Unfortunately, autolinks in .c files aren't converted by Doxygen for some reason.
Nick Wellnhofer b76286de 2025-05-02T17:30:21 doc: Remove # character for autolinks
Nick Wellnhofer 4d1e82ce 2025-05-02T17:26:08 doc: Fix xmlTextWriter struct name
Nick Wellnhofer e6d6fa6f 2025-05-02T17:23:30 doc: Fix xmlsave format hint Don't recommend deprecated symbols.
Nick Wellnhofer f7c41287 2025-05-02T15:57:17 doc: Remove more comment block headers
Nick Wellnhofer 103f0203 2025-05-02T15:29:10 doc: Add project slug to redirects
Nick Wellnhofer a5898c2a 2025-05-02T15:08:19 doc: Add redirects for GitLab pages
Nick Wellnhofer 0ffa7dd8 2025-05-02T14:52:03 include: Add hyperlink to deprecation warnings Doxygen creates a nice "deprecated list" for us.
Nick Wellnhofer 18c446a5 2025-05-02T14:41:29 python: Remove libxml2-python-api.xml Should have been removed with commit ed850ec1.
Nick Wellnhofer 1eca6e34 2025-04-30T00:54:00 parser: Deprecate xmlClearParserCtxt
Nick Wellnhofer 76531cee 2025-04-29T01:00:19 doc: Remove libxml2-api.xml This huge file can finally be removed.
Nick Wellnhofer 321aa356 2025-04-28T21:42:08 python: Make generator.py use Doxygen XML
Nick Wellnhofer ed850ec1 2025-04-28T20:04:19 python: Merge libxml2-python-api.xml into generator.py
Nick Wellnhofer 97f3ec77 2025-04-28T19:05:38 test: Make gentest.py use Doxygen XML This adds Python code to look up the required feature macros for a symbol in tools/xmlmod.py.
Nick Wellnhofer bbe5827c 2025-04-28T17:21:05 doc: Build docs with Doxygen and xsltproc Build the documentation as part of the build process with support for all build systems. This adds a new configuration option --with-docs to build documentation. Required tools are Doxygen, xsltproc and the DocBook 4 XSLT stylesheets. Doxygen will also be required to build the Python bindings.
Nick Wellnhofer e525564f 2025-05-01T19:20:06 doc: Remove empty lines at start of block These lines were left over after automatic conversion.