Log

Author Commit Date CI Message
Nick Wellnhofer 9cd47487 2024-11-22T19:51:32 doc: Add example for ICU with xmlCtxtSetCharEncConvImpl See #819.
Nick Wellnhofer 2ef1beb3 2024-11-22T18:52:24 cmake: Fix compatibility in package version file See https://github.com/microsoft/vcpkg/issues/42315
Nick Wellnhofer de478472 2024-11-22T00:45:44 build: Remove unused variables
Nick Wellnhofer e9d941f9 2024-11-21T23:38:52 cmake: Only build required source files
Nick Wellnhofer 799104de 2024-11-21T23:38:22 build: Schema doesn't require XPath anymore
Nick Wellnhofer 1dc5e50a 2024-11-21T23:22:40 catalog: Only use XML_SYSCONFDIR if catalogs are enabled
Nick Wellnhofer 52afde07 2024-11-21T23:17:07 build: Only build xmlcatalog executable if enabled
Nick Wellnhofer a5764b56 2024-11-21T22:18:36 build: Define XML_SYSCONFDIR in config.h Rename SYSCONFDIR macro to XML_SYSCONFDIR. Use AX_RECURSIVE_EVAL with Autotools. This is GPL v2 with Autoconf exception which shouldn't be a problem. Finally support meson.
Nick Wellnhofer 0dc26910 2024-11-20T21:04:19 parser: Deprecate more internal functions
Nick Wellnhofer a227a71a 2024-11-20T17:03:11 regexp: Deprecate internal functions
Nick Wellnhofer 84a6eece 2024-11-18T20:40:47 parser: Remove unneeded call to xmlDetectEncoding
Nick Wellnhofer 497081ba 2024-11-17T20:25:07 parser: Remove remaining calls to xml{Push|Pop}Input
Nick Wellnhofer 0f4f8900 2024-11-17T20:13:14 parser: Rename inputPush to xmlCtxtPushInput
Nick Wellnhofer e2ad249c 2024-11-17T19:48:44 parser: Deprecate more internal symbols - xmlParseExternalSubset - xmlPushInput - xmlPopInput - xmlCopyCharMultiByte - xmlCreateEntityParserCtxt - xmlStringComment
Nick Wellnhofer 2fcdc5f7 2024-11-18T20:41:09 globals: More comments on future directions
Nick Wellnhofer 4d1f35b0 2024-11-17T19:45:16 valid: Deprecate more internal functions
Nick Wellnhofer de0c7791 2024-11-17T13:56:19 fuzz: Switch to xmlCtxtValidateDocument This allows to check malloc failure reports during post-validation.
Nick Wellnhofer 5a51f085 2024-11-17T13:50:15 valid: Implement xmlCtxtValidateDocument This allows to use the error handler or resource loader of a parser context.
Nick Wellnhofer 1e1731a4 2024-11-17T13:20:06 valid: Add NULL check in xmlCtxtValidateDtd
Nick Wellnhofer 631778f6 2024-11-17T12:11:41 parser: Check for malloc failure in xmlCtxtParseDtd
Nick Wellnhofer 7f8c436c 2024-11-15T16:30:52 parser: Implement xmlCtxtParseDtd and xmlCtxtValidateDtd This allows to use the context's error handler, options and other settings. Fixes #808.
Nick Wellnhofer 764b8086 2024-11-13T20:22:32 tests: Fix sanitizer version check on old Apple clang See #669.
Nick Wellnhofer b57e022d 2024-11-13T19:08:47 build: Check for icu-uc instead of icu-i18n This should be the ICU component we actually need.
Ruslan Garipov aaecdc92 2024-11-12T16:42:36 parser: Assign value without if-statement This avoids an if-statement, because effectively it does nothing. And, for example, binary artifact generated by GCC with -O2 optimization settings does not contain that if-statement -- the code just uses the hprefix->name field explicitly. No functional changes intended. Signed-off-by: Ruslan Garipov <ruslanngaripov@gmail.com>
Nick Wellnhofer 1e4d8c55 2024-11-06T16:42:05 xmlIO: Fix reading from non-regular files like pipes Commit 7e14c05d removed unnecessary copying of uncompressed input through zlib or xzlib. This broke input from non-regular files like pipes which can't be reopened. Try to detect such files by checking whether they're seekable and always pipe them through zlib or xzlib. Also remove seemingly unnecessary calls to gzread and gzrewind to support unseekable files. Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/124.
Nick Wellnhofer 45914614 2024-11-05T12:05:14 xpath: Fix parsing of non-ASCII names Fix a long-standing issue where QNames starting with a non-ASCII character would be rejected. This became more visible after "streaming" XPath evaluation was disabled since the latter handled non-ASCII names correctly. Fixes #818.
Nick Wellnhofer 9201173c 2024-11-05T11:41:28 xmlreader: Fix return value of xmlTextReaderReadString Return NULL if the node has no children or the children were already deleted to match the 2.12 behavior. Fixes #817.
Nick Wellnhofer 869e3fd4 2024-11-01T16:52:31 parser: Fix loading of parameter entities in external DTDs Regressed with commit 12f0bb94. Fixes #816.
Nick Wellnhofer 36117723 2024-10-31T17:38:37 Update README
Nick Wellnhofer 467f4445 2024-10-30T14:03:39 SAX2: Add NULL check for ctxt->myDoc
Nick Wellnhofer efb57ddb 2024-10-30T14:02:36 parser: Fix downstream code that swaps DTDs Downstream code like the nginx xslt module can change the document's DTD pointers in a SAX callback. If an entity from a separate DTD is parsed lazily, its content must not reference the current document. Regressed with commit d025cfbb. Fixes #815.
Nick Wellnhofer 0ec5687e 2024-10-28T20:41:56 parser: Rework xmlCtxtGrowAttrs Remove unneeded argument. Check for integer overflow. We probably hit the buffer size limit in xmlParserGrow before, but better be safe.
Nick Wellnhofer ffb058f4 2024-10-28T20:12:52 parser: Fix detection of duplicate attributes We really need a second scan if more than one namespace clash was detected.
Nick Wellnhofer 89b9f457 2024-10-25T18:02:58 entities: Allow control chars when serializing HTML
Nick Wellnhofer b52a3044 2024-10-24T18:18:47 parser: Use counted_by attribute if supported We only have a single struct with a flexible array member.
Nick Wellnhofer 944e5fe8 2024-10-23T16:46:03 nanohttp: Fix another stdout file descriptor
Nick Wellnhofer 607ada90 2024-10-23T14:19:01 nanohttp: Fix stdout file descriptor Fixes #813.
Nick Wellnhofer b7c0f9d2 2024-10-19T14:26:39 string: Fix va_copy fallback Fix va_copy fallback reworked in 5cffba83. Should fix #812.
Nick Wellnhofer a870088f 2024-10-14T19:58:23 xpath: Hide internal sort functions
Yegor Yefremov 51394929 2024-10-15T11:11:38 python/tests: fix typos Typos were found with codespell.
Nick Wellnhofer f9a6469a 2024-10-14T16:14:55 Update NEWS
Satadru Pramanik c7b27866 2024-10-12T11:55:50 Avoid Python 'licence' distribution option is deprecated; use 'license' error
Nick Wellnhofer bf3619c3 2024-10-10T12:14:47 fuzz: Don't unlink DTD when replacing nodes OP_XML_REPLACE_NODE needs the same check as OP_XML_UNLINK_NODE.
Nick Wellnhofer 16de1346 2024-09-11T19:05:38 parser: Make new options actually work
Nick Wellnhofer 42c3823d 2024-09-11T19:05:09 html: Update comment
Nick Wellnhofer 9f04cce6 2024-09-11T17:43:07 html: Remove unused or useless return codes htmlParseStartTag should always succeed (except for malloc failures).
Nick Wellnhofer e179f3ec 2024-09-11T17:29:59 html: Stop reporting syntax errors It doesn't make much sense to keep the old syntax error handling which doesn't conform to HTML5. Handling HTML5 parser errors is rather involved and not essential for parsers.
Nick Wellnhofer a4c16a14 2024-09-27T23:49:02 xmllint: Improve --memory and --testIO options Support --memory and --testIO in SAX mode. Keep memory-mapped file across repetitions. Options `--sax --memory --noout --repeat` can now be used to benchmark the core parser without building a DOM tree or repeatedly reading files from disk.
Nick Wellnhofer 3ac214f0 2024-09-27T22:54:14 xmllint: Support --html --sax
Nick Wellnhofer 225ed707 2024-09-26T22:38:24 html: Accelerate htmlParseCharData
Nick Wellnhofer 74dfc49b 2024-09-26T21:24:00 parser: Clarify logic in xmlParseStartTag2
Nick Wellnhofer 20799979 2024-09-26T17:09:40 html: Handle numeric character references directly
Nick Wellnhofer 0bc4608c 2024-09-15T20:28:49 html: Use hash table to check for duplicate attributes
Nick Wellnhofer 24a6149f 2024-09-15T19:18:40 html: Make sure that character data mode is reset
Nick Wellnhofer c32397d5 2024-09-12T22:39:05 html: Improve character class macros
Nick Wellnhofer e8406554 2024-09-12T15:21:03 html: Rewrite parsing of most data
Nick Wellnhofer f77ec16d 2024-09-12T01:45:34 html: Optimize htmlParseCharData
Nick Wellnhofer 440bd64c 2024-09-12T04:01:38 html: Optimize htmlParseHTMLName
Nick Wellnhofer c34d0ae9 2024-09-12T23:50:20 html: Deprecate htmlIsBooleanAttr
Nick Wellnhofer 6040785a 2024-09-12T23:12:01 html: Deprecate AutoClose API
Nick Wellnhofer 188cad68 2024-09-12T02:51:20 html: Remove obsolete content model
Nick Wellnhofer 0144f662 2024-09-12T02:30:10 html: Remove obsolete code
Nick Wellnhofer 0ce7bfe5 2024-09-12T01:44:18 html: Try to avoid passing XML options to HTML parser
Nick Wellnhofer 76cc6394 2024-09-12T01:43:42 test: Fix XML_PARSE_HTML constant
Nick Wellnhofer 575be6c1 2024-09-12T01:40:07 html: Fix line numbers with CRs
Nick Wellnhofer be874d78 2024-09-11T19:47:07 html: Ignore unexpected DOCTYPE declarations
Nick Wellnhofer 462bf0b7 2024-09-11T19:06:06 html: Rework options Introduce htmlCtxtSetOptions, see similar changes made to XML parser. Add HTML_PARSE_HUGE alias. Support HTML_PARSE_BIG_LINES.
Nick Wellnhofer c6af1017 2024-09-08T20:45:48 html: Test tokenizer against html5lib test suite
Nick Wellnhofer 27752f75 2024-09-11T15:06:55 html: Fix EOF handling in start tags
Nick Wellnhofer b19d3539 2024-09-11T15:03:49 html: Fix EOF handling in comments
Nick Wellnhofer 17e56ac5 2024-09-11T14:24:58 html: Fix parsing of end tags
Nick Wellnhofer 24a09033 2024-09-09T02:53:14 html: Fix bogus end tags
Nick Wellnhofer bca64854 2024-09-09T02:30:18 html: Allow U+000C FORM FEED as whitespace
Nick Wellnhofer 6edf1a64 2024-09-09T02:09:20 html: Fix DOCTYPE parsing
Nick Wellnhofer 9678163f 2024-09-09T02:01:19 html: Don't check for valid XML characters
Nick Wellnhofer a6955c13 2024-09-08T23:19:49 html: Parse numeric character references according to HTML5
Nick Wellnhofer 4eeac309 2024-09-08T22:20:20 html: Start to fix EOF and U+0000 handling
Nick Wellnhofer e062a4a9 2024-09-08T20:40:36 html: Add HTML5 parser option This option passes tokenizer output directly to the SAX callbacks, making it possible to test the tokenizer against the html5lib test suite. This will produce unbalanced calls to the startElement and endElement callbacks, but it's the only way to support a SAX like interface for HTML5. It can be used for filtering or rewriting HTML5, for example. A HTML5 tree builder could then be implemented on top of the SAX callbacks.
Nick Wellnhofer 17da54c5 2024-09-08T19:16:12 html: Normalize newlines
Nick Wellnhofer 341dc78f 2024-09-08T19:11:14 html: Deduplicate code in htmlCurrentChar
Nick Wellnhofer 3adb396d 2024-09-07T15:18:13 html: Parse bogus comments instead of ignoring them Also treat XML processing instructions as bogus comments.
Nick Wellnhofer 84440175 2024-09-07T14:21:12 html: Add missing calls to htmlCheckParagraph()
Nick Wellnhofer 86d6b9b0 2024-09-07T04:18:06 html: Deduplicate some code
Nick Wellnhofer 0d324bde 2024-09-07T03:45:09 html: Simplify node info accounting
Nick Wellnhofer ccb61f59 2024-09-07T03:15:50 html: Remove duplicate calls to htmlAutoClose
Nick Wellnhofer e1834745 2024-09-07T00:54:25 html: Add character data tests
Nick Wellnhofer f9ed30e9 2024-09-06T17:49:04 html: HTML5 character data states
Nick Wellnhofer 59511792 2024-09-03T15:52:44 html: Parse named character references according to HTML5
Nick Wellnhofer d5cd0f07 2022-07-15T17:00:36 html: Prefer SKIP(1) over NEXT in HTML parser Use SKIP(1) where it's safe to avoid a function call.
Nick Wellnhofer dc2d4983 2023-05-04T17:47:38 html: Rework htmlLookupSequence Rename to htmlLookupString and use strstr for increased performance.
Nick Wellnhofer 637215a4 2023-05-04T17:16:51 html: Always terminate doctype declarations on '>' Align with HTML5 spec. This allows to remove the old quote handling in htmlLookupSequence.
Nick Wellnhofer 72e29f9a 2023-05-04T17:03:22 html: Fix quadratic behavior in push parser Fix quadratic behavior related to unquoted attribute values. We really have to replicate parts of the HTML5 state machine to find the end of tags relibably. Fixes #533.
Nick Wellnhofer a80f8b64 2023-05-04T15:59:31 html: Allow attributes in end tags Attribute are syntactically allowed in HTML5 end tags but otherwise ignored.
Nick Wellnhofer f2272c23 2023-05-04T15:33:27 html: Handle unexpected-solidus-in-tag according to HTML5
Nick Wellnhofer 939b53ee 2023-05-04T15:25:24 html: Stop skipping tag content Tag and attributes names should always be parsed succesfully now.
Nick Wellnhofer dcb2abb2 2023-05-04T15:16:29 html: Parse tag and attribute names according to HTML5 HTML5 allows bascially all characters in tag and attribute names.
Nick Wellnhofer d67833a3 2024-09-26T19:21:24 xmllint: Use proper type to store seconds since epoch Should avoid year 2038 problem. Fixes #801.
correctmost 81d38ed0 2024-09-25T07:52:10 meson: Fix duplicate listing of libxml2.devhelp2 The duplication caused a warning when uninstalling.
Nick Wellnhofer b1c5aa65 2024-09-19T12:50:59 xpath: Deprecate xmlXPathNAN and xmlXPath*INF Users should simply use the C99 macros.
Nick Wellnhofer 55ddccb6 2024-09-14T00:03:56 io: Make sure not to pass partial UTF-8 to write callback We cannot split UTF-8 at arbitrary boundaries.