xmlregexp.c


Log

Author Commit Date CI Message
Nick Wellnhofer 5d36664f 2024-07-16T00:35:53 memory: Deprecate xmlGcMemSetup
Nick Wellnhofer 2dcd561d 2024-07-15T14:54:37 regexp: Don't print to stderr
Nick Wellnhofer 6be79014 2024-07-15T14:18:26 Remove unused code
Nick Wellnhofer 598ee0d2 2024-06-26T01:18:55 error: Remove underscores from xmlRaiseError
Rosen Penev 217e9b7a 2024-06-08T12:27:45 clang-tidy: don't return in void functions Found with readability-redundant-control-flow Signed-off-by: Rosen Penev <rosenp@gmail.com>
Nick Wellnhofer fa01278d 2024-06-16T00:11:41 regexp: Hide experimental legacy code This was never made public.
Nick Wellnhofer 10d60d15 2024-06-16T00:04:46 regexp: Stop using LIBXML_AUTOMATA_ENABLED This macro always equals LIBXML_REGEXP_ENABLED.
Nick Wellnhofer 0651ad66 2024-05-05T20:20:22 valid: Report malloc failure after xmlRegExecPushString
Nick Wellnhofer 05d9bacd 2023-12-18T21:39:51 regexp: Improve error handling Handle malloc failure from xmlRaiseError. Use xmlRaiseMemoryError. Remove argument from memory error handler. Remove TODO macro.
Nick Wellnhofer 1a354d5b 2023-12-10T17:09:45 regexp: Report malloc failures Fix places where malloc failures aren't reported.
Nick Wellnhofer 3e7673bc 2023-09-23T17:31:55 malloc-fail: Report malloc failure in xmlFARegExec
Nick Wellnhofer b7d56ef7 2023-09-22T17:03:56 malloc-fail: Report malloc failure in xmlRegEpxFromParse Also check whether malloc failures are reported when fuzzing.
Nick Wellnhofer f98fa863 2023-09-22T15:25:40 regexp: Fix status codes and handle invalid UTF-8 Fixes #561.
Nick Wellnhofer 4e1c13eb 2023-09-18T14:45:10 debug: Remove debugging code This is barely useful these days and only clutters the code base.
Nick Wellnhofer a800b7e0 2023-05-04T12:47:00 regexp: Fix null deref in xmlFAFinishReduceEpsilonTransitions Short-lived regression found by OSS-Fuzz.
Nick Wellnhofer c613ab14 2023-05-02T00:32:50 regexp: Fix mistake in previous commit The `ret = 0` line should have been deleted. Fixes #531.
Nick Wellnhofer a06eaa61 2023-03-09T06:58:24 regexp: Fix determinism checks Swap arguments in initial call to xmlFARecurseDeterminism. Fix the check whether we revisit the initial state in xmlFARecurseDeterminism. If there are transitions with equal atoms and targets but different counters, treat the regex as deterministic but mark the transitions as non-deterministic internally. Don't overwrite zero return value of xmlFAComputesDeterminism with non-zero value from xmlFARecurseDeterminism. Most of these errors lead to non-deterministic regexes not being detected which typically isn't an issue. The improved code may break users who relied on buggy behavior or cause other bugs to become visible. Fixes #469.
Nick Wellnhofer e301865e 2023-03-09T05:34:38 regexp: Fix checks for eliminated transitions 'to' can be set to -1 or -2 when eliminating transitions, so check for all negative values.
Nick Wellnhofer 90759c59 2023-03-09T16:34:11 regexp: Simplify xmlFAReduceEpsilonTransitions
Nick Wellnhofer 9f7b1142 2023-03-09T05:25:09 regexp: Fix cycle check in xmlFAReduceEpsilonTransitions The visited flag must only be reset after the first call to xmlFAReduceEpsilonTransitions has finished. Visiting states multiple times could lead to unnecessary processing of duplicate transitions. Similar to 68eadabd.
Nick Wellnhofer 85057e51 2023-02-21T15:24:19 regexp: Add sanity check in xmlRegCalloc2 These arguments should be non-zero, but add a sanity check to avoid division by zero. Fixes #450.
Nick Wellnhofer 1743c4c3 2023-02-17T15:53:07 malloc-fail: Fix OOB read after xmlRegGetCounter Found with libFuzzer, see #344.
Nick Wellnhofer 40bc1c69 2023-02-17T15:40:32 malloc-fail: Fix memory leak in xmlFAParseCharProp Found with libFuzzer, see #344.
Nick Wellnhofer e64653c0 2023-02-17T15:20:33 malloc-fail: Fix leak of xmlRegAtom Found with libFuzzer, see #344.
Nick Wellnhofer ed615967 2023-02-17T15:23:42 malloc-fail: Fix memory leak in xmlRegexpCompile Found with libFuzzer, see #344.
Nick Wellnhofer e60c9f4c 2023-02-15T01:00:03 malloc-fail: Fix memory leak after xmlRegNewState Invoke xmlRegNewState from xmlRegStatePush to simplify error handling. Found with libFuzzer, see #344.
Nick Wellnhofer bd33331b 2023-02-17T15:19:37 regexp: Simplify xmlRegAtomPush
Nick Wellnhofer 0f568c0b 2022-08-26T01:22:33 Consolidate private header files Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.
Nick Wellnhofer 14517012 2022-04-23T19:19:33 Fix parsing of subtracted regex character classes Fixes #370.
Nick Wellnhofer ebb17970 2022-03-04T02:31:59 Remove unneeded #includes
Damjan Jovanovic 37ebf8a8 2021-05-31T07:45:18 Document support for the non-standard escape sequences. Support non-BMP code points in surrogate pairs of '\uXXXX\uXXXX'.
Damjan Jovanovic b66c1961 2021-05-30T11:11:33 Use strtoul() instead of sscanf, and correct data types that break GCC.
Damjan Jovanovic ec8ff95c 2021-05-29T16:36:44 Add support for some non-standard escapes in regular expressions. This adds support for some non-standard escape sequences observed in Microsoft's MSXML DLLs and used by Windows apps, and thus needed by Wine. Some are also used in other XML implementations, eg. Java's. This isn't intended to be final. We probably wish to toggle these non-standard escape sequences on and off somehow, as needed by the caller. Further discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/260
Nick Wellnhofer 776d15d3 2022-03-02T00:29:17 Don't check for standard C89 headers Don't check for - ctype.h - errno.h - float.h - limits.h - math.h - signal.h - stdarg.h - stdlib.h - string.h - time.h Stop including non-standard headers - malloc.h - strings.h
Nick Wellnhofer ea6e8f99 2021-12-20T00:34:58 Fix certain combinations of regex range quantifiers Fix regex transitions that have both min/max and a counter. In this case, we want to save the regex state before incrementing the counter. Fixes #301 and the issue reported here: https://mail.gnome.org/archives/xml/2016-April/msg00017.html
Nick Wellnhofer 382fb056 2021-12-20T00:31:41 Fix range quantifier on subregex Make sure to add counted exit transitions before other counter transitions. Otherwise, we won't backtrack correctly. Fixes #65.
Nick Wellnhofer 346c3a93 2022-02-20T18:46:42 Remove elfgcchack.h The same optimization can be enabled with -fno-semantic-interposition since GCC 5. clang has always used this option by default.
Arne Becker ec6e3efb 2021-07-06T21:56:04 Patch to forbid epsilon-reduction of final states When building the internal representation of a regexp, it is possible that a lot of empty transitions are created. Therefore there is a step to reduce them in the function xmlFAEliminateSimpleEpsilonTransitions. There is an error there for this case: * State 1 has a transition with an atom (in this case "a") to state 2. * State 2 is final and has an epsilon transition to state 1. After reduction it looked like: * State 1 has a transition with an atom (in this case "a") to itself and is final. In other words, the empty string is accepted when it shouldn't be. The attached patch skips the reduction step for final states. An alternative would be to insert or increment counters when reducing a final state, but this seemed error prone and unnecessary, since there aren't that many final states. Fixes #282
Nick Wellnhofer 7d6837ba 2020-10-25T20:21:43 Fix caret in regexp character group Apply Per Hedeland's patch from https://bugzilla.gnome.org/show_bug.cgi?id=779751 Fixes #188.
Nick Wellnhofer 68eadabd 2020-07-11T21:32:10 Fix exponential runtime in xmlFARecurseDeterminism In order to prevent visiting a state twice, states must be marked as visited for the whole duration of graph traversal because states might be reached by different paths. Otherwise state graphs like the following can lead to exponential runtime: ->O-->O-->O-->O-->O-> \ / \ / \ / \ / O O O O Reset the "visited" flag only after the graph was traversed. xmlFAComputesDeterminism still has massive performance problems when handling fuzzed input. By design, it has quadratic time complexity in the number of reachable states. Some issues might also stem from redundant epsilon transitions. With this fix, fuzzing regexes with a maximum length of 100 becomes feasible at least. Found with libFuzzer.
Nick Wellnhofer fc842f6e 2020-07-06T15:22:12 Limit regexp nesting depth Enforce a maximum nesting depth of 50 for regular expressions. Avoids stack overflows with deeply nested regexes. Found by OSS-Fuzz.
Nick Wellnhofer f8329fdc 2020-07-02T11:51:31 Report error for invalid regexp quantifiers
Nick Wellnhofer 1e7851b5 2020-06-25T12:17:50 Fix integer overflow in xmlFAParseQuantExact Found by OSS-Fuzz.
Nick Wellnhofer 20c60886 2020-03-08T17:19:42 Fix typos Resolves #133.
Nick Wellnhofer 52649b63 2020-01-02T14:45:28 Check for overflow when allocating two-dimensional arrays Found by lgtm.com
Nick Wellnhofer 9bd7abfb 2020-01-02T14:14:48 Remove useless comparisons Found by lgtm.com
Jared Yanovich 2a350ee9 2019-09-30T17:04:54 Large batch of typo fixes Closes #109.
Nick Wellnhofer 99a864a1 2019-09-25T15:27:45 Fix Regextests - One of the bug316338 test cases is expected to succeed. - Memory leak in testRegexp.c. - Refcount handling in xmlExpHashGetEntry.
Nick Wellnhofer c2b0a184 2019-09-25T13:57:42 Fix empty branch in regex Fixes bug 649244: https://bugzilla.gnome.org/show_bug.cgi?id=649244 Closes #57.
Nick Wellnhofer e8c9cd5c 2019-09-16T15:36:02 Fix Schema determinism check of ##other namespaces Non-compound (##local) and compound string atoms are always disjoint regardless of whether the compound atom is negated (##other). Closes #40.
zhouzhongyuan 0b793591 2019-08-26T15:24:12 Fix memory leak in xmlRegEpxFromParse Merge request !39
Nick Wellnhofer 09797c13 2019-03-05T15:14:34 Fix null deref in xmlregexp error path Thanks to Shaobo He for the report.
J. Peter Mugaas d2c329a9 2017-10-21T13:49:31 Fix -Wimplicit-fallthrough warnings Add "falls through" comments to quench implicit-fallthrough warnings which are enabled by -Wextra under GCC 7.
David Kilzer fb56f80e 2017-07-04T18:38:03 Heap-buffer-overflow read of size 1 in xmlFAParsePosCharGroup Credit to OSS-Fuzz. Add a check to xmlFAParseCharRange() for the end of the buffer to prevent reading past the end of it. This fixes Bug 784017.
Nick Wellnhofer 8a0c6698 2017-07-04T17:13:06 Fix NULL pointer deref in xmlFAParseCharClassEsc Found with libFuzzer.
Nick Wellnhofer 34e44567 2017-05-31T16:48:27 Fix undefined behavior in xmlRegExecPushStringInternal It's stupid, but the behavior of memcpy(NULL, NULL, 0) is undefined.
Pranjal Jumde cbb27165 2016-03-07T06:34:26 Bug 757711: heap-buffer-overflow in xmlFAParsePosCharGroup <https://bugzilla.gnome.org/show_bug.cgi?id=757711> * xmlregexp.c: (xmlFAParseCharRange): Only advance to the next character if there is no error. Advancing to the next character in case of an error while parsing regexp leads to an out of bounds access.
Daniel Veillard 34b35004 2016-05-09T09:28:38 Fix an error with regexp on nullable counted char transition This is the first of the two issues raised by Pete Cordell in https://mail.gnome.org/archives/xml/2016-April/msg00030.html
Jan Pokorný bb654feb 2016-04-13T16:56:07 Fix typos: dictio{ nn -> n }ar{y,ies} Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Gaurav 41b0d1c4 2014-05-09T16:52:32 Avoid Double Null Check Cleanup For https://bugzilla.gnome.org/show_bug.cgi?id=729851
Gaurav 2671b013 2013-09-11T14:59:06 Fix potential NULL pointer dereferences in regexp code https://bugzilla.gnome.org/show_bug.cgi?id=707749 Fix 3 cases where we might dereference NULL
Michael Wood fb27e2cd 2012-09-28T08:59:33 Fix spelling of "length".
Daniel Veillard f8e3db04 2012-09-11T13:26:36 Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.
Daniel Veillard 466fcdaa 2012-08-27T12:03:40 Avoid a potential infinite recursion Which can happen when eliminating epsilon transitions, as reported by Pavel Madr <pmadr@opentext.com>
Daniel Veillard 40851d0c 2012-08-17T20:34:05 Fix a segfault on XSD validation on pattern error As reported by Sven <sven@e7o.de>: The following pattern will cause a segmentation fault in my Apache (using PHP5 to validate a XML against a XSD): <xs:pattern value="(.*)|"/> Fix a cascade of error handling failures which led to the crash in that scenario.
Patrick R. Gansterer 204f1f14 2012-05-10T20:24:00 undef ERROR if already defined
Daniel Veillard 9543aee9 2010-03-15T11:13:39 Fix broken escape behaviour in regexp ranges
Daniel Veillard 9332b48f 2009-09-23T18:28:43 Fix a Relaxng bug raised by libvirt test suite * xmlregexp.c: other fixes in 2.7.4 raised this internal error when comparing ranges, this affects among others detection of the determinism * test/relaxng/libvirt* result/relaxng/libvirt*: add a test case based on libvirt schemas and tests
Daniel Veillard 29341682 2009-09-10T18:23:39 Release of libxml2-2.7.4 * configure.in: new version * libxml.spec.in: cleanup * xmlregexp.c: fix a comment * doc/apibuild.py: update * doc/*: regenerate everything
Daniel Veillard 594e5dfb 2009-09-07T14:58:47 Chasing dead assignments reported by clang-scan * SAX2.c dict.c error.c hash.c nanohttp.c parser.c python/libxml.c relaxng.c runtest.c tree.c valid.c xinclude.c xmlregexp.c xmlsave.c xmlschemas.c xpath.c xpointer.c: mostly removing unneded affectations, but this led to a few real bugs and some part not yet understood (relaxng/interleave)
Daniel Veillard 13cee4e3 2009-09-05T14:52:55 Fix a bunch of scan 'dead increments' and cleanup * HTMLparser.c c14n.c debugXML.c entities.c nanohttp.c parser.c testC14N.c uri.c xmlcatalog.c xmllint.c xmlregexp.c xpath.c: fix unused variables, or unneeded increments as well as a couple of space issues * runtest.c: check for NULL before calling unlink()
Daniel Veillard 1ba2aca3 2009-08-31T16:47:39 492317 Fix Relax-NG validation problems * relaxng.c xmlregexp.c: a subtle problem when checking for compileable content model, if using the same elements in cases of choices. Handled by adding a special flag to the regexp compilation to detect transitions with different atoms using same strings. * test/relaxng/492317* result/relaxng/492317*: add the test to the regression suite
Daniel Veillard d80d0728 2009-08-22T18:56:01 559410 - Regexp bug on (...)? constructs * xmlregexp.c: fix a regexp bug on some (...)? constructs * test/schemas/nvdcve* result/schemas/nvdcve*: add the tests to the regression suite
Daniel Veillard 11e28e4d 2009-08-12T12:21:42 570702 fix a bug in regexp determinism checking * xmlregexp.c: xmlFAComputesDeterminism was bugged as it removed as coalesced transitions on with sane source destination and atoms but not looking at counters
Daniel Veillard bf9c1dad 2008-08-26T07:46:42 add the testchar to 'make check' Volker Grabsch pointed out a typo * Makefile.am: add the testchar to 'make check' * xmlschemas.c: Volker Grabsch pointed out a typo * xmlregexp.c: production [19] from XML Schemas regexps were a mistake removed in version REC-xmlschema-2-20041028, Volker Grabsch provided a patch to remove it * test/schemas/regexp-char-ref_0.xml test/schemas/regexp-char-ref_0.xsd test/schemas/regexp-char-ref_1.xsd result/schemas/regexp-char-ref_0_0 result/schemas/regexp-char-ref_1_0: Volker Grabsch also provided regession tests for this Daniel svn path=/trunk/; revision=3776
Daniel Veillard ad55998f 2008-05-12T13:15:35 avoid a regexp crash, should fix #523738 Daniel * xmlregexp.c: avoid a regexp crash, should fix #523738 Daniel svn path=/trunk/; revision=3744
Daniel Veillard 10bda629 2008-03-13T07:27:24 found a nasty bug in regexp automata build, reported by Ashwin and Bjorn * xmlregexp.c: found a nasty bug in regexp automata build, reported by Ashwin and Bjorn Reese Daniel svn path=/trunk/; revision=3705
Daniel Veillard 041b687e 2008-02-08T10:37:18 apply patch from Andrew Tosh to fix behaviour when '.' is used in a * xmlregexp.c: apply patch from Andrew Tosh to fix behaviour when '.' is used in a posCharGroup * test/schemas/poschargrp0_0.* result/schemas/poschargrp0_0_0*: added the test to the regression suite Daniel svn path=/trunk/; revision=3687
Daniel Veillard 00fde4e4 2007-11-19T17:38:33 remove a cut-and-paste copy error Daniel * xmlregexp.c: remove a cut-and-paste copy error Daniel svn path=/trunk/; revision=3665
Daniel Veillard c821e03c 2007-08-28T17:33:45 another nasty regexp case fixed. added to regression suite Daniel * xmlregexp.c: another nasty regexp case fixed. * test/regexp/ranges2 result/regexp/ranges2: added to regression suite Daniel svn path=/trunk/; revision=3658
William M. Brack ec72008b 2007-08-24T02:57:38 Enhanced to include port number (if not == 80) on the "Header:" URL (bug * nanohttp.c: Enhanced to include port number (if not == 80) on the "Header:" URL (bug #469681). * xmlregexp.c: Fixed a typo causing a warning message. svn path=/trunk/; revision=3657
Daniel Veillard 76d59b6d 2007-08-22T16:29:21 try to fix for the nth time the automata generation in case of complex * xmlregexp.c: try to fix for the nth time the automata generation in case of complex ranges. I suppose that time it is actually okay Daniel svn path=/trunk/; revision=3650
Daniel Veillard cb4284e2 2007-04-25T13:55:20 applied patch from Richard Jones to for the silent flag on valgrind when * xstc/Makefile.am doc/examples/Makefile.am Makefile.am: applied patch from Richard Jones to for the silent flag on valgrind when doing "make valgrind" * xmlregexp.c: raise a regexp error when '\' is misused to escape a standard character. Daniel svn path=/trunk/; revision=3606
William M. Brack 56578371 2007-04-11T14:33:46 small enhancement for quantifier range with min occurs of 0; fixes bug * xmlregexp.c: small enhancement for quantifier range with min occurs of 0; fixes bug 425542. svn path=/trunk/; revision=3597
William M. Brack a9cbf283 2007-03-21T13:16:33 fixed problem with 0x2d in Char Range (bug #420596) added regression test * xmlregexp.c: fixed problem with 0x2d in Char Range (bug #420596) * test/regexp/bug420596, result/regexp/bug420596: added regression test for this svn path=/trunk/; revision=3594
Daniel Veillard fcd18ff8 2006-11-02T10:28:04 another small change on the algorithm for the elimination of epsilon * xmlregexp.c: another small change on the algorithm for the elimination of epsilon transitions, should help on #362989 too Daniel
Daniel Veillard 0e05f4c2 2006-11-01T15:33:04 applied documentation patches from Markus Keim fixed one bug and added a * tree.c: applied documentation patches from Markus Keim * xmlregexp.c: fixed one bug and added a couple of optimisations while working on bug #362989 Daniel
Daniel Veillard 777737ea 2006-10-17T21:23:17 applied fix from Christopher Boumenot for bug #362714 on regexps missing * xmlregexp.c: applied fix from Christopher Boumenot for bug #362714 on regexps missing ']' Daniel
Daniel Veillard 54eb0243 2006-03-21T23:17:57 applied patch from Youri Golovanov fixing bug #316338 and adding a couple * xmlregexp.c: applied patch from Youri Golovanov fixing bug #316338 and adding a couple of optimizations in the regexp compilation engine. * test/regexp/bug316338 result/regexp/bug316338: added regression tests based on the examples provided in the bug report. Daniel
Daniel Veillard 11ce4004 2006-03-10T00:36:23 end of first pass on coverity reports. Daniel * runtest.c schematron.c testAutomata.c tree.c valid.c xinclude.c xmlcatalog.c xmlreader.c xmlregexp.c xpath.c: end of first pass on coverity reports. Daniel
Daniel Veillard fc011b7f 2006-02-12T19:14:15 bug fixes for #327167 as well as some cleanups and more thorough tests on * xmlregexp.c: bug fixes for #327167 as well as some cleanups and more thorough tests on atoms comparisons. Daniel
Daniel Veillard d0271473 2006-01-02T10:22:02 compilation and doc build fixes from Michael Day Daniel * xmlreader.c include/libxml/xmlreader.h xmlschemas.c: compilation and doc build fixes from Michael Day Daniel
Daniel Veillard 0b1ff14b 2005-12-28T21:13:33 bug in xmlRegExecPushString2() pointed out by Sreeni Nair. Daniel * xmlregexp.c: bug in xmlRegExecPushString2() pointed out by Sreeni Nair. Daniel
Daniel Veillard 9a00fd29 2005-11-09T08:56:26 applied patch from Geert Jansen to implement the save function to a * xmlsave.c xmlIO.c include/libxml/xmlIO.h include/libxml/xmlsave.h: applied patch from Geert Jansen to implement the save function to a xmlBuffer, and a bit of cleanup. Daniel
Daniel Veillard fc6eca0d 2005-11-01T15:24:02 fix bug #319897, problem with counted atoms when the transition itself is * xmlregexp.c: fix bug #319897, problem with counted atoms when the transition itself is counted too * result/regexp/hard test/regexp/hard: augmented the regression tests with the problem exposed. Daniel
Daniel Veillard 7802ba56 2005-10-27T11:56:20 avoid function parameters names 'list' as this seems to give troubles with * valid.c xmlregexp.c include/libxml/valid.h include/libxml/xmlregexp.h: avoid function parameters names 'list' as this seems to give troubles with VC6 and stl as reported by Samuel Diaz Garcia. Daniel
Daniel Veillard aa622012 2005-10-20T15:55:25 commiting a some fixes and debug done yesterday in the London airport. * xmlregexp.c: commiting a some fixes and debug done yesterday in the London airport. Daniel
Daniel Veillard 567a45b5 2005-10-18T19:11:55 removed the error message removed 2 instability warnings from function * runtest.c: removed the error message * relaxng.c xmlschemas.c: removed 2 instability warnings from function documentation * include/libxml/schemasInternals.h: changed warning about API stability * xmlregexp.c: trying to improve runtime execution of non-deterministic regexps and automata. Not fully finished but should be way better. Daniel
Rob Richards 54a8f67c 2005-10-07T02:33:00 remove warnings under Windows. * schematron.c xmlregexp.c: remove warnings under Windows.
Daniel Veillard 5de0938f 2005-09-26T17:18:17 seems a test to avoid duplicate transition is really needed at all times. * xmlregexp.c: seems a test to avoid duplicate transition is really needed at all times. Luka Por gave an example hitting this. Changed back the internal API. Daniel