lib/regcomp.c


Log

Author Commit Date CI Message
Paul Eggert 1261513a 2018-08-10T14:28:55 autoupdate
Paul Eggert 66b99e52 2018-08-01T13:26:38 autoupdate
Paul Eggert 5508825f 2018-06-29T15:34:57 regex: glibc does not use intprops.h Maybe we can talk glibc into using intprops.h someday, but now doesn’t seem to be a good time. * lib/regcomp.c (TYPE_SIGNED): Remove; regex_internal.h now defines. * lib/regex_internal.h [_LIBC]: Do not include intprops.h. (TYPE_SIGNED, INT_ADD_WRAPV): New macros.
Paul Eggert 281b825e 2018-01-01T00:57:25 maint: Run 'make update-copyright'
Paul Eggert f583f328 2017-12-19T15:53:47 regex: use re_malloc etc. consistently Problem and original patch reported by Arnold Robbins in: https://sourceware.org/ml/libc-alpha/2017-12/msg00241.html * lib/regcomp.c (re_comp): * lib/regexec.c (push_fail_stack, build_trtable, match_ctx_clean): Use re_malloc/re_realloc/re_free instead of malloc/realloc/free.
Paul Eggert 6245cd45 2017-11-22T11:23:01 regex: merge from glibc * lib/regcomp.c (init_word_char): Add comments.
Paul Eggert 6dc8556f 2017-11-20T15:56:22 regex: merge from glibc * lib/regcomp.c (__regcomp, __regfree) [_LIBC]: Now hidden. * lib/regex_internal.h (internal_function): Remove. All uses removed.
Paul Eggert ca35d468 2017-09-13T00:48:18 all: prefer https: URLs
Paul Eggert f7795760 2017-07-26T09:12:29 regex: work with GCC7's -Werror=implicit-fallthrough= * lib/regex_internal.h (FALLTHROUGH): New macro. * lib/regcomp.c (peek_token_bracket, parse_expression): * lib/regexec.c (check_node_accept): Use it.
Paul Eggert a3fd683d 2017-01-01T02:59:23 version-etc: new year * build-aux/gendocs.sh (version): * doc/gendocs_template: * doc/gendocs_template_min: * doc/gnulib.texi: * lib/version-etc.c (COPYRIGHT_YEAR): Update copyright dates by hand in templates and the like. * all files: Run 'make update-copyright'.
Paul Eggert 334d97f3 2016-06-08T01:46:35 regex: port to Sun C Reported by Daiki Ueno. * lib/regcomp.c (regcomp, regerror): Use _Restrict_, not __restrict, in prototype. This fixes a problem I introduced in the 2016-02-19 merge from glibc.
Paul Eggert 96609bb2 2016-05-30T12:18:19 Use GCC_LINT, not lint FreeBSD and Cygwin #define _Noreturn to empty if 'lint' is defined. Problem reported by Ken Brown in: http://bugs.gnu.org/23640 * doc/posix-headers/stdnoreturn.texi (stdnoreturn.h): Document problem with lint and _Noreturn. * lib/diffseq.h (IF_LINT, IF_LINT2): * lib/fts.c (sccsid): * lib/getndelim2.c (IF_LINT): * lib/gl_anylinked_list2.h (gl_linked_iterator) (gl_linked_iterator_from_to): * lib/gl_anytree_list2.h (gl_tree_iterator) (gl_tree_iterator_from_to): * lib/gl_anytree_oset.h (gl_tree_iterator): * lib/gl_array_list.c (gl_array_iterator) (gl_array_iterator_from_to): * lib/gl_array_oset.c (gl_array_iterator): * lib/gl_carray_list.c (gl_carray_iterator) (gl_carray_iterator_from_to): * lib/idcache.c: * lib/inet_ntop.c (IF_LINT): * lib/regcomp.c (build_charclass_op, create_tree): * lib/regex_internal.c (re_acquire_state) (re_acquire_state_context): * lib/trigl.c (rcsid): * lib/trim.c (IF_LINT): * lib/vasnprintf.c (IF_LINT): * lib/verify.h (assume): Treat GCC_LINT like lint.
Paul Eggert f97745b0 2016-02-19T09:27:41 regex: make it closer to libc Make Idx a signed type, rather than possibly unsigned. The unsignedness was not really buying us anything, since the code overflows for other reasons before getting to PTRDIFF_MAX. Making it signed allows us to use -1 and -2 with abandon, like libc does, thus lessening the number of differences between gnulib and libc. Also, it should help avoid gratuitous warnings like the one reported by Nelson H. F. Beebe in: http://bugs.gnu.org/22702 * lib/regex.h (__re_idx_t): Remove. All uses changed to regoff_t. * lib/regex_internal.h (SSIZE_MAX): Define if <limits.h> doesn't. (IDX_MAX) [_REGEX_LARGE_OFFSETS]: Now SSIZE_MAX. (REG_MISSING, REG_ERROR, REG_VALID_INDEX, REG_VALID_NONZERO_INDEX): Remove. Revert all uses to their libc versions.
Paul Eggert df5ed01e 2016-02-19T08:41:58 regex: merge patches from libc 2015-10-21 Joseph Myers <joseph@codesourcery.com> 2015-10-20 Joseph Myers <joseph@codesourcery.com> Convert miscellaneous function definitions to prototype style. * lib/regcomp.c (re_compile_pattern, re_set_syntax) (re_compile_fastmap, regcomp, regerror, regfree, re_comp): * lib/regexec.c (regexec, re_match, re_search, re_match_2, re_search_2) (re_search_2_stub, re_search_stub, re_set_registers, re_exec) (re_search_internal): Convert to prototype-style function definition. Use internal_function for internal functions.
Paul Eggert 2b34f389 2016-01-24T00:55:44 regex: treat [x] as x if x is a unibyte encoding error Problem reported by Aharon Robbins in: http://lists.gnu.org/archive/html/bug-gnulib/2016-01/msg00091.html * lib/regcomp.c (parse_byte) [!_LIBC && RE_ENABLE_I18N]: New function. (build_range_exp) [!_LIBC && RE_ENABLE_I18N]: Use it.
Paul Eggert 336fa860 2016-01-18T10:34:18 regex: pacify static checkers Problem and draft fix reported by Aharon Robbins in: http://lists.gnu.org/archive/html/bug-gnulib/2016-01/msg00082.html * lib/regcomp.c (build_charclass_op, create_tree) [lint]: Clear memory to pacify static checkers.
Paul Eggert 7c6e85cf 2016-01-18T10:32:26 regex: fix [ diagnostic Problem and fix reported by Aharon Robbins in: http://lists.gnu.org/archive/html/bug-gnulib/2016-01/msg00082.html * lib/regcomp.c (REG_EBRACK_IDX): Fix misleading diagnostic about [. * lib/regcomp.c (build_range_exp, build_charclass_op)
Paul Eggert 9e849a70 2016-01-18T10:31:07 regex: fix memory leaks Problem and draft fix reported by Aharon Robbins in: http://lists.gnu.org/archive/html/bug-gnulib/2016-01/msg00082.html * lib/regcomp.c (build_range_exp, build_charclass_op): * lib/regex_internal.c (re_dfa_add_node): Fix memory leak on failure.
Paul Eggert 71090a2a 2016-01-01T00:56:19 version-etc: new year * build-aux/gendocs.sh (version): * doc/gendocs_template: * doc/gendocs_template_min: * doc/gnulib.texi: * lib/version-etc.c (COPYRIGHT_YEAR): Update copyright dates by hand in templates and the like. * all files: Run 'make update-copyright'.
Paul Eggert 5513b409 2015-09-19T13:53:34 Diagnose ERE '()|\1' Problem reported by Hanno Böck in: http://bugs.gnu.org/21513 * lib/regcomp.c (parse_reg_exp): While parsing alternatives, keep track of the set of previously-completed subexpressions available before the first alternative, and restore this set just before parsing each subsequent alternative. This lets us diagnose the invalid back-reference in the ERE '()|\1'.
Paul Eggert 2f8140bc 2015-09-19T09:21:47 regex: merge patches from libc 2015-09-08 Joseph Myers <joseph@codesourcery.com> Move bits/libc-lock.h and bits/libc-lockP.h out of bits/ (bug 14912). * lib/regex_internal.h: Include <libc-lock.h> instead of <bits/libc-lock.h>. 2015-06-09 Joseph Myers <joseph@codesourcery.com> Fix regcomp wcscoll, wcscmp namespace (bug 18497). * lib/regcomp.c (build_range_exp): Call __wcscoll instead of wcscoll. * lib/regexec.c (check_node_accept_bytes): Likewise. 2015-06-05 Joseph Myers <joseph@codesourcery.com> Fix regex wcrtomb namespace (bug 18496). * lib/regex_internal.c (build_wcs_upper_buffer): Call __wcrtomb instead of wcrtomb. 2015-06-05 Joseph Myers <joseph@codesourcery.com> Fix regex wctype namespace (bug 18495). * lib/regcomp.c (re_compile_fastmap_iter): Call __towlower instead of towlower. * lib/regex_internal.c (build_wcs_upper_buffer): Call __iswlower instead of iswlower. Call __towupper instead of towupper. * lib/regex_internal.h (IS_WIDE_WORD_CHAR): Call __iswalnum instead of iswalnum. 2015-01-07 Chris Metcalf <cmetcalf@ezchip.com> * lib/regcomp.c (parse_bracket_exp): Initialize type to COLL_SYM in a couple of places to avoid uninitialized variable wanings on tilegx gcc 4.8.2. 2014-11-24 Siddhesh Poyarekar <siddhesh@redhat.com> * lib/regex_internal.h: Remove NOT_IN_libc. 2014-11-17 Andreas Schwab <schwab@suse.de> * lib/regex_internal.h: Don't include <locale/elem-hash.h>. 2014-09-11 Roland McGrath <roland@hack.frob.com> Move findidx nested functions to top-level. * lib/regcomp.c [_LIBC]: #include <locale/weight.h>. (build_equiv_class) [_LIBC]: Don't #include it inside the function. Pass new arguments to findidx. * lib/regexec.c [RE_ENABLE_I18N] [_LIBC]: #include <locale/weight.h>. [RE_ENABLE_I18N] (check_node_accept_bytes) [_LIBC]: Don't #include it inside the function. Pass new arguments to findidx. * lib/regex_internal.h: [!NOT_IN_libc] [_LIBC]: #include <locale/weight.h>. (re_string_elem_size_at): Don't #include it inside the function. Pass new arguments to findidx. 2014-08-01 Siddhesh Poyarekar <siddhesh@redhat.com> Check if DEBUG is defined in regex_internal.c * lib/regex_internal.c: Check if DEBUG is defined and is set.
Paul Eggert b9bfe784 2015-01-01T01:38:23 version-etc: new year * doc/gnulib.texi: * lib/version-etc.c (COPYRIGHT_YEAR): Update copyright date. * all files: Run 'make update-copyright'.
Jim Meyering 1051177e 2014-07-12T16:33:49 regex: don't deref NULL upon heap allocation failure * lib/regcomp.c (parse_dup_op): Handle duplicate_tree failure in one more place. To trigger the segfault, configure grep -with-included-regex, build it, and run these commands: ( ulimit -v 300000; echo a|src/grep -E a+++++++++++++++++++++ ) I discovered this while replying to a private report from Jens Schleusener about excessive memory consumption by grep when using a regular expression like the one above.
Paul Eggert c4093fa1 2014-07-11T12:19:34 regex: fix memory leak in compiler Fix by Andreas Schwab in: https://sourceware.org/ml/libc-alpha/2014-06/msg00503.html * lib/regcomp.c (parse_reg_exp): Deallocate partially constructed tree before returning error.
Paul Eggert 316c9c50 2014-06-19T08:51:30 regex: fix memory leak in compiler Fix by Andreas Schwab in: https://sourceware.org/ml/libc-alpha/2014-06/msg00462.html * lib/regcomp.c (parse_expression): Deallocate partially constructed tree before returning error.
Eric Blake 1276a2c5 2014-01-01T00:04:40 maint: update copyright I ran 'make update-copyright'. Signed-off-by: Eric Blake <eblake@redhat.com>
Paul Eggert 96a263f7 2013-05-29T18:48:09 c-ctype, regex, verify: port to gcc -std=c90 -pedantic Avoid constructions that are rejected by gcc -std=c90 -pedantic. This fixes a porting bug I recently reintroduced in regex, and some other instances that I discovered while testing the fix. * lib/c-ctype.h [__STRICT_ANSI__]: Avoid ({ ... }). * lib/regcomp.c (utf8_sb_map) [__STRICT_ANSI__]: Avoid [0 ... N] = E. * lib/regex_internal.h [!_LIBC && GNULIB_LOCK]: Do not use a macro with an empty argument if this is a pedantic pre-C99 GCC. * lib/verify.h: Do not use _Static_assert if this is a pedantic pre-C11 GCC.
Paul Eggert 9ceceed2 2013-05-19T14:26:05 regex: fix dfa race in multithreaded uses Problem reported by Ludovic Courtès in <http://lists.gnu.org/archive/html/bug-gnulib/2013-05/msg00058.html>. * lib/regex_internal.h (lock_define, lock_init, lock_fini): New macros. All uses of __libc_lock_define, __libc_lock_init changed to use the first two of these. (__libc_lock_lock, __libc_lock_unlock): New macros, for non-glibc platforms. (struct re_dfa_t): Define the lock unconditionally. * lib/regexec.c (regexec, re_search_stub): Remove some now-incorrect '#ifdef _LIBC"s. * modules/regex (Depends-on): Add pthread, if we use the included regex. * lib/regcomp.c: Do actions that are not needed for glibc, but may be needed elsewhere. (regfree, re_compile_internal): Destroy the lock. (re_compile_internal): Check for lock-initialization failure.
Gary V. Vaughan 951e33a4 2013-03-08T19:50:10 regex: rename remaining __attribute calls to __attribute__. Commit 930b85b changed definition of __attribute, but left some uses unchanged, preventing compilation of regex module on most non-gcc environments: * lib/regcomp.c (re_set_fastmap, seek_collating_symbol_entry) (lookup_collation_sequence_value, build_range_exp) (build_collating_symbol): Set attributes with newly renamed __attribute__ decorator. * lib/regex_internal.c (re_string_peek_byte_case) (re_node_set_compare, re_node_set_contains): Likewise. * lib/regexec.c (acquire_init_state_context): Likewise. Signed-off-by: Gary V. Vaughan <gary@gnu.org>
Paul Eggert 930b85b8 2013-02-25T22:56:12 regex: merge patches from libc 2013-02-26 Siddhesh Poyarekar <siddhesh@redhat.com> * lib/regex_internal.h (__attribute__): Rename from __attribute. All uses changed. (bitset_not, bitset_merge, bitset_mask, re_string_char_size_at) (re_string_wchar_at, re_string_elem_size_at): Mark function as possibly unused. 2013-02-12 Andreas Schwab <schwab@suse.de> [BZ #11561] * lib/regcomp.c (parse_bracket_exp) [_LIBC]: When looking up collating elements compare against the byte sequence of it, not its name.
Paul Eggert d5ddbd8b 2013-01-05T12:06:52 regex: conform to strict C * lib/regcomp.c (parse_bracket_exp): Add cast to conform to strict C. From Aharon Robbins.
Paul Eggert 964bbc2d 2013-01-01T16:27:46 regex: omit needless signed-pointer casts * lib/regcomp.c (build_charclass, build_charclass_op): Use char *, not unsigned char *, for class name and extra. The char values are always nonnegative so there's no need to insist on unsigned char * here, and using char * removes the need for casts. Reported by Aharon Robbins in <http://sourceware.org/ml/libc-alpha/2012-12/msg00456.html>.
Eric Blake 9fc81090 2013-01-01T00:50:58 maint: update all copyright year number ranges Run "make update-copyright". Compare to commit 1602f0a from last year. Signed-off-by: Eric Blake <eblake@redhat.com>
Paul Eggert 6410c7a6 2012-12-29T23:31:08 regex: implement rational ranges Reported by Aharon Robbins in <http://sourceware.org/ml/libc-alpha/2012-12/msg00456.html>. * lib/regcomp.c (build_range_exp) [!_LIBC]: * lib/regexec.c (check_node_accept_bytes) [!_LIBC]: Implement rational ranges.
Paul Eggert 585a8dcf 2012-12-29T22:52:17 regex: port to C89 Reported by Aharon Robbins in <http://sourceware.org/ml/libc-alpha/2012-12/msg00456.html>. * lib/regcomp.c (init_word_char): Declaration before statement.
Paul Eggert f3155e8a 2012-12-29T21:10:29 regex: merge glibc changes Also, copy the license wording from glibc. This simplifies merging changes. gnulib-tool will change the wording to GPL as appropriate, when importing it to other packages. The only glibc change made since the last merge, which needs merging, is: 2012-05-24 Andreas Schwab <schwab@linux-m68k.org> * lib/regex_internal.h (gettext): Remove use of INTUSE.
Paul Eggert d4903bb0 2012-06-26T15:16:07 regex: use locale-independent comparison for codeset name See Bruno Haible's comment in <http://bugs.gnu.org/10305#120>. * lib/regcomp.c (init_dfa): Use just ASCII case comparison for codeset name. * lib/regex_internal.h: Do not include <strings.h>, since we no longer use strcasecmp. * modules/regex (Depends-on): Remove strcase.
Paul Eggert 3a4836d1 2012-06-17T09:55:15 regex: avoid warning when pointers are not long * lib/regcomp.c (parse_dup_op, mark_opt_subexp): Cast between void * and uintptr_t, not long, for portability to hosts where pointers and long have different sizes. Issue noted by Daniel P. Berrange in <http://lists.gnu.org/archive/html/bug-gnulib/2012-06/msg00122.html> and fix suggested by Bruno Haible in <http://lists.gnu.org/archive/html/bug-gnulib/2012-06/msg00128.html>.
Paul Eggert 252b5245 2012-05-26T23:48:00 regex: don't assume uint64_t or uint32_t * lib/regcomp.c (init_word_char): Don't assume that the types uint64_t and uint32_t exist. The C standard doesn't guarantee them, and on some 32-bit compilers there is no uint64_t. Problem reported by Gianluigi Tiesi in <http://lists.gnu.org/archive/html/bug-gnulib/2012-03/msg00154.html>.
Paul Eggert 705a87c9 2012-04-04T00:56:15 regex: remove unnecessary type punning Problem reported by Vladimir Serbinenko in <http://lists.gnu.org/archive/html/bug-gnulib/2012-04/msg00006.html>. * lib/regex.h (struct re_pattern_buffer): Change the type of __REPB_PREFIX(buffer) from unsigned char * to struct re_dfa_t *. Fix comment to match code. * lib/regcomp.c (re_compile_fastmap, re_compile_fastmap_iter, regfree) (re_compile_internal, free_workarea_compile, analyze, lower_subexp) (parse, parse_reg_exp, parse_branch, parse_expression, parse_sub_exp): * lib/regexec.c (regexec, re_search_stub, re_search_internal) (set_regs): Omit no-longer-necessary casts.
Paul Eggert bc326c6d 2012-03-30T15:24:06 regex: pacify GCC when compiling GRUB * lib/regcomp.c (init_dfa): Make a pointer 'const', to avoid a diagnostic. Reported by Vladimir Serbinenko in <http://lists.gnu.org/archive/html/bug-gnulib/2012-03/msg00163.html>.
Paul Eggert 04ff3c18 2012-03-16T14:17:55 regex: diagnose too-large repeat counts in EREs Previously, the code did not diagnose the too-large repeat count in EREs like 'b{1000000000}'; instead, it silently treated the ERE as if it were 'b\{1000000000}', which is unexpected. * lib/regcomp.c (parse_dup_op): Fail with REG_ESIZE if a repeat count is too large. REG_ESIZE is used nowhere else, and the diagnostic is a reasonable one for this problem. Another option would be to create a new REG_OVERFLOW error for repeat counts that are too large. (fetch_number): Return RE_DUP_MAX + 1, not REG_ERROR, if the repeat count is too large, so that the caller can distinguish the two cases. * lib/regex.h (_REG_ESIZE): Document that this is now a generic "Too large" return code, and that repeat counts are one example of this.
Paul Eggert 341111f6 2012-02-09T21:39:05 maint: replace FSF snail-mail addresses with URLs * config/argz.mk, lib/accept4.c, lib/alignof.h, lib/alloca.in.h: * lib/alphasort.c, lib/arcfour.c, lib/arcfour.h, lib/arctwo.c: * lib/arctwo.h, lib/argz.c, lib/arpa_inet.in.h, lib/asnprintf.c: * lib/asprintf.c, lib/assert.in.h, lib/base32.c, lib/base32.h: * lib/base64.c, lib/base64.h, lib/c-ctype.c, lib/c-ctype.h: * lib/c-strcase.h, lib/c-strcasecmp.c, lib/c-strncasecmp.c: * lib/check-version.c, lib/check-version.h, lib/config.charset: * lib/ctype.in.h, lib/des.c, lib/des.h, lib/dup3.c, lib/errno.in.h: * lib/float+.h, lib/fnmatch.c, lib/fnmatch.in.h, lib/fnmatch_loop.c: * lib/fseeko.c, lib/gai_strerror.c, lib/gc-gnulib.c: * lib/gc-libgcrypt.c, lib/gc-pbkdf2-sha1.c, lib/gc.h: * lib/getaddrinfo.c, lib/getdelim.c, lib/getfilecon.c, lib/getline.c: * lib/getlogin_r.c, lib/getpass.c, lib/getpass.h, lib/gettext.h: * lib/gettimeofday.c, lib/glob.in.h, lib/glthread/cond.c: * lib/glthread/cond.h, lib/glthread/lock.c, lib/glthread/lock.h: * lib/glthread/thread.c, lib/glthread/thread.h: * lib/glthread/threadlib.c, lib/glthread/yield.h, lib/hmac-md5.c: * lib/hmac-sha1.c, lib/hmac.h, lib/iconv.c, lib/iconv.in.h: * lib/iconv_close.c, lib/iconv_open.c, lib/inet_ntop.c, lib/isfinite.c: * lib/isinf.c, lib/iswblank.c, lib/langinfo.in.h, lib/link.c: * lib/localcharset.c, lib/localcharset.h, lib/lseek.c, lib/malloc.c: * lib/malloca.c, lib/malloca.h, lib/md2.c, lib/md2.h, lib/md4.c: * lib/md4.h, lib/md5.c, lib/md5.h, lib/memmem.c, lib/mempcpy.c: * lib/memset.c, lib/memxor.c, lib/memxor.h, lib/minmax.h, lib/mktime.c: * lib/msvc-inval.c, lib/msvc-inval.h, lib/msvc-nothrow.c: * lib/msvc-nothrow.h, lib/netdb.in.h, lib/netinet_in.in.h, lib/nproc.c: * lib/nproc.h, lib/obstack_printf.c, lib/pathmax.h, lib/pipe.c: * lib/pipe2.c, lib/poll.c, lib/poll.in.h, lib/printf-args.c: * lib/printf-args.h, lib/printf-parse.c, lib/printf-parse.h: * lib/pselect.c, lib/pthread.in.h, lib/pty-private.h, lib/pty.in.h: * lib/read-file.c, lib/read-file.h, lib/ref-add.sin, lib/ref-del.sin: * lib/regcomp.c, lib/regex.c, lib/regex.h, lib/regex_internal.c: * lib/regex_internal.h, lib/regexec.c, lib/rijndael-alg-fst.c: * lib/rijndael-alg-fst.h, lib/rijndael-api-fst.c: * lib/rijndael-api-fst.h, lib/rint.c, lib/rintf.c, lib/rintl.c: * lib/round.c, lib/roundf.c, lib/roundl.c, lib/scandir.c, lib/select.c: * lib/sha1.c, lib/sha1.h, lib/size_max.h, lib/snprintf.c: * lib/stdalign.in.h, lib/stdarg.in.h, lib/stdbool.in.h: * lib/stddef.in.h, lib/stdint.in.h, lib/stdio.in.h, lib/str-kmp.h: * lib/str-two-way.h, lib/strcasecmp.c, lib/strcasestr.c, lib/strdup.c: * lib/striconv.c, lib/striconv.h, lib/string.in.h, lib/strings.in.h: * lib/strncasecmp.c, lib/strndup.c, lib/strnlen.c, lib/strpbrk.c: * lib/strptime.c, lib/strsep.c, lib/strstr.c, lib/strverscmp.c: * lib/sys_file.in.h, lib/sys_ioctl.in.h, lib/sys_select.in.h: * lib/sys_socket.in.h, lib/sys_stat.in.h, lib/sys_time.in.h: * lib/sys_times.in.h, lib/sys_types.in.h, lib/sys_uio.in.h: * lib/sys_utsname.in.h, lib/sys_wait.in.h, lib/tcgetsid.c: * lib/termios.in.h, lib/time.in.h, lib/time_r.c, lib/timegm.c: * lib/times.c, lib/unictype/3level.h, lib/unictype/3levelbit.h: * lib/unistd.in.h, lib/vasnprintf.c, lib/vasnprintf.h, lib/vasprintf.c: * lib/vsnprintf.c, lib/waitpid.c, lib/wchar.in.h, lib/wctype.in.h: * lib/xsize.h, tests/test-closein.c, tests/test-des.c: * tests/test-fclose.c, tests/test-fgetc.c, tests/test-filevercmp.c: * tests/test-fputc.c, tests/test-fread.c, tests/test-fwrite.c: * tests/test-gc-arcfour.c, tests/test-gc-arctwo.c, tests/test-gc-des.c: * tests/test-gc-hmac-md5.c, tests/test-gc-hmac-sha1.c: * tests/test-gc-md2.c, tests/test-gc-md4.c, tests/test-gc-md5.c: * tests/test-gc-pbkdf2-sha1.c, tests/test-gc-rijndael.c: * tests/test-gc-sha1.c, tests/test-gc.c, tests/test-getdelim.c: * tests/test-getline.c, tests/test-getndelim2.c, tests/test-md2.c: * tests/test-md4.c, tests/test-parse-datetime.c, tests/test-perror.c: * tests/test-perror2.c, tests/test-pipe.c, tests/test-pipe2.c: * tests/test-poll.c, tests/test-quotearg-simple.c: * tests/test-quotearg.c, tests/test-quotearg.h: * tests/test-round-ieee.c, tests/test-round1.c: * tests/test-roundf-ieee.c, tests/test-roundf1.c: * tests/test-roundl-ieee.c, tests/test-roundl.c: * tests/test-safe-alloc.c, tests/test-sigpipe.c: * tests/test-spawn-pipe-child.c, tests/test-spawn-pipe-main.c: * tests/test-strerror.c, tests/test-strerror_r.c: * tests/test-strsignal.c, tests/test-strverscmp.c: * tests/test-xmemdup0.c: Replace FSF snail mail addresses with URLs, as per GNU coding standards. See glibc bug <http://sourceware.org/bugzilla/show_bug.cgi?id=13673>.
Paul Eggert 5a460138 2012-02-07T22:47:01 regex: merge glibc changes * lib/regcomp.c (init_dfa): Tighten overflow checks to test for IDX_MAX too, since IDX_MAX can be much less than SIZE_MAX. (init_word_char): Work even if bitset words are not exactly 32 or 64 bits wide. Don't assume there are no padding bits. * lib/regex.c [_LIBC]: Do not include <config.h>. [!_LIBC]: Add pragmas to ignore -Wsuggest-attributes=pure and -Wtype-limits. * lib/regex.h (__USE_GNU): Renamed from __USE_GNU_REGEX, to avoid needless disagreement with glibc. All uses changed. Define it to 1 only if _GNU_SOURCE, to match glibc. (_REG_RM_NAME): Remove; no longer needed, since the names in question are now all protected by __USE_GNU. (_REG_RE_NAME): Remove; replaced by glibc's __REPB_PREFIX. (REG_TRANSLATE_TYPE): Remove; replaced by glibc's __RE_TRANSLATE_TYPE. * lib/regex_internal.h (MIN): New macro. 2012-01-03 Ulrich Drepper <drepper@gmail.com> * lib/regcomp.c (init_word_char): Optimize regex a bit. 2011-12-30 Jakub Jelinek <jakub@redhat.com> * lib/regex_internal.c (re_string_fetch_byte_case): Fix up regcomp/regexec. The problem is that parse_bracket_symbol is miscompiled, and it turns out it is because of an incorrect attribute on re_string_fetch_byte_case. Unlike re_string_peek_byte_case, this one is really not pure, it modifies memory (increments pstr->cur_idx), and with the pure attribute GCC assumed it doesn't and it cached the presumed value of regexp->cur_idx in a variable across the for (;; ++i) { if (i >= BRACKET_NAME_BUF_SIZE) return REG_EBRACK; if (token->type == OP_OPEN_CHAR_CLASS) ch = re_string_fetch_byte_case (regexp); else ch = re_string_fetch_byte (regexp); if (re_string_eoi(regexp)) return REG_EBRACK; if (ch == delim && re_string_peek_byte (regexp, 0) == ']') break; elem->opr.name[i] = ch; } 2011-11-29 Andreas Schwab <schwab@redhat.com> * lib/regcomp.c (build_equiv_class): Fix access after end of search string in regex matcher. 2011-11-12 Ulrich Drepper <drepper@redhat.com> * lib/regex_internal.c, lib/regex_internal.h: Fix warnings in regex. 2011-10-12 Ulrich Drepper <drepper@redhat.com> * lib/regcomp.c (parse_branch): One more regex memory leak fixed. 2011-10-11 Ulrich Drepper <drepper@redhat.com> * lib/regcomp.c (parse_branch, parse_sub_exp): More regex memory leak fixes and tests. (parse_sub_exp, parse_bracket_exp): Fix memory leak for some invalid regular expressions. 2011-05-28 Ulrich Drepper <drepper@gmail.com> * lib/regex_internal.c, lib/regexec.c: Fix unnecessary overallocation due to incomplete character. When incomplete characters are found at the end of a string the code ran amok and allocated lots of memory. Stricter limits are now in place. 2011-05-20 Reuben Thomas <rrt@sc3d.org> * lib/regex.h: Update documentation. 2011-05-16 Aharon Robbins <arnold@skeeve.com> * lib/regex.h: Update RE_SYNTAX*_AWK constants. 2010-05-05 Andreas Schwab <schwab@redhat.com> * lib/regexec.c (find_collation_sequence_value): Fix lookup of collation sequence value during regexp matching. 2010-01-22 Ulrich Drepper <drepper@redhat.com> * lib/regex_internal.c (re_dfa_add_node): Extend overflow detection. 2008-01-16 Ulrich Drepper <drepper@redhat.com> * lib/regex.h: Cleanup namespace. 2007-11-26 Ulrich Drepper <drepper@redhat.com> * lib/regex.h (REG_ENOSYS): Define REG_ENOSYS also for __USE_XOPEN2K. 2007-08-26 Ulrich Drepper <drepper@redhat.com> * lib/regex_internal.h: Prevent some declarations and definitions to be seen when used in tests. 2005-05-06 Ulrich Drepper <drepper@redhat.com> * lib/regex_internal.h: Include bits/libc-lock.h or define dummy __libc_lock_* macros if not _LIBC. (struct re_dfa_t): Add lock.
Paul Eggert a4d796fb 2012-02-05T13:42:03 maint: spelling fixes
Paul Eggert 51e801f2 2012-01-05T23:53:49 In commentary, do not use ` to quote.
Jim Meyering 1602f0af 2012-01-01T10:04:58 maint: update all copyright year number ranges Run "make update-copyright".
Jim Meyering d60f3b0c 2011-01-01T20:17:23 maint: update almost all copyright ranges to include 2011 Run the new "make update-copyright" rule.
Jim Meyering 602e3e6b 2010-03-19T21:26:36 regcomp.c: make non-_LIBC implementation of build_range_exp consistent The _LIBC implementation of build_range_exp correctly honors the RE_NO_EMPTY_RANGES flag when checking for reversed range endpoints. However, the non-_LIBC implementation would ignore that syntax-bit flag and return REG_ERANGE unconditionally. This change makes it honor that flag. * lib/regcomp.c (build_range_exp) [!_LIBC]: Add a parameter: "syntax". Make two pointer parameters "const". Use "syntax" bits in order to honor RE_NO_EMPTY_RANGES. (parse_bracket_exp): Update caller.
Jim Meyering 9d0ad652 2010-02-03T18:01:36 regcomp.c: avoid the sole warning from gcc's -Wtype-limits * lib/regcomp.c (TYPE_SIGNED): Define. (parse_dup_op): Use it to avoid the sole warning from -Wtype-limits.
Jim Meyering 107fb0e6 2010-02-03T17:15:03 regcomp.c: avoid a new -Wshadow warning * lib/regcomp.c (create_initial_state): Do not shadow local "err".
Jim Meyering d8352858 2010-01-19T09:23:51 regcomp.c: spelling and merge-artifact from glibc * lib/regcomp.c: Merge remainder of glibc's 2da42bc06566bc89785e580fa1ac89b4c9f2a63c.
Jim Meyering 6bd173f7 2010-01-19T09:22:30 regcomp.c: sync white-space changes from glibc * lib/regcomp.c: Merge to accommodate white space changes from glibc's 2da42bc06566bc89785e580fa1ac89b4c9f2a63c.
Jim Meyering 419dde7c 2010-01-19T09:18:19 regcomp.c: do not ignore internal return values * lib/regcomp.c: Do not ignore internal return values. This is from glibc's 2da42bc06566bc89785e580fa1ac89b4c9f2a63c, but without its white-space changes and spelling fixes.
Ulrich Drepper d55324f1 2010-01-04T11:18:51 regcomp, regexec, fnmatch: avoid array bounds read error * lib/regcomp.c (build_equiv_class): From glibc: Use only the low 24 bits of a findidx return value as an index into the weights array. Patch by Ulrich Drepper: http://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=b7d1c5fa30 * lib/regexec.c (check_node_accept_bytes): Likewise. * lib/fnmatch_loop.c (FCT): Likewise.
Ulrich Drepper d7725994 2010-01-04T10:59:51 regcomp: skip collseq lookup when there are no rules * lib/regcomp.c (lookup_collation_sequence_value): From glibc: http://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=a532a41df58
Ulrich Drepper 7f206b67 2010-01-04T10:51:34 regcomp: recognize ill-formed { } expressions * lib/regcomp.c (parse_dup_op): From glibc: http://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=a87cd2894cb
Jim Meyering 608bb2b7 2010-01-04T10:47:58 regcomp: fix typo in comment * lib/regcomp.c (duplicate_node_closure): Sync from glibc. s/satisfy/satisfies/.
Jim Meyering f9b39ec6 2010-01-04T09:09:22 regcomp: sync from glibc: remove dead store * lib/regcomp.c (duplicate_node_closure): Remove useless search_duplicated_node call and dead store.
Jim Meyering 0cfc3b87 2010-01-04T09:07:52 regcomp: sync from glibc; always use nl_langinfo * lib/regcomp.c (init_dfa) [!LIBC]: Always use nl_langinfo (CODESET), now that gnulib provides it. Recognize UTF8 as well as UTF-8. * lib/regex_internal.h: Always include <langinfo.h>, now. * modules/regex (Depends-on): Add nl_langinfo.
Jim Meyering b2e2010c 2010-01-01T10:31:12 update nearly all FSF copyright year lists to include 2010 Use the same procedure as for 2009, outlined in http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/20081
Paolo Bonzini 2bc9cabc 2009-11-25T11:41:09 regex: Fix fastmap for multibyte character ranges. * lib/regcomp.c (re_compute_fastmap_iter): Add all multibyte lead characters when a multibyte character range is included.
Paolo Bonzini 493f15ba 2009-01-09T09:10:36 regex: fix glibc bug 9697 2009-01-09 Paolo Bonzini <bonzini@gnu.org> * lib/regcomp.c (re_compile_fastmap_iter): Rewrite COMPLEX_BRACKET handling.
Paolo Bonzini 03da0525 2009-01-09T09:00:58 regex: replace mbrtowc with __mbrtowc. 2009-01-09 Paolo Bonzini <bonzini@gnu.org> * lib/regcomp.c (re_compile_fastmap_iter): Use __mbrtowc. * lib/regex_internal.c (build_wcs_buffer, build_wcs_upper_buffer, re_string_skip_chars, re_string_reconstruct): Likewise. * lib/regex_internal.h [!_LIBC] (__mbrtowc): New #define.
Eric Blake 39fc05fb 2008-05-15T14:37:29 Fix violation of <stdbool.h> replacement in regex. * lib/regcomp.c (re_compile_internal): Avoid implicit cast to bool. Reported by Heinrich Mislik <Heinrich.Mislik@univie.ac.at>. Signed-off-by: Eric Blake <ebb9@byu.net>
Paolo Bonzini 1f191226 2008-05-15T08:50:06 optimize double anchors such as ^$ 2008-05-15 Paolo Bonzini <bonzini@gnu.org> * lib/regcomp.c (optimize_utf8): Add a note on why we test opr.ctx_type. (calc_first): Initialize constraint field. (duplicate_node_closure): Use it instead of special casing ANCHORS. Fix grammar. (duplicate_node): Merge constraint field for all node types. (calc_eclosure_iter): Look at constraint field for all node types. * lib/regex_internal.c (create_cd_newstate): Don't look at opr.ctx_type.
Jim Meyering 55a55895 2007-12-01T15:34:41 Fix a 4-year-old used-uninitialized bug in regcomp.c. * lib/regcomp.c (optimize_utf8): Fix a typo, s/idx/ctx_type/, that would inhibit utf8-optimization of a regexp containing line- or buffer-anchors, e.g., `^', `$'.
Paul Eggert dea6f708 2007-02-15T00:16:55 Fix regex code so it doesn't rely on strcasecmp. * lib/regex_internal.h: Include <langinfo.h> only if _LIBC is defined. Otherwise, include gnulib's langinfo.h. * lib/regcomp.c (init_dfa): Don't use strcasecmp, as it can have undesirable behavior in non-C locales. Instead, rely on locale_charset. * m4/regex.m4 (gl_PREREQ_REGEX): Don't require AM_LANGINFO_CODESET. * modules/regex (FILES): Remove m4/codeset.m4. (Depends-on): Add localcharset. Remove strcase.
Paolo Bonzini d622258b 2007-02-05T15:38:59 2007-02-05 Paolo Bonzini <bonzini@gnu.org> Merge upstream fix for glibc bugzilla #3957: 2007-02-05 Jakub Jelinek <jakub@redhat.com> * lib/regcomp.c (parse_bracket_exp): Set '\n' bit rather than '\0' bit for RE_HAT_LISTS_NOT_NEWLINE. (build_charclass_op): Remove bogus comment.
Paul Eggert f7328f9d 2007-02-02T22:15:43 Avoid mempcpy in the regex code, as the string.h mempcpy stuff is causing more trouble than it's curing. * lib/regex_internal.h (__mempcpy): Remove. * lib/regcomp.c (regerror): Rewrite to avoid the need for mempcpy (and make the code a tad smaller to boot). * m4/regex.m4 (gl_PREREQ_REGEX): Don't check for mempcpy.
Paul Eggert ad431555 2007-01-29T00:37:14 * lib/regex.h (_Restrict_): Renamed from __restrict, to avoid a circularity problem with HP-UX ia64 reported by Bob Proulx in <http://lists.gnu.org/archive/html/bug-gnulib/2007-01/msg00394.html>. All uses changed. (_Restrict_arr_): Renamed from __restrict_arr, for similar reasons. All uses changed. * lib/regcomp.c, lib/regexec.c: Change all uses from __restrict to _Restrict_. * lib/regexec.c (regexec): Declare pmatch with _Restrict_arr_, so that the parameter matches the prototype.
Jim Meyering b0cf73ef 2006-11-28T08:35:51 * lib/regcomp.c (parse_branch): Rename local, exp->expr, to avoid warning from "gcc -Wshadow" about shadowing the builtin.
Paul Eggert 4a2097ae 2006-04-13T22:14:12 * regcomp.c (init_dfa): Don't use wchar_t or wctype_t if RE_ENABLE_I18N is not defined. Problem reported by Mark D. Baushke via Derek R. Price. * regex.h (RE_DUP_MAX): Update comment to match current implementation.
Paul Eggert 78ae8220 2006-04-11T05:13:09 Fix space-tab problem. From Jim Meyering.
Paul Eggert 8335a4d6 2006-04-10T06:43:33 Merge regex changes from libc, removing some of our POSIX-conformance changes that were rejected and redoing them in a less-intrusive way. * lib/regcomp.c (re_compile_internal, init_dfa): Length arg is now size_t, not Idx. All uses changed. (peek_token): Forward decl now says internal_function. (__re_error_msgid, __re_error_msgid_idx): Now static rather than extern with attribute_hidden. (re_compile_pattern) [!defined _LIBC]: Use K&R-style defn. For some reason libc prefers K&R style defns for external functions. (regerror) [!defined _LIBC]: Likewise. (re_set_syntax, re_compile_fastmap, regcomp, regfree, re_comp): (seek_collating_symbol_entry, lookup_collation_sequence_value): (build_range_exp, build_collating_symbol): Use K&R-style defn. (re_compile_fastmap): Use '\0' to memset, not 0. (utf8_sb_map): Make the calculations more obvious. (init_dfa, parse_bracket_exp, build_charclass_op): Call calloc and cast result, as glibc does. (init_word_char, fetch_token, peek_token, peek_token_bracket): (build_range_exp, build_collating_symbol): Now internal functions. * lib/regex.c [!defined _LIBC]: Allow compiling with C++ compilers. * lib/regex.h (__USE_GNU_REGEX): New macro. Don't depend on _REGEX_SOURCE any more; depend on _GNU_SOURCE instead. Don't depend on VMS; depend on __VMS instead, for POSIX namespace cleanness. (regoff_t): Define to ssize_t, not long int. Remove the REG_ macros named below. Instead, make the old names (e.g., RE_BACKSLASH_ESCAPE_IN_LISTS) visible only if __USE_GNU_REGEX. (REG_BACKSLASH_ESCAPE_IN_LISTS): (REG_BK_PLUS_QM, REG_CHAR_CLASSES, REG_CONTEXT_INDEP_ANCHORS): (REG_CONTEXT_INDEP_OPS, REG_CONTEXT_INVALID_OPS): (REG_DOT_NEWLINE, REG_DOT_NOT_NULL, REG_HAT_LISTS_NOT_NEWLINE): (REG_INTERVALS, REG_LIMITED_OPS, REG_NEWLINE_ALT): (REG_NO_BK_BRACES, REG_NO_BK_PARENS, REG_NO_BK_REFS): (REG_NO_BK_VBAR, REG_NO_EMPTY_RANGES): (REG_UNMATCHED_RIGHT_PAREN_ORD, REG_NO_POSIX_BACKTRACKING): (REG_NO_GNU_OPS, REG_DEBUG, REG_INVALID_INTERVAL_ORD): (REG_IGNORE_CASE, REG_CARET_ANCHORS_HERE): (REG_CONTEXT_INVALID_DUP, REG_NO_SUB, REG_SYNTAX_EMACS): (REG_SYNTAX_AWK, REG_SYNTAX_GNU_AWK, REG_SYNTAX_POSIX_AWK): (REG_SYNTAX_GREP, REG_SYNTAX_EGREP, REG_SYNTAX_POSIX_EGREP): (REG_SYNTAX_ED, REG_SYNTAX_SED, _REG_SYNTAX_POSIX_COMMON): (REG_SYNTAX_POSIX_BASIC, REG_SYNTAX_POSIX_MINIMAL_BASIC): (REG_SYNTAX_POSIX_EXTENDED, REG_SYNTAX_POSIX_MINIMAL_EXTENDED): (REG_DUP_MAX, REG_UNALLOCATED, REG_REALLOCATE, REG_FIXED): (REG_NREGS): Remove. All uses replaced by the old RE_* names. (RE_BACKSLASH_ESCAPE_IN_LISTS): (RE_BK_PLUS_QM, RE_CHAR_CLASSES, RE_CONTEXT_INDEP_ANCHORS): (RE_CONTEXT_INDEP_OPS, RE_CONTEXT_INVALID_OPS): (RE_DOT_NEWLINE, RE_DOT_NOT_NULL, RE_HAT_LISTS_NOT_NEWLINE): (RE_INTERVALS, RE_LIMITED_OPS, RE_NEWLINE_ALT): (RE_NO_BK_BRACES, RE_NO_BK_PARENS, RE_NO_BK_REFS): (RE_NO_BK_VBAR, RE_NO_EMPTY_RANGES): (RE_UNMATCHED_RIGHT_PAREN_ORD, RE_NO_POSIX_BACKTRACKING): (RE_NO_GNU_OPS, RE_DEBUG, RE_INVALID_INTERVAL_ORD): (RE_IGNORE_CASE, RE_CARET_ANCHORS_HERE): (RE_CONTEXT_INVALID_DUP, RE_NO_SUB): Don't bother having these macros be independent of each others' values, since they no longer exist in the POSIX name space. Rename the following member names back to their old names, unless !__USE_GNU_REGEX. All uses changed back. (buffer): Renamed from re_buffer. (allocated): Renamed from re_allocated. (used): Renamed from re_used. (syntax): Renamed from re_syntax. (fastmap): Renamed from re_fastmap. (translate): Renamed from re_translate. (can_be_null): Renamed from re_can_be_null. (regs_allocated): Renamed from re_regs_allocated. (fastmap_accurate): Renamed from re_fastmap_accurate. (no_sub): Renamed from re_no_sub. (not_bol): Renamed from re_not_bol. (not_eol): Renamed from re_not_eol. (newline_anchor): Renamed from re_newline_anchor. (num_regs): Renamed from rm_num_regs. (start): Renamed from rm_start. (end): Renamed from rm_end. (free_state): Move up a bit. * lib/regex_internal.h (inline) [__GNUC__ < 3 && defined _LIBC]: #define to be empty. (ASCII_CHARS): New macro, replacing all uses of 0x80 and/or SBC_MAX / 2 when that is what is intended. (SBC_MAX): Define to UCHAR_MAX + 1, not 256. (__re_error_msgid, __re_error_msgid_idx): Remove decls; not needed. (MAX): New macro. (re_xmalloc, re_calloc, re_xrealloc, re_x2realloc): Remove. All uses changed back to re_malloc, etc. It's now the caller's responsibility to check for overflow; all callers changed. (re_alloc_oversized, re_x2alloc_oversized, re_xnmalloc, re_xnrealloc): (re_x2nrealloc): Remove. (free_state): Remove decl. * lib/regexc.c (regexec, re_match, re_search, re_match_2, re_search_2): (re_set_registers, re_exec): Use K&R-style defn. 2006-01-31 Roland McGrath <roland@redhat.com> * lib/regcomp.c (calc_eclosure_iter): Remove dead variables. Reported by Mike Frysinger <vapier@gentoo.org>. 2006-01-15 Andreas Jaeger <aj@suse.de> [BZ #1950] * lib/regex_internal.c (re_string_reconstruct): Adjust for build_wcs_upper_buffer change. (build_wcs_upper_buffer): Change return type. 2005-12-10 Ulrich Drepper <drepper@redhat.com> * lib/regex_internal.h: Include <stdint.h> if available. 2005-12-06 Paolo Bonzini <bonzini@gnu.org> * lib/regex_internal.h (SIZE_MAX): Provide a default definition. 2005-10-14 Ulrich Drepper <drepper@redhat.com> * lib/regcomp.c: Adjust for changed secondary hash function. 2005-09-30 Ulrich Drepper <drepper@redhat.com> * lib/regex.h: Pretty printing. Clean up namespace a bit. 2005-09-30 Jakub Jelinek <jakub@redhat.com> * lib/regexec.c (update_cur_sifted_state, check_arrival, check_arrival_add_next_nodes): Avoid using uninitialized variable. 2005-09-06 Paul Eggert <eggert@cs.ucla.edu> Ulrich Drepper <drepper@redhat.com> [BZ #1302] * lib/regex_internal.h (bitset_t): Renamed from bitset. All uses changed. (bitset_word_t): Renamed from bitset_word. All uses changed. 2005-09-22 Ulrich Drepper <drepper@redhat.com> [BZ #281] * lib/regex.h: Define RE_TRANSLATE_TYPE as unsigned char *. * lib/regcomp.c: Remove unnecessary uses of unsigned RE_TRANSLATE_TYPE. * lib/regex_internal.h: Likewise. * lib/regex_internal.c: Likewise. * lib/regexec.c: Likewise. Based on a patch by Stepan Kasal <kasal@ucw.cz>. 2005-09-07 Ulrich Drepper <drepper@redhat.com> * lib/regexec.c (find_recover_state): Remove unnecessary initialization. (transit_state_bkref): Make DFA a const pointer. (get_subexp): Likewise. (check_arrival): Likewise. (update_cur_sifted_state): Likewise. (re_search_internal): Likewise. (prune_impossible_nodes): Likewise. (acquire_init_state_context): Likewise. (proceed_next_node): Likewise. (set_regs): Likewise. (free_fail_stack_return): Likewise. (check_arrival_expand_ecl): Mark DFA parameter as const. (check_arrival_expand_ecl_sub): Likewise. (check_subexp_limits): Likewise. (sub_epsilon_src_nodes): Likewise. (add_epsilon_src_nodes): Likewise. (merge_state_array): Likewise. (update_regs): Likewise. (build_trtable): Likewise. (sift_states_backward): Mark MCTX parameter as const. (build_sifted_states): Likewise. (update_cur_sifted_state): Likewise. (sift_states_mkref): Likewise. (check_arrival_expand_ecl): Mark eclosure as const. (check_dst_limits_calc_pos_1): Likewise. * lib/regex_internal.h (re_match_context_t): Make dfa a const pointer. 2005-09-06 Ulrich Drepper <drepper@redhat.com> * lib/regexec.c (merge_state_with_log): Define dfa as const pointer. (transit_state_sb): Likewise. (transit_state_mb): Likewise. (sift_states_iter_mb): Likewise. (check_arrival_add_next_nodes): Likewise. (check_node_accept_bytes): Change first parameter to pointer-to-const. [_LIBC] (re_search_2_stub): Use mempcpy. * lib/regex_internal.c (re_string_reconstruct): Avoid calling mbrtowc for very simple UTF-8 case. * lib/regex_internal.c (re_acquire_state): Make DFA pointer arg a pointer-to-const. (re_acquire_state_context): Likewise. * lib/regex_internal.h: Adjust prototypes. * lib/regex.c: Prevent using C++ compilers. * lib/regex_internal.c (re_acquire_state): Minor code rearrangement. (re_acquire_state_context): Likewise.
Derek R. Price 1e3866b6 2005-09-16T00:23:36 * regcomp.c, regexec.c, regex_internal.c: Back out previous changes, consolidating in... * regex_internal.h: ...this file.
Derek R. Price 594190cb 2005-09-15T19:14:23 * regex_internal.h: Blank `pure' for GNUC < 3. * regex_internal.c: Ditto, using this... (__GNUC_PREREQ): ...new macro. * regcomp.c, regexec.c: Blank `always_inline' for GNUC < 3.1 using... (__GNUC_PREREQ): ...this new macro.
Paul Eggert c4f640f1 2005-09-06T07:36:48 Change bitset word type from unsigned int to unsigned long int, as this has better performance on typical 64-bit hosts. Port bitset code to hosts with unusual word sizes. * lib/regcomp.c (build_equiv_class, build_charclass): (build_range_exp, build_collating_symbol): Prefer bitset to re_bitset_ptr_t in prototypes, when the actual argument is a bitset. This is merely a style issue, but it makes it clearer that an entire array is expected. (re_compile_fastmap_iter, init_dfa, init_word_char, optimize_subexps): * lib/regcomp.c (lower_subexp, parse_bracket_exp): (built_charclass_op): Port to the case where bitset_word is not the same as unsigned int. * lib/regex_internal.h (bitset_set, bitset_clear, bitset_contain): (bitset_not, bitset_merge, bitset_set_all, bitset_mask): Likewise. * lib/regexec.c (check_dst_limits_calc_pos_1): (check_subexp_matching_top): (build_trtable, group_nodes_into_DFAstates): Likewise. * lib/regcomp.c (re_compile_fastmap_iter, utf8_sb_map): (optimize_utf8): Don't assume that SBC_MAX is a multiple of BITSET_WORD_BITS. * lib/regex_internal.h (bitset_set_all, bitset_not): Likewise. * lib/regexec.c (group_nodes_into_DFAstates): Likewise. * lib/regcomp.c (utf8_sb_map): Don't assume UINT_MAX == 0xffffffff. * lib/regcomp.c (optimize_subexps, lower_subexp): Work even if bitset_word has holes in its bitwise representation. * lib/regex_internal.h (BITSET_WORD_BITS): Likewise. * lib/regexec.c (check_dst_limits_calc_pos_1): (heck_subexp_matching_top): Likewise. * lib/regex_internal.c (re_string_reconstruct): Don't assume UCHAR_MAX == 255. * lib/regex_internal.h (bitset_set_all): Likewise. * lib/regex_internal.h (BITSET_WORD_BITS): Renamed from UINT_BITS. All uses changed. (BITSET_WORDS): Renamed from BITSET_UINTS. All uses changed. (bitset_word): New type, replacing 'unsigned int' for bitset uses. All uses changed. (BITSET_WORD_MAX): New macro. (bitset_set, bitset_clear, bitset_contain, bitset_empty): (bitset_set_all, bitset_copy): Now inline functions, not macros. (bitset_empty, bitset_copy): Prefer sizeof (bitset) to multiplying it out ourselves. (bitset_not_merge): Remove; unused. (bitset_contain): Return bool, not unsigned int with one bit on. All callers changed. * lib/regexec.c (build_trtable): Don't assume bitset has no stricter alignment than re_node_set; do this by defining a new internal type struct dests_alloc and using it to allocate memory. * config/srclist.txt: Add glibc bug 1302.
Paul Eggert 812cbebe 2005-09-02T22:54:59 Check for arithmetic overflow when calculating sizes, to prevent some buffer-overflow issues. These patches are conservative, in the sense that when I couldn't determine whether an overflow was possible, I inserted a run-time check. * regex_internal.h (re_xmalloc, re_xrealloc, re_x2realloc): New macros. (SIZE_MAX) [!defined SIZE_MAX]: New macro. (re_alloc_oversized, re_x2alloc_oversized, re_xnmalloc): (re_xnrealloc, re_x2nrealloc): New inline functions. * lib/regcomp.c (init_dfa, analyze, build_range_exp, parse_bracket_exp): (build_equiv_class, build_charclass): Check for arithmetic overflow in size expression calculations. * lib/regex_internal.c (re_string_realloc_buffers): (build_wcs_upper_buffer, re_node_set_add_intersect): (re_node_set_init_union, re_node_set_insert, re_node_set_insert_last): (re_dfa_add_node, register_state): Likewise. * lib/regexec.c (re_search_stub, re_copy_regs, re_search_internal): (prune_impossible_nodes, push_fail_stack, set_regs, check_arrival): (build_trtable, extend_buffers, match_ctx_init, match_ctx_add_entry): (match_ctx_add_subtop, match_ctx_add_sublast): Likewise.
Paul Eggert 1e5cfc92 2005-09-01T19:41:07 Use bool where appropriate. * lib/regcomp.c (re_set_fastmap): ICASE arg is bool, not int. All callers changed. (calc_eclosure_iter): Likewise, for ROOT arg. (parse_bracket_element): Likewise, for ACCEPT_HYPHEN arg. (build_charclass_op): Likewise, for NON_MATCH arg. * lib/regex_internal.c (re_string_allocate, re_string_construct): (re_string_construct_common): Likewise, for ICASE arg. * lib/regexec.c (re_search_2_stub, re_search_stub): Likewise, for RET_LEN arg. (check_matching): Likewise, for FL_LONGEST_MATCH arg. (set_regs): Likewise, for FL_BACKTRACK arg. * lib/regcomp.c (re_compile_fastmap_iter, optimize_utf8): (duplicate_node_closure, calc_inveclosure, calc_eclosure): (calc_eclosure_iter, parse_bracket_exp): Use bool for internal variables that are booleans. * lib/regexec.c (re_search_internal, check_matching): (proceed_next_node): (set_regs, build_sifted_states, sift_states_bkref): (check_arrival_add_next_nodes, check_arrival_expand_ecl_sub): (expand_bkref_cache, build_trtable, group_nodes_into_DFAstates): (find_collation_sequence_value): Likewise. * lib/regex_internal.c (re_node_set_insert, re_node_set_insert_last): (re_node_set_compare): Return bool, not int. All callers changed. * lib/regexec.c (check_halt_node_context, check_dst_limits): (build_trtable, check_node_accept): Likewise. * lib/regex_internal.h: Include stdbool.h. Fix bugs uncovered when converting to bool. * lib/regcomp.c (calc_eclosure_iter): Check for storage allocation failure instead of charging ahead blindly. * lib/regex_internal.c (register_state): Likewise. * lib/regexec.c (re_search_2_stub): Use simpler method than boolean for freeing internal storage. (group_nodes_into_DFA_states): Use unsigned int, not int, for bitset pieces used as boolean, to avoid undefined behavior on hosts that do int overflow checking. * config/srclist.txt: Add glibc bug 1285.
Paul Eggert ea626b10 2005-08-31T23:36:42 * lib/regcomp.c (search_duplicated_node): Make first pointer arg a pointer-to-const. * lib/regex_internal.c (create_ci_newstate, create_cd_newstate): (register_state): Likewise. * lib/regexec.c (search_cur_bkref_entry, check_dst_limits): (check_dst_limits_calc_pos_1, check_dst_limits_calc_pos): (group_nodes_into_DFAstates): Likewise. * config/srclist.txt: Add glibc bug 1282.
Paul Eggert 28492cce 2005-08-31T22:51:09 On 64-bit hosts (where size_t is 64 bits and int is 32 bits), the old glibc regex code mishandles strings longer than 2**31 bytes. This patch fixes this when the regex code is used in gnulib (i.e., outside glibc). * lib/regex.h (_REGEX_LARGE_OFFSETS): New feature-test macro, governing whether the rest of this patch is active. By default, the macro is disabled and the patch has no effect. (regoff_t) [defined _REGEX_LARGE_OFFSETS]: Define to off_t, not int. (__re_idx_t, __re_size_t, __re_long_size_t): New types. (struct re_pattern_buffer, re_search, re_search_2, re_match): (re_match_2, re_set_registers): Use the new types. * lib/regex_internal.h (Idx, re_hashval_t): New types. (REG_MISSING, REG_ERROR, REG_VALID_INDEX, REG_VALID_NONZERO_INDEX): New macros. (re_node_set, re_charset_t, re_token_t, re_string_realloc_buffers): (re_string_context_at, bin_tree_t, re_dfastate_t): (struct re_state_table_entry, state_array_t, re_sub_match_last_t): (re_sub_match_top_t, re_match_context_t, re_sift_context_t): (struct re_fail_stack_ent_t, struct re_fail_stack_t, struct re_dfa_t): (re_string_char_size_at, re_string_wchar_at): (re_string_elem_size_at): Use the new types and macros to port to 64-bit hosts. Use unsigned types for internal values, so that the code mostly works even for arrays larger than SSIZE_MAX. * lib/regcomp.c (re_compile_internal, init_dfa, duplicate_node): (search_duplicated_node, calc_eclosure_iter, fetch_number): (parse_reg_exp, parse_branch, parse_expression, parse_sub_exp): (build_equiv_class, build_charclass, re_compile_fastmap_iter): (free_dfa_content, create_initial_state, optimize_utf8, analyze): (optimize_subexps, calc_first, link_nfa_nodes, duplicate_node_closure): (calc_inveclosure, parse_dup_op, build_range_exp): (build_collating_symbol, parse_bracket_exp, build_charclass_op): (fetch_number, create_token_tree, mark_opt_subexp): Likewise. * lib/regex_internal.c (re_string_construct_common, create_ci_newstate): (create_cd_newstate, re_string_allocate, re_string_construct): (re_string_realloc_buffers, build_wcs_upper_buffer): (re_string_skip_chars, build_upper_buffer, re_string_translate_buffer): (re_string_reconstruct, re_string_peek_byte_case): (re_string_fetch_byte_case, re_string_context_at): (re_node_set_alloc, re_node_set_init_1, re_node_set_init_2): (re_node_set_init_copy, re_node_set_add_intersect): (re_node_set_init_union, re_node_set_merge, re_node_set_insert): (re_node_set_insert_last, re_node_set_compare, re_node_set_contains): (re_node_set_remove_at, re_dfa_add_node, calc_state_hash): (re_acquire_state, re_acquire_state_context, register_state): Likewise. * lib/regex.c (match_ctx_init, match_ctx_add_entry, search_cur_bkref_entry): (match_ctx_add_subtop, match_ctx_add_sublast, sift_ctx_init): (re_search_internal, re_search_2_stub, re_search_stub) (re_copy_regs, check_matching, check_halt_state_context, update_regs): (push_fail_stack, sift_states_iter_mb, build_sifted_states): (update_cur_sifted_state, check_dst_limits): (check_dst_limits_calc_pos_1, check_dst_limits_calc_pos): (check_subexp_limits, sift_states_bkref, merge_state_array): (check_subexp_matching_top, get_subexp, get_subexp_sub): (find_subexp_node, check_arrival, check_arrival_add_next_nodes): (check_arrival_expand_ecl, check_arrival_expand_ecl_sub): (expand_bkref_cache, check_node_accept_bytes): (group_nodes_into_DFAstates, check_node_accept, regexec, re_match): (re_search, re_match_2, re_search_2, prune_impossible_nodes): (acquire_init_state_context, check_halt_node_context): (proceed_next_node, pop_fail_stack, set_regs, free_fail_stack_return): (sift_states_backward, clean_state_log_if_needed): (sub_epsilon_src_nodes, add_epsilone_src_nodes, merge_state_with_log): (find_recover_state, transit_state_sb, transit_state_mb): (transit_state_bkref, build_trtable, match_ctx_clean): Likewise. * lib/regcomp.c (parse_dup_op): Add an extra test if Idx is unsigned, to work around an assumption that REG_MISSING is negative. * m4/regex.m4 (gl_REGEX): Require AC_SYS_LARGEFILE, Define _REGEX_LARGE_OFFSETS). Test for regoff_t/off_t bug in 64-bit and large-file glibc and in 32-bit large-file Solaris. * config/srclist.txt: Add glibc bug 1281.
Paul Eggert 0f03f7bb 2005-08-31T20:27:56 * lib/regcomp.c (re_comp) [defined _REGEX_RE_COMP || defined _LIBC]: (seek_collating_symbol_entry) [defined _LIBC]: (lookup_collation_sequence_value) [defined _LIBC]: (build_range_exp, build_collating_symbol) [defined _LIBC]: Use prototypes rather than old-style function definitions. * lib/regexec.c (re_exec) [defined _REGEX_RE_COMP || defined _LIBC]: (transit_state_sb) [0]: (find_collation_sequence_value) [defined _LIBC]: Likewise. * config/srclist.txt: Add glibc bug 1280.
Paul Eggert ec469199 2005-08-31T19:38:13 * lib/regcomp.c (re_compile_fastmap_iter, init_dfa, init_word_char): (optimize_subexps, lower_subexp): Don't assume 1<<31 has defined behavior on hosts with 32-bit int, since the signed shift might overflow. Use 1u<<31 instead. * lib/regex_internal.h (bitset_set, bitset_clear, bitset_contain): Likewise. * lib/regexec.c (check_dst_limits_calc_pos_1): (check_subexp_matching_top): Likewise. * lib/regcomp.c (optimize_subexps, lower_subexp): Use CHAR_BIT rather than 8, for clarity. * lib/regexec.c (check_dst_limits_calc_pos_1): (check_subexp_matching_top): Likewise. * lib/regcomp.c (init_dfa): Make table_size unsigned, so that we don't have to worry about portability issues when shifting it left. Remove no-longer-needed test for table_size > 0. * lib/regcomp.c (parse_sub_exp): Do not shift more bits than there are in a word, as the resulting behavior is undefined. * lib/regexec.c (check_dst_limits_calc_pos_1): Likewise; in one case, a <= should have been an <, and in another case the whole test was missing. * lib/regex_internal.h (BYTE_BITS): Remove. All uses changed to the standard name CHAR_BIT. * lib/regexec.c (match_ctx_add_entry): Don't assume that ~0 == -1; this is not true on one's complement and signed-magnitude hosts.
Paul Eggert af36f47f 2005-08-31T18:08:34 * lib/regex_internal.h (re_sub_match_top_t): Remove unused member next_last_offset. (struct re_dfa_t): Remove unused member states_alloc. * lib/regcomp.c (init_dfa): Don't initialize unused members. * config/srclist.txt: Add glibc bug 1273.
Paul Eggert cad71bd9 2005-08-25T20:39:57 Make regex safe for g++. This fixes one real bug (an "err" that should have been "*err"). * config/srclist.txt: Add glibc bug 1241. * lib/regex_internal.h (re_calloc): New macro, consistent with re_malloc etc. All callers of calloc changed to use re_calloc. * lib/regex_internal.c (build_wcs_upper_buffer): Return reg_errcode_t, not int. All callers changed. * lib/regcomp.c (re_compile_fastmap_iter): Don't use alloca (mb_cur_max); just use an array of size MB_LEN_MAX. * lib/regexec.c (push_fail_stack): Use re_realloc, not realloc. (find_recover_state): Change "err" to "*err"; this fixes what appears to be a real bug. (check_arrival_expand_ecl_sub): Be consistent about reg_errcode_t versus int.
Paul Eggert 083768e3 2005-08-25T05:08:59 * config/srclist.txt: Add glibc bug 1240. * lib/regcomp.c (regerror): 2nd arg is 'restrict', as per POSIX. * lib/regex.h (regerror): Likewise.
Paul Eggert e6d7b6da 2005-08-24T23:29:39 * config/srclist.txt: Add glibc bug 1237. * lib/regcomp.c, lib/regex_internal.c, lib/regex_internal.h: * lib/regexec.c: All uses of recently-renamed identifiers changed to use the new, POSIX-compliant names. The code will build and run just fine without these changes, but it's better to eat our own dog food and use the standard-conforming names. * m4/regex.m4 (gl_REGEX): Use POSIX-compliant spellings when testing for GNU regex features.
Paul Eggert 55a8ca2f 2005-08-21T00:29:47 * config/srclist.txt: Add glibc bug 1224. * lib/regcomp.c: (init_word_char, create_initial_state, duplicate_node_closure): (fetch_token, peek_token_bracket, build_range_exp): (build_collating_symbol): Remove forward decls; no longer needed now that we use prototypes.
Paul Eggert 620e0d08 2005-08-20T22:26:51 * config/srclist.txt: Add glibc bug 1223. * lib/regcomp.c (create_initial_state): Remove duplicate decl.
Paul Eggert 087e9e5b 2005-08-20T07:42:15 * config/srclist.txt: Add glibc bugs 1220, 1221, 1222. * lib/regcomp.c: (re_compile_pattern, re_set_syntax, re_compile_fastmap): (re_compile_fastmap_iter, regcomp, regerror, regfree): (re_compile_internal, init_dfa, init_word_char, free_workarea_compile): (create_initial_state, optimize_utf8, analyze, postorder, preorder): (optimize_subexps, lower_subexps, lower_subexp, calc_first, calc_next): (link_nfa_nodes, duplicate_node_closure, search_duplicated_node): (duplicate_node, calc_inveclosure, calc_eclosure, calc_eclosure_iter): (fetch_token, peek_token, peek_token_bracket, parse, parse_reg_exp): (parse_branch, parse_expression, parse_sub_exp, parse_dup_op): (build_range_exp, build_collating_symbol, parse_bracket_exp): (parse_bracket_element, parse_bracket_symbol, build_equiv_class): (build_charclass, build_charclass_op, fetch_number, create_tree): (create_token_tree, mark_opt_subexp, duplicate_tree): Use prototypes rather than old-style definitions. * lib/regex_internal.c: (re_string_allocate, re_string_construct, re_string_realloc_buffers): (re_string_construct_common, build_wcs_buffer, build_wcs_upper_buffer): (re_string_skip_chars, build_upper_buffer, re_string_translate_buffer): (re_string_reconstruct, re_string_peek_byte_case): (re_string_fetch_byte_case, re_string_destruct, re_string_context_at): (re_node_set_alloc, re_node_set_init_1, re_node_set_init_2): (re_node_set_init_copy, re_node_set_add_intersect): (re_node_set_init_union, re_node_set_merge, re_node_set_insert): (re_node_set_insert_last, re_node_set_compare, re_node_set_contains): (re_node_set_remove_at, re_dfa_add_node, calc_state_hash): (re_acquire_state, re_acquire_state_context, register_state): (create_ci_newstate, create_cd_newstate, free_state): Likewise. * lib/regexec.c (regexec, re_match, re_search, re_match_2, re_search_2): (re_search_2_stub, re_search_stub, re_copy_regs, re_set_registers): (re_search_internal, prune_impossible_nodes): (acquire_init_state_context, check_matching, static): (check_halt_node_context, check_halt_state_context, proceed_next_node): (push_fail_stack, pop_fail_stack, set_regs, free_fail_stack_return): (update_regs, sift_states_backward, build_sifted_states): (clean_state_log_if_needed, merge_state_array): (update_cur_sifted_state, add_epsilon_src_nodes): (sub_epsilon_src_nodes, check_dst_limits, check_dst_limits_calc_pos_1): (check_dst_limits_calc_pos, check_subexp_limits, sift_states_bkref): (sift_states_iter_mb, transit_state, merge_state_with_log, static): (find_recover_state, check_subexp_matching_top, transit_state_mb): (transit_state_bkref, get_subexp, get_subexp_sub, find_subexp_node): (check_arrival, check_arrival_add_next_nodes): (check_arrival_expand_ecl, check_arrival_expand_ecl_sub): (expand_bkref_cache, build_trtable, group_nodes_into_DFAstates): (check_node_accept_bytes, check_node_accept, extend_buffers): (match_ctx_init, match_ctx_clean, match_ctx_free, match_ctx_add_entry): (search_cur_bkref_entry, match_ctx_add_subtop, match_ctx_add_sublast): (sift_ctx_init): Likewise. * lib/regex_internal.h: (re_string_allocate, re_string_construct, re_string_reconstruct): (re_string_realloc_buffers, build_wcs_buffer, build_wcs_upper_buffer): (build_upper_buffer, re_string_translate_buffer, re_string_destruct): (re_string_elem_size_at, re_string_char_size_at, re_string_wchar_at): (re_string_context_at, re_string_peek_byte_case): (re_string_fetch_byte_case): Declare even if RE_NO_INTERNAL_PROTOTYPES is defined, since we now use prototypes always. * lib/regex.h (_RE_ARGS): Remove. No longer needed, since we assume C89 or better. All uses removed.
Paul Eggert b27102aa 2005-08-20T00:02:22 (duplicate_node): Return new index, not an error code, and let the caller return REG_ESPACE if out of space. This removes an uninitialied-variable warning with GCC 4.0.1, and also avoids taking the address of a local variable. All callers changed.
Paul Eggert e138416d 2005-07-08T17:57:01 * config/srclist.txt: Comment out regcomp.c, since we have a porting fix now. * lib/regcomp.c (init_dfa, build_range_exp): Store __btowc value in wint_t, not wchar_t. Remove now-unnecessary cast.
Paul Eggert 151e40bb 2005-07-07T08:08:39 * modules/regex (Files): Add lib/regex_internal.c, lib/regex_internal.h, lib/regexec.c, lib/regcomp.c, m4/codeset.m4. (Depends-on): Add extensions. (Makefile.am): Remove lib_SOURCES; now done by m4 code. * config/srclist.txt: Add regcomp.c, regex.c, regex.h, regex_internal.c, regexec.c. Add regex_internal.h too, but as a comment, since the libc version is currently broken in gnulib mode. * lib/regex.c, lib/regex.h: Sync from libc. * lib/regcomp.c, lib/regexec_internal.c, lib/regex_internal.h, lib/regexec.c: New files, synced from libc, except that regex_internal.h currently has a small porting fix. * m4/regex.m4: Adjust to new libc regex implementation. (gl_INCLUDED_REGEX): Add AC_LIBSOURCES for all the .c and .h parts of (the new) regex. Quote the m4 stuff better. Check for RE_ICASE bug of old gnulib. Check for REG_STARTEND of recent libc. Rename local variables from jm_* to gl_*. Quote operand of "test -f". Say "recent enough" version of libc, not "version 2". (gl_PREREQ_REGEX): Remove AC_FUNC_ALLOCA, since alloca is a prerequisite module. Remove AC_HEADER_STDC; no longer needed. Check for locale.h, isblank, mbrtowc, wcrtomb, wcscoll. Remove check for btowc, isascii. Require AM_LANGINFO_CODESET.