|
034c875d
|
2011-02-25T09:10:57
|
|
strstr: revert patches that introduced bug and pessimization
Jim's one-liner solved the bug by pessimizing speed, making the
algorithm shift less per iteration and thus perform more repeated
comparisons. The real reason for the bug is that my supposed
"optimizations" actually resulted in cases on certain periodic needles
where critical_factorization returned a factorization that was equal
to, rather than less than the period of the needle. This makes the
CMP_FUNC choose the wrong branch, since a periodic needle must be
handled differently than one where the left half of the needle does
not overlap the right half.
Thankfully, the flawed "optimization" was only present in gnulib, and
was never ported to glibc or cygwin (the only two known
implementations that use the two-way algorithm), so no additional m4
check is needed to detect the bug in the wild.
* lib/str-two-way.h: Add another reference.
(two_way_short_needle, two_way_long_needle): Revert changes from
2011-02-24; they pessimize search speed.
(critical_factorization): Partially revert changes from
2010-06-22; they violate the requirement that the left half of the
needle be smaller than the period of the needle.
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
98d62d70
|
2011-02-24T10:57:22
|
|
strstr: fix a bug whereby strstr would mistakenly return NULL
* lib/str-two-way.h (two_way_short_needle): Correct off-by-one error
in period calculation.
(two_way_long_needle): Likewise.
Reported by Ralf Wildenhues, with the short needle and haystack.
* tests/test-strstr.c: Add Ralf's test case to trigger the bug.
Add a more involved test to trigger the bug in two_way_long_needle.
|
|
d60f3b0c
|
2011-01-01T20:17:23
|
|
maint: update almost all copyright ranges to include 2011
Run the new "make update-copyright" rule.
|
|
c823199d
|
2010-10-05T16:39:32
|
|
memmem, strstr, strcasestr: fix bug with long periodic needle
* lib/str-two-way.h (two_way_long_needle): Avoid bug with long
periodic needle having false positive.
* m4/memmem.m4 (gl_FUNC_MEMMEM_SIMPLE): Detect bug in glibc 2.12
and cygwin 1.7.7.
(gl_FUNC_MEMMEM): Be more pessimistic when cross-compiling.
* m4/strcasestr.m4 (gl_FUNC_STRCASESTR_SIMPLE)
(gl_FUNC_STRCASESTR): Likewise.
* m4/strstr.m4 (gl_FUNC_STRSTR_SIMPLE, gl_FUNC_STRSTR): Likewise.
* tests/test-memmem.c (main): Expose the bug.
* tests/test-strcasestr.c (main): Likewise.
* tests/test-strstr.c (main): Likewise.
* tests/test-c-strcasestr.c (main): Likewise.
* doc/glibc-functions/memmem.texi (memmem): Document the bug.
* doc/posix-functions/strstr.texi (strstr): Likewise.
* doc/glibc-functions/strcasestr.texi (strcasestr): Likewise.
Reported via http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
fffd5fac
|
2010-06-21T16:16:44
|
|
memmem: slight optimization
For any needle, the factorization 0/n has a local period of 1, so it
is a critical factorization iff the entire needle consists only of a
single repeated byte, in which case 1/n-1 would also be critical.
Starting with a comparison of x[0] and x[1] in the maximal suffix
check will either find the 0/n case or move on to something else, so
we can optimize and start with the x[1] vs. x[2] case to begin with.
To avoid out-of-bounds references, we must then special case needles
of length two or less. However, for these cases, we can determine a
critical factorization without any probes of the needle (we already
require a non-empty needle; a 1-byte needle can factor as either 0/1
or 1/0 but the rest of our code assumes a non-empty suffix; and of the
two 2-byte needle patterns, "aa" can factor as either 0/2 or 1/1 but
with best performance for 1/1, and "ab" must be factored as 1/1).
* lib/str-two-way.h (critical_factorization): Update comments.
Reduce work during factorization phase.
Reported by Carlos Bueno <carlos@bueno.org>.
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
b2e2010c
|
2010-01-01T10:31:12
|
|
update nearly all FSF copyright year lists to include 2010
Use the same procedure as for 2009, outlined in
http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/20081
|
|
3030c5b5
|
2009-12-28T10:50:36
|
|
update nearly all FSF copyright year lists to include 2009
The files named by the following are exempted:
grep -v '^#' config/srclist.txt|grep -v '^$' \
| while read src dst; do
test -f "$dst" && { echo "$dst"; continue; }
test -d "$dst" || continue
echo "$dst"/$(basename "$src")
done > exempt
git ls-files tests/unictype >> exempt
In the remaining files, convert to all-interval notation if
- there is already at least one year interval like 2000-2003
- the file is maintained by me
- the file is in lib/uni*/, where that style already prevails
Otherwise, use update-copyright's default.
|
|
441aa304
|
2009-12-10T20:28:30
|
|
Use spaces for indentation, not tabs.
|
|
a3721792
|
2008-05-15T06:16:11
|
|
Glibc finally accepted the memmem speedup code, bugzilla #5514.
* doc/glibc-functions/memmem.texi (memmem): Mention last broken
glibc version.
* doc/glibc-functions/strcasestr.texi (strcasestr): Likewise.
* doc/posix-functions/strstr.texi (strstr): Likewise.
* lib/str-two-way.h (MAX): Sychronize with glibc.
Signed-off-by: Eric Blake <ebb9@byu.net>
|
|
9c063a2a
|
2008-01-10T22:22:51
|
|
Convert strcasestr module to use Two-Way algorithm.
* modules/strcasestr-simple: New module, based on the old
strcasestr, but with Two-Way rather than KMP.
* modules/strcasestr (Depends-on): Change to strcasestr-simple.
* lib/string.in.h (rpl_strcasestr): Declare.
* m4/strcasestr.m4 (gl_FUNC_STRCASESTR): Check for linear
performance.
* lib/strcasestr.c (strcasestr): Simplify, and avoid malloc.
* modules/string (Makefile.am): Support strcasestr.
* m4/string_h.m4 (gl_HEADER_STRING_H_DEFAULTS): Likewise.
* modules/strcasestr-tests (Depends-on): Check for alarm.
* tests/test-strcasestr.c: Augment test.
* lib/str-two-way.h: Clean up stray macro.
* NEWS: Document new module.
* MODULES.html.sh (string handling): Likewise.
* doc/functions/strcasestr.texi: New file.
* doc/gnulib.texi (Function Substitutes): New node. Move memmem
here, since it is not a POSIX function.
Signed-off-by: Eric Blake <ebb9@byu.net>
|
|
c358da1e
|
2008-01-10T12:06:35
|
|
Share two-way algorithm.
* lib/str-two-way.h: New file, merged from...
* lib/memmem.c: ...here...
* lib/strstr.c: ...and here.
* modules/memmem (Files): Use it.
* modules/strstr (Files): Likewise.
Signed-off-by: Eric Blake <ebb9@byu.net>
|