Commit 4299106ceab30ee5f3333e6b0c1adf5568699e5f

Norihiro Tanaka 2018-09-18T10:07:23

dfa: optimize alternation in NFA Even when similar states exist in alternation, the DFA treats them as separate items, which may complicate the transition in NFA and cause slowdown. This change assembles the states into one. For example, ab|ac is changed into a(b|c). This change speeds-up matching for many branched patterns. For example, grep speeds up more than 30× in: seq 10000 | sed 's/$/ abcdefghijklmnopqrstuvwxyz/; s/$/./' >in time -p env LC_ALL=C grep -vf in in * lib/dfa.c (prune): New function. (merge_nfa_state): New function. It merges similar NFA states. (dfaoptimize): New function. It seeks merged and removed nodes. (dfaanalyze): Call new function. (dfautf8noss): Change name from dfaoptimize because of addition of new function. (dfacomp): Update caller.