• Show log

    Commit

  • Hash : b19a1077
    Author : Paul Eggert
    Date : 2022-05-13T23:23:35

    dfa: fix bug with ‘.’ and UTF-8 Hangul Syllables
    
    This fixes a bug introduced in 2019-12-18T05:41:27Z!eggert@cs.ucla.edu,
    an earlier patch that fixed dfa.c to not match invalid UTF-8.
    Unfortunately that patch had a couple of typos when dfa.c is
    matching against the regular expression ‘.’ (dot).  One typo
    caused dfa.c to incorrectly reject the valid UTF-8 sequences
    (ED)(90-9F)(80-BF) corresponding to U+D400 through U+D7FF, which
    are some Hangul Syllables and Hangul Jamo Extended-B.  The other
    typo caused dfa.c to incorrectly reject the valid sequences
    (F4)(88-8F)(80-BF)(80-BF) which correspond to U+108000 through
    U+10FFFF (Supplemental Private Use Area plane B).
    * lib/dfa.c (utf8_classes): Fix typos.
    * tests/test-dfa-match.sh: Test the fix.