lib/uniname/uninames.h


Log

Author Commit Date CI Message
Daiki Ueno 19f18ecf 2017-11-27T11:34:25 libunistring: update to Unicode 9.0.0 * lib/gen-uni-tables.c (fill_properties): Recognize Sentence_Terminal and Prepended_Concatenation_Mark. (is_property_default_ignorable_code_point): Exclude U+08E2. (fill_arabicshaping): Allow missing whitespace when parsing; recognize "AFRICAN FEH", "AFRICAN QAF", and "AFRICAN MOON". (output_blocks): Increase the element size of the level1 table to accommodate more blocks. (get_lbp): Recognize ZWJ, E_Base, and E_Modifier characters; Update each class according to the standard. (get_wbp): Recognize ZWJ, E_Base, E_Modifier, Glue_After_Zwj, and E_Base_GAZ characters. (output_gbp_table): Recognize ZWJ, E_Base, E_Modifier, Glue_After_Zwj, and E_Base_GAZ characters. * lib/unictype.in.h (UC_JOINING_GROUP_AFRICAN_FEH) (UC_JOINING_GROUP_AFRICAN_QAF, UC_JOINING_GROUP_AFRICAN_MOON): New enum value. * lib/unilbrk/lbrktables.h (LBP_ZWJ, LBP_EB, LBP_EM): New enum value. * lib/unilbrk/lbrktables.c (unilbrk_table): Extend the table with LBP_ZWJ, LBP_EB, and LBP_EM. * lib/uniwbrk.in.h (WBP_ZWJ, WBP_EB, WBP_EM, WBP_GAZ, WBP_EBG): New enum value. * lib/uniwbrk/u-wordbreaks.h: Implement WB3c, WB15, and WB16. * lib/uniwbrk/wbrktable.h (uniwbrk_prop_index): New variable declaration. * lib/uniwbrk/wbrktable.c (uniwbrk_prop_index): New variable. (uniwbrk_table): Implement WB14. * tests/uniwbrk/test-uc-wordbreaks.c (wordbreakproperty_to_string): Check WBP_ZWJ, WBP_EB, WBP_EM, WBP_GAZ, and WBP_EBG. * modules/unigbrk/u{32,16,8}-grapheme-breaks: No longer depend on uc-is-grapheme-break. * modules/unigbrk/uc-grapheme-breaks: New module. * modules/unigbrk/uc-grapheme-breaks-tests: New module. * lib/unigbrk.in.h (GBP_ZWJ, GBP_EB, GBP_EM, GBP_GAZ, GBP_EBG): New enum value. (uc_grapheme_breaks): New function, replacing uc_is_grapheme_break. * lib/unigbrk/u-grapheme-breaks.h: New file. * lib/unigbrk/u{32,16,8}-grapheme-breaks.c: Rewrite using u-grapheme-breaks.h instead of uc_is_grapheme_break. * lib/unigbrk/uc-grapheme-breaks.c: New file. * lib/unigbrk/uc-is-grapheme-break.c: Partially update to TR29 rev 29. * tests/unigbrk/test-uc-gbrk-prop.c (graphemebreakproperty_to_string): Check GBP_ZWJ, GBP_EB, GBP_EM, GBP_GAZ, and GBP_EBG. * tests/unigbrk/test-uc-grapheme-breaks.c: New test. * tests/unigbrk/test-uc-is-grapheme-break.c (graphemebreakproperty_to_string): Check GBP_ZWJ, GBP_EB, GBP_EM, GBP_GAZ, and GBP_EBG. (main): Skip unsupported rules involving 3 or more characters, namely GB10, GB12, and GB13. * lib/uniwidth/width.c (nonspacing_table_data): Update. * all generated files under lib/uni* and tests/uni*: Regenerate. * all the dependant modules: Bump version.
Daiki Ueno 1b23c219 2015-06-18T18:03:53 uniname/uniname: update to Unicode 8.0.0 * lib/uniname/uninames.h: Regenerate. * tests/uniname/NameAliases.txt: Update from Unicode 8.0.0. * tests/uniname/UnicodeDataNames.txt: Update from Unicode 8.0.0.
Daiki Ueno 784023c9 2015-02-16T15:44:14 uniname/uniname: support character alias * lib/uniname/gen-uninames.lisp (main): New argument ALIASFILE. Generate one-way mapping from aliases to codepoints in the generated tables. Special case variation selectors to reduce table size. * lib/uniname/uniname.c (unicode_character_name): Special case variation selectors. (unicode_name_character): Special case variation selectors and their aliases. * lib/uniname/uninames.h: Regenerate. * tests/uniname/NameAliases.txt: New file, taken from UCD 7.0.0. * modules/uniname/uniname-tests (Files): Add tests/uniname/NameAliases.txt. * tests/uniname/test-uninames.c: Mark as static. (ALIASLEN): Define. (struct unicode_alias): New struct. (unicode_aliases): New variable. (fill_aliases): New function. (test_alias_lookup): New test function. (main): Run the 'test_alias_lookup' test if the second argument is given. * tests/uniname/test-uninames.sh: Supply NameAliases.txt as the second argument.
Daiki Ueno 257752a1 2015-01-06T18:53:40 uniname/uniname: update to Unicode 7.0.0 To accommodate new characters added since Unicode 5.1.0, this changes the internal representation of codepoint ranges. Previously, we grouped codepoint ranges by manually assigned 4-bit tag, which only allowed 16 groups. This removes the limitation by switching to binary search on a table. For the detail rationale and the benchmark results, see: https://lists.gnu.org/archive/html/bug-libunistring/2014-06/msg00001.html * lib/uniname/gen-uninames.lisp (unicode-char): Rename CODE member to INDEX, as it no longer represents a codepoint. (range): New struct. (main): Switch to intervals list from a bit-pattern based classification. * lib/uniname/uninames.h: Regenerate. * tests/uniname/UnicodeDataNames.txt: Update to Unicode 7.0.0. * modules/uniname/base (configure.ac): Bump minimum version to 0.9.5. * modules/uniname/uniname (configure.ac): Bump minimum version to 0.9.5.
Bruno Haible 1a4fd05a 2009-02-08T16:11:07 Regenerated for Unicode 5.1.0.
Bruno Haible 91562b11 2007-07-15T00:14:51 Emit a "do not edit" line to the generated file.
Bruno Haible e5df82f1 2007-07-08T12:00:12 Regenerated for Unicode 5.0.
Bruno Haible 919363c0 2007-07-07T12:49:35 New modules uniname/base and uniname/uniname.