Hash :
a93e0da1
Author :
Date :
2024-01-30T17:08:49
Update to Unicode 15.1.0. * lib/gen-uni-tables.c (PROP_SENTENCE_TERMINAL): Renamed from PROP_STERM. (PROP_IDS_UNARY_OPERATOR, PROP_ID_COMPAT_MATH_CONTINUE, PROP_ID_COMPAT_MATH_START): New enum items. (UC_INDIC_CONJUNCT_BREAK_*): New enum items. (unicode_indic_conjunct_break): New variable. (fill_properties): Rename local variable propvalue to propcode. Handle the properties IDS_Unary_Operator, ID_Compat_Math_Continue, ID_Compat_Math_Start. Parse the InCB values from file DerivedCoreProperties.txt. (indic_conjunct_break_as_c_identifier, output_indic_conjunct_break_test): New functions. (indic_conjunct_break_table): New variable. (output_indic_conjunct_break): New function. (fill_width): Accept spaces at the end of field0 and at the start and end of field1. (LBP_QU1, LBP_QU2, LBP_QU3): New enum items, for Unicode TR #14 rules (LB15a) and (LB15b). (LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF): New enum items, for Brahmic scripts. (get_lbp): Update such that unilbrk/lbrkprop.txt comes out as expected. (debug_output_lbp): Print either LBP_QU1 or LBP_QU2 or LBP_QU3 as LBP_QU. Handle LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF. (fill_org_lbp): Accept spaces at the end of field0 and at the start and end of field1. Recognize LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF. (debug_output_org_lbp): Handle LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF. (lbp_value_to_string): Handle LBP_QU1, LBP_QU2, LBP_QU3 instead of LBP_QU. Handle LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF. (output_lbrk_rules_as_tables): Treat LBP_QU as macro that maps to three table rows/columns. Replace rule (LB15) with rules (LB15b) and (LB15a). (get_wbp): Update such that uniwbrk/wbrkprop.txt comes out as expected. (main): Invoke output_indic_conjunct_break_test and output_indic_conjunct_break. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * lib/unilbrk/lbrktables.h (LBP_QU1, LBP_QU2, LBP_QU3): New enum items, for Unicode TR #14 rules (LB15a) and (LB15b). (LBP_QU): Remove enum item. (LBP_AP, LBP_AK, LBP_AS, LBP_VI, LBP_VF): New enum items, for Brahmic scripts. (unilbrk_table): Update array bounds. * lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop): Conditionally replace LBP_QU2 with LBP_QU1, for rule (LB15a). Conditionally replace LBP_QU3 with LBP_QU1, for rule (LB15b). * lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop): Likewise. * lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop): Likewise. * lib/unictype.in.h (UC_INDIC_CONJUNCT_BREAK_*): New enum values. (uc_indic_conjunct_break_name, uc_indic_conjunct_break_byname, uc_indic_conjunct_break): New declarations. * lib/unictype/incb_byname.c: New file. * lib/unictype/incb_byname.gperf: New file. * lib/unictype/incb_name.c: New file. * lib/unictype/incb_name.h: New file. * lib/unictype/incb_of.c: New file. * lib/unictype/incb_of.h: New generated file. * modules/unictype/incb-all: New file. * modules/unictype/incb-byname: New file. * modules/unictype/incb-name: New file. * modules/unictype/incb-of: New file. * tests/unictype/test-incb_byname.c: New file. * tests/unictype/test-incb_name.c: New file. * tests/unictype/test-incb_of.c: New file. * tests/unictype/test-incb_of.h: New generated file. * modules/unictype/incb-byname-tests: New file. * modules/unictype/incb-name-tests: New file. * modules/unictype/incb-of-tests: New file. * lib/unigbrk.in.h (uc_is_grapheme_break, u*_grapheme_next, u*_grapheme_prev): Add comments. * lib/unigbrk/u-grapheme-breaks.h (FUNC): Add local variables incb_consonant_extended, incb_consonant_extended_linker, incb_consonant_extended_linker_extended. Implement rule (GB9c). * modules/unigbrk/u8-grapheme-breaks (Depends-on): Add unictype/incb-of. * modules/unigbrk/u16-grapheme-breaks (Depends-on): Likewise. * modules/unigbrk/u32-grapheme-breaks (Depends-on): Likewise. * modules/unigbrk/uc-grapheme-breaks (Depends-on): Likewise. * tests/unigbrk/test-uc-is-grapheme-break.c (main): Add local variables incb_consonant_extended, incb_consonant_extended_linker, incb_consonant_extended_linker_extended. Skip test cases that match rule (GB9c). * modules/unigbrk/uc-is-grapheme-break-tests (Depends-on): Add unictype/incb-of. * All the affected modules: Bump required libunistring version.