Branch
Hash :
c63a0c67
Author :
Date :
2025-09-24T23:28:18
Update to Unicode 17.0.0. * lib/gen-uni-tables.c (is_property_alphabetic): Add three YANGQIN SIGNs. (UC_JOINING_GROUP_THIN_NOON): New enum item. (fill_arabicshaping, joining_group_as_c_identifier): Handle UC_JOINING_GROUP_THIN_NOON. (LBP_*): Split LBP_SA into LBP_SA1 and LBP_SA2. (LBP_HH, LBP_SA): New enum items. (get_lbp): Use them. Update such that unilbrk/lbrkprop.txt comes out as expected. (debug_output_lbp): Handle LBP_HH. Print either LBP_SA1, LBP_SA2 as LBP_SA. (fill_org_lbp, debug_output_org_lbp): Handle LBP_HH. (lbp_value_to_string): Handle LBP_HH. Handle LBP_SA1, LBP_SA2 instead of LBP_SA. (output_lbrk_rules_as_tables): Update for LBP_HH change. Update rules LBP12a, LB21 as specified by https://www.unicode.org/reports/tr14/tr14-55.html. (get_wbp): Update such that uniwbrk/wbrkprop.txt comes out as expected. * lib/unictype.in.h (UC_JOINING_GROUP_THIN_NOON): New enum item. * lib/unictype/joininggroup_byname.gperf: Handle it. * lib/unictype/joininggroup_name.h: Likewise. * lib/unilbrk/lbrktables.h (LBP_*): Split LBP_SA into LBP_SA1 and LBP_SA2. (LBP_HH): New enum item. (unilbrk_table): Update bounds. * lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop): Use LBP_HL_HY instead of LBP_HL_BA. Use LBP_SA1 instead of LBP_SA. Treat LBP_SA2 like LBP_CM. Update rules LB20a and LB21a, as specified by https://www.unicode.org/reports/tr14/tr14-55.html. * lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop): Likewise. * lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop): Likewise. * tests/unigbrk/test-u8-grapheme-breaks.c (main): Use U+2B50 instead of U+2605, because U+2605 no longer is an Extended_Pictographic character. * tests/unigbrk/test-u16-grapheme-breaks.c (main): Likewise. * tests/unigbrk/test-u32-grapheme-breaks.c (main): Likewise. * tests/unigbrk/test-u8-grapheme-next.c (main): Likewise. * tests/unigbrk/test-u16-grapheme-next.c (main): Likewise. * tests/unigbrk/test-u32-grapheme-next.c (main): Likewise. * tests/unigbrk/test-u8-grapheme-prev.c (main): Likewise. * tests/unigbrk/test-u16-grapheme-prev.c (main): Likewise. * tests/unigbrk/test-u32-grapheme-prev.c (main): Likewise. * tests/uniwidth/test-uc_width2.sh: Update expected test result. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * tests/unilbrk/LineBreakTest.txt: Update. * All the affected modules: Bump required libunistring version.
/* DO NOT EDIT! GENERATED AUTOMATICALLY! */
/* Decomposition of Unicode characters. */
/* Generated automatically by gen-uni-tables.c for Unicode 17.0.0. */
/* Copyright (C) 2000-2025 Free Software Foundation, Inc.
This file is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation; either version 2.1 of the
License, or (at your option) any later version.
This file is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>. */
extern const unsigned char gl_uninorm_decomp_chars_table[];
#define decomp_header_0 10
#define decomp_header_1 191
#define decomp_header_2 5
#define decomp_header_3 31
#define decomp_header_4 31
typedef struct
{
int level1[191];
int level2[30 << 5];
unsigned short level3[293 << 5];
}
decomp_index_table_t;
extern const decomp_index_table_t gl_uninorm_decomp_index_table;