Update Unicode support to 12.1. * scipts/build_*_map.py: Implement helper pythonic scripts used to generate some Unicode search maps and data for helper Unicode functions used in MD4C. This should simplify updating to future Unicode versions. * md_get_unicode_fold_info: Use data generated by the scripts. * md_is_unicode_whitespace__: Ditto. * md_is_unicode_punct__: Ditto.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 5100 5101 5102 5103 5104 5105 5106 5107 5108 5109 5110 5111 5112 5113 5114 5115 5116 5117 5118 5119 5120 5121 5122 5123 5124 5125 5126 5127 5128 5129 5130 5131 5132 5133 5134 5135 5136 5137 5138 5139 5140 5141 5142 5143 5144 5145 5146 5147 5148 5149 5150 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 5250 5251 5252 5253 5254 5255 5256 5257 5258 5259 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273 5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 5300 5301 5302 5303 5304 5305 5306 5307 5308 5309 5310 5311 5312 5313 5314 5315 5316 5317 5318 5319 5320 5321 5322 5323 5324 5325 5326 5327 5328 5329 5330 5331 5332 5333 5334 5335 5336 5337 5338 5339 5340 5341 5342 5343 5344 5345 5346 5347 5348 5349 5350 5351 5352 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378 5379 5380 5381 5382 5383 5384 5385 5386 5387 5388 5389 5390 5391 5392 5393 5394 5395 5396 5397 5398 5399 5400 5401 5402 5403 5404 5405 5406 5407 5408 5409 5410 5411 5412 5413 5414 5415 5416 5417 5418 5419 5420 5421 5422 5423 5424 5425 5426 5427 5428 5429 5430 5431 5432 5433 5434 5435 5436 5437 5438 5439 5440 5441 5442 5443 5444 5445 5446 5447 5448 5449 5450 5451 5452 5453 5454 5455 5456 5457 5458 5459 5460 5461 5462 5463 5464 5465 5466 5467 5468 5469 5470 5471 5472 5473 5474 5475 5476 5477 5478 5479 5480 5481 5482 5483 5484 5485 5486 5487 5488 5489 5490 5491 5492 5493 5494 5495 5496 5497 5498 5499 5500 5501 5502 5503 5504 5505 5506 5507 5508 5509 5510 5511 5512 5513 5514 5515 5516 5517 5518 5519 5520 5521 5522 5523 5524 5525 5526 5527 5528 5529 5530 5531 5532 5533 5534 5535 5536 5537 5538 5539 5540 5541 5542 5543 5544 5545 5546 5547 5548 5549 5550 5551 5552 5553 5554 5555 5556 5557 5558 5559 5560 5561 5562 5563 5564 5565 5566 5567 5568 5569 5570 5571 5572 5573 5574 5575 5576 5577 5578 5579 5580 5581 5582 5583 5584 5585 5586 5587 5588 5589 5590 5591 5592 5593 5594 5595 5596 5597 5598 5599 5600 5601 5602 5603 5604 5605 5606 5607 5608 5609 5610 5611 5612 5613 5614 5615 5616 5617 5618 5619 5620 5621 5622 5623 5624 5625 5626 5627 5628 5629 5630 5631 5632 5633 5634 5635 5636 5637 5638 5639 5640 5641 5642 5643 5644 5645 5646 5647 5648 5649 5650 5651 5652 5653 5654 5655 5656 5657 5658 5659 5660 5661 5662 5663 5664 5665 5666 5667 5668 5669 5670 5671 5672 5673 5674 5675 5676 5677 5678 5679 5680 5681 5682 5683 5684 5685 5686 5687 5688 5689 5690 5691 5692 5693 5694 5695 5696 5697 5698 5699 5700 5701 5702 5703 5704 5705 5706 5707 5708 5709 5710 5711 5712 5713 5714 5715 5716 5717 5718 5719 5720 5721 5722 5723 5724 5725 5726 5727 5728 5729 5730 5731 5732 5733 5734 5735 5736 5737 5738 5739 5740 5741 5742 5743 5744 5745 5746 5747 5748 5749 5750 5751 5752 5753 5754 5755 5756 5757 5758 5759 5760 5761 5762 5763 5764 5765 5766 5767 5768 5769 5770 5771 5772 5773 5774 5775 5776 5777 5778 5779 5780 5781 5782 5783 5784 5785 5786 5787 5788 5789 5790 5791 5792 5793 5794 5795 5796 5797 5798 5799 5800 5801 5802 5803 5804 5805 5806 5807 5808 5809 5810 5811 5812 5813 5814 5815 5816 5817 5818 5819 5820 5821 5822 5823 5824 5825 5826 5827 5828 5829 5830 5831 5832 5833 5834 5835 5836 5837 5838 5839 5840 5841 5842 5843 5844 5845 5846 5847 5848 5849 5850 5851 5852 5853 5854 5855 5856 5857 5858 5859 5860 5861 5862 5863 5864 5865 5866 5867 5868 5869 5870 5871 5872 5873 5874 5875 5876 5877 5878 5879 5880 5881 5882 5883 5884 5885 5886 5887 5888 5889 5890 5891 5892 5893 5894 5895 5896 5897 5898 5899 5900 5901 5902 5903 5904 5905 5906 5907 5908 5909 5910 5911 5912 5913 5914 5915 5916 5917 5918 5919 5920 5921 5922 5923 5924 5925 5926 5927 5928 5929 5930 5931 5932 5933 5934 5935 5936 5937 5938 5939 5940 5941 5942 5943 5944 5945 5946 5947 5948 5949 5950 5951 5952 5953 5954 5955 5956 5957 5958 5959 5960 5961 5962 5963 5964 5965 5966 5967 5968 5969 5970 5971 5972 5973 5974 5975 5976 5977 5978 5979 5980 5981 5982 5983 5984 5985 5986 5987 5988 5989 5990 5991 5992 5993 5994 5995 5996 5997 5998 5999 6000 6001 6002 6003 6004 6005 6006 6007 6008 6009 6010 6011 6012 6013 6014 6015 6016 6017 6018 6019 6020 6021 6022 6023 6024 6025 6026 6027 6028 6029 6030 6031 6032 6033 6034 6035 6036 6037 6038 6039 6040 6041 6042 6043 6044 6045 6046 6047 6048 6049 6050 6051 6052 6053 6054 6055 6056 6057 6058 6059 6060 6061 6062 6063 6064 6065 6066 6067 6068 6069 6070 6071 6072 6073 6074 6075 6076 6077 6078 6079 6080 6081 6082 6083 6084 6085 6086 6087 6088 6089 6090 6091 6092 6093 6094 6095 6096 6097 6098 6099 6100 6101 6102 6103 6104 6105 6106 6107 6108 6109 6110 6111 6112 6113 6114 6115 6116 6117 6118 6119 6120 6121 6122 6123 6124 6125 6126 6127 6128 6129 6130 6131 6132 6133 6134 6135 6136 6137 6138 6139 6140 6141 6142 6143 6144 6145 6146 6147 6148 6149 6150 6151 6152 6153 6154 6155 6156 6157 6158 6159 6160 6161 6162 6163 6164 6165 6166 6167 6168 6169 6170 6171 6172 6173 6174 6175 6176 6177 6178 6179 6180 6181 6182 6183 6184 6185 6186 6187 6188 6189 6190 6191 6192 6193 6194 6195 6196 6197 6198 6199 6200 6201 6202 6203 6204 6205 6206 6207 6208 6209 6210 6211 6212 6213 6214 6215 6216 6217 6218 6219 6220 6221 6222 6223 6224 6225 6226 6227 6228 6229 6230 6231 6232 6233 6234 6235 6236 6237 6238 6239 6240 6241 6242 6243 6244 6245 6246 6247 6248 6249 6250 6251 6252 6253 6254 6255 6256 6257 6258 6259 6260 6261 6262 6263 6264 6265 6266 6267 6268 6269 6270 6271 6272 6273 6274 6275 6276 6277 6278 6279 6280 6281 6282 6283 6284 6285 6286 6287 6288 6289 6290 6291 6292 6293 6294 6295 6296 6297 6298 6299 6300 6301 6302 6303 6304 6305 6306 6307 6308 6309 6310 6311 6312 6313 6314 6315 6316 6317 6318 6319 6320 6321 6322 6323 6324 6325 6326 6327 6328 6329 6330 6331 6332 6333 6334 6335 6336 6337 6338 6339 6340 6341 6342 6343 6344 6345 6346 6347 6348 6349 6350 6351 6352 6353 6354 6355 6356 6357 6358 6359 6360 6361 6362 6363 6364 6365 6366 6367 6368 6369 6370 6371 6372 6373 6374 6375 6376 6377 6378 6379 6380 6381 6382 6383 6384 6385 6386 6387 6388 6389 6390 6391 6392 6393 6394 6395 6396 6397 6398 6399 6400 6401 6402 6403 6404 6405 6406 6407 6408 6409 6410 6411 6412 6413 6414 6415 6416 6417 6418 6419 6420 6421 6422 6423 6424 6425 6426 6427 6428 6429 6430 6431 6432 6433 6434 6435 6436 6437 6438 6439 6440 6441 6442 6443 6444 6445 6446 6447 6448 6449 6450 6451 6452 6453 6454 6455 6456 6457 6458 6459 6460 6461 6462 6463 6464 6465 6466 6467 6468 6469 6470 6471 6472 6473 6474 6475 6476 6477 6478 6479 6480 6481 6482 6483 6484 6485 6486 6487 6488 6489 6490 6491 6492 6493 6494 6495 6496 6497 6498 6499 6500 6501 6502 6503 6504 6505 6506 6507 6508 6509 6510 6511 6512 6513 6514 6515 6516 6517 6518 6519 6520 6521 6522 6523 6524 6525 6526 6527 6528 6529 6530 6531 6532 6533 6534 6535 6536 6537 6538 6539 6540 6541 6542 6543 6544 6545 6546 6547 6548 6549 6550 6551 6552 6553 6554 6555 6556 6557 6558 6559 6560 6561 6562
diff --git a/CHANGELOG.md b/CHANGELOG.md
index bf9f1f2..a1095eb 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -4,6 +4,9 @@
## Next Version (Work in Progress)
+Changes:
+ * Update Unicode-specific code to use Unicode 12.1.
+
Fixes:
* [#77](https://github.com/mity/md4c/issues/77):
Fix maximal counts of digits for numerical character references, as requested
diff --git a/md4c/md4c.c b/md4c/md4c.c
index d86f8f5..fc8bf76 100644
--- a/md4c/md4c.c
+++ b/md4c/md4c.c
@@ -453,209 +453,212 @@ md_text_with_null_replacement(MD_CTX* ctx, MD_TEXTTYPE type, const CHAR* str, SZ
typedef struct MD_UNICODE_FOLD_INFO_tag MD_UNICODE_FOLD_INFO;
struct MD_UNICODE_FOLD_INFO_tag {
- int codepoints[3];
+ unsigned codepoints[3];
int n_codepoints;
};
#if defined MD4C_USE_UTF16 || defined MD4C_USE_UTF8
+ /* Binary search over sorted "map" of codepoints. Consecutive sequences
+ * of codepoints may be encoded in the map by just using the
+ * (MIN_CODEPOINT | 0x40000000) and (MAX_CODEPOINT | 0x80000000).
+ *
+ * Returns index of the found record in the map (in the case of ranges,
+ * the minimal value is used); or -1 on failure. */
static int
- md_is_unicode_whitespace__(int codepoint)
+ md_unicode_bsearch__(unsigned codepoint, const unsigned* map, size_t map_size)
{
- /* The ASCII ones are the most frequently used ones, so lets check them first. */
- if(codepoint <= 0x7f)
- return ISWHITESPACE_(codepoint);
-
- /* Check for Unicode codepoints in Zs class above 127. */
- if(codepoint == 0x00a0 || codepoint == 0x1680)
- return TRUE;
- if(0x2000 <= codepoint && codepoint <= 0x200a)
- return TRUE;
- if(codepoint == 0x202f || codepoint == 0x205f || codepoint == 0x3000)
- return TRUE;
+ int beg, end;
+ int pivot_beg, pivot_end;
+
+ beg = 0;
+ end = (int) map_size-1;
+ while(beg <= end) {
+ /* Pivot may be a range, not just a single value. */
+ pivot_beg = pivot_end = (beg + end) / 2;
+ if(map[pivot_end] & 0x40000000)
+ pivot_end++;
+ if(map[pivot_beg] & 0x80000000)
+ pivot_beg--;
+
+ if(codepoint < (map[pivot_beg] & 0x00ffffff))
+ end = pivot_beg - 1;
+ else if(codepoint > (map[pivot_end] & 0x00ffffff))
+ beg = pivot_end + 1;
+ else
+ return pivot_beg;
+ }
- return FALSE;
+ return -1;
}
static int
- md_unicode_cmp__(const void* p_codepoint_a, const void* p_codepoint_b)
+ md_is_unicode_whitespace__(unsigned codepoint)
{
- return (*(const int*)p_codepoint_a - *(const int*)p_codepoint_b);
+#define R(cp_min, cp_max) ((cp_min) | 0x40000000), ((cp_max) | 0x80000000)
+#define S(cp) (cp)
+ /* Unicode "Zs" category.
+ * (generated by scripts/build_whitespace_map.py) */
+ static const unsigned WHITESPACE_MAP[] = {
+ S(0x0020), S(0x00a0), S(0x1680), R(0x2000,0x200a), S(0x202f), S(0x205f), S(0x3000)
+ };
+#undef R
+#undef S
+
+ /* The ASCII ones are the most frequently used ones, also CommonMark
+ * specification requests few more in this range. */
+ if(codepoint <= 0x7f)
+ return ISWHITESPACE_(codepoint);
+
+ return (md_unicode_bsearch__(codepoint, WHITESPACE_MAP, SIZEOF_ARRAY(WHITESPACE_MAP)) >= 0);
}
static int
- md_is_unicode_punct__(int codepoint)
+ md_is_unicode_punct__(unsigned codepoint)
{
- /* non-ASCII (above 127) Unicode punctuation codepoints (classes
- * Pc, Pd, Pe, Pf, Pi, Po, Ps).
- *
- * Warning: Keep the array sorted.
- */
- static const int punct_list[] = {
- 0x00a1, 0x00a7, 0x00ab, 0x00b6, 0x00b7, 0x00bb, 0x00bf, 0x037e, 0x0387, 0x055a, 0x055b, 0x055c, 0x055d, 0x055e, 0x055f, 0x0589,
- 0x058a, 0x05be, 0x05c0, 0x05c3, 0x05c6, 0x05f3, 0x05f4, 0x0609, 0x060a, 0x060c, 0x060d, 0x061b, 0x061e, 0x061f, 0x066a, 0x066b,
- 0x066c, 0x066d, 0x06d4, 0x0700, 0x0701, 0x0702, 0x0703, 0x0704, 0x0705, 0x0706, 0x0707, 0x0708, 0x0709, 0x070a, 0x070b, 0x070c,
- 0x070d, 0x07f7, 0x07f8, 0x07f9, 0x0830, 0x0831, 0x0832, 0x0833, 0x0834, 0x0835, 0x0836, 0x0837, 0x0838, 0x0839, 0x083a, 0x083b,
- 0x083c, 0x083d, 0x083e, 0x085e, 0x0964, 0x0965, 0x0970, 0x0af0, 0x0df4, 0x0e4f, 0x0e5a, 0x0e5b, 0x0f04, 0x0f05, 0x0f06, 0x0f07,
- 0x0f08, 0x0f09, 0x0f0a, 0x0f0b, 0x0f0c, 0x0f0d, 0x0f0e, 0x0f0f, 0x0f10, 0x0f11, 0x0f12, 0x0f14, 0x0f3a, 0x0f3b, 0x0f3c, 0x0f3d,
- 0x0f85, 0x0fd0, 0x0fd1, 0x0fd2, 0x0fd3, 0x0fd4, 0x0fd9, 0x0fda, 0x104a, 0x104b, 0x104c, 0x104d, 0x104e, 0x104f, 0x10fb, 0x1360,
- 0x1361, 0x1362, 0x1363, 0x1364, 0x1365, 0x1366, 0x1367, 0x1368, 0x1400, 0x166d, 0x166e, 0x169b, 0x169c, 0x16eb, 0x16ec, 0x16ed,
- 0x1735, 0x1736, 0x17d4, 0x17d5, 0x17d6, 0x17d8, 0x17d9, 0x17da, 0x1800, 0x1801, 0x1802, 0x1803, 0x1804, 0x1805, 0x1806, 0x1807,
- 0x1808, 0x1809, 0x180a, 0x1944, 0x1945, 0x1a1e, 0x1a1f, 0x1aa0, 0x1aa1, 0x1aa2, 0x1aa3, 0x1aa4, 0x1aa5, 0x1aa6, 0x1aa8, 0x1aa9,
- 0x1aaa, 0x1aab, 0x1aac, 0x1aad, 0x1b5a, 0x1b5b, 0x1b5c, 0x1b5d, 0x1b5e, 0x1b5f, 0x1b60, 0x1bfc, 0x1bfd, 0x1bfe, 0x1bff, 0x1c3b,
- 0x1c3c, 0x1c3d, 0x1c3e, 0x1c3f, 0x1c7e, 0x1c7f, 0x1cc0, 0x1cc1, 0x1cc2, 0x1cc3, 0x1cc4, 0x1cc5, 0x1cc6, 0x1cc7, 0x1cd3, 0x2010,
- 0x2011, 0x2012, 0x2013, 0x2014, 0x2015, 0x2016, 0x2017, 0x2018, 0x2019, 0x201a, 0x201b, 0x201c, 0x201d, 0x201e, 0x201f, 0x2020,
- 0x2021, 0x2022, 0x2023, 0x2024, 0x2025, 0x2026, 0x2027, 0x2030, 0x2031, 0x2032, 0x2033, 0x2034, 0x2035, 0x2036, 0x2037, 0x2038,
- 0x2039, 0x203a, 0x203b, 0x203c, 0x203d, 0x203e, 0x203f, 0x2040, 0x2041, 0x2042, 0x2043, 0x2045, 0x2046, 0x2047, 0x2048, 0x2049,
- 0x204a, 0x204b, 0x204c, 0x204d, 0x204e, 0x204f, 0x2050, 0x2051, 0x2053, 0x2054, 0x2055, 0x2056, 0x2057, 0x2058, 0x2059, 0x205a,
- 0x205b, 0x205c, 0x205d, 0x205e, 0x207d, 0x207e, 0x208d, 0x208e, 0x2308, 0x2309, 0x230a, 0x230b, 0x2329, 0x232a, 0x2768, 0x2769,
- 0x276a, 0x276b, 0x276c, 0x276d, 0x276e, 0x276f, 0x2770, 0x2771, 0x2772, 0x2773, 0x2774, 0x2775, 0x27c5, 0x27c6, 0x27e6, 0x27e7,
- 0x27e8, 0x27e9, 0x27ea, 0x27eb, 0x27ec, 0x27ed, 0x27ee, 0x27ef, 0x2983, 0x2984, 0x2985, 0x2986, 0x2987, 0x2988, 0x2989, 0x298a,
- 0x298b, 0x298c, 0x298d, 0x298e, 0x298f, 0x2990, 0x2991, 0x2992, 0x2993, 0x2994, 0x2995, 0x2996, 0x2997, 0x2998, 0x29d8, 0x29d9,
- 0x29da, 0x29db, 0x29fc, 0x29fd, 0x2cf9, 0x2cfa, 0x2cfb, 0x2cfc, 0x2cfe, 0x2cff, 0x2d70, 0x2e00, 0x2e01, 0x2e02, 0x2e03, 0x2e04,
- 0x2e05, 0x2e06, 0x2e07, 0x2e08, 0x2e09, 0x2e0a, 0x2e0b, 0x2e0c, 0x2e0d, 0x2e0e, 0x2e0f, 0x2e10, 0x2e11, 0x2e12, 0x2e13, 0x2e14,
- 0x2e15, 0x2e16, 0x2e17, 0x2e18, 0x2e19, 0x2e1a, 0x2e1b, 0x2e1c, 0x2e1d, 0x2e1e, 0x2e1f, 0x2e20, 0x2e21, 0x2e22, 0x2e23, 0x2e24,
- 0x2e25, 0x2e26, 0x2e27, 0x2e28, 0x2e29, 0x2e2a, 0x2e2b, 0x2e2c, 0x2e2d, 0x2e2e, 0x2e30, 0x2e31, 0x2e32, 0x2e33, 0x2e34, 0x2e35,
- 0x2e36, 0x2e37, 0x2e38, 0x2e39, 0x2e3a, 0x2e3b, 0x2e3c, 0x2e3d, 0x2e3e, 0x2e3f, 0x2e40, 0x2e41, 0x2e42, 0x2e43, 0x2e44, 0x3001,
- 0x3002, 0x3003, 0x3008, 0x3009, 0x300a, 0x300b, 0x300c, 0x300d, 0x300e, 0x300f, 0x3010, 0x3011, 0x3014, 0x3015, 0x3016, 0x3017,
- 0x3018, 0x3019, 0x301a, 0x301b, 0x301c, 0x301d, 0x301e, 0x301f, 0x3030, 0x303d, 0x30a0, 0x30fb, 0xa4fe, 0xa4ff, 0xa60d, 0xa60e,
- 0xa60f, 0xa673, 0xa67e, 0xa6f2, 0xa6f3, 0xa6f4, 0xa6f5, 0xa6f6, 0xa6f7, 0xa874, 0xa875, 0xa876, 0xa877, 0xa8ce, 0xa8cf, 0xa8f8,
- 0xa8f9, 0xa8fa, 0xa8fc, 0xa92e, 0xa92f, 0xa95f, 0xa9c1, 0xa9c2, 0xa9c3, 0xa9c4, 0xa9c5, 0xa9c6, 0xa9c7, 0xa9c8, 0xa9c9, 0xa9ca,
- 0xa9cb, 0xa9cc, 0xa9cd, 0xa9de, 0xa9df, 0xaa5c, 0xaa5d, 0xaa5e, 0xaa5f, 0xaade, 0xaadf, 0xaaf0, 0xaaf1, 0xabeb, 0xfd3e, 0xfd3f,
- 0xfe10, 0xfe11, 0xfe12, 0xfe13, 0xfe14, 0xfe15, 0xfe16, 0xfe17, 0xfe18, 0xfe19, 0xfe30, 0xfe31, 0xfe32, 0xfe33, 0xfe34, 0xfe35,
- 0xfe36, 0xfe37, 0xfe38, 0xfe39, 0xfe3a, 0xfe3b, 0xfe3c, 0xfe3d, 0xfe3e, 0xfe3f, 0xfe40, 0xfe41, 0xfe42, 0xfe43, 0xfe44, 0xfe45,
- 0xfe46, 0xfe47, 0xfe48, 0xfe49, 0xfe4a, 0xfe4b, 0xfe4c, 0xfe4d, 0xfe4e, 0xfe4f, 0xfe50, 0xfe51, 0xfe52, 0xfe54, 0xfe55, 0xfe56,
- 0xfe57, 0xfe58, 0xfe59, 0xfe5a, 0xfe5b, 0xfe5c, 0xfe5d, 0xfe5e, 0xfe5f, 0xfe60, 0xfe61, 0xfe63, 0xfe68, 0xfe6a, 0xfe6b, 0xff01,
- 0xff02, 0xff03, 0xff05, 0xff06, 0xff07, 0xff08, 0xff09, 0xff0a, 0xff0c, 0xff0d, 0xff0e, 0xff0f, 0xff1a, 0xff1b, 0xff1f, 0xff20,
- 0xff3b, 0xff3c, 0xff3d, 0xff3f, 0xff5b, 0xff5d, 0xff5f, 0xff60, 0xff61, 0xff62, 0xff63, 0xff64, 0xff65, 0x10100, 0x10101, 0x10102,
- 0x1039f, 0x103d0, 0x1056f, 0x10857, 0x1091f, 0x1093f, 0x10a50, 0x10a51, 0x10a52, 0x10a53, 0x10a54, 0x10a55, 0x10a56, 0x10a57, 0x10a58, 0x10a7f,
- 0x10af0, 0x10af1, 0x10af2, 0x10af3, 0x10af4, 0x10af5, 0x10af6, 0x10b39, 0x10b3a, 0x10b3b, 0x10b3c, 0x10b3d, 0x10b3e, 0x10b3f, 0x10b99, 0x10b9a,
- 0x10b9b, 0x10b9c, 0x11047, 0x11048, 0x11049, 0x1104a, 0x1104b, 0x1104c, 0x1104d, 0x110bb, 0x110bc, 0x110be, 0x110bf, 0x110c0, 0x110c1, 0x11140,
- 0x11141, 0x11142, 0x11143, 0x11174, 0x11175, 0x111c5, 0x111c6, 0x111c7, 0x111c8, 0x111c9, 0x111cd, 0x111db, 0x111dd, 0x111de, 0x111df, 0x11238,
- 0x11239, 0x1123a, 0x1123b, 0x1123c, 0x1123d, 0x112a9, 0x1144b, 0x1144c, 0x1144d, 0x1144e, 0x1144f, 0x1145b, 0x1145d, 0x114c6, 0x115c1, 0x115c2,
- 0x115c3, 0x115c4, 0x115c5, 0x115c6, 0x115c7, 0x115c8, 0x115c9, 0x115ca, 0x115cb, 0x115cc, 0x115cd, 0x115ce, 0x115cf, 0x115d0, 0x115d1, 0x115d2,
- 0x115d3, 0x115d4, 0x115d5, 0x115d6, 0x115d7, 0x11641, 0x11642, 0x11643, 0x11660, 0x11661, 0x11662, 0x11663, 0x11664, 0x11665, 0x11666, 0x11667,
- 0x11668, 0x11669, 0x1166a, 0x1166b, 0x1166c, 0x1173c, 0x1173d, 0x1173e, 0x11c41, 0x11c42, 0x11c43, 0x11c44, 0x11c45, 0x11c70, 0x11c71, 0x12470,
- 0x12471, 0x12472, 0x12473, 0x12474, 0x16a6e, 0x16a6f, 0x16af5, 0x16b37, 0x16b38, 0x16b39, 0x16b3a, 0x16b3b, 0x16b44, 0x1bc9f, 0x1da87, 0x1da88,
- 0x1da89, 0x1da8a, 0x1da8b, 0x1e95e, 0x1e95f
+#define R(cp_min, cp_max) ((cp_min) | 0x40000000), ((cp_max) | 0x80000000)
+#define S(cp) (cp)
+ /* Unicode "Pc", "Pd", "Pe", "Pf", "Pi", "Po", "Ps" categories.
+ * (generated by scripts/build_punct_map.py) */
+ static const unsigned PUNCT_MAP[] = {
+ R(0x0021,0x0023), R(0x0025,0x002a), R(0x002c,0x002f), R(0x003a,0x003b), R(0x003f,0x0040),
+ R(0x005b,0x005d), S(0x005f), S(0x007b), S(0x007d), S(0x00a1), S(0x00a7), S(0x00ab), R(0x00b6,0x00b7),
+ S(0x00bb), S(0x00bf), S(0x037e), S(0x0387), R(0x055a,0x055f), R(0x0589,0x058a), S(0x05be), S(0x05c0),
+ S(0x05c3), S(0x05c6), R(0x05f3,0x05f4), R(0x0609,0x060a), R(0x060c,0x060d), S(0x061b), R(0x061e,0x061f),
+ R(0x066a,0x066d), S(0x06d4), R(0x0700,0x070d), R(0x07f7,0x07f9), R(0x0830,0x083e), S(0x085e),
+ R(0x0964,0x0965), S(0x0970), S(0x09fd), S(0x0a76), S(0x0af0), S(0x0c77), S(0x0c84), S(0x0df4), S(0x0e4f),
+ R(0x0e5a,0x0e5b), R(0x0f04,0x0f12), S(0x0f14), R(0x0f3a,0x0f3d), S(0x0f85), R(0x0fd0,0x0fd4),
+ R(0x0fd9,0x0fda), R(0x104a,0x104f), S(0x10fb), R(0x1360,0x1368), S(0x1400), S(0x166e), R(0x169b,0x169c),
+ R(0x16eb,0x16ed), R(0x1735,0x1736), R(0x17d4,0x17d6), R(0x17d8,0x17da), R(0x1800,0x180a),
+ R(0x1944,0x1945), R(0x1a1e,0x1a1f), R(0x1aa0,0x1aa6), R(0x1aa8,0x1aad), R(0x1b5a,0x1b60),
+ R(0x1bfc,0x1bff), R(0x1c3b,0x1c3f), R(0x1c7e,0x1c7f), R(0x1cc0,0x1cc7), S(0x1cd3), R(0x2010,0x2027),
+ R(0x2030,0x2043), R(0x2045,0x2051), R(0x2053,0x205e), R(0x207d,0x207e), R(0x208d,0x208e),
+ R(0x2308,0x230b), R(0x2329,0x232a), R(0x2768,0x2775), R(0x27c5,0x27c6), R(0x27e6,0x27ef),
+ R(0x2983,0x2998), R(0x29d8,0x29db), R(0x29fc,0x29fd), R(0x2cf9,0x2cfc), R(0x2cfe,0x2cff), S(0x2d70),
+ R(0x2e00,0x2e2e), R(0x2e30,0x2e4f), R(0x3001,0x3003), R(0x3008,0x3011), R(0x3014,0x301f), S(0x3030),
+ S(0x303d), S(0x30a0), S(0x30fb), R(0xa4fe,0xa4ff), R(0xa60d,0xa60f), S(0xa673), S(0xa67e),
+ R(0xa6f2,0xa6f7), R(0xa874,0xa877), R(0xa8ce,0xa8cf), R(0xa8f8,0xa8fa), S(0xa8fc), R(0xa92e,0xa92f),
+ S(0xa95f), R(0xa9c1,0xa9cd), R(0xa9de,0xa9df), R(0xaa5c,0xaa5f), R(0xaade,0xaadf), R(0xaaf0,0xaaf1),
+ S(0xabeb), R(0xfd3e,0xfd3f), R(0xfe10,0xfe19), R(0xfe30,0xfe52), R(0xfe54,0xfe61), S(0xfe63), S(0xfe68),
+ R(0xfe6a,0xfe6b), R(0xff01,0xff03), R(0xff05,0xff0a), R(0xff0c,0xff0f), R(0xff1a,0xff1b),
+ R(0xff1f,0xff20), R(0xff3b,0xff3d), S(0xff3f), S(0xff5b), S(0xff5d), R(0xff5f,0xff65), R(0x10100,0x10102),
+ S(0x1039f), S(0x103d0), S(0x1056f), S(0x10857), S(0x1091f), S(0x1093f), R(0x10a50,0x10a58), S(0x10a7f),
+ R(0x10af0,0x10af6), R(0x10b39,0x10b3f), R(0x10b99,0x10b9c), R(0x10f55,0x10f59), R(0x11047,0x1104d),
+ R(0x110bb,0x110bc), R(0x110be,0x110c1), R(0x11140,0x11143), R(0x11174,0x11175), R(0x111c5,0x111c8),
+ S(0x111cd), S(0x111db), R(0x111dd,0x111df), R(0x11238,0x1123d), S(0x112a9), R(0x1144b,0x1144f),
+ S(0x1145b), S(0x1145d), S(0x114c6), R(0x115c1,0x115d7), R(0x11641,0x11643), R(0x11660,0x1166c),
+ R(0x1173c,0x1173e), S(0x1183b), S(0x119e2), R(0x11a3f,0x11a46), R(0x11a9a,0x11a9c), R(0x11a9e,0x11aa2),
+ R(0x11c41,0x11c45), R(0x11c70,0x11c71), R(0x11ef7,0x11ef8), S(0x11fff), R(0x12470,0x12474),
+ R(0x16a6e,0x16a6f), S(0x16af5), R(0x16b37,0x16b3b), S(0x16b44), R(0x16e97,0x16e9a), S(0x16fe2),
+ S(0x1bc9f), R(0x1da87,0x1da8b), R(0x1e95e,0x1e95f)
};
+#undef R
+#undef S
- /* The ASCII ones are the most frequently used ones, so lets check them first. */
+ /* The ASCII ones are the most frequently used ones, also CommonMark
+ * specification requests few more in this range. */
if(codepoint <= 0x7f)
return ISPUNCT_(codepoint);
- return (bsearch(&codepoint, punct_list, SIZEOF_ARRAY(punct_list), sizeof(int), md_unicode_cmp__) != NULL);
+ return (md_unicode_bsearch__(codepoint, PUNCT_MAP, SIZEOF_ARRAY(PUNCT_MAP)) >= 0);
}
static void
- md_get_unicode_fold_info(int codepoint, MD_UNICODE_FOLD_INFO* info)
+ md_get_unicode_fold_info(unsigned codepoint, MD_UNICODE_FOLD_INFO* info)
{
- /* This maps single codepoint within a range to a single codepoint
- * within an offseted range. */
- static const struct {
- int min_codepoint;
- int max_codepoint;
- int offset;
- } range_map[] = {
- { 0x00c0, 0x00d6, 32 }, { 0x00d8, 0x00de, 32 }, { 0x0388, 0x038a, 37 }, { 0x0391, 0x03a1, 32 }, { 0x03a3, 0x03ab, 32 }, { 0x0400, 0x040f, 80 },
- { 0x0410, 0x042f, 32 }, { 0x0531, 0x0556, 48 }, { 0x1f08, 0x1f0f, -8 }, { 0x1f18, 0x1f1d, -8 }, { 0x1f28, 0x1f2f, -8 }, { 0x1f38, 0x1f3f, -8 },
- { 0x1f48, 0x1f4d, -8 }, { 0x1f68, 0x1f6f, -8 }, { 0x1fc8, 0x1fcb, -86 }, { 0x2160, 0x216f, 16 }, { 0x24b6, 0x24cf, 26 }, { 0xff21, 0xff3a, 32 },
- { 0x10400, 0x10425, 40 }
+#define R(cp_min, cp_max) ((cp_min) | 0x40000000), ((cp_max) | 0x80000000)
+#define S(cp) (cp)
+ /* Unicode "Pc", "Pd", "Pe", "Pf", "Pi", "Po", "Ps" categories.
+ * (generated by scripts/build_punct_map.py) */
+ static const unsigned FOLD_MAP_1[] = {
+ R(0x0041,0x005a), S(0x00b5), R(0x00c0,0x00d6), R(0x00d8,0x00de), R(0x0100,0x012e), R(0x0132,0x0136),
+ R(0x0139,0x0147), R(0x014a,0x0176), S(0x0178), R(0x0179,0x017d), S(0x017f), S(0x0181), S(0x0182),
+ S(0x0186), S(0x0187), S(0x0189), S(0x018b), S(0x018e), S(0x018f), S(0x0190), S(0x0191), S(0x0193),
+ S(0x0194), S(0x0196), S(0x0197), S(0x0198), S(0x019c), S(0x019d), S(0x019f), R(0x01a0,0x01a4), S(0x01a6),
+ S(0x01a7), S(0x01a9), S(0x01ac), S(0x01ae), S(0x01af), S(0x01b1), S(0x01b3), S(0x01b7), S(0x01b8),
+ S(0x01bc), S(0x01c4), S(0x01c5), S(0x01c7), S(0x01c8), S(0x01ca), R(0x01cb,0x01db), R(0x01de,0x01ee),
+ S(0x01f1), S(0x01f2), S(0x01f6), S(0x01f7), R(0x01f8,0x021e), S(0x0220), R(0x0222,0x0232), S(0x023a),
+ S(0x023b), S(0x023d), S(0x023e), S(0x0241), S(0x0243), S(0x0244), S(0x0245), R(0x0246,0x024e), S(0x0345),
+ S(0x0370), S(0x0376), S(0x037f), S(0x0386), R(0x0388,0x038a), S(0x038c), S(0x038e), R(0x0391,0x03a1),
+ R(0x03a3,0x03ab), S(0x03c2), S(0x03cf), S(0x03d0), S(0x03d1), S(0x03d5), S(0x03d6), R(0x03d8,0x03ee),
+ S(0x03f0), S(0x03f1), S(0x03f4), S(0x03f5), S(0x03f7), S(0x03f9), S(0x03fa), R(0x03fd,0x03ff),
+ R(0x0400,0x040f), R(0x0410,0x042f), R(0x0460,0x0480), R(0x048a,0x04be), S(0x04c0), R(0x04c1,0x04cd),
+ R(0x04d0,0x052e), R(0x0531,0x0556), R(0x10a0,0x10c5), S(0x10c7), S(0x10cd), R(0x13f8,0x13fd), S(0x1c80),
+ S(0x1c81), S(0x1c82), S(0x1c83), S(0x1c85), S(0x1c86), S(0x1c87), S(0x1c88), R(0x1c90,0x1cba),
+ R(0x1cbd,0x1cbf), R(0x1e00,0x1e94), S(0x1e9b), R(0x1ea0,0x1efe), R(0x1f08,0x1f0f), R(0x1f18,0x1f1d),
+ R(0x1f28,0x1f2f), R(0x1f38,0x1f3f), R(0x1f48,0x1f4d), S(0x1f59), S(0x1f5b), S(0x1f5d), S(0x1f5f),
+ R(0x1f68,0x1f6f), S(0x1fb8), S(0x1fba), S(0x1fbe), R(0x1fc8,0x1fcb), S(0x1fd8), S(0x1fda), S(0x1fe8),
+ S(0x1fea), S(0x1fec), S(0x1ff8), S(0x1ffa), S(0x2126), S(0x212a), S(0x212b), S(0x2132), R(0x2160,0x216f),
+ S(0x2183), R(0x24b6,0x24cf), R(0x2c00,0x2c2e), S(0x2c60), S(0x2c62), S(0x2c63), S(0x2c64),
+ R(0x2c67,0x2c6b), S(0x2c6d), S(0x2c6e), S(0x2c6f), S(0x2c70), S(0x2c72), S(0x2c75), S(0x2c7e),
+ R(0x2c80,0x2ce2), S(0x2ceb), S(0x2cf2), R(0xa640,0xa66c), R(0xa680,0xa69a), R(0xa722,0xa72e),
+ R(0xa732,0xa76e), S(0xa779), S(0xa77d), R(0xa77e,0xa786), S(0xa78b), S(0xa78d), S(0xa790),
+ R(0xa796,0xa7a8), S(0xa7aa), S(0xa7ab), S(0xa7ac), S(0xa7ad), S(0xa7ae), S(0xa7b0), S(0xa7b1), S(0xa7b2),
+ S(0xa7b3), R(0xa7b4,0xa7be), S(0xa7c2), S(0xa7c4), S(0xa7c5), S(0xa7c6), R(0xab70,0xabbf),
+ R(0xff21,0xff3a), R(0x10400,0x10427), R(0x104b0,0x104d3), R(0x10c80,0x10cb2), R(0x118a0,0x118bf),
+ R(0x16e40,0x16e5f), R(0x1e900,0x1e921)
};
-
- /* This maps single codepoint to another single codepoint. */
- static const struct {
- int src_codepoint;
- int dest_codepoint;
- } single_map[] = {
- { 0x00b5, 0x03bc }, { 0x0100, 0x0101 }, { 0x0102, 0x0103 }, { 0x0104, 0x0105 }, { 0x0106, 0x0107 }, { 0x0108, 0x0109 }, { 0x010a, 0x010b }, { 0x010c, 0x010d },
- { 0x010e, 0x010f }, { 0x0110, 0x0111 }, { 0x0112, 0x0113 }, { 0x0114, 0x0115 }, { 0x0116, 0x0117 }, { 0x0118, 0x0119 }, { 0x011a, 0x011b }, { 0x011c, 0x011d },
- { 0x011e, 0x011f }, { 0x0120, 0x0121 }, { 0x0122, 0x0123 }, { 0x0124, 0x0125 }, { 0x0126, 0x0127 }, { 0x0128, 0x0129 }, { 0x012a, 0x012b }, { 0x012c, 0x012d },
- { 0x012e, 0x012f }, { 0x0132, 0x0133 }, { 0x0134, 0x0135 }, { 0x0136, 0x0137 }, { 0x0139, 0x013a }, { 0x013b, 0x013c }, { 0x013d, 0x013e }, { 0x013f, 0x0140 },
- { 0x0141, 0x0142 }, { 0x0143, 0x0144 }, { 0x0145, 0x0146 }, { 0x0147, 0x0148 }, { 0x014a, 0x014b }, { 0x014c, 0x014d }, { 0x014e, 0x014f }, { 0x0150, 0x0151 },
- { 0x0152, 0x0153 }, { 0x0154, 0x0155 }, { 0x0156, 0x0157 }, { 0x0158, 0x0159 }, { 0x015a, 0x015b }, { 0x015c, 0x015d }, { 0x015e, 0x015f }, { 0x0160, 0x0161 },
- { 0x0162, 0x0163 }, { 0x0164, 0x0165 }, { 0x0166, 0x0167 }, { 0x0168, 0x0169 }, { 0x016a, 0x016b }, { 0x016c, 0x016d }, { 0x016e, 0x016f }, { 0x0170, 0x0171 },
- { 0x0172, 0x0173 }, { 0x0174, 0x0175 }, { 0x0176, 0x0177 }, { 0x0178, 0x00ff }, { 0x0179, 0x017a }, { 0x017b, 0x017c }, { 0x017d, 0x017e }, { 0x017f, 0x0073 },
- { 0x0181, 0x0253 }, { 0x0182, 0x0183 }, { 0x0184, 0x0185 }, { 0x0186, 0x0254 }, { 0x0187, 0x0188 }, { 0x0189, 0x0256 }, { 0x018a, 0x0257 }, { 0x018b, 0x018c },
- { 0x018e, 0x01dd }, { 0x018f, 0x0259 }, { 0x0190, 0x025b }, { 0x0191, 0x0192 }, { 0x0193, 0x0260 }, { 0x0194, 0x0263 }, { 0x0196, 0x0269 }, { 0x0197, 0x0268 },
- { 0x0198, 0x0199 }, { 0x019c, 0x026f }, { 0x019d, 0x0272 }, { 0x019f, 0x0275 }, { 0x01a0, 0x01a1 }, { 0x01a2, 0x01a3 }, { 0x01a4, 0x01a5 }, { 0x01a6, 0x0280 },
- { 0x01a7, 0x01a8 }, { 0x01a9, 0x0283 }, { 0x01ac, 0x01ad }, { 0x01ae, 0x0288 }, { 0x01af, 0x01b0 }, { 0x01b1, 0x028a }, { 0x01b2, 0x028b }, { 0x01b3, 0x01b4 },
- { 0x01b5, 0x01b6 }, { 0x01b7, 0x0292 }, { 0x01b8, 0x01b9 }, { 0x01bc, 0x01bd }, { 0x01c4, 0x01c6 }, { 0x01c5, 0x01c6 }, { 0x01c7, 0x01c9 }, { 0x01c8, 0x01c9 },
- { 0x01ca, 0x01cc }, { 0x01cb, 0x01cc }, { 0x01cd, 0x01ce }, { 0x01cf, 0x01d0 }, { 0x01d1, 0x01d2 }, { 0x01d3, 0x01d4 }, { 0x01d5, 0x01d6 }, { 0x01d7, 0x01d8 },
- { 0x01d9, 0x01da }, { 0x01db, 0x01dc }, { 0x01de, 0x01df }, { 0x01e0, 0x01e1 }, { 0x01e2, 0x01e3 }, { 0x01e4, 0x01e5 }, { 0x01e6, 0x01e7 }, { 0x01e8, 0x01e9 },
- { 0x01ea, 0x01eb }, { 0x01ec, 0x01ed }, { 0x01ee, 0x01ef }, { 0x01f1, 0x01f3 }, { 0x01f2, 0x01f3 }, { 0x01f4, 0x01f5 }, { 0x01f6, 0x0195 }, { 0x01f7, 0x01bf },
- { 0x01f8, 0x01f9 }, { 0x01fa, 0x01fb }, { 0x01fc, 0x01fd }, { 0x01fe, 0x01ff }, { 0x0200, 0x0201 }, { 0x0202, 0x0203 }, { 0x0204, 0x0205 }, { 0x0206, 0x0207 },
- { 0x0208, 0x0209 }, { 0x020a, 0x020b }, { 0x020c, 0x020d }, { 0x020e, 0x020f }, { 0x0210, 0x0211 }, { 0x0212, 0x0213 }, { 0x0214, 0x0215 }, { 0x0216, 0x0217 },
- { 0x0218, 0x0219 }, { 0x021a, 0x021b }, { 0x021c, 0x021d }, { 0x021e, 0x021f }, { 0x0220, 0x019e }, { 0x0222, 0x0223 }, { 0x0224, 0x0225 }, { 0x0226, 0x0227 },
- { 0x0228, 0x0229 }, { 0x022a, 0x022b }, { 0x022c, 0x022d }, { 0x022e, 0x022f }, { 0x0230, 0x0231 }, { 0x0232, 0x0233 }, { 0x0345, 0x03b9 }, { 0x0386, 0x03ac },
- { 0x038c, 0x03cc }, { 0x038e, 0x03cd }, { 0x038f, 0x03ce }, { 0x03c2, 0x03c3 }, { 0x03d0, 0x03b2 }, { 0x03d1, 0x03b8 }, { 0x03d5, 0x03c6 }, { 0x03d6, 0x03c0 },
- { 0x03d8, 0x03d9 }, { 0x03da, 0x03db }, { 0x03dc, 0x03dd }, { 0x03de, 0x03df }, { 0x03e0, 0x03e1 }, { 0x03e2, 0x03e3 }, { 0x03e4, 0x03e5 }, { 0x03e6, 0x03e7 },
- { 0x03e8, 0x03e9 }, { 0x03ea, 0x03eb }, { 0x03ec, 0x03ed }, { 0x03ee, 0x03ef }, { 0x03f0, 0x03ba }, { 0x03f1, 0x03c1 }, { 0x03f2, 0x03c3 }, { 0x03f4, 0x03b8 },
- { 0x03f5, 0x03b5 }, { 0x0460, 0x0461 }, { 0x0462, 0x0463 }, { 0x0464, 0x0465 }, { 0x0466, 0x0467 }, { 0x0468, 0x0469 }, { 0x046a, 0x046b }, { 0x046c, 0x046d },
- { 0x046e, 0x046f }, { 0x0470, 0x0471 }, { 0x0472, 0x0473 }, { 0x0474, 0x0475 }, { 0x0476, 0x0477 }, { 0x0478, 0x0479 }, { 0x047a, 0x047b }, { 0x047c, 0x047d },
- { 0x047e, 0x047f }, { 0x0480, 0x0481 }, { 0x048a, 0x048b }, { 0x048c, 0x048d }, { 0x048e, 0x048f }, { 0x0490, 0x0491 }, { 0x0492, 0x0493 }, { 0x0494, 0x0495 },
- { 0x0496, 0x0497 }, { 0x0498, 0x0499 }, { 0x049a, 0x049b }, { 0x049c, 0x049d }, { 0x049e, 0x049f }, { 0x04a0, 0x04a1 }, { 0x04a2, 0x04a3 }, { 0x04a4, 0x04a5 },
- { 0x04a6, 0x04a7 }, { 0x04a8, 0x04a9 }, { 0x04aa, 0x04ab }, { 0x04ac, 0x04ad }, { 0x04ae, 0x04af }, { 0x04b0, 0x04b1 }, { 0x04b2, 0x04b3 }, { 0x04b4, 0x04b5 },
- { 0x04b6, 0x04b7 }, { 0x04b8, 0x04b9 }, { 0x04ba, 0x04bb }, { 0x04bc, 0x04bd }, { 0x04be, 0x04bf }, { 0x04c1, 0x04c2 }, { 0x04c3, 0x04c4 }, { 0x04c5, 0x04c6 },
- { 0x04c7, 0x04c8 }, { 0x04c9, 0x04ca }, { 0x04cb, 0x04cc }, { 0x04cd, 0x04ce }, { 0x04d0, 0x04d1 }, { 0x04d2, 0x04d3 }, { 0x04d4, 0x04d5 }, { 0x04d6, 0x04d7 },
- { 0x04d8, 0x04d9 }, { 0x04da, 0x04db }, { 0x04dc, 0x04dd }, { 0x04de, 0x04df }, { 0x04e0, 0x04e1 }, { 0x04e2, 0x04e3 }, { 0x04e4, 0x04e5 }, { 0x04e6, 0x04e7 },
- { 0x04e8, 0x04e9 }, { 0x04ea, 0x04eb }, { 0x04ec, 0x04ed }, { 0x04ee, 0x04ef }, { 0x04f0, 0x04f1 }, { 0x04f2, 0x04f3 }, { 0x04f4, 0x04f5 }, { 0x04f8, 0x04f9 },
- { 0x0500, 0x0501 }, { 0x0502, 0x0503 }, { 0x0504, 0x0505 }, { 0x0506, 0x0507 }, { 0x0508, 0x0509 }, { 0x050a, 0x050b }, { 0x050c, 0x050d }, { 0x050e, 0x050f },
- { 0x1e00, 0x1e01 }, { 0x1e02, 0x1e03 }, { 0x1e04, 0x1e05 }, { 0x1e06, 0x1e07 }, { 0x1e08, 0x1e09 }, { 0x1e0a, 0x1e0b }, { 0x1e0c, 0x1e0d }, { 0x1e0e, 0x1e0f },
- { 0x1e10, 0x1e11 }, { 0x1e12, 0x1e13 }, { 0x1e14, 0x1e15 }, { 0x1e16, 0x1e17 }, { 0x1e18, 0x1e19 }, { 0x1e1a, 0x1e1b }, { 0x1e1c, 0x1e1d }, { 0x1e1e, 0x1e1f },
- { 0x1e20, 0x1e21 }, { 0x1e22, 0x1e23 }, { 0x1e24, 0x1e25 }, { 0x1e26, 0x1e27 }, { 0x1e28, 0x1e29 }, { 0x1e2a, 0x1e2b }, { 0x1e2c, 0x1e2d }, { 0x1e2e, 0x1e2f },
- { 0x1e30, 0x1e31 }, { 0x1e32, 0x1e33 }, { 0x1e34, 0x1e35 }, { 0x1e36, 0x1e37 }, { 0x1e38, 0x1e39 }, { 0x1e3a, 0x1e3b }, { 0x1e3c, 0x1e3d }, { 0x1e3e, 0x1e3f },
- { 0x1e40, 0x1e41 }, { 0x1e42, 0x1e43 }, { 0x1e44, 0x1e45 }, { 0x1e46, 0x1e47 }, { 0x1e48, 0x1e49 }, { 0x1e4a, 0x1e4b }, { 0x1e4c, 0x1e4d }, { 0x1e4e, 0x1e4f },
- { 0x1e50, 0x1e51 }, { 0x1e52, 0x1e53 }, { 0x1e54, 0x1e55 }, { 0x1e56, 0x1e57 }, { 0x1e58, 0x1e59 }, { 0x1e5a, 0x1e5b }, { 0x1e5c, 0x1e5d }, { 0x1e5e, 0x1e5f },
- { 0x1e60, 0x1e61 }, { 0x1e62, 0x1e63 }, { 0x1e64, 0x1e65 }, { 0x1e66, 0x1e67 }, { 0x1e68, 0x1e69 }, { 0x1e6a, 0x1e6b }, { 0x1e6c, 0x1e6d }, { 0x1e6e, 0x1e6f },
- { 0x1e70, 0x1e71 }, { 0x1e72, 0x1e73 }, { 0x1e74, 0x1e75 }, { 0x1e76, 0x1e77 }, { 0x1e78, 0x1e79 }, { 0x1e7a, 0x1e7b }, { 0x1e7c, 0x1e7d }, { 0x1e7e, 0x1e7f },
- { 0x1e80, 0x1e81 }, { 0x1e82, 0x1e83 }, { 0x1e84, 0x1e85 }, { 0x1e86, 0x1e87 }, { 0x1e88, 0x1e89 }, { 0x1e8a, 0x1e8b }, { 0x1e8c, 0x1e8d }, { 0x1e8e, 0x1e8f },
- { 0x1e90, 0x1e91 }, { 0x1e92, 0x1e93 }, { 0x1e94, 0x1e95 }, { 0x1e9b, 0x1e61 }, { 0x1ea0, 0x1ea1 }, { 0x1ea2, 0x1ea3 }, { 0x1ea4, 0x1ea5 }, { 0x1ea6, 0x1ea7 },
- { 0x1ea8, 0x1ea9 }, { 0x1eaa, 0x1eab }, { 0x1eac, 0x1ead }, { 0x1eae, 0x1eaf }, { 0x1eb0, 0x1eb1 }, { 0x1eb2, 0x1eb3 }, { 0x1eb4, 0x1eb5 }, { 0x1eb6, 0x1eb7 },
- { 0x1eb8, 0x1eb9 }, { 0x1eba, 0x1ebb }, { 0x1ebc, 0x1ebd }, { 0x1ebe, 0x1ebf }, { 0x1ec0, 0x1ec1 }, { 0x1ec2, 0x1ec3 }, { 0x1ec4, 0x1ec5 }, { 0x1ec6, 0x1ec7 },
- { 0x1ec8, 0x1ec9 }, { 0x1eca, 0x1ecb }, { 0x1ecc, 0x1ecd }, { 0x1ece, 0x1ecf }, { 0x1ed0, 0x1ed1 }, { 0x1ed2, 0x1ed3 }, { 0x1ed4, 0x1ed5 }, { 0x1ed6, 0x1ed7 },
- { 0x1ed8, 0x1ed9 }, { 0x1eda, 0x1edb }, { 0x1edc, 0x1edd }, { 0x1ede, 0x1edf }, { 0x1ee0, 0x1ee1 }, { 0x1ee2, 0x1ee3 }, { 0x1ee4, 0x1ee5 }, { 0x1ee6, 0x1ee7 },
- { 0x1ee8, 0x1ee9 }, { 0x1eea, 0x1eeb }, { 0x1eec, 0x1eed }, { 0x1eee, 0x1eef }, { 0x1ef0, 0x1ef1 }, { 0x1ef2, 0x1ef3 }, { 0x1ef4, 0x1ef5 }, { 0x1ef6, 0x1ef7 },
- { 0x1ef8, 0x1ef9 }, { 0x1f59, 0x1f51 }, { 0x1f5b, 0x1f53 }, { 0x1f5d, 0x1f55 }, { 0x1f5f, 0x1f57 }, { 0x1fb8, 0x1fb0 }, { 0x1fb9, 0x1fb1 }, { 0x1fba, 0x1f70 },
- { 0x1fbb, 0x1f71 }, { 0x1fbe, 0x03b9 }, { 0x1fd8, 0x1fd0 }, { 0x1fd9, 0x1fd1 }, { 0x1fda, 0x1f76 }, { 0x1fdb, 0x1f77 }, { 0x1fe8, 0x1fe0 }, { 0x1fe9, 0x1fe1 },
- { 0x1fea, 0x1f7a }, { 0x1feb, 0x1f7b }, { 0x1fec, 0x1fe5 }, { 0x1ff8, 0x1f78 }, { 0x1ff9, 0x1f79 }, { 0x1ffa, 0x1f7c }, { 0x1ffb, 0x1f7d }, { 0x2126, 0x03c9 },
- { 0x212a, 0x006b }, { 0x212b, 0x00e5 },
+ static const unsigned FOLD_MAP_1_DATA[] = {
+ 0x0061, 0x007a, 0x03bc, 0x00e0, 0x00f6, 0x00f8, 0x00fe, 0x0101, 0x012f, 0x0133, 0x0137, 0x013a, 0x0148,
+ 0x014b, 0x0177, 0x00ff, 0x017a, 0x017e, 0x0073, 0x0253, 0x0183, 0x0254, 0x0188, 0x0256, 0x018c, 0x01dd,
+ 0x0259, 0x025b, 0x0192, 0x0260, 0x0263, 0x0269, 0x0268, 0x0199, 0x026f, 0x0272, 0x0275, 0x01a1, 0x01a5,
+ 0x0280, 0x01a8, 0x0283, 0x01ad, 0x0288, 0x01b0, 0x028a, 0x01b4, 0x0292, 0x01b9, 0x01bd, 0x01c6, 0x01c6,
+ 0x01c9, 0x01c9, 0x01cc, 0x01cc, 0x01dc, 0x01df, 0x01ef, 0x01f3, 0x01f3, 0x0195, 0x01bf, 0x01f9, 0x021f,
+ 0x019e, 0x0223, 0x0233, 0x2c65, 0x023c, 0x019a, 0x2c66, 0x0242, 0x0180, 0x0289, 0x028c, 0x0247, 0x024f,
+ 0x03b9, 0x0371, 0x0377, 0x03f3, 0x03ac, 0x03ad, 0x03af, 0x03cc, 0x03cd, 0x03b1, 0x03c1, 0x03c3, 0x03cb,
+ 0x03c3, 0x03d7, 0x03b2, 0x03b8, 0x03c6, 0x03c0, 0x03d9, 0x03ef, 0x03ba, 0x03c1, 0x03b8, 0x03b5, 0x03f8,
+ 0x03f2, 0x03fb, 0x037b, 0x037d, 0x0450, 0x045f, 0x0430, 0x044f, 0x0461, 0x0481, 0x048b, 0x04bf, 0x04cf,
+ 0x04c2, 0x04ce, 0x04d1, 0x052f, 0x0561, 0x0586, 0x2d00, 0x2d25, 0x2d27, 0x2d2d, 0x13f0, 0x13f5, 0x0432,
+ 0x0434, 0x043e, 0x0441, 0x0442, 0x044a, 0x0463, 0xa64b, 0x10d0, 0x10fa, 0x10fd, 0x10ff, 0x1e01, 0x1e95,
+ 0x1e61, 0x1ea1, 0x1eff, 0x1f00, 0x1f07, 0x1f10, 0x1f15, 0x1f20, 0x1f27, 0x1f30, 0x1f37, 0x1f40, 0x1f45,
+ 0x1f51, 0x1f53, 0x1f55, 0x1f57, 0x1f60, 0x1f67, 0x1fb0, 0x1f70, 0x03b9, 0x1f72, 0x1f75, 0x1fd0, 0x1f76,
+ 0x1fe0, 0x1f7a, 0x1fe5, 0x1f78, 0x1f7c, 0x03c9, 0x006b, 0x00e5, 0x214e, 0x2170, 0x217f, 0x2184, 0x24d0,
+ 0x24e9, 0x2c30, 0x2c5e, 0x2c61, 0x026b, 0x1d7d, 0x027d, 0x2c68, 0x2c6c, 0x0251, 0x0271, 0x0250, 0x0252,
+ 0x2c73, 0x2c76, 0x023f, 0x2c81, 0x2ce3, 0x2cec, 0x2cf3, 0xa641, 0xa66d, 0xa681, 0xa69b, 0xa723, 0xa72f,
+ 0xa733, 0xa76f, 0xa77a, 0x1d79, 0xa77f, 0xa787, 0xa78c, 0x0265, 0xa791, 0xa797, 0xa7a9, 0x0266, 0x025c,
+ 0x0261, 0x026c, 0x026a, 0x029e, 0x0287, 0x029d, 0xab53, 0xa7b5, 0xa7bf, 0xa7c3, 0xa794, 0x0282, 0x1d8e,
+ 0x13a0, 0x13ef, 0xff41, 0xff5a, 0x10428, 0x1044f, 0x104d8, 0x104fb, 0x10cc0, 0x10cf2, 0x118c0, 0x118df,
+ 0x16e60, 0x16e7f, 0x1e922, 0x1e943
};
-
- /* This maps single codepoint to two codepoints. */
- static const struct {
- int src_codepoint;
- int dest_codepoint0;
- int dest_codepoint1;
- } double_map[] = {
- { 0x00df, 0x0073, 0x0073 }, { 0x0130, 0x0069, 0x0307 }, { 0x0149, 0x02bc, 0x006e }, { 0x01f0, 0x006a, 0x030c }, { 0x0587, 0x0565, 0x0582 }, { 0x1e96, 0x0068, 0x0331 },
- { 0x1e97, 0x0074, 0x0308 }, { 0x1e98, 0x0077, 0x030a }, { 0x1e99, 0x0079, 0x030a }, { 0x1e9a, 0x0061, 0x02be }, { 0x1f50, 0x03c5, 0x0313 }, { 0x1f80, 0x1f00, 0x03b9 },
- { 0x1f81, 0x1f01, 0x03b9 }, { 0x1f82, 0x1f02, 0x03b9 }, { 0x1f83, 0x1f03, 0x03b9 }, { 0x1f84, 0x1f04, 0x03b9 }, { 0x1f85, 0x1f05, 0x03b9 }, { 0x1f86, 0x1f06, 0x03b9 },
- { 0x1f87, 0x1f07, 0x03b9 }, { 0x1f88, 0x1f00, 0x03b9 }, { 0x1f89, 0x1f01, 0x03b9 }, { 0x1f8a, 0x1f02, 0x03b9 }, { 0x1f8b, 0x1f03, 0x03b9 }, { 0x1f8c, 0x1f04, 0x03b9 },
- { 0x1f8d, 0x1f05, 0x03b9 }, { 0x1f8e, 0x1f06, 0x03b9 }, { 0x1f8f, 0x1f07, 0x03b9 }, { 0x1f90, 0x1f20, 0x03b9 }, { 0x1f91, 0x1f21, 0x03b9 }, { 0x1f92, 0x1f22, 0x03b9 },
- { 0x1f93, 0x1f23, 0x03b9 }, { 0x1f94, 0x1f24, 0x03b9 }, { 0x1f95, 0x1f25, 0x03b9 }, { 0x1f96, 0x1f26, 0x03b9 }, { 0x1f97, 0x1f27, 0x03b9 }, { 0x1f98, 0x1f20, 0x03b9 },
- { 0x1f99, 0x1f21, 0x03b9 }, { 0x1f9a, 0x1f22, 0x03b9 }, { 0x1f9b, 0x1f23, 0x03b9 }, { 0x1f9c, 0x1f24, 0x03b9 }, { 0x1f9d, 0x1f25, 0x03b9 }, { 0x1f9e, 0x1f26, 0x03b9 },
- { 0x1f9f, 0x1f27, 0x03b9 }, { 0x1fa0, 0x1f60, 0x03b9 }, { 0x1fa1, 0x1f61, 0x03b9 }, { 0x1fa2, 0x1f62, 0x03b9 }, { 0x1fa3, 0x1f63, 0x03b9 }, { 0x1fa4, 0x1f64, 0x03b9 },
- { 0x1fa5, 0x1f65, 0x03b9 }, { 0x1fa6, 0x1f66, 0x03b9 }, { 0x1fa7, 0x1f67, 0x03b9 }, { 0x1fa8, 0x1f60, 0x03b9 }, { 0x1fa9, 0x1f61, 0x03b9 }, { 0x1faa, 0x1f62, 0x03b9 },
- { 0x1fab, 0x1f63, 0x03b9 }, { 0x1fac, 0x1f64, 0x03b9 }, { 0x1fad, 0x1f65, 0x03b9 }, { 0x1fae, 0x1f66, 0x03b9 }, { 0x1faf, 0x1f67, 0x03b9 }, { 0x1fb2, 0x1f70, 0x03b9 },
- { 0x1fb3, 0x03b1, 0x03b9 }, { 0x1fb4, 0x03ac, 0x03b9 }, { 0x1fb6, 0x03b1, 0x0342 }, { 0x1fbc, 0x03b1, 0x03b9 }, { 0x1fc2, 0x1f74, 0x03b9 }, { 0x1fc3, 0x03b7, 0x03b9 },
- { 0x1fc4, 0x03ae, 0x03b9 }, { 0x1fc6, 0x03b7, 0x0342 }, { 0x1fcc, 0x03b7, 0x03b9 }, { 0x1fd6, 0x03b9, 0x0342 }, { 0x1fe4, 0x03c1, 0x0313 }, { 0x1fe6, 0x03c5, 0x0342 },
- { 0x1ff2, 0x1f7c, 0x03b9 }, { 0x1ff3, 0x03c9, 0x03b9 }, { 0x1ff4, 0x03ce, 0x03b9 }, { 0x1ff6, 0x03c9, 0x0342 }, { 0x1ffc, 0x03c9, 0x03b9 }, { 0xfb00, 0x0066, 0x0066 },
- { 0xfb01, 0x0066, 0x0069 }, { 0xfb02, 0x0066, 0x006c }, { 0xfb05, 0x0073, 0x0074 }, { 0xfb06, 0x0073, 0x0074 }, { 0xfb13, 0x0574, 0x0576 }, { 0xfb14, 0x0574, 0x0565 },
- { 0xfb15, 0x0574, 0x056b }, { 0xfb16, 0x057e, 0x0576 }, { 0xfb17, 0x0574, 0x056d }
+ static const unsigned FOLD_MAP_2[] = {
+ S(0x00df), S(0x0130), S(0x0149), S(0x01f0), S(0x0587), S(0x1e96), S(0x1e97), S(0x1e98), S(0x1e99),
+ S(0x1e9a), S(0x1e9e), S(0x1f50), R(0x1f80,0x1f87), R(0x1f88,0x1f8f), R(0x1f90,0x1f97), R(0x1f98,0x1f9f),
+ R(0x1fa0,0x1fa7), R(0x1fa8,0x1faf), S(0x1fb2), S(0x1fb3), S(0x1fb4), S(0x1fb6), S(0x1fbc), S(0x1fc2),
+ S(0x1fc3), S(0x1fc4), S(0x1fc6), S(0x1fcc), S(0x1fd6), S(0x1fe4), S(0x1fe6), S(0x1ff2), S(0x1ff3),
+ S(0x1ff4), S(0x1ff6), S(0x1ffc), S(0xfb00), S(0xfb01), S(0xfb02), S(0xfb05), S(0xfb06), S(0xfb13),
+ S(0xfb14), S(0xfb15), S(0xfb16), S(0xfb17)
};
-
- /* This maps single codepoint to three codepoints. */
+ static const unsigned FOLD_MAP_2_DATA[] = {
+ 0x0073,0x0073, 0x0069,0x0307, 0x02bc,0x006e, 0x006a,0x030c, 0x0565,0x0582, 0x0068,0x0331, 0x0074,0x0308,
+ 0x0077,0x030a, 0x0079,0x030a, 0x0061,0x02be, 0x0073,0x0073, 0x03c5,0x0313, 0x1f00,0x03b9, 0x1f07,0x03b9,
+ 0x1f00,0x03b9, 0x1f07,0x03b9, 0x1f20,0x03b9, 0x1f27,0x03b9, 0x1f20,0x03b9, 0x1f27,0x03b9, 0x1f60,0x03b9,
+ 0x1f67,0x03b9, 0x1f60,0x03b9, 0x1f67,0x03b9, 0x1f70,0x03b9, 0x03b1,0x03b9, 0x03ac,0x03b9, 0x03b1,0x0342,
+ 0x03b1,0x03b9, 0x1f74,0x03b9, 0x03b7,0x03b9, 0x03ae,0x03b9, 0x03b7,0x0342, 0x03b7,0x03b9, 0x03b9,0x0342,
+ 0x03c1,0x0313, 0x03c5,0x0342, 0x1f7c,0x03b9, 0x03c9,0x03b9, 0x03ce,0x03b9, 0x03c9,0x0342, 0x03c9,0x03b9,
+ 0x0066,0x0066, 0x0066,0x0069, 0x0066,0x006c, 0x0073,0x0074, 0x0073,0x0074, 0x0574,0x0576, 0x0574,0x0565,
+ 0x0574,0x056b, 0x057e,0x0576, 0x0574,0x056d
+ };
+ static const unsigned FOLD_MAP_3[] = {
+ S(0x0390), S(0x03b0), S(0x1f52), S(0x1f54), S(0x1f56), S(0x1fb7), S(0x1fc7), S(0x1fd2), S(0x1fd3),
+ S(0x1fd7), S(0x1fe2), S(0x1fe3), S(0x1fe7), S(0x1ff7), S(0xfb03), S(0xfb04)
+ };
+ static const unsigned FOLD_MAP_3_DATA[] = {
+ 0x03b9,0x0308,0x0301, 0x03c5,0x0308,0x0301, 0x03c5,0x0313,0x0300, 0x03c5,0x0313,0x0301,
+ 0x03c5,0x0313,0x0342, 0x03b1,0x0342,0x03b9, 0x03b7,0x0342,0x03b9, 0x03b9,0x0308,0x0300,
+ 0x03b9,0x0308,0x0301, 0x03b9,0x0308,0x0342, 0x03c5,0x0308,0x0300, 0x03c5,0x0308,0x0301,
+ 0x03c5,0x0308,0x0342, 0x03c9,0x0342,0x03b9, 0x0066,0x0066,0x0069, 0x0066,0x0066,0x006c
+ };
+#undef R
+#undef S
static const struct {
- int src_codepoint;
- int dest_codepoint0;
- int dest_codepoint1;
- int dest_codepoint2;
- } triple_map[] = {
- { 0x0390, 0x03b9, 0x0308, 0x0301 }, { 0x03b0, 0x03c5, 0x0308, 0x0301 }, { 0x1f52, 0x03c5, 0x0313, 0x0300 }, { 0x1f54, 0x03c5, 0x0313, 0x0301 },
- { 0x1f56, 0x03c5, 0x0313, 0x0342 }, { 0x1fb7, 0x03b1, 0x0342, 0x03b9 }, { 0x1fc7, 0x03b7, 0x0342, 0x03b9 }, { 0x1fd2, 0x03b9, 0x0308, 0x0300 },
- { 0x1fd3, 0x03b9, 0x0308, 0x0301 }, { 0x1fd7, 0x03b9, 0x0308, 0x0342 }, { 0x1fe2, 0x03c5, 0x0308, 0x0300 }, { 0x1fe3, 0x03c5, 0x0308, 0x0301 },
- { 0x1fe7, 0x03c5, 0x0308, 0x0342 }, { 0x1ff7, 0x03c9, 0x0342, 0x03b9 }, { 0xfb03, 0x0066, 0x0066, 0x0069 }, { 0xfb04, 0x0066, 0x0066, 0x006c }
+ const unsigned* map;
+ const unsigned* data;
+ size_t map_size;
+ int n_codepoints;
+ } FOLD_MAP_LIST[] = {
+ { FOLD_MAP_1, FOLD_MAP_1_DATA, SIZEOF_ARRAY(FOLD_MAP_1), 1 },
+ { FOLD_MAP_2, FOLD_MAP_2_DATA, SIZEOF_ARRAY(FOLD_MAP_2), 2 },
+ { FOLD_MAP_3, FOLD_MAP_3_DATA, SIZEOF_ARRAY(FOLD_MAP_3), 3 }
};
int i;
@@ -669,41 +672,37 @@ struct MD_UNICODE_FOLD_INFO_tag {
return;
}
- for(i = 0; i < SIZEOF_ARRAY(range_map); i++) {
- if(range_map[i].min_codepoint <= codepoint && codepoint <= range_map[i].max_codepoint) {
- info->codepoints[0] = codepoint + range_map[i].offset;
- info->n_codepoints = 1;
- return;
- }
- }
-
- for(i = 0; i < SIZEOF_ARRAY(single_map); i++) {
- if(codepoint == single_map[i].src_codepoint) {
- info->codepoints[0] = single_map[i].dest_codepoint;
- info->n_codepoints = 1;
- return;
- }
- }
-
- for(i = 0; i < SIZEOF_ARRAY(double_map); i++) {
- if(codepoint == double_map[i].src_codepoint) {
- info->codepoints[0] = double_map[i].dest_codepoint0;
- info->codepoints[1] = double_map[i].dest_codepoint1;
- info->n_codepoints = 2;
- return;
- }
- }
+ /* Try to locate the codepoint in any of the maps. */
+ for(i = 0; i < (int) SIZEOF_ARRAY(FOLD_MAP_LIST); i++) {
+ int index;
+
+ index = md_unicode_bsearch__(codepoint, FOLD_MAP_LIST[i].map, FOLD_MAP_LIST[i].map_size);
+ if(index >= 0) {
+ /* Found the mapping. */
+ int n_codepoints = FOLD_MAP_LIST[i].n_codepoints;
+ const unsigned* map = FOLD_MAP_LIST[i].map;
+ const unsigned* codepoints = FOLD_MAP_LIST[i].data + (index * n_codepoints);
+
+ memcpy(info->codepoints, codepoints, sizeof(unsigned) * n_codepoints);
+ info->n_codepoints = n_codepoints;
+
+ if(FOLD_MAP_LIST[i].map[index] != codepoint) {
+ /* The found mapping maps whole range of codepoints,
+ * i.e. we have to offset info->codepoints[0] accordingly. */
+ if((map[index] & 0x00ffffff)+1 == codepoints[0]) {
+ /* Alternating type of the range. */
+ info->codepoints[0] = codepoint + ((codepoint & 0x1) == (map[index] & 0x1) ? 1 : 0);
+ } else {
+ /* Range to range kind of mapping. */
+ info->codepoints[0] += (codepoint - (map[index] & 0x00ffffff));
+ }
+ }
- for(i = 0; i < SIZEOF_ARRAY(triple_map); i++) {
- if(codepoint == triple_map[i].src_codepoint) {
- info->codepoints[0] = triple_map[i].dest_codepoint0;
- info->codepoints[1] = triple_map[i].dest_codepoint1;
- info->codepoints[2] = triple_map[i].dest_codepoint2;
- info->n_codepoints = 3;
return;
}
}
+ /* No mapping found. Map the codepoint to itself. */
info->codepoints[0] = codepoint;
info->n_codepoints = 1;
}
@@ -715,7 +714,7 @@ struct MD_UNICODE_FOLD_INFO_tag {
#define IS_UTF16_SURROGATE_LO(word) (((WORD)(word) & 0xfc00) == 0xdc00)
#define UTF16_DECODE_SURROGATE(hi, lo) (0x10000 + ((((unsigned)(hi) & 0x3ff) << 10) | (((unsigned)(lo) & 0x3ff) << 0)))
- static int
+ static unsigned
md_decode_utf16le__(const CHAR* str, SZ str_size, SZ* p_size)
{
if(IS_UTF16_SURROGATE_HI(str[0])) {
@@ -731,7 +730,7 @@ struct MD_UNICODE_FOLD_INFO_tag {
return str[0];
}
- static int
+ static unsigned
md_decode_utf16le_before__(MD_CTX* ctx, OFF off)
{
if(off > 2 && IS_UTF16_SURROGATE_HI(CH(off-2)) && IS_UTF16_SURROGATE_LO(CH(off-1)))
@@ -760,7 +759,7 @@ struct MD_UNICODE_FOLD_INFO_tag {
#define IS_UTF8_LEAD4(byte) (((unsigned char)(byte) & 0xf8) == 0xf0)
#define IS_UTF8_TAIL(byte) (((unsigned char)(byte) & 0xc0) == 0x80)
- static int
+ static unsigned
md_decode_utf8__(const CHAR* str, SZ str_size, SZ* p_size)
{
if(!IS_UTF8_LEAD1(str[0])) {
@@ -796,10 +795,10 @@ struct MD_UNICODE_FOLD_INFO_tag {
if(p_size != NULL)
*p_size = 1;
- return str[0];
+ return (unsigned) str[0];
}
- static int
+ static unsigned
md_decode_utf8_before__(MD_CTX* ctx, OFF off)
{
if(!IS_UTF8_LEAD1(CH(off-1))) {
@@ -819,7 +818,7 @@ struct MD_UNICODE_FOLD_INFO_tag {
(((unsigned int)CH(off-1) & 0x3f) << 0);
}
- return CH(off-1);
+ return (unsigned) CH(off-1);
}
#define ISUNICODEWHITESPACE_(codepoint) md_is_unicode_whitespace__(codepoint)
@@ -829,7 +828,7 @@ struct MD_UNICODE_FOLD_INFO_tag {
#define ISUNICODEPUNCT(off) md_is_unicode_punct__(md_decode_utf8__(STR(off), ctx->size - (off), NULL))
#define ISUNICODEPUNCTBEFORE(off) md_is_unicode_punct__(md_decode_utf8_before__(ctx, off))
- static inline int
+ static inline unsigned
md_decode_unicode(const CHAR* str, OFF off, SZ str_size, SZ* p_char_size)
{
return md_decode_utf8__(str+off, str_size-off, p_char_size);
@@ -843,7 +842,7 @@ struct MD_UNICODE_FOLD_INFO_tag {
#define ISUNICODEPUNCTBEFORE(off) ISPUNCT((off)-1)
static inline void
- md_get_unicode_fold_info(int codepoint, MD_UNICODE_FOLD_INFO* info)
+ md_get_unicode_fold_info(unsigned codepoint, MD_UNICODE_FOLD_INFO* info)
{
info->codepoints[0] = codepoint;
if(ISUPPER_(codepoint))
@@ -851,11 +850,11 @@ struct MD_UNICODE_FOLD_INFO_tag {
info->n_codepoints = 1;
}
- static inline int
+ static inline unsigned
md_decode_unicode(const CHAR* str, OFF off, SZ str_size, SZ* p_size)
{
*p_size = 1;
- return str[off];
+ return (unsigned) str[off];
}
#endif
@@ -929,7 +928,7 @@ static OFF
md_skip_unicode_whitespace(const CHAR* label, OFF off, SZ size)
{
SZ char_size;
- int codepoint;
+ unsigned codepoint;
while(off < size) {
codepoint = md_decode_unicode(label, off, size, &char_size);
@@ -1493,7 +1492,7 @@ md_link_label_hash(const CHAR* label, SZ size)
{
unsigned hash = MD_FNV1A_BASE;
OFF off;
- int codepoint;
+ unsigned codepoint;
int is_whitespace = FALSE;
off = md_skip_unicode_whitespace(label, 0, size);
@@ -1505,24 +1504,17 @@ md_link_label_hash(const CHAR* label, SZ size)
if(is_whitespace) {
codepoint = ' ';
- hash = md_fnv1a(hash, &codepoint, 1 * sizeof(int));
-
+ hash = md_fnv1a(hash, &codepoint, sizeof(unsigned));
off = md_skip_unicode_whitespace(label, off, size);
} else {
MD_UNICODE_FOLD_INFO fold_info;
md_get_unicode_fold_info(codepoint, &fold_info);
- hash = md_fnv1a(hash, fold_info.codepoints, fold_info.n_codepoints * sizeof(int));
-
+ hash = md_fnv1a(hash, fold_info.codepoints, fold_info.n_codepoints * sizeof(unsigned));
off += char_size;
}
}
- if(!is_whitespace) {
- codepoint = ' ';
- hash = md_fnv1a(hash, &codepoint, 1 * sizeof(int));
- }
-
return hash;
}
@@ -1536,7 +1528,7 @@ md_link_label_cmp(const CHAR* a_label, SZ a_size, const CHAR* b_label, SZ b_size
a_off = md_skip_unicode_whitespace(a_label, 0, a_size);
b_off = md_skip_unicode_whitespace(b_label, 0, b_size);
while(a_off < a_size || b_off < b_size) {
- int a_codepoint, b_codepoint;
+ unsigned a_codepoint, b_codepoint;
SZ a_char_size, b_char_size;
int a_is_whitespace, b_is_whitespace;
@@ -1545,7 +1537,7 @@ md_link_label_cmp(const CHAR* a_label, SZ a_size, const CHAR* b_label, SZ b_size
a_is_whitespace = ISUNICODEWHITESPACE_(a_codepoint) || ISNEWLINE_(a_label[a_off]);
} else {
/* Treat end of label as a whitespace. */
- a_codepoint = -1;
+ a_codepoint = 0;
a_is_whitespace = TRUE;
}
@@ -1554,7 +1546,7 @@ md_link_label_cmp(const CHAR* a_label, SZ a_size, const CHAR* b_label, SZ b_size
b_is_whitespace = ISUNICODEWHITESPACE_(b_codepoint) || ISNEWLINE_(b_label[b_off]);
} else {
/* Treat end of label as a whitespace. */
- b_codepoint = -1;
+ b_codepoint = 0;
b_is_whitespace = TRUE;
}
@@ -1573,7 +1565,7 @@ md_link_label_cmp(const CHAR* a_label, SZ a_size, const CHAR* b_label, SZ b_size
if(a_fold_info.n_codepoints != b_fold_info.n_codepoints)
return (a_fold_info.n_codepoints - b_fold_info.n_codepoints);
- cmp = memcmp(a_fold_info.codepoints, b_fold_info.codepoints, a_fold_info.n_codepoints * sizeof(int));
+ cmp = memcmp(a_fold_info.codepoints, b_fold_info.codepoints, a_fold_info.n_codepoints * sizeof(unsigned));
if(cmp != 0)
return cmp;
@@ -1853,7 +1845,7 @@ md_is_link_label(MD_CTX* ctx, const MD_LINE* lines, int n_lines, OFF beg,
return FALSE;
}
} else {
- int codepoint;
+ unsigned codepoint;
SZ char_size;
codepoint = md_decode_unicode(ctx->text, off, ctx->size, &char_size);
diff --git a/scripts/build_folding_map.py b/scripts/build_folding_map.py
new file mode 100644
index 0000000..b27b480
--- /dev/null
+++ b/scripts/build_folding_map.py
@@ -0,0 +1,118 @@
+#!/usr/bin/env python3
+
+import os
+import sys
+import textwrap
+
+
+self_path = os.path.dirname(os.path.realpath(__file__));
+f = open(self_path + "/unicode/CaseFolding.txt", "r")
+
+status_list = [ "C", "F" ]
+
+folding_list = [ dict(), dict(), dict() ]
+
+# Filter the foldings for "full" folding.
+for line in f:
+ comment_off = line.find("#")
+ if comment_off >= 0:
+ line = line[:comment_off]
+ line = line.strip()
+ if not line:
+ continue
+
+ raw_codepoint, status, raw_mapping, ignored_tail = line.split(";", 3)
+ if not status.strip() in status_list:
+ continue
+ codepoint = int(raw_codepoint.strip(), 16)
+ mapping = [int(it, 16) for it in raw_mapping.strip().split(" ")]
+ mapping_len = len(mapping)
+
+ if mapping_len in range(1, 4):
+ folding_list[mapping_len-1][codepoint] = mapping
+ else:
+ assert(False)
+f.close()
+
+
+# If we assume that range (index0 ... index-1) makes a range, check that index
+# is compatible with it too.
+#
+# We are capable to handle ranges which:
+#
+# (1) either form consecutive sequence of codepoints and which map that range
+# to other consecutive range of codepoints;
+#
+# (2) or consecutive range of codepoints with step 2 where each codepoint
+# CP is mapped to the next codepoint CP+1
+# (e.g. 0x1234 -> 0x1235; 0x1236 -> 0x1238; ...).
+#
+# (If the mappings have multiple codepoints, only the 1st mapped codepoint is
+# considered and all the other ones have to be the same for the whole range.)
+def is_range_compatible(folding, codepoint_list, index0, index):
+ N = index - index0
+ codepoint0 = codepoint_list[index0]
+ codepoint1 = codepoint_list[index0+1]
+ codepointN = codepoint_list[index]
+ mapping0 = folding[codepoint0]
+ mapping1 = folding[codepoint1]
+ mappingN = folding[codepointN]
+
+ # Check the range type (1):
+ if codepoint1 - codepoint0 == 1 and codepointN - codepoint0 == N \
+ and mapping1[0] - mapping0[0] == 1 and mapping1[1:] == mapping0[1:] \
+ and mappingN[0] - mapping0[0] == N and mappingN[1:] == mapping0[1:]:
+ return True
+
+ # Check the range type (2):
+ if codepoint1 - codepoint0 == 2 and codepointN - codepoint0 == 2 * N \
+ and mapping0[0] - codepoint0 == 1 \
+ and mapping1[0] - codepoint1 == 1 and mapping1[1:] == mapping0[1:] \
+ and mappingN[0] - codepointN == 1 and mappingN[1:] == mapping0[1:]:
+ return True
+
+ return False
+
+
+def mapping_str(list, mapping):
+ return ",".join("0x{:04x}".format(x) for x in mapping)
+
+for mapping_len in range(1, 4):
+ folding = folding_list[mapping_len-1]
+ codepoint_list = list(folding)
+
+ index0 = 0
+ count = len(folding)
+
+ records = list()
+ data_records = list()
+
+ while index0 < count:
+ index1 = index0 + 1
+ while index1 < count and is_range_compatible(folding, codepoint_list, index0, index1):
+ index1 += 1
+
+ if index1 - index0 > 2:
+ # Range of codepoints
+ records.append("R(0x{:04x},0x{:04x})".format(codepoint_list[index0], codepoint_list[index1-1]))
+ data_records.append(mapping_str(data_records, folding[codepoint_list[index0]]))
+ data_records.append(mapping_str(data_records, folding[codepoint_list[index1-1]]))
+ else:
+ # Single codepoint
+ records.append("S(0x{:04x})".format(codepoint_list[index0]))
+ data_records.append(mapping_str(data_records, folding[codepoint_list[index0]]))
+
+ index0 = index1
+
+ sys.stdout.write("static const unsigned FOLD_MAP_{}[] = {{\n".format(mapping_len))
+ sys.stdout.write("\n".join(textwrap.wrap(", ".join(records), 110,
+ initial_indent = " ", subsequent_indent=" ")))
+ sys.stdout.write("\n};\n")
+
+ sys.stdout.write("static const unsigned FOLD_MAP_{}_DATA[] = {{\n".format(mapping_len))
+ sys.stdout.write("\n".join(textwrap.wrap(", ".join(data_records), 110,
+ initial_indent = " ", subsequent_indent=" ")))
+ sys.stdout.write("\n};\n")
+
+
+
diff --git a/scripts/build_punct_map.py b/scripts/build_punct_map.py
new file mode 100644
index 0000000..13102f2
--- /dev/null
+++ b/scripts/build_punct_map.py
@@ -0,0 +1,66 @@
+#!/usr/bin/env python3
+
+import os
+import sys
+import textwrap
+
+
+self_path = os.path.dirname(os.path.realpath(__file__));
+f = open(self_path + "/unicode/DerivedGeneralCategory.txt", "r")
+
+codepoint_list = []
+category_list = [ "Pc", "Pd", "Pe", "Pf", "Pi", "Po", "Ps" ]
+
+# Filter codepoints falling in the right category:
+for line in f:
+ comment_off = line.find("#")
+ if comment_off >= 0:
+ line = line[:comment_off]
+ line = line.strip()
+ if not line:
+ continue
+
+ char_range, category = line.split(";")
+ char_range = char_range.strip()
+ category = category.strip()
+
+ if not category in category_list:
+ continue
+
+ delim_off = char_range.find("..")
+ if delim_off >= 0:
+ codepoint0 = int(char_range[:delim_off], 16)
+ codepoint1 = int(char_range[delim_off+2:], 16)
+ for codepoint in range(codepoint0, codepoint1 + 1):
+ codepoint_list.append(codepoint)
+ else:
+ codepoint = int(char_range, 16)
+ codepoint_list.append(codepoint)
+f.close()
+
+
+codepoint_list.sort()
+
+
+index0 = 0
+count = len(codepoint_list)
+
+records = list()
+while index0 < count:
+ index1 = index0 + 1
+ while index1 < count and codepoint_list[index1] == codepoint_list[index1-1] + 1:
+ index1 += 1
+
+ if index1 - index0 > 1:
+ # Range of codepoints
+ records.append("R(0x{:04x},0x{:04x})".format(codepoint_list[index0], codepoint_list[index1-1]))
+ else:
+ # Single codepoint
+ records.append("S(0x{:04x})".format(codepoint_list[index0]))
+
+ index0 = index1
+
+sys.stdout.write("static const unsigned PUNCT_MAP[] = {\n")
+sys.stdout.write("\n".join(textwrap.wrap(", ".join(records), 110,
+ initial_indent = " ", subsequent_indent=" ")))
+sys.stdout.write("\n};\n\n")
diff --git a/scripts/build_whitespace_map.py b/scripts/build_whitespace_map.py
new file mode 100644
index 0000000..932b571
--- /dev/null
+++ b/scripts/build_whitespace_map.py
@@ -0,0 +1,66 @@
+#!/usr/bin/env python3
+
+import os
+import sys
+import textwrap
+
+
+self_path = os.path.dirname(os.path.realpath(__file__));
+f = open(self_path + "/unicode/DerivedGeneralCategory.txt", "r")
+
+codepoint_list = []
+category_list = [ "Zs" ]
+
+# Filter codepoints falling in the right category:
+for line in f:
+ comment_off = line.find("#")
+ if comment_off >= 0:
+ line = line[:comment_off]
+ line = line.strip()
+ if not line:
+ continue
+
+ char_range, category = line.split(";")
+ char_range = char_range.strip()
+ category = category.strip()
+
+ if not category in category_list:
+ continue
+
+ delim_off = char_range.find("..")
+ if delim_off >= 0:
+ codepoint0 = int(char_range[:delim_off], 16)
+ codepoint1 = int(char_range[delim_off+2:], 16)
+ for codepoint in range(codepoint0, codepoint1 + 1):
+ codepoint_list.append(codepoint)
+ else:
+ codepoint = int(char_range, 16)
+ codepoint_list.append(codepoint)
+f.close()
+
+
+codepoint_list.sort()
+
+
+index0 = 0
+count = len(codepoint_list)
+
+records = list()
+while index0 < count:
+ index1 = index0 + 1
+ while index1 < count and codepoint_list[index1] == codepoint_list[index1-1] + 1:
+ index1 += 1
+
+ if index1 - index0 > 1:
+ # Range of codepoints
+ records.append("R(0x{:04x},0x{:04x})".format(codepoint_list[index0], codepoint_list[index1-1]))
+ else:
+ # Single codepoint
+ records.append("S(0x{:04x})".format(codepoint_list[index0]))
+
+ index0 = index1
+
+sys.stdout.write("static const unsigned WHITESPACE_MAP[] = {\n")
+sys.stdout.write("\n".join(textwrap.wrap(", ".join(records), 110,
+ initial_indent = " ", subsequent_indent=" ")))
+sys.stdout.write("\n};\n\n")
diff --git a/scripts/unicode/CaseFolding.txt b/scripts/unicode/CaseFolding.txt
new file mode 100644
index 0000000..7eeb915
--- /dev/null
+++ b/scripts/unicode/CaseFolding.txt
@@ -0,0 +1,1581 @@
+# CaseFolding-12.1.0.txt
+# Date: 2019-03-10, 10:53:00 GMT
+# © 2019 Unicode®, Inc.
+# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
+# For terms of use, see http://www.unicode.org/terms_of_use.html
+#
+# Unicode Character Database
+# For documentation, see http://www.unicode.org/reports/tr44/
+#
+# Case Folding Properties
+#
+# This file is a supplement to the UnicodeData file.
+# It provides a case folding mapping generated from the Unicode Character Database.
+# If all characters are mapped according to the full mapping below, then
+# case differences (according to UnicodeData.txt and SpecialCasing.txt)
+# are eliminated.
+#
+# The data supports both implementations that require simple case foldings
+# (where string lengths don't change), and implementations that allow full case folding
+# (where string lengths may grow). Note that where they can be supported, the
+# full case foldings are superior: for example, they allow "MASSE" and "Maße" to match.
+#
+# All code points not listed in this file map to themselves.
+#
+# NOTE: case folding does not preserve normalization formats!
+#
+# For information on case folding, including how to have case folding
+# preserve normalization formats, see Section 3.13 Default Case Algorithms in
+# The Unicode Standard.
+#
+# ================================================================================
+# Format
+# ================================================================================
+# The entries in this file are in the following machine-readable format:
+#
+# <code>; <status>; <mapping>; # <name>
+#
+# The status field is:
+# C: common case folding, common mappings shared by both simple and full mappings.
+# F: full case folding, mappings that cause strings to grow in length. Multiple characters are separated by spaces.
+# S: simple case folding, mappings to single characters where different from F.
+# T: special case for uppercase I and dotted uppercase I
+# - For non-Turkic languages, this mapping is normally not used.
+# - For Turkic languages (tr, az), this mapping can be used instead of the normal mapping for these characters.
+# Note that the Turkic mappings do not maintain canonical equivalence without additional processing.
+# See the discussions of case mapping in the Unicode Standard for more information.
+#
+# Usage:
+# A. To do a simple case folding, use the mappings with status C + S.
+# B. To do a full case folding, use the mappings with status C + F.
+#
+# The mappings with status T can be used or omitted depending on the desired case-folding
+# behavior. (The default option is to exclude them.)
+#
+# =================================================================
+
+# Property: Case_Folding
+
+# All code points not explicitly listed for Case_Folding
+# have the value C for the status field, and the code point itself for the mapping field.
+
+# =================================================================
+0041; C; 0061; # LATIN CAPITAL LETTER A
+0042; C; 0062; # LATIN CAPITAL LETTER B
+0043; C; 0063; # LATIN CAPITAL LETTER C
+0044; C; 0064; # LATIN CAPITAL LETTER D
+0045; C; 0065; # LATIN CAPITAL LETTER E
+0046; C; 0066; # LATIN CAPITAL LETTER F
+0047; C; 0067; # LATIN CAPITAL LETTER G
+0048; C; 0068; # LATIN CAPITAL LETTER H
+0049; C; 0069; # LATIN CAPITAL LETTER I
+0049; T; 0131; # LATIN CAPITAL LETTER I
+004A; C; 006A; # LATIN CAPITAL LETTER J
+004B; C; 006B; # LATIN CAPITAL LETTER K
+004C; C; 006C; # LATIN CAPITAL LETTER L
+004D; C; 006D; # LATIN CAPITAL LETTER M
+004E; C; 006E; # LATIN CAPITAL LETTER N
+004F; C; 006F; # LATIN CAPITAL LETTER O
+0050; C; 0070; # LATIN CAPITAL LETTER P
+0051; C; 0071; # LATIN CAPITAL LETTER Q
+0052; C; 0072; # LATIN CAPITAL LETTER R
+0053; C; 0073; # LATIN CAPITAL LETTER S
+0054; C; 0074; # LATIN CAPITAL LETTER T
+0055; C; 0075; # LATIN CAPITAL LETTER U
+0056; C; 0076; # LATIN CAPITAL LETTER V
+0057; C; 0077; # LATIN CAPITAL LETTER W
+0058; C; 0078; # LATIN CAPITAL LETTER X
+0059; C; 0079; # LATIN CAPITAL LETTER Y
+005A; C; 007A; # LATIN CAPITAL LETTER Z
+00B5; C; 03BC; # MICRO SIGN
+00C0; C; 00E0; # LATIN CAPITAL LETTER A WITH GRAVE
+00C1; C; 00E1; # LATIN CAPITAL LETTER A WITH ACUTE
+00C2; C; 00E2; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+00C3; C; 00E3; # LATIN CAPITAL LETTER A WITH TILDE
+00C4; C; 00E4; # LATIN CAPITAL LETTER A WITH DIAERESIS
+00C5; C; 00E5; # LATIN CAPITAL LETTER A WITH RING ABOVE
+00C6; C; 00E6; # LATIN CAPITAL LETTER AE
+00C7; C; 00E7; # LATIN CAPITAL LETTER C WITH CEDILLA
+00C8; C; 00E8; # LATIN CAPITAL LETTER E WITH GRAVE
+00C9; C; 00E9; # LATIN CAPITAL LETTER E WITH ACUTE
+00CA; C; 00EA; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+00CB; C; 00EB; # LATIN CAPITAL LETTER E WITH DIAERESIS
+00CC; C; 00EC; # LATIN CAPITAL LETTER I WITH GRAVE
+00CD; C; 00ED; # LATIN CAPITAL LETTER I WITH ACUTE
+00CE; C; 00EE; # LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+00CF; C; 00EF; # LATIN CAPITAL LETTER I WITH DIAERESIS
+00D0; C; 00F0; # LATIN CAPITAL LETTER ETH
+00D1; C; 00F1; # LATIN CAPITAL LETTER N WITH TILDE
+00D2; C; 00F2; # LATIN CAPITAL LETTER O WITH GRAVE
+00D3; C; 00F3; # LATIN CAPITAL LETTER O WITH ACUTE
+00D4; C; 00F4; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+00D5; C; 00F5; # LATIN CAPITAL LETTER O WITH TILDE
+00D6; C; 00F6; # LATIN CAPITAL LETTER O WITH DIAERESIS
+00D8; C; 00F8; # LATIN CAPITAL LETTER O WITH STROKE
+00D9; C; 00F9; # LATIN CAPITAL LETTER U WITH GRAVE
+00DA; C; 00FA; # LATIN CAPITAL LETTER U WITH ACUTE
+00DB; C; 00FB; # LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+00DC; C; 00FC; # LATIN CAPITAL LETTER U WITH DIAERESIS
+00DD; C; 00FD; # LATIN CAPITAL LETTER Y WITH ACUTE
+00DE; C; 00FE; # LATIN CAPITAL LETTER THORN
+00DF; F; 0073 0073; # LATIN SMALL LETTER SHARP S
+0100; C; 0101; # LATIN CAPITAL LETTER A WITH MACRON
+0102; C; 0103; # LATIN CAPITAL LETTER A WITH BREVE
+0104; C; 0105; # LATIN CAPITAL LETTER A WITH OGONEK
+0106; C; 0107; # LATIN CAPITAL LETTER C WITH ACUTE
+0108; C; 0109; # LATIN CAPITAL LETTER C WITH CIRCUMFLEX
+010A; C; 010B; # LATIN CAPITAL LETTER C WITH DOT ABOVE
+010C; C; 010D; # LATIN CAPITAL LETTER C WITH CARON
+010E; C; 010F; # LATIN CAPITAL LETTER D WITH CARON
+0110; C; 0111; # LATIN CAPITAL LETTER D WITH STROKE
+0112; C; 0113; # LATIN CAPITAL LETTER E WITH MACRON
+0114; C; 0115; # LATIN CAPITAL LETTER E WITH BREVE
+0116; C; 0117; # LATIN CAPITAL LETTER E WITH DOT ABOVE
+0118; C; 0119; # LATIN CAPITAL LETTER E WITH OGONEK
+011A; C; 011B; # LATIN CAPITAL LETTER E WITH CARON
+011C; C; 011D; # LATIN CAPITAL LETTER G WITH CIRCUMFLEX
+011E; C; 011F; # LATIN CAPITAL LETTER G WITH BREVE
+0120; C; 0121; # LATIN CAPITAL LETTER G WITH DOT ABOVE
+0122; C; 0123; # LATIN CAPITAL LETTER G WITH CEDILLA
+0124; C; 0125; # LATIN CAPITAL LETTER H WITH CIRCUMFLEX
+0126; C; 0127; # LATIN CAPITAL LETTER H WITH STROKE
+0128; C; 0129; # LATIN CAPITAL LETTER I WITH TILDE
+012A; C; 012B; # LATIN CAPITAL LETTER I WITH MACRON
+012C; C; 012D; # LATIN CAPITAL LETTER I WITH BREVE
+012E; C; 012F; # LATIN CAPITAL LETTER I WITH OGONEK
+0130; F; 0069 0307; # LATIN CAPITAL LETTER I WITH DOT ABOVE
+0130; T; 0069; # LATIN CAPITAL LETTER I WITH DOT ABOVE
+0132; C; 0133; # LATIN CAPITAL LIGATURE IJ
+0134; C; 0135; # LATIN CAPITAL LETTER J WITH CIRCUMFLEX
+0136; C; 0137; # LATIN CAPITAL LETTER K WITH CEDILLA
+0139; C; 013A; # LATIN CAPITAL LETTER L WITH ACUTE
+013B; C; 013C; # LATIN CAPITAL LETTER L WITH CEDILLA
+013D; C; 013E; # LATIN CAPITAL LETTER L WITH CARON
+013F; C; 0140; # LATIN CAPITAL LETTER L WITH MIDDLE DOT
+0141; C; 0142; # LATIN CAPITAL LETTER L WITH STROKE
+0143; C; 0144; # LATIN CAPITAL LETTER N WITH ACUTE
+0145; C; 0146; # LATIN CAPITAL LETTER N WITH CEDILLA
+0147; C; 0148; # LATIN CAPITAL LETTER N WITH CARON
+0149; F; 02BC 006E; # LATIN SMALL LETTER N PRECEDED BY APOSTROPHE
+014A; C; 014B; # LATIN CAPITAL LETTER ENG
+014C; C; 014D; # LATIN CAPITAL LETTER O WITH MACRON
+014E; C; 014F; # LATIN CAPITAL LETTER O WITH BREVE
+0150; C; 0151; # LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
+0152; C; 0153; # LATIN CAPITAL LIGATURE OE
+0154; C; 0155; # LATIN CAPITAL LETTER R WITH ACUTE
+0156; C; 0157; # LATIN CAPITAL LETTER R WITH CEDILLA
+0158; C; 0159; # LATIN CAPITAL LETTER R WITH CARON
+015A; C; 015B; # LATIN CAPITAL LETTER S WITH ACUTE
+015C; C; 015D; # LATIN CAPITAL LETTER S WITH CIRCUMFLEX
+015E; C; 015F; # LATIN CAPITAL LETTER S WITH CEDILLA
+0160; C; 0161; # LATIN CAPITAL LETTER S WITH CARON
+0162; C; 0163; # LATIN CAPITAL LETTER T WITH CEDILLA
+0164; C; 0165; # LATIN CAPITAL LETTER T WITH CARON
+0166; C; 0167; # LATIN CAPITAL LETTER T WITH STROKE
+0168; C; 0169; # LATIN CAPITAL LETTER U WITH TILDE
+016A; C; 016B; # LATIN CAPITAL LETTER U WITH MACRON
+016C; C; 016D; # LATIN CAPITAL LETTER U WITH BREVE
+016E; C; 016F; # LATIN CAPITAL LETTER U WITH RING ABOVE
+0170; C; 0171; # LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
+0172; C; 0173; # LATIN CAPITAL LETTER U WITH OGONEK
+0174; C; 0175; # LATIN CAPITAL LETTER W WITH CIRCUMFLEX
+0176; C; 0177; # LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
+0178; C; 00FF; # LATIN CAPITAL LETTER Y WITH DIAERESIS
+0179; C; 017A; # LATIN CAPITAL LETTER Z WITH ACUTE
+017B; C; 017C; # LATIN CAPITAL LETTER Z WITH DOT ABOVE
+017D; C; 017E; # LATIN CAPITAL LETTER Z WITH CARON
+017F; C; 0073; # LATIN SMALL LETTER LONG S
+0181; C; 0253; # LATIN CAPITAL LETTER B WITH HOOK
+0182; C; 0183; # LATIN CAPITAL LETTER B WITH TOPBAR
+0184; C; 0185; # LATIN CAPITAL LETTER TONE SIX
+0186; C; 0254; # LATIN CAPITAL LETTER OPEN O
+0187; C; 0188; # LATIN CAPITAL LETTER C WITH HOOK
+0189; C; 0256; # LATIN CAPITAL LETTER AFRICAN D
+018A; C; 0257; # LATIN CAPITAL LETTER D WITH HOOK
+018B; C; 018C; # LATIN CAPITAL LETTER D WITH TOPBAR
+018E; C; 01DD; # LATIN CAPITAL LETTER REVERSED E
+018F; C; 0259; # LATIN CAPITAL LETTER SCHWA
+0190; C; 025B; # LATIN CAPITAL LETTER OPEN E
+0191; C; 0192; # LATIN CAPITAL LETTER F WITH HOOK
+0193; C; 0260; # LATIN CAPITAL LETTER G WITH HOOK
+0194; C; 0263; # LATIN CAPITAL LETTER GAMMA
+0196; C; 0269; # LATIN CAPITAL LETTER IOTA
+0197; C; 0268; # LATIN CAPITAL LETTER I WITH STROKE
+0198; C; 0199; # LATIN CAPITAL LETTER K WITH HOOK
+019C; C; 026F; # LATIN CAPITAL LETTER TURNED M
+019D; C; 0272; # LATIN CAPITAL LETTER N WITH LEFT HOOK
+019F; C; 0275; # LATIN CAPITAL LETTER O WITH MIDDLE TILDE
+01A0; C; 01A1; # LATIN CAPITAL LETTER O WITH HORN
+01A2; C; 01A3; # LATIN CAPITAL LETTER OI
+01A4; C; 01A5; # LATIN CAPITAL LETTER P WITH HOOK
+01A6; C; 0280; # LATIN LETTER YR
+01A7; C; 01A8; # LATIN CAPITAL LETTER TONE TWO
+01A9; C; 0283; # LATIN CAPITAL LETTER ESH
+01AC; C; 01AD; # LATIN CAPITAL LETTER T WITH HOOK
+01AE; C; 0288; # LATIN CAPITAL LETTER T WITH RETROFLEX HOOK
+01AF; C; 01B0; # LATIN CAPITAL LETTER U WITH HORN
+01B1; C; 028A; # LATIN CAPITAL LETTER UPSILON
+01B2; C; 028B; # LATIN CAPITAL LETTER V WITH HOOK
+01B3; C; 01B4; # LATIN CAPITAL LETTER Y WITH HOOK
+01B5; C; 01B6; # LATIN CAPITAL LETTER Z WITH STROKE
+01B7; C; 0292; # LATIN CAPITAL LETTER EZH
+01B8; C; 01B9; # LATIN CAPITAL LETTER EZH REVERSED
+01BC; C; 01BD; # LATIN CAPITAL LETTER TONE FIVE
+01C4; C; 01C6; # LATIN CAPITAL LETTER DZ WITH CARON
+01C5; C; 01C6; # LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON
+01C7; C; 01C9; # LATIN CAPITAL LETTER LJ
+01C8; C; 01C9; # LATIN CAPITAL LETTER L WITH SMALL LETTER J
+01CA; C; 01CC; # LATIN CAPITAL LETTER NJ
+01CB; C; 01CC; # LATIN CAPITAL LETTER N WITH SMALL LETTER J
+01CD; C; 01CE; # LATIN CAPITAL LETTER A WITH CARON
+01CF; C; 01D0; # LATIN CAPITAL LETTER I WITH CARON
+01D1; C; 01D2; # LATIN CAPITAL LETTER O WITH CARON
+01D3; C; 01D4; # LATIN CAPITAL LETTER U WITH CARON
+01D5; C; 01D6; # LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON
+01D7; C; 01D8; # LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE
+01D9; C; 01DA; # LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON
+01DB; C; 01DC; # LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE
+01DE; C; 01DF; # LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON
+01E0; C; 01E1; # LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON
+01E2; C; 01E3; # LATIN CAPITAL LETTER AE WITH MACRON
+01E4; C; 01E5; # LATIN CAPITAL LETTER G WITH STROKE
+01E6; C; 01E7; # LATIN CAPITAL LETTER G WITH CARON
+01E8; C; 01E9; # LATIN CAPITAL LETTER K WITH CARON
+01EA; C; 01EB; # LATIN CAPITAL LETTER O WITH OGONEK
+01EC; C; 01ED; # LATIN CAPITAL LETTER O WITH OGONEK AND MACRON
+01EE; C; 01EF; # LATIN CAPITAL LETTER EZH WITH CARON
+01F0; F; 006A 030C; # LATIN SMALL LETTER J WITH CARON
+01F1; C; 01F3; # LATIN CAPITAL LETTER DZ
+01F2; C; 01F3; # LATIN CAPITAL LETTER D WITH SMALL LETTER Z
+01F4; C; 01F5; # LATIN CAPITAL LETTER G WITH ACUTE
+01F6; C; 0195; # LATIN CAPITAL LETTER HWAIR
+01F7; C; 01BF; # LATIN CAPITAL LETTER WYNN
+01F8; C; 01F9; # LATIN CAPITAL LETTER N WITH GRAVE
+01FA; C; 01FB; # LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE
+01FC; C; 01FD; # LATIN CAPITAL LETTER AE WITH ACUTE
+01FE; C; 01FF; # LATIN CAPITAL LETTER O WITH STROKE AND ACUTE
+0200; C; 0201; # LATIN CAPITAL LETTER A WITH DOUBLE GRAVE
+0202; C; 0203; # LATIN CAPITAL LETTER A WITH INVERTED BREVE
+0204; C; 0205; # LATIN CAPITAL LETTER E WITH DOUBLE GRAVE
+0206; C; 0207; # LATIN CAPITAL LETTER E WITH INVERTED BREVE
+0208; C; 0209; # LATIN CAPITAL LETTER I WITH DOUBLE GRAVE
+020A; C; 020B; # LATIN CAPITAL LETTER I WITH INVERTED BREVE
+020C; C; 020D; # LATIN CAPITAL LETTER O WITH DOUBLE GRAVE
+020E; C; 020F; # LATIN CAPITAL LETTER O WITH INVERTED BREVE
+0210; C; 0211; # LATIN CAPITAL LETTER R WITH DOUBLE GRAVE
+0212; C; 0213; # LATIN CAPITAL LETTER R WITH INVERTED BREVE
+0214; C; 0215; # LATIN CAPITAL LETTER U WITH DOUBLE GRAVE
+0216; C; 0217; # LATIN CAPITAL LETTER U WITH INVERTED BREVE
+0218; C; 0219; # LATIN CAPITAL LETTER S WITH COMMA BELOW
+021A; C; 021B; # LATIN CAPITAL LETTER T WITH COMMA BELOW
+021C; C; 021D; # LATIN CAPITAL LETTER YOGH
+021E; C; 021F; # LATIN CAPITAL LETTER H WITH CARON
+0220; C; 019E; # LATIN CAPITAL LETTER N WITH LONG RIGHT LEG
+0222; C; 0223; # LATIN CAPITAL LETTER OU
+0224; C; 0225; # LATIN CAPITAL LETTER Z WITH HOOK
+0226; C; 0227; # LATIN CAPITAL LETTER A WITH DOT ABOVE
+0228; C; 0229; # LATIN CAPITAL LETTER E WITH CEDILLA
+022A; C; 022B; # LATIN CAPITAL LETTER O WITH DIAERESIS AND MACRON
+022C; C; 022D; # LATIN CAPITAL LETTER O WITH TILDE AND MACRON
+022E; C; 022F; # LATIN CAPITAL LETTER O WITH DOT ABOVE
+0230; C; 0231; # LATIN CAPITAL LETTER O WITH DOT ABOVE AND MACRON
+0232; C; 0233; # LATIN CAPITAL LETTER Y WITH MACRON
+023A; C; 2C65; # LATIN CAPITAL LETTER A WITH STROKE
+023B; C; 023C; # LATIN CAPITAL LETTER C WITH STROKE
+023D; C; 019A; # LATIN CAPITAL LETTER L WITH BAR
+023E; C; 2C66; # LATIN CAPITAL LETTER T WITH DIAGONAL STROKE
+0241; C; 0242; # LATIN CAPITAL LETTER GLOTTAL STOP
+0243; C; 0180; # LATIN CAPITAL LETTER B WITH STROKE
+0244; C; 0289; # LATIN CAPITAL LETTER U BAR
+0245; C; 028C; # LATIN CAPITAL LETTER TURNED V
+0246; C; 0247; # LATIN CAPITAL LETTER E WITH STROKE
+0248; C; 0249; # LATIN CAPITAL LETTER J WITH STROKE
+024A; C; 024B; # LATIN CAPITAL LETTER SMALL Q WITH HOOK TAIL
+024C; C; 024D; # LATIN CAPITAL LETTER R WITH STROKE
+024E; C; 024F; # LATIN CAPITAL LETTER Y WITH STROKE
+0345; C; 03B9; # COMBINING GREEK YPOGEGRAMMENI
+0370; C; 0371; # GREEK CAPITAL LETTER HETA
+0372; C; 0373; # GREEK CAPITAL LETTER ARCHAIC SAMPI
+0376; C; 0377; # GREEK CAPITAL LETTER PAMPHYLIAN DIGAMMA
+037F; C; 03F3; # GREEK CAPITAL LETTER YOT
+0386; C; 03AC; # GREEK CAPITAL LETTER ALPHA WITH TONOS
+0388; C; 03AD; # GREEK CAPITAL LETTER EPSILON WITH TONOS
+0389; C; 03AE; # GREEK CAPITAL LETTER ETA WITH TONOS
+038A; C; 03AF; # GREEK CAPITAL LETTER IOTA WITH TONOS
+038C; C; 03CC; # GREEK CAPITAL LETTER OMICRON WITH TONOS
+038E; C; 03CD; # GREEK CAPITAL LETTER UPSILON WITH TONOS
+038F; C; 03CE; # GREEK CAPITAL LETTER OMEGA WITH TONOS
+0390; F; 03B9 0308 0301; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
+0391; C; 03B1; # GREEK CAPITAL LETTER ALPHA
+0392; C; 03B2; # GREEK CAPITAL LETTER BETA
+0393; C; 03B3; # GREEK CAPITAL LETTER GAMMA
+0394; C; 03B4; # GREEK CAPITAL LETTER DELTA
+0395; C; 03B5; # GREEK CAPITAL LETTER EPSILON
+0396; C; 03B6; # GREEK CAPITAL LETTER ZETA
+0397; C; 03B7; # GREEK CAPITAL LETTER ETA
+0398; C; 03B8; # GREEK CAPITAL LETTER THETA
+0399; C; 03B9; # GREEK CAPITAL LETTER IOTA
+039A; C; 03BA; # GREEK CAPITAL LETTER KAPPA
+039B; C; 03BB; # GREEK CAPITAL LETTER LAMDA
+039C; C; 03BC; # GREEK CAPITAL LETTER MU
+039D; C; 03BD; # GREEK CAPITAL LETTER NU
+039E; C; 03BE; # GREEK CAPITAL LETTER XI
+039F; C; 03BF; # GREEK CAPITAL LETTER OMICRON
+03A0; C; 03C0; # GREEK CAPITAL LETTER PI
+03A1; C; 03C1; # GREEK CAPITAL LETTER RHO
+03A3; C; 03C3; # GREEK CAPITAL LETTER SIGMA
+03A4; C; 03C4; # GREEK CAPITAL LETTER TAU
+03A5; C; 03C5; # GREEK CAPITAL LETTER UPSILON
+03A6; C; 03C6; # GREEK CAPITAL LETTER PHI
+03A7; C; 03C7; # GREEK CAPITAL LETTER CHI
+03A8; C; 03C8; # GREEK CAPITAL LETTER PSI
+03A9; C; 03C9; # GREEK CAPITAL LETTER OMEGA
+03AA; C; 03CA; # GREEK CAPITAL LETTER IOTA WITH DIALYTIKA
+03AB; C; 03CB; # GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA
+03B0; F; 03C5 0308 0301; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS
+03C2; C; 03C3; # GREEK SMALL LETTER FINAL SIGMA
+03CF; C; 03D7; # GREEK CAPITAL KAI SYMBOL
+03D0; C; 03B2; # GREEK BETA SYMBOL
+03D1; C; 03B8; # GREEK THETA SYMBOL
+03D5; C; 03C6; # GREEK PHI SYMBOL
+03D6; C; 03C0; # GREEK PI SYMBOL
+03D8; C; 03D9; # GREEK LETTER ARCHAIC KOPPA
+03DA; C; 03DB; # GREEK LETTER STIGMA
+03DC; C; 03DD; # GREEK LETTER DIGAMMA
+03DE; C; 03DF; # GREEK LETTER KOPPA
+03E0; C; 03E1; # GREEK LETTER SAMPI
+03E2; C; 03E3; # COPTIC CAPITAL LETTER SHEI
+03E4; C; 03E5; # COPTIC CAPITAL LETTER FEI
+03E6; C; 03E7; # COPTIC CAPITAL LETTER KHEI
+03E8; C; 03E9; # COPTIC CAPITAL LETTER HORI
+03EA; C; 03EB; # COPTIC CAPITAL LETTER GANGIA
+03EC; C; 03ED; # COPTIC CAPITAL LETTER SHIMA
+03EE; C; 03EF; # COPTIC CAPITAL LETTER DEI
+03F0; C; 03BA; # GREEK KAPPA SYMBOL
+03F1; C; 03C1; # GREEK RHO SYMBOL
+03F4; C; 03B8; # GREEK CAPITAL THETA SYMBOL
+03F5; C; 03B5; # GREEK LUNATE EPSILON SYMBOL
+03F7; C; 03F8; # GREEK CAPITAL LETTER SHO
+03F9; C; 03F2; # GREEK CAPITAL LUNATE SIGMA SYMBOL
+03FA; C; 03FB; # GREEK CAPITAL LETTER SAN
+03FD; C; 037B; # GREEK CAPITAL REVERSED LUNATE SIGMA SYMBOL
+03FE; C; 037C; # GREEK CAPITAL DOTTED LUNATE SIGMA SYMBOL
+03FF; C; 037D; # GREEK CAPITAL REVERSED DOTTED LUNATE SIGMA SYMBOL
+0400; C; 0450; # CYRILLIC CAPITAL LETTER IE WITH GRAVE
+0401; C; 0451; # CYRILLIC CAPITAL LETTER IO
+0402; C; 0452; # CYRILLIC CAPITAL LETTER DJE
+0403; C; 0453; # CYRILLIC CAPITAL LETTER GJE
+0404; C; 0454; # CYRILLIC CAPITAL LETTER UKRAINIAN IE
+0405; C; 0455; # CYRILLIC CAPITAL LETTER DZE
+0406; C; 0456; # CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+0407; C; 0457; # CYRILLIC CAPITAL LETTER YI
+0408; C; 0458; # CYRILLIC CAPITAL LETTER JE
+0409; C; 0459; # CYRILLIC CAPITAL LETTER LJE
+040A; C; 045A; # CYRILLIC CAPITAL LETTER NJE
+040B; C; 045B; # CYRILLIC CAPITAL LETTER TSHE
+040C; C; 045C; # CYRILLIC CAPITAL LETTER KJE
+040D; C; 045D; # CYRILLIC CAPITAL LETTER I WITH GRAVE
+040E; C; 045E; # CYRILLIC CAPITAL LETTER SHORT U
+040F; C; 045F; # CYRILLIC CAPITAL LETTER DZHE
+0410; C; 0430; # CYRILLIC CAPITAL LETTER A
+0411; C; 0431; # CYRILLIC CAPITAL LETTER BE
+0412; C; 0432; # CYRILLIC CAPITAL LETTER VE
+0413; C; 0433; # CYRILLIC CAPITAL LETTER GHE
+0414; C; 0434; # CYRILLIC CAPITAL LETTER DE
+0415; C; 0435; # CYRILLIC CAPITAL LETTER IE
+0416; C; 0436; # CYRILLIC CAPITAL LETTER ZHE
+0417; C; 0437; # CYRILLIC CAPITAL LETTER ZE
+0418; C; 0438; # CYRILLIC CAPITAL LETTER I
+0419; C; 0439; # CYRILLIC CAPITAL LETTER SHORT I
+041A; C; 043A; # CYRILLIC CAPITAL LETTER KA
+041B; C; 043B; # CYRILLIC CAPITAL LETTER EL
+041C; C; 043C; # CYRILLIC CAPITAL LETTER EM
+041D; C; 043D; # CYRILLIC CAPITAL LETTER EN
+041E; C; 043E; # CYRILLIC CAPITAL LETTER O
+041F; C; 043F; # CYRILLIC CAPITAL LETTER PE
+0420; C; 0440; # CYRILLIC CAPITAL LETTER ER
+0421; C; 0441; # CYRILLIC CAPITAL LETTER ES
+0422; C; 0442; # CYRILLIC CAPITAL LETTER TE
+0423; C; 0443; # CYRILLIC CAPITAL LETTER U
+0424; C; 0444; # CYRILLIC CAPITAL LETTER EF
+0425; C; 0445; # CYRILLIC CAPITAL LETTER HA
+0426; C; 0446; # CYRILLIC CAPITAL LETTER TSE
+0427; C; 0447; # CYRILLIC CAPITAL LETTER CHE
+0428; C; 0448; # CYRILLIC CAPITAL LETTER SHA
+0429; C; 0449; # CYRILLIC CAPITAL LETTER SHCHA
+042A; C; 044A; # CYRILLIC CAPITAL LETTER HARD SIGN
+042B; C; 044B; # CYRILLIC CAPITAL LETTER YERU
+042C; C; 044C; # CYRILLIC CAPITAL LETTER SOFT SIGN
+042D; C; 044D; # CYRILLIC CAPITAL LETTER E
+042E; C; 044E; # CYRILLIC CAPITAL LETTER YU
+042F; C; 044F; # CYRILLIC CAPITAL LETTER YA
+0460; C; 0461; # CYRILLIC CAPITAL LETTER OMEGA
+0462; C; 0463; # CYRILLIC CAPITAL LETTER YAT
+0464; C; 0465; # CYRILLIC CAPITAL LETTER IOTIFIED E
+0466; C; 0467; # CYRILLIC CAPITAL LETTER LITTLE YUS
+0468; C; 0469; # CYRILLIC CAPITAL LETTER IOTIFIED LITTLE YUS
+046A; C; 046B; # CYRILLIC CAPITAL LETTER BIG YUS
+046C; C; 046D; # CYRILLIC CAPITAL LETTER IOTIFIED BIG YUS
+046E; C; 046F; # CYRILLIC CAPITAL LETTER KSI
+0470; C; 0471; # CYRILLIC CAPITAL LETTER PSI
+0472; C; 0473; # CYRILLIC CAPITAL LETTER FITA
+0474; C; 0475; # CYRILLIC CAPITAL LETTER IZHITSA
+0476; C; 0477; # CYRILLIC CAPITAL LETTER IZHITSA WITH DOUBLE GRAVE ACCENT
+0478; C; 0479; # CYRILLIC CAPITAL LETTER UK
+047A; C; 047B; # CYRILLIC CAPITAL LETTER ROUND OMEGA
+047C; C; 047D; # CYRILLIC CAPITAL LETTER OMEGA WITH TITLO
+047E; C; 047F; # CYRILLIC CAPITAL LETTER OT
+0480; C; 0481; # CYRILLIC CAPITAL LETTER KOPPA
+048A; C; 048B; # CYRILLIC CAPITAL LETTER SHORT I WITH TAIL
+048C; C; 048D; # CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+048E; C; 048F; # CYRILLIC CAPITAL LETTER ER WITH TICK
+0490; C; 0491; # CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+0492; C; 0493; # CYRILLIC CAPITAL LETTER GHE WITH STROKE
+0494; C; 0495; # CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+0496; C; 0497; # CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+0498; C; 0499; # CYRILLIC CAPITAL LETTER ZE WITH DESCENDER
+049A; C; 049B; # CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+049C; C; 049D; # CYRILLIC CAPITAL LETTER KA WITH VERTICAL STROKE
+049E; C; 049F; # CYRILLIC CAPITAL LETTER KA WITH STROKE
+04A0; C; 04A1; # CYRILLIC CAPITAL LETTER BASHKIR KA
+04A2; C; 04A3; # CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+04A4; C; 04A5; # CYRILLIC CAPITAL LIGATURE EN GHE
+04A6; C; 04A7; # CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+04A8; C; 04A9; # CYRILLIC CAPITAL LETTER ABKHASIAN HA
+04AA; C; 04AB; # CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+04AC; C; 04AD; # CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+04AE; C; 04AF; # CYRILLIC CAPITAL LETTER STRAIGHT U
+04B0; C; 04B1; # CYRILLIC CAPITAL LETTER STRAIGHT U WITH STROKE
+04B2; C; 04B3; # CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+04B4; C; 04B5; # CYRILLIC CAPITAL LIGATURE TE TSE
+04B6; C; 04B7; # CYRILLIC CAPITAL LETTER CHE WITH DESCENDER
+04B8; C; 04B9; # CYRILLIC CAPITAL LETTER CHE WITH VERTICAL STROKE
+04BA; C; 04BB; # CYRILLIC CAPITAL LETTER SHHA
+04BC; C; 04BD; # CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+04BE; C; 04BF; # CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+04C0; C; 04CF; # CYRILLIC LETTER PALOCHKA
+04C1; C; 04C2; # CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+04C3; C; 04C4; # CYRILLIC CAPITAL LETTER KA WITH HOOK
+04C5; C; 04C6; # CYRILLIC CAPITAL LETTER EL WITH TAIL
+04C7; C; 04C8; # CYRILLIC CAPITAL LETTER EN WITH HOOK
+04C9; C; 04CA; # CYRILLIC CAPITAL LETTER EN WITH TAIL
+04CB; C; 04CC; # CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+04CD; C; 04CE; # CYRILLIC CAPITAL LETTER EM WITH TAIL
+04D0; C; 04D1; # CYRILLIC CAPITAL LETTER A WITH BREVE
+04D2; C; 04D3; # CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+04D4; C; 04D5; # CYRILLIC CAPITAL LIGATURE A IE
+04D6; C; 04D7; # CYRILLIC CAPITAL LETTER IE WITH BREVE
+04D8; C; 04D9; # CYRILLIC CAPITAL LETTER SCHWA
+04DA; C; 04DB; # CYRILLIC CAPITAL LETTER SCHWA WITH DIAERESIS
+04DC; C; 04DD; # CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+04DE; C; 04DF; # CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+04E0; C; 04E1; # CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+04E2; C; 04E3; # CYRILLIC CAPITAL LETTER I WITH MACRON
+04E4; C; 04E5; # CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+04E6; C; 04E7; # CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+04E8; C; 04E9; # CYRILLIC CAPITAL LETTER BARRED O
+04EA; C; 04EB; # CYRILLIC CAPITAL LETTER BARRED O WITH DIAERESIS
+04EC; C; 04ED; # CYRILLIC CAPITAL LETTER E WITH DIAERESIS
+04EE; C; 04EF; # CYRILLIC CAPITAL LETTER U WITH MACRON
+04F0; C; 04F1; # CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+04F2; C; 04F3; # CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+04F4; C; 04F5; # CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+04F6; C; 04F7; # CYRILLIC CAPITAL LETTER GHE WITH DESCENDER
+04F8; C; 04F9; # CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+04FA; C; 04FB; # CYRILLIC CAPITAL LETTER GHE WITH STROKE AND HOOK
+04FC; C; 04FD; # CYRILLIC CAPITAL LETTER HA WITH HOOK
+04FE; C; 04FF; # CYRILLIC CAPITAL LETTER HA WITH STROKE
+0500; C; 0501; # CYRILLIC CAPITAL LETTER KOMI DE
+0502; C; 0503; # CYRILLIC CAPITAL LETTER KOMI DJE
+0504; C; 0505; # CYRILLIC CAPITAL LETTER KOMI ZJE
+0506; C; 0507; # CYRILLIC CAPITAL LETTER KOMI DZJE
+0508; C; 0509; # CYRILLIC CAPITAL LETTER KOMI LJE
+050A; C; 050B; # CYRILLIC CAPITAL LETTER KOMI NJE
+050C; C; 050D; # CYRILLIC CAPITAL LETTER KOMI SJE
+050E; C; 050F; # CYRILLIC CAPITAL LETTER KOMI TJE
+0510; C; 0511; # CYRILLIC CAPITAL LETTER REVERSED ZE
+0512; C; 0513; # CYRILLIC CAPITAL LETTER EL WITH HOOK
+0514; C; 0515; # CYRILLIC CAPITAL LETTER LHA
+0516; C; 0517; # CYRILLIC CAPITAL LETTER RHA
+0518; C; 0519; # CYRILLIC CAPITAL LETTER YAE
+051A; C; 051B; # CYRILLIC CAPITAL LETTER QA
+051C; C; 051D; # CYRILLIC CAPITAL LETTER WE
+051E; C; 051F; # CYRILLIC CAPITAL LETTER ALEUT KA
+0520; C; 0521; # CYRILLIC CAPITAL LETTER EL WITH MIDDLE HOOK
+0522; C; 0523; # CYRILLIC CAPITAL LETTER EN WITH MIDDLE HOOK
+0524; C; 0525; # CYRILLIC CAPITAL LETTER PE WITH DESCENDER
+0526; C; 0527; # CYRILLIC CAPITAL LETTER SHHA WITH DESCENDER
+0528; C; 0529; # CYRILLIC CAPITAL LETTER EN WITH LEFT HOOK
+052A; C; 052B; # CYRILLIC CAPITAL LETTER DZZHE
+052C; C; 052D; # CYRILLIC CAPITAL LETTER DCHE
+052E; C; 052F; # CYRILLIC CAPITAL LETTER EL WITH DESCENDER
+0531; C; 0561; # ARMENIAN CAPITAL LETTER AYB
+0532; C; 0562; # ARMENIAN CAPITAL LETTER BEN
+0533; C; 0563; # ARMENIAN CAPITAL LETTER GIM
+0534; C; 0564; # ARMENIAN CAPITAL LETTER DA
+0535; C; 0565; # ARMENIAN CAPITAL LETTER ECH
+0536; C; 0566; # ARMENIAN CAPITAL LETTER ZA
+0537; C; 0567; # ARMENIAN CAPITAL LETTER EH
+0538; C; 0568; # ARMENIAN CAPITAL LETTER ET
+0539; C; 0569; # ARMENIAN CAPITAL LETTER TO
+053A; C; 056A; # ARMENIAN CAPITAL LETTER ZHE
+053B; C; 056B; # ARMENIAN CAPITAL LETTER INI
+053C; C; 056C; # ARMENIAN CAPITAL LETTER LIWN
+053D; C; 056D; # ARMENIAN CAPITAL LETTER XEH
+053E; C; 056E; # ARMENIAN CAPITAL LETTER CA
+053F; C; 056F; # ARMENIAN CAPITAL LETTER KEN
+0540; C; 0570; # ARMENIAN CAPITAL LETTER HO
+0541; C; 0571; # ARMENIAN CAPITAL LETTER JA
+0542; C; 0572; # ARMENIAN CAPITAL LETTER GHAD
+0543; C; 0573; # ARMENIAN CAPITAL LETTER CHEH
+0544; C; 0574; # ARMENIAN CAPITAL LETTER MEN
+0545; C; 0575; # ARMENIAN CAPITAL LETTER YI
+0546; C; 0576; # ARMENIAN CAPITAL LETTER NOW
+0547; C; 0577; # ARMENIAN CAPITAL LETTER SHA
+0548; C; 0578; # ARMENIAN CAPITAL LETTER VO
+0549; C; 0579; # ARMENIAN CAPITAL LETTER CHA
+054A; C; 057A; # ARMENIAN CAPITAL LETTER PEH
+054B; C; 057B; # ARMENIAN CAPITAL LETTER JHEH
+054C; C; 057C; # ARMENIAN CAPITAL LETTER RA
+054D; C; 057D; # ARMENIAN CAPITAL LETTER SEH
+054E; C; 057E; # ARMENIAN CAPITAL LETTER VEW
+054F; C; 057F; # ARMENIAN CAPITAL LETTER TIWN
+0550; C; 0580; # ARMENIAN CAPITAL LETTER REH
+0551; C; 0581; # ARMENIAN CAPITAL LETTER CO
+0552; C; 0582; # ARMENIAN CAPITAL LETTER YIWN
+0553; C; 0583; # ARMENIAN CAPITAL LETTER PIWR
+0554; C; 0584; # ARMENIAN CAPITAL LETTER KEH
+0555; C; 0585; # ARMENIAN CAPITAL LETTER OH
+0556; C; 0586; # ARMENIAN CAPITAL LETTER FEH
+0587; F; 0565 0582; # ARMENIAN SMALL LIGATURE ECH YIWN
+10A0; C; 2D00; # GEORGIAN CAPITAL LETTER AN
+10A1; C; 2D01; # GEORGIAN CAPITAL LETTER BAN
+10A2; C; 2D02; # GEORGIAN CAPITAL LETTER GAN
+10A3; C; 2D03; # GEORGIAN CAPITAL LETTER DON
+10A4; C; 2D04; # GEORGIAN CAPITAL LETTER EN
+10A5; C; 2D05; # GEORGIAN CAPITAL LETTER VIN
+10A6; C; 2D06; # GEORGIAN CAPITAL LETTER ZEN
+10A7; C; 2D07; # GEORGIAN CAPITAL LETTER TAN
+10A8; C; 2D08; # GEORGIAN CAPITAL LETTER IN
+10A9; C; 2D09; # GEORGIAN CAPITAL LETTER KAN
+10AA; C; 2D0A; # GEORGIAN CAPITAL LETTER LAS
+10AB; C; 2D0B; # GEORGIAN CAPITAL LETTER MAN
+10AC; C; 2D0C; # GEORGIAN CAPITAL LETTER NAR
+10AD; C; 2D0D; # GEORGIAN CAPITAL LETTER ON
+10AE; C; 2D0E; # GEORGIAN CAPITAL LETTER PAR
+10AF; C; 2D0F; # GEORGIAN CAPITAL LETTER ZHAR
+10B0; C; 2D10; # GEORGIAN CAPITAL LETTER RAE
+10B1; C; 2D11; # GEORGIAN CAPITAL LETTER SAN
+10B2; C; 2D12; # GEORGIAN CAPITAL LETTER TAR
+10B3; C; 2D13; # GEORGIAN CAPITAL LETTER UN
+10B4; C; 2D14; # GEORGIAN CAPITAL LETTER PHAR
+10B5; C; 2D15; # GEORGIAN CAPITAL LETTER KHAR
+10B6; C; 2D16; # GEORGIAN CAPITAL LETTER GHAN
+10B7; C; 2D17; # GEORGIAN CAPITAL LETTER QAR
+10B8; C; 2D18; # GEORGIAN CAPITAL LETTER SHIN
+10B9; C; 2D19; # GEORGIAN CAPITAL LETTER CHIN
+10BA; C; 2D1A; # GEORGIAN CAPITAL LETTER CAN
+10BB; C; 2D1B; # GEORGIAN CAPITAL LETTER JIL
+10BC; C; 2D1C; # GEORGIAN CAPITAL LETTER CIL
+10BD; C; 2D1D; # GEORGIAN CAPITAL LETTER CHAR
+10BE; C; 2D1E; # GEORGIAN CAPITAL LETTER XAN
+10BF; C; 2D1F; # GEORGIAN CAPITAL LETTER JHAN
+10C0; C; 2D20; # GEORGIAN CAPITAL LETTER HAE
+10C1; C; 2D21; # GEORGIAN CAPITAL LETTER HE
+10C2; C; 2D22; # GEORGIAN CAPITAL LETTER HIE
+10C3; C; 2D23; # GEORGIAN CAPITAL LETTER WE
+10C4; C; 2D24; # GEORGIAN CAPITAL LETTER HAR
+10C5; C; 2D25; # GEORGIAN CAPITAL LETTER HOE
+10C7; C; 2D27; # GEORGIAN CAPITAL LETTER YN
+10CD; C; 2D2D; # GEORGIAN CAPITAL LETTER AEN
+13F8; C; 13F0; # CHEROKEE SMALL LETTER YE
+13F9; C; 13F1; # CHEROKEE SMALL LETTER YI
+13FA; C; 13F2; # CHEROKEE SMALL LETTER YO
+13FB; C; 13F3; # CHEROKEE SMALL LETTER YU
+13FC; C; 13F4; # CHEROKEE SMALL LETTER YV
+13FD; C; 13F5; # CHEROKEE SMALL LETTER MV
+1C80; C; 0432; # CYRILLIC SMALL LETTER ROUNDED VE
+1C81; C; 0434; # CYRILLIC SMALL LETTER LONG-LEGGED DE
+1C82; C; 043E; # CYRILLIC SMALL LETTER NARROW O
+1C83; C; 0441; # CYRILLIC SMALL LETTER WIDE ES
+1C84; C; 0442; # CYRILLIC SMALL LETTER TALL TE
+1C85; C; 0442; # CYRILLIC SMALL LETTER THREE-LEGGED TE
+1C86; C; 044A; # CYRILLIC SMALL LETTER TALL HARD SIGN
+1C87; C; 0463; # CYRILLIC SMALL LETTER TALL YAT
+1C88; C; A64B; # CYRILLIC SMALL LETTER UNBLENDED UK
+1C90; C; 10D0; # GEORGIAN MTAVRULI CAPITAL LETTER AN
+1C91; C; 10D1; # GEORGIAN MTAVRULI CAPITAL LETTER BAN
+1C92; C; 10D2; # GEORGIAN MTAVRULI CAPITAL LETTER GAN
+1C93; C; 10D3; # GEORGIAN MTAVRULI CAPITAL LETTER DON
+1C94; C; 10D4; # GEORGIAN MTAVRULI CAPITAL LETTER EN
+1C95; C; 10D5; # GEORGIAN MTAVRULI CAPITAL LETTER VIN
+1C96; C; 10D6; # GEORGIAN MTAVRULI CAPITAL LETTER ZEN
+1C97; C; 10D7; # GEORGIAN MTAVRULI CAPITAL LETTER TAN
+1C98; C; 10D8; # GEORGIAN MTAVRULI CAPITAL LETTER IN
+1C99; C; 10D9; # GEORGIAN MTAVRULI CAPITAL LETTER KAN
+1C9A; C; 10DA; # GEORGIAN MTAVRULI CAPITAL LETTER LAS
+1C9B; C; 10DB; # GEORGIAN MTAVRULI CAPITAL LETTER MAN
+1C9C; C; 10DC; # GEORGIAN MTAVRULI CAPITAL LETTER NAR
+1C9D; C; 10DD; # GEORGIAN MTAVRULI CAPITAL LETTER ON
+1C9E; C; 10DE; # GEORGIAN MTAVRULI CAPITAL LETTER PAR
+1C9F; C; 10DF; # GEORGIAN MTAVRULI CAPITAL LETTER ZHAR
+1CA0; C; 10E0; # GEORGIAN MTAVRULI CAPITAL LETTER RAE
+1CA1; C; 10E1; # GEORGIAN MTAVRULI CAPITAL LETTER SAN
+1CA2; C; 10E2; # GEORGIAN MTAVRULI CAPITAL LETTER TAR
+1CA3; C; 10E3; # GEORGIAN MTAVRULI CAPITAL LETTER UN
+1CA4; C; 10E4; # GEORGIAN MTAVRULI CAPITAL LETTER PHAR
+1CA5; C; 10E5; # GEORGIAN MTAVRULI CAPITAL LETTER KHAR
+1CA6; C; 10E6; # GEORGIAN MTAVRULI CAPITAL LETTER GHAN
+1CA7; C; 10E7; # GEORGIAN MTAVRULI CAPITAL LETTER QAR
+1CA8; C; 10E8; # GEORGIAN MTAVRULI CAPITAL LETTER SHIN
+1CA9; C; 10E9; # GEORGIAN MTAVRULI CAPITAL LETTER CHIN
+1CAA; C; 10EA; # GEORGIAN MTAVRULI CAPITAL LETTER CAN
+1CAB; C; 10EB; # GEORGIAN MTAVRULI CAPITAL LETTER JIL
+1CAC; C; 10EC; # GEORGIAN MTAVRULI CAPITAL LETTER CIL
+1CAD; C; 10ED; # GEORGIAN MTAVRULI CAPITAL LETTER CHAR
+1CAE; C; 10EE; # GEORGIAN MTAVRULI CAPITAL LETTER XAN
+1CAF; C; 10EF; # GEORGIAN MTAVRULI CAPITAL LETTER JHAN
+1CB0; C; 10F0; # GEORGIAN MTAVRULI CAPITAL LETTER HAE
+1CB1; C; 10F1; # GEORGIAN MTAVRULI CAPITAL LETTER HE
+1CB2; C; 10F2; # GEORGIAN MTAVRULI CAPITAL LETTER HIE
+1CB3; C; 10F3; # GEORGIAN MTAVRULI CAPITAL LETTER WE
+1CB4; C; 10F4; # GEORGIAN MTAVRULI CAPITAL LETTER HAR
+1CB5; C; 10F5; # GEORGIAN MTAVRULI CAPITAL LETTER HOE
+1CB6; C; 10F6; # GEORGIAN MTAVRULI CAPITAL LETTER FI
+1CB7; C; 10F7; # GEORGIAN MTAVRULI CAPITAL LETTER YN
+1CB8; C; 10F8; # GEORGIAN MTAVRULI CAPITAL LETTER ELIFI
+1CB9; C; 10F9; # GEORGIAN MTAVRULI CAPITAL LETTER TURNED GAN
+1CBA; C; 10FA; # GEORGIAN MTAVRULI CAPITAL LETTER AIN
+1CBD; C; 10FD; # GEORGIAN MTAVRULI CAPITAL LETTER AEN
+1CBE; C; 10FE; # GEORGIAN MTAVRULI CAPITAL LETTER HARD SIGN
+1CBF; C; 10FF; # GEORGIAN MTAVRULI CAPITAL LETTER LABIAL SIGN
+1E00; C; 1E01; # LATIN CAPITAL LETTER A WITH RING BELOW
+1E02; C; 1E03; # LATIN CAPITAL LETTER B WITH DOT ABOVE
+1E04; C; 1E05; # LATIN CAPITAL LETTER B WITH DOT BELOW
+1E06; C; 1E07; # LATIN CAPITAL LETTER B WITH LINE BELOW
+1E08; C; 1E09; # LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE
+1E0A; C; 1E0B; # LATIN CAPITAL LETTER D WITH DOT ABOVE
+1E0C; C; 1E0D; # LATIN CAPITAL LETTER D WITH DOT BELOW
+1E0E; C; 1E0F; # LATIN CAPITAL LETTER D WITH LINE BELOW
+1E10; C; 1E11; # LATIN CAPITAL LETTER D WITH CEDILLA
+1E12; C; 1E13; # LATIN CAPITAL LETTER D WITH CIRCUMFLEX BELOW
+1E14; C; 1E15; # LATIN CAPITAL LETTER E WITH MACRON AND GRAVE
+1E16; C; 1E17; # LATIN CAPITAL LETTER E WITH MACRON AND ACUTE
+1E18; C; 1E19; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX BELOW
+1E1A; C; 1E1B; # LATIN CAPITAL LETTER E WITH TILDE BELOW
+1E1C; C; 1E1D; # LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE
+1E1E; C; 1E1F; # LATIN CAPITAL LETTER F WITH DOT ABOVE
+1E20; C; 1E21; # LATIN CAPITAL LETTER G WITH MACRON
+1E22; C; 1E23; # LATIN CAPITAL LETTER H WITH DOT ABOVE
+1E24; C; 1E25; # LATIN CAPITAL LETTER H WITH DOT BELOW
+1E26; C; 1E27; # LATIN CAPITAL LETTER H WITH DIAERESIS
+1E28; C; 1E29; # LATIN CAPITAL LETTER H WITH CEDILLA
+1E2A; C; 1E2B; # LATIN CAPITAL LETTER H WITH BREVE BELOW
+1E2C; C; 1E2D; # LATIN CAPITAL LETTER I WITH TILDE BELOW
+1E2E; C; 1E2F; # LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE
+1E30; C; 1E31; # LATIN CAPITAL LETTER K WITH ACUTE
+1E32; C; 1E33; # LATIN CAPITAL LETTER K WITH DOT BELOW
+1E34; C; 1E35; # LATIN CAPITAL LETTER K WITH LINE BELOW
+1E36; C; 1E37; # LATIN CAPITAL LETTER L WITH DOT BELOW
+1E38; C; 1E39; # LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON
+1E3A; C; 1E3B; # LATIN CAPITAL LETTER L WITH LINE BELOW
+1E3C; C; 1E3D; # LATIN CAPITAL LETTER L WITH CIRCUMFLEX BELOW
+1E3E; C; 1E3F; # LATIN CAPITAL LETTER M WITH ACUTE
+1E40; C; 1E41; # LATIN CAPITAL LETTER M WITH DOT ABOVE
+1E42; C; 1E43; # LATIN CAPITAL LETTER M WITH DOT BELOW
+1E44; C; 1E45; # LATIN CAPITAL LETTER N WITH DOT ABOVE
+1E46; C; 1E47; # LATIN CAPITAL LETTER N WITH DOT BELOW
+1E48; C; 1E49; # LATIN CAPITAL LETTER N WITH LINE BELOW
+1E4A; C; 1E4B; # LATIN CAPITAL LETTER N WITH CIRCUMFLEX BELOW
+1E4C; C; 1E4D; # LATIN CAPITAL LETTER O WITH TILDE AND ACUTE
+1E4E; C; 1E4F; # LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS
+1E50; C; 1E51; # LATIN CAPITAL LETTER O WITH MACRON AND GRAVE
+1E52; C; 1E53; # LATIN CAPITAL LETTER O WITH MACRON AND ACUTE
+1E54; C; 1E55; # LATIN CAPITAL LETTER P WITH ACUTE
+1E56; C; 1E57; # LATIN CAPITAL LETTER P WITH DOT ABOVE
+1E58; C; 1E59; # LATIN CAPITAL LETTER R WITH DOT ABOVE
+1E5A; C; 1E5B; # LATIN CAPITAL LETTER R WITH DOT BELOW
+1E5C; C; 1E5D; # LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON
+1E5E; C; 1E5F; # LATIN CAPITAL LETTER R WITH LINE BELOW
+1E60; C; 1E61; # LATIN CAPITAL LETTER S WITH DOT ABOVE
+1E62; C; 1E63; # LATIN CAPITAL LETTER S WITH DOT BELOW
+1E64; C; 1E65; # LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE
+1E66; C; 1E67; # LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE
+1E68; C; 1E69; # LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE
+1E6A; C; 1E6B; # LATIN CAPITAL LETTER T WITH DOT ABOVE
+1E6C; C; 1E6D; # LATIN CAPITAL LETTER T WITH DOT BELOW
+1E6E; C; 1E6F; # LATIN CAPITAL LETTER T WITH LINE BELOW
+1E70; C; 1E71; # LATIN CAPITAL LETTER T WITH CIRCUMFLEX BELOW
+1E72; C; 1E73; # LATIN CAPITAL LETTER U WITH DIAERESIS BELOW
+1E74; C; 1E75; # LATIN CAPITAL LETTER U WITH TILDE BELOW
+1E76; C; 1E77; # LATIN CAPITAL LETTER U WITH CIRCUMFLEX BELOW
+1E78; C; 1E79; # LATIN CAPITAL LETTER U WITH TILDE AND ACUTE
+1E7A; C; 1E7B; # LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS
+1E7C; C; 1E7D; # LATIN CAPITAL LETTER V WITH TILDE
+1E7E; C; 1E7F; # LATIN CAPITAL LETTER V WITH DOT BELOW
+1E80; C; 1E81; # LATIN CAPITAL LETTER W WITH GRAVE
+1E82; C; 1E83; # LATIN CAPITAL LETTER W WITH ACUTE
+1E84; C; 1E85; # LATIN CAPITAL LETTER W WITH DIAERESIS
+1E86; C; 1E87; # LATIN CAPITAL LETTER W WITH DOT ABOVE
+1E88; C; 1E89; # LATIN CAPITAL LETTER W WITH DOT BELOW
+1E8A; C; 1E8B; # LATIN CAPITAL LETTER X WITH DOT ABOVE
+1E8C; C; 1E8D; # LATIN CAPITAL LETTER X WITH DIAERESIS
+1E8E; C; 1E8F; # LATIN CAPITAL LETTER Y WITH DOT ABOVE
+1E90; C; 1E91; # LATIN CAPITAL LETTER Z WITH CIRCUMFLEX
+1E92; C; 1E93; # LATIN CAPITAL LETTER Z WITH DOT BELOW
+1E94; C; 1E95; # LATIN CAPITAL LETTER Z WITH LINE BELOW
+1E96; F; 0068 0331; # LATIN SMALL LETTER H WITH LINE BELOW
+1E97; F; 0074 0308; # LATIN SMALL LETTER T WITH DIAERESIS
+1E98; F; 0077 030A; # LATIN SMALL LETTER W WITH RING ABOVE
+1E99; F; 0079 030A; # LATIN SMALL LETTER Y WITH RING ABOVE
+1E9A; F; 0061 02BE; # LATIN SMALL LETTER A WITH RIGHT HALF RING
+1E9B; C; 1E61; # LATIN SMALL LETTER LONG S WITH DOT ABOVE
+1E9E; F; 0073 0073; # LATIN CAPITAL LETTER SHARP S
+1E9E; S; 00DF; # LATIN CAPITAL LETTER SHARP S
+1EA0; C; 1EA1; # LATIN CAPITAL LETTER A WITH DOT BELOW
+1EA2; C; 1EA3; # LATIN CAPITAL LETTER A WITH HOOK ABOVE
+1EA4; C; 1EA5; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE
+1EA6; C; 1EA7; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE
+1EA8; C; 1EA9; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
+1EAA; C; 1EAB; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE
+1EAC; C; 1EAD; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW
+1EAE; C; 1EAF; # LATIN CAPITAL LETTER A WITH BREVE AND ACUTE
+1EB0; C; 1EB1; # LATIN CAPITAL LETTER A WITH BREVE AND GRAVE
+1EB2; C; 1EB3; # LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE
+1EB4; C; 1EB5; # LATIN CAPITAL LETTER A WITH BREVE AND TILDE
+1EB6; C; 1EB7; # LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW
+1EB8; C; 1EB9; # LATIN CAPITAL LETTER E WITH DOT BELOW
+1EBA; C; 1EBB; # LATIN CAPITAL LETTER E WITH HOOK ABOVE
+1EBC; C; 1EBD; # LATIN CAPITAL LETTER E WITH TILDE
+1EBE; C; 1EBF; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE
+1EC0; C; 1EC1; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE
+1EC2; C; 1EC3; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
+1EC4; C; 1EC5; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE
+1EC6; C; 1EC7; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW
+1EC8; C; 1EC9; # LATIN CAPITAL LETTER I WITH HOOK ABOVE
+1ECA; C; 1ECB; # LATIN CAPITAL LETTER I WITH DOT BELOW
+1ECC; C; 1ECD; # LATIN CAPITAL LETTER O WITH DOT BELOW
+1ECE; C; 1ECF; # LATIN CAPITAL LETTER O WITH HOOK ABOVE
+1ED0; C; 1ED1; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE
+1ED2; C; 1ED3; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE
+1ED4; C; 1ED5; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
+1ED6; C; 1ED7; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE
+1ED8; C; 1ED9; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW
+1EDA; C; 1EDB; # LATIN CAPITAL LETTER O WITH HORN AND ACUTE
+1EDC; C; 1EDD; # LATIN CAPITAL LETTER O WITH HORN AND GRAVE
+1EDE; C; 1EDF; # LATIN CAPITAL LETTER O WITH HORN AND HOOK ABOVE
+1EE0; C; 1EE1; # LATIN CAPITAL LETTER O WITH HORN AND TILDE
+1EE2; C; 1EE3; # LATIN CAPITAL LETTER O WITH HORN AND DOT BELOW
+1EE4; C; 1EE5; # LATIN CAPITAL LETTER U WITH DOT BELOW
+1EE6; C; 1EE7; # LATIN CAPITAL LETTER U WITH HOOK ABOVE
+1EE8; C; 1EE9; # LATIN CAPITAL LETTER U WITH HORN AND ACUTE
+1EEA; C; 1EEB; # LATIN CAPITAL LETTER U WITH HORN AND GRAVE
+1EEC; C; 1EED; # LATIN CAPITAL LETTER U WITH HORN AND HOOK ABOVE
+1EEE; C; 1EEF; # LATIN CAPITAL LETTER U WITH HORN AND TILDE
+1EF0; C; 1EF1; # LATIN CAPITAL LETTER U WITH HORN AND DOT BELOW
+1EF2; C; 1EF3; # LATIN CAPITAL LETTER Y WITH GRAVE
+1EF4; C; 1EF5; # LATIN CAPITAL LETTER Y WITH DOT BELOW
+1EF6; C; 1EF7; # LATIN CAPITAL LETTER Y WITH HOOK ABOVE
+1EF8; C; 1EF9; # LATIN CAPITAL LETTER Y WITH TILDE
+1EFA; C; 1EFB; # LATIN CAPITAL LETTER MIDDLE-WELSH LL
+1EFC; C; 1EFD; # LATIN CAPITAL LETTER MIDDLE-WELSH V
+1EFE; C; 1EFF; # LATIN CAPITAL LETTER Y WITH LOOP
+1F08; C; 1F00; # GREEK CAPITAL LETTER ALPHA WITH PSILI
+1F09; C; 1F01; # GREEK CAPITAL LETTER ALPHA WITH DASIA
+1F0A; C; 1F02; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA
+1F0B; C; 1F03; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA
+1F0C; C; 1F04; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA
+1F0D; C; 1F05; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA
+1F0E; C; 1F06; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI
+1F0F; C; 1F07; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI
+1F18; C; 1F10; # GREEK CAPITAL LETTER EPSILON WITH PSILI
+1F19; C; 1F11; # GREEK CAPITAL LETTER EPSILON WITH DASIA
+1F1A; C; 1F12; # GREEK CAPITAL LETTER EPSILON WITH PSILI AND VARIA
+1F1B; C; 1F13; # GREEK CAPITAL LETTER EPSILON WITH DASIA AND VARIA
+1F1C; C; 1F14; # GREEK CAPITAL LETTER EPSILON WITH PSILI AND OXIA
+1F1D; C; 1F15; # GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA
+1F28; C; 1F20; # GREEK CAPITAL LETTER ETA WITH PSILI
+1F29; C; 1F21; # GREEK CAPITAL LETTER ETA WITH DASIA
+1F2A; C; 1F22; # GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA
+1F2B; C; 1F23; # GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA
+1F2C; C; 1F24; # GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA
+1F2D; C; 1F25; # GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA
+1F2E; C; 1F26; # GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI
+1F2F; C; 1F27; # GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI
+1F38; C; 1F30; # GREEK CAPITAL LETTER IOTA WITH PSILI
+1F39; C; 1F31; # GREEK CAPITAL LETTER IOTA WITH DASIA
+1F3A; C; 1F32; # GREEK CAPITAL LETTER IOTA WITH PSILI AND VARIA
+1F3B; C; 1F33; # GREEK CAPITAL LETTER IOTA WITH DASIA AND VARIA
+1F3C; C; 1F34; # GREEK CAPITAL LETTER IOTA WITH PSILI AND OXIA
+1F3D; C; 1F35; # GREEK CAPITAL LETTER IOTA WITH DASIA AND OXIA
+1F3E; C; 1F36; # GREEK CAPITAL LETTER IOTA WITH PSILI AND PERISPOMENI
+1F3F; C; 1F37; # GREEK CAPITAL LETTER IOTA WITH DASIA AND PERISPOMENI
+1F48; C; 1F40; # GREEK CAPITAL LETTER OMICRON WITH PSILI
+1F49; C; 1F41; # GREEK CAPITAL LETTER OMICRON WITH DASIA
+1F4A; C; 1F42; # GREEK CAPITAL LETTER OMICRON WITH PSILI AND VARIA
+1F4B; C; 1F43; # GREEK CAPITAL LETTER OMICRON WITH DASIA AND VARIA
+1F4C; C; 1F44; # GREEK CAPITAL LETTER OMICRON WITH PSILI AND OXIA
+1F4D; C; 1F45; # GREEK CAPITAL LETTER OMICRON WITH DASIA AND OXIA
+1F50; F; 03C5 0313; # GREEK SMALL LETTER UPSILON WITH PSILI
+1F52; F; 03C5 0313 0300; # GREEK SMALL LETTER UPSILON WITH PSILI AND VARIA
+1F54; F; 03C5 0313 0301; # GREEK SMALL LETTER UPSILON WITH PSILI AND OXIA
+1F56; F; 03C5 0313 0342; # GREEK SMALL LETTER UPSILON WITH PSILI AND PERISPOMENI
+1F59; C; 1F51; # GREEK CAPITAL LETTER UPSILON WITH DASIA
+1F5B; C; 1F53; # GREEK CAPITAL LETTER UPSILON WITH DASIA AND VARIA
+1F5D; C; 1F55; # GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXIA
+1F5F; C; 1F57; # GREEK CAPITAL LETTER UPSILON WITH DASIA AND PERISPOMENI
+1F68; C; 1F60; # GREEK CAPITAL LETTER OMEGA WITH PSILI
+1F69; C; 1F61; # GREEK CAPITAL LETTER OMEGA WITH DASIA
+1F6A; C; 1F62; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA
+1F6B; C; 1F63; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA
+1F6C; C; 1F64; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA
+1F6D; C; 1F65; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA
+1F6E; C; 1F66; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI
+1F6F; C; 1F67; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI
+1F80; F; 1F00 03B9; # GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI
+1F81; F; 1F01 03B9; # GREEK SMALL LETTER ALPHA WITH DASIA AND YPOGEGRAMMENI
+1F82; F; 1F02 03B9; # GREEK SMALL LETTER ALPHA WITH PSILI AND VARIA AND YPOGEGRAMMENI
+1F83; F; 1F03 03B9; # GREEK SMALL LETTER ALPHA WITH DASIA AND VARIA AND YPOGEGRAMMENI
+1F84; F; 1F04 03B9; # GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA AND YPOGEGRAMMENI
+1F85; F; 1F05 03B9; # GREEK SMALL LETTER ALPHA WITH DASIA AND OXIA AND YPOGEGRAMMENI
+1F86; F; 1F06 03B9; # GREEK SMALL LETTER ALPHA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI
+1F87; F; 1F07 03B9; # GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+1F88; F; 1F00 03B9; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI
+1F88; S; 1F80; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI
+1F89; F; 1F01 03B9; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PROSGEGRAMMENI
+1F89; S; 1F81; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PROSGEGRAMMENI
+1F8A; F; 1F02 03B9; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1F8A; S; 1F82; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1F8B; F; 1F03 03B9; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1F8B; S; 1F83; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1F8C; F; 1F04 03B9; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1F8C; S; 1F84; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1F8D; F; 1F05 03B9; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1F8D; S; 1F85; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1F8E; F; 1F06 03B9; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1F8E; S; 1F86; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1F8F; F; 1F07 03B9; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1F8F; S; 1F87; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1F90; F; 1F20 03B9; # GREEK SMALL LETTER ETA WITH PSILI AND YPOGEGRAMMENI
+1F91; F; 1F21 03B9; # GREEK SMALL LETTER ETA WITH DASIA AND YPOGEGRAMMENI
+1F92; F; 1F22 03B9; # GREEK SMALL LETTER ETA WITH PSILI AND VARIA AND YPOGEGRAMMENI
+1F93; F; 1F23 03B9; # GREEK SMALL LETTER ETA WITH DASIA AND VARIA AND YPOGEGRAMMENI
+1F94; F; 1F24 03B9; # GREEK SMALL LETTER ETA WITH PSILI AND OXIA AND YPOGEGRAMMENI
+1F95; F; 1F25 03B9; # GREEK SMALL LETTER ETA WITH DASIA AND OXIA AND YPOGEGRAMMENI
+1F96; F; 1F26 03B9; # GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI
+1F97; F; 1F27 03B9; # GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+1F98; F; 1F20 03B9; # GREEK CAPITAL LETTER ETA WITH PSILI AND PROSGEGRAMMENI
+1F98; S; 1F90; # GREEK CAPITAL LETTER ETA WITH PSILI AND PROSGEGRAMMENI
+1F99; F; 1F21 03B9; # GREEK CAPITAL LETTER ETA WITH DASIA AND PROSGEGRAMMENI
+1F99; S; 1F91; # GREEK CAPITAL LETTER ETA WITH DASIA AND PROSGEGRAMMENI
+1F9A; F; 1F22 03B9; # GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1F9A; S; 1F92; # GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1F9B; F; 1F23 03B9; # GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1F9B; S; 1F93; # GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1F9C; F; 1F24 03B9; # GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1F9C; S; 1F94; # GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1F9D; F; 1F25 03B9; # GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1F9D; S; 1F95; # GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1F9E; F; 1F26 03B9; # GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1F9E; S; 1F96; # GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1F9F; F; 1F27 03B9; # GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1F9F; S; 1F97; # GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1FA0; F; 1F60 03B9; # GREEK SMALL LETTER OMEGA WITH PSILI AND YPOGEGRAMMENI
+1FA1; F; 1F61 03B9; # GREEK SMALL LETTER OMEGA WITH DASIA AND YPOGEGRAMMENI
+1FA2; F; 1F62 03B9; # GREEK SMALL LETTER OMEGA WITH PSILI AND VARIA AND YPOGEGRAMMENI
+1FA3; F; 1F63 03B9; # GREEK SMALL LETTER OMEGA WITH DASIA AND VARIA AND YPOGEGRAMMENI
+1FA4; F; 1F64 03B9; # GREEK SMALL LETTER OMEGA WITH PSILI AND OXIA AND YPOGEGRAMMENI
+1FA5; F; 1F65 03B9; # GREEK SMALL LETTER OMEGA WITH DASIA AND OXIA AND YPOGEGRAMMENI
+1FA6; F; 1F66 03B9; # GREEK SMALL LETTER OMEGA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI
+1FA7; F; 1F67 03B9; # GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+1FA8; F; 1F60 03B9; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PROSGEGRAMMENI
+1FA8; S; 1FA0; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PROSGEGRAMMENI
+1FA9; F; 1F61 03B9; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PROSGEGRAMMENI
+1FA9; S; 1FA1; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PROSGEGRAMMENI
+1FAA; F; 1F62 03B9; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1FAA; S; 1FA2; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1FAB; F; 1F63 03B9; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1FAB; S; 1FA3; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1FAC; F; 1F64 03B9; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1FAC; S; 1FA4; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1FAD; F; 1F65 03B9; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1FAD; S; 1FA5; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1FAE; F; 1F66 03B9; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1FAE; S; 1FA6; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1FAF; F; 1F67 03B9; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1FAF; S; 1FA7; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1FB2; F; 1F70 03B9; # GREEK SMALL LETTER ALPHA WITH VARIA AND YPOGEGRAMMENI
+1FB3; F; 03B1 03B9; # GREEK SMALL LETTER ALPHA WITH YPOGEGRAMMENI
+1FB4; F; 03AC 03B9; # GREEK SMALL LETTER ALPHA WITH OXIA AND YPOGEGRAMMENI
+1FB6; F; 03B1 0342; # GREEK SMALL LETTER ALPHA WITH PERISPOMENI
+1FB7; F; 03B1 0342 03B9; # GREEK SMALL LETTER ALPHA WITH PERISPOMENI AND YPOGEGRAMMENI
+1FB8; C; 1FB0; # GREEK CAPITAL LETTER ALPHA WITH VRACHY
+1FB9; C; 1FB1; # GREEK CAPITAL LETTER ALPHA WITH MACRON
+1FBA; C; 1F70; # GREEK CAPITAL LETTER ALPHA WITH VARIA
+1FBB; C; 1F71; # GREEK CAPITAL LETTER ALPHA WITH OXIA
+1FBC; F; 03B1 03B9; # GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI
+1FBC; S; 1FB3; # GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI
+1FBE; C; 03B9; # GREEK PROSGEGRAMMENI
+1FC2; F; 1F74 03B9; # GREEK SMALL LETTER ETA WITH VARIA AND YPOGEGRAMMENI
+1FC3; F; 03B7 03B9; # GREEK SMALL LETTER ETA WITH YPOGEGRAMMENI
+1FC4; F; 03AE 03B9; # GREEK SMALL LETTER ETA WITH OXIA AND YPOGEGRAMMENI
+1FC6; F; 03B7 0342; # GREEK SMALL LETTER ETA WITH PERISPOMENI
+1FC7; F; 03B7 0342 03B9; # GREEK SMALL LETTER ETA WITH PERISPOMENI AND YPOGEGRAMMENI
+1FC8; C; 1F72; # GREEK CAPITAL LETTER EPSILON WITH VARIA
+1FC9; C; 1F73; # GREEK CAPITAL LETTER EPSILON WITH OXIA
+1FCA; C; 1F74; # GREEK CAPITAL LETTER ETA WITH VARIA
+1FCB; C; 1F75; # GREEK CAPITAL LETTER ETA WITH OXIA
+1FCC; F; 03B7 03B9; # GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI
+1FCC; S; 1FC3; # GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI
+1FD2; F; 03B9 0308 0300; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND VARIA
+1FD3; F; 03B9 0308 0301; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA
+1FD6; F; 03B9 0342; # GREEK SMALL LETTER IOTA WITH PERISPOMENI
+1FD7; F; 03B9 0308 0342; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND PERISPOMENI
+1FD8; C; 1FD0; # GREEK CAPITAL LETTER IOTA WITH VRACHY
+1FD9; C; 1FD1; # GREEK CAPITAL LETTER IOTA WITH MACRON
+1FDA; C; 1F76; # GREEK CAPITAL LETTER IOTA WITH VARIA
+1FDB; C; 1F77; # GREEK CAPITAL LETTER IOTA WITH OXIA
+1FE2; F; 03C5 0308 0300; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND VARIA
+1FE3; F; 03C5 0308 0301; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA
+1FE4; F; 03C1 0313; # GREEK SMALL LETTER RHO WITH PSILI
+1FE6; F; 03C5 0342; # GREEK SMALL LETTER UPSILON WITH PERISPOMENI
+1FE7; F; 03C5 0308 0342; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND PERISPOMENI
+1FE8; C; 1FE0; # GREEK CAPITAL LETTER UPSILON WITH VRACHY
+1FE9; C; 1FE1; # GREEK CAPITAL LETTER UPSILON WITH MACRON
+1FEA; C; 1F7A; # GREEK CAPITAL LETTER UPSILON WITH VARIA
+1FEB; C; 1F7B; # GREEK CAPITAL LETTER UPSILON WITH OXIA
+1FEC; C; 1FE5; # GREEK CAPITAL LETTER RHO WITH DASIA
+1FF2; F; 1F7C 03B9; # GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI
+1FF3; F; 03C9 03B9; # GREEK SMALL LETTER OMEGA WITH YPOGEGRAMMENI
+1FF4; F; 03CE 03B9; # GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI
+1FF6; F; 03C9 0342; # GREEK SMALL LETTER OMEGA WITH PERISPOMENI
+1FF7; F; 03C9 0342 03B9; # GREEK SMALL LETTER OMEGA WITH PERISPOMENI AND YPOGEGRAMMENI
+1FF8; C; 1F78; # GREEK CAPITAL LETTER OMICRON WITH VARIA
+1FF9; C; 1F79; # GREEK CAPITAL LETTER OMICRON WITH OXIA
+1FFA; C; 1F7C; # GREEK CAPITAL LETTER OMEGA WITH VARIA
+1FFB; C; 1F7D; # GREEK CAPITAL LETTER OMEGA WITH OXIA
+1FFC; F; 03C9 03B9; # GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
+1FFC; S; 1FF3; # GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
+2126; C; 03C9; # OHM SIGN
+212A; C; 006B; # KELVIN SIGN
+212B; C; 00E5; # ANGSTROM SIGN
+2132; C; 214E; # TURNED CAPITAL F
+2160; C; 2170; # ROMAN NUMERAL ONE
+2161; C; 2171; # ROMAN NUMERAL TWO
+2162; C; 2172; # ROMAN NUMERAL THREE
+2163; C; 2173; # ROMAN NUMERAL FOUR
+2164; C; 2174; # ROMAN NUMERAL FIVE
+2165; C; 2175; # ROMAN NUMERAL SIX
+2166; C; 2176; # ROMAN NUMERAL SEVEN
+2167; C; 2177; # ROMAN NUMERAL EIGHT
+2168; C; 2178; # ROMAN NUMERAL NINE
+2169; C; 2179; # ROMAN NUMERAL TEN
+216A; C; 217A; # ROMAN NUMERAL ELEVEN
+216B; C; 217B; # ROMAN NUMERAL TWELVE
+216C; C; 217C; # ROMAN NUMERAL FIFTY
+216D; C; 217D; # ROMAN NUMERAL ONE HUNDRED
+216E; C; 217E; # ROMAN NUMERAL FIVE HUNDRED
+216F; C; 217F; # ROMAN NUMERAL ONE THOUSAND
+2183; C; 2184; # ROMAN NUMERAL REVERSED ONE HUNDRED
+24B6; C; 24D0; # CIRCLED LATIN CAPITAL LETTER A
+24B7; C; 24D1; # CIRCLED LATIN CAPITAL LETTER B
+24B8; C; 24D2; # CIRCLED LATIN CAPITAL LETTER C
+24B9; C; 24D3; # CIRCLED LATIN CAPITAL LETTER D
+24BA; C; 24D4; # CIRCLED LATIN CAPITAL LETTER E
+24BB; C; 24D5; # CIRCLED LATIN CAPITAL LETTER F
+24BC; C; 24D6; # CIRCLED LATIN CAPITAL LETTER G
+24BD; C; 24D7; # CIRCLED LATIN CAPITAL LETTER H
+24BE; C; 24D8; # CIRCLED LATIN CAPITAL LETTER I
+24BF; C; 24D9; # CIRCLED LATIN CAPITAL LETTER J
+24C0; C; 24DA; # CIRCLED LATIN CAPITAL LETTER K
+24C1; C; 24DB; # CIRCLED LATIN CAPITAL LETTER L
+24C2; C; 24DC; # CIRCLED LATIN CAPITAL LETTER M
+24C3; C; 24DD; # CIRCLED LATIN CAPITAL LETTER N
+24C4; C; 24DE; # CIRCLED LATIN CAPITAL LETTER O
+24C5; C; 24DF; # CIRCLED LATIN CAPITAL LETTER P
+24C6; C; 24E0; # CIRCLED LATIN CAPITAL LETTER Q
+24C7; C; 24E1; # CIRCLED LATIN CAPITAL LETTER R
+24C8; C; 24E2; # CIRCLED LATIN CAPITAL LETTER S
+24C9; C; 24E3; # CIRCLED LATIN CAPITAL LETTER T
+24CA; C; 24E4; # CIRCLED LATIN CAPITAL LETTER U
+24CB; C; 24E5; # CIRCLED LATIN CAPITAL LETTER V
+24CC; C; 24E6; # CIRCLED LATIN CAPITAL LETTER W
+24CD; C; 24E7; # CIRCLED LATIN CAPITAL LETTER X
+24CE; C; 24E8; # CIRCLED LATIN CAPITAL LETTER Y
+24CF; C; 24E9; # CIRCLED LATIN CAPITAL LETTER Z
+2C00; C; 2C30; # GLAGOLITIC CAPITAL LETTER AZU
+2C01; C; 2C31; # GLAGOLITIC CAPITAL LETTER BUKY
+2C02; C; 2C32; # GLAGOLITIC CAPITAL LETTER VEDE
+2C03; C; 2C33; # GLAGOLITIC CAPITAL LETTER GLAGOLI
+2C04; C; 2C34; # GLAGOLITIC CAPITAL LETTER DOBRO
+2C05; C; 2C35; # GLAGOLITIC CAPITAL LETTER YESTU
+2C06; C; 2C36; # GLAGOLITIC CAPITAL LETTER ZHIVETE
+2C07; C; 2C37; # GLAGOLITIC CAPITAL LETTER DZELO
+2C08; C; 2C38; # GLAGOLITIC CAPITAL LETTER ZEMLJA
+2C09; C; 2C39; # GLAGOLITIC CAPITAL LETTER IZHE
+2C0A; C; 2C3A; # GLAGOLITIC CAPITAL LETTER INITIAL IZHE
+2C0B; C; 2C3B; # GLAGOLITIC CAPITAL LETTER I
+2C0C; C; 2C3C; # GLAGOLITIC CAPITAL LETTER DJERVI
+2C0D; C; 2C3D; # GLAGOLITIC CAPITAL LETTER KAKO
+2C0E; C; 2C3E; # GLAGOLITIC CAPITAL LETTER LJUDIJE
+2C0F; C; 2C3F; # GLAGOLITIC CAPITAL LETTER MYSLITE
+2C10; C; 2C40; # GLAGOLITIC CAPITAL LETTER NASHI
+2C11; C; 2C41; # GLAGOLITIC CAPITAL LETTER ONU
+2C12; C; 2C42; # GLAGOLITIC CAPITAL LETTER POKOJI
+2C13; C; 2C43; # GLAGOLITIC CAPITAL LETTER RITSI
+2C14; C; 2C44; # GLAGOLITIC CAPITAL LETTER SLOVO
+2C15; C; 2C45; # GLAGOLITIC CAPITAL LETTER TVRIDO
+2C16; C; 2C46; # GLAGOLITIC CAPITAL LETTER UKU
+2C17; C; 2C47; # GLAGOLITIC CAPITAL LETTER FRITU
+2C18; C; 2C48; # GLAGOLITIC CAPITAL LETTER HERU
+2C19; C; 2C49; # GLAGOLITIC CAPITAL LETTER OTU
+2C1A; C; 2C4A; # GLAGOLITIC CAPITAL LETTER PE
+2C1B; C; 2C4B; # GLAGOLITIC CAPITAL LETTER SHTA
+2C1C; C; 2C4C; # GLAGOLITIC CAPITAL LETTER TSI
+2C1D; C; 2C4D; # GLAGOLITIC CAPITAL LETTER CHRIVI
+2C1E; C; 2C4E; # GLAGOLITIC CAPITAL LETTER SHA
+2C1F; C; 2C4F; # GLAGOLITIC CAPITAL LETTER YERU
+2C20; C; 2C50; # GLAGOLITIC CAPITAL LETTER YERI
+2C21; C; 2C51; # GLAGOLITIC CAPITAL LETTER YATI
+2C22; C; 2C52; # GLAGOLITIC CAPITAL LETTER SPIDERY HA
+2C23; C; 2C53; # GLAGOLITIC CAPITAL LETTER YU
+2C24; C; 2C54; # GLAGOLITIC CAPITAL LETTER SMALL YUS
+2C25; C; 2C55; # GLAGOLITIC CAPITAL LETTER SMALL YUS WITH TAIL
+2C26; C; 2C56; # GLAGOLITIC CAPITAL LETTER YO
+2C27; C; 2C57; # GLAGOLITIC CAPITAL LETTER IOTATED SMALL YUS
+2C28; C; 2C58; # GLAGOLITIC CAPITAL LETTER BIG YUS
+2C29; C; 2C59; # GLAGOLITIC CAPITAL LETTER IOTATED BIG YUS
+2C2A; C; 2C5A; # GLAGOLITIC CAPITAL LETTER FITA
+2C2B; C; 2C5B; # GLAGOLITIC CAPITAL LETTER IZHITSA
+2C2C; C; 2C5C; # GLAGOLITIC CAPITAL LETTER SHTAPIC
+2C2D; C; 2C5D; # GLAGOLITIC CAPITAL LETTER TROKUTASTI A
+2C2E; C; 2C5E; # GLAGOLITIC CAPITAL LETTER LATINATE MYSLITE
+2C60; C; 2C61; # LATIN CAPITAL LETTER L WITH DOUBLE BAR
+2C62; C; 026B; # LATIN CAPITAL LETTER L WITH MIDDLE TILDE
+2C63; C; 1D7D; # LATIN CAPITAL LETTER P WITH STROKE
+2C64; C; 027D; # LATIN CAPITAL LETTER R WITH TAIL
+2C67; C; 2C68; # LATIN CAPITAL LETTER H WITH DESCENDER
+2C69; C; 2C6A; # LATIN CAPITAL LETTER K WITH DESCENDER
+2C6B; C; 2C6C; # LATIN CAPITAL LETTER Z WITH DESCENDER
+2C6D; C; 0251; # LATIN CAPITAL LETTER ALPHA
+2C6E; C; 0271; # LATIN CAPITAL LETTER M WITH HOOK
+2C6F; C; 0250; # LATIN CAPITAL LETTER TURNED A
+2C70; C; 0252; # LATIN CAPITAL LETTER TURNED ALPHA
+2C72; C; 2C73; # LATIN CAPITAL LETTER W WITH HOOK
+2C75; C; 2C76; # LATIN CAPITAL LETTER HALF H
+2C7E; C; 023F; # LATIN CAPITAL LETTER S WITH SWASH TAIL
+2C7F; C; 0240; # LATIN CAPITAL LETTER Z WITH SWASH TAIL
+2C80; C; 2C81; # COPTIC CAPITAL LETTER ALFA
+2C82; C; 2C83; # COPTIC CAPITAL LETTER VIDA
+2C84; C; 2C85; # COPTIC CAPITAL LETTER GAMMA
+2C86; C; 2C87; # COPTIC CAPITAL LETTER DALDA
+2C88; C; 2C89; # COPTIC CAPITAL LETTER EIE
+2C8A; C; 2C8B; # COPTIC CAPITAL LETTER SOU
+2C8C; C; 2C8D; # COPTIC CAPITAL LETTER ZATA
+2C8E; C; 2C8F; # COPTIC CAPITAL LETTER HATE
+2C90; C; 2C91; # COPTIC CAPITAL LETTER THETHE
+2C92; C; 2C93; # COPTIC CAPITAL LETTER IAUDA
+2C94; C; 2C95; # COPTIC CAPITAL LETTER KAPA
+2C96; C; 2C97; # COPTIC CAPITAL LETTER LAULA
+2C98; C; 2C99; # COPTIC CAPITAL LETTER MI
+2C9A; C; 2C9B; # COPTIC CAPITAL LETTER NI
+2C9C; C; 2C9D; # COPTIC CAPITAL LETTER KSI
+2C9E; C; 2C9F; # COPTIC CAPITAL LETTER O
+2CA0; C; 2CA1; # COPTIC CAPITAL LETTER PI
+2CA2; C; 2CA3; # COPTIC CAPITAL LETTER RO
+2CA4; C; 2CA5; # COPTIC CAPITAL LETTER SIMA
+2CA6; C; 2CA7; # COPTIC CAPITAL LETTER TAU
+2CA8; C; 2CA9; # COPTIC CAPITAL LETTER UA
+2CAA; C; 2CAB; # COPTIC CAPITAL LETTER FI
+2CAC; C; 2CAD; # COPTIC CAPITAL LETTER KHI
+2CAE; C; 2CAF; # COPTIC CAPITAL LETTER PSI
+2CB0; C; 2CB1; # COPTIC CAPITAL LETTER OOU
+2CB2; C; 2CB3; # COPTIC CAPITAL LETTER DIALECT-P ALEF
+2CB4; C; 2CB5; # COPTIC CAPITAL LETTER OLD COPTIC AIN
+2CB6; C; 2CB7; # COPTIC CAPITAL LETTER CRYPTOGRAMMIC EIE
+2CB8; C; 2CB9; # COPTIC CAPITAL LETTER DIALECT-P KAPA
+2CBA; C; 2CBB; # COPTIC CAPITAL LETTER DIALECT-P NI
+2CBC; C; 2CBD; # COPTIC CAPITAL LETTER CRYPTOGRAMMIC NI
+2CBE; C; 2CBF; # COPTIC CAPITAL LETTER OLD COPTIC OOU
+2CC0; C; 2CC1; # COPTIC CAPITAL LETTER SAMPI
+2CC2; C; 2CC3; # COPTIC CAPITAL LETTER CROSSED SHEI
+2CC4; C; 2CC5; # COPTIC CAPITAL LETTER OLD COPTIC SHEI
+2CC6; C; 2CC7; # COPTIC CAPITAL LETTER OLD COPTIC ESH
+2CC8; C; 2CC9; # COPTIC CAPITAL LETTER AKHMIMIC KHEI
+2CCA; C; 2CCB; # COPTIC CAPITAL LETTER DIALECT-P HORI
+2CCC; C; 2CCD; # COPTIC CAPITAL LETTER OLD COPTIC HORI
+2CCE; C; 2CCF; # COPTIC CAPITAL LETTER OLD COPTIC HA
+2CD0; C; 2CD1; # COPTIC CAPITAL LETTER L-SHAPED HA
+2CD2; C; 2CD3; # COPTIC CAPITAL LETTER OLD COPTIC HEI
+2CD4; C; 2CD5; # COPTIC CAPITAL LETTER OLD COPTIC HAT
+2CD6; C; 2CD7; # COPTIC CAPITAL LETTER OLD COPTIC GANGIA
+2CD8; C; 2CD9; # COPTIC CAPITAL LETTER OLD COPTIC DJA
+2CDA; C; 2CDB; # COPTIC CAPITAL LETTER OLD COPTIC SHIMA
+2CDC; C; 2CDD; # COPTIC CAPITAL LETTER OLD NUBIAN SHIMA
+2CDE; C; 2CDF; # COPTIC CAPITAL LETTER OLD NUBIAN NGI
+2CE0; C; 2CE1; # COPTIC CAPITAL LETTER OLD NUBIAN NYI
+2CE2; C; 2CE3; # COPTIC CAPITAL LETTER OLD NUBIAN WAU
+2CEB; C; 2CEC; # COPTIC CAPITAL LETTER CRYPTOGRAMMIC SHEI
+2CED; C; 2CEE; # COPTIC CAPITAL LETTER CRYPTOGRAMMIC GANGIA
+2CF2; C; 2CF3; # COPTIC CAPITAL LETTER BOHAIRIC KHEI
+A640; C; A641; # CYRILLIC CAPITAL LETTER ZEMLYA
+A642; C; A643; # CYRILLIC CAPITAL LETTER DZELO
+A644; C; A645; # CYRILLIC CAPITAL LETTER REVERSED DZE
+A646; C; A647; # CYRILLIC CAPITAL LETTER IOTA
+A648; C; A649; # CYRILLIC CAPITAL LETTER DJERV
+A64A; C; A64B; # CYRILLIC CAPITAL LETTER MONOGRAPH UK
+A64C; C; A64D; # CYRILLIC CAPITAL LETTER BROAD OMEGA
+A64E; C; A64F; # CYRILLIC CAPITAL LETTER NEUTRAL YER
+A650; C; A651; # CYRILLIC CAPITAL LETTER YERU WITH BACK YER
+A652; C; A653; # CYRILLIC CAPITAL LETTER IOTIFIED YAT
+A654; C; A655; # CYRILLIC CAPITAL LETTER REVERSED YU
+A656; C; A657; # CYRILLIC CAPITAL LETTER IOTIFIED A
+A658; C; A659; # CYRILLIC CAPITAL LETTER CLOSED LITTLE YUS
+A65A; C; A65B; # CYRILLIC CAPITAL LETTER BLENDED YUS
+A65C; C; A65D; # CYRILLIC CAPITAL LETTER IOTIFIED CLOSED LITTLE YUS
+A65E; C; A65F; # CYRILLIC CAPITAL LETTER YN
+A660; C; A661; # CYRILLIC CAPITAL LETTER REVERSED TSE
+A662; C; A663; # CYRILLIC CAPITAL LETTER SOFT DE
+A664; C; A665; # CYRILLIC CAPITAL LETTER SOFT EL
+A666; C; A667; # CYRILLIC CAPITAL LETTER SOFT EM
+A668; C; A669; # CYRILLIC CAPITAL LETTER MONOCULAR O
+A66A; C; A66B; # CYRILLIC CAPITAL LETTER BINOCULAR O
+A66C; C; A66D; # CYRILLIC CAPITAL LETTER DOUBLE MONOCULAR O
+A680; C; A681; # CYRILLIC CAPITAL LETTER DWE
+A682; C; A683; # CYRILLIC CAPITAL LETTER DZWE
+A684; C; A685; # CYRILLIC CAPITAL LETTER ZHWE
+A686; C; A687; # CYRILLIC CAPITAL LETTER CCHE
+A688; C; A689; # CYRILLIC CAPITAL LETTER DZZE
+A68A; C; A68B; # CYRILLIC CAPITAL LETTER TE WITH MIDDLE HOOK
+A68C; C; A68D; # CYRILLIC CAPITAL LETTER TWE
+A68E; C; A68F; # CYRILLIC CAPITAL LETTER TSWE
+A690; C; A691; # CYRILLIC CAPITAL LETTER TSSE
+A692; C; A693; # CYRILLIC CAPITAL LETTER TCHE
+A694; C; A695; # CYRILLIC CAPITAL LETTER HWE
+A696; C; A697; # CYRILLIC CAPITAL LETTER SHWE
+A698; C; A699; # CYRILLIC CAPITAL LETTER DOUBLE O
+A69A; C; A69B; # CYRILLIC CAPITAL LETTER CROSSED O
+A722; C; A723; # LATIN CAPITAL LETTER EGYPTOLOGICAL ALEF
+A724; C; A725; # LATIN CAPITAL LETTER EGYPTOLOGICAL AIN
+A726; C; A727; # LATIN CAPITAL LETTER HENG
+A728; C; A729; # LATIN CAPITAL LETTER TZ
+A72A; C; A72B; # LATIN CAPITAL LETTER TRESILLO
+A72C; C; A72D; # LATIN CAPITAL LETTER CUATRILLO
+A72E; C; A72F; # LATIN CAPITAL LETTER CUATRILLO WITH COMMA
+A732; C; A733; # LATIN CAPITAL LETTER AA
+A734; C; A735; # LATIN CAPITAL LETTER AO
+A736; C; A737; # LATIN CAPITAL LETTER AU
+A738; C; A739; # LATIN CAPITAL LETTER AV
+A73A; C; A73B; # LATIN CAPITAL LETTER AV WITH HORIZONTAL BAR
+A73C; C; A73D; # LATIN CAPITAL LETTER AY
+A73E; C; A73F; # LATIN CAPITAL LETTER REVERSED C WITH DOT
+A740; C; A741; # LATIN CAPITAL LETTER K WITH STROKE
+A742; C; A743; # LATIN CAPITAL LETTER K WITH DIAGONAL STROKE
+A744; C; A745; # LATIN CAPITAL LETTER K WITH STROKE AND DIAGONAL STROKE
+A746; C; A747; # LATIN CAPITAL LETTER BROKEN L
+A748; C; A749; # LATIN CAPITAL LETTER L WITH HIGH STROKE
+A74A; C; A74B; # LATIN CAPITAL LETTER O WITH LONG STROKE OVERLAY
+A74C; C; A74D; # LATIN CAPITAL LETTER O WITH LOOP
+A74E; C; A74F; # LATIN CAPITAL LETTER OO
+A750; C; A751; # LATIN CAPITAL LETTER P WITH STROKE THROUGH DESCENDER
+A752; C; A753; # LATIN CAPITAL LETTER P WITH FLOURISH
+A754; C; A755; # LATIN CAPITAL LETTER P WITH SQUIRREL TAIL
+A756; C; A757; # LATIN CAPITAL LETTER Q WITH STROKE THROUGH DESCENDER
+A758; C; A759; # LATIN CAPITAL LETTER Q WITH DIAGONAL STROKE
+A75A; C; A75B; # LATIN CAPITAL LETTER R ROTUNDA
+A75C; C; A75D; # LATIN CAPITAL LETTER RUM ROTUNDA
+A75E; C; A75F; # LATIN CAPITAL LETTER V WITH DIAGONAL STROKE
+A760; C; A761; # LATIN CAPITAL LETTER VY
+A762; C; A763; # LATIN CAPITAL LETTER VISIGOTHIC Z
+A764; C; A765; # LATIN CAPITAL LETTER THORN WITH STROKE
+A766; C; A767; # LATIN CAPITAL LETTER THORN WITH STROKE THROUGH DESCENDER
+A768; C; A769; # LATIN CAPITAL LETTER VEND
+A76A; C; A76B; # LATIN CAPITAL LETTER ET
+A76C; C; A76D; # LATIN CAPITAL LETTER IS
+A76E; C; A76F; # LATIN CAPITAL LETTER CON
+A779; C; A77A; # LATIN CAPITAL LETTER INSULAR D
+A77B; C; A77C; # LATIN CAPITAL LETTER INSULAR F
+A77D; C; 1D79; # LATIN CAPITAL LETTER INSULAR G
+A77E; C; A77F; # LATIN CAPITAL LETTER TURNED INSULAR G
+A780; C; A781; # LATIN CAPITAL LETTER TURNED L
+A782; C; A783; # LATIN CAPITAL LETTER INSULAR R
+A784; C; A785; # LATIN CAPITAL LETTER INSULAR S
+A786; C; A787; # LATIN CAPITAL LETTER INSULAR T
+A78B; C; A78C; # LATIN CAPITAL LETTER SALTILLO
+A78D; C; 0265; # LATIN CAPITAL LETTER TURNED H
+A790; C; A791; # LATIN CAPITAL LETTER N WITH DESCENDER
+A792; C; A793; # LATIN CAPITAL LETTER C WITH BAR
+A796; C; A797; # LATIN CAPITAL LETTER B WITH FLOURISH
+A798; C; A799; # LATIN CAPITAL LETTER F WITH STROKE
+A79A; C; A79B; # LATIN CAPITAL LETTER VOLAPUK AE
+A79C; C; A79D; # LATIN CAPITAL LETTER VOLAPUK OE
+A79E; C; A79F; # LATIN CAPITAL LETTER VOLAPUK UE
+A7A0; C; A7A1; # LATIN CAPITAL LETTER G WITH OBLIQUE STROKE
+A7A2; C; A7A3; # LATIN CAPITAL LETTER K WITH OBLIQUE STROKE
+A7A4; C; A7A5; # LATIN CAPITAL LETTER N WITH OBLIQUE STROKE
+A7A6; C; A7A7; # LATIN CAPITAL LETTER R WITH OBLIQUE STROKE
+A7A8; C; A7A9; # LATIN CAPITAL LETTER S WITH OBLIQUE STROKE
+A7AA; C; 0266; # LATIN CAPITAL LETTER H WITH HOOK
+A7AB; C; 025C; # LATIN CAPITAL LETTER REVERSED OPEN E
+A7AC; C; 0261; # LATIN CAPITAL LETTER SCRIPT G
+A7AD; C; 026C; # LATIN CAPITAL LETTER L WITH BELT
+A7AE; C; 026A; # LATIN CAPITAL LETTER SMALL CAPITAL I
+A7B0; C; 029E; # LATIN CAPITAL LETTER TURNED K
+A7B1; C; 0287; # LATIN CAPITAL LETTER TURNED T
+A7B2; C; 029D; # LATIN CAPITAL LETTER J WITH CROSSED-TAIL
+A7B3; C; AB53; # LATIN CAPITAL LETTER CHI
+A7B4; C; A7B5; # LATIN CAPITAL LETTER BETA
+A7B6; C; A7B7; # LATIN CAPITAL LETTER OMEGA
+A7B8; C; A7B9; # LATIN CAPITAL LETTER U WITH STROKE
+A7BA; C; A7BB; # LATIN CAPITAL LETTER GLOTTAL A
+A7BC; C; A7BD; # LATIN CAPITAL LETTER GLOTTAL I
+A7BE; C; A7BF; # LATIN CAPITAL LETTER GLOTTAL U
+A7C2; C; A7C3; # LATIN CAPITAL LETTER ANGLICANA W
+A7C4; C; A794; # LATIN CAPITAL LETTER C WITH PALATAL HOOK
+A7C5; C; 0282; # LATIN CAPITAL LETTER S WITH HOOK
+A7C6; C; 1D8E; # LATIN CAPITAL LETTER Z WITH PALATAL HOOK
+AB70; C; 13A0; # CHEROKEE SMALL LETTER A
+AB71; C; 13A1; # CHEROKEE SMALL LETTER E
+AB72; C; 13A2; # CHEROKEE SMALL LETTER I
+AB73; C; 13A3; # CHEROKEE SMALL LETTER O
+AB74; C; 13A4; # CHEROKEE SMALL LETTER U
+AB75; C; 13A5; # CHEROKEE SMALL LETTER V
+AB76; C; 13A6; # CHEROKEE SMALL LETTER GA
+AB77; C; 13A7; # CHEROKEE SMALL LETTER KA
+AB78; C; 13A8; # CHEROKEE SMALL LETTER GE
+AB79; C; 13A9; # CHEROKEE SMALL LETTER GI
+AB7A; C; 13AA; # CHEROKEE SMALL LETTER GO
+AB7B; C; 13AB; # CHEROKEE SMALL LETTER GU
+AB7C; C; 13AC; # CHEROKEE SMALL LETTER GV
+AB7D; C; 13AD; # CHEROKEE SMALL LETTER HA
+AB7E; C; 13AE; # CHEROKEE SMALL LETTER HE
+AB7F; C; 13AF; # CHEROKEE SMALL LETTER HI
+AB80; C; 13B0; # CHEROKEE SMALL LETTER HO
+AB81; C; 13B1; # CHEROKEE SMALL LETTER HU
+AB82; C; 13B2; # CHEROKEE SMALL LETTER HV
+AB83; C; 13B3; # CHEROKEE SMALL LETTER LA
+AB84; C; 13B4; # CHEROKEE SMALL LETTER LE
+AB85; C; 13B5; # CHEROKEE SMALL LETTER LI
+AB86; C; 13B6; # CHEROKEE SMALL LETTER LO
+AB87; C; 13B7; # CHEROKEE SMALL LETTER LU
+AB88; C; 13B8; # CHEROKEE SMALL LETTER LV
+AB89; C; 13B9; # CHEROKEE SMALL LETTER MA
+AB8A; C; 13BA; # CHEROKEE SMALL LETTER ME
+AB8B; C; 13BB; # CHEROKEE SMALL LETTER MI
+AB8C; C; 13BC; # CHEROKEE SMALL LETTER MO
+AB8D; C; 13BD; # CHEROKEE SMALL LETTER MU
+AB8E; C; 13BE; # CHEROKEE SMALL LETTER NA
+AB8F; C; 13BF; # CHEROKEE SMALL LETTER HNA
+AB90; C; 13C0; # CHEROKEE SMALL LETTER NAH
+AB91; C; 13C1; # CHEROKEE SMALL LETTER NE
+AB92; C; 13C2; # CHEROKEE SMALL LETTER NI
+AB93; C; 13C3; # CHEROKEE SMALL LETTER NO
+AB94; C; 13C4; # CHEROKEE SMALL LETTER NU
+AB95; C; 13C5; # CHEROKEE SMALL LETTER NV
+AB96; C; 13C6; # CHEROKEE SMALL LETTER QUA
+AB97; C; 13C7; # CHEROKEE SMALL LETTER QUE
+AB98; C; 13C8; # CHEROKEE SMALL LETTER QUI
+AB99; C; 13C9; # CHEROKEE SMALL LETTER QUO
+AB9A; C; 13CA; # CHEROKEE SMALL LETTER QUU
+AB9B; C; 13CB; # CHEROKEE SMALL LETTER QUV
+AB9C; C; 13CC; # CHEROKEE SMALL LETTER SA
+AB9D; C; 13CD; # CHEROKEE SMALL LETTER S
+AB9E; C; 13CE; # CHEROKEE SMALL LETTER SE
+AB9F; C; 13CF; # CHEROKEE SMALL LETTER SI
+ABA0; C; 13D0; # CHEROKEE SMALL LETTER SO
+ABA1; C; 13D1; # CHEROKEE SMALL LETTER SU
+ABA2; C; 13D2; # CHEROKEE SMALL LETTER SV
+ABA3; C; 13D3; # CHEROKEE SMALL LETTER DA
+ABA4; C; 13D4; # CHEROKEE SMALL LETTER TA
+ABA5; C; 13D5; # CHEROKEE SMALL LETTER DE
+ABA6; C; 13D6; # CHEROKEE SMALL LETTER TE
+ABA7; C; 13D7; # CHEROKEE SMALL LETTER DI
+ABA8; C; 13D8; # CHEROKEE SMALL LETTER TI
+ABA9; C; 13D9; # CHEROKEE SMALL LETTER DO
+ABAA; C; 13DA; # CHEROKEE SMALL LETTER DU
+ABAB; C; 13DB; # CHEROKEE SMALL LETTER DV
+ABAC; C; 13DC; # CHEROKEE SMALL LETTER DLA
+ABAD; C; 13DD; # CHEROKEE SMALL LETTER TLA
+ABAE; C; 13DE; # CHEROKEE SMALL LETTER TLE
+ABAF; C; 13DF; # CHEROKEE SMALL LETTER TLI
+ABB0; C; 13E0; # CHEROKEE SMALL LETTER TLO
+ABB1; C; 13E1; # CHEROKEE SMALL LETTER TLU
+ABB2; C; 13E2; # CHEROKEE SMALL LETTER TLV
+ABB3; C; 13E3; # CHEROKEE SMALL LETTER TSA
+ABB4; C; 13E4; # CHEROKEE SMALL LETTER TSE
+ABB5; C; 13E5; # CHEROKEE SMALL LETTER TSI
+ABB6; C; 13E6; # CHEROKEE SMALL LETTER TSO
+ABB7; C; 13E7; # CHEROKEE SMALL LETTER TSU
+ABB8; C; 13E8; # CHEROKEE SMALL LETTER TSV
+ABB9; C; 13E9; # CHEROKEE SMALL LETTER WA
+ABBA; C; 13EA; # CHEROKEE SMALL LETTER WE
+ABBB; C; 13EB; # CHEROKEE SMALL LETTER WI
+ABBC; C; 13EC; # CHEROKEE SMALL LETTER WO
+ABBD; C; 13ED; # CHEROKEE SMALL LETTER WU
+ABBE; C; 13EE; # CHEROKEE SMALL LETTER WV
+ABBF; C; 13EF; # CHEROKEE SMALL LETTER YA
+FB00; F; 0066 0066; # LATIN SMALL LIGATURE FF
+FB01; F; 0066 0069; # LATIN SMALL LIGATURE FI
+FB02; F; 0066 006C; # LATIN SMALL LIGATURE FL
+FB03; F; 0066 0066 0069; # LATIN SMALL LIGATURE FFI
+FB04; F; 0066 0066 006C; # LATIN SMALL LIGATURE FFL
+FB05; F; 0073 0074; # LATIN SMALL LIGATURE LONG S T
+FB06; F; 0073 0074; # LATIN SMALL LIGATURE ST
+FB13; F; 0574 0576; # ARMENIAN SMALL LIGATURE MEN NOW
+FB14; F; 0574 0565; # ARMENIAN SMALL LIGATURE MEN ECH
+FB15; F; 0574 056B; # ARMENIAN SMALL LIGATURE MEN INI
+FB16; F; 057E 0576; # ARMENIAN SMALL LIGATURE VEW NOW
+FB17; F; 0574 056D; # ARMENIAN SMALL LIGATURE MEN XEH
+FF21; C; FF41; # FULLWIDTH LATIN CAPITAL LETTER A
+FF22; C; FF42; # FULLWIDTH LATIN CAPITAL LETTER B
+FF23; C; FF43; # FULLWIDTH LATIN CAPITAL LETTER C
+FF24; C; FF44; # FULLWIDTH LATIN CAPITAL LETTER D
+FF25; C; FF45; # FULLWIDTH LATIN CAPITAL LETTER E
+FF26; C; FF46; # FULLWIDTH LATIN CAPITAL LETTER F
+FF27; C; FF47; # FULLWIDTH LATIN CAPITAL LETTER G
+FF28; C; FF48; # FULLWIDTH LATIN CAPITAL LETTER H
+FF29; C; FF49; # FULLWIDTH LATIN CAPITAL LETTER I
+FF2A; C; FF4A; # FULLWIDTH LATIN CAPITAL LETTER J
+FF2B; C; FF4B; # FULLWIDTH LATIN CAPITAL LETTER K
+FF2C; C; FF4C; # FULLWIDTH LATIN CAPITAL LETTER L
+FF2D; C; FF4D; # FULLWIDTH LATIN CAPITAL LETTER M
+FF2E; C; FF4E; # FULLWIDTH LATIN CAPITAL LETTER N
+FF2F; C; FF4F; # FULLWIDTH LATIN CAPITAL LETTER O
+FF30; C; FF50; # FULLWIDTH LATIN CAPITAL LETTER P
+FF31; C; FF51; # FULLWIDTH LATIN CAPITAL LETTER Q
+FF32; C; FF52; # FULLWIDTH LATIN CAPITAL LETTER R
+FF33; C; FF53; # FULLWIDTH LATIN CAPITAL LETTER S
+FF34; C; FF54; # FULLWIDTH LATIN CAPITAL LETTER T
+FF35; C; FF55; # FULLWIDTH LATIN CAPITAL LETTER U
+FF36; C; FF56; # FULLWIDTH LATIN CAPITAL LETTER V
+FF37; C; FF57; # FULLWIDTH LATIN CAPITAL LETTER W
+FF38; C; FF58; # FULLWIDTH LATIN CAPITAL LETTER X
+FF39; C; FF59; # FULLWIDTH LATIN CAPITAL LETTER Y
+FF3A; C; FF5A; # FULLWIDTH LATIN CAPITAL LETTER Z
+10400; C; 10428; # DESERET CAPITAL LETTER LONG I
+10401; C; 10429; # DESERET CAPITAL LETTER LONG E
+10402; C; 1042A; # DESERET CAPITAL LETTER LONG A
+10403; C; 1042B; # DESERET CAPITAL LETTER LONG AH
+10404; C; 1042C; # DESERET CAPITAL LETTER LONG O
+10405; C; 1042D; # DESERET CAPITAL LETTER LONG OO
+10406; C; 1042E; # DESERET CAPITAL LETTER SHORT I
+10407; C; 1042F; # DESERET CAPITAL LETTER SHORT E
+10408; C; 10430; # DESERET CAPITAL LETTER SHORT A
+10409; C; 10431; # DESERET CAPITAL LETTER SHORT AH
+1040A; C; 10432; # DESERET CAPITAL LETTER SHORT O
+1040B; C; 10433; # DESERET CAPITAL LETTER SHORT OO
+1040C; C; 10434; # DESERET CAPITAL LETTER AY
+1040D; C; 10435; # DESERET CAPITAL LETTER OW
+1040E; C; 10436; # DESERET CAPITAL LETTER WU
+1040F; C; 10437; # DESERET CAPITAL LETTER YEE
+10410; C; 10438; # DESERET CAPITAL LETTER H
+10411; C; 10439; # DESERET CAPITAL LETTER PEE
+10412; C; 1043A; # DESERET CAPITAL LETTER BEE
+10413; C; 1043B; # DESERET CAPITAL LETTER TEE
+10414; C; 1043C; # DESERET CAPITAL LETTER DEE
+10415; C; 1043D; # DESERET CAPITAL LETTER CHEE
+10416; C; 1043E; # DESERET CAPITAL LETTER JEE
+10417; C; 1043F; # DESERET CAPITAL LETTER KAY
+10418; C; 10440; # DESERET CAPITAL LETTER GAY
+10419; C; 10441; # DESERET CAPITAL LETTER EF
+1041A; C; 10442; # DESERET CAPITAL LETTER VEE
+1041B; C; 10443; # DESERET CAPITAL LETTER ETH
+1041C; C; 10444; # DESERET CAPITAL LETTER THEE
+1041D; C; 10445; # DESERET CAPITAL LETTER ES
+1041E; C; 10446; # DESERET CAPITAL LETTER ZEE
+1041F; C; 10447; # DESERET CAPITAL LETTER ESH
+10420; C; 10448; # DESERET CAPITAL LETTER ZHEE
+10421; C; 10449; # DESERET CAPITAL LETTER ER
+10422; C; 1044A; # DESERET CAPITAL LETTER EL
+10423; C; 1044B; # DESERET CAPITAL LETTER EM
+10424; C; 1044C; # DESERET CAPITAL LETTER EN
+10425; C; 1044D; # DESERET CAPITAL LETTER ENG
+10426; C; 1044E; # DESERET CAPITAL LETTER OI
+10427; C; 1044F; # DESERET CAPITAL LETTER EW
+104B0; C; 104D8; # OSAGE CAPITAL LETTER A
+104B1; C; 104D9; # OSAGE CAPITAL LETTER AI
+104B2; C; 104DA; # OSAGE CAPITAL LETTER AIN
+104B3; C; 104DB; # OSAGE CAPITAL LETTER AH
+104B4; C; 104DC; # OSAGE CAPITAL LETTER BRA
+104B5; C; 104DD; # OSAGE CAPITAL LETTER CHA
+104B6; C; 104DE; # OSAGE CAPITAL LETTER EHCHA
+104B7; C; 104DF; # OSAGE CAPITAL LETTER E
+104B8; C; 104E0; # OSAGE CAPITAL LETTER EIN
+104B9; C; 104E1; # OSAGE CAPITAL LETTER HA
+104BA; C; 104E2; # OSAGE CAPITAL LETTER HYA
+104BB; C; 104E3; # OSAGE CAPITAL LETTER I
+104BC; C; 104E4; # OSAGE CAPITAL LETTER KA
+104BD; C; 104E5; # OSAGE CAPITAL LETTER EHKA
+104BE; C; 104E6; # OSAGE CAPITAL LETTER KYA
+104BF; C; 104E7; # OSAGE CAPITAL LETTER LA
+104C0; C; 104E8; # OSAGE CAPITAL LETTER MA
+104C1; C; 104E9; # OSAGE CAPITAL LETTER NA
+104C2; C; 104EA; # OSAGE CAPITAL LETTER O
+104C3; C; 104EB; # OSAGE CAPITAL LETTER OIN
+104C4; C; 104EC; # OSAGE CAPITAL LETTER PA
+104C5; C; 104ED; # OSAGE CAPITAL LETTER EHPA
+104C6; C; 104EE; # OSAGE CAPITAL LETTER SA
+104C7; C; 104EF; # OSAGE CAPITAL LETTER SHA
+104C8; C; 104F0; # OSAGE CAPITAL LETTER TA
+104C9; C; 104F1; # OSAGE CAPITAL LETTER EHTA
+104CA; C; 104F2; # OSAGE CAPITAL LETTER TSA
+104CB; C; 104F3; # OSAGE CAPITAL LETTER EHTSA
+104CC; C; 104F4; # OSAGE CAPITAL LETTER TSHA
+104CD; C; 104F5; # OSAGE CAPITAL LETTER DHA
+104CE; C; 104F6; # OSAGE CAPITAL LETTER U
+104CF; C; 104F7; # OSAGE CAPITAL LETTER WA
+104D0; C; 104F8; # OSAGE CAPITAL LETTER KHA
+104D1; C; 104F9; # OSAGE CAPITAL LETTER GHA
+104D2; C; 104FA; # OSAGE CAPITAL LETTER ZA
+104D3; C; 104FB; # OSAGE CAPITAL LETTER ZHA
+10C80; C; 10CC0; # OLD HUNGARIAN CAPITAL LETTER A
+10C81; C; 10CC1; # OLD HUNGARIAN CAPITAL LETTER AA
+10C82; C; 10CC2; # OLD HUNGARIAN CAPITAL LETTER EB
+10C83; C; 10CC3; # OLD HUNGARIAN CAPITAL LETTER AMB
+10C84; C; 10CC4; # OLD HUNGARIAN CAPITAL LETTER EC
+10C85; C; 10CC5; # OLD HUNGARIAN CAPITAL LETTER ENC
+10C86; C; 10CC6; # OLD HUNGARIAN CAPITAL LETTER ECS
+10C87; C; 10CC7; # OLD HUNGARIAN CAPITAL LETTER ED
+10C88; C; 10CC8; # OLD HUNGARIAN CAPITAL LETTER AND
+10C89; C; 10CC9; # OLD HUNGARIAN CAPITAL LETTER E
+10C8A; C; 10CCA; # OLD HUNGARIAN CAPITAL LETTER CLOSE E
+10C8B; C; 10CCB; # OLD HUNGARIAN CAPITAL LETTER EE
+10C8C; C; 10CCC; # OLD HUNGARIAN CAPITAL LETTER EF
+10C8D; C; 10CCD; # OLD HUNGARIAN CAPITAL LETTER EG
+10C8E; C; 10CCE; # OLD HUNGARIAN CAPITAL LETTER EGY
+10C8F; C; 10CCF; # OLD HUNGARIAN CAPITAL LETTER EH
+10C90; C; 10CD0; # OLD HUNGARIAN CAPITAL LETTER I
+10C91; C; 10CD1; # OLD HUNGARIAN CAPITAL LETTER II
+10C92; C; 10CD2; # OLD HUNGARIAN CAPITAL LETTER EJ
+10C93; C; 10CD3; # OLD HUNGARIAN CAPITAL LETTER EK
+10C94; C; 10CD4; # OLD HUNGARIAN CAPITAL LETTER AK
+10C95; C; 10CD5; # OLD HUNGARIAN CAPITAL LETTER UNK
+10C96; C; 10CD6; # OLD HUNGARIAN CAPITAL LETTER EL
+10C97; C; 10CD7; # OLD HUNGARIAN CAPITAL LETTER ELY
+10C98; C; 10CD8; # OLD HUNGARIAN CAPITAL LETTER EM
+10C99; C; 10CD9; # OLD HUNGARIAN CAPITAL LETTER EN
+10C9A; C; 10CDA; # OLD HUNGARIAN CAPITAL LETTER ENY
+10C9B; C; 10CDB; # OLD HUNGARIAN CAPITAL LETTER O
+10C9C; C; 10CDC; # OLD HUNGARIAN CAPITAL LETTER OO
+10C9D; C; 10CDD; # OLD HUNGARIAN CAPITAL LETTER NIKOLSBURG OE
+10C9E; C; 10CDE; # OLD HUNGARIAN CAPITAL LETTER RUDIMENTA OE
+10C9F; C; 10CDF; # OLD HUNGARIAN CAPITAL LETTER OEE
+10CA0; C; 10CE0; # OLD HUNGARIAN CAPITAL LETTER EP
+10CA1; C; 10CE1; # OLD HUNGARIAN CAPITAL LETTER EMP
+10CA2; C; 10CE2; # OLD HUNGARIAN CAPITAL LETTER ER
+10CA3; C; 10CE3; # OLD HUNGARIAN CAPITAL LETTER SHORT ER
+10CA4; C; 10CE4; # OLD HUNGARIAN CAPITAL LETTER ES
+10CA5; C; 10CE5; # OLD HUNGARIAN CAPITAL LETTER ESZ
+10CA6; C; 10CE6; # OLD HUNGARIAN CAPITAL LETTER ET
+10CA7; C; 10CE7; # OLD HUNGARIAN CAPITAL LETTER ENT
+10CA8; C; 10CE8; # OLD HUNGARIAN CAPITAL LETTER ETY
+10CA9; C; 10CE9; # OLD HUNGARIAN CAPITAL LETTER ECH
+10CAA; C; 10CEA; # OLD HUNGARIAN CAPITAL LETTER U
+10CAB; C; 10CEB; # OLD HUNGARIAN CAPITAL LETTER UU
+10CAC; C; 10CEC; # OLD HUNGARIAN CAPITAL LETTER NIKOLSBURG UE
+10CAD; C; 10CED; # OLD HUNGARIAN CAPITAL LETTER RUDIMENTA UE
+10CAE; C; 10CEE; # OLD HUNGARIAN CAPITAL LETTER EV
+10CAF; C; 10CEF; # OLD HUNGARIAN CAPITAL LETTER EZ
+10CB0; C; 10CF0; # OLD HUNGARIAN CAPITAL LETTER EZS
+10CB1; C; 10CF1; # OLD HUNGARIAN CAPITAL LETTER ENT-SHAPED SIGN
+10CB2; C; 10CF2; # OLD HUNGARIAN CAPITAL LETTER US
+118A0; C; 118C0; # WARANG CITI CAPITAL LETTER NGAA
+118A1; C; 118C1; # WARANG CITI CAPITAL LETTER A
+118A2; C; 118C2; # WARANG CITI CAPITAL LETTER WI
+118A3; C; 118C3; # WARANG CITI CAPITAL LETTER YU
+118A4; C; 118C4; # WARANG CITI CAPITAL LETTER YA
+118A5; C; 118C5; # WARANG CITI CAPITAL LETTER YO
+118A6; C; 118C6; # WARANG CITI CAPITAL LETTER II
+118A7; C; 118C7; # WARANG CITI CAPITAL LETTER UU
+118A8; C; 118C8; # WARANG CITI CAPITAL LETTER E
+118A9; C; 118C9; # WARANG CITI CAPITAL LETTER O
+118AA; C; 118CA; # WARANG CITI CAPITAL LETTER ANG
+118AB; C; 118CB; # WARANG CITI CAPITAL LETTER GA
+118AC; C; 118CC; # WARANG CITI CAPITAL LETTER KO
+118AD; C; 118CD; # WARANG CITI CAPITAL LETTER ENY
+118AE; C; 118CE; # WARANG CITI CAPITAL LETTER YUJ
+118AF; C; 118CF; # WARANG CITI CAPITAL LETTER UC
+118B0; C; 118D0; # WARANG CITI CAPITAL LETTER ENN
+118B1; C; 118D1; # WARANG CITI CAPITAL LETTER ODD
+118B2; C; 118D2; # WARANG CITI CAPITAL LETTER TTE
+118B3; C; 118D3; # WARANG CITI CAPITAL LETTER NUNG
+118B4; C; 118D4; # WARANG CITI CAPITAL LETTER DA
+118B5; C; 118D5; # WARANG CITI CAPITAL LETTER AT
+118B6; C; 118D6; # WARANG CITI CAPITAL LETTER AM
+118B7; C; 118D7; # WARANG CITI CAPITAL LETTER BU
+118B8; C; 118D8; # WARANG CITI CAPITAL LETTER PU
+118B9; C; 118D9; # WARANG CITI CAPITAL LETTER HIYO
+118BA; C; 118DA; # WARANG CITI CAPITAL LETTER HOLO
+118BB; C; 118DB; # WARANG CITI CAPITAL LETTER HORR
+118BC; C; 118DC; # WARANG CITI CAPITAL LETTER HAR
+118BD; C; 118DD; # WARANG CITI CAPITAL LETTER SSUU
+118BE; C; 118DE; # WARANG CITI CAPITAL LETTER SII
+118BF; C; 118DF; # WARANG CITI CAPITAL LETTER VIYO
+16E40; C; 16E60; # MEDEFAIDRIN CAPITAL LETTER M
+16E41; C; 16E61; # MEDEFAIDRIN CAPITAL LETTER S
+16E42; C; 16E62; # MEDEFAIDRIN CAPITAL LETTER V
+16E43; C; 16E63; # MEDEFAIDRIN CAPITAL LETTER W
+16E44; C; 16E64; # MEDEFAIDRIN CAPITAL LETTER ATIU
+16E45; C; 16E65; # MEDEFAIDRIN CAPITAL LETTER Z
+16E46; C; 16E66; # MEDEFAIDRIN CAPITAL LETTER KP
+16E47; C; 16E67; # MEDEFAIDRIN CAPITAL LETTER P
+16E48; C; 16E68; # MEDEFAIDRIN CAPITAL LETTER T
+16E49; C; 16E69; # MEDEFAIDRIN CAPITAL LETTER G
+16E4A; C; 16E6A; # MEDEFAIDRIN CAPITAL LETTER F
+16E4B; C; 16E6B; # MEDEFAIDRIN CAPITAL LETTER I
+16E4C; C; 16E6C; # MEDEFAIDRIN CAPITAL LETTER K
+16E4D; C; 16E6D; # MEDEFAIDRIN CAPITAL LETTER A
+16E4E; C; 16E6E; # MEDEFAIDRIN CAPITAL LETTER J
+16E4F; C; 16E6F; # MEDEFAIDRIN CAPITAL LETTER E
+16E50; C; 16E70; # MEDEFAIDRIN CAPITAL LETTER B
+16E51; C; 16E71; # MEDEFAIDRIN CAPITAL LETTER C
+16E52; C; 16E72; # MEDEFAIDRIN CAPITAL LETTER U
+16E53; C; 16E73; # MEDEFAIDRIN CAPITAL LETTER YU
+16E54; C; 16E74; # MEDEFAIDRIN CAPITAL LETTER L
+16E55; C; 16E75; # MEDEFAIDRIN CAPITAL LETTER Q
+16E56; C; 16E76; # MEDEFAIDRIN CAPITAL LETTER HP
+16E57; C; 16E77; # MEDEFAIDRIN CAPITAL LETTER NY
+16E58; C; 16E78; # MEDEFAIDRIN CAPITAL LETTER X
+16E59; C; 16E79; # MEDEFAIDRIN CAPITAL LETTER D
+16E5A; C; 16E7A; # MEDEFAIDRIN CAPITAL LETTER OE
+16E5B; C; 16E7B; # MEDEFAIDRIN CAPITAL LETTER N
+16E5C; C; 16E7C; # MEDEFAIDRIN CAPITAL LETTER R
+16E5D; C; 16E7D; # MEDEFAIDRIN CAPITAL LETTER O
+16E5E; C; 16E7E; # MEDEFAIDRIN CAPITAL LETTER AI
+16E5F; C; 16E7F; # MEDEFAIDRIN CAPITAL LETTER Y
+1E900; C; 1E922; # ADLAM CAPITAL LETTER ALIF
+1E901; C; 1E923; # ADLAM CAPITAL LETTER DAALI
+1E902; C; 1E924; # ADLAM CAPITAL LETTER LAAM
+1E903; C; 1E925; # ADLAM CAPITAL LETTER MIIM
+1E904; C; 1E926; # ADLAM CAPITAL LETTER BA
+1E905; C; 1E927; # ADLAM CAPITAL LETTER SINNYIIYHE
+1E906; C; 1E928; # ADLAM CAPITAL LETTER PE
+1E907; C; 1E929; # ADLAM CAPITAL LETTER BHE
+1E908; C; 1E92A; # ADLAM CAPITAL LETTER RA
+1E909; C; 1E92B; # ADLAM CAPITAL LETTER E
+1E90A; C; 1E92C; # ADLAM CAPITAL LETTER FA
+1E90B; C; 1E92D; # ADLAM CAPITAL LETTER I
+1E90C; C; 1E92E; # ADLAM CAPITAL LETTER O
+1E90D; C; 1E92F; # ADLAM CAPITAL LETTER DHA
+1E90E; C; 1E930; # ADLAM CAPITAL LETTER YHE
+1E90F; C; 1E931; # ADLAM CAPITAL LETTER WAW
+1E910; C; 1E932; # ADLAM CAPITAL LETTER NUN
+1E911; C; 1E933; # ADLAM CAPITAL LETTER KAF
+1E912; C; 1E934; # ADLAM CAPITAL LETTER YA
+1E913; C; 1E935; # ADLAM CAPITAL LETTER U
+1E914; C; 1E936; # ADLAM CAPITAL LETTER JIIM
+1E915; C; 1E937; # ADLAM CAPITAL LETTER CHI
+1E916; C; 1E938; # ADLAM CAPITAL LETTER HA
+1E917; C; 1E939; # ADLAM CAPITAL LETTER QAAF
+1E918; C; 1E93A; # ADLAM CAPITAL LETTER GA
+1E919; C; 1E93B; # ADLAM CAPITAL LETTER NYA
+1E91A; C; 1E93C; # ADLAM CAPITAL LETTER TU
+1E91B; C; 1E93D; # ADLAM CAPITAL LETTER NHA
+1E91C; C; 1E93E; # ADLAM CAPITAL LETTER VA
+1E91D; C; 1E93F; # ADLAM CAPITAL LETTER KHA
+1E91E; C; 1E940; # ADLAM CAPITAL LETTER GBE
+1E91F; C; 1E941; # ADLAM CAPITAL LETTER ZAL
+1E920; C; 1E942; # ADLAM CAPITAL LETTER KPO
+1E921; C; 1E943; # ADLAM CAPITAL LETTER SHA
+#
+# EOF
diff --git a/scripts/unicode/DerivedGeneralCategory.txt b/scripts/unicode/DerivedGeneralCategory.txt
new file mode 100644
index 0000000..21a66ee
--- /dev/null
+++ b/scripts/unicode/DerivedGeneralCategory.txt
@@ -0,0 +1,4045 @@
+# DerivedGeneralCategory-12.1.0.txt
+# Date: 2019-03-10, 10:53:08 GMT
+# © 2019 Unicode®, Inc.
+# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
+# For terms of use, see http://www.unicode.org/terms_of_use.html
+#
+# Unicode Character Database
+# For documentation, see http://www.unicode.org/reports/tr44/
+
+# ================================================
+
+# Property: General_Category
+
+# ================================================
+
+# General_Category=Unassigned
+
+0378..0379 ; Cn # [2] <reserved-0378>..<reserved-0379>
+0380..0383 ; Cn # [4] <reserved-0380>..<reserved-0383>
+038B ; Cn # <reserved-038B>
+038D ; Cn # <reserved-038D>
+03A2 ; Cn # <reserved-03A2>
+0530 ; Cn # <reserved-0530>
+0557..0558 ; Cn # [2] <reserved-0557>..<reserved-0558>
+058B..058C ; Cn # [2] <reserved-058B>..<reserved-058C>
+0590 ; Cn # <reserved-0590>
+05C8..05CF ; Cn # [8] <reserved-05C8>..<reserved-05CF>
+05EB..05EE ; Cn # [4] <reserved-05EB>..<reserved-05EE>
+05F5..05FF ; Cn # [11] <reserved-05F5>..<reserved-05FF>
+061D ; Cn # <reserved-061D>
+070E ; Cn # <reserved-070E>
+074B..074C ; Cn # [2] <reserved-074B>..<reserved-074C>
+07B2..07BF ; Cn # [14] <reserved-07B2>..<reserved-07BF>
+07FB..07FC ; Cn # [2] <reserved-07FB>..<reserved-07FC>
+082E..082F ; Cn # [2] <reserved-082E>..<reserved-082F>
+083F ; Cn # <reserved-083F>
+085C..085D ; Cn # [2] <reserved-085C>..<reserved-085D>
+085F ; Cn # <reserved-085F>
+086B..089F ; Cn # [53] <reserved-086B>..<reserved-089F>
+08B5 ; Cn # <reserved-08B5>
+08BE..08D2 ; Cn # [21] <reserved-08BE>..<reserved-08D2>
+0984 ; Cn # <reserved-0984>
+098D..098E ; Cn # [2] <reserved-098D>..<reserved-098E>
+0991..0992 ; Cn # [2] <reserved-0991>..<reserved-0992>
+09A9 ; Cn # <reserved-09A9>
+09B1 ; Cn # <reserved-09B1>
+09B3..09B5 ; Cn # [3] <reserved-09B3>..<reserved-09B5>
+09BA..09BB ; Cn # [2] <reserved-09BA>..<reserved-09BB>
+09C5..09C6 ; Cn # [2] <reserved-09C5>..<reserved-09C6>
+09C9..09CA ; Cn # [2] <reserved-09C9>..<reserved-09CA>
+09CF..09D6 ; Cn # [8] <reserved-09CF>..<reserved-09D6>
+09D8..09DB ; Cn # [4] <reserved-09D8>..<reserved-09DB>
+09DE ; Cn # <reserved-09DE>
+09E4..09E5 ; Cn # [2] <reserved-09E4>..<reserved-09E5>
+09FF..0A00 ; Cn # [2] <reserved-09FF>..<reserved-0A00>
+0A04 ; Cn # <reserved-0A04>
+0A0B..0A0E ; Cn # [4] <reserved-0A0B>..<reserved-0A0E>
+0A11..0A12 ; Cn # [2] <reserved-0A11>..<reserved-0A12>
+0A29 ; Cn # <reserved-0A29>
+0A31 ; Cn # <reserved-0A31>
+0A34 ; Cn # <reserved-0A34>
+0A37 ; Cn # <reserved-0A37>
+0A3A..0A3B ; Cn # [2] <reserved-0A3A>..<reserved-0A3B>
+0A3D ; Cn # <reserved-0A3D>
+0A43..0A46 ; Cn # [4] <reserved-0A43>..<reserved-0A46>
+0A49..0A4A ; Cn # [2] <reserved-0A49>..<reserved-0A4A>
+0A4E..0A50 ; Cn # [3] <reserved-0A4E>..<reserved-0A50>
+0A52..0A58 ; Cn # [7] <reserved-0A52>..<reserved-0A58>
+0A5D ; Cn # <reserved-0A5D>
+0A5F..0A65 ; Cn # [7] <reserved-0A5F>..<reserved-0A65>
+0A77..0A80 ; Cn # [10] <reserved-0A77>..<reserved-0A80>
+0A84 ; Cn # <reserved-0A84>
+0A8E ; Cn # <reserved-0A8E>
+0A92 ; Cn # <reserved-0A92>
+0AA9 ; Cn # <reserved-0AA9>
+0AB1 ; Cn # <reserved-0AB1>
+0AB4 ; Cn # <reserved-0AB4>
+0ABA..0ABB ; Cn # [2] <reserved-0ABA>..<reserved-0ABB>
+0AC6 ; Cn # <reserved-0AC6>
+0ACA ; Cn # <reserved-0ACA>
+0ACE..0ACF ; Cn # [2] <reserved-0ACE>..<reserved-0ACF>
+0AD1..0ADF ; Cn # [15] <reserved-0AD1>..<reserved-0ADF>
+0AE4..0AE5 ; Cn # [2] <reserved-0AE4>..<reserved-0AE5>
+0AF2..0AF8 ; Cn # [7] <reserved-0AF2>..<reserved-0AF8>
+0B00 ; Cn # <reserved-0B00>
+0B04 ; Cn # <reserved-0B04>
+0B0D..0B0E ; Cn # [2] <reserved-0B0D>..<reserved-0B0E>
+0B11..0B12 ; Cn # [2] <reserved-0B11>..<reserved-0B12>
+0B29 ; Cn # <reserved-0B29>
+0B31 ; Cn # <reserved-0B31>
+0B34 ; Cn # <reserved-0B34>
+0B3A..0B3B ; Cn # [2] <reserved-0B3A>..<reserved-0B3B>
+0B45..0B46 ; Cn # [2] <reserved-0B45>..<reserved-0B46>
+0B49..0B4A ; Cn # [2] <reserved-0B49>..<reserved-0B4A>
+0B4E..0B55 ; Cn # [8] <reserved-0B4E>..<reserved-0B55>
+0B58..0B5B ; Cn # [4] <reserved-0B58>..<reserved-0B5B>
+0B5E ; Cn # <reserved-0B5E>
+0B64..0B65 ; Cn # [2] <reserved-0B64>..<reserved-0B65>
+0B78..0B81 ; Cn # [10] <reserved-0B78>..<reserved-0B81>
+0B84 ; Cn # <reserved-0B84>
+0B8B..0B8D ; Cn # [3] <reserved-0B8B>..<reserved-0B8D>
+0B91 ; Cn # <reserved-0B91>
+0B96..0B98 ; Cn # [3] <reserved-0B96>..<reserved-0B98>
+0B9B ; Cn # <reserved-0B9B>
+0B9D ; Cn # <reserved-0B9D>
+0BA0..0BA2 ; Cn # [3] <reserved-0BA0>..<reserved-0BA2>
+0BA5..0BA7 ; Cn # [3] <reserved-0BA5>..<reserved-0BA7>
+0BAB..0BAD ; Cn # [3] <reserved-0BAB>..<reserved-0BAD>
+0BBA..0BBD ; Cn # [4] <reserved-0BBA>..<reserved-0BBD>
+0BC3..0BC5 ; Cn # [3] <reserved-0BC3>..<reserved-0BC5>
+0BC9 ; Cn # <reserved-0BC9>
+0BCE..0BCF ; Cn # [2] <reserved-0BCE>..<reserved-0BCF>
+0BD1..0BD6 ; Cn # [6] <reserved-0BD1>..<reserved-0BD6>
+0BD8..0BE5 ; Cn # [14] <reserved-0BD8>..<reserved-0BE5>
+0BFB..0BFF ; Cn # [5] <reserved-0BFB>..<reserved-0BFF>
+0C0D ; Cn # <reserved-0C0D>
+0C11 ; Cn # <reserved-0C11>
+0C29 ; Cn # <reserved-0C29>
+0C3A..0C3C ; Cn # [3] <reserved-0C3A>..<reserved-0C3C>
+0C45 ; Cn # <reserved-0C45>
+0C49 ; Cn # <reserved-0C49>
+0C4E..0C54 ; Cn # [7] <reserved-0C4E>..<reserved-0C54>
+0C57 ; Cn # <reserved-0C57>
+0C5B..0C5F ; Cn # [5] <reserved-0C5B>..<reserved-0C5F>
+0C64..0C65 ; Cn # [2] <reserved-0C64>..<reserved-0C65>
+0C70..0C76 ; Cn # [7] <reserved-0C70>..<reserved-0C76>
+0C8D ; Cn # <reserved-0C8D>
+0C91 ; Cn # <reserved-0C91>
+0CA9 ; Cn # <reserved-0CA9>
+0CB4 ; Cn # <reserved-0CB4>
+0CBA..0CBB ; Cn # [2] <reserved-0CBA>..<reserved-0CBB>
+0CC5 ; Cn # <reserved-0CC5>
+0CC9 ; Cn # <reserved-0CC9>
+0CCE..0CD4 ; Cn # [7] <reserved-0CCE>..<reserved-0CD4>
+0CD7..0CDD ; Cn # [7] <reserved-0CD7>..<reserved-0CDD>
+0CDF ; Cn # <reserved-0CDF>
+0CE4..0CE5 ; Cn # [2] <reserved-0CE4>..<reserved-0CE5>
+0CF0 ; Cn # <reserved-0CF0>
+0CF3..0CFF ; Cn # [13] <reserved-0CF3>..<reserved-0CFF>
+0D04 ; Cn # <reserved-0D04>
+0D0D ; Cn # <reserved-0D0D>
+0D11 ; Cn # <reserved-0D11>
+0D45 ; Cn # <reserved-0D45>
+0D49 ; Cn # <reserved-0D49>
+0D50..0D53 ; Cn # [4] <reserved-0D50>..<reserved-0D53>
+0D64..0D65 ; Cn # [2] <reserved-0D64>..<reserved-0D65>
+0D80..0D81 ; Cn # [2] <reserved-0D80>..<reserved-0D81>
+0D84 ; Cn # <reserved-0D84>
+0D97..0D99 ; Cn # [3] <reserved-0D97>..<reserved-0D99>
+0DB2 ; Cn # <reserved-0DB2>
+0DBC ; Cn # <reserved-0DBC>
+0DBE..0DBF ; Cn # [2] <reserved-0DBE>..<reserved-0DBF>
+0DC7..0DC9 ; Cn # [3] <reserved-0DC7>..<reserved-0DC9>
+0DCB..0DCE ; Cn # [4] <reserved-0DCB>..<reserved-0DCE>
+0DD5 ; Cn # <reserved-0DD5>
+0DD7 ; Cn # <reserved-0DD7>
+0DE0..0DE5 ; Cn # [6] <reserved-0DE0>..<reserved-0DE5>
+0DF0..0DF1 ; Cn # [2] <reserved-0DF0>..<reserved-0DF1>
+0DF5..0E00 ; Cn # [12] <reserved-0DF5>..<reserved-0E00>
+0E3B..0E3E ; Cn # [4] <reserved-0E3B>..<reserved-0E3E>
+0E5C..0E80 ; Cn # [37] <reserved-0E5C>..<reserved-0E80>
+0E83 ; Cn # <reserved-0E83>
+0E85 ; Cn # <reserved-0E85>
+0E8B ; Cn # <reserved-0E8B>
+0EA4 ; Cn # <reserved-0EA4>
+0EA6 ; Cn # <reserved-0EA6>
+0EBE..0EBF ; Cn # [2] <reserved-0EBE>..<reserved-0EBF>
+0EC5 ; Cn # <reserved-0EC5>
+0EC7 ; Cn # <reserved-0EC7>
+0ECE..0ECF ; Cn # [2] <reserved-0ECE>..<reserved-0ECF>
+0EDA..0EDB ; Cn # [2] <reserved-0EDA>..<reserved-0EDB>
+0EE0..0EFF ; Cn # [32] <reserved-0EE0>..<reserved-0EFF>
+0F48 ; Cn # <reserved-0F48>
+0F6D..0F70 ; Cn # [4] <reserved-0F6D>..<reserved-0F70>
+0F98 ; Cn # <reserved-0F98>
+0FBD ; Cn # <reserved-0FBD>
+0FCD ; Cn # <reserved-0FCD>
+0FDB..0FFF ; Cn # [37] <reserved-0FDB>..<reserved-0FFF>
+10C6 ; Cn # <reserved-10C6>
+10C8..10CC ; Cn # [5] <reserved-10C8>..<reserved-10CC>
+10CE..10CF ; Cn # [2] <reserved-10CE>..<reserved-10CF>
+1249 ; Cn # <reserved-1249>
+124E..124F ; Cn # [2] <reserved-124E>..<reserved-124F>
+1257 ; Cn # <reserved-1257>
+1259 ; Cn # <reserved-1259>
+125E..125F ; Cn # [2] <reserved-125E>..<reserved-125F>
+1289 ; Cn # <reserved-1289>
+128E..128F ; Cn # [2] <reserved-128E>..<reserved-128F>
+12B1 ; Cn # <reserved-12B1>
+12B6..12B7 ; Cn # [2] <reserved-12B6>..<reserved-12B7>
+12BF ; Cn # <reserved-12BF>
+12C1 ; Cn # <reserved-12C1>
+12C6..12C7 ; Cn # [2] <reserved-12C6>..<reserved-12C7>
+12D7 ; Cn # <reserved-12D7>
+1311 ; Cn # <reserved-1311>
+1316..1317 ; Cn # [2] <reserved-1316>..<reserved-1317>
+135B..135C ; Cn # [2] <reserved-135B>..<reserved-135C>
+137D..137F ; Cn # [3] <reserved-137D>..<reserved-137F>
+139A..139F ; Cn # [6] <reserved-139A>..<reserved-139F>
+13F6..13F7 ; Cn # [2] <reserved-13F6>..<reserved-13F7>
+13FE..13FF ; Cn # [2] <reserved-13FE>..<reserved-13FF>
+169D..169F ; Cn # [3] <reserved-169D>..<reserved-169F>
+16F9..16FF ; Cn # [7] <reserved-16F9>..<reserved-16FF>
+170D ; Cn # <reserved-170D>
+1715..171F ; Cn # [11] <reserved-1715>..<reserved-171F>
+1737..173F ; Cn # [9] <reserved-1737>..<reserved-173F>
+1754..175F ; Cn # [12] <reserved-1754>..<reserved-175F>
+176D ; Cn # <reserved-176D>
+1771 ; Cn # <reserved-1771>
+1774..177F ; Cn # [12] <reserved-1774>..<reserved-177F>
+17DE..17DF ; Cn # [2] <reserved-17DE>..<reserved-17DF>
+17EA..17EF ; Cn # [6] <reserved-17EA>..<reserved-17EF>
+17FA..17FF ; Cn # [6] <reserved-17FA>..<reserved-17FF>
+180F ; Cn # <reserved-180F>
+181A..181F ; Cn # [6] <reserved-181A>..<reserved-181F>
+1879..187F ; Cn # [7] <reserved-1879>..<reserved-187F>
+18AB..18AF ; Cn # [5] <reserved-18AB>..<reserved-18AF>
+18F6..18FF ; Cn # [10] <reserved-18F6>..<reserved-18FF>
+191F ; Cn # <reserved-191F>
+192C..192F ; Cn # [4] <reserved-192C>..<reserved-192F>
+193C..193F ; Cn # [4] <reserved-193C>..<reserved-193F>
+1941..1943 ; Cn # [3] <reserved-1941>..<reserved-1943>
+196E..196F ; Cn # [2] <reserved-196E>..<reserved-196F>
+1975..197F ; Cn # [11] <reserved-1975>..<reserved-197F>
+19AC..19AF ; Cn # [4] <reserved-19AC>..<reserved-19AF>
+19CA..19CF ; Cn # [6] <reserved-19CA>..<reserved-19CF>
+19DB..19DD ; Cn # [3] <reserved-19DB>..<reserved-19DD>
+1A1C..1A1D ; Cn # [2] <reserved-1A1C>..<reserved-1A1D>
+1A5F ; Cn # <reserved-1A5F>
+1A7D..1A7E ; Cn # [2] <reserved-1A7D>..<reserved-1A7E>
+1A8A..1A8F ; Cn # [6] <reserved-1A8A>..<reserved-1A8F>
+1A9A..1A9F ; Cn # [6] <reserved-1A9A>..<reserved-1A9F>
+1AAE..1AAF ; Cn # [2] <reserved-1AAE>..<reserved-1AAF>
+1ABF..1AFF ; Cn # [65] <reserved-1ABF>..<reserved-1AFF>
+1B4C..1B4F ; Cn # [4] <reserved-1B4C>..<reserved-1B4F>
+1B7D..1B7F ; Cn # [3] <reserved-1B7D>..<reserved-1B7F>
+1BF4..1BFB ; Cn # [8] <reserved-1BF4>..<reserved-1BFB>
+1C38..1C3A ; Cn # [3] <reserved-1C38>..<reserved-1C3A>
+1C4A..1C4C ; Cn # [3] <reserved-1C4A>..<reserved-1C4C>
+1C89..1C8F ; Cn # [7] <reserved-1C89>..<reserved-1C8F>
+1CBB..1CBC ; Cn # [2] <reserved-1CBB>..<reserved-1CBC>
+1CC8..1CCF ; Cn # [8] <reserved-1CC8>..<reserved-1CCF>
+1CFB..1CFF ; Cn # [5] <reserved-1CFB>..<reserved-1CFF>
+1DFA ; Cn # <reserved-1DFA>
+1F16..1F17 ; Cn # [2] <reserved-1F16>..<reserved-1F17>
+1F1E..1F1F ; Cn # [2] <reserved-1F1E>..<reserved-1F1F>
+1F46..1F47 ; Cn # [2] <reserved-1F46>..<reserved-1F47>
+1F4E..1F4F ; Cn # [2] <reserved-1F4E>..<reserved-1F4F>
+1F58 ; Cn # <reserved-1F58>
+1F5A ; Cn # <reserved-1F5A>
+1F5C ; Cn # <reserved-1F5C>
+1F5E ; Cn # <reserved-1F5E>
+1F7E..1F7F ; Cn # [2] <reserved-1F7E>..<reserved-1F7F>
+1FB5 ; Cn # <reserved-1FB5>
+1FC5 ; Cn # <reserved-1FC5>
+1FD4..1FD5 ; Cn # [2] <reserved-1FD4>..<reserved-1FD5>
+1FDC ; Cn # <reserved-1FDC>
+1FF0..1FF1 ; Cn # [2] <reserved-1FF0>..<reserved-1FF1>
+1FF5 ; Cn # <reserved-1FF5>
+1FFF ; Cn # <reserved-1FFF>
+2065 ; Cn # <reserved-2065>
+2072..2073 ; Cn # [2] <reserved-2072>..<reserved-2073>
+208F ; Cn # <reserved-208F>
+209D..209F ; Cn # [3] <reserved-209D>..<reserved-209F>
+20C0..20CF ; Cn # [16] <reserved-20C0>..<reserved-20CF>
+20F1..20FF ; Cn # [15] <reserved-20F1>..<reserved-20FF>
+218C..218F ; Cn # [4] <reserved-218C>..<reserved-218F>
+2427..243F ; Cn # [25] <reserved-2427>..<reserved-243F>
+244B..245F ; Cn # [21] <reserved-244B>..<reserved-245F>
+2B74..2B75 ; Cn # [2] <reserved-2B74>..<reserved-2B75>
+2B96..2B97 ; Cn # [2] <reserved-2B96>..<reserved-2B97>
+2C2F ; Cn # <reserved-2C2F>
+2C5F ; Cn # <reserved-2C5F>
+2CF4..2CF8 ; Cn # [5] <reserved-2CF4>..<reserved-2CF8>
+2D26 ; Cn # <reserved-2D26>
+2D28..2D2C ; Cn # [5] <reserved-2D28>..<reserved-2D2C>
+2D2E..2D2F ; Cn # [2] <reserved-2D2E>..<reserved-2D2F>
+2D68..2D6E ; Cn # [7] <reserved-2D68>..<reserved-2D6E>
+2D71..2D7E ; Cn # [14] <reserved-2D71>..<reserved-2D7E>
+2D97..2D9F ; Cn # [9] <reserved-2D97>..<reserved-2D9F>
+2DA7 ; Cn # <reserved-2DA7>
+2DAF ; Cn # <reserved-2DAF>
+2DB7 ; Cn # <reserved-2DB7>
+2DBF ; Cn # <reserved-2DBF>
+2DC7 ; Cn # <reserved-2DC7>
+2DCF ; Cn # <reserved-2DCF>
+2DD7 ; Cn # <reserved-2DD7>
+2DDF ; Cn # <reserved-2DDF>
+2E50..2E7F ; Cn # [48] <reserved-2E50>..<reserved-2E7F>
+2E9A ; Cn # <reserved-2E9A>
+2EF4..2EFF ; Cn # [12] <reserved-2EF4>..<reserved-2EFF>
+2FD6..2FEF ; Cn # [26] <reserved-2FD6>..<reserved-2FEF>
+2FFC..2FFF ; Cn # [4] <reserved-2FFC>..<reserved-2FFF>
+3040 ; Cn # <reserved-3040>
+3097..3098 ; Cn # [2] <reserved-3097>..<reserved-3098>
+3100..3104 ; Cn # [5] <reserved-3100>..<reserved-3104>
+3130 ; Cn # <reserved-3130>
+318F ; Cn # <reserved-318F>
+31BB..31BF ; Cn # [5] <reserved-31BB>..<reserved-31BF>
+31E4..31EF ; Cn # [12] <reserved-31E4>..<reserved-31EF>
+321F ; Cn # <reserved-321F>
+4DB6..4DBF ; Cn # [10] <reserved-4DB6>..<reserved-4DBF>
+9FF0..9FFF ; Cn # [16] <reserved-9FF0>..<reserved-9FFF>
+A48D..A48F ; Cn # [3] <reserved-A48D>..<reserved-A48F>
+A4C7..A4CF ; Cn # [9] <reserved-A4C7>..<reserved-A4CF>
+A62C..A63F ; Cn # [20] <reserved-A62C>..<reserved-A63F>
+A6F8..A6FF ; Cn # [8] <reserved-A6F8>..<reserved-A6FF>
+A7C0..A7C1 ; Cn # [2] <reserved-A7C0>..<reserved-A7C1>
+A7C7..A7F6 ; Cn # [48] <reserved-A7C7>..<reserved-A7F6>
+A82C..A82F ; Cn # [4] <reserved-A82C>..<reserved-A82F>
+A83A..A83F ; Cn # [6] <reserved-A83A>..<reserved-A83F>
+A878..A87F ; Cn # [8] <reserved-A878>..<reserved-A87F>
+A8C6..A8CD ; Cn # [8] <reserved-A8C6>..<reserved-A8CD>
+A8DA..A8DF ; Cn # [6] <reserved-A8DA>..<reserved-A8DF>
+A954..A95E ; Cn # [11] <reserved-A954>..<reserved-A95E>
+A97D..A97F ; Cn # [3] <reserved-A97D>..<reserved-A97F>
+A9CE ; Cn # <reserved-A9CE>
+A9DA..A9DD ; Cn # [4] <reserved-A9DA>..<reserved-A9DD>
+A9FF ; Cn # <reserved-A9FF>
+AA37..AA3F ; Cn # [9] <reserved-AA37>..<reserved-AA3F>
+AA4E..AA4F ; Cn # [2] <reserved-AA4E>..<reserved-AA4F>
+AA5A..AA5B ; Cn # [2] <reserved-AA5A>..<reserved-AA5B>
+AAC3..AADA ; Cn # [24] <reserved-AAC3>..<reserved-AADA>
+AAF7..AB00 ; Cn # [10] <reserved-AAF7>..<reserved-AB00>
+AB07..AB08 ; Cn # [2] <reserved-AB07>..<reserved-AB08>
+AB0F..AB10 ; Cn # [2] <reserved-AB0F>..<reserved-AB10>
+AB17..AB1F ; Cn # [9] <reserved-AB17>..<reserved-AB1F>
+AB27 ; Cn # <reserved-AB27>
+AB2F ; Cn # <reserved-AB2F>
+AB68..AB6F ; Cn # [8] <reserved-AB68>..<reserved-AB6F>
+ABEE..ABEF ; Cn # [2] <reserved-ABEE>..<reserved-ABEF>
+ABFA..ABFF ; Cn # [6] <reserved-ABFA>..<reserved-ABFF>
+D7A4..D7AF ; Cn # [12] <reserved-D7A4>..<reserved-D7AF>
+D7C7..D7CA ; Cn # [4] <reserved-D7C7>..<reserved-D7CA>
+D7FC..D7FF ; Cn # [4] <reserved-D7FC>..<reserved-D7FF>
+FA6E..FA6F ; Cn # [2] <reserved-FA6E>..<reserved-FA6F>
+FADA..FAFF ; Cn # [38] <reserved-FADA>..<reserved-FAFF>
+FB07..FB12 ; Cn # [12] <reserved-FB07>..<reserved-FB12>
+FB18..FB1C ; Cn # [5] <reserved-FB18>..<reserved-FB1C>
+FB37 ; Cn # <reserved-FB37>
+FB3D ; Cn # <reserved-FB3D>
+FB3F ; Cn # <reserved-FB3F>
+FB42 ; Cn # <reserved-FB42>
+FB45 ; Cn # <reserved-FB45>
+FBC2..FBD2 ; Cn # [17] <reserved-FBC2>..<reserved-FBD2>
+FD40..FD4F ; Cn # [16] <reserved-FD40>..<reserved-FD4F>
+FD90..FD91 ; Cn # [2] <reserved-FD90>..<reserved-FD91>
+FDC8..FDEF ; Cn # [40] <reserved-FDC8>..<noncharacter-FDEF>
+FDFE..FDFF ; Cn # [2] <reserved-FDFE>..<reserved-FDFF>
+FE1A..FE1F ; Cn # [6] <reserved-FE1A>..<reserved-FE1F>
+FE53 ; Cn # <reserved-FE53>
+FE67 ; Cn # <reserved-FE67>
+FE6C..FE6F ; Cn # [4] <reserved-FE6C>..<reserved-FE6F>
+FE75 ; Cn # <reserved-FE75>
+FEFD..FEFE ; Cn # [2] <reserved-FEFD>..<reserved-FEFE>
+FF00 ; Cn # <reserved-FF00>
+FFBF..FFC1 ; Cn # [3] <reserved-FFBF>..<reserved-FFC1>
+FFC8..FFC9 ; Cn # [2] <reserved-FFC8>..<reserved-FFC9>
+FFD0..FFD1 ; Cn # [2] <reserved-FFD0>..<reserved-FFD1>
+FFD8..FFD9 ; Cn # [2] <reserved-FFD8>..<reserved-FFD9>
+FFDD..FFDF ; Cn # [3] <reserved-FFDD>..<reserved-FFDF>
+FFE7 ; Cn # <reserved-FFE7>
+FFEF..FFF8 ; Cn # [10] <reserved-FFEF>..<reserved-FFF8>
+FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
+1000C ; Cn # <reserved-1000C>
+10027 ; Cn # <reserved-10027>
+1003B ; Cn # <reserved-1003B>
+1003E ; Cn # <reserved-1003E>
+1004E..1004F ; Cn # [2] <reserved-1004E>..<reserved-1004F>
+1005E..1007F ; Cn # [34] <reserved-1005E>..<reserved-1007F>
+100FB..100FF ; Cn # [5] <reserved-100FB>..<reserved-100FF>
+10103..10106 ; Cn # [4] <reserved-10103>..<reserved-10106>
+10134..10136 ; Cn # [3] <reserved-10134>..<reserved-10136>
+1018F ; Cn # <reserved-1018F>
+1019C..1019F ; Cn # [4] <reserved-1019C>..<reserved-1019F>
+101A1..101CF ; Cn # [47] <reserved-101A1>..<reserved-101CF>
+101FE..1027F ; Cn # [130] <reserved-101FE>..<reserved-1027F>
+1029D..1029F ; Cn # [3] <reserved-1029D>..<reserved-1029F>
+102D1..102DF ; Cn # [15] <reserved-102D1>..<reserved-102DF>
+102FC..102FF ; Cn # [4] <reserved-102FC>..<reserved-102FF>
+10324..1032C ; Cn # [9] <reserved-10324>..<reserved-1032C>
+1034B..1034F ; Cn # [5] <reserved-1034B>..<reserved-1034F>
+1037B..1037F ; Cn # [5] <reserved-1037B>..<reserved-1037F>
+1039E ; Cn # <reserved-1039E>
+103C4..103C7 ; Cn # [4] <reserved-103C4>..<reserved-103C7>
+103D6..103FF ; Cn # [42] <reserved-103D6>..<reserved-103FF>
+1049E..1049F ; Cn # [2] <reserved-1049E>..<reserved-1049F>
+104AA..104AF ; Cn # [6] <reserved-104AA>..<reserved-104AF>
+104D4..104D7 ; Cn # [4] <reserved-104D4>..<reserved-104D7>
+104FC..104FF ; Cn # [4] <reserved-104FC>..<reserved-104FF>
+10528..1052F ; Cn # [8] <reserved-10528>..<reserved-1052F>
+10564..1056E ; Cn # [11] <reserved-10564>..<reserved-1056E>
+10570..105FF ; Cn # [144] <reserved-10570>..<reserved-105FF>
+10737..1073F ; Cn # [9] <reserved-10737>..<reserved-1073F>
+10756..1075F ; Cn # [10] <reserved-10756>..<reserved-1075F>
+10768..107FF ; Cn # [152] <reserved-10768>..<reserved-107FF>
+10806..10807 ; Cn # [2] <reserved-10806>..<reserved-10807>
+10809 ; Cn # <reserved-10809>
+10836 ; Cn # <reserved-10836>
+10839..1083B ; Cn # [3] <reserved-10839>..<reserved-1083B>
+1083D..1083E ; Cn # [2] <reserved-1083D>..<reserved-1083E>
+10856 ; Cn # <reserved-10856>
+1089F..108A6 ; Cn # [8] <reserved-1089F>..<reserved-108A6>
+108B0..108DF ; Cn # [48] <reserved-108B0>..<reserved-108DF>
+108F3 ; Cn # <reserved-108F3>
+108F6..108FA ; Cn # [5] <reserved-108F6>..<reserved-108FA>
+1091C..1091E ; Cn # [3] <reserved-1091C>..<reserved-1091E>
+1093A..1093E ; Cn # [5] <reserved-1093A>..<reserved-1093E>
+10940..1097F ; Cn # [64] <reserved-10940>..<reserved-1097F>
+109B8..109BB ; Cn # [4] <reserved-109B8>..<reserved-109BB>
+109D0..109D1 ; Cn # [2] <reserved-109D0>..<reserved-109D1>
+10A04 ; Cn # <reserved-10A04>
+10A07..10A0B ; Cn # [5] <reserved-10A07>..<reserved-10A0B>
+10A14 ; Cn # <reserved-10A14>
+10A18 ; Cn # <reserved-10A18>
+10A36..10A37 ; Cn # [2] <reserved-10A36>..<reserved-10A37>
+10A3B..10A3E ; Cn # [4] <reserved-10A3B>..<reserved-10A3E>
+10A49..10A4F ; Cn # [7] <reserved-10A49>..<reserved-10A4F>
+10A59..10A5F ; Cn # [7] <reserved-10A59>..<reserved-10A5F>
+10AA0..10ABF ; Cn # [32] <reserved-10AA0>..<reserved-10ABF>
+10AE7..10AEA ; Cn # [4] <reserved-10AE7>..<reserved-10AEA>
+10AF7..10AFF ; Cn # [9] <reserved-10AF7>..<reserved-10AFF>
+10B36..10B38 ; Cn # [3] <reserved-10B36>..<reserved-10B38>
+10B56..10B57 ; Cn # [2] <reserved-10B56>..<reserved-10B57>
+10B73..10B77 ; Cn # [5] <reserved-10B73>..<reserved-10B77>
+10B92..10B98 ; Cn # [7] <reserved-10B92>..<reserved-10B98>
+10B9D..10BA8 ; Cn # [12] <reserved-10B9D>..<reserved-10BA8>
+10BB0..10BFF ; Cn # [80] <reserved-10BB0>..<reserved-10BFF>
+10C49..10C7F ; Cn # [55] <reserved-10C49>..<reserved-10C7F>
+10CB3..10CBF ; Cn # [13] <reserved-10CB3>..<reserved-10CBF>
+10CF3..10CF9 ; Cn # [7] <reserved-10CF3>..<reserved-10CF9>
+10D28..10D2F ; Cn # [8] <reserved-10D28>..<reserved-10D2F>
+10D3A..10E5F ; Cn # [294] <reserved-10D3A>..<reserved-10E5F>
+10E7F..10EFF ; Cn # [129] <reserved-10E7F>..<reserved-10EFF>
+10F28..10F2F ; Cn # [8] <reserved-10F28>..<reserved-10F2F>
+10F5A..10FDF ; Cn # [134] <reserved-10F5A>..<reserved-10FDF>
+10FF7..10FFF ; Cn # [9] <reserved-10FF7>..<reserved-10FFF>
+1104E..11051 ; Cn # [4] <reserved-1104E>..<reserved-11051>
+11070..1107E ; Cn # [15] <reserved-11070>..<reserved-1107E>
+110C2..110CC ; Cn # [11] <reserved-110C2>..<reserved-110CC>
+110CE..110CF ; Cn # [2] <reserved-110CE>..<reserved-110CF>
+110E9..110EF ; Cn # [7] <reserved-110E9>..<reserved-110EF>
+110FA..110FF ; Cn # [6] <reserved-110FA>..<reserved-110FF>
+11135 ; Cn # <reserved-11135>
+11147..1114F ; Cn # [9] <reserved-11147>..<reserved-1114F>
+11177..1117F ; Cn # [9] <reserved-11177>..<reserved-1117F>
+111CE..111CF ; Cn # [2] <reserved-111CE>..<reserved-111CF>
+111E0 ; Cn # <reserved-111E0>
+111F5..111FF ; Cn # [11] <reserved-111F5>..<reserved-111FF>
+11212 ; Cn # <reserved-11212>
+1123F..1127F ; Cn # [65] <reserved-1123F>..<reserved-1127F>
+11287 ; Cn # <reserved-11287>
+11289 ; Cn # <reserved-11289>
+1128E ; Cn # <reserved-1128E>
+1129E ; Cn # <reserved-1129E>
+112AA..112AF ; Cn # [6] <reserved-112AA>..<reserved-112AF>
+112EB..112EF ; Cn # [5] <reserved-112EB>..<reserved-112EF>
+112FA..112FF ; Cn # [6] <reserved-112FA>..<reserved-112FF>
+11304 ; Cn # <reserved-11304>
+1130D..1130E ; Cn # [2] <reserved-1130D>..<reserved-1130E>
+11311..11312 ; Cn # [2] <reserved-11311>..<reserved-11312>
+11329 ; Cn # <reserved-11329>
+11331 ; Cn # <reserved-11331>
+11334 ; Cn # <reserved-11334>
+1133A ; Cn # <reserved-1133A>
+11345..11346 ; Cn # [2] <reserved-11345>..<reserved-11346>
+11349..1134A ; Cn # [2] <reserved-11349>..<reserved-1134A>
+1134E..1134F ; Cn # [2] <reserved-1134E>..<reserved-1134F>
+11351..11356 ; Cn # [6] <reserved-11351>..<reserved-11356>
+11358..1135C ; Cn # [5] <reserved-11358>..<reserved-1135C>
+11364..11365 ; Cn # [2] <reserved-11364>..<reserved-11365>
+1136D..1136F ; Cn # [3] <reserved-1136D>..<reserved-1136F>
+11375..113FF ; Cn # [139] <reserved-11375>..<reserved-113FF>
+1145A ; Cn # <reserved-1145A>
+1145C ; Cn # <reserved-1145C>
+11460..1147F ; Cn # [32] <reserved-11460>..<reserved-1147F>
+114C8..114CF ; Cn # [8] <reserved-114C8>..<reserved-114CF>
+114DA..1157F ; Cn # [166] <reserved-114DA>..<reserved-1157F>
+115B6..115B7 ; Cn # [2] <reserved-115B6>..<reserved-115B7>
+115DE..115FF ; Cn # [34] <reserved-115DE>..<reserved-115FF>
+11645..1164F ; Cn # [11] <reserved-11645>..<reserved-1164F>
+1165A..1165F ; Cn # [6] <reserved-1165A>..<reserved-1165F>
+1166D..1167F ; Cn # [19] <reserved-1166D>..<reserved-1167F>
+116B9..116BF ; Cn # [7] <reserved-116B9>..<reserved-116BF>
+116CA..116FF ; Cn # [54] <reserved-116CA>..<reserved-116FF>
+1171B..1171C ; Cn # [2] <reserved-1171B>..<reserved-1171C>
+1172C..1172F ; Cn # [4] <reserved-1172C>..<reserved-1172F>
+11740..117FF ; Cn # [192] <reserved-11740>..<reserved-117FF>
+1183C..1189F ; Cn # [100] <reserved-1183C>..<reserved-1189F>
+118F3..118FE ; Cn # [12] <reserved-118F3>..<reserved-118FE>
+11900..1199F ; Cn # [160] <reserved-11900>..<reserved-1199F>
+119A8..119A9 ; Cn # [2] <reserved-119A8>..<reserved-119A9>
+119D8..119D9 ; Cn # [2] <reserved-119D8>..<reserved-119D9>
+119E5..119FF ; Cn # [27] <reserved-119E5>..<reserved-119FF>
+11A48..11A4F ; Cn # [8] <reserved-11A48>..<reserved-11A4F>
+11AA3..11ABF ; Cn # [29] <reserved-11AA3>..<reserved-11ABF>
+11AF9..11BFF ; Cn # [263] <reserved-11AF9>..<reserved-11BFF>
+11C09 ; Cn # <reserved-11C09>
+11C37 ; Cn # <reserved-11C37>
+11C46..11C4F ; Cn # [10] <reserved-11C46>..<reserved-11C4F>
+11C6D..11C6F ; Cn # [3] <reserved-11C6D>..<reserved-11C6F>
+11C90..11C91 ; Cn # [2] <reserved-11C90>..<reserved-11C91>
+11CA8 ; Cn # <reserved-11CA8>
+11CB7..11CFF ; Cn # [73] <reserved-11CB7>..<reserved-11CFF>
+11D07 ; Cn # <reserved-11D07>
+11D0A ; Cn # <reserved-11D0A>
+11D37..11D39 ; Cn # [3] <reserved-11D37>..<reserved-11D39>
+11D3B ; Cn # <reserved-11D3B>
+11D3E ; Cn # <reserved-11D3E>
+11D48..11D4F ; Cn # [8] <reserved-11D48>..<reserved-11D4F>
+11D5A..11D5F ; Cn # [6] <reserved-11D5A>..<reserved-11D5F>
+11D66 ; Cn # <reserved-11D66>
+11D69 ; Cn # <reserved-11D69>
+11D8F ; Cn # <reserved-11D8F>
+11D92 ; Cn # <reserved-11D92>
+11D99..11D9F ; Cn # [7] <reserved-11D99>..<reserved-11D9F>
+11DAA..11EDF ; Cn # [310] <reserved-11DAA>..<reserved-11EDF>
+11EF9..11FBF ; Cn # [199] <reserved-11EF9>..<reserved-11FBF>
+11FF2..11FFE ; Cn # [13] <reserved-11FF2>..<reserved-11FFE>
+1239A..123FF ; Cn # [102] <reserved-1239A>..<reserved-123FF>
+1246F ; Cn # <reserved-1246F>
+12475..1247F ; Cn # [11] <reserved-12475>..<reserved-1247F>
+12544..12FFF ; Cn # [2748] <reserved-12544>..<reserved-12FFF>
+1342F ; Cn # <reserved-1342F>
+13439..143FF ; Cn # [4039] <reserved-13439>..<reserved-143FF>
+14647..167FF ; Cn # [8633] <reserved-14647>..<reserved-167FF>
+16A39..16A3F ; Cn # [7] <reserved-16A39>..<reserved-16A3F>
+16A5F ; Cn # <reserved-16A5F>
+16A6A..16A6D ; Cn # [4] <reserved-16A6A>..<reserved-16A6D>
+16A70..16ACF ; Cn # [96] <reserved-16A70>..<reserved-16ACF>
+16AEE..16AEF ; Cn # [2] <reserved-16AEE>..<reserved-16AEF>
+16AF6..16AFF ; Cn # [10] <reserved-16AF6>..<reserved-16AFF>
+16B46..16B4F ; Cn # [10] <reserved-16B46>..<reserved-16B4F>
+16B5A ; Cn # <reserved-16B5A>
+16B62 ; Cn # <reserved-16B62>
+16B78..16B7C ; Cn # [5] <reserved-16B78>..<reserved-16B7C>
+16B90..16E3F ; Cn # [688] <reserved-16B90>..<reserved-16E3F>
+16E9B..16EFF ; Cn # [101] <reserved-16E9B>..<reserved-16EFF>
+16F4B..16F4E ; Cn # [4] <reserved-16F4B>..<reserved-16F4E>
+16F88..16F8E ; Cn # [7] <reserved-16F88>..<reserved-16F8E>
+16FA0..16FDF ; Cn # [64] <reserved-16FA0>..<reserved-16FDF>
+16FE4..16FFF ; Cn # [28] <reserved-16FE4>..<reserved-16FFF>
+187F8..187FF ; Cn # [8] <reserved-187F8>..<reserved-187FF>
+18AF3..1AFFF ; Cn # [9485] <reserved-18AF3>..<reserved-1AFFF>
+1B11F..1B14F ; Cn # [49] <reserved-1B11F>..<reserved-1B14F>
+1B153..1B163 ; Cn # [17] <reserved-1B153>..<reserved-1B163>
+1B168..1B16F ; Cn # [8] <reserved-1B168>..<reserved-1B16F>
+1B2FC..1BBFF ; Cn # [2308] <reserved-1B2FC>..<reserved-1BBFF>
+1BC6B..1BC6F ; Cn # [5] <reserved-1BC6B>..<reserved-1BC6F>
+1BC7D..1BC7F ; Cn # [3] <reserved-1BC7D>..<reserved-1BC7F>
+1BC89..1BC8F ; Cn # [7] <reserved-1BC89>..<reserved-1BC8F>
+1BC9A..1BC9B ; Cn # [2] <reserved-1BC9A>..<reserved-1BC9B>
+1BCA4..1CFFF ; Cn # [4956] <reserved-1BCA4>..<reserved-1CFFF>
+1D0F6..1D0FF ; Cn # [10] <reserved-1D0F6>..<reserved-1D0FF>
+1D127..1D128 ; Cn # [2] <reserved-1D127>..<reserved-1D128>
+1D1E9..1D1FF ; Cn # [23] <reserved-1D1E9>..<reserved-1D1FF>
+1D246..1D2DF ; Cn # [154] <reserved-1D246>..<reserved-1D2DF>
+1D2F4..1D2FF ; Cn # [12] <reserved-1D2F4>..<reserved-1D2FF>
+1D357..1D35F ; Cn # [9] <reserved-1D357>..<reserved-1D35F>
+1D379..1D3FF ; Cn # [135] <reserved-1D379>..<reserved-1D3FF>
+1D455 ; Cn # <reserved-1D455>
+1D49D ; Cn # <reserved-1D49D>
+1D4A0..1D4A1 ; Cn # [2] <reserved-1D4A0>..<reserved-1D4A1>
+1D4A3..1D4A4 ; Cn # [2] <reserved-1D4A3>..<reserved-1D4A4>
+1D4A7..1D4A8 ; Cn # [2] <reserved-1D4A7>..<reserved-1D4A8>
+1D4AD ; Cn # <reserved-1D4AD>
+1D4BA ; Cn # <reserved-1D4BA>
+1D4BC ; Cn # <reserved-1D4BC>
+1D4C4 ; Cn # <reserved-1D4C4>
+1D506 ; Cn # <reserved-1D506>
+1D50B..1D50C ; Cn # [2] <reserved-1D50B>..<reserved-1D50C>
+1D515 ; Cn # <reserved-1D515>
+1D51D ; Cn # <reserved-1D51D>
+1D53A ; Cn # <reserved-1D53A>
+1D53F ; Cn # <reserved-1D53F>
+1D545 ; Cn # <reserved-1D545>
+1D547..1D549 ; Cn # [3] <reserved-1D547>..<reserved-1D549>
+1D551 ; Cn # <reserved-1D551>
+1D6A6..1D6A7 ; Cn # [2] <reserved-1D6A6>..<reserved-1D6A7>
+1D7CC..1D7CD ; Cn # [2] <reserved-1D7CC>..<reserved-1D7CD>
+1DA8C..1DA9A ; Cn # [15] <reserved-1DA8C>..<reserved-1DA9A>
+1DAA0 ; Cn # <reserved-1DAA0>
+1DAB0..1DFFF ; Cn # [1360] <reserved-1DAB0>..<reserved-1DFFF>
+1E007 ; Cn # <reserved-1E007>
+1E019..1E01A ; Cn # [2] <reserved-1E019>..<reserved-1E01A>
+1E022 ; Cn # <reserved-1E022>
+1E025 ; Cn # <reserved-1E025>
+1E02B..1E0FF ; Cn # [213] <reserved-1E02B>..<reserved-1E0FF>
+1E12D..1E12F ; Cn # [3] <reserved-1E12D>..<reserved-1E12F>
+1E13E..1E13F ; Cn # [2] <reserved-1E13E>..<reserved-1E13F>
+1E14A..1E14D ; Cn # [4] <reserved-1E14A>..<reserved-1E14D>
+1E150..1E2BF ; Cn # [368] <reserved-1E150>..<reserved-1E2BF>
+1E2FA..1E2FE ; Cn # [5] <reserved-1E2FA>..<reserved-1E2FE>
+1E300..1E7FF ; Cn # [1280] <reserved-1E300>..<reserved-1E7FF>
+1E8C5..1E8C6 ; Cn # [2] <reserved-1E8C5>..<reserved-1E8C6>
+1E8D7..1E8FF ; Cn # [41] <reserved-1E8D7>..<reserved-1E8FF>
+1E94C..1E94F ; Cn # [4] <reserved-1E94C>..<reserved-1E94F>
+1E95A..1E95D ; Cn # [4] <reserved-1E95A>..<reserved-1E95D>
+1E960..1EC70 ; Cn # [785] <reserved-1E960>..<reserved-1EC70>
+1ECB5..1ED00 ; Cn # [76] <reserved-1ECB5>..<reserved-1ED00>
+1ED3E..1EDFF ; Cn # [194] <reserved-1ED3E>..<reserved-1EDFF>
+1EE04 ; Cn # <reserved-1EE04>
+1EE20 ; Cn # <reserved-1EE20>
+1EE23 ; Cn # <reserved-1EE23>
+1EE25..1EE26 ; Cn # [2] <reserved-1EE25>..<reserved-1EE26>
+1EE28 ; Cn # <reserved-1EE28>
+1EE33 ; Cn # <reserved-1EE33>
+1EE38 ; Cn # <reserved-1EE38>
+1EE3A ; Cn # <reserved-1EE3A>
+1EE3C..1EE41 ; Cn # [6] <reserved-1EE3C>..<reserved-1EE41>
+1EE43..1EE46 ; Cn # [4] <reserved-1EE43>..<reserved-1EE46>
+1EE48 ; Cn # <reserved-1EE48>
+1EE4A ; Cn # <reserved-1EE4A>
+1EE4C ; Cn # <reserved-1EE4C>
+1EE50 ; Cn # <reserved-1EE50>
+1EE53 ; Cn # <reserved-1EE53>
+1EE55..1EE56 ; Cn # [2] <reserved-1EE55>..<reserved-1EE56>
+1EE58 ; Cn # <reserved-1EE58>
+1EE5A ; Cn # <reserved-1EE5A>
+1EE5C ; Cn # <reserved-1EE5C>
+1EE5E ; Cn # <reserved-1EE5E>
+1EE60 ; Cn # <reserved-1EE60>
+1EE63 ; Cn # <reserved-1EE63>
+1EE65..1EE66 ; Cn # [2] <reserved-1EE65>..<reserved-1EE66>
+1EE6B ; Cn # <reserved-1EE6B>
+1EE73 ; Cn # <reserved-1EE73>
+1EE78 ; Cn # <reserved-1EE78>
+1EE7D ; Cn # <reserved-1EE7D>
+1EE7F ; Cn # <reserved-1EE7F>
+1EE8A ; Cn # <reserved-1EE8A>
+1EE9C..1EEA0 ; Cn # [5] <reserved-1EE9C>..<reserved-1EEA0>
+1EEA4 ; Cn # <reserved-1EEA4>
+1EEAA ; Cn # <reserved-1EEAA>
+1EEBC..1EEEF ; Cn # [52] <reserved-1EEBC>..<reserved-1EEEF>
+1EEF2..1EFFF ; Cn # [270] <reserved-1EEF2>..<reserved-1EFFF>
+1F02C..1F02F ; Cn # [4] <reserved-1F02C>..<reserved-1F02F>
+1F094..1F09F ; Cn # [12] <reserved-1F094>..<reserved-1F09F>
+1F0AF..1F0B0 ; Cn # [2] <reserved-1F0AF>..<reserved-1F0B0>
+1F0C0 ; Cn # <reserved-1F0C0>
+1F0D0 ; Cn # <reserved-1F0D0>
+1F0F6..1F0FF ; Cn # [10] <reserved-1F0F6>..<reserved-1F0FF>
+1F10D..1F10F ; Cn # [3] <reserved-1F10D>..<reserved-1F10F>
+1F16D..1F16F ; Cn # [3] <reserved-1F16D>..<reserved-1F16F>
+1F1AD..1F1E5 ; Cn # [57] <reserved-1F1AD>..<reserved-1F1E5>
+1F203..1F20F ; Cn # [13] <reserved-1F203>..<reserved-1F20F>
+1F23C..1F23F ; Cn # [4] <reserved-1F23C>..<reserved-1F23F>
+1F249..1F24F ; Cn # [7] <reserved-1F249>..<reserved-1F24F>
+1F252..1F25F ; Cn # [14] <reserved-1F252>..<reserved-1F25F>
+1F266..1F2FF ; Cn # [154] <reserved-1F266>..<reserved-1F2FF>
+1F6D6..1F6DF ; Cn # [10] <reserved-1F6D6>..<reserved-1F6DF>
+1F6ED..1F6EF ; Cn # [3] <reserved-1F6ED>..<reserved-1F6EF>
+1F6FB..1F6FF ; Cn # [5] <reserved-1F6FB>..<reserved-1F6FF>
+1F774..1F77F ; Cn # [12] <reserved-1F774>..<reserved-1F77F>
+1F7D9..1F7DF ; Cn # [7] <reserved-1F7D9>..<reserved-1F7DF>
+1F7EC..1F7FF ; Cn # [20] <reserved-1F7EC>..<reserved-1F7FF>
+1F80C..1F80F ; Cn # [4] <reserved-1F80C>..<reserved-1F80F>
+1F848..1F84F ; Cn # [8] <reserved-1F848>..<reserved-1F84F>
+1F85A..1F85F ; Cn # [6] <reserved-1F85A>..<reserved-1F85F>
+1F888..1F88F ; Cn # [8] <reserved-1F888>..<reserved-1F88F>
+1F8AE..1F8FF ; Cn # [82] <reserved-1F8AE>..<reserved-1F8FF>
+1F90C ; Cn # <reserved-1F90C>
+1F972 ; Cn # <reserved-1F972>
+1F977..1F979 ; Cn # [3] <reserved-1F977>..<reserved-1F979>
+1F9A3..1F9A4 ; Cn # [2] <reserved-1F9A3>..<reserved-1F9A4>
+1F9AB..1F9AD ; Cn # [3] <reserved-1F9AB>..<reserved-1F9AD>
+1F9CB..1F9CC ; Cn # [2] <reserved-1F9CB>..<reserved-1F9CC>
+1FA54..1FA5F ; Cn # [12] <reserved-1FA54>..<reserved-1FA5F>
+1FA6E..1FA6F ; Cn # [2] <reserved-1FA6E>..<reserved-1FA6F>
+1FA74..1FA77 ; Cn # [4] <reserved-1FA74>..<reserved-1FA77>
+1FA7B..1FA7F ; Cn # [5] <reserved-1FA7B>..<reserved-1FA7F>
+1FA83..1FA8F ; Cn # [13] <reserved-1FA83>..<reserved-1FA8F>
+1FA96..1FFFF ; Cn # [1386] <reserved-1FA96>..<noncharacter-1FFFF>
+2A6D7..2A6FF ; Cn # [41] <reserved-2A6D7>..<reserved-2A6FF>
+2B735..2B73F ; Cn # [11] <reserved-2B735>..<reserved-2B73F>
+2B81E..2B81F ; Cn # [2] <reserved-2B81E>..<reserved-2B81F>
+2CEA2..2CEAF ; Cn # [14] <reserved-2CEA2>..<reserved-2CEAF>
+2EBE1..2F7FF ; Cn # [3103] <reserved-2EBE1>..<reserved-2F7FF>
+2FA1E..E0000 ; Cn # [722403] <reserved-2FA1E>..<reserved-E0000>
+E0002..E001F ; Cn # [30] <reserved-E0002>..<reserved-E001F>
+E0080..E00FF ; Cn # [128] <reserved-E0080>..<reserved-E00FF>
+E01F0..EFFFF ; Cn # [65040] <reserved-E01F0>..<noncharacter-EFFFF>
+FFFFE..FFFFF ; Cn # [2] <noncharacter-FFFFE>..<noncharacter-FFFFF>
+10FFFE..10FFFF; Cn # [2] <noncharacter-10FFFE>..<noncharacter-10FFFF>
+
+# Total code points: 836602
+
+# ================================================
+
+# General_Category=Uppercase_Letter
+
+0041..005A ; Lu # [26] LATIN CAPITAL LETTER A..LATIN CAPITAL LETTER Z
+00C0..00D6 ; Lu # [23] LATIN CAPITAL LETTER A WITH GRAVE..LATIN CAPITAL LETTER O WITH DIAERESIS
+00D8..00DE ; Lu # [7] LATIN CAPITAL LETTER O WITH STROKE..LATIN CAPITAL LETTER THORN
+0100 ; Lu # LATIN CAPITAL LETTER A WITH MACRON
+0102 ; Lu # LATIN CAPITAL LETTER A WITH BREVE
+0104 ; Lu # LATIN CAPITAL LETTER A WITH OGONEK
+0106 ; Lu # LATIN CAPITAL LETTER C WITH ACUTE
+0108 ; Lu # LATIN CAPITAL LETTER C WITH CIRCUMFLEX
+010A ; Lu # LATIN CAPITAL LETTER C WITH DOT ABOVE
+010C ; Lu # LATIN CAPITAL LETTER C WITH CARON
+010E ; Lu # LATIN CAPITAL LETTER D WITH CARON
+0110 ; Lu # LATIN CAPITAL LETTER D WITH STROKE
+0112 ; Lu # LATIN CAPITAL LETTER E WITH MACRON
+0114 ; Lu # LATIN CAPITAL LETTER E WITH BREVE
+0116 ; Lu # LATIN CAPITAL LETTER E WITH DOT ABOVE
+0118 ; Lu # LATIN CAPITAL LETTER E WITH OGONEK
+011A ; Lu # LATIN CAPITAL LETTER E WITH CARON
+011C ; Lu # LATIN CAPITAL LETTER G WITH CIRCUMFLEX
+011E ; Lu # LATIN CAPITAL LETTER G WITH BREVE
+0120 ; Lu # LATIN CAPITAL LETTER G WITH DOT ABOVE
+0122 ; Lu # LATIN CAPITAL LETTER G WITH CEDILLA
+0124 ; Lu # LATIN CAPITAL LETTER H WITH CIRCUMFLEX
+0126 ; Lu # LATIN CAPITAL LETTER H WITH STROKE
+0128 ; Lu # LATIN CAPITAL LETTER I WITH TILDE
+012A ; Lu # LATIN CAPITAL LETTER I WITH MACRON
+012C ; Lu # LATIN CAPITAL LETTER I WITH BREVE
+012E ; Lu # LATIN CAPITAL LETTER I WITH OGONEK
+0130 ; Lu # LATIN CAPITAL LETTER I WITH DOT ABOVE
+0132 ; Lu # LATIN CAPITAL LIGATURE IJ
+0134 ; Lu # LATIN CAPITAL LETTER J WITH CIRCUMFLEX
+0136 ; Lu # LATIN CAPITAL LETTER K WITH CEDILLA
+0139 ; Lu # LATIN CAPITAL LETTER L WITH ACUTE
+013B ; Lu # LATIN CAPITAL LETTER L WITH CEDILLA
+013D ; Lu # LATIN CAPITAL LETTER L WITH CARON
+013F ; Lu # LATIN CAPITAL LETTER L WITH MIDDLE DOT
+0141 ; Lu # LATIN CAPITAL LETTER L WITH STROKE
+0143 ; Lu # LATIN CAPITAL LETTER N WITH ACUTE
+0145 ; Lu # LATIN CAPITAL LETTER N WITH CEDILLA
+0147 ; Lu # LATIN CAPITAL LETTER N WITH CARON
+014A ; Lu # LATIN CAPITAL LETTER ENG
+014C ; Lu # LATIN CAPITAL LETTER O WITH MACRON
+014E ; Lu # LATIN CAPITAL LETTER O WITH BREVE
+0150 ; Lu # LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
+0152 ; Lu # LATIN CAPITAL LIGATURE OE
+0154 ; Lu # LATIN CAPITAL LETTER R WITH ACUTE
+0156 ; Lu # LATIN CAPITAL LETTER R WITH CEDILLA
+0158 ; Lu # LATIN CAPITAL LETTER R WITH CARON
+015A ; Lu # LATIN CAPITAL LETTER S WITH ACUTE
+015C ; Lu # LATIN CAPITAL LETTER S WITH CIRCUMFLEX
+015E ; Lu # LATIN CAPITAL LETTER S WITH CEDILLA
+0160 ; Lu # LATIN CAPITAL LETTER S WITH CARON
+0162 ; Lu # LATIN CAPITAL LETTER T WITH CEDILLA
+0164 ; Lu # LATIN CAPITAL LETTER T WITH CARON
+0166 ; Lu # LATIN CAPITAL LETTER T WITH STROKE
+0168 ; Lu # LATIN CAPITAL LETTER U WITH TILDE
+016A ; Lu # LATIN CAPITAL LETTER U WITH MACRON
+016C ; Lu # LATIN CAPITAL LETTER U WITH BREVE
+016E ; Lu # LATIN CAPITAL LETTER U WITH RING ABOVE
+0170 ; Lu # LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
+0172 ; Lu # LATIN CAPITAL LETTER U WITH OGONEK
+0174 ; Lu # LATIN CAPITAL LETTER W WITH CIRCUMFLEX
+0176 ; Lu # LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
+0178..0179 ; Lu # [2] LATIN CAPITAL LETTER Y WITH DIAERESIS..LATIN CAPITAL LETTER Z WITH ACUTE
+017B ; Lu # LATIN CAPITAL LETTER Z WITH DOT ABOVE
+017D ; Lu # LATIN CAPITAL LETTER Z WITH CARON
+0181..0182 ; Lu # [2] LATIN CAPITAL LETTER B WITH HOOK..LATIN CAPITAL LETTER B WITH TOPBAR
+0184 ; Lu # LATIN CAPITAL LETTER TONE SIX
+0186..0187 ; Lu # [2] LATIN CAPITAL LETTER OPEN O..LATIN CAPITAL LETTER C WITH HOOK
+0189..018B ; Lu # [3] LATIN CAPITAL LETTER AFRICAN D..LATIN CAPITAL LETTER D WITH TOPBAR
+018E..0191 ; Lu # [4] LATIN CAPITAL LETTER REVERSED E..LATIN CAPITAL LETTER F WITH HOOK
+0193..0194 ; Lu # [2] LATIN CAPITAL LETTER G WITH HOOK..LATIN CAPITAL LETTER GAMMA
+0196..0198 ; Lu # [3] LATIN CAPITAL LETTER IOTA..LATIN CAPITAL LETTER K WITH HOOK
+019C..019D ; Lu # [2] LATIN CAPITAL LETTER TURNED M..LATIN CAPITAL LETTER N WITH LEFT HOOK
+019F..01A0 ; Lu # [2] LATIN CAPITAL LETTER O WITH MIDDLE TILDE..LATIN CAPITAL LETTER O WITH HORN
+01A2 ; Lu # LATIN CAPITAL LETTER OI
+01A4 ; Lu # LATIN CAPITAL LETTER P WITH HOOK
+01A6..01A7 ; Lu # [2] LATIN LETTER YR..LATIN CAPITAL LETTER TONE TWO
+01A9 ; Lu # LATIN CAPITAL LETTER ESH
+01AC ; Lu # LATIN CAPITAL LETTER T WITH HOOK
+01AE..01AF ; Lu # [2] LATIN CAPITAL LETTER T WITH RETROFLEX HOOK..LATIN CAPITAL LETTER U WITH HORN
+01B1..01B3 ; Lu # [3] LATIN CAPITAL LETTER UPSILON..LATIN CAPITAL LETTER Y WITH HOOK
+01B5 ; Lu # LATIN CAPITAL LETTER Z WITH STROKE
+01B7..01B8 ; Lu # [2] LATIN CAPITAL LETTER EZH..LATIN CAPITAL LETTER EZH REVERSED
+01BC ; Lu # LATIN CAPITAL LETTER TONE FIVE
+01C4 ; Lu # LATIN CAPITAL LETTER DZ WITH CARON
+01C7 ; Lu # LATIN CAPITAL LETTER LJ
+01CA ; Lu # LATIN CAPITAL LETTER NJ
+01CD ; Lu # LATIN CAPITAL LETTER A WITH CARON
+01CF ; Lu # LATIN CAPITAL LETTER I WITH CARON
+01D1 ; Lu # LATIN CAPITAL LETTER O WITH CARON
+01D3 ; Lu # LATIN CAPITAL LETTER U WITH CARON
+01D5 ; Lu # LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON
+01D7 ; Lu # LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE
+01D9 ; Lu # LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON
+01DB ; Lu # LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE
+01DE ; Lu # LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON
+01E0 ; Lu # LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON
+01E2 ; Lu # LATIN CAPITAL LETTER AE WITH MACRON
+01E4 ; Lu # LATIN CAPITAL LETTER G WITH STROKE
+01E6 ; Lu # LATIN CAPITAL LETTER G WITH CARON
+01E8 ; Lu # LATIN CAPITAL LETTER K WITH CARON
+01EA ; Lu # LATIN CAPITAL LETTER O WITH OGONEK
+01EC ; Lu # LATIN CAPITAL LETTER O WITH OGONEK AND MACRON
+01EE ; Lu # LATIN CAPITAL LETTER EZH WITH CARON
+01F1 ; Lu # LATIN CAPITAL LETTER DZ
+01F4 ; Lu # LATIN CAPITAL LETTER G WITH ACUTE
+01F6..01F8 ; Lu # [3] LATIN CAPITAL LETTER HWAIR..LATIN CAPITAL LETTER N WITH GRAVE
+01FA ; Lu # LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE
+01FC ; Lu # LATIN CAPITAL LETTER AE WITH ACUTE
+01FE ; Lu # LATIN CAPITAL LETTER O WITH STROKE AND ACUTE
+0200 ; Lu # LATIN CAPITAL LETTER A WITH DOUBLE GRAVE
+0202 ; Lu # LATIN CAPITAL LETTER A WITH INVERTED BREVE
+0204 ; Lu # LATIN CAPITAL LETTER E WITH DOUBLE GRAVE
+0206 ; Lu # LATIN CAPITAL LETTER E WITH INVERTED BREVE
+0208 ; Lu # LATIN CAPITAL LETTER I WITH DOUBLE GRAVE
+020A ; Lu # LATIN CAPITAL LETTER I WITH INVERTED BREVE
+020C ; Lu # LATIN CAPITAL LETTER O WITH DOUBLE GRAVE
+020E ; Lu # LATIN CAPITAL LETTER O WITH INVERTED BREVE
+0210 ; Lu # LATIN CAPITAL LETTER R WITH DOUBLE GRAVE
+0212 ; Lu # LATIN CAPITAL LETTER R WITH INVERTED BREVE
+0214 ; Lu # LATIN CAPITAL LETTER U WITH DOUBLE GRAVE
+0216 ; Lu # LATIN CAPITAL LETTER U WITH INVERTED BREVE
+0218 ; Lu # LATIN CAPITAL LETTER S WITH COMMA BELOW
+021A ; Lu # LATIN CAPITAL LETTER T WITH COMMA BELOW
+021C ; Lu # LATIN CAPITAL LETTER YOGH
+021E ; Lu # LATIN CAPITAL LETTER H WITH CARON
+0220 ; Lu # LATIN CAPITAL LETTER N WITH LONG RIGHT LEG
+0222 ; Lu # LATIN CAPITAL LETTER OU
+0224 ; Lu # LATIN CAPITAL LETTER Z WITH HOOK
+0226 ; Lu # LATIN CAPITAL LETTER A WITH DOT ABOVE
+0228 ; Lu # LATIN CAPITAL LETTER E WITH CEDILLA
+022A ; Lu # LATIN CAPITAL LETTER O WITH DIAERESIS AND MACRON
+022C ; Lu # LATIN CAPITAL LETTER O WITH TILDE AND MACRON
+022E ; Lu # LATIN CAPITAL LETTER O WITH DOT ABOVE
+0230 ; Lu # LATIN CAPITAL LETTER O WITH DOT ABOVE AND MACRON
+0232 ; Lu # LATIN CAPITAL LETTER Y WITH MACRON
+023A..023B ; Lu # [2] LATIN CAPITAL LETTER A WITH STROKE..LATIN CAPITAL LETTER C WITH STROKE
+023D..023E ; Lu # [2] LATIN CAPITAL LETTER L WITH BAR..LATIN CAPITAL LETTER T WITH DIAGONAL STROKE
+0241 ; Lu # LATIN CAPITAL LETTER GLOTTAL STOP
+0243..0246 ; Lu # [4] LATIN CAPITAL LETTER B WITH STROKE..LATIN CAPITAL LETTER E WITH STROKE
+0248 ; Lu # LATIN CAPITAL LETTER J WITH STROKE
+024A ; Lu # LATIN CAPITAL LETTER SMALL Q WITH HOOK TAIL
+024C ; Lu # LATIN CAPITAL LETTER R WITH STROKE
+024E ; Lu # LATIN CAPITAL LETTER Y WITH STROKE
+0370 ; Lu # GREEK CAPITAL LETTER HETA
+0372 ; Lu # GREEK CAPITAL LETTER ARCHAIC SAMPI
+0376 ; Lu # GREEK CAPITAL LETTER PAMPHYLIAN DIGAMMA
+037F ; Lu # GREEK CAPITAL LETTER YOT
+0386 ; Lu # GREEK CAPITAL LETTER ALPHA WITH TONOS
+0388..038A ; Lu # [3] GREEK CAPITAL LETTER EPSILON WITH TONOS..GREEK CAPITAL LETTER IOTA WITH TONOS
+038C ; Lu # GREEK CAPITAL LETTER OMICRON WITH TONOS
+038E..038F ; Lu # [2] GREEK CAPITAL LETTER UPSILON WITH TONOS..GREEK CAPITAL LETTER OMEGA WITH TONOS
+0391..03A1 ; Lu # [17] GREEK CAPITAL LETTER ALPHA..GREEK CAPITAL LETTER RHO
+03A3..03AB ; Lu # [9] GREEK CAPITAL LETTER SIGMA..GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA
+03CF ; Lu # GREEK CAPITAL KAI SYMBOL
+03D2..03D4 ; Lu # [3] GREEK UPSILON WITH HOOK SYMBOL..GREEK UPSILON WITH DIAERESIS AND HOOK SYMBOL
+03D8 ; Lu # GREEK LETTER ARCHAIC KOPPA
+03DA ; Lu # GREEK LETTER STIGMA
+03DC ; Lu # GREEK LETTER DIGAMMA
+03DE ; Lu # GREEK LETTER KOPPA
+03E0 ; Lu # GREEK LETTER SAMPI
+03E2 ; Lu # COPTIC CAPITAL LETTER SHEI
+03E4 ; Lu # COPTIC CAPITAL LETTER FEI
+03E6 ; Lu # COPTIC CAPITAL LETTER KHEI
+03E8 ; Lu # COPTIC CAPITAL LETTER HORI
+03EA ; Lu # COPTIC CAPITAL LETTER GANGIA
+03EC ; Lu # COPTIC CAPITAL LETTER SHIMA
+03EE ; Lu # COPTIC CAPITAL LETTER DEI
+03F4 ; Lu # GREEK CAPITAL THETA SYMBOL
+03F7 ; Lu # GREEK CAPITAL LETTER SHO
+03F9..03FA ; Lu # [2] GREEK CAPITAL LUNATE SIGMA SYMBOL..GREEK CAPITAL LETTER SAN
+03FD..042F ; Lu # [51] GREEK CAPITAL REVERSED LUNATE SIGMA SYMBOL..CYRILLIC CAPITAL LETTER YA
+0460 ; Lu # CYRILLIC CAPITAL LETTER OMEGA
+0462 ; Lu # CYRILLIC CAPITAL LETTER YAT
+0464 ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED E
+0466 ; Lu # CYRILLIC CAPITAL LETTER LITTLE YUS
+0468 ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED LITTLE YUS
+046A ; Lu # CYRILLIC CAPITAL LETTER BIG YUS
+046C ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED BIG YUS
+046E ; Lu # CYRILLIC CAPITAL LETTER KSI
+0470 ; Lu # CYRILLIC CAPITAL LETTER PSI
+0472 ; Lu # CYRILLIC CAPITAL LETTER FITA
+0474 ; Lu # CYRILLIC CAPITAL LETTER IZHITSA
+0476 ; Lu # CYRILLIC CAPITAL LETTER IZHITSA WITH DOUBLE GRAVE ACCENT
+0478 ; Lu # CYRILLIC CAPITAL LETTER UK
+047A ; Lu # CYRILLIC CAPITAL LETTER ROUND OMEGA
+047C ; Lu # CYRILLIC CAPITAL LETTER OMEGA WITH TITLO
+047E ; Lu # CYRILLIC CAPITAL LETTER OT
+0480 ; Lu # CYRILLIC CAPITAL LETTER KOPPA
+048A ; Lu # CYRILLIC CAPITAL LETTER SHORT I WITH TAIL
+048C ; Lu # CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+048E ; Lu # CYRILLIC CAPITAL LETTER ER WITH TICK
+0490 ; Lu # CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+0492 ; Lu # CYRILLIC CAPITAL LETTER GHE WITH STROKE
+0494 ; Lu # CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+0496 ; Lu # CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+0498 ; Lu # CYRILLIC CAPITAL LETTER ZE WITH DESCENDER
+049A ; Lu # CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+049C ; Lu # CYRILLIC CAPITAL LETTER KA WITH VERTICAL STROKE
+049E ; Lu # CYRILLIC CAPITAL LETTER KA WITH STROKE
+04A0 ; Lu # CYRILLIC CAPITAL LETTER BASHKIR KA
+04A2 ; Lu # CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+04A4 ; Lu # CYRILLIC CAPITAL LIGATURE EN GHE
+04A6 ; Lu # CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+04A8 ; Lu # CYRILLIC CAPITAL LETTER ABKHASIAN HA
+04AA ; Lu # CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+04AC ; Lu # CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+04AE ; Lu # CYRILLIC CAPITAL LETTER STRAIGHT U
+04B0 ; Lu # CYRILLIC CAPITAL LETTER STRAIGHT U WITH STROKE
+04B2 ; Lu # CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+04B4 ; Lu # CYRILLIC CAPITAL LIGATURE TE TSE
+04B6 ; Lu # CYRILLIC CAPITAL LETTER CHE WITH DESCENDER
+04B8 ; Lu # CYRILLIC CAPITAL LETTER CHE WITH VERTICAL STROKE
+04BA ; Lu # CYRILLIC CAPITAL LETTER SHHA
+04BC ; Lu # CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+04BE ; Lu # CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+04C0..04C1 ; Lu # [2] CYRILLIC LETTER PALOCHKA..CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+04C3 ; Lu # CYRILLIC CAPITAL LETTER KA WITH HOOK
+04C5 ; Lu # CYRILLIC CAPITAL LETTER EL WITH TAIL
+04C7 ; Lu # CYRILLIC CAPITAL LETTER EN WITH HOOK
+04C9 ; Lu # CYRILLIC CAPITAL LETTER EN WITH TAIL
+04CB ; Lu # CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+04CD ; Lu # CYRILLIC CAPITAL LETTER EM WITH TAIL
+04D0 ; Lu # CYRILLIC CAPITAL LETTER A WITH BREVE
+04D2 ; Lu # CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+04D4 ; Lu # CYRILLIC CAPITAL LIGATURE A IE
+04D6 ; Lu # CYRILLIC CAPITAL LETTER IE WITH BREVE
+04D8 ; Lu # CYRILLIC CAPITAL LETTER SCHWA
+04DA ; Lu # CYRILLIC CAPITAL LETTER SCHWA WITH DIAERESIS
+04DC ; Lu # CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+04DE ; Lu # CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+04E0 ; Lu # CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+04E2 ; Lu # CYRILLIC CAPITAL LETTER I WITH MACRON
+04E4 ; Lu # CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+04E6 ; Lu # CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+04E8 ; Lu # CYRILLIC CAPITAL LETTER BARRED O
+04EA ; Lu # CYRILLIC CAPITAL LETTER BARRED O WITH DIAERESIS
+04EC ; Lu # CYRILLIC CAPITAL LETTER E WITH DIAERESIS
+04EE ; Lu # CYRILLIC CAPITAL LETTER U WITH MACRON
+04F0 ; Lu # CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+04F2 ; Lu # CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+04F4 ; Lu # CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+04F6 ; Lu # CYRILLIC CAPITAL LETTER GHE WITH DESCENDER
+04F8 ; Lu # CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+04FA ; Lu # CYRILLIC CAPITAL LETTER GHE WITH STROKE AND HOOK
+04FC ; Lu # CYRILLIC CAPITAL LETTER HA WITH HOOK
+04FE ; Lu # CYRILLIC CAPITAL LETTER HA WITH STROKE
+0500 ; Lu # CYRILLIC CAPITAL LETTER KOMI DE
+0502 ; Lu # CYRILLIC CAPITAL LETTER KOMI DJE
+0504 ; Lu # CYRILLIC CAPITAL LETTER KOMI ZJE
+0506 ; Lu # CYRILLIC CAPITAL LETTER KOMI DZJE
+0508 ; Lu # CYRILLIC CAPITAL LETTER KOMI LJE
+050A ; Lu # CYRILLIC CAPITAL LETTER KOMI NJE
+050C ; Lu # CYRILLIC CAPITAL LETTER KOMI SJE
+050E ; Lu # CYRILLIC CAPITAL LETTER KOMI TJE
+0510 ; Lu # CYRILLIC CAPITAL LETTER REVERSED ZE
+0512 ; Lu # CYRILLIC CAPITAL LETTER EL WITH HOOK
+0514 ; Lu # CYRILLIC CAPITAL LETTER LHA
+0516 ; Lu # CYRILLIC CAPITAL LETTER RHA
+0518 ; Lu # CYRILLIC CAPITAL LETTER YAE
+051A ; Lu # CYRILLIC CAPITAL LETTER QA
+051C ; Lu # CYRILLIC CAPITAL LETTER WE
+051E ; Lu # CYRILLIC CAPITAL LETTER ALEUT KA
+0520 ; Lu # CYRILLIC CAPITAL LETTER EL WITH MIDDLE HOOK
+0522 ; Lu # CYRILLIC CAPITAL LETTER EN WITH MIDDLE HOOK
+0524 ; Lu # CYRILLIC CAPITAL LETTER PE WITH DESCENDER
+0526 ; Lu # CYRILLIC CAPITAL LETTER SHHA WITH DESCENDER
+0528 ; Lu # CYRILLIC CAPITAL LETTER EN WITH LEFT HOOK
+052A ; Lu # CYRILLIC CAPITAL LETTER DZZHE
+052C ; Lu # CYRILLIC CAPITAL LETTER DCHE
+052E ; Lu # CYRILLIC CAPITAL LETTER EL WITH DESCENDER
+0531..0556 ; Lu # [38] ARMENIAN CAPITAL LETTER AYB..ARMENIAN CAPITAL LETTER FEH
+10A0..10C5 ; Lu # [38] GEORGIAN CAPITAL LETTER AN..GEORGIAN CAPITAL LETTER HOE
+10C7 ; Lu # GEORGIAN CAPITAL LETTER YN
+10CD ; Lu # GEORGIAN CAPITAL LETTER AEN
+13A0..13F5 ; Lu # [86] CHEROKEE LETTER A..CHEROKEE LETTER MV
+1C90..1CBA ; Lu # [43] GEORGIAN MTAVRULI CAPITAL LETTER AN..GEORGIAN MTAVRULI CAPITAL LETTER AIN
+1CBD..1CBF ; Lu # [3] GEORGIAN MTAVRULI CAPITAL LETTER AEN..GEORGIAN MTAVRULI CAPITAL LETTER LABIAL SIGN
+1E00 ; Lu # LATIN CAPITAL LETTER A WITH RING BELOW
+1E02 ; Lu # LATIN CAPITAL LETTER B WITH DOT ABOVE
+1E04 ; Lu # LATIN CAPITAL LETTER B WITH DOT BELOW
+1E06 ; Lu # LATIN CAPITAL LETTER B WITH LINE BELOW
+1E08 ; Lu # LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE
+1E0A ; Lu # LATIN CAPITAL LETTER D WITH DOT ABOVE
+1E0C ; Lu # LATIN CAPITAL LETTER D WITH DOT BELOW
+1E0E ; Lu # LATIN CAPITAL LETTER D WITH LINE BELOW
+1E10 ; Lu # LATIN CAPITAL LETTER D WITH CEDILLA
+1E12 ; Lu # LATIN CAPITAL LETTER D WITH CIRCUMFLEX BELOW
+1E14 ; Lu # LATIN CAPITAL LETTER E WITH MACRON AND GRAVE
+1E16 ; Lu # LATIN CAPITAL LETTER E WITH MACRON AND ACUTE
+1E18 ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX BELOW
+1E1A ; Lu # LATIN CAPITAL LETTER E WITH TILDE BELOW
+1E1C ; Lu # LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE
+1E1E ; Lu # LATIN CAPITAL LETTER F WITH DOT ABOVE
+1E20 ; Lu # LATIN CAPITAL LETTER G WITH MACRON
+1E22 ; Lu # LATIN CAPITAL LETTER H WITH DOT ABOVE
+1E24 ; Lu # LATIN CAPITAL LETTER H WITH DOT BELOW
+1E26 ; Lu # LATIN CAPITAL LETTER H WITH DIAERESIS
+1E28 ; Lu # LATIN CAPITAL LETTER H WITH CEDILLA
+1E2A ; Lu # LATIN CAPITAL LETTER H WITH BREVE BELOW
+1E2C ; Lu # LATIN CAPITAL LETTER I WITH TILDE BELOW
+1E2E ; Lu # LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE
+1E30 ; Lu # LATIN CAPITAL LETTER K WITH ACUTE
+1E32 ; Lu # LATIN CAPITAL LETTER K WITH DOT BELOW
+1E34 ; Lu # LATIN CAPITAL LETTER K WITH LINE BELOW
+1E36 ; Lu # LATIN CAPITAL LETTER L WITH DOT BELOW
+1E38 ; Lu # LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON
+1E3A ; Lu # LATIN CAPITAL LETTER L WITH LINE BELOW
+1E3C ; Lu # LATIN CAPITAL LETTER L WITH CIRCUMFLEX BELOW
+1E3E ; Lu # LATIN CAPITAL LETTER M WITH ACUTE
+1E40 ; Lu # LATIN CAPITAL LETTER M WITH DOT ABOVE
+1E42 ; Lu # LATIN CAPITAL LETTER M WITH DOT BELOW
+1E44 ; Lu # LATIN CAPITAL LETTER N WITH DOT ABOVE
+1E46 ; Lu # LATIN CAPITAL LETTER N WITH DOT BELOW
+1E48 ; Lu # LATIN CAPITAL LETTER N WITH LINE BELOW
+1E4A ; Lu # LATIN CAPITAL LETTER N WITH CIRCUMFLEX BELOW
+1E4C ; Lu # LATIN CAPITAL LETTER O WITH TILDE AND ACUTE
+1E4E ; Lu # LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS
+1E50 ; Lu # LATIN CAPITAL LETTER O WITH MACRON AND GRAVE
+1E52 ; Lu # LATIN CAPITAL LETTER O WITH MACRON AND ACUTE
+1E54 ; Lu # LATIN CAPITAL LETTER P WITH ACUTE
+1E56 ; Lu # LATIN CAPITAL LETTER P WITH DOT ABOVE
+1E58 ; Lu # LATIN CAPITAL LETTER R WITH DOT ABOVE
+1E5A ; Lu # LATIN CAPITAL LETTER R WITH DOT BELOW
+1E5C ; Lu # LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON
+1E5E ; Lu # LATIN CAPITAL LETTER R WITH LINE BELOW
+1E60 ; Lu # LATIN CAPITAL LETTER S WITH DOT ABOVE
+1E62 ; Lu # LATIN CAPITAL LETTER S WITH DOT BELOW
+1E64 ; Lu # LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE
+1E66 ; Lu # LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE
+1E68 ; Lu # LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE
+1E6A ; Lu # LATIN CAPITAL LETTER T WITH DOT ABOVE
+1E6C ; Lu # LATIN CAPITAL LETTER T WITH DOT BELOW
+1E6E ; Lu # LATIN CAPITAL LETTER T WITH LINE BELOW
+1E70 ; Lu # LATIN CAPITAL LETTER T WITH CIRCUMFLEX BELOW
+1E72 ; Lu # LATIN CAPITAL LETTER U WITH DIAERESIS BELOW
+1E74 ; Lu # LATIN CAPITAL LETTER U WITH TILDE BELOW
+1E76 ; Lu # LATIN CAPITAL LETTER U WITH CIRCUMFLEX BELOW
+1E78 ; Lu # LATIN CAPITAL LETTER U WITH TILDE AND ACUTE
+1E7A ; Lu # LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS
+1E7C ; Lu # LATIN CAPITAL LETTER V WITH TILDE
+1E7E ; Lu # LATIN CAPITAL LETTER V WITH DOT BELOW
+1E80 ; Lu # LATIN CAPITAL LETTER W WITH GRAVE
+1E82 ; Lu # LATIN CAPITAL LETTER W WITH ACUTE
+1E84 ; Lu # LATIN CAPITAL LETTER W WITH DIAERESIS
+1E86 ; Lu # LATIN CAPITAL LETTER W WITH DOT ABOVE
+1E88 ; Lu # LATIN CAPITAL LETTER W WITH DOT BELOW
+1E8A ; Lu # LATIN CAPITAL LETTER X WITH DOT ABOVE
+1E8C ; Lu # LATIN CAPITAL LETTER X WITH DIAERESIS
+1E8E ; Lu # LATIN CAPITAL LETTER Y WITH DOT ABOVE
+1E90 ; Lu # LATIN CAPITAL LETTER Z WITH CIRCUMFLEX
+1E92 ; Lu # LATIN CAPITAL LETTER Z WITH DOT BELOW
+1E94 ; Lu # LATIN CAPITAL LETTER Z WITH LINE BELOW
+1E9E ; Lu # LATIN CAPITAL LETTER SHARP S
+1EA0 ; Lu # LATIN CAPITAL LETTER A WITH DOT BELOW
+1EA2 ; Lu # LATIN CAPITAL LETTER A WITH HOOK ABOVE
+1EA4 ; Lu # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE
+1EA6 ; Lu # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE
+1EA8 ; Lu # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
+1EAA ; Lu # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE
+1EAC ; Lu # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW
+1EAE ; Lu # LATIN CAPITAL LETTER A WITH BREVE AND ACUTE
+1EB0 ; Lu # LATIN CAPITAL LETTER A WITH BREVE AND GRAVE
+1EB2 ; Lu # LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE
+1EB4 ; Lu # LATIN CAPITAL LETTER A WITH BREVE AND TILDE
+1EB6 ; Lu # LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW
+1EB8 ; Lu # LATIN CAPITAL LETTER E WITH DOT BELOW
+1EBA ; Lu # LATIN CAPITAL LETTER E WITH HOOK ABOVE
+1EBC ; Lu # LATIN CAPITAL LETTER E WITH TILDE
+1EBE ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE
+1EC0 ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE
+1EC2 ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
+1EC4 ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE
+1EC6 ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW
+1EC8 ; Lu # LATIN CAPITAL LETTER I WITH HOOK ABOVE
+1ECA ; Lu # LATIN CAPITAL LETTER I WITH DOT BELOW
+1ECC ; Lu # LATIN CAPITAL LETTER O WITH DOT BELOW
+1ECE ; Lu # LATIN CAPITAL LETTER O WITH HOOK ABOVE
+1ED0 ; Lu # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE
+1ED2 ; Lu # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE
+1ED4 ; Lu # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
+1ED6 ; Lu # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE
+1ED8 ; Lu # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW
+1EDA ; Lu # LATIN CAPITAL LETTER O WITH HORN AND ACUTE
+1EDC ; Lu # LATIN CAPITAL LETTER O WITH HORN AND GRAVE
+1EDE ; Lu # LATIN CAPITAL LETTER O WITH HORN AND HOOK ABOVE
+1EE0 ; Lu # LATIN CAPITAL LETTER O WITH HORN AND TILDE
+1EE2 ; Lu # LATIN CAPITAL LETTER O WITH HORN AND DOT BELOW
+1EE4 ; Lu # LATIN CAPITAL LETTER U WITH DOT BELOW
+1EE6 ; Lu # LATIN CAPITAL LETTER U WITH HOOK ABOVE
+1EE8 ; Lu # LATIN CAPITAL LETTER U WITH HORN AND ACUTE
+1EEA ; Lu # LATIN CAPITAL LETTER U WITH HORN AND GRAVE
+1EEC ; Lu # LATIN CAPITAL LETTER U WITH HORN AND HOOK ABOVE
+1EEE ; Lu # LATIN CAPITAL LETTER U WITH HORN AND TILDE
+1EF0 ; Lu # LATIN CAPITAL LETTER U WITH HORN AND DOT BELOW
+1EF2 ; Lu # LATIN CAPITAL LETTER Y WITH GRAVE
+1EF4 ; Lu # LATIN CAPITAL LETTER Y WITH DOT BELOW
+1EF6 ; Lu # LATIN CAPITAL LETTER Y WITH HOOK ABOVE
+1EF8 ; Lu # LATIN CAPITAL LETTER Y WITH TILDE
+1EFA ; Lu # LATIN CAPITAL LETTER MIDDLE-WELSH LL
+1EFC ; Lu # LATIN CAPITAL LETTER MIDDLE-WELSH V
+1EFE ; Lu # LATIN CAPITAL LETTER Y WITH LOOP
+1F08..1F0F ; Lu # [8] GREEK CAPITAL LETTER ALPHA WITH PSILI..GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI
+1F18..1F1D ; Lu # [6] GREEK CAPITAL LETTER EPSILON WITH PSILI..GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA
+1F28..1F2F ; Lu # [8] GREEK CAPITAL LETTER ETA WITH PSILI..GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI
+1F38..1F3F ; Lu # [8] GREEK CAPITAL LETTER IOTA WITH PSILI..GREEK CAPITAL LETTER IOTA WITH DASIA AND PERISPOMENI
+1F48..1F4D ; Lu # [6] GREEK CAPITAL LETTER OMICRON WITH PSILI..GREEK CAPITAL LETTER OMICRON WITH DASIA AND OXIA
+1F59 ; Lu # GREEK CAPITAL LETTER UPSILON WITH DASIA
+1F5B ; Lu # GREEK CAPITAL LETTER UPSILON WITH DASIA AND VARIA
+1F5D ; Lu # GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXIA
+1F5F ; Lu # GREEK CAPITAL LETTER UPSILON WITH DASIA AND PERISPOMENI
+1F68..1F6F ; Lu # [8] GREEK CAPITAL LETTER OMEGA WITH PSILI..GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI
+1FB8..1FBB ; Lu # [4] GREEK CAPITAL LETTER ALPHA WITH VRACHY..GREEK CAPITAL LETTER ALPHA WITH OXIA
+1FC8..1FCB ; Lu # [4] GREEK CAPITAL LETTER EPSILON WITH VARIA..GREEK CAPITAL LETTER ETA WITH OXIA
+1FD8..1FDB ; Lu # [4] GREEK CAPITAL LETTER IOTA WITH VRACHY..GREEK CAPITAL LETTER IOTA WITH OXIA
+1FE8..1FEC ; Lu # [5] GREEK CAPITAL LETTER UPSILON WITH VRACHY..GREEK CAPITAL LETTER RHO WITH DASIA
+1FF8..1FFB ; Lu # [4] GREEK CAPITAL LETTER OMICRON WITH VARIA..GREEK CAPITAL LETTER OMEGA WITH OXIA
+2102 ; Lu # DOUBLE-STRUCK CAPITAL C
+2107 ; Lu # EULER CONSTANT
+210B..210D ; Lu # [3] SCRIPT CAPITAL H..DOUBLE-STRUCK CAPITAL H
+2110..2112 ; Lu # [3] SCRIPT CAPITAL I..SCRIPT CAPITAL L
+2115 ; Lu # DOUBLE-STRUCK CAPITAL N
+2119..211D ; Lu # [5] DOUBLE-STRUCK CAPITAL P..DOUBLE-STRUCK CAPITAL R
+2124 ; Lu # DOUBLE-STRUCK CAPITAL Z
+2126 ; Lu # OHM SIGN
+2128 ; Lu # BLACK-LETTER CAPITAL Z
+212A..212D ; Lu # [4] KELVIN SIGN..BLACK-LETTER CAPITAL C
+2130..2133 ; Lu # [4] SCRIPT CAPITAL E..SCRIPT CAPITAL M
+213E..213F ; Lu # [2] DOUBLE-STRUCK CAPITAL GAMMA..DOUBLE-STRUCK CAPITAL PI
+2145 ; Lu # DOUBLE-STRUCK ITALIC CAPITAL D
+2183 ; Lu # ROMAN NUMERAL REVERSED ONE HUNDRED
+2C00..2C2E ; Lu # [47] GLAGOLITIC CAPITAL LETTER AZU..GLAGOLITIC CAPITAL LETTER LATINATE MYSLITE
+2C60 ; Lu # LATIN CAPITAL LETTER L WITH DOUBLE BAR
+2C62..2C64 ; Lu # [3] LATIN CAPITAL LETTER L WITH MIDDLE TILDE..LATIN CAPITAL LETTER R WITH TAIL
+2C67 ; Lu # LATIN CAPITAL LETTER H WITH DESCENDER
+2C69 ; Lu # LATIN CAPITAL LETTER K WITH DESCENDER
+2C6B ; Lu # LATIN CAPITAL LETTER Z WITH DESCENDER
+2C6D..2C70 ; Lu # [4] LATIN CAPITAL LETTER ALPHA..LATIN CAPITAL LETTER TURNED ALPHA
+2C72 ; Lu # LATIN CAPITAL LETTER W WITH HOOK
+2C75 ; Lu # LATIN CAPITAL LETTER HALF H
+2C7E..2C80 ; Lu # [3] LATIN CAPITAL LETTER S WITH SWASH TAIL..COPTIC CAPITAL LETTER ALFA
+2C82 ; Lu # COPTIC CAPITAL LETTER VIDA
+2C84 ; Lu # COPTIC CAPITAL LETTER GAMMA
+2C86 ; Lu # COPTIC CAPITAL LETTER DALDA
+2C88 ; Lu # COPTIC CAPITAL LETTER EIE
+2C8A ; Lu # COPTIC CAPITAL LETTER SOU
+2C8C ; Lu # COPTIC CAPITAL LETTER ZATA
+2C8E ; Lu # COPTIC CAPITAL LETTER HATE
+2C90 ; Lu # COPTIC CAPITAL LETTER THETHE
+2C92 ; Lu # COPTIC CAPITAL LETTER IAUDA
+2C94 ; Lu # COPTIC CAPITAL LETTER KAPA
+2C96 ; Lu # COPTIC CAPITAL LETTER LAULA
+2C98 ; Lu # COPTIC CAPITAL LETTER MI
+2C9A ; Lu # COPTIC CAPITAL LETTER NI
+2C9C ; Lu # COPTIC CAPITAL LETTER KSI
+2C9E ; Lu # COPTIC CAPITAL LETTER O
+2CA0 ; Lu # COPTIC CAPITAL LETTER PI
+2CA2 ; Lu # COPTIC CAPITAL LETTER RO
+2CA4 ; Lu # COPTIC CAPITAL LETTER SIMA
+2CA6 ; Lu # COPTIC CAPITAL LETTER TAU
+2CA8 ; Lu # COPTIC CAPITAL LETTER UA
+2CAA ; Lu # COPTIC CAPITAL LETTER FI
+2CAC ; Lu # COPTIC CAPITAL LETTER KHI
+2CAE ; Lu # COPTIC CAPITAL LETTER PSI
+2CB0 ; Lu # COPTIC CAPITAL LETTER OOU
+2CB2 ; Lu # COPTIC CAPITAL LETTER DIALECT-P ALEF
+2CB4 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC AIN
+2CB6 ; Lu # COPTIC CAPITAL LETTER CRYPTOGRAMMIC EIE
+2CB8 ; Lu # COPTIC CAPITAL LETTER DIALECT-P KAPA
+2CBA ; Lu # COPTIC CAPITAL LETTER DIALECT-P NI
+2CBC ; Lu # COPTIC CAPITAL LETTER CRYPTOGRAMMIC NI
+2CBE ; Lu # COPTIC CAPITAL LETTER OLD COPTIC OOU
+2CC0 ; Lu # COPTIC CAPITAL LETTER SAMPI
+2CC2 ; Lu # COPTIC CAPITAL LETTER CROSSED SHEI
+2CC4 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC SHEI
+2CC6 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC ESH
+2CC8 ; Lu # COPTIC CAPITAL LETTER AKHMIMIC KHEI
+2CCA ; Lu # COPTIC CAPITAL LETTER DIALECT-P HORI
+2CCC ; Lu # COPTIC CAPITAL LETTER OLD COPTIC HORI
+2CCE ; Lu # COPTIC CAPITAL LETTER OLD COPTIC HA
+2CD0 ; Lu # COPTIC CAPITAL LETTER L-SHAPED HA
+2CD2 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC HEI
+2CD4 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC HAT
+2CD6 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC GANGIA
+2CD8 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC DJA
+2CDA ; Lu # COPTIC CAPITAL LETTER OLD COPTIC SHIMA
+2CDC ; Lu # COPTIC CAPITAL LETTER OLD NUBIAN SHIMA
+2CDE ; Lu # COPTIC CAPITAL LETTER OLD NUBIAN NGI
+2CE0 ; Lu # COPTIC CAPITAL LETTER OLD NUBIAN NYI
+2CE2 ; Lu # COPTIC CAPITAL LETTER OLD NUBIAN WAU
+2CEB ; Lu # COPTIC CAPITAL LETTER CRYPTOGRAMMIC SHEI
+2CED ; Lu # COPTIC CAPITAL LETTER CRYPTOGRAMMIC GANGIA
+2CF2 ; Lu # COPTIC CAPITAL LETTER BOHAIRIC KHEI
+A640 ; Lu # CYRILLIC CAPITAL LETTER ZEMLYA
+A642 ; Lu # CYRILLIC CAPITAL LETTER DZELO
+A644 ; Lu # CYRILLIC CAPITAL LETTER REVERSED DZE
+A646 ; Lu # CYRILLIC CAPITAL LETTER IOTA
+A648 ; Lu # CYRILLIC CAPITAL LETTER DJERV
+A64A ; Lu # CYRILLIC CAPITAL LETTER MONOGRAPH UK
+A64C ; Lu # CYRILLIC CAPITAL LETTER BROAD OMEGA
+A64E ; Lu # CYRILLIC CAPITAL LETTER NEUTRAL YER
+A650 ; Lu # CYRILLIC CAPITAL LETTER YERU WITH BACK YER
+A652 ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED YAT
+A654 ; Lu # CYRILLIC CAPITAL LETTER REVERSED YU
+A656 ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED A
+A658 ; Lu # CYRILLIC CAPITAL LETTER CLOSED LITTLE YUS
+A65A ; Lu # CYRILLIC CAPITAL LETTER BLENDED YUS
+A65C ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED CLOSED LITTLE YUS
+A65E ; Lu # CYRILLIC CAPITAL LETTER YN
+A660 ; Lu # CYRILLIC CAPITAL LETTER REVERSED TSE
+A662 ; Lu # CYRILLIC CAPITAL LETTER SOFT DE
+A664 ; Lu # CYRILLIC CAPITAL LETTER SOFT EL
+A666 ; Lu # CYRILLIC CAPITAL LETTER SOFT EM
+A668 ; Lu # CYRILLIC CAPITAL LETTER MONOCULAR O
+A66A ; Lu # CYRILLIC CAPITAL LETTER BINOCULAR O
+A66C ; Lu # CYRILLIC CAPITAL LETTER DOUBLE MONOCULAR O
+A680 ; Lu # CYRILLIC CAPITAL LETTER DWE
+A682 ; Lu # CYRILLIC CAPITAL LETTER DZWE
+A684 ; Lu # CYRILLIC CAPITAL LETTER ZHWE
+A686 ; Lu # CYRILLIC CAPITAL LETTER CCHE
+A688 ; Lu # CYRILLIC CAPITAL LETTER DZZE
+A68A ; Lu # CYRILLIC CAPITAL LETTER TE WITH MIDDLE HOOK
+A68C ; Lu # CYRILLIC CAPITAL LETTER TWE
+A68E ; Lu # CYRILLIC CAPITAL LETTER TSWE
+A690 ; Lu # CYRILLIC CAPITAL LETTER TSSE
+A692 ; Lu # CYRILLIC CAPITAL LETTER TCHE
+A694 ; Lu # CYRILLIC CAPITAL LETTER HWE
+A696 ; Lu # CYRILLIC CAPITAL LETTER SHWE
+A698 ; Lu # CYRILLIC CAPITAL LETTER DOUBLE O
+A69A ; Lu # CYRILLIC CAPITAL LETTER CROSSED O
+A722 ; Lu # LATIN CAPITAL LETTER EGYPTOLOGICAL ALEF
+A724 ; Lu # LATIN CAPITAL LETTER EGYPTOLOGICAL AIN
+A726 ; Lu # LATIN CAPITAL LETTER HENG
+A728 ; Lu # LATIN CAPITAL LETTER TZ
+A72A ; Lu # LATIN CAPITAL LETTER TRESILLO
+A72C ; Lu # LATIN CAPITAL LETTER CUATRILLO
+A72E ; Lu # LATIN CAPITAL LETTER CUATRILLO WITH COMMA
+A732 ; Lu # LATIN CAPITAL LETTER AA
+A734 ; Lu # LATIN CAPITAL LETTER AO
+A736 ; Lu # LATIN CAPITAL LETTER AU
+A738 ; Lu # LATIN CAPITAL LETTER AV
+A73A ; Lu # LATIN CAPITAL LETTER AV WITH HORIZONTAL BAR
+A73C ; Lu # LATIN CAPITAL LETTER AY
+A73E ; Lu # LATIN CAPITAL LETTER REVERSED C WITH DOT
+A740 ; Lu # LATIN CAPITAL LETTER K WITH STROKE
+A742 ; Lu # LATIN CAPITAL LETTER K WITH DIAGONAL STROKE
+A744 ; Lu # LATIN CAPITAL LETTER K WITH STROKE AND DIAGONAL STROKE
+A746 ; Lu # LATIN CAPITAL LETTER BROKEN L
+A748 ; Lu # LATIN CAPITAL LETTER L WITH HIGH STROKE
+A74A ; Lu # LATIN CAPITAL LETTER O WITH LONG STROKE OVERLAY
+A74C ; Lu # LATIN CAPITAL LETTER O WITH LOOP
+A74E ; Lu # LATIN CAPITAL LETTER OO
+A750 ; Lu # LATIN CAPITAL LETTER P WITH STROKE THROUGH DESCENDER
+A752 ; Lu # LATIN CAPITAL LETTER P WITH FLOURISH
+A754 ; Lu # LATIN CAPITAL LETTER P WITH SQUIRREL TAIL
+A756 ; Lu # LATIN CAPITAL LETTER Q WITH STROKE THROUGH DESCENDER
+A758 ; Lu # LATIN CAPITAL LETTER Q WITH DIAGONAL STROKE
+A75A ; Lu # LATIN CAPITAL LETTER R ROTUNDA
+A75C ; Lu # LATIN CAPITAL LETTER RUM ROTUNDA
+A75E ; Lu # LATIN CAPITAL LETTER V WITH DIAGONAL STROKE
+A760 ; Lu # LATIN CAPITAL LETTER VY
+A762 ; Lu # LATIN CAPITAL LETTER VISIGOTHIC Z
+A764 ; Lu # LATIN CAPITAL LETTER THORN WITH STROKE
+A766 ; Lu # LATIN CAPITAL LETTER THORN WITH STROKE THROUGH DESCENDER
+A768 ; Lu # LATIN CAPITAL LETTER VEND
+A76A ; Lu # LATIN CAPITAL LETTER ET
+A76C ; Lu # LATIN CAPITAL LETTER IS
+A76E ; Lu # LATIN CAPITAL LETTER CON
+A779 ; Lu # LATIN CAPITAL LETTER INSULAR D
+A77B ; Lu # LATIN CAPITAL LETTER INSULAR F
+A77D..A77E ; Lu # [2] LATIN CAPITAL LETTER INSULAR G..LATIN CAPITAL LETTER TURNED INSULAR G
+A780 ; Lu # LATIN CAPITAL LETTER TURNED L
+A782 ; Lu # LATIN CAPITAL LETTER INSULAR R
+A784 ; Lu # LATIN CAPITAL LETTER INSULAR S
+A786 ; Lu # LATIN CAPITAL LETTER INSULAR T
+A78B ; Lu # LATIN CAPITAL LETTER SALTILLO
+A78D ; Lu # LATIN CAPITAL LETTER TURNED H
+A790 ; Lu # LATIN CAPITAL LETTER N WITH DESCENDER
+A792 ; Lu # LATIN CAPITAL LETTER C WITH BAR
+A796 ; Lu # LATIN CAPITAL LETTER B WITH FLOURISH
+A798 ; Lu # LATIN CAPITAL LETTER F WITH STROKE
+A79A ; Lu # LATIN CAPITAL LETTER VOLAPUK AE
+A79C ; Lu # LATIN CAPITAL LETTER VOLAPUK OE
+A79E ; Lu # LATIN CAPITAL LETTER VOLAPUK UE
+A7A0 ; Lu # LATIN CAPITAL LETTER G WITH OBLIQUE STROKE
+A7A2 ; Lu # LATIN CAPITAL LETTER K WITH OBLIQUE STROKE
+A7A4 ; Lu # LATIN CAPITAL LETTER N WITH OBLIQUE STROKE
+A7A6 ; Lu # LATIN CAPITAL LETTER R WITH OBLIQUE STROKE
+A7A8 ; Lu # LATIN CAPITAL LETTER S WITH OBLIQUE STROKE
+A7AA..A7AE ; Lu # [5] LATIN CAPITAL LETTER H WITH HOOK..LATIN CAPITAL LETTER SMALL CAPITAL I
+A7B0..A7B4 ; Lu # [5] LATIN CAPITAL LETTER TURNED K..LATIN CAPITAL LETTER BETA
+A7B6 ; Lu # LATIN CAPITAL LETTER OMEGA
+A7B8 ; Lu # LATIN CAPITAL LETTER U WITH STROKE
+A7BA ; Lu # LATIN CAPITAL LETTER GLOTTAL A
+A7BC ; Lu # LATIN CAPITAL LETTER GLOTTAL I
+A7BE ; Lu # LATIN CAPITAL LETTER GLOTTAL U
+A7C2 ; Lu # LATIN CAPITAL LETTER ANGLICANA W
+A7C4..A7C6 ; Lu # [3] LATIN CAPITAL LETTER C WITH PALATAL HOOK..LATIN CAPITAL LETTER Z WITH PALATAL HOOK
+FF21..FF3A ; Lu # [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAPITAL LETTER Z
+10400..10427 ; Lu # [40] DESERET CAPITAL LETTER LONG I..DESERET CAPITAL LETTER EW
+104B0..104D3 ; Lu # [36] OSAGE CAPITAL LETTER A..OSAGE CAPITAL LETTER ZHA
+10C80..10CB2 ; Lu # [51] OLD HUNGARIAN CAPITAL LETTER A..OLD HUNGARIAN CAPITAL LETTER US
+118A0..118BF ; Lu # [32] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI CAPITAL LETTER VIYO
+16E40..16E5F ; Lu # [32] MEDEFAIDRIN CAPITAL LETTER M..MEDEFAIDRIN CAPITAL LETTER Y
+1D400..1D419 ; Lu # [26] MATHEMATICAL BOLD CAPITAL A..MATHEMATICAL BOLD CAPITAL Z
+1D434..1D44D ; Lu # [26] MATHEMATICAL ITALIC CAPITAL A..MATHEMATICAL ITALIC CAPITAL Z
+1D468..1D481 ; Lu # [26] MATHEMATICAL BOLD ITALIC CAPITAL A..MATHEMATICAL BOLD ITALIC CAPITAL Z
+1D49C ; Lu # MATHEMATICAL SCRIPT CAPITAL A
+1D49E..1D49F ; Lu # [2] MATHEMATICAL SCRIPT CAPITAL C..MATHEMATICAL SCRIPT CAPITAL D
+1D4A2 ; Lu # MATHEMATICAL SCRIPT CAPITAL G
+1D4A5..1D4A6 ; Lu # [2] MATHEMATICAL SCRIPT CAPITAL J..MATHEMATICAL SCRIPT CAPITAL K
+1D4A9..1D4AC ; Lu # [4] MATHEMATICAL SCRIPT CAPITAL N..MATHEMATICAL SCRIPT CAPITAL Q
+1D4AE..1D4B5 ; Lu # [8] MATHEMATICAL SCRIPT CAPITAL S..MATHEMATICAL SCRIPT CAPITAL Z
+1D4D0..1D4E9 ; Lu # [26] MATHEMATICAL BOLD SCRIPT CAPITAL A..MATHEMATICAL BOLD SCRIPT CAPITAL Z
+1D504..1D505 ; Lu # [2] MATHEMATICAL FRAKTUR CAPITAL A..MATHEMATICAL FRAKTUR CAPITAL B
+1D507..1D50A ; Lu # [4] MATHEMATICAL FRAKTUR CAPITAL D..MATHEMATICAL FRAKTUR CAPITAL G
+1D50D..1D514 ; Lu # [8] MATHEMATICAL FRAKTUR CAPITAL J..MATHEMATICAL FRAKTUR CAPITAL Q
+1D516..1D51C ; Lu # [7] MATHEMATICAL FRAKTUR CAPITAL S..MATHEMATICAL FRAKTUR CAPITAL Y
+1D538..1D539 ; Lu # [2] MATHEMATICAL DOUBLE-STRUCK CAPITAL A..MATHEMATICAL DOUBLE-STRUCK CAPITAL B
+1D53B..1D53E ; Lu # [4] MATHEMATICAL DOUBLE-STRUCK CAPITAL D..MATHEMATICAL DOUBLE-STRUCK CAPITAL G
+1D540..1D544 ; Lu # [5] MATHEMATICAL DOUBLE-STRUCK CAPITAL I..MATHEMATICAL DOUBLE-STRUCK CAPITAL M
+1D546 ; Lu # MATHEMATICAL DOUBLE-STRUCK CAPITAL O
+1D54A..1D550 ; Lu # [7] MATHEMATICAL DOUBLE-STRUCK CAPITAL S..MATHEMATICAL DOUBLE-STRUCK CAPITAL Y
+1D56C..1D585 ; Lu # [26] MATHEMATICAL BOLD FRAKTUR CAPITAL A..MATHEMATICAL BOLD FRAKTUR CAPITAL Z
+1D5A0..1D5B9 ; Lu # [26] MATHEMATICAL SANS-SERIF CAPITAL A..MATHEMATICAL SANS-SERIF CAPITAL Z
+1D5D4..1D5ED ; Lu # [26] MATHEMATICAL SANS-SERIF BOLD CAPITAL A..MATHEMATICAL SANS-SERIF BOLD CAPITAL Z
+1D608..1D621 ; Lu # [26] MATHEMATICAL SANS-SERIF ITALIC CAPITAL A..MATHEMATICAL SANS-SERIF ITALIC CAPITAL Z
+1D63C..1D655 ; Lu # [26] MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL A..MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL Z
+1D670..1D689 ; Lu # [26] MATHEMATICAL MONOSPACE CAPITAL A..MATHEMATICAL MONOSPACE CAPITAL Z
+1D6A8..1D6C0 ; Lu # [25] MATHEMATICAL BOLD CAPITAL ALPHA..MATHEMATICAL BOLD CAPITAL OMEGA
+1D6E2..1D6FA ; Lu # [25] MATHEMATICAL ITALIC CAPITAL ALPHA..MATHEMATICAL ITALIC CAPITAL OMEGA
+1D71C..1D734 ; Lu # [25] MATHEMATICAL BOLD ITALIC CAPITAL ALPHA..MATHEMATICAL BOLD ITALIC CAPITAL OMEGA
+1D756..1D76E ; Lu # [25] MATHEMATICAL SANS-SERIF BOLD CAPITAL ALPHA..MATHEMATICAL SANS-SERIF BOLD CAPITAL OMEGA
+1D790..1D7A8 ; Lu # [25] MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL ALPHA..MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL OMEGA
+1D7CA ; Lu # MATHEMATICAL BOLD CAPITAL DIGAMMA
+1E900..1E921 ; Lu # [34] ADLAM CAPITAL LETTER ALIF..ADLAM CAPITAL LETTER SHA
+
+# Total code points: 1788
+
+# ================================================
+
+# General_Category=Lowercase_Letter
+
+0061..007A ; Ll # [26] LATIN SMALL LETTER A..LATIN SMALL LETTER Z
+00B5 ; Ll # MICRO SIGN
+00DF..00F6 ; Ll # [24] LATIN SMALL LETTER SHARP S..LATIN SMALL LETTER O WITH DIAERESIS
+00F8..00FF ; Ll # [8] LATIN SMALL LETTER O WITH STROKE..LATIN SMALL LETTER Y WITH DIAERESIS
+0101 ; Ll # LATIN SMALL LETTER A WITH MACRON
+0103 ; Ll # LATIN SMALL LETTER A WITH BREVE
+0105 ; Ll # LATIN SMALL LETTER A WITH OGONEK
+0107 ; Ll # LATIN SMALL LETTER C WITH ACUTE
+0109 ; Ll # LATIN SMALL LETTER C WITH CIRCUMFLEX
+010B ; Ll # LATIN SMALL LETTER C WITH DOT ABOVE
+010D ; Ll # LATIN SMALL LETTER C WITH CARON
+010F ; Ll # LATIN SMALL LETTER D WITH CARON
+0111 ; Ll # LATIN SMALL LETTER D WITH STROKE
+0113 ; Ll # LATIN SMALL LETTER E WITH MACRON
+0115 ; Ll # LATIN SMALL LETTER E WITH BREVE
+0117 ; Ll # LATIN SMALL LETTER E WITH DOT ABOVE
+0119 ; Ll # LATIN SMALL LETTER E WITH OGONEK
+011B ; Ll # LATIN SMALL LETTER E WITH CARON
+011D ; Ll # LATIN SMALL LETTER G WITH CIRCUMFLEX
+011F ; Ll # LATIN SMALL LETTER G WITH BREVE
+0121 ; Ll # LATIN SMALL LETTER G WITH DOT ABOVE
+0123 ; Ll # LATIN SMALL LETTER G WITH CEDILLA
+0125 ; Ll # LATIN SMALL LETTER H WITH CIRCUMFLEX
+0127 ; Ll # LATIN SMALL LETTER H WITH STROKE
+0129 ; Ll # LATIN SMALL LETTER I WITH TILDE
+012B ; Ll # LATIN SMALL LETTER I WITH MACRON
+012D ; Ll # LATIN SMALL LETTER I WITH BREVE
+012F ; Ll # LATIN SMALL LETTER I WITH OGONEK
+0131 ; Ll # LATIN SMALL LETTER DOTLESS I
+0133 ; Ll # LATIN SMALL LIGATURE IJ
+0135 ; Ll # LATIN SMALL LETTER J WITH CIRCUMFLEX
+0137..0138 ; Ll # [2] LATIN SMALL LETTER K WITH CEDILLA..LATIN SMALL LETTER KRA
+013A ; Ll # LATIN SMALL LETTER L WITH ACUTE
+013C ; Ll # LATIN SMALL LETTER L WITH CEDILLA
+013E ; Ll # LATIN SMALL LETTER L WITH CARON
+0140 ; Ll # LATIN SMALL LETTER L WITH MIDDLE DOT
+0142 ; Ll # LATIN SMALL LETTER L WITH STROKE
+0144 ; Ll # LATIN SMALL LETTER N WITH ACUTE
+0146 ; Ll # LATIN SMALL LETTER N WITH CEDILLA
+0148..0149 ; Ll # [2] LATIN SMALL LETTER N WITH CARON..LATIN SMALL LETTER N PRECEDED BY APOSTROPHE
+014B ; Ll # LATIN SMALL LETTER ENG
+014D ; Ll # LATIN SMALL LETTER O WITH MACRON
+014F ; Ll # LATIN SMALL LETTER O WITH BREVE
+0151 ; Ll # LATIN SMALL LETTER O WITH DOUBLE ACUTE
+0153 ; Ll # LATIN SMALL LIGATURE OE
+0155 ; Ll # LATIN SMALL LETTER R WITH ACUTE
+0157 ; Ll # LATIN SMALL LETTER R WITH CEDILLA
+0159 ; Ll # LATIN SMALL LETTER R WITH CARON
+015B ; Ll # LATIN SMALL LETTER S WITH ACUTE
+015D ; Ll # LATIN SMALL LETTER S WITH CIRCUMFLEX
+015F ; Ll # LATIN SMALL LETTER S WITH CEDILLA
+0161 ; Ll # LATIN SMALL LETTER S WITH CARON
+0163 ; Ll # LATIN SMALL LETTER T WITH CEDILLA
+0165 ; Ll # LATIN SMALL LETTER T WITH CARON
+0167 ; Ll # LATIN SMALL LETTER T WITH STROKE
+0169 ; Ll # LATIN SMALL LETTER U WITH TILDE
+016B ; Ll # LATIN SMALL LETTER U WITH MACRON
+016D ; Ll # LATIN SMALL LETTER U WITH BREVE
+016F ; Ll # LATIN SMALL LETTER U WITH RING ABOVE
+0171 ; Ll # LATIN SMALL LETTER U WITH DOUBLE ACUTE
+0173 ; Ll # LATIN SMALL LETTER U WITH OGONEK
+0175 ; Ll # LATIN SMALL LETTER W WITH CIRCUMFLEX
+0177 ; Ll # LATIN SMALL LETTER Y WITH CIRCUMFLEX
+017A ; Ll # LATIN SMALL LETTER Z WITH ACUTE
+017C ; Ll # LATIN SMALL LETTER Z WITH DOT ABOVE
+017E..0180 ; Ll # [3] LATIN SMALL LETTER Z WITH CARON..LATIN SMALL LETTER B WITH STROKE
+0183 ; Ll # LATIN SMALL LETTER B WITH TOPBAR
+0185 ; Ll # LATIN SMALL LETTER TONE SIX
+0188 ; Ll # LATIN SMALL LETTER C WITH HOOK
+018C..018D ; Ll # [2] LATIN SMALL LETTER D WITH TOPBAR..LATIN SMALL LETTER TURNED DELTA
+0192 ; Ll # LATIN SMALL LETTER F WITH HOOK
+0195 ; Ll # LATIN SMALL LETTER HV
+0199..019B ; Ll # [3] LATIN SMALL LETTER K WITH HOOK..LATIN SMALL LETTER LAMBDA WITH STROKE
+019E ; Ll # LATIN SMALL LETTER N WITH LONG RIGHT LEG
+01A1 ; Ll # LATIN SMALL LETTER O WITH HORN
+01A3 ; Ll # LATIN SMALL LETTER OI
+01A5 ; Ll # LATIN SMALL LETTER P WITH HOOK
+01A8 ; Ll # LATIN SMALL LETTER TONE TWO
+01AA..01AB ; Ll # [2] LATIN LETTER REVERSED ESH LOOP..LATIN SMALL LETTER T WITH PALATAL HOOK
+01AD ; Ll # LATIN SMALL LETTER T WITH HOOK
+01B0 ; Ll # LATIN SMALL LETTER U WITH HORN
+01B4 ; Ll # LATIN SMALL LETTER Y WITH HOOK
+01B6 ; Ll # LATIN SMALL LETTER Z WITH STROKE
+01B9..01BA ; Ll # [2] LATIN SMALL LETTER EZH REVERSED..LATIN SMALL LETTER EZH WITH TAIL
+01BD..01BF ; Ll # [3] LATIN SMALL LETTER TONE FIVE..LATIN LETTER WYNN
+01C6 ; Ll # LATIN SMALL LETTER DZ WITH CARON
+01C9 ; Ll # LATIN SMALL LETTER LJ
+01CC ; Ll # LATIN SMALL LETTER NJ
+01CE ; Ll # LATIN SMALL LETTER A WITH CARON
+01D0 ; Ll # LATIN SMALL LETTER I WITH CARON
+01D2 ; Ll # LATIN SMALL LETTER O WITH CARON
+01D4 ; Ll # LATIN SMALL LETTER U WITH CARON
+01D6 ; Ll # LATIN SMALL LETTER U WITH DIAERESIS AND MACRON
+01D8 ; Ll # LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE
+01DA ; Ll # LATIN SMALL LETTER U WITH DIAERESIS AND CARON
+01DC..01DD ; Ll # [2] LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE..LATIN SMALL LETTER TURNED E
+01DF ; Ll # LATIN SMALL LETTER A WITH DIAERESIS AND MACRON
+01E1 ; Ll # LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON
+01E3 ; Ll # LATIN SMALL LETTER AE WITH MACRON
+01E5 ; Ll # LATIN SMALL LETTER G WITH STROKE
+01E7 ; Ll # LATIN SMALL LETTER G WITH CARON
+01E9 ; Ll # LATIN SMALL LETTER K WITH CARON
+01EB ; Ll # LATIN SMALL LETTER O WITH OGONEK
+01ED ; Ll # LATIN SMALL LETTER O WITH OGONEK AND MACRON
+01EF..01F0 ; Ll # [2] LATIN SMALL LETTER EZH WITH CARON..LATIN SMALL LETTER J WITH CARON
+01F3 ; Ll # LATIN SMALL LETTER DZ
+01F5 ; Ll # LATIN SMALL LETTER G WITH ACUTE
+01F9 ; Ll # LATIN SMALL LETTER N WITH GRAVE
+01FB ; Ll # LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE
+01FD ; Ll # LATIN SMALL LETTER AE WITH ACUTE
+01FF ; Ll # LATIN SMALL LETTER O WITH STROKE AND ACUTE
+0201 ; Ll # LATIN SMALL LETTER A WITH DOUBLE GRAVE
+0203 ; Ll # LATIN SMALL LETTER A WITH INVERTED BREVE
+0205 ; Ll # LATIN SMALL LETTER E WITH DOUBLE GRAVE
+0207 ; Ll # LATIN SMALL LETTER E WITH INVERTED BREVE
+0209 ; Ll # LATIN SMALL LETTER I WITH DOUBLE GRAVE
+020B ; Ll # LATIN SMALL LETTER I WITH INVERTED BREVE
+020D ; Ll # LATIN SMALL LETTER O WITH DOUBLE GRAVE
+020F ; Ll # LATIN SMALL LETTER O WITH INVERTED BREVE
+0211 ; Ll # LATIN SMALL LETTER R WITH DOUBLE GRAVE
+0213 ; Ll # LATIN SMALL LETTER R WITH INVERTED BREVE
+0215 ; Ll # LATIN SMALL LETTER U WITH DOUBLE GRAVE
+0217 ; Ll # LATIN SMALL LETTER U WITH INVERTED BREVE
+0219 ; Ll # LATIN SMALL LETTER S WITH COMMA BELOW
+021B ; Ll # LATIN SMALL LETTER T WITH COMMA BELOW
+021D ; Ll # LATIN SMALL LETTER YOGH
+021F ; Ll # LATIN SMALL LETTER H WITH CARON
+0221 ; Ll # LATIN SMALL LETTER D WITH CURL
+0223 ; Ll # LATIN SMALL LETTER OU
+0225 ; Ll # LATIN SMALL LETTER Z WITH HOOK
+0227 ; Ll # LATIN SMALL LETTER A WITH DOT ABOVE
+0229 ; Ll # LATIN SMALL LETTER E WITH CEDILLA
+022B ; Ll # LATIN SMALL LETTER O WITH DIAERESIS AND MACRON
+022D ; Ll # LATIN SMALL LETTER O WITH TILDE AND MACRON
+022F ; Ll # LATIN SMALL LETTER O WITH DOT ABOVE
+0231 ; Ll # LATIN SMALL LETTER O WITH DOT ABOVE AND MACRON
+0233..0239 ; Ll # [7] LATIN SMALL LETTER Y WITH MACRON..LATIN SMALL LETTER QP DIGRAPH
+023C ; Ll # LATIN SMALL LETTER C WITH STROKE
+023F..0240 ; Ll # [2] LATIN SMALL LETTER S WITH SWASH TAIL..LATIN SMALL LETTER Z WITH SWASH TAIL
+0242 ; Ll # LATIN SMALL LETTER GLOTTAL STOP
+0247 ; Ll # LATIN SMALL LETTER E WITH STROKE
+0249 ; Ll # LATIN SMALL LETTER J WITH STROKE
+024B ; Ll # LATIN SMALL LETTER Q WITH HOOK TAIL
+024D ; Ll # LATIN SMALL LETTER R WITH STROKE
+024F..0293 ; Ll # [69] LATIN SMALL LETTER Y WITH STROKE..LATIN SMALL LETTER EZH WITH CURL
+0295..02AF ; Ll # [27] LATIN LETTER PHARYNGEAL VOICED FRICATIVE..LATIN SMALL LETTER TURNED H WITH FISHHOOK AND TAIL
+0371 ; Ll # GREEK SMALL LETTER HETA
+0373 ; Ll # GREEK SMALL LETTER ARCHAIC SAMPI
+0377 ; Ll # GREEK SMALL LETTER PAMPHYLIAN DIGAMMA
+037B..037D ; Ll # [3] GREEK SMALL REVERSED LUNATE SIGMA SYMBOL..GREEK SMALL REVERSED DOTTED LUNATE SIGMA SYMBOL
+0390 ; Ll # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
+03AC..03CE ; Ll # [35] GREEK SMALL LETTER ALPHA WITH TONOS..GREEK SMALL LETTER OMEGA WITH TONOS
+03D0..03D1 ; Ll # [2] GREEK BETA SYMBOL..GREEK THETA SYMBOL
+03D5..03D7 ; Ll # [3] GREEK PHI SYMBOL..GREEK KAI SYMBOL
+03D9 ; Ll # GREEK SMALL LETTER ARCHAIC KOPPA
+03DB ; Ll # GREEK SMALL LETTER STIGMA
+03DD ; Ll # GREEK SMALL LETTER DIGAMMA
+03DF ; Ll # GREEK SMALL LETTER KOPPA
+03E1 ; Ll # GREEK SMALL LETTER SAMPI
+03E3 ; Ll # COPTIC SMALL LETTER SHEI
+03E5 ; Ll # COPTIC SMALL LETTER FEI
+03E7 ; Ll # COPTIC SMALL LETTER KHEI
+03E9 ; Ll # COPTIC SMALL LETTER HORI
+03EB ; Ll # COPTIC SMALL LETTER GANGIA
+03ED ; Ll # COPTIC SMALL LETTER SHIMA
+03EF..03F3 ; Ll # [5] COPTIC SMALL LETTER DEI..GREEK LETTER YOT
+03F5 ; Ll # GREEK LUNATE EPSILON SYMBOL
+03F8 ; Ll # GREEK SMALL LETTER SHO
+03FB..03FC ; Ll # [2] GREEK SMALL LETTER SAN..GREEK RHO WITH STROKE SYMBOL
+0430..045F ; Ll # [48] CYRILLIC SMALL LETTER A..CYRILLIC SMALL LETTER DZHE
+0461 ; Ll # CYRILLIC SMALL LETTER OMEGA
+0463 ; Ll # CYRILLIC SMALL LETTER YAT
+0465 ; Ll # CYRILLIC SMALL LETTER IOTIFIED E
+0467 ; Ll # CYRILLIC SMALL LETTER LITTLE YUS
+0469 ; Ll # CYRILLIC SMALL LETTER IOTIFIED LITTLE YUS
+046B ; Ll # CYRILLIC SMALL LETTER BIG YUS
+046D ; Ll # CYRILLIC SMALL LETTER IOTIFIED BIG YUS
+046F ; Ll # CYRILLIC SMALL LETTER KSI
+0471 ; Ll # CYRILLIC SMALL LETTER PSI
+0473 ; Ll # CYRILLIC SMALL LETTER FITA
+0475 ; Ll # CYRILLIC SMALL LETTER IZHITSA
+0477 ; Ll # CYRILLIC SMALL LETTER IZHITSA WITH DOUBLE GRAVE ACCENT
+0479 ; Ll # CYRILLIC SMALL LETTER UK
+047B ; Ll # CYRILLIC SMALL LETTER ROUND OMEGA
+047D ; Ll # CYRILLIC SMALL LETTER OMEGA WITH TITLO
+047F ; Ll # CYRILLIC SMALL LETTER OT
+0481 ; Ll # CYRILLIC SMALL LETTER KOPPA
+048B ; Ll # CYRILLIC SMALL LETTER SHORT I WITH TAIL
+048D ; Ll # CYRILLIC SMALL LETTER SEMISOFT SIGN
+048F ; Ll # CYRILLIC SMALL LETTER ER WITH TICK
+0491 ; Ll # CYRILLIC SMALL LETTER GHE WITH UPTURN
+0493 ; Ll # CYRILLIC SMALL LETTER GHE WITH STROKE
+0495 ; Ll # CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+0497 ; Ll # CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+0499 ; Ll # CYRILLIC SMALL LETTER ZE WITH DESCENDER
+049B ; Ll # CYRILLIC SMALL LETTER KA WITH DESCENDER
+049D ; Ll # CYRILLIC SMALL LETTER KA WITH VERTICAL STROKE
+049F ; Ll # CYRILLIC SMALL LETTER KA WITH STROKE
+04A1 ; Ll # CYRILLIC SMALL LETTER BASHKIR KA
+04A3 ; Ll # CYRILLIC SMALL LETTER EN WITH DESCENDER
+04A5 ; Ll # CYRILLIC SMALL LIGATURE EN GHE
+04A7 ; Ll # CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+04A9 ; Ll # CYRILLIC SMALL LETTER ABKHASIAN HA
+04AB ; Ll # CYRILLIC SMALL LETTER ES WITH DESCENDER
+04AD ; Ll # CYRILLIC SMALL LETTER TE WITH DESCENDER
+04AF ; Ll # CYRILLIC SMALL LETTER STRAIGHT U
+04B1 ; Ll # CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE
+04B3 ; Ll # CYRILLIC SMALL LETTER HA WITH DESCENDER
+04B5 ; Ll # CYRILLIC SMALL LIGATURE TE TSE
+04B7 ; Ll # CYRILLIC SMALL LETTER CHE WITH DESCENDER
+04B9 ; Ll # CYRILLIC SMALL LETTER CHE WITH VERTICAL STROKE
+04BB ; Ll # CYRILLIC SMALL LETTER SHHA
+04BD ; Ll # CYRILLIC SMALL LETTER ABKHASIAN CHE
+04BF ; Ll # CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+04C2 ; Ll # CYRILLIC SMALL LETTER ZHE WITH BREVE
+04C4 ; Ll # CYRILLIC SMALL LETTER KA WITH HOOK
+04C6 ; Ll # CYRILLIC SMALL LETTER EL WITH TAIL
+04C8 ; Ll # CYRILLIC SMALL LETTER EN WITH HOOK
+04CA ; Ll # CYRILLIC SMALL LETTER EN WITH TAIL
+04CC ; Ll # CYRILLIC SMALL LETTER KHAKASSIAN CHE
+04CE..04CF ; Ll # [2] CYRILLIC SMALL LETTER EM WITH TAIL..CYRILLIC SMALL LETTER PALOCHKA
+04D1 ; Ll # CYRILLIC SMALL LETTER A WITH BREVE
+04D3 ; Ll # CYRILLIC SMALL LETTER A WITH DIAERESIS
+04D5 ; Ll # CYRILLIC SMALL LIGATURE A IE
+04D7 ; Ll # CYRILLIC SMALL LETTER IE WITH BREVE
+04D9 ; Ll # CYRILLIC SMALL LETTER SCHWA
+04DB ; Ll # CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS
+04DD ; Ll # CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+04DF ; Ll # CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+04E1 ; Ll # CYRILLIC SMALL LETTER ABKHASIAN DZE
+04E3 ; Ll # CYRILLIC SMALL LETTER I WITH MACRON
+04E5 ; Ll # CYRILLIC SMALL LETTER I WITH DIAERESIS
+04E7 ; Ll # CYRILLIC SMALL LETTER O WITH DIAERESIS
+04E9 ; Ll # CYRILLIC SMALL LETTER BARRED O
+04EB ; Ll # CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS
+04ED ; Ll # CYRILLIC SMALL LETTER E WITH DIAERESIS
+04EF ; Ll # CYRILLIC SMALL LETTER U WITH MACRON
+04F1 ; Ll # CYRILLIC SMALL LETTER U WITH DIAERESIS
+04F3 ; Ll # CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+04F5 ; Ll # CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+04F7 ; Ll # CYRILLIC SMALL LETTER GHE WITH DESCENDER
+04F9 ; Ll # CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+04FB ; Ll # CYRILLIC SMALL LETTER GHE WITH STROKE AND HOOK
+04FD ; Ll # CYRILLIC SMALL LETTER HA WITH HOOK
+04FF ; Ll # CYRILLIC SMALL LETTER HA WITH STROKE
+0501 ; Ll # CYRILLIC SMALL LETTER KOMI DE
+0503 ; Ll # CYRILLIC SMALL LETTER KOMI DJE
+0505 ; Ll # CYRILLIC SMALL LETTER KOMI ZJE
+0507 ; Ll # CYRILLIC SMALL LETTER KOMI DZJE
+0509 ; Ll # CYRILLIC SMALL LETTER KOMI LJE
+050B ; Ll # CYRILLIC SMALL LETTER KOMI NJE
+050D ; Ll # CYRILLIC SMALL LETTER KOMI SJE
+050F ; Ll # CYRILLIC SMALL LETTER KOMI TJE
+0511 ; Ll # CYRILLIC SMALL LETTER REVERSED ZE
+0513 ; Ll # CYRILLIC SMALL LETTER EL WITH HOOK
+0515 ; Ll # CYRILLIC SMALL LETTER LHA
+0517 ; Ll # CYRILLIC SMALL LETTER RHA
+0519 ; Ll # CYRILLIC SMALL LETTER YAE
+051B ; Ll # CYRILLIC SMALL LETTER QA
+051D ; Ll # CYRILLIC SMALL LETTER WE
+051F ; Ll # CYRILLIC SMALL LETTER ALEUT KA
+0521 ; Ll # CYRILLIC SMALL LETTER EL WITH MIDDLE HOOK
+0523 ; Ll # CYRILLIC SMALL LETTER EN WITH MIDDLE HOOK
+0525 ; Ll # CYRILLIC SMALL LETTER PE WITH DESCENDER
+0527 ; Ll # CYRILLIC SMALL LETTER SHHA WITH DESCENDER
+0529 ; Ll # CYRILLIC SMALL LETTER EN WITH LEFT HOOK
+052B ; Ll # CYRILLIC SMALL LETTER DZZHE
+052D ; Ll # CYRILLIC SMALL LETTER DCHE
+052F ; Ll # CYRILLIC SMALL LETTER EL WITH DESCENDER
+0560..0588 ; Ll # [41] ARMENIAN SMALL LETTER TURNED AYB..ARMENIAN SMALL LETTER YI WITH STROKE
+10D0..10FA ; Ll # [43] GEORGIAN LETTER AN..GEORGIAN LETTER AIN
+10FD..10FF ; Ll # [3] GEORGIAN LETTER AEN..GEORGIAN LETTER LABIAL SIGN
+13F8..13FD ; Ll # [6] CHEROKEE SMALL LETTER YE..CHEROKEE SMALL LETTER MV
+1C80..1C88 ; Ll # [9] CYRILLIC SMALL LETTER ROUNDED VE..CYRILLIC SMALL LETTER UNBLENDED UK
+1D00..1D2B ; Ll # [44] LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER SMALL CAPITAL EL
+1D6B..1D77 ; Ll # [13] LATIN SMALL LETTER UE..LATIN SMALL LETTER TURNED G
+1D79..1D9A ; Ll # [34] LATIN SMALL LETTER INSULAR G..LATIN SMALL LETTER EZH WITH RETROFLEX HOOK
+1E01 ; Ll # LATIN SMALL LETTER A WITH RING BELOW
+1E03 ; Ll # LATIN SMALL LETTER B WITH DOT ABOVE
+1E05 ; Ll # LATIN SMALL LETTER B WITH DOT BELOW
+1E07 ; Ll # LATIN SMALL LETTER B WITH LINE BELOW
+1E09 ; Ll # LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
+1E0B ; Ll # LATIN SMALL LETTER D WITH DOT ABOVE
+1E0D ; Ll # LATIN SMALL LETTER D WITH DOT BELOW
+1E0F ; Ll # LATIN SMALL LETTER D WITH LINE BELOW
+1E11 ; Ll # LATIN SMALL LETTER D WITH CEDILLA
+1E13 ; Ll # LATIN SMALL LETTER D WITH CIRCUMFLEX BELOW
+1E15 ; Ll # LATIN SMALL LETTER E WITH MACRON AND GRAVE
+1E17 ; Ll # LATIN SMALL LETTER E WITH MACRON AND ACUTE
+1E19 ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX BELOW
+1E1B ; Ll # LATIN SMALL LETTER E WITH TILDE BELOW
+1E1D ; Ll # LATIN SMALL LETTER E WITH CEDILLA AND BREVE
+1E1F ; Ll # LATIN SMALL LETTER F WITH DOT ABOVE
+1E21 ; Ll # LATIN SMALL LETTER G WITH MACRON
+1E23 ; Ll # LATIN SMALL LETTER H WITH DOT ABOVE
+1E25 ; Ll # LATIN SMALL LETTER H WITH DOT BELOW
+1E27 ; Ll # LATIN SMALL LETTER H WITH DIAERESIS
+1E29 ; Ll # LATIN SMALL LETTER H WITH CEDILLA
+1E2B ; Ll # LATIN SMALL LETTER H WITH BREVE BELOW
+1E2D ; Ll # LATIN SMALL LETTER I WITH TILDE BELOW
+1E2F ; Ll # LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE
+1E31 ; Ll # LATIN SMALL LETTER K WITH ACUTE
+1E33 ; Ll # LATIN SMALL LETTER K WITH DOT BELOW
+1E35 ; Ll # LATIN SMALL LETTER K WITH LINE BELOW
+1E37 ; Ll # LATIN SMALL LETTER L WITH DOT BELOW
+1E39 ; Ll # LATIN SMALL LETTER L WITH DOT BELOW AND MACRON
+1E3B ; Ll # LATIN SMALL LETTER L WITH LINE BELOW
+1E3D ; Ll # LATIN SMALL LETTER L WITH CIRCUMFLEX BELOW
+1E3F ; Ll # LATIN SMALL LETTER M WITH ACUTE
+1E41 ; Ll # LATIN SMALL LETTER M WITH DOT ABOVE
+1E43 ; Ll # LATIN SMALL LETTER M WITH DOT BELOW
+1E45 ; Ll # LATIN SMALL LETTER N WITH DOT ABOVE
+1E47 ; Ll # LATIN SMALL LETTER N WITH DOT BELOW
+1E49 ; Ll # LATIN SMALL LETTER N WITH LINE BELOW
+1E4B ; Ll # LATIN SMALL LETTER N WITH CIRCUMFLEX BELOW
+1E4D ; Ll # LATIN SMALL LETTER O WITH TILDE AND ACUTE
+1E4F ; Ll # LATIN SMALL LETTER O WITH TILDE AND DIAERESIS
+1E51 ; Ll # LATIN SMALL LETTER O WITH MACRON AND GRAVE
+1E53 ; Ll # LATIN SMALL LETTER O WITH MACRON AND ACUTE
+1E55 ; Ll # LATIN SMALL LETTER P WITH ACUTE
+1E57 ; Ll # LATIN SMALL LETTER P WITH DOT ABOVE
+1E59 ; Ll # LATIN SMALL LETTER R WITH DOT ABOVE
+1E5B ; Ll # LATIN SMALL LETTER R WITH DOT BELOW
+1E5D ; Ll # LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
+1E5F ; Ll # LATIN SMALL LETTER R WITH LINE BELOW
+1E61 ; Ll # LATIN SMALL LETTER S WITH DOT ABOVE
+1E63 ; Ll # LATIN SMALL LETTER S WITH DOT BELOW
+1E65 ; Ll # LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE
+1E67 ; Ll # LATIN SMALL LETTER S WITH CARON AND DOT ABOVE
+1E69 ; Ll # LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE
+1E6B ; Ll # LATIN SMALL LETTER T WITH DOT ABOVE
+1E6D ; Ll # LATIN SMALL LETTER T WITH DOT BELOW
+1E6F ; Ll # LATIN SMALL LETTER T WITH LINE BELOW
+1E71 ; Ll # LATIN SMALL LETTER T WITH CIRCUMFLEX BELOW
+1E73 ; Ll # LATIN SMALL LETTER U WITH DIAERESIS BELOW
+1E75 ; Ll # LATIN SMALL LETTER U WITH TILDE BELOW
+1E77 ; Ll # LATIN SMALL LETTER U WITH CIRCUMFLEX BELOW
+1E79 ; Ll # LATIN SMALL LETTER U WITH TILDE AND ACUTE
+1E7B ; Ll # LATIN SMALL LETTER U WITH MACRON AND DIAERESIS
+1E7D ; Ll # LATIN SMALL LETTER V WITH TILDE
+1E7F ; Ll # LATIN SMALL LETTER V WITH DOT BELOW
+1E81 ; Ll # LATIN SMALL LETTER W WITH GRAVE
+1E83 ; Ll # LATIN SMALL LETTER W WITH ACUTE
+1E85 ; Ll # LATIN SMALL LETTER W WITH DIAERESIS
+1E87 ; Ll # LATIN SMALL LETTER W WITH DOT ABOVE
+1E89 ; Ll # LATIN SMALL LETTER W WITH DOT BELOW
+1E8B ; Ll # LATIN SMALL LETTER X WITH DOT ABOVE
+1E8D ; Ll # LATIN SMALL LETTER X WITH DIAERESIS
+1E8F ; Ll # LATIN SMALL LETTER Y WITH DOT ABOVE
+1E91 ; Ll # LATIN SMALL LETTER Z WITH CIRCUMFLEX
+1E93 ; Ll # LATIN SMALL LETTER Z WITH DOT BELOW
+1E95..1E9D ; Ll # [9] LATIN SMALL LETTER Z WITH LINE BELOW..LATIN SMALL LETTER LONG S WITH HIGH STROKE
+1E9F ; Ll # LATIN SMALL LETTER DELTA
+1EA1 ; Ll # LATIN SMALL LETTER A WITH DOT BELOW
+1EA3 ; Ll # LATIN SMALL LETTER A WITH HOOK ABOVE
+1EA5 ; Ll # LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE
+1EA7 ; Ll # LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE
+1EA9 ; Ll # LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
+1EAB ; Ll # LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE
+1EAD ; Ll # LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW
+1EAF ; Ll # LATIN SMALL LETTER A WITH BREVE AND ACUTE
+1EB1 ; Ll # LATIN SMALL LETTER A WITH BREVE AND GRAVE
+1EB3 ; Ll # LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE
+1EB5 ; Ll # LATIN SMALL LETTER A WITH BREVE AND TILDE
+1EB7 ; Ll # LATIN SMALL LETTER A WITH BREVE AND DOT BELOW
+1EB9 ; Ll # LATIN SMALL LETTER E WITH DOT BELOW
+1EBB ; Ll # LATIN SMALL LETTER E WITH HOOK ABOVE
+1EBD ; Ll # LATIN SMALL LETTER E WITH TILDE
+1EBF ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE
+1EC1 ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE
+1EC3 ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
+1EC5 ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE
+1EC7 ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW
+1EC9 ; Ll # LATIN SMALL LETTER I WITH HOOK ABOVE
+1ECB ; Ll # LATIN SMALL LETTER I WITH DOT BELOW
+1ECD ; Ll # LATIN SMALL LETTER O WITH DOT BELOW
+1ECF ; Ll # LATIN SMALL LETTER O WITH HOOK ABOVE
+1ED1 ; Ll # LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE
+1ED3 ; Ll # LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE
+1ED5 ; Ll # LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
+1ED7 ; Ll # LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE
+1ED9 ; Ll # LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW
+1EDB ; Ll # LATIN SMALL LETTER O WITH HORN AND ACUTE
+1EDD ; Ll # LATIN SMALL LETTER O WITH HORN AND GRAVE
+1EDF ; Ll # LATIN SMALL LETTER O WITH HORN AND HOOK ABOVE
+1EE1 ; Ll # LATIN SMALL LETTER O WITH HORN AND TILDE
+1EE3 ; Ll # LATIN SMALL LETTER O WITH HORN AND DOT BELOW
+1EE5 ; Ll # LATIN SMALL LETTER U WITH DOT BELOW
+1EE7 ; Ll # LATIN SMALL LETTER U WITH HOOK ABOVE
+1EE9 ; Ll # LATIN SMALL LETTER U WITH HORN AND ACUTE
+1EEB ; Ll # LATIN SMALL LETTER U WITH HORN AND GRAVE
+1EED ; Ll # LATIN SMALL LETTER U WITH HORN AND HOOK ABOVE
+1EEF ; Ll # LATIN SMALL LETTER U WITH HORN AND TILDE
+1EF1 ; Ll # LATIN SMALL LETTER U WITH HORN AND DOT BELOW
+1EF3 ; Ll # LATIN SMALL LETTER Y WITH GRAVE
+1EF5 ; Ll # LATIN SMALL LETTER Y WITH DOT BELOW
+1EF7 ; Ll # LATIN SMALL LETTER Y WITH HOOK ABOVE
+1EF9 ; Ll # LATIN SMALL LETTER Y WITH TILDE
+1EFB ; Ll # LATIN SMALL LETTER MIDDLE-WELSH LL
+1EFD ; Ll # LATIN SMALL LETTER MIDDLE-WELSH V
+1EFF..1F07 ; Ll # [9] LATIN SMALL LETTER Y WITH LOOP..GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI
+1F10..1F15 ; Ll # [6] GREEK SMALL LETTER EPSILON WITH PSILI..GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA
+1F20..1F27 ; Ll # [8] GREEK SMALL LETTER ETA WITH PSILI..GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI
+1F30..1F37 ; Ll # [8] GREEK SMALL LETTER IOTA WITH PSILI..GREEK SMALL LETTER IOTA WITH DASIA AND PERISPOMENI
+1F40..1F45 ; Ll # [6] GREEK SMALL LETTER OMICRON WITH PSILI..GREEK SMALL LETTER OMICRON WITH DASIA AND OXIA
+1F50..1F57 ; Ll # [8] GREEK SMALL LETTER UPSILON WITH PSILI..GREEK SMALL LETTER UPSILON WITH DASIA AND PERISPOMENI
+1F60..1F67 ; Ll # [8] GREEK SMALL LETTER OMEGA WITH PSILI..GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI
+1F70..1F7D ; Ll # [14] GREEK SMALL LETTER ALPHA WITH VARIA..GREEK SMALL LETTER OMEGA WITH OXIA
+1F80..1F87 ; Ll # [8] GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI..GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+1F90..1F97 ; Ll # [8] GREEK SMALL LETTER ETA WITH PSILI AND YPOGEGRAMMENI..GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+1FA0..1FA7 ; Ll # [8] GREEK SMALL LETTER OMEGA WITH PSILI AND YPOGEGRAMMENI..GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+1FB0..1FB4 ; Ll # [5] GREEK SMALL LETTER ALPHA WITH VRACHY..GREEK SMALL LETTER ALPHA WITH OXIA AND YPOGEGRAMMENI
+1FB6..1FB7 ; Ll # [2] GREEK SMALL LETTER ALPHA WITH PERISPOMENI..GREEK SMALL LETTER ALPHA WITH PERISPOMENI AND YPOGEGRAMMENI
+1FBE ; Ll # GREEK PROSGEGRAMMENI
+1FC2..1FC4 ; Ll # [3] GREEK SMALL LETTER ETA WITH VARIA AND YPOGEGRAMMENI..GREEK SMALL LETTER ETA WITH OXIA AND YPOGEGRAMMENI
+1FC6..1FC7 ; Ll # [2] GREEK SMALL LETTER ETA WITH PERISPOMENI..GREEK SMALL LETTER ETA WITH PERISPOMENI AND YPOGEGRAMMENI
+1FD0..1FD3 ; Ll # [4] GREEK SMALL LETTER IOTA WITH VRACHY..GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA
+1FD6..1FD7 ; Ll # [2] GREEK SMALL LETTER IOTA WITH PERISPOMENI..GREEK SMALL LETTER IOTA WITH DIALYTIKA AND PERISPOMENI
+1FE0..1FE7 ; Ll # [8] GREEK SMALL LETTER UPSILON WITH VRACHY..GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND PERISPOMENI
+1FF2..1FF4 ; Ll # [3] GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI..GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI
+1FF6..1FF7 ; Ll # [2] GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREEK SMALL LETTER OMEGA WITH PERISPOMENI AND YPOGEGRAMMENI
+210A ; Ll # SCRIPT SMALL G
+210E..210F ; Ll # [2] PLANCK CONSTANT..PLANCK CONSTANT OVER TWO PI
+2113 ; Ll # SCRIPT SMALL L
+212F ; Ll # SCRIPT SMALL E
+2134 ; Ll # SCRIPT SMALL O
+2139 ; Ll # INFORMATION SOURCE
+213C..213D ; Ll # [2] DOUBLE-STRUCK SMALL PI..DOUBLE-STRUCK SMALL GAMMA
+2146..2149 ; Ll # [4] DOUBLE-STRUCK ITALIC SMALL D..DOUBLE-STRUCK ITALIC SMALL J
+214E ; Ll # TURNED SMALL F
+2184 ; Ll # LATIN SMALL LETTER REVERSED C
+2C30..2C5E ; Ll # [47] GLAGOLITIC SMALL LETTER AZU..GLAGOLITIC SMALL LETTER LATINATE MYSLITE
+2C61 ; Ll # LATIN SMALL LETTER L WITH DOUBLE BAR
+2C65..2C66 ; Ll # [2] LATIN SMALL LETTER A WITH STROKE..LATIN SMALL LETTER T WITH DIAGONAL STROKE
+2C68 ; Ll # LATIN SMALL LETTER H WITH DESCENDER
+2C6A ; Ll # LATIN SMALL LETTER K WITH DESCENDER
+2C6C ; Ll # LATIN SMALL LETTER Z WITH DESCENDER
+2C71 ; Ll # LATIN SMALL LETTER V WITH RIGHT HOOK
+2C73..2C74 ; Ll # [2] LATIN SMALL LETTER W WITH HOOK..LATIN SMALL LETTER V WITH CURL
+2C76..2C7B ; Ll # [6] LATIN SMALL LETTER HALF H..LATIN LETTER SMALL CAPITAL TURNED E
+2C81 ; Ll # COPTIC SMALL LETTER ALFA
+2C83 ; Ll # COPTIC SMALL LETTER VIDA
+2C85 ; Ll # COPTIC SMALL LETTER GAMMA
+2C87 ; Ll # COPTIC SMALL LETTER DALDA
+2C89 ; Ll # COPTIC SMALL LETTER EIE
+2C8B ; Ll # COPTIC SMALL LETTER SOU
+2C8D ; Ll # COPTIC SMALL LETTER ZATA
+2C8F ; Ll # COPTIC SMALL LETTER HATE
+2C91 ; Ll # COPTIC SMALL LETTER THETHE
+2C93 ; Ll # COPTIC SMALL LETTER IAUDA
+2C95 ; Ll # COPTIC SMALL LETTER KAPA
+2C97 ; Ll # COPTIC SMALL LETTER LAULA
+2C99 ; Ll # COPTIC SMALL LETTER MI
+2C9B ; Ll # COPTIC SMALL LETTER NI
+2C9D ; Ll # COPTIC SMALL LETTER KSI
+2C9F ; Ll # COPTIC SMALL LETTER O
+2CA1 ; Ll # COPTIC SMALL LETTER PI
+2CA3 ; Ll # COPTIC SMALL LETTER RO
+2CA5 ; Ll # COPTIC SMALL LETTER SIMA
+2CA7 ; Ll # COPTIC SMALL LETTER TAU
+2CA9 ; Ll # COPTIC SMALL LETTER UA
+2CAB ; Ll # COPTIC SMALL LETTER FI
+2CAD ; Ll # COPTIC SMALL LETTER KHI
+2CAF ; Ll # COPTIC SMALL LETTER PSI
+2CB1 ; Ll # COPTIC SMALL LETTER OOU
+2CB3 ; Ll # COPTIC SMALL LETTER DIALECT-P ALEF
+2CB5 ; Ll # COPTIC SMALL LETTER OLD COPTIC AIN
+2CB7 ; Ll # COPTIC SMALL LETTER CRYPTOGRAMMIC EIE
+2CB9 ; Ll # COPTIC SMALL LETTER DIALECT-P KAPA
+2CBB ; Ll # COPTIC SMALL LETTER DIALECT-P NI
+2CBD ; Ll # COPTIC SMALL LETTER CRYPTOGRAMMIC NI
+2CBF ; Ll # COPTIC SMALL LETTER OLD COPTIC OOU
+2CC1 ; Ll # COPTIC SMALL LETTER SAMPI
+2CC3 ; Ll # COPTIC SMALL LETTER CROSSED SHEI
+2CC5 ; Ll # COPTIC SMALL LETTER OLD COPTIC SHEI
+2CC7 ; Ll # COPTIC SMALL LETTER OLD COPTIC ESH
+2CC9 ; Ll # COPTIC SMALL LETTER AKHMIMIC KHEI
+2CCB ; Ll # COPTIC SMALL LETTER DIALECT-P HORI
+2CCD ; Ll # COPTIC SMALL LETTER OLD COPTIC HORI
+2CCF ; Ll # COPTIC SMALL LETTER OLD COPTIC HA
+2CD1 ; Ll # COPTIC SMALL LETTER L-SHAPED HA
+2CD3 ; Ll # COPTIC SMALL LETTER OLD COPTIC HEI
+2CD5 ; Ll # COPTIC SMALL LETTER OLD COPTIC HAT
+2CD7 ; Ll # COPTIC SMALL LETTER OLD COPTIC GANGIA
+2CD9 ; Ll # COPTIC SMALL LETTER OLD COPTIC DJA
+2CDB ; Ll # COPTIC SMALL LETTER OLD COPTIC SHIMA
+2CDD ; Ll # COPTIC SMALL LETTER OLD NUBIAN SHIMA
+2CDF ; Ll # COPTIC SMALL LETTER OLD NUBIAN NGI
+2CE1 ; Ll # COPTIC SMALL LETTER OLD NUBIAN NYI
+2CE3..2CE4 ; Ll # [2] COPTIC SMALL LETTER OLD NUBIAN WAU..COPTIC SYMBOL KAI
+2CEC ; Ll # COPTIC SMALL LETTER CRYPTOGRAMMIC SHEI
+2CEE ; Ll # COPTIC SMALL LETTER CRYPTOGRAMMIC GANGIA
+2CF3 ; Ll # COPTIC SMALL LETTER BOHAIRIC KHEI
+2D00..2D25 ; Ll # [38] GEORGIAN SMALL LETTER AN..GEORGIAN SMALL LETTER HOE
+2D27 ; Ll # GEORGIAN SMALL LETTER YN
+2D2D ; Ll # GEORGIAN SMALL LETTER AEN
+A641 ; Ll # CYRILLIC SMALL LETTER ZEMLYA
+A643 ; Ll # CYRILLIC SMALL LETTER DZELO
+A645 ; Ll # CYRILLIC SMALL LETTER REVERSED DZE
+A647 ; Ll # CYRILLIC SMALL LETTER IOTA
+A649 ; Ll # CYRILLIC SMALL LETTER DJERV
+A64B ; Ll # CYRILLIC SMALL LETTER MONOGRAPH UK
+A64D ; Ll # CYRILLIC SMALL LETTER BROAD OMEGA
+A64F ; Ll # CYRILLIC SMALL LETTER NEUTRAL YER
+A651 ; Ll # CYRILLIC SMALL LETTER YERU WITH BACK YER
+A653 ; Ll # CYRILLIC SMALL LETTER IOTIFIED YAT
+A655 ; Ll # CYRILLIC SMALL LETTER REVERSED YU
+A657 ; Ll # CYRILLIC SMALL LETTER IOTIFIED A
+A659 ; Ll # CYRILLIC SMALL LETTER CLOSED LITTLE YUS
+A65B ; Ll # CYRILLIC SMALL LETTER BLENDED YUS
+A65D ; Ll # CYRILLIC SMALL LETTER IOTIFIED CLOSED LITTLE YUS
+A65F ; Ll # CYRILLIC SMALL LETTER YN
+A661 ; Ll # CYRILLIC SMALL LETTER REVERSED TSE
+A663 ; Ll # CYRILLIC SMALL LETTER SOFT DE
+A665 ; Ll # CYRILLIC SMALL LETTER SOFT EL
+A667 ; Ll # CYRILLIC SMALL LETTER SOFT EM
+A669 ; Ll # CYRILLIC SMALL LETTER MONOCULAR O
+A66B ; Ll # CYRILLIC SMALL LETTER BINOCULAR O
+A66D ; Ll # CYRILLIC SMALL LETTER DOUBLE MONOCULAR O
+A681 ; Ll # CYRILLIC SMALL LETTER DWE
+A683 ; Ll # CYRILLIC SMALL LETTER DZWE
+A685 ; Ll # CYRILLIC SMALL LETTER ZHWE
+A687 ; Ll # CYRILLIC SMALL LETTER CCHE
+A689 ; Ll # CYRILLIC SMALL LETTER DZZE
+A68B ; Ll # CYRILLIC SMALL LETTER TE WITH MIDDLE HOOK
+A68D ; Ll # CYRILLIC SMALL LETTER TWE
+A68F ; Ll # CYRILLIC SMALL LETTER TSWE
+A691 ; Ll # CYRILLIC SMALL LETTER TSSE
+A693 ; Ll # CYRILLIC SMALL LETTER TCHE
+A695 ; Ll # CYRILLIC SMALL LETTER HWE
+A697 ; Ll # CYRILLIC SMALL LETTER SHWE
+A699 ; Ll # CYRILLIC SMALL LETTER DOUBLE O
+A69B ; Ll # CYRILLIC SMALL LETTER CROSSED O
+A723 ; Ll # LATIN SMALL LETTER EGYPTOLOGICAL ALEF
+A725 ; Ll # LATIN SMALL LETTER EGYPTOLOGICAL AIN
+A727 ; Ll # LATIN SMALL LETTER HENG
+A729 ; Ll # LATIN SMALL LETTER TZ
+A72B ; Ll # LATIN SMALL LETTER TRESILLO
+A72D ; Ll # LATIN SMALL LETTER CUATRILLO
+A72F..A731 ; Ll # [3] LATIN SMALL LETTER CUATRILLO WITH COMMA..LATIN LETTER SMALL CAPITAL S
+A733 ; Ll # LATIN SMALL LETTER AA
+A735 ; Ll # LATIN SMALL LETTER AO
+A737 ; Ll # LATIN SMALL LETTER AU
+A739 ; Ll # LATIN SMALL LETTER AV
+A73B ; Ll # LATIN SMALL LETTER AV WITH HORIZONTAL BAR
+A73D ; Ll # LATIN SMALL LETTER AY
+A73F ; Ll # LATIN SMALL LETTER REVERSED C WITH DOT
+A741 ; Ll # LATIN SMALL LETTER K WITH STROKE
+A743 ; Ll # LATIN SMALL LETTER K WITH DIAGONAL STROKE
+A745 ; Ll # LATIN SMALL LETTER K WITH STROKE AND DIAGONAL STROKE
+A747 ; Ll # LATIN SMALL LETTER BROKEN L
+A749 ; Ll # LATIN SMALL LETTER L WITH HIGH STROKE
+A74B ; Ll # LATIN SMALL LETTER O WITH LONG STROKE OVERLAY
+A74D ; Ll # LATIN SMALL LETTER O WITH LOOP
+A74F ; Ll # LATIN SMALL LETTER OO
+A751 ; Ll # LATIN SMALL LETTER P WITH STROKE THROUGH DESCENDER
+A753 ; Ll # LATIN SMALL LETTER P WITH FLOURISH
+A755 ; Ll # LATIN SMALL LETTER P WITH SQUIRREL TAIL
+A757 ; Ll # LATIN SMALL LETTER Q WITH STROKE THROUGH DESCENDER
+A759 ; Ll # LATIN SMALL LETTER Q WITH DIAGONAL STROKE
+A75B ; Ll # LATIN SMALL LETTER R ROTUNDA
+A75D ; Ll # LATIN SMALL LETTER RUM ROTUNDA
+A75F ; Ll # LATIN SMALL LETTER V WITH DIAGONAL STROKE
+A761 ; Ll # LATIN SMALL LETTER VY
+A763 ; Ll # LATIN SMALL LETTER VISIGOTHIC Z
+A765 ; Ll # LATIN SMALL LETTER THORN WITH STROKE
+A767 ; Ll # LATIN SMALL LETTER THORN WITH STROKE THROUGH DESCENDER
+A769 ; Ll # LATIN SMALL LETTER VEND
+A76B ; Ll # LATIN SMALL LETTER ET
+A76D ; Ll # LATIN SMALL LETTER IS
+A76F ; Ll # LATIN SMALL LETTER CON
+A771..A778 ; Ll # [8] LATIN SMALL LETTER DUM..LATIN SMALL LETTER UM
+A77A ; Ll # LATIN SMALL LETTER INSULAR D
+A77C ; Ll # LATIN SMALL LETTER INSULAR F
+A77F ; Ll # LATIN SMALL LETTER TURNED INSULAR G
+A781 ; Ll # LATIN SMALL LETTER TURNED L
+A783 ; Ll # LATIN SMALL LETTER INSULAR R
+A785 ; Ll # LATIN SMALL LETTER INSULAR S
+A787 ; Ll # LATIN SMALL LETTER INSULAR T
+A78C ; Ll # LATIN SMALL LETTER SALTILLO
+A78E ; Ll # LATIN SMALL LETTER L WITH RETROFLEX HOOK AND BELT
+A791 ; Ll # LATIN SMALL LETTER N WITH DESCENDER
+A793..A795 ; Ll # [3] LATIN SMALL LETTER C WITH BAR..LATIN SMALL LETTER H WITH PALATAL HOOK
+A797 ; Ll # LATIN SMALL LETTER B WITH FLOURISH
+A799 ; Ll # LATIN SMALL LETTER F WITH STROKE
+A79B ; Ll # LATIN SMALL LETTER VOLAPUK AE
+A79D ; Ll # LATIN SMALL LETTER VOLAPUK OE
+A79F ; Ll # LATIN SMALL LETTER VOLAPUK UE
+A7A1 ; Ll # LATIN SMALL LETTER G WITH OBLIQUE STROKE
+A7A3 ; Ll # LATIN SMALL LETTER K WITH OBLIQUE STROKE
+A7A5 ; Ll # LATIN SMALL LETTER N WITH OBLIQUE STROKE
+A7A7 ; Ll # LATIN SMALL LETTER R WITH OBLIQUE STROKE
+A7A9 ; Ll # LATIN SMALL LETTER S WITH OBLIQUE STROKE
+A7AF ; Ll # LATIN LETTER SMALL CAPITAL Q
+A7B5 ; Ll # LATIN SMALL LETTER BETA
+A7B7 ; Ll # LATIN SMALL LETTER OMEGA
+A7B9 ; Ll # LATIN SMALL LETTER U WITH STROKE
+A7BB ; Ll # LATIN SMALL LETTER GLOTTAL A
+A7BD ; Ll # LATIN SMALL LETTER GLOTTAL I
+A7BF ; Ll # LATIN SMALL LETTER GLOTTAL U
+A7C3 ; Ll # LATIN SMALL LETTER ANGLICANA W
+A7FA ; Ll # LATIN LETTER SMALL CAPITAL TURNED M
+AB30..AB5A ; Ll # [43] LATIN SMALL LETTER BARRED ALPHA..LATIN SMALL LETTER Y WITH SHORT RIGHT LEG
+AB60..AB67 ; Ll # [8] LATIN SMALL LETTER SAKHA YAT..LATIN SMALL LETTER TS DIGRAPH WITH RETROFLEX HOOK
+AB70..ABBF ; Ll # [80] CHEROKEE SMALL LETTER A..CHEROKEE SMALL LETTER YA
+FB00..FB06 ; Ll # [7] LATIN SMALL LIGATURE FF..LATIN SMALL LIGATURE ST
+FB13..FB17 ; Ll # [5] ARMENIAN SMALL LIGATURE MEN NOW..ARMENIAN SMALL LIGATURE MEN XEH
+FF41..FF5A ; Ll # [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN SMALL LETTER Z
+10428..1044F ; Ll # [40] DESERET SMALL LETTER LONG I..DESERET SMALL LETTER EW
+104D8..104FB ; Ll # [36] OSAGE SMALL LETTER A..OSAGE SMALL LETTER ZHA
+10CC0..10CF2 ; Ll # [51] OLD HUNGARIAN SMALL LETTER A..OLD HUNGARIAN SMALL LETTER US
+118C0..118DF ; Ll # [32] WARANG CITI SMALL LETTER NGAA..WARANG CITI SMALL LETTER VIYO
+16E60..16E7F ; Ll # [32] MEDEFAIDRIN SMALL LETTER M..MEDEFAIDRIN SMALL LETTER Y
+1D41A..1D433 ; Ll # [26] MATHEMATICAL BOLD SMALL A..MATHEMATICAL BOLD SMALL Z
+1D44E..1D454 ; Ll # [7] MATHEMATICAL ITALIC SMALL A..MATHEMATICAL ITALIC SMALL G
+1D456..1D467 ; Ll # [18] MATHEMATICAL ITALIC SMALL I..MATHEMATICAL ITALIC SMALL Z
+1D482..1D49B ; Ll # [26] MATHEMATICAL BOLD ITALIC SMALL A..MATHEMATICAL BOLD ITALIC SMALL Z
+1D4B6..1D4B9 ; Ll # [4] MATHEMATICAL SCRIPT SMALL A..MATHEMATICAL SCRIPT SMALL D
+1D4BB ; Ll # MATHEMATICAL SCRIPT SMALL F
+1D4BD..1D4C3 ; Ll # [7] MATHEMATICAL SCRIPT SMALL H..MATHEMATICAL SCRIPT SMALL N
+1D4C5..1D4CF ; Ll # [11] MATHEMATICAL SCRIPT SMALL P..MATHEMATICAL SCRIPT SMALL Z
+1D4EA..1D503 ; Ll # [26] MATHEMATICAL BOLD SCRIPT SMALL A..MATHEMATICAL BOLD SCRIPT SMALL Z
+1D51E..1D537 ; Ll # [26] MATHEMATICAL FRAKTUR SMALL A..MATHEMATICAL FRAKTUR SMALL Z
+1D552..1D56B ; Ll # [26] MATHEMATICAL DOUBLE-STRUCK SMALL A..MATHEMATICAL DOUBLE-STRUCK SMALL Z
+1D586..1D59F ; Ll # [26] MATHEMATICAL BOLD FRAKTUR SMALL A..MATHEMATICAL BOLD FRAKTUR SMALL Z
+1D5BA..1D5D3 ; Ll # [26] MATHEMATICAL SANS-SERIF SMALL A..MATHEMATICAL SANS-SERIF SMALL Z
+1D5EE..1D607 ; Ll # [26] MATHEMATICAL SANS-SERIF BOLD SMALL A..MATHEMATICAL SANS-SERIF BOLD SMALL Z
+1D622..1D63B ; Ll # [26] MATHEMATICAL SANS-SERIF ITALIC SMALL A..MATHEMATICAL SANS-SERIF ITALIC SMALL Z
+1D656..1D66F ; Ll # [26] MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL A..MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL Z
+1D68A..1D6A5 ; Ll # [28] MATHEMATICAL MONOSPACE SMALL A..MATHEMATICAL ITALIC SMALL DOTLESS J
+1D6C2..1D6DA ; Ll # [25] MATHEMATICAL BOLD SMALL ALPHA..MATHEMATICAL BOLD SMALL OMEGA
+1D6DC..1D6E1 ; Ll # [6] MATHEMATICAL BOLD EPSILON SYMBOL..MATHEMATICAL BOLD PI SYMBOL
+1D6FC..1D714 ; Ll # [25] MATHEMATICAL ITALIC SMALL ALPHA..MATHEMATICAL ITALIC SMALL OMEGA
+1D716..1D71B ; Ll # [6] MATHEMATICAL ITALIC EPSILON SYMBOL..MATHEMATICAL ITALIC PI SYMBOL
+1D736..1D74E ; Ll # [25] MATHEMATICAL BOLD ITALIC SMALL ALPHA..MATHEMATICAL BOLD ITALIC SMALL OMEGA
+1D750..1D755 ; Ll # [6] MATHEMATICAL BOLD ITALIC EPSILON SYMBOL..MATHEMATICAL BOLD ITALIC PI SYMBOL
+1D770..1D788 ; Ll # [25] MATHEMATICAL SANS-SERIF BOLD SMALL ALPHA..MATHEMATICAL SANS-SERIF BOLD SMALL OMEGA
+1D78A..1D78F ; Ll # [6] MATHEMATICAL SANS-SERIF BOLD EPSILON SYMBOL..MATHEMATICAL SANS-SERIF BOLD PI SYMBOL
+1D7AA..1D7C2 ; Ll # [25] MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL ALPHA..MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL OMEGA
+1D7C4..1D7C9 ; Ll # [6] MATHEMATICAL SANS-SERIF BOLD ITALIC EPSILON SYMBOL..MATHEMATICAL SANS-SERIF BOLD ITALIC PI SYMBOL
+1D7CB ; Ll # MATHEMATICAL BOLD SMALL DIGAMMA
+1E922..1E943 ; Ll # [34] ADLAM SMALL LETTER ALIF..ADLAM SMALL LETTER SHA
+
+# Total code points: 2151
+
+# ================================================
+
+# General_Category=Titlecase_Letter
+
+01C5 ; Lt # LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON
+01C8 ; Lt # LATIN CAPITAL LETTER L WITH SMALL LETTER J
+01CB ; Lt # LATIN CAPITAL LETTER N WITH SMALL LETTER J
+01F2 ; Lt # LATIN CAPITAL LETTER D WITH SMALL LETTER Z
+1F88..1F8F ; Lt # [8] GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI..GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1F98..1F9F ; Lt # [8] GREEK CAPITAL LETTER ETA WITH PSILI AND PROSGEGRAMMENI..GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1FA8..1FAF ; Lt # [8] GREEK CAPITAL LETTER OMEGA WITH PSILI AND PROSGEGRAMMENI..GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1FBC ; Lt # GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI
+1FCC ; Lt # GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI
+1FFC ; Lt # GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
+
+# Total code points: 31
+
+# ================================================
+
+# General_Category=Modifier_Letter
+
+02B0..02C1 ; Lm # [18] MODIFIER LETTER SMALL H..MODIFIER LETTER REVERSED GLOTTAL STOP
+02C6..02D1 ; Lm # [12] MODIFIER LETTER CIRCUMFLEX ACCENT..MODIFIER LETTER HALF TRIANGULAR COLON
+02E0..02E4 ; Lm # [5] MODIFIER LETTER SMALL GAMMA..MODIFIER LETTER SMALL REVERSED GLOTTAL STOP
+02EC ; Lm # MODIFIER LETTER VOICING
+02EE ; Lm # MODIFIER LETTER DOUBLE APOSTROPHE
+0374 ; Lm # GREEK NUMERAL SIGN
+037A ; Lm # GREEK YPOGEGRAMMENI
+0559 ; Lm # ARMENIAN MODIFIER LETTER LEFT HALF RING
+0640 ; Lm # ARABIC TATWEEL
+06E5..06E6 ; Lm # [2] ARABIC SMALL WAW..ARABIC SMALL YEH
+07F4..07F5 ; Lm # [2] NKO HIGH TONE APOSTROPHE..NKO LOW TONE APOSTROPHE
+07FA ; Lm # NKO LAJANYALAN
+081A ; Lm # SAMARITAN MODIFIER LETTER EPENTHETIC YUT
+0824 ; Lm # SAMARITAN MODIFIER LETTER SHORT A
+0828 ; Lm # SAMARITAN MODIFIER LETTER I
+0971 ; Lm # DEVANAGARI SIGN HIGH SPACING DOT
+0E46 ; Lm # THAI CHARACTER MAIYAMOK
+0EC6 ; Lm # LAO KO LA
+10FC ; Lm # MODIFIER LETTER GEORGIAN NAR
+17D7 ; Lm # KHMER SIGN LEK TOO
+1843 ; Lm # MONGOLIAN LETTER TODO LONG VOWEL SIGN
+1AA7 ; Lm # TAI THAM SIGN MAI YAMOK
+1C78..1C7D ; Lm # [6] OL CHIKI MU TTUDDAG..OL CHIKI AHAD
+1D2C..1D6A ; Lm # [63] MODIFIER LETTER CAPITAL A..GREEK SUBSCRIPT SMALL LETTER CHI
+1D78 ; Lm # MODIFIER LETTER CYRILLIC EN
+1D9B..1DBF ; Lm # [37] MODIFIER LETTER SMALL TURNED ALPHA..MODIFIER LETTER SMALL THETA
+2071 ; Lm # SUPERSCRIPT LATIN SMALL LETTER I
+207F ; Lm # SUPERSCRIPT LATIN SMALL LETTER N
+2090..209C ; Lm # [13] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER T
+2C7C..2C7D ; Lm # [2] LATIN SUBSCRIPT SMALL LETTER J..MODIFIER LETTER CAPITAL V
+2D6F ; Lm # TIFINAGH MODIFIER LETTER LABIALIZATION MARK
+2E2F ; Lm # VERTICAL TILDE
+3005 ; Lm # IDEOGRAPHIC ITERATION MARK
+3031..3035 ; Lm # [5] VERTICAL KANA REPEAT MARK..VERTICAL KANA REPEAT MARK LOWER HALF
+303B ; Lm # VERTICAL IDEOGRAPHIC ITERATION MARK
+309D..309E ; Lm # [2] HIRAGANA ITERATION MARK..HIRAGANA VOICED ITERATION MARK
+30FC..30FE ; Lm # [3] KATAKANA-HIRAGANA PROLONGED SOUND MARK..KATAKANA VOICED ITERATION MARK
+A015 ; Lm # YI SYLLABLE WU
+A4F8..A4FD ; Lm # [6] LISU LETTER TONE MYA TI..LISU LETTER TONE MYA JEU
+A60C ; Lm # VAI SYLLABLE LENGTHENER
+A67F ; Lm # CYRILLIC PAYEROK
+A69C..A69D ; Lm # [2] MODIFIER LETTER CYRILLIC HARD SIGN..MODIFIER LETTER CYRILLIC SOFT SIGN
+A717..A71F ; Lm # [9] MODIFIER LETTER DOT VERTICAL BAR..MODIFIER LETTER LOW INVERTED EXCLAMATION MARK
+A770 ; Lm # MODIFIER LETTER US
+A788 ; Lm # MODIFIER LETTER LOW CIRCUMFLEX ACCENT
+A7F8..A7F9 ; Lm # [2] MODIFIER LETTER CAPITAL H WITH STROKE..MODIFIER LETTER SMALL LIGATURE OE
+A9CF ; Lm # JAVANESE PANGRANGKEP
+A9E6 ; Lm # MYANMAR MODIFIER LETTER SHAN REDUPLICATION
+AA70 ; Lm # MYANMAR MODIFIER LETTER KHAMTI REDUPLICATION
+AADD ; Lm # TAI VIET SYMBOL SAM
+AAF3..AAF4 ; Lm # [2] MEETEI MAYEK SYLLABLE REPETITION MARK..MEETEI MAYEK WORD REPETITION MARK
+AB5C..AB5F ; Lm # [4] MODIFIER LETTER SMALL HENG..MODIFIER LETTER SMALL U WITH LEFT HOOK
+FF70 ; Lm # HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK
+FF9E..FF9F ; Lm # [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK
+16B40..16B43 ; Lm # [4] PAHAWH HMONG SIGN VOS SEEV..PAHAWH HMONG SIGN IB YAM
+16F93..16F9F ; Lm # [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8
+16FE0..16FE1 ; Lm # [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
+16FE3 ; Lm # OLD CHINESE ITERATION MARK
+1E137..1E13D ; Lm # [7] NYIAKENG PUACHUE HMONG SIGN FOR PERSON..NYIAKENG PUACHUE HMONG SYLLABLE LENGTHENER
+1E94B ; Lm # ADLAM NASALIZATION MARK
+
+# Total code points: 259
+
+# ================================================
+
+# General_Category=Other_Letter
+
+00AA ; Lo # FEMININE ORDINAL INDICATOR
+00BA ; Lo # MASCULINE ORDINAL INDICATOR
+01BB ; Lo # LATIN LETTER TWO WITH STROKE
+01C0..01C3 ; Lo # [4] LATIN LETTER DENTAL CLICK..LATIN LETTER RETROFLEX CLICK
+0294 ; Lo # LATIN LETTER GLOTTAL STOP
+05D0..05EA ; Lo # [27] HEBREW LETTER ALEF..HEBREW LETTER TAV
+05EF..05F2 ; Lo # [4] HEBREW YOD TRIANGLE..HEBREW LIGATURE YIDDISH DOUBLE YOD
+0620..063F ; Lo # [32] ARABIC LETTER KASHMIRI YEH..ARABIC LETTER FARSI YEH WITH THREE DOTS ABOVE
+0641..064A ; Lo # [10] ARABIC LETTER FEH..ARABIC LETTER YEH
+066E..066F ; Lo # [2] ARABIC LETTER DOTLESS BEH..ARABIC LETTER DOTLESS QAF
+0671..06D3 ; Lo # [99] ARABIC LETTER ALEF WASLA..ARABIC LETTER YEH BARREE WITH HAMZA ABOVE
+06D5 ; Lo # ARABIC LETTER AE
+06EE..06EF ; Lo # [2] ARABIC LETTER DAL WITH INVERTED V..ARABIC LETTER REH WITH INVERTED V
+06FA..06FC ; Lo # [3] ARABIC LETTER SHEEN WITH DOT BELOW..ARABIC LETTER GHAIN WITH DOT BELOW
+06FF ; Lo # ARABIC LETTER HEH WITH INVERTED V
+0710 ; Lo # SYRIAC LETTER ALAPH
+0712..072F ; Lo # [30] SYRIAC LETTER BETH..SYRIAC LETTER PERSIAN DHALATH
+074D..07A5 ; Lo # [89] SYRIAC LETTER SOGDIAN ZHAIN..THAANA LETTER WAAVU
+07B1 ; Lo # THAANA LETTER NAA
+07CA..07EA ; Lo # [33] NKO LETTER A..NKO LETTER JONA RA
+0800..0815 ; Lo # [22] SAMARITAN LETTER ALAF..SAMARITAN LETTER TAAF
+0840..0858 ; Lo # [25] MANDAIC LETTER HALQA..MANDAIC LETTER AIN
+0860..086A ; Lo # [11] SYRIAC LETTER MALAYALAM NGA..SYRIAC LETTER MALAYALAM SSA
+08A0..08B4 ; Lo # [21] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER KAF WITH DOT BELOW
+08B6..08BD ; Lo # [8] ARABIC LETTER BEH WITH SMALL MEEM ABOVE..ARABIC LETTER AFRICAN NOON
+0904..0939 ; Lo # [54] DEVANAGARI LETTER SHORT A..DEVANAGARI LETTER HA
+093D ; Lo # DEVANAGARI SIGN AVAGRAHA
+0950 ; Lo # DEVANAGARI OM
+0958..0961 ; Lo # [10] DEVANAGARI LETTER QA..DEVANAGARI LETTER VOCALIC LL
+0972..0980 ; Lo # [15] DEVANAGARI LETTER CANDRA A..BENGALI ANJI
+0985..098C ; Lo # [8] BENGALI LETTER A..BENGALI LETTER VOCALIC L
+098F..0990 ; Lo # [2] BENGALI LETTER E..BENGALI LETTER AI
+0993..09A8 ; Lo # [22] BENGALI LETTER O..BENGALI LETTER NA
+09AA..09B0 ; Lo # [7] BENGALI LETTER PA..BENGALI LETTER RA
+09B2 ; Lo # BENGALI LETTER LA
+09B6..09B9 ; Lo # [4] BENGALI LETTER SHA..BENGALI LETTER HA
+09BD ; Lo # BENGALI SIGN AVAGRAHA
+09CE ; Lo # BENGALI LETTER KHANDA TA
+09DC..09DD ; Lo # [2] BENGALI LETTER RRA..BENGALI LETTER RHA
+09DF..09E1 ; Lo # [3] BENGALI LETTER YYA..BENGALI LETTER VOCALIC LL
+09F0..09F1 ; Lo # [2] BENGALI LETTER RA WITH MIDDLE DIAGONAL..BENGALI LETTER RA WITH LOWER DIAGONAL
+09FC ; Lo # BENGALI LETTER VEDIC ANUSVARA
+0A05..0A0A ; Lo # [6] GURMUKHI LETTER A..GURMUKHI LETTER UU
+0A0F..0A10 ; Lo # [2] GURMUKHI LETTER EE..GURMUKHI LETTER AI
+0A13..0A28 ; Lo # [22] GURMUKHI LETTER OO..GURMUKHI LETTER NA
+0A2A..0A30 ; Lo # [7] GURMUKHI LETTER PA..GURMUKHI LETTER RA
+0A32..0A33 ; Lo # [2] GURMUKHI LETTER LA..GURMUKHI LETTER LLA
+0A35..0A36 ; Lo # [2] GURMUKHI LETTER VA..GURMUKHI LETTER SHA
+0A38..0A39 ; Lo # [2] GURMUKHI LETTER SA..GURMUKHI LETTER HA
+0A59..0A5C ; Lo # [4] GURMUKHI LETTER KHHA..GURMUKHI LETTER RRA
+0A5E ; Lo # GURMUKHI LETTER FA
+0A72..0A74 ; Lo # [3] GURMUKHI IRI..GURMUKHI EK ONKAR
+0A85..0A8D ; Lo # [9] GUJARATI LETTER A..GUJARATI VOWEL CANDRA E
+0A8F..0A91 ; Lo # [3] GUJARATI LETTER E..GUJARATI VOWEL CANDRA O
+0A93..0AA8 ; Lo # [22] GUJARATI LETTER O..GUJARATI LETTER NA
+0AAA..0AB0 ; Lo # [7] GUJARATI LETTER PA..GUJARATI LETTER RA
+0AB2..0AB3 ; Lo # [2] GUJARATI LETTER LA..GUJARATI LETTER LLA
+0AB5..0AB9 ; Lo # [5] GUJARATI LETTER VA..GUJARATI LETTER HA
+0ABD ; Lo # GUJARATI SIGN AVAGRAHA
+0AD0 ; Lo # GUJARATI OM
+0AE0..0AE1 ; Lo # [2] GUJARATI LETTER VOCALIC RR..GUJARATI LETTER VOCALIC LL
+0AF9 ; Lo # GUJARATI LETTER ZHA
+0B05..0B0C ; Lo # [8] ORIYA LETTER A..ORIYA LETTER VOCALIC L
+0B0F..0B10 ; Lo # [2] ORIYA LETTER E..ORIYA LETTER AI
+0B13..0B28 ; Lo # [22] ORIYA LETTER O..ORIYA LETTER NA
+0B2A..0B30 ; Lo # [7] ORIYA LETTER PA..ORIYA LETTER RA
+0B32..0B33 ; Lo # [2] ORIYA LETTER LA..ORIYA LETTER LLA
+0B35..0B39 ; Lo # [5] ORIYA LETTER VA..ORIYA LETTER HA
+0B3D ; Lo # ORIYA SIGN AVAGRAHA
+0B5C..0B5D ; Lo # [2] ORIYA LETTER RRA..ORIYA LETTER RHA
+0B5F..0B61 ; Lo # [3] ORIYA LETTER YYA..ORIYA LETTER VOCALIC LL
+0B71 ; Lo # ORIYA LETTER WA
+0B83 ; Lo # TAMIL SIGN VISARGA
+0B85..0B8A ; Lo # [6] TAMIL LETTER A..TAMIL LETTER UU
+0B8E..0B90 ; Lo # [3] TAMIL LETTER E..TAMIL LETTER AI
+0B92..0B95 ; Lo # [4] TAMIL LETTER O..TAMIL LETTER KA
+0B99..0B9A ; Lo # [2] TAMIL LETTER NGA..TAMIL LETTER CA
+0B9C ; Lo # TAMIL LETTER JA
+0B9E..0B9F ; Lo # [2] TAMIL LETTER NYA..TAMIL LETTER TTA
+0BA3..0BA4 ; Lo # [2] TAMIL LETTER NNA..TAMIL LETTER TA
+0BA8..0BAA ; Lo # [3] TAMIL LETTER NA..TAMIL LETTER PA
+0BAE..0BB9 ; Lo # [12] TAMIL LETTER MA..TAMIL LETTER HA
+0BD0 ; Lo # TAMIL OM
+0C05..0C0C ; Lo # [8] TELUGU LETTER A..TELUGU LETTER VOCALIC L
+0C0E..0C10 ; Lo # [3] TELUGU LETTER E..TELUGU LETTER AI
+0C12..0C28 ; Lo # [23] TELUGU LETTER O..TELUGU LETTER NA
+0C2A..0C39 ; Lo # [16] TELUGU LETTER PA..TELUGU LETTER HA
+0C3D ; Lo # TELUGU SIGN AVAGRAHA
+0C58..0C5A ; Lo # [3] TELUGU LETTER TSA..TELUGU LETTER RRRA
+0C60..0C61 ; Lo # [2] TELUGU LETTER VOCALIC RR..TELUGU LETTER VOCALIC LL
+0C80 ; Lo # KANNADA SIGN SPACING CANDRABINDU
+0C85..0C8C ; Lo # [8] KANNADA LETTER A..KANNADA LETTER VOCALIC L
+0C8E..0C90 ; Lo # [3] KANNADA LETTER E..KANNADA LETTER AI
+0C92..0CA8 ; Lo # [23] KANNADA LETTER O..KANNADA LETTER NA
+0CAA..0CB3 ; Lo # [10] KANNADA LETTER PA..KANNADA LETTER LLA
+0CB5..0CB9 ; Lo # [5] KANNADA LETTER VA..KANNADA LETTER HA
+0CBD ; Lo # KANNADA SIGN AVAGRAHA
+0CDE ; Lo # KANNADA LETTER FA
+0CE0..0CE1 ; Lo # [2] KANNADA LETTER VOCALIC RR..KANNADA LETTER VOCALIC LL
+0CF1..0CF2 ; Lo # [2] KANNADA SIGN JIHVAMULIYA..KANNADA SIGN UPADHMANIYA
+0D05..0D0C ; Lo # [8] MALAYALAM LETTER A..MALAYALAM LETTER VOCALIC L
+0D0E..0D10 ; Lo # [3] MALAYALAM LETTER E..MALAYALAM LETTER AI
+0D12..0D3A ; Lo # [41] MALAYALAM LETTER O..MALAYALAM LETTER TTTA
+0D3D ; Lo # MALAYALAM SIGN AVAGRAHA
+0D4E ; Lo # MALAYALAM LETTER DOT REPH
+0D54..0D56 ; Lo # [3] MALAYALAM LETTER CHILLU M..MALAYALAM LETTER CHILLU LLL
+0D5F..0D61 ; Lo # [3] MALAYALAM LETTER ARCHAIC II..MALAYALAM LETTER VOCALIC LL
+0D7A..0D7F ; Lo # [6] MALAYALAM LETTER CHILLU NN..MALAYALAM LETTER CHILLU K
+0D85..0D96 ; Lo # [18] SINHALA LETTER AYANNA..SINHALA LETTER AUYANNA
+0D9A..0DB1 ; Lo # [24] SINHALA LETTER ALPAPRAANA KAYANNA..SINHALA LETTER DANTAJA NAYANNA
+0DB3..0DBB ; Lo # [9] SINHALA LETTER SANYAKA DAYANNA..SINHALA LETTER RAYANNA
+0DBD ; Lo # SINHALA LETTER DANTAJA LAYANNA
+0DC0..0DC6 ; Lo # [7] SINHALA LETTER VAYANNA..SINHALA LETTER FAYANNA
+0E01..0E30 ; Lo # [48] THAI CHARACTER KO KAI..THAI CHARACTER SARA A
+0E32..0E33 ; Lo # [2] THAI CHARACTER SARA AA..THAI CHARACTER SARA AM
+0E40..0E45 ; Lo # [6] THAI CHARACTER SARA E..THAI CHARACTER LAKKHANGYAO
+0E81..0E82 ; Lo # [2] LAO LETTER KO..LAO LETTER KHO SUNG
+0E84 ; Lo # LAO LETTER KHO TAM
+0E86..0E8A ; Lo # [5] LAO LETTER PALI GHA..LAO LETTER SO TAM
+0E8C..0EA3 ; Lo # [24] LAO LETTER PALI JHA..LAO LETTER LO LING
+0EA5 ; Lo # LAO LETTER LO LOOT
+0EA7..0EB0 ; Lo # [10] LAO LETTER WO..LAO VOWEL SIGN A
+0EB2..0EB3 ; Lo # [2] LAO VOWEL SIGN AA..LAO VOWEL SIGN AM
+0EBD ; Lo # LAO SEMIVOWEL SIGN NYO
+0EC0..0EC4 ; Lo # [5] LAO VOWEL SIGN E..LAO VOWEL SIGN AI
+0EDC..0EDF ; Lo # [4] LAO HO NO..LAO LETTER KHMU NYO
+0F00 ; Lo # TIBETAN SYLLABLE OM
+0F40..0F47 ; Lo # [8] TIBETAN LETTER KA..TIBETAN LETTER JA
+0F49..0F6C ; Lo # [36] TIBETAN LETTER NYA..TIBETAN LETTER RRA
+0F88..0F8C ; Lo # [5] TIBETAN SIGN LCE TSA CAN..TIBETAN SIGN INVERTED MCHU CAN
+1000..102A ; Lo # [43] MYANMAR LETTER KA..MYANMAR LETTER AU
+103F ; Lo # MYANMAR LETTER GREAT SA
+1050..1055 ; Lo # [6] MYANMAR LETTER SHA..MYANMAR LETTER VOCALIC LL
+105A..105D ; Lo # [4] MYANMAR LETTER MON NGA..MYANMAR LETTER MON BBE
+1061 ; Lo # MYANMAR LETTER SGAW KAREN SHA
+1065..1066 ; Lo # [2] MYANMAR LETTER WESTERN PWO KAREN THA..MYANMAR LETTER WESTERN PWO KAREN PWA
+106E..1070 ; Lo # [3] MYANMAR LETTER EASTERN PWO KAREN NNA..MYANMAR LETTER EASTERN PWO KAREN GHWA
+1075..1081 ; Lo # [13] MYANMAR LETTER SHAN KA..MYANMAR LETTER SHAN HA
+108E ; Lo # MYANMAR LETTER RUMAI PALAUNG FA
+1100..1248 ; Lo # [329] HANGUL CHOSEONG KIYEOK..ETHIOPIC SYLLABLE QWA
+124A..124D ; Lo # [4] ETHIOPIC SYLLABLE QWI..ETHIOPIC SYLLABLE QWE
+1250..1256 ; Lo # [7] ETHIOPIC SYLLABLE QHA..ETHIOPIC SYLLABLE QHO
+1258 ; Lo # ETHIOPIC SYLLABLE QHWA
+125A..125D ; Lo # [4] ETHIOPIC SYLLABLE QHWI..ETHIOPIC SYLLABLE QHWE
+1260..1288 ; Lo # [41] ETHIOPIC SYLLABLE BA..ETHIOPIC SYLLABLE XWA
+128A..128D ; Lo # [4] ETHIOPIC SYLLABLE XWI..ETHIOPIC SYLLABLE XWE
+1290..12B0 ; Lo # [33] ETHIOPIC SYLLABLE NA..ETHIOPIC SYLLABLE KWA
+12B2..12B5 ; Lo # [4] ETHIOPIC SYLLABLE KWI..ETHIOPIC SYLLABLE KWE
+12B8..12BE ; Lo # [7] ETHIOPIC SYLLABLE KXA..ETHIOPIC SYLLABLE KXO
+12C0 ; Lo # ETHIOPIC SYLLABLE KXWA
+12C2..12C5 ; Lo # [4] ETHIOPIC SYLLABLE KXWI..ETHIOPIC SYLLABLE KXWE
+12C8..12D6 ; Lo # [15] ETHIOPIC SYLLABLE WA..ETHIOPIC SYLLABLE PHARYNGEAL O
+12D8..1310 ; Lo # [57] ETHIOPIC SYLLABLE ZA..ETHIOPIC SYLLABLE GWA
+1312..1315 ; Lo # [4] ETHIOPIC SYLLABLE GWI..ETHIOPIC SYLLABLE GWE
+1318..135A ; Lo # [67] ETHIOPIC SYLLABLE GGA..ETHIOPIC SYLLABLE FYA
+1380..138F ; Lo # [16] ETHIOPIC SYLLABLE SEBATBEIT MWA..ETHIOPIC SYLLABLE PWE
+1401..166C ; Lo # [620] CANADIAN SYLLABICS E..CANADIAN SYLLABICS CARRIER TTSA
+166F..167F ; Lo # [17] CANADIAN SYLLABICS QAI..CANADIAN SYLLABICS BLACKFOOT W
+1681..169A ; Lo # [26] OGHAM LETTER BEITH..OGHAM LETTER PEITH
+16A0..16EA ; Lo # [75] RUNIC LETTER FEHU FEOH FE F..RUNIC LETTER X
+16F1..16F8 ; Lo # [8] RUNIC LETTER K..RUNIC LETTER FRANKS CASKET AESC
+1700..170C ; Lo # [13] TAGALOG LETTER A..TAGALOG LETTER YA
+170E..1711 ; Lo # [4] TAGALOG LETTER LA..TAGALOG LETTER HA
+1720..1731 ; Lo # [18] HANUNOO LETTER A..HANUNOO LETTER HA
+1740..1751 ; Lo # [18] BUHID LETTER A..BUHID LETTER HA
+1760..176C ; Lo # [13] TAGBANWA LETTER A..TAGBANWA LETTER YA
+176E..1770 ; Lo # [3] TAGBANWA LETTER LA..TAGBANWA LETTER SA
+1780..17B3 ; Lo # [52] KHMER LETTER KA..KHMER INDEPENDENT VOWEL QAU
+17DC ; Lo # KHMER SIGN AVAKRAHASANYA
+1820..1842 ; Lo # [35] MONGOLIAN LETTER A..MONGOLIAN LETTER CHI
+1844..1878 ; Lo # [53] MONGOLIAN LETTER TODO E..MONGOLIAN LETTER CHA WITH TWO DOTS
+1880..1884 ; Lo # [5] MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER ALI GALI INVERTED UBADAMA
+1887..18A8 ; Lo # [34] MONGOLIAN LETTER ALI GALI A..MONGOLIAN LETTER MANCHU ALI GALI BHA
+18AA ; Lo # MONGOLIAN LETTER MANCHU ALI GALI LHA
+18B0..18F5 ; Lo # [70] CANADIAN SYLLABICS OY..CANADIAN SYLLABICS CARRIER DENTAL S
+1900..191E ; Lo # [31] LIMBU VOWEL-CARRIER LETTER..LIMBU LETTER TRA
+1950..196D ; Lo # [30] TAI LE LETTER KA..TAI LE LETTER AI
+1970..1974 ; Lo # [5] TAI LE LETTER TONE-2..TAI LE LETTER TONE-6
+1980..19AB ; Lo # [44] NEW TAI LUE LETTER HIGH QA..NEW TAI LUE LETTER LOW SUA
+19B0..19C9 ; Lo # [26] NEW TAI LUE VOWEL SIGN VOWEL SHORTENER..NEW TAI LUE TONE MARK-2
+1A00..1A16 ; Lo # [23] BUGINESE LETTER KA..BUGINESE LETTER HA
+1A20..1A54 ; Lo # [53] TAI THAM LETTER HIGH KA..TAI THAM LETTER GREAT SA
+1B05..1B33 ; Lo # [47] BALINESE LETTER AKARA..BALINESE LETTER HA
+1B45..1B4B ; Lo # [7] BALINESE LETTER KAF SASAK..BALINESE LETTER ASYURA SASAK
+1B83..1BA0 ; Lo # [30] SUNDANESE LETTER A..SUNDANESE LETTER HA
+1BAE..1BAF ; Lo # [2] SUNDANESE LETTER KHA..SUNDANESE LETTER SYA
+1BBA..1BE5 ; Lo # [44] SUNDANESE AVAGRAHA..BATAK LETTER U
+1C00..1C23 ; Lo # [36] LEPCHA LETTER KA..LEPCHA LETTER A
+1C4D..1C4F ; Lo # [3] LEPCHA LETTER TTA..LEPCHA LETTER DDA
+1C5A..1C77 ; Lo # [30] OL CHIKI LETTER LA..OL CHIKI LETTER OH
+1CE9..1CEC ; Lo # [4] VEDIC SIGN ANUSVARA ANTARGOMUKHA..VEDIC SIGN ANUSVARA VAMAGOMUKHA WITH TAIL
+1CEE..1CF3 ; Lo # [6] VEDIC SIGN HEXIFORM LONG ANUSVARA..VEDIC SIGN ROTATED ARDHAVISARGA
+1CF5..1CF6 ; Lo # [2] VEDIC SIGN JIHVAMULIYA..VEDIC SIGN UPADHMANIYA
+1CFA ; Lo # VEDIC SIGN DOUBLE ANUSVARA ANTARGOMUKHA
+2135..2138 ; Lo # [4] ALEF SYMBOL..DALET SYMBOL
+2D30..2D67 ; Lo # [56] TIFINAGH LETTER YA..TIFINAGH LETTER YO
+2D80..2D96 ; Lo # [23] ETHIOPIC SYLLABLE LOA..ETHIOPIC SYLLABLE GGWE
+2DA0..2DA6 ; Lo # [7] ETHIOPIC SYLLABLE SSA..ETHIOPIC SYLLABLE SSO
+2DA8..2DAE ; Lo # [7] ETHIOPIC SYLLABLE CCA..ETHIOPIC SYLLABLE CCO
+2DB0..2DB6 ; Lo # [7] ETHIOPIC SYLLABLE ZZA..ETHIOPIC SYLLABLE ZZO
+2DB8..2DBE ; Lo # [7] ETHIOPIC SYLLABLE CCHA..ETHIOPIC SYLLABLE CCHO
+2DC0..2DC6 ; Lo # [7] ETHIOPIC SYLLABLE QYA..ETHIOPIC SYLLABLE QYO
+2DC8..2DCE ; Lo # [7] ETHIOPIC SYLLABLE KYA..ETHIOPIC SYLLABLE KYO
+2DD0..2DD6 ; Lo # [7] ETHIOPIC SYLLABLE XYA..ETHIOPIC SYLLABLE XYO
+2DD8..2DDE ; Lo # [7] ETHIOPIC SYLLABLE GYA..ETHIOPIC SYLLABLE GYO
+3006 ; Lo # IDEOGRAPHIC CLOSING MARK
+303C ; Lo # MASU MARK
+3041..3096 ; Lo # [86] HIRAGANA LETTER SMALL A..HIRAGANA LETTER SMALL KE
+309F ; Lo # HIRAGANA DIGRAPH YORI
+30A1..30FA ; Lo # [90] KATAKANA LETTER SMALL A..KATAKANA LETTER VO
+30FF ; Lo # KATAKANA DIGRAPH KOTO
+3105..312F ; Lo # [43] BOPOMOFO LETTER B..BOPOMOFO LETTER NN
+3131..318E ; Lo # [94] HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE
+31A0..31BA ; Lo # [27] BOPOMOFO LETTER BU..BOPOMOFO LETTER ZY
+31F0..31FF ; Lo # [16] KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO
+3400..4DB5 ; Lo # [6582] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5
+4E00..9FEF ; Lo # [20976] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FEF
+A000..A014 ; Lo # [21] YI SYLLABLE IT..YI SYLLABLE E
+A016..A48C ; Lo # [1143] YI SYLLABLE BIT..YI SYLLABLE YYR
+A4D0..A4F7 ; Lo # [40] LISU LETTER BA..LISU LETTER OE
+A500..A60B ; Lo # [268] VAI SYLLABLE EE..VAI SYLLABLE NG
+A610..A61F ; Lo # [16] VAI SYLLABLE NDOLE FA..VAI SYMBOL JONG
+A62A..A62B ; Lo # [2] VAI SYLLABLE NDOLE MA..VAI SYLLABLE NDOLE DO
+A66E ; Lo # CYRILLIC LETTER MULTIOCULAR O
+A6A0..A6E5 ; Lo # [70] BAMUM LETTER A..BAMUM LETTER KI
+A78F ; Lo # LATIN LETTER SINOLOGICAL DOT
+A7F7 ; Lo # LATIN EPIGRAPHIC LETTER SIDEWAYS I
+A7FB..A801 ; Lo # [7] LATIN EPIGRAPHIC LETTER REVERSED F..SYLOTI NAGRI LETTER I
+A803..A805 ; Lo # [3] SYLOTI NAGRI LETTER U..SYLOTI NAGRI LETTER O
+A807..A80A ; Lo # [4] SYLOTI NAGRI LETTER KO..SYLOTI NAGRI LETTER GHO
+A80C..A822 ; Lo # [23] SYLOTI NAGRI LETTER CO..SYLOTI NAGRI LETTER HO
+A840..A873 ; Lo # [52] PHAGS-PA LETTER KA..PHAGS-PA LETTER CANDRABINDU
+A882..A8B3 ; Lo # [50] SAURASHTRA LETTER A..SAURASHTRA LETTER LLA
+A8F2..A8F7 ; Lo # [6] DEVANAGARI SIGN SPACING CANDRABINDU..DEVANAGARI SIGN CANDRABINDU AVAGRAHA
+A8FB ; Lo # DEVANAGARI HEADSTROKE
+A8FD..A8FE ; Lo # [2] DEVANAGARI JAIN OM..DEVANAGARI LETTER AY
+A90A..A925 ; Lo # [28] KAYAH LI LETTER KA..KAYAH LI LETTER OO
+A930..A946 ; Lo # [23] REJANG LETTER KA..REJANG LETTER A
+A960..A97C ; Lo # [29] HANGUL CHOSEONG TIKEUT-MIEUM..HANGUL CHOSEONG SSANGYEORINHIEUH
+A984..A9B2 ; Lo # [47] JAVANESE LETTER A..JAVANESE LETTER HA
+A9E0..A9E4 ; Lo # [5] MYANMAR LETTER SHAN GHA..MYANMAR LETTER SHAN BHA
+A9E7..A9EF ; Lo # [9] MYANMAR LETTER TAI LAING NYA..MYANMAR LETTER TAI LAING NNA
+A9FA..A9FE ; Lo # [5] MYANMAR LETTER TAI LAING LLA..MYANMAR LETTER TAI LAING BHA
+AA00..AA28 ; Lo # [41] CHAM LETTER A..CHAM LETTER HA
+AA40..AA42 ; Lo # [3] CHAM LETTER FINAL K..CHAM LETTER FINAL NG
+AA44..AA4B ; Lo # [8] CHAM LETTER FINAL CH..CHAM LETTER FINAL SS
+AA60..AA6F ; Lo # [16] MYANMAR LETTER KHAMTI GA..MYANMAR LETTER KHAMTI FA
+AA71..AA76 ; Lo # [6] MYANMAR LETTER KHAMTI XA..MYANMAR LOGOGRAM KHAMTI HM
+AA7A ; Lo # MYANMAR LETTER AITON RA
+AA7E..AAAF ; Lo # [50] MYANMAR LETTER SHWE PALAUNG CHA..TAI VIET LETTER HIGH O
+AAB1 ; Lo # TAI VIET VOWEL AA
+AAB5..AAB6 ; Lo # [2] TAI VIET VOWEL E..TAI VIET VOWEL O
+AAB9..AABD ; Lo # [5] TAI VIET VOWEL UEA..TAI VIET VOWEL AN
+AAC0 ; Lo # TAI VIET TONE MAI NUENG
+AAC2 ; Lo # TAI VIET TONE MAI SONG
+AADB..AADC ; Lo # [2] TAI VIET SYMBOL KON..TAI VIET SYMBOL NUENG
+AAE0..AAEA ; Lo # [11] MEETEI MAYEK LETTER E..MEETEI MAYEK LETTER SSA
+AAF2 ; Lo # MEETEI MAYEK ANJI
+AB01..AB06 ; Lo # [6] ETHIOPIC SYLLABLE TTHU..ETHIOPIC SYLLABLE TTHO
+AB09..AB0E ; Lo # [6] ETHIOPIC SYLLABLE DDHU..ETHIOPIC SYLLABLE DDHO
+AB11..AB16 ; Lo # [6] ETHIOPIC SYLLABLE DZU..ETHIOPIC SYLLABLE DZO
+AB20..AB26 ; Lo # [7] ETHIOPIC SYLLABLE CCHHA..ETHIOPIC SYLLABLE CCHHO
+AB28..AB2E ; Lo # [7] ETHIOPIC SYLLABLE BBA..ETHIOPIC SYLLABLE BBO
+ABC0..ABE2 ; Lo # [35] MEETEI MAYEK LETTER KOK..MEETEI MAYEK LETTER I LONSUM
+AC00..D7A3 ; Lo # [11172] HANGUL SYLLABLE GA..HANGUL SYLLABLE HIH
+D7B0..D7C6 ; Lo # [23] HANGUL JUNGSEONG O-YEO..HANGUL JUNGSEONG ARAEA-E
+D7CB..D7FB ; Lo # [49] HANGUL JONGSEONG NIEUN-RIEUL..HANGUL JONGSEONG PHIEUPH-THIEUTH
+F900..FA6D ; Lo # [366] CJK COMPATIBILITY IDEOGRAPH-F900..CJK COMPATIBILITY IDEOGRAPH-FA6D
+FA70..FAD9 ; Lo # [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILITY IDEOGRAPH-FAD9
+FB1D ; Lo # HEBREW LETTER YOD WITH HIRIQ
+FB1F..FB28 ; Lo # [10] HEBREW LIGATURE YIDDISH YOD YOD PATAH..HEBREW LETTER WIDE TAV
+FB2A..FB36 ; Lo # [13] HEBREW LETTER SHIN WITH SHIN DOT..HEBREW LETTER ZAYIN WITH DAGESH
+FB38..FB3C ; Lo # [5] HEBREW LETTER TET WITH DAGESH..HEBREW LETTER LAMED WITH DAGESH
+FB3E ; Lo # HEBREW LETTER MEM WITH DAGESH
+FB40..FB41 ; Lo # [2] HEBREW LETTER NUN WITH DAGESH..HEBREW LETTER SAMEKH WITH DAGESH
+FB43..FB44 ; Lo # [2] HEBREW LETTER FINAL PE WITH DAGESH..HEBREW LETTER PE WITH DAGESH
+FB46..FBB1 ; Lo # [108] HEBREW LETTER TSADI WITH DAGESH..ARABIC LETTER YEH BARREE WITH HAMZA ABOVE FINAL FORM
+FBD3..FD3D ; Lo # [363] ARABIC LETTER NG ISOLATED FORM..ARABIC LIGATURE ALEF WITH FATHATAN ISOLATED FORM
+FD50..FD8F ; Lo # [64] ARABIC LIGATURE TEH WITH JEEM WITH MEEM INITIAL FORM..ARABIC LIGATURE MEEM WITH KHAH WITH MEEM INITIAL FORM
+FD92..FDC7 ; Lo # [54] ARABIC LIGATURE MEEM WITH JEEM WITH KHAH INITIAL FORM..ARABIC LIGATURE NOON WITH JEEM WITH YEH FINAL FORM
+FDF0..FDFB ; Lo # [12] ARABIC LIGATURE SALLA USED AS KORANIC STOP SIGN ISOLATED FORM..ARABIC LIGATURE JALLAJALALOUHOU
+FE70..FE74 ; Lo # [5] ARABIC FATHATAN ISOLATED FORM..ARABIC KASRATAN ISOLATED FORM
+FE76..FEFC ; Lo # [135] ARABIC FATHA ISOLATED FORM..ARABIC LIGATURE LAM WITH ALEF FINAL FORM
+FF66..FF6F ; Lo # [10] HALFWIDTH KATAKANA LETTER WO..HALFWIDTH KATAKANA LETTER SMALL TU
+FF71..FF9D ; Lo # [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAKANA LETTER N
+FFA0..FFBE ; Lo # [31] HALFWIDTH HANGUL FILLER..HALFWIDTH HANGUL LETTER HIEUH
+FFC2..FFC7 ; Lo # [6] HALFWIDTH HANGUL LETTER A..HALFWIDTH HANGUL LETTER E
+FFCA..FFCF ; Lo # [6] HALFWIDTH HANGUL LETTER YEO..HALFWIDTH HANGUL LETTER OE
+FFD2..FFD7 ; Lo # [6] HALFWIDTH HANGUL LETTER YO..HALFWIDTH HANGUL LETTER YU
+FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
+10000..1000B ; Lo # [12] LINEAR B SYLLABLE B008 A..LINEAR B SYLLABLE B046 JE
+1000D..10026 ; Lo # [26] LINEAR B SYLLABLE B036 JO..LINEAR B SYLLABLE B032 QO
+10028..1003A ; Lo # [19] LINEAR B SYLLABLE B060 RA..LINEAR B SYLLABLE B042 WO
+1003C..1003D ; Lo # [2] LINEAR B SYLLABLE B017 ZA..LINEAR B SYLLABLE B074 ZE
+1003F..1004D ; Lo # [15] LINEAR B SYLLABLE B020 ZO..LINEAR B SYLLABLE B091 TWO
+10050..1005D ; Lo # [14] LINEAR B SYMBOL B018..LINEAR B SYMBOL B089
+10080..100FA ; Lo # [123] LINEAR B IDEOGRAM B100 MAN..LINEAR B IDEOGRAM VESSEL B305
+10280..1029C ; Lo # [29] LYCIAN LETTER A..LYCIAN LETTER X
+102A0..102D0 ; Lo # [49] CARIAN LETTER A..CARIAN LETTER UUU3
+10300..1031F ; Lo # [32] OLD ITALIC LETTER A..OLD ITALIC LETTER ESS
+1032D..10340 ; Lo # [20] OLD ITALIC LETTER YE..GOTHIC LETTER PAIRTHRA
+10342..10349 ; Lo # [8] GOTHIC LETTER RAIDA..GOTHIC LETTER OTHAL
+10350..10375 ; Lo # [38] OLD PERMIC LETTER AN..OLD PERMIC LETTER IA
+10380..1039D ; Lo # [30] UGARITIC LETTER ALPA..UGARITIC LETTER SSU
+103A0..103C3 ; Lo # [36] OLD PERSIAN SIGN A..OLD PERSIAN SIGN HA
+103C8..103CF ; Lo # [8] OLD PERSIAN SIGN AURAMAZDAA..OLD PERSIAN SIGN BUUMISH
+10450..1049D ; Lo # [78] SHAVIAN LETTER PEEP..OSMANYA LETTER OO
+10500..10527 ; Lo # [40] ELBASAN LETTER A..ELBASAN LETTER KHE
+10530..10563 ; Lo # [52] CAUCASIAN ALBANIAN LETTER ALT..CAUCASIAN ALBANIAN LETTER KIW
+10600..10736 ; Lo # [311] LINEAR A SIGN AB001..LINEAR A SIGN A664
+10740..10755 ; Lo # [22] LINEAR A SIGN A701 A..LINEAR A SIGN A732 JE
+10760..10767 ; Lo # [8] LINEAR A SIGN A800..LINEAR A SIGN A807
+10800..10805 ; Lo # [6] CYPRIOT SYLLABLE A..CYPRIOT SYLLABLE JA
+10808 ; Lo # CYPRIOT SYLLABLE JO
+1080A..10835 ; Lo # [44] CYPRIOT SYLLABLE KA..CYPRIOT SYLLABLE WO
+10837..10838 ; Lo # [2] CYPRIOT SYLLABLE XA..CYPRIOT SYLLABLE XE
+1083C ; Lo # CYPRIOT SYLLABLE ZA
+1083F..10855 ; Lo # [23] CYPRIOT SYLLABLE ZO..IMPERIAL ARAMAIC LETTER TAW
+10860..10876 ; Lo # [23] PALMYRENE LETTER ALEPH..PALMYRENE LETTER TAW
+10880..1089E ; Lo # [31] NABATAEAN LETTER FINAL ALEPH..NABATAEAN LETTER TAW
+108E0..108F2 ; Lo # [19] HATRAN LETTER ALEPH..HATRAN LETTER QOPH
+108F4..108F5 ; Lo # [2] HATRAN LETTER SHIN..HATRAN LETTER TAW
+10900..10915 ; Lo # [22] PHOENICIAN LETTER ALF..PHOENICIAN LETTER TAU
+10920..10939 ; Lo # [26] LYDIAN LETTER A..LYDIAN LETTER C
+10980..109B7 ; Lo # [56] MEROITIC HIEROGLYPHIC LETTER A..MEROITIC CURSIVE LETTER DA
+109BE..109BF ; Lo # [2] MEROITIC CURSIVE LOGOGRAM RMT..MEROITIC CURSIVE LOGOGRAM IMN
+10A00 ; Lo # KHAROSHTHI LETTER A
+10A10..10A13 ; Lo # [4] KHAROSHTHI LETTER KA..KHAROSHTHI LETTER GHA
+10A15..10A17 ; Lo # [3] KHAROSHTHI LETTER CA..KHAROSHTHI LETTER JA
+10A19..10A35 ; Lo # [29] KHAROSHTHI LETTER NYA..KHAROSHTHI LETTER VHA
+10A60..10A7C ; Lo # [29] OLD SOUTH ARABIAN LETTER HE..OLD SOUTH ARABIAN LETTER THETH
+10A80..10A9C ; Lo # [29] OLD NORTH ARABIAN LETTER HEH..OLD NORTH ARABIAN LETTER ZAH
+10AC0..10AC7 ; Lo # [8] MANICHAEAN LETTER ALEPH..MANICHAEAN LETTER WAW
+10AC9..10AE4 ; Lo # [28] MANICHAEAN LETTER ZAYIN..MANICHAEAN LETTER TAW
+10B00..10B35 ; Lo # [54] AVESTAN LETTER A..AVESTAN LETTER HE
+10B40..10B55 ; Lo # [22] INSCRIPTIONAL PARTHIAN LETTER ALEPH..INSCRIPTIONAL PARTHIAN LETTER TAW
+10B60..10B72 ; Lo # [19] INSCRIPTIONAL PAHLAVI LETTER ALEPH..INSCRIPTIONAL PAHLAVI LETTER TAW
+10B80..10B91 ; Lo # [18] PSALTER PAHLAVI LETTER ALEPH..PSALTER PAHLAVI LETTER TAW
+10C00..10C48 ; Lo # [73] OLD TURKIC LETTER ORKHON A..OLD TURKIC LETTER ORKHON BASH
+10D00..10D23 ; Lo # [36] HANIFI ROHINGYA LETTER A..HANIFI ROHINGYA MARK NA KHONNA
+10F00..10F1C ; Lo # [29] OLD SOGDIAN LETTER ALEPH..OLD SOGDIAN LETTER FINAL TAW WITH VERTICAL TAIL
+10F27 ; Lo # OLD SOGDIAN LIGATURE AYIN-DALETH
+10F30..10F45 ; Lo # [22] SOGDIAN LETTER ALEPH..SOGDIAN INDEPENDENT SHIN
+10FE0..10FF6 ; Lo # [23] ELYMAIC LETTER ALEPH..ELYMAIC LIGATURE ZAYIN-YODH
+11003..11037 ; Lo # [53] BRAHMI SIGN JIHVAMULIYA..BRAHMI LETTER OLD TAMIL NNNA
+11083..110AF ; Lo # [45] KAITHI LETTER A..KAITHI LETTER HA
+110D0..110E8 ; Lo # [25] SORA SOMPENG LETTER SAH..SORA SOMPENG LETTER MAE
+11103..11126 ; Lo # [36] CHAKMA LETTER AA..CHAKMA LETTER HAA
+11144 ; Lo # CHAKMA LETTER LHAA
+11150..11172 ; Lo # [35] MAHAJANI LETTER A..MAHAJANI LETTER RRA
+11176 ; Lo # MAHAJANI LIGATURE SHRI
+11183..111B2 ; Lo # [48] SHARADA LETTER A..SHARADA LETTER HA
+111C1..111C4 ; Lo # [4] SHARADA SIGN AVAGRAHA..SHARADA OM
+111DA ; Lo # SHARADA EKAM
+111DC ; Lo # SHARADA HEADSTROKE
+11200..11211 ; Lo # [18] KHOJKI LETTER A..KHOJKI LETTER JJA
+11213..1122B ; Lo # [25] KHOJKI LETTER NYA..KHOJKI LETTER LLA
+11280..11286 ; Lo # [7] MULTANI LETTER A..MULTANI LETTER GA
+11288 ; Lo # MULTANI LETTER GHA
+1128A..1128D ; Lo # [4] MULTANI LETTER CA..MULTANI LETTER JJA
+1128F..1129D ; Lo # [15] MULTANI LETTER NYA..MULTANI LETTER BA
+1129F..112A8 ; Lo # [10] MULTANI LETTER BHA..MULTANI LETTER RHA
+112B0..112DE ; Lo # [47] KHUDAWADI LETTER A..KHUDAWADI LETTER HA
+11305..1130C ; Lo # [8] GRANTHA LETTER A..GRANTHA LETTER VOCALIC L
+1130F..11310 ; Lo # [2] GRANTHA LETTER EE..GRANTHA LETTER AI
+11313..11328 ; Lo # [22] GRANTHA LETTER OO..GRANTHA LETTER NA
+1132A..11330 ; Lo # [7] GRANTHA LETTER PA..GRANTHA LETTER RA
+11332..11333 ; Lo # [2] GRANTHA LETTER LA..GRANTHA LETTER LLA
+11335..11339 ; Lo # [5] GRANTHA LETTER VA..GRANTHA LETTER HA
+1133D ; Lo # GRANTHA SIGN AVAGRAHA
+11350 ; Lo # GRANTHA OM
+1135D..11361 ; Lo # [5] GRANTHA SIGN PLUTA..GRANTHA LETTER VOCALIC LL
+11400..11434 ; Lo # [53] NEWA LETTER A..NEWA LETTER HA
+11447..1144A ; Lo # [4] NEWA SIGN AVAGRAHA..NEWA SIDDHI
+1145F ; Lo # NEWA LETTER VEDIC ANUSVARA
+11480..114AF ; Lo # [48] TIRHUTA ANJI..TIRHUTA LETTER HA
+114C4..114C5 ; Lo # [2] TIRHUTA SIGN AVAGRAHA..TIRHUTA GVANG
+114C7 ; Lo # TIRHUTA OM
+11580..115AE ; Lo # [47] SIDDHAM LETTER A..SIDDHAM LETTER HA
+115D8..115DB ; Lo # [4] SIDDHAM LETTER THREE-CIRCLE ALTERNATE I..SIDDHAM LETTER ALTERNATE U
+11600..1162F ; Lo # [48] MODI LETTER A..MODI LETTER LLA
+11644 ; Lo # MODI SIGN HUVA
+11680..116AA ; Lo # [43] TAKRI LETTER A..TAKRI LETTER RRA
+116B8 ; Lo # TAKRI LETTER ARCHAIC KHA
+11700..1171A ; Lo # [27] AHOM LETTER KA..AHOM LETTER ALTERNATE BA
+11800..1182B ; Lo # [44] DOGRA LETTER A..DOGRA LETTER RRA
+118FF ; Lo # WARANG CITI OM
+119A0..119A7 ; Lo # [8] NANDINAGARI LETTER A..NANDINAGARI LETTER VOCALIC RR
+119AA..119D0 ; Lo # [39] NANDINAGARI LETTER E..NANDINAGARI LETTER RRA
+119E1 ; Lo # NANDINAGARI SIGN AVAGRAHA
+119E3 ; Lo # NANDINAGARI HEADSTROKE
+11A00 ; Lo # ZANABAZAR SQUARE LETTER A
+11A0B..11A32 ; Lo # [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
+11A3A ; Lo # ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
+11A50 ; Lo # SOYOMBO LETTER A
+11A5C..11A89 ; Lo # [46] SOYOMBO LETTER KA..SOYOMBO CLUSTER-INITIAL LETTER SA
+11A9D ; Lo # SOYOMBO MARK PLUTA
+11AC0..11AF8 ; Lo # [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL
+11C00..11C08 ; Lo # [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
+11C0A..11C2E ; Lo # [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
+11C40 ; Lo # BHAIKSUKI SIGN AVAGRAHA
+11C72..11C8F ; Lo # [30] MARCHEN LETTER KA..MARCHEN LETTER A
+11D00..11D06 ; Lo # [7] MASARAM GONDI LETTER A..MASARAM GONDI LETTER E
+11D08..11D09 ; Lo # [2] MASARAM GONDI LETTER AI..MASARAM GONDI LETTER O
+11D0B..11D30 ; Lo # [38] MASARAM GONDI LETTER AU..MASARAM GONDI LETTER TRA
+11D46 ; Lo # MASARAM GONDI REPHA
+11D60..11D65 ; Lo # [6] GUNJALA GONDI LETTER A..GUNJALA GONDI LETTER UU
+11D67..11D68 ; Lo # [2] GUNJALA GONDI LETTER EE..GUNJALA GONDI LETTER AI
+11D6A..11D89 ; Lo # [32] GUNJALA GONDI LETTER OO..GUNJALA GONDI LETTER SA
+11D98 ; Lo # GUNJALA GONDI OM
+11EE0..11EF2 ; Lo # [19] MAKASAR LETTER KA..MAKASAR ANGKA
+12000..12399 ; Lo # [922] CUNEIFORM SIGN A..CUNEIFORM SIGN U U
+12480..12543 ; Lo # [196] CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM SIGN ZU5 TIMES THREE DISH TENU
+13000..1342E ; Lo # [1071] EGYPTIAN HIEROGLYPH A001..EGYPTIAN HIEROGLYPH AA032
+14400..14646 ; Lo # [583] ANATOLIAN HIEROGLYPH A001..ANATOLIAN HIEROGLYPH A530
+16800..16A38 ; Lo # [569] BAMUM LETTER PHASE-A NGKUE MFON..BAMUM LETTER PHASE-F VUEQ
+16A40..16A5E ; Lo # [31] MRO LETTER TA..MRO LETTER TEK
+16AD0..16AED ; Lo # [30] BASSA VAH LETTER ENNI..BASSA VAH LETTER I
+16B00..16B2F ; Lo # [48] PAHAWH HMONG VOWEL KEEB..PAHAWH HMONG CONSONANT CAU
+16B63..16B77 ; Lo # [21] PAHAWH HMONG SIGN VOS LUB..PAHAWH HMONG SIGN CIM NRES TOS
+16B7D..16B8F ; Lo # [19] PAHAWH HMONG CLAN SIGN TSHEEJ..PAHAWH HMONG CLAN SIGN VWJ
+16F00..16F4A ; Lo # [75] MIAO LETTER PA..MIAO LETTER RTE
+16F50 ; Lo # MIAO LETTER NASALIZATION
+17000..187F7 ; Lo # [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
+18800..18AF2 ; Lo # [755] TANGUT COMPONENT-001..TANGUT COMPONENT-755
+1B000..1B11E ; Lo # [287] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER N-MU-MO-2
+1B150..1B152 ; Lo # [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
+1B164..1B167 ; Lo # [4] KATAKANA LETTER SMALL WI..KATAKANA LETTER SMALL N
+1B170..1B2FB ; Lo # [396] NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB
+1BC00..1BC6A ; Lo # [107] DUPLOYAN LETTER H..DUPLOYAN LETTER VOCALIC M
+1BC70..1BC7C ; Lo # [13] DUPLOYAN AFFIX LEFT HORIZONTAL SECANT..DUPLOYAN AFFIX ATTACHED TANGENT HOOK
+1BC80..1BC88 ; Lo # [9] DUPLOYAN AFFIX HIGH ACUTE..DUPLOYAN AFFIX HIGH VERTICAL
+1BC90..1BC99 ; Lo # [10] DUPLOYAN AFFIX LOW ACUTE..DUPLOYAN AFFIX LOW ARROW
+1E100..1E12C ; Lo # [45] NYIAKENG PUACHUE HMONG LETTER MA..NYIAKENG PUACHUE HMONG LETTER W
+1E14E ; Lo # NYIAKENG PUACHUE HMONG LOGOGRAM NYAJ
+1E2C0..1E2EB ; Lo # [44] WANCHO LETTER AA..WANCHO LETTER YIH
+1E800..1E8C4 ; Lo # [197] MENDE KIKAKUI SYLLABLE M001 KI..MENDE KIKAKUI SYLLABLE M060 NYON
+1EE00..1EE03 ; Lo # [4] ARABIC MATHEMATICAL ALEF..ARABIC MATHEMATICAL DAL
+1EE05..1EE1F ; Lo # [27] ARABIC MATHEMATICAL WAW..ARABIC MATHEMATICAL DOTLESS QAF
+1EE21..1EE22 ; Lo # [2] ARABIC MATHEMATICAL INITIAL BEH..ARABIC MATHEMATICAL INITIAL JEEM
+1EE24 ; Lo # ARABIC MATHEMATICAL INITIAL HEH
+1EE27 ; Lo # ARABIC MATHEMATICAL INITIAL HAH
+1EE29..1EE32 ; Lo # [10] ARABIC MATHEMATICAL INITIAL YEH..ARABIC MATHEMATICAL INITIAL QAF
+1EE34..1EE37 ; Lo # [4] ARABIC MATHEMATICAL INITIAL SHEEN..ARABIC MATHEMATICAL INITIAL KHAH
+1EE39 ; Lo # ARABIC MATHEMATICAL INITIAL DAD
+1EE3B ; Lo # ARABIC MATHEMATICAL INITIAL GHAIN
+1EE42 ; Lo # ARABIC MATHEMATICAL TAILED JEEM
+1EE47 ; Lo # ARABIC MATHEMATICAL TAILED HAH
+1EE49 ; Lo # ARABIC MATHEMATICAL TAILED YEH
+1EE4B ; Lo # ARABIC MATHEMATICAL TAILED LAM
+1EE4D..1EE4F ; Lo # [3] ARABIC MATHEMATICAL TAILED NOON..ARABIC MATHEMATICAL TAILED AIN
+1EE51..1EE52 ; Lo # [2] ARABIC MATHEMATICAL TAILED SAD..ARABIC MATHEMATICAL TAILED QAF
+1EE54 ; Lo # ARABIC MATHEMATICAL TAILED SHEEN
+1EE57 ; Lo # ARABIC MATHEMATICAL TAILED KHAH
+1EE59 ; Lo # ARABIC MATHEMATICAL TAILED DAD
+1EE5B ; Lo # ARABIC MATHEMATICAL TAILED GHAIN
+1EE5D ; Lo # ARABIC MATHEMATICAL TAILED DOTLESS NOON
+1EE5F ; Lo # ARABIC MATHEMATICAL TAILED DOTLESS QAF
+1EE61..1EE62 ; Lo # [2] ARABIC MATHEMATICAL STRETCHED BEH..ARABIC MATHEMATICAL STRETCHED JEEM
+1EE64 ; Lo # ARABIC MATHEMATICAL STRETCHED HEH
+1EE67..1EE6A ; Lo # [4] ARABIC MATHEMATICAL STRETCHED HAH..ARABIC MATHEMATICAL STRETCHED KAF
+1EE6C..1EE72 ; Lo # [7] ARABIC MATHEMATICAL STRETCHED MEEM..ARABIC MATHEMATICAL STRETCHED QAF
+1EE74..1EE77 ; Lo # [4] ARABIC MATHEMATICAL STRETCHED SHEEN..ARABIC MATHEMATICAL STRETCHED KHAH
+1EE79..1EE7C ; Lo # [4] ARABIC MATHEMATICAL STRETCHED DAD..ARABIC MATHEMATICAL STRETCHED DOTLESS BEH
+1EE7E ; Lo # ARABIC MATHEMATICAL STRETCHED DOTLESS FEH
+1EE80..1EE89 ; Lo # [10] ARABIC MATHEMATICAL LOOPED ALEF..ARABIC MATHEMATICAL LOOPED YEH
+1EE8B..1EE9B ; Lo # [17] ARABIC MATHEMATICAL LOOPED LAM..ARABIC MATHEMATICAL LOOPED GHAIN
+1EEA1..1EEA3 ; Lo # [3] ARABIC MATHEMATICAL DOUBLE-STRUCK BEH..ARABIC MATHEMATICAL DOUBLE-STRUCK DAL
+1EEA5..1EEA9 ; Lo # [5] ARABIC MATHEMATICAL DOUBLE-STRUCK WAW..ARABIC MATHEMATICAL DOUBLE-STRUCK YEH
+1EEAB..1EEBB ; Lo # [17] ARABIC MATHEMATICAL DOUBLE-STRUCK LAM..ARABIC MATHEMATICAL DOUBLE-STRUCK GHAIN
+20000..2A6D6 ; Lo # [42711] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6D6
+2A700..2B734 ; Lo # [4149] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B734
+2B740..2B81D ; Lo # [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
+2B820..2CEA1 ; Lo # [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
+2CEB0..2EBE0 ; Lo # [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
+2F800..2FA1D ; Lo # [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
+
+# Total code points: 121414
+
+# ================================================
+
+# General_Category=Nonspacing_Mark
+
+0300..036F ; Mn # [112] COMBINING GRAVE ACCENT..COMBINING LATIN SMALL LETTER X
+0483..0487 ; Mn # [5] COMBINING CYRILLIC TITLO..COMBINING CYRILLIC POKRYTIE
+0591..05BD ; Mn # [45] HEBREW ACCENT ETNAHTA..HEBREW POINT METEG
+05BF ; Mn # HEBREW POINT RAFE
+05C1..05C2 ; Mn # [2] HEBREW POINT SHIN DOT..HEBREW POINT SIN DOT
+05C4..05C5 ; Mn # [2] HEBREW MARK UPPER DOT..HEBREW MARK LOWER DOT
+05C7 ; Mn # HEBREW POINT QAMATS QATAN
+0610..061A ; Mn # [11] ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM..ARABIC SMALL KASRA
+064B..065F ; Mn # [21] ARABIC FATHATAN..ARABIC WAVY HAMZA BELOW
+0670 ; Mn # ARABIC LETTER SUPERSCRIPT ALEF
+06D6..06DC ; Mn # [7] ARABIC SMALL HIGH LIGATURE SAD WITH LAM WITH ALEF MAKSURA..ARABIC SMALL HIGH SEEN
+06DF..06E4 ; Mn # [6] ARABIC SMALL HIGH ROUNDED ZERO..ARABIC SMALL HIGH MADDA
+06E7..06E8 ; Mn # [2] ARABIC SMALL HIGH YEH..ARABIC SMALL HIGH NOON
+06EA..06ED ; Mn # [4] ARABIC EMPTY CENTRE LOW STOP..ARABIC SMALL LOW MEEM
+0711 ; Mn # SYRIAC LETTER SUPERSCRIPT ALAPH
+0730..074A ; Mn # [27] SYRIAC PTHAHA ABOVE..SYRIAC BARREKH
+07A6..07B0 ; Mn # [11] THAANA ABAFILI..THAANA SUKUN
+07EB..07F3 ; Mn # [9] NKO COMBINING SHORT HIGH TONE..NKO COMBINING DOUBLE DOT ABOVE
+07FD ; Mn # NKO DANTAYALAN
+0816..0819 ; Mn # [4] SAMARITAN MARK IN..SAMARITAN MARK DAGESH
+081B..0823 ; Mn # [9] SAMARITAN MARK EPENTHETIC YUT..SAMARITAN VOWEL SIGN A
+0825..0827 ; Mn # [3] SAMARITAN VOWEL SIGN SHORT A..SAMARITAN VOWEL SIGN U
+0829..082D ; Mn # [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA
+0859..085B ; Mn # [3] MANDAIC AFFRICATION MARK..MANDAIC GEMINATION MARK
+08D3..08E1 ; Mn # [15] ARABIC SMALL LOW WAW..ARABIC SMALL HIGH SIGN SAFHA
+08E3..0902 ; Mn # [32] ARABIC TURNED DAMMA BELOW..DEVANAGARI SIGN ANUSVARA
+093A ; Mn # DEVANAGARI VOWEL SIGN OE
+093C ; Mn # DEVANAGARI SIGN NUKTA
+0941..0948 ; Mn # [8] DEVANAGARI VOWEL SIGN U..DEVANAGARI VOWEL SIGN AI
+094D ; Mn # DEVANAGARI SIGN VIRAMA
+0951..0957 ; Mn # [7] DEVANAGARI STRESS SIGN UDATTA..DEVANAGARI VOWEL SIGN UUE
+0962..0963 ; Mn # [2] DEVANAGARI VOWEL SIGN VOCALIC L..DEVANAGARI VOWEL SIGN VOCALIC LL
+0981 ; Mn # BENGALI SIGN CANDRABINDU
+09BC ; Mn # BENGALI SIGN NUKTA
+09C1..09C4 ; Mn # [4] BENGALI VOWEL SIGN U..BENGALI VOWEL SIGN VOCALIC RR
+09CD ; Mn # BENGALI SIGN VIRAMA
+09E2..09E3 ; Mn # [2] BENGALI VOWEL SIGN VOCALIC L..BENGALI VOWEL SIGN VOCALIC LL
+09FE ; Mn # BENGALI SANDHI MARK
+0A01..0A02 ; Mn # [2] GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN BINDI
+0A3C ; Mn # GURMUKHI SIGN NUKTA
+0A41..0A42 ; Mn # [2] GURMUKHI VOWEL SIGN U..GURMUKHI VOWEL SIGN UU
+0A47..0A48 ; Mn # [2] GURMUKHI VOWEL SIGN EE..GURMUKHI VOWEL SIGN AI
+0A4B..0A4D ; Mn # [3] GURMUKHI VOWEL SIGN OO..GURMUKHI SIGN VIRAMA
+0A51 ; Mn # GURMUKHI SIGN UDAAT
+0A70..0A71 ; Mn # [2] GURMUKHI TIPPI..GURMUKHI ADDAK
+0A75 ; Mn # GURMUKHI SIGN YAKASH
+0A81..0A82 ; Mn # [2] GUJARATI SIGN CANDRABINDU..GUJARATI SIGN ANUSVARA
+0ABC ; Mn # GUJARATI SIGN NUKTA
+0AC1..0AC5 ; Mn # [5] GUJARATI VOWEL SIGN U..GUJARATI VOWEL SIGN CANDRA E
+0AC7..0AC8 ; Mn # [2] GUJARATI VOWEL SIGN E..GUJARATI VOWEL SIGN AI
+0ACD ; Mn # GUJARATI SIGN VIRAMA
+0AE2..0AE3 ; Mn # [2] GUJARATI VOWEL SIGN VOCALIC L..GUJARATI VOWEL SIGN VOCALIC LL
+0AFA..0AFF ; Mn # [6] GUJARATI SIGN SUKUN..GUJARATI SIGN TWO-CIRCLE NUKTA ABOVE
+0B01 ; Mn # ORIYA SIGN CANDRABINDU
+0B3C ; Mn # ORIYA SIGN NUKTA
+0B3F ; Mn # ORIYA VOWEL SIGN I
+0B41..0B44 ; Mn # [4] ORIYA VOWEL SIGN U..ORIYA VOWEL SIGN VOCALIC RR
+0B4D ; Mn # ORIYA SIGN VIRAMA
+0B56 ; Mn # ORIYA AI LENGTH MARK
+0B62..0B63 ; Mn # [2] ORIYA VOWEL SIGN VOCALIC L..ORIYA VOWEL SIGN VOCALIC LL
+0B82 ; Mn # TAMIL SIGN ANUSVARA
+0BC0 ; Mn # TAMIL VOWEL SIGN II
+0BCD ; Mn # TAMIL SIGN VIRAMA
+0C00 ; Mn # TELUGU SIGN COMBINING CANDRABINDU ABOVE
+0C04 ; Mn # TELUGU SIGN COMBINING ANUSVARA ABOVE
+0C3E..0C40 ; Mn # [3] TELUGU VOWEL SIGN AA..TELUGU VOWEL SIGN II
+0C46..0C48 ; Mn # [3] TELUGU VOWEL SIGN E..TELUGU VOWEL SIGN AI
+0C4A..0C4D ; Mn # [4] TELUGU VOWEL SIGN O..TELUGU SIGN VIRAMA
+0C55..0C56 ; Mn # [2] TELUGU LENGTH MARK..TELUGU AI LENGTH MARK
+0C62..0C63 ; Mn # [2] TELUGU VOWEL SIGN VOCALIC L..TELUGU VOWEL SIGN VOCALIC LL
+0C81 ; Mn # KANNADA SIGN CANDRABINDU
+0CBC ; Mn # KANNADA SIGN NUKTA
+0CBF ; Mn # KANNADA VOWEL SIGN I
+0CC6 ; Mn # KANNADA VOWEL SIGN E
+0CCC..0CCD ; Mn # [2] KANNADA VOWEL SIGN AU..KANNADA SIGN VIRAMA
+0CE2..0CE3 ; Mn # [2] KANNADA VOWEL SIGN VOCALIC L..KANNADA VOWEL SIGN VOCALIC LL
+0D00..0D01 ; Mn # [2] MALAYALAM SIGN COMBINING ANUSVARA ABOVE..MALAYALAM SIGN CANDRABINDU
+0D3B..0D3C ; Mn # [2] MALAYALAM SIGN VERTICAL BAR VIRAMA..MALAYALAM SIGN CIRCULAR VIRAMA
+0D41..0D44 ; Mn # [4] MALAYALAM VOWEL SIGN U..MALAYALAM VOWEL SIGN VOCALIC RR
+0D4D ; Mn # MALAYALAM SIGN VIRAMA
+0D62..0D63 ; Mn # [2] MALAYALAM VOWEL SIGN VOCALIC L..MALAYALAM VOWEL SIGN VOCALIC LL
+0DCA ; Mn # SINHALA SIGN AL-LAKUNA
+0DD2..0DD4 ; Mn # [3] SINHALA VOWEL SIGN KETTI IS-PILLA..SINHALA VOWEL SIGN KETTI PAA-PILLA
+0DD6 ; Mn # SINHALA VOWEL SIGN DIGA PAA-PILLA
+0E31 ; Mn # THAI CHARACTER MAI HAN-AKAT
+0E34..0E3A ; Mn # [7] THAI CHARACTER SARA I..THAI CHARACTER PHINTHU
+0E47..0E4E ; Mn # [8] THAI CHARACTER MAITAIKHU..THAI CHARACTER YAMAKKAN
+0EB1 ; Mn # LAO VOWEL SIGN MAI KAN
+0EB4..0EBC ; Mn # [9] LAO VOWEL SIGN I..LAO SEMIVOWEL SIGN LO
+0EC8..0ECD ; Mn # [6] LAO TONE MAI EK..LAO NIGGAHITA
+0F18..0F19 ; Mn # [2] TIBETAN ASTROLOGICAL SIGN -KHYUD PA..TIBETAN ASTROLOGICAL SIGN SDONG TSHUGS
+0F35 ; Mn # TIBETAN MARK NGAS BZUNG NYI ZLA
+0F37 ; Mn # TIBETAN MARK NGAS BZUNG SGOR RTAGS
+0F39 ; Mn # TIBETAN MARK TSA -PHRU
+0F71..0F7E ; Mn # [14] TIBETAN VOWEL SIGN AA..TIBETAN SIGN RJES SU NGA RO
+0F80..0F84 ; Mn # [5] TIBETAN VOWEL SIGN REVERSED I..TIBETAN MARK HALANTA
+0F86..0F87 ; Mn # [2] TIBETAN SIGN LCI RTAGS..TIBETAN SIGN YANG RTAGS
+0F8D..0F97 ; Mn # [11] TIBETAN SUBJOINED SIGN LCE TSA CAN..TIBETAN SUBJOINED LETTER JA
+0F99..0FBC ; Mn # [36] TIBETAN SUBJOINED LETTER NYA..TIBETAN SUBJOINED LETTER FIXED-FORM RA
+0FC6 ; Mn # TIBETAN SYMBOL PADMA GDAN
+102D..1030 ; Mn # [4] MYANMAR VOWEL SIGN I..MYANMAR VOWEL SIGN UU
+1032..1037 ; Mn # [6] MYANMAR VOWEL SIGN AI..MYANMAR SIGN DOT BELOW
+1039..103A ; Mn # [2] MYANMAR SIGN VIRAMA..MYANMAR SIGN ASAT
+103D..103E ; Mn # [2] MYANMAR CONSONANT SIGN MEDIAL WA..MYANMAR CONSONANT SIGN MEDIAL HA
+1058..1059 ; Mn # [2] MYANMAR VOWEL SIGN VOCALIC L..MYANMAR VOWEL SIGN VOCALIC LL
+105E..1060 ; Mn # [3] MYANMAR CONSONANT SIGN MON MEDIAL NA..MYANMAR CONSONANT SIGN MON MEDIAL LA
+1071..1074 ; Mn # [4] MYANMAR VOWEL SIGN GEBA KAREN I..MYANMAR VOWEL SIGN KAYAH EE
+1082 ; Mn # MYANMAR CONSONANT SIGN SHAN MEDIAL WA
+1085..1086 ; Mn # [2] MYANMAR VOWEL SIGN SHAN E ABOVE..MYANMAR VOWEL SIGN SHAN FINAL Y
+108D ; Mn # MYANMAR SIGN SHAN COUNCIL EMPHATIC TONE
+109D ; Mn # MYANMAR VOWEL SIGN AITON AI
+135D..135F ; Mn # [3] ETHIOPIC COMBINING GEMINATION AND VOWEL LENGTH MARK..ETHIOPIC COMBINING GEMINATION MARK
+1712..1714 ; Mn # [3] TAGALOG VOWEL SIGN I..TAGALOG SIGN VIRAMA
+1732..1734 ; Mn # [3] HANUNOO VOWEL SIGN I..HANUNOO SIGN PAMUDPOD
+1752..1753 ; Mn # [2] BUHID VOWEL SIGN I..BUHID VOWEL SIGN U
+1772..1773 ; Mn # [2] TAGBANWA VOWEL SIGN I..TAGBANWA VOWEL SIGN U
+17B4..17B5 ; Mn # [2] KHMER VOWEL INHERENT AQ..KHMER VOWEL INHERENT AA
+17B7..17BD ; Mn # [7] KHMER VOWEL SIGN I..KHMER VOWEL SIGN UA
+17C6 ; Mn # KHMER SIGN NIKAHIT
+17C9..17D3 ; Mn # [11] KHMER SIGN MUUSIKATOAN..KHMER SIGN BATHAMASAT
+17DD ; Mn # KHMER SIGN ATTHACAN
+180B..180D ; Mn # [3] MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIAN FREE VARIATION SELECTOR THREE
+1885..1886 ; Mn # [2] MONGOLIAN LETTER ALI GALI BALUDA..MONGOLIAN LETTER ALI GALI THREE BALUDA
+18A9 ; Mn # MONGOLIAN LETTER ALI GALI DAGALGA
+1920..1922 ; Mn # [3] LIMBU VOWEL SIGN A..LIMBU VOWEL SIGN U
+1927..1928 ; Mn # [2] LIMBU VOWEL SIGN E..LIMBU VOWEL SIGN O
+1932 ; Mn # LIMBU SMALL LETTER ANUSVARA
+1939..193B ; Mn # [3] LIMBU SIGN MUKPHRENG..LIMBU SIGN SA-I
+1A17..1A18 ; Mn # [2] BUGINESE VOWEL SIGN I..BUGINESE VOWEL SIGN U
+1A1B ; Mn # BUGINESE VOWEL SIGN AE
+1A56 ; Mn # TAI THAM CONSONANT SIGN MEDIAL LA
+1A58..1A5E ; Mn # [7] TAI THAM SIGN MAI KANG LAI..TAI THAM CONSONANT SIGN SA
+1A60 ; Mn # TAI THAM SIGN SAKOT
+1A62 ; Mn # TAI THAM VOWEL SIGN MAI SAT
+1A65..1A6C ; Mn # [8] TAI THAM VOWEL SIGN I..TAI THAM VOWEL SIGN OA BELOW
+1A73..1A7C ; Mn # [10] TAI THAM VOWEL SIGN OA ABOVE..TAI THAM SIGN KHUEN-LUE KARAN
+1A7F ; Mn # TAI THAM COMBINING CRYPTOGRAMMIC DOT
+1AB0..1ABD ; Mn # [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
+1B00..1B03 ; Mn # [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
+1B34 ; Mn # BALINESE SIGN REREKAN
+1B36..1B3A ; Mn # [5] BALINESE VOWEL SIGN ULU..BALINESE VOWEL SIGN RA REPA
+1B3C ; Mn # BALINESE VOWEL SIGN LA LENGA
+1B42 ; Mn # BALINESE VOWEL SIGN PEPET
+1B6B..1B73 ; Mn # [9] BALINESE MUSICAL SYMBOL COMBINING TEGEH..BALINESE MUSICAL SYMBOL COMBINING GONG
+1B80..1B81 ; Mn # [2] SUNDANESE SIGN PANYECEK..SUNDANESE SIGN PANGLAYAR
+1BA2..1BA5 ; Mn # [4] SUNDANESE CONSONANT SIGN PANYAKRA..SUNDANESE VOWEL SIGN PANYUKU
+1BA8..1BA9 ; Mn # [2] SUNDANESE VOWEL SIGN PAMEPET..SUNDANESE VOWEL SIGN PANEULEUNG
+1BAB..1BAD ; Mn # [3] SUNDANESE SIGN VIRAMA..SUNDANESE CONSONANT SIGN PASANGAN WA
+1BE6 ; Mn # BATAK SIGN TOMPI
+1BE8..1BE9 ; Mn # [2] BATAK VOWEL SIGN PAKPAK E..BATAK VOWEL SIGN EE
+1BED ; Mn # BATAK VOWEL SIGN KARO O
+1BEF..1BF1 ; Mn # [3] BATAK VOWEL SIGN U FOR SIMALUNGUN SA..BATAK CONSONANT SIGN H
+1C2C..1C33 ; Mn # [8] LEPCHA VOWEL SIGN E..LEPCHA CONSONANT SIGN T
+1C36..1C37 ; Mn # [2] LEPCHA SIGN RAN..LEPCHA SIGN NUKTA
+1CD0..1CD2 ; Mn # [3] VEDIC TONE KARSHANA..VEDIC TONE PRENKHA
+1CD4..1CE0 ; Mn # [13] VEDIC SIGN YAJURVEDIC MIDLINE SVARITA..VEDIC TONE RIGVEDIC KASHMIRI INDEPENDENT SVARITA
+1CE2..1CE8 ; Mn # [7] VEDIC SIGN VISARGA SVARITA..VEDIC SIGN VISARGA ANUDATTA WITH TAIL
+1CED ; Mn # VEDIC SIGN TIRYAK
+1CF4 ; Mn # VEDIC TONE CANDRA ABOVE
+1CF8..1CF9 ; Mn # [2] VEDIC TONE RING ABOVE..VEDIC TONE DOUBLE RING ABOVE
+1DC0..1DF9 ; Mn # [58] COMBINING DOTTED GRAVE ACCENT..COMBINING WIDE INVERTED BRIDGE BELOW
+1DFB..1DFF ; Mn # [5] COMBINING DELETION MARK..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW
+20D0..20DC ; Mn # [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE
+20E1 ; Mn # COMBINING LEFT RIGHT ARROW ABOVE
+20E5..20F0 ; Mn # [12] COMBINING REVERSE SOLIDUS OVERLAY..COMBINING ASTERISK ABOVE
+2CEF..2CF1 ; Mn # [3] COPTIC COMBINING NI ABOVE..COPTIC COMBINING SPIRITUS LENIS
+2D7F ; Mn # TIFINAGH CONSONANT JOINER
+2DE0..2DFF ; Mn # [32] COMBINING CYRILLIC LETTER BE..COMBINING CYRILLIC LETTER IOTIFIED BIG YUS
+302A..302D ; Mn # [4] IDEOGRAPHIC LEVEL TONE MARK..IDEOGRAPHIC ENTERING TONE MARK
+3099..309A ; Mn # [2] COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK..COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
+A66F ; Mn # COMBINING CYRILLIC VZMET
+A674..A67D ; Mn # [10] COMBINING CYRILLIC LETTER UKRAINIAN IE..COMBINING CYRILLIC PAYEROK
+A69E..A69F ; Mn # [2] COMBINING CYRILLIC LETTER EF..COMBINING CYRILLIC LETTER IOTIFIED E
+A6F0..A6F1 ; Mn # [2] BAMUM COMBINING MARK KOQNDON..BAMUM COMBINING MARK TUKWENTIS
+A802 ; Mn # SYLOTI NAGRI SIGN DVISVARA
+A806 ; Mn # SYLOTI NAGRI SIGN HASANTA
+A80B ; Mn # SYLOTI NAGRI SIGN ANUSVARA
+A825..A826 ; Mn # [2] SYLOTI NAGRI VOWEL SIGN U..SYLOTI NAGRI VOWEL SIGN E
+A8C4..A8C5 ; Mn # [2] SAURASHTRA SIGN VIRAMA..SAURASHTRA SIGN CANDRABINDU
+A8E0..A8F1 ; Mn # [18] COMBINING DEVANAGARI DIGIT ZERO..COMBINING DEVANAGARI SIGN AVAGRAHA
+A8FF ; Mn # DEVANAGARI VOWEL SIGN AY
+A926..A92D ; Mn # [8] KAYAH LI VOWEL UE..KAYAH LI TONE CALYA PLOPHU
+A947..A951 ; Mn # [11] REJANG VOWEL SIGN I..REJANG CONSONANT SIGN R
+A980..A982 ; Mn # [3] JAVANESE SIGN PANYANGGA..JAVANESE SIGN LAYAR
+A9B3 ; Mn # JAVANESE SIGN CECAK TELU
+A9B6..A9B9 ; Mn # [4] JAVANESE VOWEL SIGN WULU..JAVANESE VOWEL SIGN SUKU MENDUT
+A9BC..A9BD ; Mn # [2] JAVANESE VOWEL SIGN PEPET..JAVANESE CONSONANT SIGN KERET
+A9E5 ; Mn # MYANMAR SIGN SHAN SAW
+AA29..AA2E ; Mn # [6] CHAM VOWEL SIGN AA..CHAM VOWEL SIGN OE
+AA31..AA32 ; Mn # [2] CHAM VOWEL SIGN AU..CHAM VOWEL SIGN UE
+AA35..AA36 ; Mn # [2] CHAM CONSONANT SIGN LA..CHAM CONSONANT SIGN WA
+AA43 ; Mn # CHAM CONSONANT SIGN FINAL NG
+AA4C ; Mn # CHAM CONSONANT SIGN FINAL M
+AA7C ; Mn # MYANMAR SIGN TAI LAING TONE-2
+AAB0 ; Mn # TAI VIET MAI KANG
+AAB2..AAB4 ; Mn # [3] TAI VIET VOWEL I..TAI VIET VOWEL U
+AAB7..AAB8 ; Mn # [2] TAI VIET MAI KHIT..TAI VIET VOWEL IA
+AABE..AABF ; Mn # [2] TAI VIET VOWEL AM..TAI VIET TONE MAI EK
+AAC1 ; Mn # TAI VIET TONE MAI THO
+AAEC..AAED ; Mn # [2] MEETEI MAYEK VOWEL SIGN UU..MEETEI MAYEK VOWEL SIGN AAI
+AAF6 ; Mn # MEETEI MAYEK VIRAMA
+ABE5 ; Mn # MEETEI MAYEK VOWEL SIGN ANAP
+ABE8 ; Mn # MEETEI MAYEK VOWEL SIGN UNAP
+ABED ; Mn # MEETEI MAYEK APUN IYEK
+FB1E ; Mn # HEBREW POINT JUDEO-SPANISH VARIKA
+FE00..FE0F ; Mn # [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16
+FE20..FE2F ; Mn # [16] COMBINING LIGATURE LEFT HALF..COMBINING CYRILLIC TITLO RIGHT HALF
+101FD ; Mn # PHAISTOS DISC SIGN COMBINING OBLIQUE STROKE
+102E0 ; Mn # COPTIC EPACT THOUSANDS MARK
+10376..1037A ; Mn # [5] COMBINING OLD PERMIC LETTER AN..COMBINING OLD PERMIC LETTER SII
+10A01..10A03 ; Mn # [3] KHAROSHTHI VOWEL SIGN I..KHAROSHTHI VOWEL SIGN VOCALIC R
+10A05..10A06 ; Mn # [2] KHAROSHTHI VOWEL SIGN E..KHAROSHTHI VOWEL SIGN O
+10A0C..10A0F ; Mn # [4] KHAROSHTHI VOWEL LENGTH MARK..KHAROSHTHI SIGN VISARGA
+10A38..10A3A ; Mn # [3] KHAROSHTHI SIGN BAR ABOVE..KHAROSHTHI SIGN DOT BELOW
+10A3F ; Mn # KHAROSHTHI VIRAMA
+10AE5..10AE6 ; Mn # [2] MANICHAEAN ABBREVIATION MARK ABOVE..MANICHAEAN ABBREVIATION MARK BELOW
+10D24..10D27 ; Mn # [4] HANIFI ROHINGYA SIGN HARBAHAY..HANIFI ROHINGYA SIGN TASSI
+10F46..10F50 ; Mn # [11] SOGDIAN COMBINING DOT BELOW..SOGDIAN COMBINING STROKE BELOW
+11001 ; Mn # BRAHMI SIGN ANUSVARA
+11038..11046 ; Mn # [15] BRAHMI VOWEL SIGN AA..BRAHMI VIRAMA
+1107F..11081 ; Mn # [3] BRAHMI NUMBER JOINER..KAITHI SIGN ANUSVARA
+110B3..110B6 ; Mn # [4] KAITHI VOWEL SIGN U..KAITHI VOWEL SIGN AI
+110B9..110BA ; Mn # [2] KAITHI SIGN VIRAMA..KAITHI SIGN NUKTA
+11100..11102 ; Mn # [3] CHAKMA SIGN CANDRABINDU..CHAKMA SIGN VISARGA
+11127..1112B ; Mn # [5] CHAKMA VOWEL SIGN A..CHAKMA VOWEL SIGN UU
+1112D..11134 ; Mn # [8] CHAKMA VOWEL SIGN AI..CHAKMA MAAYYAA
+11173 ; Mn # MAHAJANI SIGN NUKTA
+11180..11181 ; Mn # [2] SHARADA SIGN CANDRABINDU..SHARADA SIGN ANUSVARA
+111B6..111BE ; Mn # [9] SHARADA VOWEL SIGN U..SHARADA VOWEL SIGN O
+111C9..111CC ; Mn # [4] SHARADA SANDHI MARK..SHARADA EXTRA SHORT VOWEL MARK
+1122F..11231 ; Mn # [3] KHOJKI VOWEL SIGN U..KHOJKI VOWEL SIGN AI
+11234 ; Mn # KHOJKI SIGN ANUSVARA
+11236..11237 ; Mn # [2] KHOJKI SIGN NUKTA..KHOJKI SIGN SHADDA
+1123E ; Mn # KHOJKI SIGN SUKUN
+112DF ; Mn # KHUDAWADI SIGN ANUSVARA
+112E3..112EA ; Mn # [8] KHUDAWADI VOWEL SIGN U..KHUDAWADI SIGN VIRAMA
+11300..11301 ; Mn # [2] GRANTHA SIGN COMBINING ANUSVARA ABOVE..GRANTHA SIGN CANDRABINDU
+1133B..1133C ; Mn # [2] COMBINING BINDU BELOW..GRANTHA SIGN NUKTA
+11340 ; Mn # GRANTHA VOWEL SIGN II
+11366..1136C ; Mn # [7] COMBINING GRANTHA DIGIT ZERO..COMBINING GRANTHA DIGIT SIX
+11370..11374 ; Mn # [5] COMBINING GRANTHA LETTER A..COMBINING GRANTHA LETTER PA
+11438..1143F ; Mn # [8] NEWA VOWEL SIGN U..NEWA VOWEL SIGN AI
+11442..11444 ; Mn # [3] NEWA SIGN VIRAMA..NEWA SIGN ANUSVARA
+11446 ; Mn # NEWA SIGN NUKTA
+1145E ; Mn # NEWA SANDHI MARK
+114B3..114B8 ; Mn # [6] TIRHUTA VOWEL SIGN U..TIRHUTA VOWEL SIGN VOCALIC LL
+114BA ; Mn # TIRHUTA VOWEL SIGN SHORT E
+114BF..114C0 ; Mn # [2] TIRHUTA SIGN CANDRABINDU..TIRHUTA SIGN ANUSVARA
+114C2..114C3 ; Mn # [2] TIRHUTA SIGN VIRAMA..TIRHUTA SIGN NUKTA
+115B2..115B5 ; Mn # [4] SIDDHAM VOWEL SIGN U..SIDDHAM VOWEL SIGN VOCALIC RR
+115BC..115BD ; Mn # [2] SIDDHAM SIGN CANDRABINDU..SIDDHAM SIGN ANUSVARA
+115BF..115C0 ; Mn # [2] SIDDHAM SIGN VIRAMA..SIDDHAM SIGN NUKTA
+115DC..115DD ; Mn # [2] SIDDHAM VOWEL SIGN ALTERNATE U..SIDDHAM VOWEL SIGN ALTERNATE UU
+11633..1163A ; Mn # [8] MODI VOWEL SIGN U..MODI VOWEL SIGN AI
+1163D ; Mn # MODI SIGN ANUSVARA
+1163F..11640 ; Mn # [2] MODI SIGN VIRAMA..MODI SIGN ARDHACANDRA
+116AB ; Mn # TAKRI SIGN ANUSVARA
+116AD ; Mn # TAKRI VOWEL SIGN AA
+116B0..116B5 ; Mn # [6] TAKRI VOWEL SIGN U..TAKRI VOWEL SIGN AU
+116B7 ; Mn # TAKRI SIGN NUKTA
+1171D..1171F ; Mn # [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
+11722..11725 ; Mn # [4] AHOM VOWEL SIGN I..AHOM VOWEL SIGN UU
+11727..1172B ; Mn # [5] AHOM VOWEL SIGN AW..AHOM SIGN KILLER
+1182F..11837 ; Mn # [9] DOGRA VOWEL SIGN U..DOGRA SIGN ANUSVARA
+11839..1183A ; Mn # [2] DOGRA SIGN VIRAMA..DOGRA SIGN NUKTA
+119D4..119D7 ; Mn # [4] NANDINAGARI VOWEL SIGN U..NANDINAGARI VOWEL SIGN VOCALIC RR
+119DA..119DB ; Mn # [2] NANDINAGARI VOWEL SIGN E..NANDINAGARI VOWEL SIGN AI
+119E0 ; Mn # NANDINAGARI SIGN VIRAMA
+11A01..11A0A ; Mn # [10] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL LENGTH MARK
+11A33..11A38 ; Mn # [6] ZANABAZAR SQUARE FINAL CONSONANT MARK..ZANABAZAR SQUARE SIGN ANUSVARA
+11A3B..11A3E ; Mn # [4] ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA..ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA
+11A47 ; Mn # ZANABAZAR SQUARE SUBJOINER
+11A51..11A56 ; Mn # [6] SOYOMBO VOWEL SIGN I..SOYOMBO VOWEL SIGN OE
+11A59..11A5B ; Mn # [3] SOYOMBO VOWEL SIGN VOCALIC R..SOYOMBO VOWEL LENGTH MARK
+11A8A..11A96 ; Mn # [13] SOYOMBO FINAL CONSONANT SIGN G..SOYOMBO SIGN ANUSVARA
+11A98..11A99 ; Mn # [2] SOYOMBO GEMINATION MARK..SOYOMBO SUBJOINER
+11C30..11C36 ; Mn # [7] BHAIKSUKI VOWEL SIGN I..BHAIKSUKI VOWEL SIGN VOCALIC L
+11C38..11C3D ; Mn # [6] BHAIKSUKI VOWEL SIGN E..BHAIKSUKI SIGN ANUSVARA
+11C3F ; Mn # BHAIKSUKI SIGN VIRAMA
+11C92..11CA7 ; Mn # [22] MARCHEN SUBJOINED LETTER KA..MARCHEN SUBJOINED LETTER ZA
+11CAA..11CB0 ; Mn # [7] MARCHEN SUBJOINED LETTER RA..MARCHEN VOWEL SIGN AA
+11CB2..11CB3 ; Mn # [2] MARCHEN VOWEL SIGN U..MARCHEN VOWEL SIGN E
+11CB5..11CB6 ; Mn # [2] MARCHEN SIGN ANUSVARA..MARCHEN SIGN CANDRABINDU
+11D31..11D36 ; Mn # [6] MASARAM GONDI VOWEL SIGN AA..MASARAM GONDI VOWEL SIGN VOCALIC R
+11D3A ; Mn # MASARAM GONDI VOWEL SIGN E
+11D3C..11D3D ; Mn # [2] MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VOWEL SIGN O
+11D3F..11D45 ; Mn # [7] MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI VIRAMA
+11D47 ; Mn # MASARAM GONDI RA-KARA
+11D90..11D91 ; Mn # [2] GUNJALA GONDI VOWEL SIGN EE..GUNJALA GONDI VOWEL SIGN AI
+11D95 ; Mn # GUNJALA GONDI SIGN ANUSVARA
+11D97 ; Mn # GUNJALA GONDI VIRAMA
+11EF3..11EF4 ; Mn # [2] MAKASAR VOWEL SIGN I..MAKASAR VOWEL SIGN U
+16AF0..16AF4 ; Mn # [5] BASSA VAH COMBINING HIGH TONE..BASSA VAH COMBINING HIGH-LOW TONE
+16B30..16B36 ; Mn # [7] PAHAWH HMONG MARK CIM TUB..PAHAWH HMONG MARK CIM TAUM
+16F4F ; Mn # MIAO SIGN CONSONANT MODIFIER BAR
+16F8F..16F92 ; Mn # [4] MIAO TONE RIGHT..MIAO TONE BELOW
+1BC9D..1BC9E ; Mn # [2] DUPLOYAN THICK LETTER SELECTOR..DUPLOYAN DOUBLE MARK
+1D167..1D169 ; Mn # [3] MUSICAL SYMBOL COMBINING TREMOLO-1..MUSICAL SYMBOL COMBINING TREMOLO-3
+1D17B..1D182 ; Mn # [8] MUSICAL SYMBOL COMBINING ACCENT..MUSICAL SYMBOL COMBINING LOURE
+1D185..1D18B ; Mn # [7] MUSICAL SYMBOL COMBINING DOIT..MUSICAL SYMBOL COMBINING TRIPLE TONGUE
+1D1AA..1D1AD ; Mn # [4] MUSICAL SYMBOL COMBINING DOWN BOW..MUSICAL SYMBOL COMBINING SNAP PIZZICATO
+1D242..1D244 ; Mn # [3] COMBINING GREEK MUSICAL TRISEME..COMBINING GREEK MUSICAL PENTASEME
+1DA00..1DA36 ; Mn # [55] SIGNWRITING HEAD RIM..SIGNWRITING AIR SUCKING IN
+1DA3B..1DA6C ; Mn # [50] SIGNWRITING MOUTH CLOSED NEUTRAL..SIGNWRITING EXCITEMENT
+1DA75 ; Mn # SIGNWRITING UPPER BODY TILTING FROM HIP JOINTS
+1DA84 ; Mn # SIGNWRITING LOCATION HEAD NECK
+1DA9B..1DA9F ; Mn # [5] SIGNWRITING FILL MODIFIER-2..SIGNWRITING FILL MODIFIER-6
+1DAA1..1DAAF ; Mn # [15] SIGNWRITING ROTATION MODIFIER-2..SIGNWRITING ROTATION MODIFIER-16
+1E000..1E006 ; Mn # [7] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER ZHIVETE
+1E008..1E018 ; Mn # [17] COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING GLAGOLITIC LETTER HERU
+1E01B..1E021 ; Mn # [7] COMBINING GLAGOLITIC LETTER SHTA..COMBINING GLAGOLITIC LETTER YATI
+1E023..1E024 ; Mn # [2] COMBINING GLAGOLITIC LETTER YU..COMBINING GLAGOLITIC LETTER SMALL YUS
+1E026..1E02A ; Mn # [5] COMBINING GLAGOLITIC LETTER YO..COMBINING GLAGOLITIC LETTER FITA
+1E130..1E136 ; Mn # [7] NYIAKENG PUACHUE HMONG TONE-B..NYIAKENG PUACHUE HMONG TONE-D
+1E2EC..1E2EF ; Mn # [4] WANCHO TONE TUP..WANCHO TONE KOINI
+1E8D0..1E8D6 ; Mn # [7] MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE KIKAKUI COMBINING NUMBER MILLIONS
+1E944..1E94A ; Mn # [7] ADLAM ALIF LENGTHENER..ADLAM NUKTA
+E0100..E01EF ; Mn # [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
+
+# Total code points: 1826
+
+# ================================================
+
+# General_Category=Enclosing_Mark
+
+0488..0489 ; Me # [2] COMBINING CYRILLIC HUNDRED THOUSANDS SIGN..COMBINING CYRILLIC MILLIONS SIGN
+1ABE ; Me # COMBINING PARENTHESES OVERLAY
+20DD..20E0 ; Me # [4] COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH
+20E2..20E4 ; Me # [3] COMBINING ENCLOSING SCREEN..COMBINING ENCLOSING UPWARD POINTING TRIANGLE
+A670..A672 ; Me # [3] COMBINING CYRILLIC TEN MILLIONS SIGN..COMBINING CYRILLIC THOUSAND MILLIONS SIGN
+
+# Total code points: 13
+
+# ================================================
+
+# General_Category=Spacing_Mark
+
+0903 ; Mc # DEVANAGARI SIGN VISARGA
+093B ; Mc # DEVANAGARI VOWEL SIGN OOE
+093E..0940 ; Mc # [3] DEVANAGARI VOWEL SIGN AA..DEVANAGARI VOWEL SIGN II
+0949..094C ; Mc # [4] DEVANAGARI VOWEL SIGN CANDRA O..DEVANAGARI VOWEL SIGN AU
+094E..094F ; Mc # [2] DEVANAGARI VOWEL SIGN PRISHTHAMATRA E..DEVANAGARI VOWEL SIGN AW
+0982..0983 ; Mc # [2] BENGALI SIGN ANUSVARA..BENGALI SIGN VISARGA
+09BE..09C0 ; Mc # [3] BENGALI VOWEL SIGN AA..BENGALI VOWEL SIGN II
+09C7..09C8 ; Mc # [2] BENGALI VOWEL SIGN E..BENGALI VOWEL SIGN AI
+09CB..09CC ; Mc # [2] BENGALI VOWEL SIGN O..BENGALI VOWEL SIGN AU
+09D7 ; Mc # BENGALI AU LENGTH MARK
+0A03 ; Mc # GURMUKHI SIGN VISARGA
+0A3E..0A40 ; Mc # [3] GURMUKHI VOWEL SIGN AA..GURMUKHI VOWEL SIGN II
+0A83 ; Mc # GUJARATI SIGN VISARGA
+0ABE..0AC0 ; Mc # [3] GUJARATI VOWEL SIGN AA..GUJARATI VOWEL SIGN II
+0AC9 ; Mc # GUJARATI VOWEL SIGN CANDRA O
+0ACB..0ACC ; Mc # [2] GUJARATI VOWEL SIGN O..GUJARATI VOWEL SIGN AU
+0B02..0B03 ; Mc # [2] ORIYA SIGN ANUSVARA..ORIYA SIGN VISARGA
+0B3E ; Mc # ORIYA VOWEL SIGN AA
+0B40 ; Mc # ORIYA VOWEL SIGN II
+0B47..0B48 ; Mc # [2] ORIYA VOWEL SIGN E..ORIYA VOWEL SIGN AI
+0B4B..0B4C ; Mc # [2] ORIYA VOWEL SIGN O..ORIYA VOWEL SIGN AU
+0B57 ; Mc # ORIYA AU LENGTH MARK
+0BBE..0BBF ; Mc # [2] TAMIL VOWEL SIGN AA..TAMIL VOWEL SIGN I
+0BC1..0BC2 ; Mc # [2] TAMIL VOWEL SIGN U..TAMIL VOWEL SIGN UU
+0BC6..0BC8 ; Mc # [3] TAMIL VOWEL SIGN E..TAMIL VOWEL SIGN AI
+0BCA..0BCC ; Mc # [3] TAMIL VOWEL SIGN O..TAMIL VOWEL SIGN AU
+0BD7 ; Mc # TAMIL AU LENGTH MARK
+0C01..0C03 ; Mc # [3] TELUGU SIGN CANDRABINDU..TELUGU SIGN VISARGA
+0C41..0C44 ; Mc # [4] TELUGU VOWEL SIGN U..TELUGU VOWEL SIGN VOCALIC RR
+0C82..0C83 ; Mc # [2] KANNADA SIGN ANUSVARA..KANNADA SIGN VISARGA
+0CBE ; Mc # KANNADA VOWEL SIGN AA
+0CC0..0CC4 ; Mc # [5] KANNADA VOWEL SIGN II..KANNADA VOWEL SIGN VOCALIC RR
+0CC7..0CC8 ; Mc # [2] KANNADA VOWEL SIGN EE..KANNADA VOWEL SIGN AI
+0CCA..0CCB ; Mc # [2] KANNADA VOWEL SIGN O..KANNADA VOWEL SIGN OO
+0CD5..0CD6 ; Mc # [2] KANNADA LENGTH MARK..KANNADA AI LENGTH MARK
+0D02..0D03 ; Mc # [2] MALAYALAM SIGN ANUSVARA..MALAYALAM SIGN VISARGA
+0D3E..0D40 ; Mc # [3] MALAYALAM VOWEL SIGN AA..MALAYALAM VOWEL SIGN II
+0D46..0D48 ; Mc # [3] MALAYALAM VOWEL SIGN E..MALAYALAM VOWEL SIGN AI
+0D4A..0D4C ; Mc # [3] MALAYALAM VOWEL SIGN O..MALAYALAM VOWEL SIGN AU
+0D57 ; Mc # MALAYALAM AU LENGTH MARK
+0D82..0D83 ; Mc # [2] SINHALA SIGN ANUSVARAYA..SINHALA SIGN VISARGAYA
+0DCF..0DD1 ; Mc # [3] SINHALA VOWEL SIGN AELA-PILLA..SINHALA VOWEL SIGN DIGA AEDA-PILLA
+0DD8..0DDF ; Mc # [8] SINHALA VOWEL SIGN GAETTA-PILLA..SINHALA VOWEL SIGN GAYANUKITTA
+0DF2..0DF3 ; Mc # [2] SINHALA VOWEL SIGN DIGA GAETTA-PILLA..SINHALA VOWEL SIGN DIGA GAYANUKITTA
+0F3E..0F3F ; Mc # [2] TIBETAN SIGN YAR TSHES..TIBETAN SIGN MAR TSHES
+0F7F ; Mc # TIBETAN SIGN RNAM BCAD
+102B..102C ; Mc # [2] MYANMAR VOWEL SIGN TALL AA..MYANMAR VOWEL SIGN AA
+1031 ; Mc # MYANMAR VOWEL SIGN E
+1038 ; Mc # MYANMAR SIGN VISARGA
+103B..103C ; Mc # [2] MYANMAR CONSONANT SIGN MEDIAL YA..MYANMAR CONSONANT SIGN MEDIAL RA
+1056..1057 ; Mc # [2] MYANMAR VOWEL SIGN VOCALIC R..MYANMAR VOWEL SIGN VOCALIC RR
+1062..1064 ; Mc # [3] MYANMAR VOWEL SIGN SGAW KAREN EU..MYANMAR TONE MARK SGAW KAREN KE PHO
+1067..106D ; Mc # [7] MYANMAR VOWEL SIGN WESTERN PWO KAREN EU..MYANMAR SIGN WESTERN PWO KAREN TONE-5
+1083..1084 ; Mc # [2] MYANMAR VOWEL SIGN SHAN AA..MYANMAR VOWEL SIGN SHAN E
+1087..108C ; Mc # [6] MYANMAR SIGN SHAN TONE-2..MYANMAR SIGN SHAN COUNCIL TONE-3
+108F ; Mc # MYANMAR SIGN RUMAI PALAUNG TONE-5
+109A..109C ; Mc # [3] MYANMAR SIGN KHAMTI TONE-1..MYANMAR VOWEL SIGN AITON A
+17B6 ; Mc # KHMER VOWEL SIGN AA
+17BE..17C5 ; Mc # [8] KHMER VOWEL SIGN OE..KHMER VOWEL SIGN AU
+17C7..17C8 ; Mc # [2] KHMER SIGN REAHMUK..KHMER SIGN YUUKALEAPINTU
+1923..1926 ; Mc # [4] LIMBU VOWEL SIGN EE..LIMBU VOWEL SIGN AU
+1929..192B ; Mc # [3] LIMBU SUBJOINED LETTER YA..LIMBU SUBJOINED LETTER WA
+1930..1931 ; Mc # [2] LIMBU SMALL LETTER KA..LIMBU SMALL LETTER NGA
+1933..1938 ; Mc # [6] LIMBU SMALL LETTER TA..LIMBU SMALL LETTER LA
+1A19..1A1A ; Mc # [2] BUGINESE VOWEL SIGN E..BUGINESE VOWEL SIGN O
+1A55 ; Mc # TAI THAM CONSONANT SIGN MEDIAL RA
+1A57 ; Mc # TAI THAM CONSONANT SIGN LA TANG LAI
+1A61 ; Mc # TAI THAM VOWEL SIGN A
+1A63..1A64 ; Mc # [2] TAI THAM VOWEL SIGN AA..TAI THAM VOWEL SIGN TALL AA
+1A6D..1A72 ; Mc # [6] TAI THAM VOWEL SIGN OY..TAI THAM VOWEL SIGN THAM AI
+1B04 ; Mc # BALINESE SIGN BISAH
+1B35 ; Mc # BALINESE VOWEL SIGN TEDUNG
+1B3B ; Mc # BALINESE VOWEL SIGN RA REPA TEDUNG
+1B3D..1B41 ; Mc # [5] BALINESE VOWEL SIGN LA LENGA TEDUNG..BALINESE VOWEL SIGN TALING REPA TEDUNG
+1B43..1B44 ; Mc # [2] BALINESE VOWEL SIGN PEPET TEDUNG..BALINESE ADEG ADEG
+1B82 ; Mc # SUNDANESE SIGN PANGWISAD
+1BA1 ; Mc # SUNDANESE CONSONANT SIGN PAMINGKAL
+1BA6..1BA7 ; Mc # [2] SUNDANESE VOWEL SIGN PANAELAENG..SUNDANESE VOWEL SIGN PANOLONG
+1BAA ; Mc # SUNDANESE SIGN PAMAAEH
+1BE7 ; Mc # BATAK VOWEL SIGN E
+1BEA..1BEC ; Mc # [3] BATAK VOWEL SIGN I..BATAK VOWEL SIGN O
+1BEE ; Mc # BATAK VOWEL SIGN U
+1BF2..1BF3 ; Mc # [2] BATAK PANGOLAT..BATAK PANONGONAN
+1C24..1C2B ; Mc # [8] LEPCHA SUBJOINED LETTER YA..LEPCHA VOWEL SIGN UU
+1C34..1C35 ; Mc # [2] LEPCHA CONSONANT SIGN NYIN-DO..LEPCHA CONSONANT SIGN KANG
+1CE1 ; Mc # VEDIC TONE ATHARVAVEDIC INDEPENDENT SVARITA
+1CF7 ; Mc # VEDIC SIGN ATIKRAMA
+302E..302F ; Mc # [2] HANGUL SINGLE DOT TONE MARK..HANGUL DOUBLE DOT TONE MARK
+A823..A824 ; Mc # [2] SYLOTI NAGRI VOWEL SIGN A..SYLOTI NAGRI VOWEL SIGN I
+A827 ; Mc # SYLOTI NAGRI VOWEL SIGN OO
+A880..A881 ; Mc # [2] SAURASHTRA SIGN ANUSVARA..SAURASHTRA SIGN VISARGA
+A8B4..A8C3 ; Mc # [16] SAURASHTRA CONSONANT SIGN HAARU..SAURASHTRA VOWEL SIGN AU
+A952..A953 ; Mc # [2] REJANG CONSONANT SIGN H..REJANG VIRAMA
+A983 ; Mc # JAVANESE SIGN WIGNYAN
+A9B4..A9B5 ; Mc # [2] JAVANESE VOWEL SIGN TARUNG..JAVANESE VOWEL SIGN TOLONG
+A9BA..A9BB ; Mc # [2] JAVANESE VOWEL SIGN TALING..JAVANESE VOWEL SIGN DIRGA MURE
+A9BE..A9C0 ; Mc # [3] JAVANESE CONSONANT SIGN PENGKAL..JAVANESE PANGKON
+AA2F..AA30 ; Mc # [2] CHAM VOWEL SIGN O..CHAM VOWEL SIGN AI
+AA33..AA34 ; Mc # [2] CHAM CONSONANT SIGN YA..CHAM CONSONANT SIGN RA
+AA4D ; Mc # CHAM CONSONANT SIGN FINAL H
+AA7B ; Mc # MYANMAR SIGN PAO KAREN TONE
+AA7D ; Mc # MYANMAR SIGN TAI LAING TONE-5
+AAEB ; Mc # MEETEI MAYEK VOWEL SIGN II
+AAEE..AAEF ; Mc # [2] MEETEI MAYEK VOWEL SIGN AU..MEETEI MAYEK VOWEL SIGN AAU
+AAF5 ; Mc # MEETEI MAYEK VOWEL SIGN VISARGA
+ABE3..ABE4 ; Mc # [2] MEETEI MAYEK VOWEL SIGN ONAP..MEETEI MAYEK VOWEL SIGN INAP
+ABE6..ABE7 ; Mc # [2] MEETEI MAYEK VOWEL SIGN YENAP..MEETEI MAYEK VOWEL SIGN SOUNAP
+ABE9..ABEA ; Mc # [2] MEETEI MAYEK VOWEL SIGN CHEINAP..MEETEI MAYEK VOWEL SIGN NUNG
+ABEC ; Mc # MEETEI MAYEK LUM IYEK
+11000 ; Mc # BRAHMI SIGN CANDRABINDU
+11002 ; Mc # BRAHMI SIGN VISARGA
+11082 ; Mc # KAITHI SIGN VISARGA
+110B0..110B2 ; Mc # [3] KAITHI VOWEL SIGN AA..KAITHI VOWEL SIGN II
+110B7..110B8 ; Mc # [2] KAITHI VOWEL SIGN O..KAITHI VOWEL SIGN AU
+1112C ; Mc # CHAKMA VOWEL SIGN E
+11145..11146 ; Mc # [2] CHAKMA VOWEL SIGN AA..CHAKMA VOWEL SIGN EI
+11182 ; Mc # SHARADA SIGN VISARGA
+111B3..111B5 ; Mc # [3] SHARADA VOWEL SIGN AA..SHARADA VOWEL SIGN II
+111BF..111C0 ; Mc # [2] SHARADA VOWEL SIGN AU..SHARADA SIGN VIRAMA
+1122C..1122E ; Mc # [3] KHOJKI VOWEL SIGN AA..KHOJKI VOWEL SIGN II
+11232..11233 ; Mc # [2] KHOJKI VOWEL SIGN O..KHOJKI VOWEL SIGN AU
+11235 ; Mc # KHOJKI SIGN VIRAMA
+112E0..112E2 ; Mc # [3] KHUDAWADI VOWEL SIGN AA..KHUDAWADI VOWEL SIGN II
+11302..11303 ; Mc # [2] GRANTHA SIGN ANUSVARA..GRANTHA SIGN VISARGA
+1133E..1133F ; Mc # [2] GRANTHA VOWEL SIGN AA..GRANTHA VOWEL SIGN I
+11341..11344 ; Mc # [4] GRANTHA VOWEL SIGN U..GRANTHA VOWEL SIGN VOCALIC RR
+11347..11348 ; Mc # [2] GRANTHA VOWEL SIGN EE..GRANTHA VOWEL SIGN AI
+1134B..1134D ; Mc # [3] GRANTHA VOWEL SIGN OO..GRANTHA SIGN VIRAMA
+11357 ; Mc # GRANTHA AU LENGTH MARK
+11362..11363 ; Mc # [2] GRANTHA VOWEL SIGN VOCALIC L..GRANTHA VOWEL SIGN VOCALIC LL
+11435..11437 ; Mc # [3] NEWA VOWEL SIGN AA..NEWA VOWEL SIGN II
+11440..11441 ; Mc # [2] NEWA VOWEL SIGN O..NEWA VOWEL SIGN AU
+11445 ; Mc # NEWA SIGN VISARGA
+114B0..114B2 ; Mc # [3] TIRHUTA VOWEL SIGN AA..TIRHUTA VOWEL SIGN II
+114B9 ; Mc # TIRHUTA VOWEL SIGN E
+114BB..114BE ; Mc # [4] TIRHUTA VOWEL SIGN AI..TIRHUTA VOWEL SIGN AU
+114C1 ; Mc # TIRHUTA SIGN VISARGA
+115AF..115B1 ; Mc # [3] SIDDHAM VOWEL SIGN AA..SIDDHAM VOWEL SIGN II
+115B8..115BB ; Mc # [4] SIDDHAM VOWEL SIGN E..SIDDHAM VOWEL SIGN AU
+115BE ; Mc # SIDDHAM SIGN VISARGA
+11630..11632 ; Mc # [3] MODI VOWEL SIGN AA..MODI VOWEL SIGN II
+1163B..1163C ; Mc # [2] MODI VOWEL SIGN O..MODI VOWEL SIGN AU
+1163E ; Mc # MODI SIGN VISARGA
+116AC ; Mc # TAKRI SIGN VISARGA
+116AE..116AF ; Mc # [2] TAKRI VOWEL SIGN I..TAKRI VOWEL SIGN II
+116B6 ; Mc # TAKRI SIGN VIRAMA
+11720..11721 ; Mc # [2] AHOM VOWEL SIGN A..AHOM VOWEL SIGN AA
+11726 ; Mc # AHOM VOWEL SIGN E
+1182C..1182E ; Mc # [3] DOGRA VOWEL SIGN AA..DOGRA VOWEL SIGN II
+11838 ; Mc # DOGRA SIGN VISARGA
+119D1..119D3 ; Mc # [3] NANDINAGARI VOWEL SIGN AA..NANDINAGARI VOWEL SIGN II
+119DC..119DF ; Mc # [4] NANDINAGARI VOWEL SIGN O..NANDINAGARI SIGN VISARGA
+119E4 ; Mc # NANDINAGARI VOWEL SIGN PRISHTHAMATRA E
+11A39 ; Mc # ZANABAZAR SQUARE SIGN VISARGA
+11A57..11A58 ; Mc # [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU
+11A97 ; Mc # SOYOMBO SIGN VISARGA
+11C2F ; Mc # BHAIKSUKI VOWEL SIGN AA
+11C3E ; Mc # BHAIKSUKI SIGN VISARGA
+11CA9 ; Mc # MARCHEN SUBJOINED LETTER YA
+11CB1 ; Mc # MARCHEN VOWEL SIGN I
+11CB4 ; Mc # MARCHEN VOWEL SIGN O
+11D8A..11D8E ; Mc # [5] GUNJALA GONDI VOWEL SIGN AA..GUNJALA GONDI VOWEL SIGN UU
+11D93..11D94 ; Mc # [2] GUNJALA GONDI VOWEL SIGN OO..GUNJALA GONDI VOWEL SIGN AU
+11D96 ; Mc # GUNJALA GONDI SIGN VISARGA
+11EF5..11EF6 ; Mc # [2] MAKASAR VOWEL SIGN E..MAKASAR VOWEL SIGN O
+16F51..16F87 ; Mc # [55] MIAO SIGN ASPIRATION..MIAO VOWEL SIGN UI
+1D165..1D166 ; Mc # [2] MUSICAL SYMBOL COMBINING STEM..MUSICAL SYMBOL COMBINING SPRECHGESANG STEM
+1D16D..1D172 ; Mc # [6] MUSICAL SYMBOL COMBINING AUGMENTATION DOT..MUSICAL SYMBOL COMBINING FLAG-5
+
+# Total code points: 429
+
+# ================================================
+
+# General_Category=Decimal_Number
+
+0030..0039 ; Nd # [10] DIGIT ZERO..DIGIT NINE
+0660..0669 ; Nd # [10] ARABIC-INDIC DIGIT ZERO..ARABIC-INDIC DIGIT NINE
+06F0..06F9 ; Nd # [10] EXTENDED ARABIC-INDIC DIGIT ZERO..EXTENDED ARABIC-INDIC DIGIT NINE
+07C0..07C9 ; Nd # [10] NKO DIGIT ZERO..NKO DIGIT NINE
+0966..096F ; Nd # [10] DEVANAGARI DIGIT ZERO..DEVANAGARI DIGIT NINE
+09E6..09EF ; Nd # [10] BENGALI DIGIT ZERO..BENGALI DIGIT NINE
+0A66..0A6F ; Nd # [10] GURMUKHI DIGIT ZERO..GURMUKHI DIGIT NINE
+0AE6..0AEF ; Nd # [10] GUJARATI DIGIT ZERO..GUJARATI DIGIT NINE
+0B66..0B6F ; Nd # [10] ORIYA DIGIT ZERO..ORIYA DIGIT NINE
+0BE6..0BEF ; Nd # [10] TAMIL DIGIT ZERO..TAMIL DIGIT NINE
+0C66..0C6F ; Nd # [10] TELUGU DIGIT ZERO..TELUGU DIGIT NINE
+0CE6..0CEF ; Nd # [10] KANNADA DIGIT ZERO..KANNADA DIGIT NINE
+0D66..0D6F ; Nd # [10] MALAYALAM DIGIT ZERO..MALAYALAM DIGIT NINE
+0DE6..0DEF ; Nd # [10] SINHALA LITH DIGIT ZERO..SINHALA LITH DIGIT NINE
+0E50..0E59 ; Nd # [10] THAI DIGIT ZERO..THAI DIGIT NINE
+0ED0..0ED9 ; Nd # [10] LAO DIGIT ZERO..LAO DIGIT NINE
+0F20..0F29 ; Nd # [10] TIBETAN DIGIT ZERO..TIBETAN DIGIT NINE
+1040..1049 ; Nd # [10] MYANMAR DIGIT ZERO..MYANMAR DIGIT NINE
+1090..1099 ; Nd # [10] MYANMAR SHAN DIGIT ZERO..MYANMAR SHAN DIGIT NINE
+17E0..17E9 ; Nd # [10] KHMER DIGIT ZERO..KHMER DIGIT NINE
+1810..1819 ; Nd # [10] MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE
+1946..194F ; Nd # [10] LIMBU DIGIT ZERO..LIMBU DIGIT NINE
+19D0..19D9 ; Nd # [10] NEW TAI LUE DIGIT ZERO..NEW TAI LUE DIGIT NINE
+1A80..1A89 ; Nd # [10] TAI THAM HORA DIGIT ZERO..TAI THAM HORA DIGIT NINE
+1A90..1A99 ; Nd # [10] TAI THAM THAM DIGIT ZERO..TAI THAM THAM DIGIT NINE
+1B50..1B59 ; Nd # [10] BALINESE DIGIT ZERO..BALINESE DIGIT NINE
+1BB0..1BB9 ; Nd # [10] SUNDANESE DIGIT ZERO..SUNDANESE DIGIT NINE
+1C40..1C49 ; Nd # [10] LEPCHA DIGIT ZERO..LEPCHA DIGIT NINE
+1C50..1C59 ; Nd # [10] OL CHIKI DIGIT ZERO..OL CHIKI DIGIT NINE
+A620..A629 ; Nd # [10] VAI DIGIT ZERO..VAI DIGIT NINE
+A8D0..A8D9 ; Nd # [10] SAURASHTRA DIGIT ZERO..SAURASHTRA DIGIT NINE
+A900..A909 ; Nd # [10] KAYAH LI DIGIT ZERO..KAYAH LI DIGIT NINE
+A9D0..A9D9 ; Nd # [10] JAVANESE DIGIT ZERO..JAVANESE DIGIT NINE
+A9F0..A9F9 ; Nd # [10] MYANMAR TAI LAING DIGIT ZERO..MYANMAR TAI LAING DIGIT NINE
+AA50..AA59 ; Nd # [10] CHAM DIGIT ZERO..CHAM DIGIT NINE
+ABF0..ABF9 ; Nd # [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DIGIT NINE
+FF10..FF19 ; Nd # [10] FULLWIDTH DIGIT ZERO..FULLWIDTH DIGIT NINE
+104A0..104A9 ; Nd # [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE
+10D30..10D39 ; Nd # [10] HANIFI ROHINGYA DIGIT ZERO..HANIFI ROHINGYA DIGIT NINE
+11066..1106F ; Nd # [10] BRAHMI DIGIT ZERO..BRAHMI DIGIT NINE
+110F0..110F9 ; Nd # [10] SORA SOMPENG DIGIT ZERO..SORA SOMPENG DIGIT NINE
+11136..1113F ; Nd # [10] CHAKMA DIGIT ZERO..CHAKMA DIGIT NINE
+111D0..111D9 ; Nd # [10] SHARADA DIGIT ZERO..SHARADA DIGIT NINE
+112F0..112F9 ; Nd # [10] KHUDAWADI DIGIT ZERO..KHUDAWADI DIGIT NINE
+11450..11459 ; Nd # [10] NEWA DIGIT ZERO..NEWA DIGIT NINE
+114D0..114D9 ; Nd # [10] TIRHUTA DIGIT ZERO..TIRHUTA DIGIT NINE
+11650..11659 ; Nd # [10] MODI DIGIT ZERO..MODI DIGIT NINE
+116C0..116C9 ; Nd # [10] TAKRI DIGIT ZERO..TAKRI DIGIT NINE
+11730..11739 ; Nd # [10] AHOM DIGIT ZERO..AHOM DIGIT NINE
+118E0..118E9 ; Nd # [10] WARANG CITI DIGIT ZERO..WARANG CITI DIGIT NINE
+11C50..11C59 ; Nd # [10] BHAIKSUKI DIGIT ZERO..BHAIKSUKI DIGIT NINE
+11D50..11D59 ; Nd # [10] MASARAM GONDI DIGIT ZERO..MASARAM GONDI DIGIT NINE
+11DA0..11DA9 ; Nd # [10] GUNJALA GONDI DIGIT ZERO..GUNJALA GONDI DIGIT NINE
+16A60..16A69 ; Nd # [10] MRO DIGIT ZERO..MRO DIGIT NINE
+16B50..16B59 ; Nd # [10] PAHAWH HMONG DIGIT ZERO..PAHAWH HMONG DIGIT NINE
+1D7CE..1D7FF ; Nd # [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MONOSPACE DIGIT NINE
+1E140..1E149 ; Nd # [10] NYIAKENG PUACHUE HMONG DIGIT ZERO..NYIAKENG PUACHUE HMONG DIGIT NINE
+1E2F0..1E2F9 ; Nd # [10] WANCHO DIGIT ZERO..WANCHO DIGIT NINE
+1E950..1E959 ; Nd # [10] ADLAM DIGIT ZERO..ADLAM DIGIT NINE
+
+# Total code points: 630
+
+# ================================================
+
+# General_Category=Letter_Number
+
+16EE..16F0 ; Nl # [3] RUNIC ARLAUG SYMBOL..RUNIC BELGTHOR SYMBOL
+2160..2182 ; Nl # [35] ROMAN NUMERAL ONE..ROMAN NUMERAL TEN THOUSAND
+2185..2188 ; Nl # [4] ROMAN NUMERAL SIX LATE FORM..ROMAN NUMERAL ONE HUNDRED THOUSAND
+3007 ; Nl # IDEOGRAPHIC NUMBER ZERO
+3021..3029 ; Nl # [9] HANGZHOU NUMERAL ONE..HANGZHOU NUMERAL NINE
+3038..303A ; Nl # [3] HANGZHOU NUMERAL TEN..HANGZHOU NUMERAL THIRTY
+A6E6..A6EF ; Nl # [10] BAMUM LETTER MO..BAMUM LETTER KOGHOM
+10140..10174 ; Nl # [53] GREEK ACROPHONIC ATTIC ONE QUARTER..GREEK ACROPHONIC STRATIAN FIFTY MNAS
+10341 ; Nl # GOTHIC LETTER NINETY
+1034A ; Nl # GOTHIC LETTER NINE HUNDRED
+103D1..103D5 ; Nl # [5] OLD PERSIAN NUMBER ONE..OLD PERSIAN NUMBER HUNDRED
+12400..1246E ; Nl # [111] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN NINE U VARIANT FORM
+
+# Total code points: 236
+
+# ================================================
+
+# General_Category=Other_Number
+
+00B2..00B3 ; No # [2] SUPERSCRIPT TWO..SUPERSCRIPT THREE
+00B9 ; No # SUPERSCRIPT ONE
+00BC..00BE ; No # [3] VULGAR FRACTION ONE QUARTER..VULGAR FRACTION THREE QUARTERS
+09F4..09F9 ; No # [6] BENGALI CURRENCY NUMERATOR ONE..BENGALI CURRENCY DENOMINATOR SIXTEEN
+0B72..0B77 ; No # [6] ORIYA FRACTION ONE QUARTER..ORIYA FRACTION THREE SIXTEENTHS
+0BF0..0BF2 ; No # [3] TAMIL NUMBER TEN..TAMIL NUMBER ONE THOUSAND
+0C78..0C7E ; No # [7] TELUGU FRACTION DIGIT ZERO FOR ODD POWERS OF FOUR..TELUGU FRACTION DIGIT THREE FOR EVEN POWERS OF FOUR
+0D58..0D5E ; No # [7] MALAYALAM FRACTION ONE ONE-HUNDRED-AND-SIXTIETH..MALAYALAM FRACTION ONE FIFTH
+0D70..0D78 ; No # [9] MALAYALAM NUMBER TEN..MALAYALAM FRACTION THREE SIXTEENTHS
+0F2A..0F33 ; No # [10] TIBETAN DIGIT HALF ONE..TIBETAN DIGIT HALF ZERO
+1369..137C ; No # [20] ETHIOPIC DIGIT ONE..ETHIOPIC NUMBER TEN THOUSAND
+17F0..17F9 ; No # [10] KHMER SYMBOL LEK ATTAK SON..KHMER SYMBOL LEK ATTAK PRAM-BUON
+19DA ; No # NEW TAI LUE THAM DIGIT ONE
+2070 ; No # SUPERSCRIPT ZERO
+2074..2079 ; No # [6] SUPERSCRIPT FOUR..SUPERSCRIPT NINE
+2080..2089 ; No # [10] SUBSCRIPT ZERO..SUBSCRIPT NINE
+2150..215F ; No # [16] VULGAR FRACTION ONE SEVENTH..FRACTION NUMERATOR ONE
+2189 ; No # VULGAR FRACTION ZERO THIRDS
+2460..249B ; No # [60] CIRCLED DIGIT ONE..NUMBER TWENTY FULL STOP
+24EA..24FF ; No # [22] CIRCLED DIGIT ZERO..NEGATIVE CIRCLED DIGIT ZERO
+2776..2793 ; No # [30] DINGBAT NEGATIVE CIRCLED DIGIT ONE..DINGBAT NEGATIVE CIRCLED SANS-SERIF NUMBER TEN
+2CFD ; No # COPTIC FRACTION ONE HALF
+3192..3195 ; No # [4] IDEOGRAPHIC ANNOTATION ONE MARK..IDEOGRAPHIC ANNOTATION FOUR MARK
+3220..3229 ; No # [10] PARENTHESIZED IDEOGRAPH ONE..PARENTHESIZED IDEOGRAPH TEN
+3248..324F ; No # [8] CIRCLED NUMBER TEN ON BLACK SQUARE..CIRCLED NUMBER EIGHTY ON BLACK SQUARE
+3251..325F ; No # [15] CIRCLED NUMBER TWENTY ONE..CIRCLED NUMBER THIRTY FIVE
+3280..3289 ; No # [10] CIRCLED IDEOGRAPH ONE..CIRCLED IDEOGRAPH TEN
+32B1..32BF ; No # [15] CIRCLED NUMBER THIRTY SIX..CIRCLED NUMBER FIFTY
+A830..A835 ; No # [6] NORTH INDIC FRACTION ONE QUARTER..NORTH INDIC FRACTION THREE SIXTEENTHS
+10107..10133 ; No # [45] AEGEAN NUMBER ONE..AEGEAN NUMBER NINETY THOUSAND
+10175..10178 ; No # [4] GREEK ONE HALF SIGN..GREEK THREE QUARTERS SIGN
+1018A..1018B ; No # [2] GREEK ZERO SIGN..GREEK ONE QUARTER SIGN
+102E1..102FB ; No # [27] COPTIC EPACT DIGIT ONE..COPTIC EPACT NUMBER NINE HUNDRED
+10320..10323 ; No # [4] OLD ITALIC NUMERAL ONE..OLD ITALIC NUMERAL FIFTY
+10858..1085F ; No # [8] IMPERIAL ARAMAIC NUMBER ONE..IMPERIAL ARAMAIC NUMBER TEN THOUSAND
+10879..1087F ; No # [7] PALMYRENE NUMBER ONE..PALMYRENE NUMBER TWENTY
+108A7..108AF ; No # [9] NABATAEAN NUMBER ONE..NABATAEAN NUMBER ONE HUNDRED
+108FB..108FF ; No # [5] HATRAN NUMBER ONE..HATRAN NUMBER ONE HUNDRED
+10916..1091B ; No # [6] PHOENICIAN NUMBER ONE..PHOENICIAN NUMBER THREE
+109BC..109BD ; No # [2] MEROITIC CURSIVE FRACTION ELEVEN TWELFTHS..MEROITIC CURSIVE FRACTION ONE HALF
+109C0..109CF ; No # [16] MEROITIC CURSIVE NUMBER ONE..MEROITIC CURSIVE NUMBER SEVENTY
+109D2..109FF ; No # [46] MEROITIC CURSIVE NUMBER ONE HUNDRED..MEROITIC CURSIVE FRACTION TEN TWELFTHS
+10A40..10A48 ; No # [9] KHAROSHTHI DIGIT ONE..KHAROSHTHI FRACTION ONE HALF
+10A7D..10A7E ; No # [2] OLD SOUTH ARABIAN NUMBER ONE..OLD SOUTH ARABIAN NUMBER FIFTY
+10A9D..10A9F ; No # [3] OLD NORTH ARABIAN NUMBER ONE..OLD NORTH ARABIAN NUMBER TWENTY
+10AEB..10AEF ; No # [5] MANICHAEAN NUMBER ONE..MANICHAEAN NUMBER ONE HUNDRED
+10B58..10B5F ; No # [8] INSCRIPTIONAL PARTHIAN NUMBER ONE..INSCRIPTIONAL PARTHIAN NUMBER ONE THOUSAND
+10B78..10B7F ; No # [8] INSCRIPTIONAL PAHLAVI NUMBER ONE..INSCRIPTIONAL PAHLAVI NUMBER ONE THOUSAND
+10BA9..10BAF ; No # [7] PSALTER PAHLAVI NUMBER ONE..PSALTER PAHLAVI NUMBER ONE HUNDRED
+10CFA..10CFF ; No # [6] OLD HUNGARIAN NUMBER ONE..OLD HUNGARIAN NUMBER ONE THOUSAND
+10E60..10E7E ; No # [31] RUMI DIGIT ONE..RUMI FRACTION TWO THIRDS
+10F1D..10F26 ; No # [10] OLD SOGDIAN NUMBER ONE..OLD SOGDIAN FRACTION ONE HALF
+10F51..10F54 ; No # [4] SOGDIAN NUMBER ONE..SOGDIAN NUMBER ONE HUNDRED
+11052..11065 ; No # [20] BRAHMI NUMBER ONE..BRAHMI NUMBER ONE THOUSAND
+111E1..111F4 ; No # [20] SINHALA ARCHAIC DIGIT ONE..SINHALA ARCHAIC NUMBER ONE THOUSAND
+1173A..1173B ; No # [2] AHOM NUMBER TEN..AHOM NUMBER TWENTY
+118EA..118F2 ; No # [9] WARANG CITI NUMBER TEN..WARANG CITI NUMBER NINETY
+11C5A..11C6C ; No # [19] BHAIKSUKI NUMBER ONE..BHAIKSUKI HUNDREDS UNIT MARK
+11FC0..11FD4 ; No # [21] TAMIL FRACTION ONE THREE-HUNDRED-AND-TWENTIETH..TAMIL FRACTION DOWNSCALING FACTOR KIIZH
+16B5B..16B61 ; No # [7] PAHAWH HMONG NUMBER TENS..PAHAWH HMONG NUMBER TRILLIONS
+16E80..16E96 ; No # [23] MEDEFAIDRIN DIGIT ZERO..MEDEFAIDRIN DIGIT THREE ALTERNATE FORM
+1D2E0..1D2F3 ; No # [20] MAYAN NUMERAL ZERO..MAYAN NUMERAL NINETEEN
+1D360..1D378 ; No # [25] COUNTING ROD UNIT DIGIT ONE..TALLY MARK FIVE
+1E8C7..1E8CF ; No # [9] MENDE KIKAKUI DIGIT ONE..MENDE KIKAKUI DIGIT NINE
+1EC71..1ECAB ; No # [59] INDIC SIYAQ NUMBER ONE..INDIC SIYAQ NUMBER PREFIXED NINE
+1ECAD..1ECAF ; No # [3] INDIC SIYAQ FRACTION ONE QUARTER..INDIC SIYAQ FRACTION THREE QUARTERS
+1ECB1..1ECB4 ; No # [4] INDIC SIYAQ NUMBER ALTERNATE ONE..INDIC SIYAQ ALTERNATE LAKH MARK
+1ED01..1ED2D ; No # [45] OTTOMAN SIYAQ NUMBER ONE..OTTOMAN SIYAQ NUMBER NINETY THOUSAND
+1ED2F..1ED3D ; No # [15] OTTOMAN SIYAQ ALTERNATE NUMBER TWO..OTTOMAN SIYAQ FRACTION ONE SIXTH
+1F100..1F10C ; No # [13] DIGIT ZERO FULL STOP..DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ZERO
+
+# Total code points: 888
+
+# ================================================
+
+# General_Category=Space_Separator
+
+0020 ; Zs # SPACE
+00A0 ; Zs # NO-BREAK SPACE
+1680 ; Zs # OGHAM SPACE MARK
+2000..200A ; Zs # [11] EN QUAD..HAIR SPACE
+202F ; Zs # NARROW NO-BREAK SPACE
+205F ; Zs # MEDIUM MATHEMATICAL SPACE
+3000 ; Zs # IDEOGRAPHIC SPACE
+
+# Total code points: 17
+
+# ================================================
+
+# General_Category=Line_Separator
+
+2028 ; Zl # LINE SEPARATOR
+
+# Total code points: 1
+
+# ================================================
+
+# General_Category=Paragraph_Separator
+
+2029 ; Zp # PARAGRAPH SEPARATOR
+
+# Total code points: 1
+
+# ================================================
+
+# General_Category=Control
+
+0000..001F ; Cc # [32] <control-0000>..<control-001F>
+007F..009F ; Cc # [33] <control-007F>..<control-009F>
+
+# Total code points: 65
+
+# ================================================
+
+# General_Category=Format
+
+00AD ; Cf # SOFT HYPHEN
+0600..0605 ; Cf # [6] ARABIC NUMBER SIGN..ARABIC NUMBER MARK ABOVE
+061C ; Cf # ARABIC LETTER MARK
+06DD ; Cf # ARABIC END OF AYAH
+070F ; Cf # SYRIAC ABBREVIATION MARK
+08E2 ; Cf # ARABIC DISPUTED END OF AYAH
+180E ; Cf # MONGOLIAN VOWEL SEPARATOR
+200B..200F ; Cf # [5] ZERO WIDTH SPACE..RIGHT-TO-LEFT MARK
+202A..202E ; Cf # [5] LEFT-TO-RIGHT EMBEDDING..RIGHT-TO-LEFT OVERRIDE
+2060..2064 ; Cf # [5] WORD JOINER..INVISIBLE PLUS
+2066..206F ; Cf # [10] LEFT-TO-RIGHT ISOLATE..NOMINAL DIGIT SHAPES
+FEFF ; Cf # ZERO WIDTH NO-BREAK SPACE
+FFF9..FFFB ; Cf # [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNOTATION TERMINATOR
+110BD ; Cf # KAITHI NUMBER SIGN
+110CD ; Cf # KAITHI NUMBER SIGN ABOVE
+13430..13438 ; Cf # [9] EGYPTIAN HIEROGLYPH VERTICAL JOINER..EGYPTIAN HIEROGLYPH END SEGMENT
+1BCA0..1BCA3 ; Cf # [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
+1D173..1D17A ; Cf # [8] MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL END PHRASE
+E0001 ; Cf # LANGUAGE TAG
+E0020..E007F ; Cf # [96] TAG SPACE..CANCEL TAG
+
+# Total code points: 161
+
+# ================================================
+
+# General_Category=Private_Use
+
+E000..F8FF ; Co # [6400] <private-use-E000>..<private-use-F8FF>
+F0000..FFFFD ; Co # [65534] <private-use-F0000>..<private-use-FFFFD>
+100000..10FFFD; Co # [65534] <private-use-100000>..<private-use-10FFFD>
+
+# Total code points: 137468
+
+# ================================================
+
+# General_Category=Surrogate
+
+D800..DFFF ; Cs # [2048] <surrogate-D800>..<surrogate-DFFF>
+
+# Total code points: 2048
+
+# ================================================
+
+# General_Category=Dash_Punctuation
+
+002D ; Pd # HYPHEN-MINUS
+058A ; Pd # ARMENIAN HYPHEN
+05BE ; Pd # HEBREW PUNCTUATION MAQAF
+1400 ; Pd # CANADIAN SYLLABICS HYPHEN
+1806 ; Pd # MONGOLIAN TODO SOFT HYPHEN
+2010..2015 ; Pd # [6] HYPHEN..HORIZONTAL BAR
+2E17 ; Pd # DOUBLE OBLIQUE HYPHEN
+2E1A ; Pd # HYPHEN WITH DIAERESIS
+2E3A..2E3B ; Pd # [2] TWO-EM DASH..THREE-EM DASH
+2E40 ; Pd # DOUBLE HYPHEN
+301C ; Pd # WAVE DASH
+3030 ; Pd # WAVY DASH
+30A0 ; Pd # KATAKANA-HIRAGANA DOUBLE HYPHEN
+FE31..FE32 ; Pd # [2] PRESENTATION FORM FOR VERTICAL EM DASH..PRESENTATION FORM FOR VERTICAL EN DASH
+FE58 ; Pd # SMALL EM DASH
+FE63 ; Pd # SMALL HYPHEN-MINUS
+FF0D ; Pd # FULLWIDTH HYPHEN-MINUS
+
+# Total code points: 24
+
+# ================================================
+
+# General_Category=Open_Punctuation
+
+0028 ; Ps # LEFT PARENTHESIS
+005B ; Ps # LEFT SQUARE BRACKET
+007B ; Ps # LEFT CURLY BRACKET
+0F3A ; Ps # TIBETAN MARK GUG RTAGS GYON
+0F3C ; Ps # TIBETAN MARK ANG KHANG GYON
+169B ; Ps # OGHAM FEATHER MARK
+201A ; Ps # SINGLE LOW-9 QUOTATION MARK
+201E ; Ps # DOUBLE LOW-9 QUOTATION MARK
+2045 ; Ps # LEFT SQUARE BRACKET WITH QUILL
+207D ; Ps # SUPERSCRIPT LEFT PARENTHESIS
+208D ; Ps # SUBSCRIPT LEFT PARENTHESIS
+2308 ; Ps # LEFT CEILING
+230A ; Ps # LEFT FLOOR
+2329 ; Ps # LEFT-POINTING ANGLE BRACKET
+2768 ; Ps # MEDIUM LEFT PARENTHESIS ORNAMENT
+276A ; Ps # MEDIUM FLATTENED LEFT PARENTHESIS ORNAMENT
+276C ; Ps # MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT
+276E ; Ps # HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT
+2770 ; Ps # HEAVY LEFT-POINTING ANGLE BRACKET ORNAMENT
+2772 ; Ps # LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT
+2774 ; Ps # MEDIUM LEFT CURLY BRACKET ORNAMENT
+27C5 ; Ps # LEFT S-SHAPED BAG DELIMITER
+27E6 ; Ps # MATHEMATICAL LEFT WHITE SQUARE BRACKET
+27E8 ; Ps # MATHEMATICAL LEFT ANGLE BRACKET
+27EA ; Ps # MATHEMATICAL LEFT DOUBLE ANGLE BRACKET
+27EC ; Ps # MATHEMATICAL LEFT WHITE TORTOISE SHELL BRACKET
+27EE ; Ps # MATHEMATICAL LEFT FLATTENED PARENTHESIS
+2983 ; Ps # LEFT WHITE CURLY BRACKET
+2985 ; Ps # LEFT WHITE PARENTHESIS
+2987 ; Ps # Z NOTATION LEFT IMAGE BRACKET
+2989 ; Ps # Z NOTATION LEFT BINDING BRACKET
+298B ; Ps # LEFT SQUARE BRACKET WITH UNDERBAR
+298D ; Ps # LEFT SQUARE BRACKET WITH TICK IN TOP CORNER
+298F ; Ps # LEFT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
+2991 ; Ps # LEFT ANGLE BRACKET WITH DOT
+2993 ; Ps # LEFT ARC LESS-THAN BRACKET
+2995 ; Ps # DOUBLE LEFT ARC GREATER-THAN BRACKET
+2997 ; Ps # LEFT BLACK TORTOISE SHELL BRACKET
+29D8 ; Ps # LEFT WIGGLY FENCE
+29DA ; Ps # LEFT DOUBLE WIGGLY FENCE
+29FC ; Ps # LEFT-POINTING CURVED ANGLE BRACKET
+2E22 ; Ps # TOP LEFT HALF BRACKET
+2E24 ; Ps # BOTTOM LEFT HALF BRACKET
+2E26 ; Ps # LEFT SIDEWAYS U BRACKET
+2E28 ; Ps # LEFT DOUBLE PARENTHESIS
+2E42 ; Ps # DOUBLE LOW-REVERSED-9 QUOTATION MARK
+3008 ; Ps # LEFT ANGLE BRACKET
+300A ; Ps # LEFT DOUBLE ANGLE BRACKET
+300C ; Ps # LEFT CORNER BRACKET
+300E ; Ps # LEFT WHITE CORNER BRACKET
+3010 ; Ps # LEFT BLACK LENTICULAR BRACKET
+3014 ; Ps # LEFT TORTOISE SHELL BRACKET
+3016 ; Ps # LEFT WHITE LENTICULAR BRACKET
+3018 ; Ps # LEFT WHITE TORTOISE SHELL BRACKET
+301A ; Ps # LEFT WHITE SQUARE BRACKET
+301D ; Ps # REVERSED DOUBLE PRIME QUOTATION MARK
+FD3F ; Ps # ORNATE RIGHT PARENTHESIS
+FE17 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT WHITE LENTICULAR BRACKET
+FE35 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT PARENTHESIS
+FE37 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT CURLY BRACKET
+FE39 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT TORTOISE SHELL BRACKET
+FE3B ; Ps # PRESENTATION FORM FOR VERTICAL LEFT BLACK LENTICULAR BRACKET
+FE3D ; Ps # PRESENTATION FORM FOR VERTICAL LEFT DOUBLE ANGLE BRACKET
+FE3F ; Ps # PRESENTATION FORM FOR VERTICAL LEFT ANGLE BRACKET
+FE41 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT CORNER BRACKET
+FE43 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT WHITE CORNER BRACKET
+FE47 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT SQUARE BRACKET
+FE59 ; Ps # SMALL LEFT PARENTHESIS
+FE5B ; Ps # SMALL LEFT CURLY BRACKET
+FE5D ; Ps # SMALL LEFT TORTOISE SHELL BRACKET
+FF08 ; Ps # FULLWIDTH LEFT PARENTHESIS
+FF3B ; Ps # FULLWIDTH LEFT SQUARE BRACKET
+FF5B ; Ps # FULLWIDTH LEFT CURLY BRACKET
+FF5F ; Ps # FULLWIDTH LEFT WHITE PARENTHESIS
+FF62 ; Ps # HALFWIDTH LEFT CORNER BRACKET
+
+# Total code points: 75
+
+# ================================================
+
+# General_Category=Close_Punctuation
+
+0029 ; Pe # RIGHT PARENTHESIS
+005D ; Pe # RIGHT SQUARE BRACKET
+007D ; Pe # RIGHT CURLY BRACKET
+0F3B ; Pe # TIBETAN MARK GUG RTAGS GYAS
+0F3D ; Pe # TIBETAN MARK ANG KHANG GYAS
+169C ; Pe # OGHAM REVERSED FEATHER MARK
+2046 ; Pe # RIGHT SQUARE BRACKET WITH QUILL
+207E ; Pe # SUPERSCRIPT RIGHT PARENTHESIS
+208E ; Pe # SUBSCRIPT RIGHT PARENTHESIS
+2309 ; Pe # RIGHT CEILING
+230B ; Pe # RIGHT FLOOR
+232A ; Pe # RIGHT-POINTING ANGLE BRACKET
+2769 ; Pe # MEDIUM RIGHT PARENTHESIS ORNAMENT
+276B ; Pe # MEDIUM FLATTENED RIGHT PARENTHESIS ORNAMENT
+276D ; Pe # MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT
+276F ; Pe # HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT
+2771 ; Pe # HEAVY RIGHT-POINTING ANGLE BRACKET ORNAMENT
+2773 ; Pe # LIGHT RIGHT TORTOISE SHELL BRACKET ORNAMENT
+2775 ; Pe # MEDIUM RIGHT CURLY BRACKET ORNAMENT
+27C6 ; Pe # RIGHT S-SHAPED BAG DELIMITER
+27E7 ; Pe # MATHEMATICAL RIGHT WHITE SQUARE BRACKET
+27E9 ; Pe # MATHEMATICAL RIGHT ANGLE BRACKET
+27EB ; Pe # MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET
+27ED ; Pe # MATHEMATICAL RIGHT WHITE TORTOISE SHELL BRACKET
+27EF ; Pe # MATHEMATICAL RIGHT FLATTENED PARENTHESIS
+2984 ; Pe # RIGHT WHITE CURLY BRACKET
+2986 ; Pe # RIGHT WHITE PARENTHESIS
+2988 ; Pe # Z NOTATION RIGHT IMAGE BRACKET
+298A ; Pe # Z NOTATION RIGHT BINDING BRACKET
+298C ; Pe # RIGHT SQUARE BRACKET WITH UNDERBAR
+298E ; Pe # RIGHT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
+2990 ; Pe # RIGHT SQUARE BRACKET WITH TICK IN TOP CORNER
+2992 ; Pe # RIGHT ANGLE BRACKET WITH DOT
+2994 ; Pe # RIGHT ARC GREATER-THAN BRACKET
+2996 ; Pe # DOUBLE RIGHT ARC LESS-THAN BRACKET
+2998 ; Pe # RIGHT BLACK TORTOISE SHELL BRACKET
+29D9 ; Pe # RIGHT WIGGLY FENCE
+29DB ; Pe # RIGHT DOUBLE WIGGLY FENCE
+29FD ; Pe # RIGHT-POINTING CURVED ANGLE BRACKET
+2E23 ; Pe # TOP RIGHT HALF BRACKET
+2E25 ; Pe # BOTTOM RIGHT HALF BRACKET
+2E27 ; Pe # RIGHT SIDEWAYS U BRACKET
+2E29 ; Pe # RIGHT DOUBLE PARENTHESIS
+3009 ; Pe # RIGHT ANGLE BRACKET
+300B ; Pe # RIGHT DOUBLE ANGLE BRACKET
+300D ; Pe # RIGHT CORNER BRACKET
+300F ; Pe # RIGHT WHITE CORNER BRACKET
+3011 ; Pe # RIGHT BLACK LENTICULAR BRACKET
+3015 ; Pe # RIGHT TORTOISE SHELL BRACKET
+3017 ; Pe # RIGHT WHITE LENTICULAR BRACKET
+3019 ; Pe # RIGHT WHITE TORTOISE SHELL BRACKET
+301B ; Pe # RIGHT WHITE SQUARE BRACKET
+301E..301F ; Pe # [2] DOUBLE PRIME QUOTATION MARK..LOW DOUBLE PRIME QUOTATION MARK
+FD3E ; Pe # ORNATE LEFT PARENTHESIS
+FE18 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET
+FE36 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT PARENTHESIS
+FE38 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT CURLY BRACKET
+FE3A ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT TORTOISE SHELL BRACKET
+FE3C ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT BLACK LENTICULAR BRACKET
+FE3E ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT DOUBLE ANGLE BRACKET
+FE40 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT ANGLE BRACKET
+FE42 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT CORNER BRACKET
+FE44 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT WHITE CORNER BRACKET
+FE48 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT SQUARE BRACKET
+FE5A ; Pe # SMALL RIGHT PARENTHESIS
+FE5C ; Pe # SMALL RIGHT CURLY BRACKET
+FE5E ; Pe # SMALL RIGHT TORTOISE SHELL BRACKET
+FF09 ; Pe # FULLWIDTH RIGHT PARENTHESIS
+FF3D ; Pe # FULLWIDTH RIGHT SQUARE BRACKET
+FF5D ; Pe # FULLWIDTH RIGHT CURLY BRACKET
+FF60 ; Pe # FULLWIDTH RIGHT WHITE PARENTHESIS
+FF63 ; Pe # HALFWIDTH RIGHT CORNER BRACKET
+
+# Total code points: 73
+
+# ================================================
+
+# General_Category=Connector_Punctuation
+
+005F ; Pc # LOW LINE
+203F..2040 ; Pc # [2] UNDERTIE..CHARACTER TIE
+2054 ; Pc # INVERTED UNDERTIE
+FE33..FE34 ; Pc # [2] PRESENTATION FORM FOR VERTICAL LOW LINE..PRESENTATION FORM FOR VERTICAL WAVY LOW LINE
+FE4D..FE4F ; Pc # [3] DASHED LOW LINE..WAVY LOW LINE
+FF3F ; Pc # FULLWIDTH LOW LINE
+
+# Total code points: 10
+
+# ================================================
+
+# General_Category=Other_Punctuation
+
+0021..0023 ; Po # [3] EXCLAMATION MARK..NUMBER SIGN
+0025..0027 ; Po # [3] PERCENT SIGN..APOSTROPHE
+002A ; Po # ASTERISK
+002C ; Po # COMMA
+002E..002F ; Po # [2] FULL STOP..SOLIDUS
+003A..003B ; Po # [2] COLON..SEMICOLON
+003F..0040 ; Po # [2] QUESTION MARK..COMMERCIAL AT
+005C ; Po # REVERSE SOLIDUS
+00A1 ; Po # INVERTED EXCLAMATION MARK
+00A7 ; Po # SECTION SIGN
+00B6..00B7 ; Po # [2] PILCROW SIGN..MIDDLE DOT
+00BF ; Po # INVERTED QUESTION MARK
+037E ; Po # GREEK QUESTION MARK
+0387 ; Po # GREEK ANO TELEIA
+055A..055F ; Po # [6] ARMENIAN APOSTROPHE..ARMENIAN ABBREVIATION MARK
+0589 ; Po # ARMENIAN FULL STOP
+05C0 ; Po # HEBREW PUNCTUATION PASEQ
+05C3 ; Po # HEBREW PUNCTUATION SOF PASUQ
+05C6 ; Po # HEBREW PUNCTUATION NUN HAFUKHA
+05F3..05F4 ; Po # [2] HEBREW PUNCTUATION GERESH..HEBREW PUNCTUATION GERSHAYIM
+0609..060A ; Po # [2] ARABIC-INDIC PER MILLE SIGN..ARABIC-INDIC PER TEN THOUSAND SIGN
+060C..060D ; Po # [2] ARABIC COMMA..ARABIC DATE SEPARATOR
+061B ; Po # ARABIC SEMICOLON
+061E..061F ; Po # [2] ARABIC TRIPLE DOT PUNCTUATION MARK..ARABIC QUESTION MARK
+066A..066D ; Po # [4] ARABIC PERCENT SIGN..ARABIC FIVE POINTED STAR
+06D4 ; Po # ARABIC FULL STOP
+0700..070D ; Po # [14] SYRIAC END OF PARAGRAPH..SYRIAC HARKLEAN ASTERISCUS
+07F7..07F9 ; Po # [3] NKO SYMBOL GBAKURUNEN..NKO EXCLAMATION MARK
+0830..083E ; Po # [15] SAMARITAN PUNCTUATION NEQUDAA..SAMARITAN PUNCTUATION ANNAAU
+085E ; Po # MANDAIC PUNCTUATION
+0964..0965 ; Po # [2] DEVANAGARI DANDA..DEVANAGARI DOUBLE DANDA
+0970 ; Po # DEVANAGARI ABBREVIATION SIGN
+09FD ; Po # BENGALI ABBREVIATION SIGN
+0A76 ; Po # GURMUKHI ABBREVIATION SIGN
+0AF0 ; Po # GUJARATI ABBREVIATION SIGN
+0C77 ; Po # TELUGU SIGN SIDDHAM
+0C84 ; Po # KANNADA SIGN SIDDHAM
+0DF4 ; Po # SINHALA PUNCTUATION KUNDDALIYA
+0E4F ; Po # THAI CHARACTER FONGMAN
+0E5A..0E5B ; Po # [2] THAI CHARACTER ANGKHANKHU..THAI CHARACTER KHOMUT
+0F04..0F12 ; Po # [15] TIBETAN MARK INITIAL YIG MGO MDUN MA..TIBETAN MARK RGYA GRAM SHAD
+0F14 ; Po # TIBETAN MARK GTER TSHEG
+0F85 ; Po # TIBETAN MARK PALUTA
+0FD0..0FD4 ; Po # [5] TIBETAN MARK BSKA- SHOG GI MGO RGYAN..TIBETAN MARK CLOSING BRDA RNYING YIG MGO SGAB MA
+0FD9..0FDA ; Po # [2] TIBETAN MARK LEADING MCHAN RTAGS..TIBETAN MARK TRAILING MCHAN RTAGS
+104A..104F ; Po # [6] MYANMAR SIGN LITTLE SECTION..MYANMAR SYMBOL GENITIVE
+10FB ; Po # GEORGIAN PARAGRAPH SEPARATOR
+1360..1368 ; Po # [9] ETHIOPIC SECTION MARK..ETHIOPIC PARAGRAPH SEPARATOR
+166E ; Po # CANADIAN SYLLABICS FULL STOP
+16EB..16ED ; Po # [3] RUNIC SINGLE PUNCTUATION..RUNIC CROSS PUNCTUATION
+1735..1736 ; Po # [2] PHILIPPINE SINGLE PUNCTUATION..PHILIPPINE DOUBLE PUNCTUATION
+17D4..17D6 ; Po # [3] KHMER SIGN KHAN..KHMER SIGN CAMNUC PII KUUH
+17D8..17DA ; Po # [3] KHMER SIGN BEYYAL..KHMER SIGN KOOMUUT
+1800..1805 ; Po # [6] MONGOLIAN BIRGA..MONGOLIAN FOUR DOTS
+1807..180A ; Po # [4] MONGOLIAN SIBE SYLLABLE BOUNDARY MARKER..MONGOLIAN NIRUGU
+1944..1945 ; Po # [2] LIMBU EXCLAMATION MARK..LIMBU QUESTION MARK
+1A1E..1A1F ; Po # [2] BUGINESE PALLAWA..BUGINESE END OF SECTION
+1AA0..1AA6 ; Po # [7] TAI THAM SIGN WIANG..TAI THAM SIGN REVERSED ROTATED RANA
+1AA8..1AAD ; Po # [6] TAI THAM SIGN KAAN..TAI THAM SIGN CAANG
+1B5A..1B60 ; Po # [7] BALINESE PANTI..BALINESE PAMENENG
+1BFC..1BFF ; Po # [4] BATAK SYMBOL BINDU NA METEK..BATAK SYMBOL BINDU PANGOLAT
+1C3B..1C3F ; Po # [5] LEPCHA PUNCTUATION TA-ROL..LEPCHA PUNCTUATION TSHOOK
+1C7E..1C7F ; Po # [2] OL CHIKI PUNCTUATION MUCAAD..OL CHIKI PUNCTUATION DOUBLE MUCAAD
+1CC0..1CC7 ; Po # [8] SUNDANESE PUNCTUATION BINDU SURYA..SUNDANESE PUNCTUATION BINDU BA SATANGA
+1CD3 ; Po # VEDIC SIGN NIHSHVASA
+2016..2017 ; Po # [2] DOUBLE VERTICAL LINE..DOUBLE LOW LINE
+2020..2027 ; Po # [8] DAGGER..HYPHENATION POINT
+2030..2038 ; Po # [9] PER MILLE SIGN..CARET
+203B..203E ; Po # [4] REFERENCE MARK..OVERLINE
+2041..2043 ; Po # [3] CARET INSERTION POINT..HYPHEN BULLET
+2047..2051 ; Po # [11] DOUBLE QUESTION MARK..TWO ASTERISKS ALIGNED VERTICALLY
+2053 ; Po # SWUNG DASH
+2055..205E ; Po # [10] FLOWER PUNCTUATION MARK..VERTICAL FOUR DOTS
+2CF9..2CFC ; Po # [4] COPTIC OLD NUBIAN FULL STOP..COPTIC OLD NUBIAN VERSE DIVIDER
+2CFE..2CFF ; Po # [2] COPTIC FULL STOP..COPTIC MORPHOLOGICAL DIVIDER
+2D70 ; Po # TIFINAGH SEPARATOR MARK
+2E00..2E01 ; Po # [2] RIGHT ANGLE SUBSTITUTION MARKER..RIGHT ANGLE DOTTED SUBSTITUTION MARKER
+2E06..2E08 ; Po # [3] RAISED INTERPOLATION MARKER..DOTTED TRANSPOSITION MARKER
+2E0B ; Po # RAISED SQUARE
+2E0E..2E16 ; Po # [9] EDITORIAL CORONIS..DOTTED RIGHT-POINTING ANGLE
+2E18..2E19 ; Po # [2] INVERTED INTERROBANG..PALM BRANCH
+2E1B ; Po # TILDE WITH RING ABOVE
+2E1E..2E1F ; Po # [2] TILDE WITH DOT ABOVE..TILDE WITH DOT BELOW
+2E2A..2E2E ; Po # [5] TWO DOTS OVER ONE DOT PUNCTUATION..REVERSED QUESTION MARK
+2E30..2E39 ; Po # [10] RING POINT..TOP HALF SECTION SIGN
+2E3C..2E3F ; Po # [4] STENOGRAPHIC FULL STOP..CAPITULUM
+2E41 ; Po # REVERSED COMMA
+2E43..2E4F ; Po # [13] DASH WITH LEFT UPTURN..CORNISH VERSE DIVIDER
+3001..3003 ; Po # [3] IDEOGRAPHIC COMMA..DITTO MARK
+303D ; Po # PART ALTERNATION MARK
+30FB ; Po # KATAKANA MIDDLE DOT
+A4FE..A4FF ; Po # [2] LISU PUNCTUATION COMMA..LISU PUNCTUATION FULL STOP
+A60D..A60F ; Po # [3] VAI COMMA..VAI QUESTION MARK
+A673 ; Po # SLAVONIC ASTERISK
+A67E ; Po # CYRILLIC KAVYKA
+A6F2..A6F7 ; Po # [6] BAMUM NJAEMLI..BAMUM QUESTION MARK
+A874..A877 ; Po # [4] PHAGS-PA SINGLE HEAD MARK..PHAGS-PA MARK DOUBLE SHAD
+A8CE..A8CF ; Po # [2] SAURASHTRA DANDA..SAURASHTRA DOUBLE DANDA
+A8F8..A8FA ; Po # [3] DEVANAGARI SIGN PUSHPIKA..DEVANAGARI CARET
+A8FC ; Po # DEVANAGARI SIGN SIDDHAM
+A92E..A92F ; Po # [2] KAYAH LI SIGN CWI..KAYAH LI SIGN SHYA
+A95F ; Po # REJANG SECTION MARK
+A9C1..A9CD ; Po # [13] JAVANESE LEFT RERENGGAN..JAVANESE TURNED PADA PISELEH
+A9DE..A9DF ; Po # [2] JAVANESE PADA TIRTA TUMETES..JAVANESE PADA ISEN-ISEN
+AA5C..AA5F ; Po # [4] CHAM PUNCTUATION SPIRAL..CHAM PUNCTUATION TRIPLE DANDA
+AADE..AADF ; Po # [2] TAI VIET SYMBOL HO HOI..TAI VIET SYMBOL KOI KOI
+AAF0..AAF1 ; Po # [2] MEETEI MAYEK CHEIKHAN..MEETEI MAYEK AHANG KHUDAM
+ABEB ; Po # MEETEI MAYEK CHEIKHEI
+FE10..FE16 ; Po # [7] PRESENTATION FORM FOR VERTICAL COMMA..PRESENTATION FORM FOR VERTICAL QUESTION MARK
+FE19 ; Po # PRESENTATION FORM FOR VERTICAL HORIZONTAL ELLIPSIS
+FE30 ; Po # PRESENTATION FORM FOR VERTICAL TWO DOT LEADER
+FE45..FE46 ; Po # [2] SESAME DOT..WHITE SESAME DOT
+FE49..FE4C ; Po # [4] DASHED OVERLINE..DOUBLE WAVY OVERLINE
+FE50..FE52 ; Po # [3] SMALL COMMA..SMALL FULL STOP
+FE54..FE57 ; Po # [4] SMALL SEMICOLON..SMALL EXCLAMATION MARK
+FE5F..FE61 ; Po # [3] SMALL NUMBER SIGN..SMALL ASTERISK
+FE68 ; Po # SMALL REVERSE SOLIDUS
+FE6A..FE6B ; Po # [2] SMALL PERCENT SIGN..SMALL COMMERCIAL AT
+FF01..FF03 ; Po # [3] FULLWIDTH EXCLAMATION MARK..FULLWIDTH NUMBER SIGN
+FF05..FF07 ; Po # [3] FULLWIDTH PERCENT SIGN..FULLWIDTH APOSTROPHE
+FF0A ; Po # FULLWIDTH ASTERISK
+FF0C ; Po # FULLWIDTH COMMA
+FF0E..FF0F ; Po # [2] FULLWIDTH FULL STOP..FULLWIDTH SOLIDUS
+FF1A..FF1B ; Po # [2] FULLWIDTH COLON..FULLWIDTH SEMICOLON
+FF1F..FF20 ; Po # [2] FULLWIDTH QUESTION MARK..FULLWIDTH COMMERCIAL AT
+FF3C ; Po # FULLWIDTH REVERSE SOLIDUS
+FF61 ; Po # HALFWIDTH IDEOGRAPHIC FULL STOP
+FF64..FF65 ; Po # [2] HALFWIDTH IDEOGRAPHIC COMMA..HALFWIDTH KATAKANA MIDDLE DOT
+10100..10102 ; Po # [3] AEGEAN WORD SEPARATOR LINE..AEGEAN CHECK MARK
+1039F ; Po # UGARITIC WORD DIVIDER
+103D0 ; Po # OLD PERSIAN WORD DIVIDER
+1056F ; Po # CAUCASIAN ALBANIAN CITATION MARK
+10857 ; Po # IMPERIAL ARAMAIC SECTION SIGN
+1091F ; Po # PHOENICIAN WORD SEPARATOR
+1093F ; Po # LYDIAN TRIANGULAR MARK
+10A50..10A58 ; Po # [9] KHAROSHTHI PUNCTUATION DOT..KHAROSHTHI PUNCTUATION LINES
+10A7F ; Po # OLD SOUTH ARABIAN NUMERIC INDICATOR
+10AF0..10AF6 ; Po # [7] MANICHAEAN PUNCTUATION STAR..MANICHAEAN PUNCTUATION LINE FILLER
+10B39..10B3F ; Po # [7] AVESTAN ABBREVIATION MARK..LARGE ONE RING OVER TWO RINGS PUNCTUATION
+10B99..10B9C ; Po # [4] PSALTER PAHLAVI SECTION MARK..PSALTER PAHLAVI FOUR DOTS WITH DOT
+10F55..10F59 ; Po # [5] SOGDIAN PUNCTUATION TWO VERTICAL BARS..SOGDIAN PUNCTUATION HALF CIRCLE WITH DOT
+11047..1104D ; Po # [7] BRAHMI DANDA..BRAHMI PUNCTUATION LOTUS
+110BB..110BC ; Po # [2] KAITHI ABBREVIATION SIGN..KAITHI ENUMERATION SIGN
+110BE..110C1 ; Po # [4] KAITHI SECTION MARK..KAITHI DOUBLE DANDA
+11140..11143 ; Po # [4] CHAKMA SECTION MARK..CHAKMA QUESTION MARK
+11174..11175 ; Po # [2] MAHAJANI ABBREVIATION SIGN..MAHAJANI SECTION MARK
+111C5..111C8 ; Po # [4] SHARADA DANDA..SHARADA SEPARATOR
+111CD ; Po # SHARADA SUTRA MARK
+111DB ; Po # SHARADA SIGN SIDDHAM
+111DD..111DF ; Po # [3] SHARADA CONTINUATION SIGN..SHARADA SECTION MARK-2
+11238..1123D ; Po # [6] KHOJKI DANDA..KHOJKI ABBREVIATION SIGN
+112A9 ; Po # MULTANI SECTION MARK
+1144B..1144F ; Po # [5] NEWA DANDA..NEWA ABBREVIATION SIGN
+1145B ; Po # NEWA PLACEHOLDER MARK
+1145D ; Po # NEWA INSERTION SIGN
+114C6 ; Po # TIRHUTA ABBREVIATION SIGN
+115C1..115D7 ; Po # [23] SIDDHAM SIGN SIDDHAM..SIDDHAM SECTION MARK WITH CIRCLES AND FOUR ENCLOSURES
+11641..11643 ; Po # [3] MODI DANDA..MODI ABBREVIATION SIGN
+11660..1166C ; Po # [13] MONGOLIAN BIRGA WITH ORNAMENT..MONGOLIAN TURNED SWIRL BIRGA WITH DOUBLE ORNAMENT
+1173C..1173E ; Po # [3] AHOM SIGN SMALL SECTION..AHOM SIGN RULAI
+1183B ; Po # DOGRA ABBREVIATION SIGN
+119E2 ; Po # NANDINAGARI SIGN SIDDHAM
+11A3F..11A46 ; Po # [8] ZANABAZAR SQUARE INITIAL HEAD MARK..ZANABAZAR SQUARE CLOSING DOUBLE-LINED HEAD MARK
+11A9A..11A9C ; Po # [3] SOYOMBO MARK TSHEG..SOYOMBO MARK DOUBLE SHAD
+11A9E..11AA2 ; Po # [5] SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPLE FLAME..SOYOMBO TERMINAL MARK-2
+11C41..11C45 ; Po # [5] BHAIKSUKI DANDA..BHAIKSUKI GAP FILLER-2
+11C70..11C71 ; Po # [2] MARCHEN HEAD MARK..MARCHEN MARK SHAD
+11EF7..11EF8 ; Po # [2] MAKASAR PASSIMBANG..MAKASAR END OF SECTION
+11FFF ; Po # TAMIL PUNCTUATION END OF TEXT
+12470..12474 ; Po # [5] CUNEIFORM PUNCTUATION SIGN OLD ASSYRIAN WORD DIVIDER..CUNEIFORM PUNCTUATION SIGN DIAGONAL QUADCOLON
+16A6E..16A6F ; Po # [2] MRO DANDA..MRO DOUBLE DANDA
+16AF5 ; Po # BASSA VAH FULL STOP
+16B37..16B3B ; Po # [5] PAHAWH HMONG SIGN VOS THOM..PAHAWH HMONG SIGN VOS FEEM
+16B44 ; Po # PAHAWH HMONG SIGN XAUS
+16E97..16E9A ; Po # [4] MEDEFAIDRIN COMMA..MEDEFAIDRIN EXCLAMATION OH
+16FE2 ; Po # OLD CHINESE HOOK MARK
+1BC9F ; Po # DUPLOYAN PUNCTUATION CHINOOK FULL STOP
+1DA87..1DA8B ; Po # [5] SIGNWRITING COMMA..SIGNWRITING PARENTHESIS
+1E95E..1E95F ; Po # [2] ADLAM INITIAL EXCLAMATION MARK..ADLAM INITIAL QUESTION MARK
+
+# Total code points: 588
+
+# ================================================
+
+# General_Category=Math_Symbol
+
+002B ; Sm # PLUS SIGN
+003C..003E ; Sm # [3] LESS-THAN SIGN..GREATER-THAN SIGN
+007C ; Sm # VERTICAL LINE
+007E ; Sm # TILDE
+00AC ; Sm # NOT SIGN
+00B1 ; Sm # PLUS-MINUS SIGN
+00D7 ; Sm # MULTIPLICATION SIGN
+00F7 ; Sm # DIVISION SIGN
+03F6 ; Sm # GREEK REVERSED LUNATE EPSILON SYMBOL
+0606..0608 ; Sm # [3] ARABIC-INDIC CUBE ROOT..ARABIC RAY
+2044 ; Sm # FRACTION SLASH
+2052 ; Sm # COMMERCIAL MINUS SIGN
+207A..207C ; Sm # [3] SUPERSCRIPT PLUS SIGN..SUPERSCRIPT EQUALS SIGN
+208A..208C ; Sm # [3] SUBSCRIPT PLUS SIGN..SUBSCRIPT EQUALS SIGN
+2118 ; Sm # SCRIPT CAPITAL P
+2140..2144 ; Sm # [5] DOUBLE-STRUCK N-ARY SUMMATION..TURNED SANS-SERIF CAPITAL Y
+214B ; Sm # TURNED AMPERSAND
+2190..2194 ; Sm # [5] LEFTWARDS ARROW..LEFT RIGHT ARROW
+219A..219B ; Sm # [2] LEFTWARDS ARROW WITH STROKE..RIGHTWARDS ARROW WITH STROKE
+21A0 ; Sm # RIGHTWARDS TWO HEADED ARROW
+21A3 ; Sm # RIGHTWARDS ARROW WITH TAIL
+21A6 ; Sm # RIGHTWARDS ARROW FROM BAR
+21AE ; Sm # LEFT RIGHT ARROW WITH STROKE
+21CE..21CF ; Sm # [2] LEFT RIGHT DOUBLE ARROW WITH STROKE..RIGHTWARDS DOUBLE ARROW WITH STROKE
+21D2 ; Sm # RIGHTWARDS DOUBLE ARROW
+21D4 ; Sm # LEFT RIGHT DOUBLE ARROW
+21F4..22FF ; Sm # [268] RIGHT ARROW WITH SMALL CIRCLE..Z NOTATION BAG MEMBERSHIP
+2320..2321 ; Sm # [2] TOP HALF INTEGRAL..BOTTOM HALF INTEGRAL
+237C ; Sm # RIGHT ANGLE WITH DOWNWARDS ZIGZAG ARROW
+239B..23B3 ; Sm # [25] LEFT PARENTHESIS UPPER HOOK..SUMMATION BOTTOM
+23DC..23E1 ; Sm # [6] TOP PARENTHESIS..BOTTOM TORTOISE SHELL BRACKET
+25B7 ; Sm # WHITE RIGHT-POINTING TRIANGLE
+25C1 ; Sm # WHITE LEFT-POINTING TRIANGLE
+25F8..25FF ; Sm # [8] UPPER LEFT TRIANGLE..LOWER RIGHT TRIANGLE
+266F ; Sm # MUSIC SHARP SIGN
+27C0..27C4 ; Sm # [5] THREE DIMENSIONAL ANGLE..OPEN SUPERSET
+27C7..27E5 ; Sm # [31] OR WITH DOT INSIDE..WHITE SQUARE WITH RIGHTWARDS TICK
+27F0..27FF ; Sm # [16] UPWARDS QUADRUPLE ARROW..LONG RIGHTWARDS SQUIGGLE ARROW
+2900..2982 ; Sm # [131] RIGHTWARDS TWO-HEADED ARROW WITH VERTICAL STROKE..Z NOTATION TYPE COLON
+2999..29D7 ; Sm # [63] DOTTED FENCE..BLACK HOURGLASS
+29DC..29FB ; Sm # [32] INCOMPLETE INFINITY..TRIPLE PLUS
+29FE..2AFF ; Sm # [258] TINY..N-ARY WHITE VERTICAL BAR
+2B30..2B44 ; Sm # [21] LEFT ARROW WITH SMALL CIRCLE..RIGHTWARDS ARROW THROUGH SUPERSET
+2B47..2B4C ; Sm # [6] REVERSE TILDE OPERATOR ABOVE RIGHTWARDS ARROW..RIGHTWARDS ARROW ABOVE REVERSE TILDE OPERATOR
+FB29 ; Sm # HEBREW LETTER ALTERNATIVE PLUS SIGN
+FE62 ; Sm # SMALL PLUS SIGN
+FE64..FE66 ; Sm # [3] SMALL LESS-THAN SIGN..SMALL EQUALS SIGN
+FF0B ; Sm # FULLWIDTH PLUS SIGN
+FF1C..FF1E ; Sm # [3] FULLWIDTH LESS-THAN SIGN..FULLWIDTH GREATER-THAN SIGN
+FF5C ; Sm # FULLWIDTH VERTICAL LINE
+FF5E ; Sm # FULLWIDTH TILDE
+FFE2 ; Sm # FULLWIDTH NOT SIGN
+FFE9..FFEC ; Sm # [4] HALFWIDTH LEFTWARDS ARROW..HALFWIDTH DOWNWARDS ARROW
+1D6C1 ; Sm # MATHEMATICAL BOLD NABLA
+1D6DB ; Sm # MATHEMATICAL BOLD PARTIAL DIFFERENTIAL
+1D6FB ; Sm # MATHEMATICAL ITALIC NABLA
+1D715 ; Sm # MATHEMATICAL ITALIC PARTIAL DIFFERENTIAL
+1D735 ; Sm # MATHEMATICAL BOLD ITALIC NABLA
+1D74F ; Sm # MATHEMATICAL BOLD ITALIC PARTIAL DIFFERENTIAL
+1D76F ; Sm # MATHEMATICAL SANS-SERIF BOLD NABLA
+1D789 ; Sm # MATHEMATICAL SANS-SERIF BOLD PARTIAL DIFFERENTIAL
+1D7A9 ; Sm # MATHEMATICAL SANS-SERIF BOLD ITALIC NABLA
+1D7C3 ; Sm # MATHEMATICAL SANS-SERIF BOLD ITALIC PARTIAL DIFFERENTIAL
+1EEF0..1EEF1 ; Sm # [2] ARABIC MATHEMATICAL OPERATOR MEEM WITH HAH WITH TATWEEL..ARABIC MATHEMATICAL OPERATOR HAH WITH DAL
+
+# Total code points: 948
+
+# ================================================
+
+# General_Category=Currency_Symbol
+
+0024 ; Sc # DOLLAR SIGN
+00A2..00A5 ; Sc # [4] CENT SIGN..YEN SIGN
+058F ; Sc # ARMENIAN DRAM SIGN
+060B ; Sc # AFGHANI SIGN
+07FE..07FF ; Sc # [2] NKO DOROME SIGN..NKO TAMAN SIGN
+09F2..09F3 ; Sc # [2] BENGALI RUPEE MARK..BENGALI RUPEE SIGN
+09FB ; Sc # BENGALI GANDA MARK
+0AF1 ; Sc # GUJARATI RUPEE SIGN
+0BF9 ; Sc # TAMIL RUPEE SIGN
+0E3F ; Sc # THAI CURRENCY SYMBOL BAHT
+17DB ; Sc # KHMER CURRENCY SYMBOL RIEL
+20A0..20BF ; Sc # [32] EURO-CURRENCY SIGN..BITCOIN SIGN
+A838 ; Sc # NORTH INDIC RUPEE MARK
+FDFC ; Sc # RIAL SIGN
+FE69 ; Sc # SMALL DOLLAR SIGN
+FF04 ; Sc # FULLWIDTH DOLLAR SIGN
+FFE0..FFE1 ; Sc # [2] FULLWIDTH CENT SIGN..FULLWIDTH POUND SIGN
+FFE5..FFE6 ; Sc # [2] FULLWIDTH YEN SIGN..FULLWIDTH WON SIGN
+11FDD..11FE0 ; Sc # [4] TAMIL SIGN KAACU..TAMIL SIGN VARAAKAN
+1E2FF ; Sc # WANCHO NGUN SIGN
+1ECB0 ; Sc # INDIC SIYAQ RUPEE MARK
+
+# Total code points: 62
+
+# ================================================
+
+# General_Category=Modifier_Symbol
+
+005E ; Sk # CIRCUMFLEX ACCENT
+0060 ; Sk # GRAVE ACCENT
+00A8 ; Sk # DIAERESIS
+00AF ; Sk # MACRON
+00B4 ; Sk # ACUTE ACCENT
+00B8 ; Sk # CEDILLA
+02C2..02C5 ; Sk # [4] MODIFIER LETTER LEFT ARROWHEAD..MODIFIER LETTER DOWN ARROWHEAD
+02D2..02DF ; Sk # [14] MODIFIER LETTER CENTRED RIGHT HALF RING..MODIFIER LETTER CROSS ACCENT
+02E5..02EB ; Sk # [7] MODIFIER LETTER EXTRA-HIGH TONE BAR..MODIFIER LETTER YANG DEPARTING TONE MARK
+02ED ; Sk # MODIFIER LETTER UNASPIRATED
+02EF..02FF ; Sk # [17] MODIFIER LETTER LOW DOWN ARROWHEAD..MODIFIER LETTER LOW LEFT ARROW
+0375 ; Sk # GREEK LOWER NUMERAL SIGN
+0384..0385 ; Sk # [2] GREEK TONOS..GREEK DIALYTIKA TONOS
+1FBD ; Sk # GREEK KORONIS
+1FBF..1FC1 ; Sk # [3] GREEK PSILI..GREEK DIALYTIKA AND PERISPOMENI
+1FCD..1FCF ; Sk # [3] GREEK PSILI AND VARIA..GREEK PSILI AND PERISPOMENI
+1FDD..1FDF ; Sk # [3] GREEK DASIA AND VARIA..GREEK DASIA AND PERISPOMENI
+1FED..1FEF ; Sk # [3] GREEK DIALYTIKA AND VARIA..GREEK VARIA
+1FFD..1FFE ; Sk # [2] GREEK OXIA..GREEK DASIA
+309B..309C ; Sk # [2] KATAKANA-HIRAGANA VOICED SOUND MARK..KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
+A700..A716 ; Sk # [23] MODIFIER LETTER CHINESE TONE YIN PING..MODIFIER LETTER EXTRA-LOW LEFT-STEM TONE BAR
+A720..A721 ; Sk # [2] MODIFIER LETTER STRESS AND HIGH TONE..MODIFIER LETTER STRESS AND LOW TONE
+A789..A78A ; Sk # [2] MODIFIER LETTER COLON..MODIFIER LETTER SHORT EQUALS SIGN
+AB5B ; Sk # MODIFIER BREVE WITH INVERTED BREVE
+FBB2..FBC1 ; Sk # [16] ARABIC SYMBOL DOT ABOVE..ARABIC SYMBOL SMALL TAH BELOW
+FF3E ; Sk # FULLWIDTH CIRCUMFLEX ACCENT
+FF40 ; Sk # FULLWIDTH GRAVE ACCENT
+FFE3 ; Sk # FULLWIDTH MACRON
+1F3FB..1F3FF ; Sk # [5] EMOJI MODIFIER FITZPATRICK TYPE-1-2..EMOJI MODIFIER FITZPATRICK TYPE-6
+
+# Total code points: 121
+
+# ================================================
+
+# General_Category=Other_Symbol
+
+00A6 ; So # BROKEN BAR
+00A9 ; So # COPYRIGHT SIGN
+00AE ; So # REGISTERED SIGN
+00B0 ; So # DEGREE SIGN
+0482 ; So # CYRILLIC THOUSANDS SIGN
+058D..058E ; So # [2] RIGHT-FACING ARMENIAN ETERNITY SIGN..LEFT-FACING ARMENIAN ETERNITY SIGN
+060E..060F ; So # [2] ARABIC POETIC VERSE SIGN..ARABIC SIGN MISRA
+06DE ; So # ARABIC START OF RUB EL HIZB
+06E9 ; So # ARABIC PLACE OF SAJDAH
+06FD..06FE ; So # [2] ARABIC SIGN SINDHI AMPERSAND..ARABIC SIGN SINDHI POSTPOSITION MEN
+07F6 ; So # NKO SYMBOL OO DENNEN
+09FA ; So # BENGALI ISSHAR
+0B70 ; So # ORIYA ISSHAR
+0BF3..0BF8 ; So # [6] TAMIL DAY SIGN..TAMIL AS ABOVE SIGN
+0BFA ; So # TAMIL NUMBER SIGN
+0C7F ; So # TELUGU SIGN TUUMU
+0D4F ; So # MALAYALAM SIGN PARA
+0D79 ; So # MALAYALAM DATE MARK
+0F01..0F03 ; So # [3] TIBETAN MARK GTER YIG MGO TRUNCATED A..TIBETAN MARK GTER YIG MGO -UM GTER TSHEG MA
+0F13 ; So # TIBETAN MARK CARET -DZUD RTAGS ME LONG CAN
+0F15..0F17 ; So # [3] TIBETAN LOGOTYPE SIGN CHAD RTAGS..TIBETAN ASTROLOGICAL SIGN SGRA GCAN -CHAR RTAGS
+0F1A..0F1F ; So # [6] TIBETAN SIGN RDEL DKAR GCIG..TIBETAN SIGN RDEL DKAR RDEL NAG
+0F34 ; So # TIBETAN MARK BSDUS RTAGS
+0F36 ; So # TIBETAN MARK CARET -DZUD RTAGS BZHI MIG CAN
+0F38 ; So # TIBETAN MARK CHE MGO
+0FBE..0FC5 ; So # [8] TIBETAN KU RU KHA..TIBETAN SYMBOL RDO RJE
+0FC7..0FCC ; So # [6] TIBETAN SYMBOL RDO RJE RGYA GRAM..TIBETAN SYMBOL NOR BU BZHI -KHYIL
+0FCE..0FCF ; So # [2] TIBETAN SIGN RDEL NAG RDEL DKAR..TIBETAN SIGN RDEL NAG GSUM
+0FD5..0FD8 ; So # [4] RIGHT-FACING SVASTI SIGN..LEFT-FACING SVASTI SIGN WITH DOTS
+109E..109F ; So # [2] MYANMAR SYMBOL SHAN ONE..MYANMAR SYMBOL SHAN EXCLAMATION
+1390..1399 ; So # [10] ETHIOPIC TONAL MARK YIZET..ETHIOPIC TONAL MARK KURT
+166D ; So # CANADIAN SYLLABICS CHI SIGN
+1940 ; So # LIMBU SIGN LOO
+19DE..19FF ; So # [34] NEW TAI LUE SIGN LAE..KHMER SYMBOL DAP-PRAM ROC
+1B61..1B6A ; So # [10] BALINESE MUSICAL SYMBOL DONG..BALINESE MUSICAL SYMBOL DANG GEDE
+1B74..1B7C ; So # [9] BALINESE MUSICAL SYMBOL RIGHT-HAND OPEN DUG..BALINESE MUSICAL SYMBOL LEFT-HAND OPEN PING
+2100..2101 ; So # [2] ACCOUNT OF..ADDRESSED TO THE SUBJECT
+2103..2106 ; So # [4] DEGREE CELSIUS..CADA UNA
+2108..2109 ; So # [2] SCRUPLE..DEGREE FAHRENHEIT
+2114 ; So # L B BAR SYMBOL
+2116..2117 ; So # [2] NUMERO SIGN..SOUND RECORDING COPYRIGHT
+211E..2123 ; So # [6] PRESCRIPTION TAKE..VERSICLE
+2125 ; So # OUNCE SIGN
+2127 ; So # INVERTED OHM SIGN
+2129 ; So # TURNED GREEK SMALL LETTER IOTA
+212E ; So # ESTIMATED SYMBOL
+213A..213B ; So # [2] ROTATED CAPITAL Q..FACSIMILE SIGN
+214A ; So # PROPERTY LINE
+214C..214D ; So # [2] PER SIGN..AKTIESELSKAB
+214F ; So # SYMBOL FOR SAMARITAN SOURCE
+218A..218B ; So # [2] TURNED DIGIT TWO..TURNED DIGIT THREE
+2195..2199 ; So # [5] UP DOWN ARROW..SOUTH WEST ARROW
+219C..219F ; So # [4] LEFTWARDS WAVE ARROW..UPWARDS TWO HEADED ARROW
+21A1..21A2 ; So # [2] DOWNWARDS TWO HEADED ARROW..LEFTWARDS ARROW WITH TAIL
+21A4..21A5 ; So # [2] LEFTWARDS ARROW FROM BAR..UPWARDS ARROW FROM BAR
+21A7..21AD ; So # [7] DOWNWARDS ARROW FROM BAR..LEFT RIGHT WAVE ARROW
+21AF..21CD ; So # [31] DOWNWARDS ZIGZAG ARROW..LEFTWARDS DOUBLE ARROW WITH STROKE
+21D0..21D1 ; So # [2] LEFTWARDS DOUBLE ARROW..UPWARDS DOUBLE ARROW
+21D3 ; So # DOWNWARDS DOUBLE ARROW
+21D5..21F3 ; So # [31] UP DOWN DOUBLE ARROW..UP DOWN WHITE ARROW
+2300..2307 ; So # [8] DIAMETER SIGN..WAVY LINE
+230C..231F ; So # [20] BOTTOM RIGHT CROP..BOTTOM RIGHT CORNER
+2322..2328 ; So # [7] FROWN..KEYBOARD
+232B..237B ; So # [81] ERASE TO THE LEFT..NOT CHECK MARK
+237D..239A ; So # [30] SHOULDERED OPEN BOX..CLEAR SCREEN SYMBOL
+23B4..23DB ; So # [40] TOP SQUARE BRACKET..FUSE
+23E2..2426 ; So # [69] WHITE TRAPEZIUM..SYMBOL FOR SUBSTITUTE FORM TWO
+2440..244A ; So # [11] OCR HOOK..OCR DOUBLE BACKSLASH
+249C..24E9 ; So # [78] PARENTHESIZED LATIN SMALL LETTER A..CIRCLED LATIN SMALL LETTER Z
+2500..25B6 ; So # [183] BOX DRAWINGS LIGHT HORIZONTAL..BLACK RIGHT-POINTING TRIANGLE
+25B8..25C0 ; So # [9] BLACK RIGHT-POINTING SMALL TRIANGLE..BLACK LEFT-POINTING TRIANGLE
+25C2..25F7 ; So # [54] BLACK LEFT-POINTING SMALL TRIANGLE..WHITE CIRCLE WITH UPPER RIGHT QUADRANT
+2600..266E ; So # [111] BLACK SUN WITH RAYS..MUSIC NATURAL SIGN
+2670..2767 ; So # [248] WEST SYRIAC CROSS..ROTATED FLORAL HEART BULLET
+2794..27BF ; So # [44] HEAVY WIDE-HEADED RIGHTWARDS ARROW..DOUBLE CURLY LOOP
+2800..28FF ; So # [256] BRAILLE PATTERN BLANK..BRAILLE PATTERN DOTS-12345678
+2B00..2B2F ; So # [48] NORTH EAST WHITE ARROW..WHITE VERTICAL ELLIPSE
+2B45..2B46 ; So # [2] LEFTWARDS QUADRUPLE ARROW..RIGHTWARDS QUADRUPLE ARROW
+2B4D..2B73 ; So # [39] DOWNWARDS TRIANGLE-HEADED ZIGZAG ARROW..DOWNWARDS TRIANGLE-HEADED ARROW TO BAR
+2B76..2B95 ; So # [32] NORTH WEST TRIANGLE-HEADED ARROW TO BAR..RIGHTWARDS BLACK ARROW
+2B98..2BFF ; So # [104] THREE-D TOP-LIGHTED LEFTWARDS EQUILATERAL ARROWHEAD..HELLSCHREIBER PAUSE SYMBOL
+2CE5..2CEA ; So # [6] COPTIC SYMBOL MI RO..COPTIC SYMBOL SHIMA SIMA
+2E80..2E99 ; So # [26] CJK RADICAL REPEAT..CJK RADICAL RAP
+2E9B..2EF3 ; So # [89] CJK RADICAL CHOKE..CJK RADICAL C-SIMPLIFIED TURTLE
+2F00..2FD5 ; So # [214] KANGXI RADICAL ONE..KANGXI RADICAL FLUTE
+2FF0..2FFB ; So # [12] IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER OVERLAID
+3004 ; So # JAPANESE INDUSTRIAL STANDARD SYMBOL
+3012..3013 ; So # [2] POSTAL MARK..GETA MARK
+3020 ; So # POSTAL MARK FACE
+3036..3037 ; So # [2] CIRCLED POSTAL MARK..IDEOGRAPHIC TELEGRAPH LINE FEED SEPARATOR SYMBOL
+303E..303F ; So # [2] IDEOGRAPHIC VARIATION INDICATOR..IDEOGRAPHIC HALF FILL SPACE
+3190..3191 ; So # [2] IDEOGRAPHIC ANNOTATION LINKING MARK..IDEOGRAPHIC ANNOTATION REVERSE MARK
+3196..319F ; So # [10] IDEOGRAPHIC ANNOTATION TOP MARK..IDEOGRAPHIC ANNOTATION MAN MARK
+31C0..31E3 ; So # [36] CJK STROKE T..CJK STROKE Q
+3200..321E ; So # [31] PARENTHESIZED HANGUL KIYEOK..PARENTHESIZED KOREAN CHARACTER O HU
+322A..3247 ; So # [30] PARENTHESIZED IDEOGRAPH MOON..CIRCLED IDEOGRAPH KOTO
+3250 ; So # PARTNERSHIP SIGN
+3260..327F ; So # [32] CIRCLED HANGUL KIYEOK..KOREAN STANDARD SYMBOL
+328A..32B0 ; So # [39] CIRCLED IDEOGRAPH MOON..CIRCLED IDEOGRAPH NIGHT
+32C0..33FF ; So # [320] IDEOGRAPHIC TELEGRAPH SYMBOL FOR JANUARY..SQUARE GAL
+4DC0..4DFF ; So # [64] HEXAGRAM FOR THE CREATIVE HEAVEN..HEXAGRAM FOR BEFORE COMPLETION
+A490..A4C6 ; So # [55] YI RADICAL QOT..YI RADICAL KE
+A828..A82B ; So # [4] SYLOTI NAGRI POETRY MARK-1..SYLOTI NAGRI POETRY MARK-4
+A836..A837 ; So # [2] NORTH INDIC QUARTER MARK..NORTH INDIC PLACEHOLDER MARK
+A839 ; So # NORTH INDIC QUANTITY MARK
+AA77..AA79 ; So # [3] MYANMAR SYMBOL AITON EXCLAMATION..MYANMAR SYMBOL AITON TWO
+FDFD ; So # ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM
+FFE4 ; So # FULLWIDTH BROKEN BAR
+FFE8 ; So # HALFWIDTH FORMS LIGHT VERTICAL
+FFED..FFEE ; So # [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CIRCLE
+FFFC..FFFD ; So # [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARACTER
+10137..1013F ; So # [9] AEGEAN WEIGHT BASE UNIT..AEGEAN MEASURE THIRD SUBUNIT
+10179..10189 ; So # [17] GREEK YEAR SIGN..GREEK TRYBLION BASE SIGN
+1018C..1018E ; So # [3] GREEK SINUSOID SIGN..NOMISMA SIGN
+10190..1019B ; So # [12] ROMAN SEXTANS SIGN..ROMAN CENTURIAL SIGN
+101A0 ; So # GREEK SYMBOL TAU RHO
+101D0..101FC ; So # [45] PHAISTOS DISC SIGN PEDESTRIAN..PHAISTOS DISC SIGN WAVY BAND
+10877..10878 ; So # [2] PALMYRENE LEFT-POINTING FLEURON..PALMYRENE RIGHT-POINTING FLEURON
+10AC8 ; So # MANICHAEAN SIGN UD
+1173F ; So # AHOM SYMBOL VI
+11FD5..11FDC ; So # [8] TAMIL SIGN NEL..TAMIL SIGN MUKKURUNI
+11FE1..11FF1 ; So # [17] TAMIL SIGN PAARAM..TAMIL SIGN VAKAIYARAA
+16B3C..16B3F ; So # [4] PAHAWH HMONG SIGN XYEEM NTXIV..PAHAWH HMONG SIGN XYEEM FAIB
+16B45 ; So # PAHAWH HMONG SIGN CIM TSOV ROG
+1BC9C ; So # DUPLOYAN SIGN O WITH CROSS
+1D000..1D0F5 ; So # [246] BYZANTINE MUSICAL SYMBOL PSILI..BYZANTINE MUSICAL SYMBOL GORGON NEO KATO
+1D100..1D126 ; So # [39] MUSICAL SYMBOL SINGLE BARLINE..MUSICAL SYMBOL DRUM CLEF-2
+1D129..1D164 ; So # [60] MUSICAL SYMBOL MULTIPLE MEASURE REST..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
+1D16A..1D16C ; So # [3] MUSICAL SYMBOL FINGERED TREMOLO-1..MUSICAL SYMBOL FINGERED TREMOLO-3
+1D183..1D184 ; So # [2] MUSICAL SYMBOL ARPEGGIATO UP..MUSICAL SYMBOL ARPEGGIATO DOWN
+1D18C..1D1A9 ; So # [30] MUSICAL SYMBOL RINFORZANDO..MUSICAL SYMBOL DEGREE SLASH
+1D1AE..1D1E8 ; So # [59] MUSICAL SYMBOL PEDAL MARK..MUSICAL SYMBOL KIEVAN FLAT SIGN
+1D200..1D241 ; So # [66] GREEK VOCAL NOTATION SYMBOL-1..GREEK INSTRUMENTAL NOTATION SYMBOL-54
+1D245 ; So # GREEK MUSICAL LEIMMA
+1D300..1D356 ; So # [87] MONOGRAM FOR EARTH..TETRAGRAM FOR FOSTERING
+1D800..1D9FF ; So # [512] SIGNWRITING HAND-FIST INDEX..SIGNWRITING HEAD
+1DA37..1DA3A ; So # [4] SIGNWRITING AIR BLOW SMALL ROTATIONS..SIGNWRITING BREATH EXHALE
+1DA6D..1DA74 ; So # [8] SIGNWRITING SHOULDER HIP SPINE..SIGNWRITING TORSO-FLOORPLANE TWISTING
+1DA76..1DA83 ; So # [14] SIGNWRITING LIMB COMBINATION..SIGNWRITING LOCATION DEPTH
+1DA85..1DA86 ; So # [2] SIGNWRITING LOCATION TORSO..SIGNWRITING LOCATION LIMBS DIGITS
+1E14F ; So # NYIAKENG PUACHUE HMONG CIRCLED CA
+1ECAC ; So # INDIC SIYAQ PLACEHOLDER
+1ED2E ; So # OTTOMAN SIYAQ MARRATAN
+1F000..1F02B ; So # [44] MAHJONG TILE EAST WIND..MAHJONG TILE BACK
+1F030..1F093 ; So # [100] DOMINO TILE HORIZONTAL BACK..DOMINO TILE VERTICAL-06-06
+1F0A0..1F0AE ; So # [15] PLAYING CARD BACK..PLAYING CARD KING OF SPADES
+1F0B1..1F0BF ; So # [15] PLAYING CARD ACE OF HEARTS..PLAYING CARD RED JOKER
+1F0C1..1F0CF ; So # [15] PLAYING CARD ACE OF DIAMONDS..PLAYING CARD BLACK JOKER
+1F0D1..1F0F5 ; So # [37] PLAYING CARD ACE OF CLUBS..PLAYING CARD TRUMP-21
+1F110..1F16C ; So # [93] PARENTHESIZED LATIN CAPITAL LETTER A..RAISED MR SIGN
+1F170..1F1AC ; So # [61] NEGATIVE SQUARED LATIN CAPITAL LETTER A..SQUARED VOD
+1F1E6..1F202 ; So # [29] REGIONAL INDICATOR SYMBOL LETTER A..SQUARED KATAKANA SA
+1F210..1F23B ; So # [44] SQUARED CJK UNIFIED IDEOGRAPH-624B..SQUARED CJK UNIFIED IDEOGRAPH-914D
+1F240..1F248 ; So # [9] TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-672C..TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-6557
+1F250..1F251 ; So # [2] CIRCLED IDEOGRAPH ADVANTAGE..CIRCLED IDEOGRAPH ACCEPT
+1F260..1F265 ; So # [6] ROUNDED SYMBOL FOR FU..ROUNDED SYMBOL FOR CAI
+1F300..1F3FA ; So # [251] CYCLONE..AMPHORA
+1F400..1F6D5 ; So # [726] RAT..HINDU TEMPLE
+1F6E0..1F6EC ; So # [13] HAMMER AND WRENCH..AIRPLANE ARRIVING
+1F6F0..1F6FA ; So # [11] SATELLITE..AUTO RICKSHAW
+1F700..1F773 ; So # [116] ALCHEMICAL SYMBOL FOR QUINTESSENCE..ALCHEMICAL SYMBOL FOR HALF OUNCE
+1F780..1F7D8 ; So # [89] BLACK LEFT-POINTING ISOSCELES RIGHT TRIANGLE..NEGATIVE CIRCLED SQUARE
+1F7E0..1F7EB ; So # [12] LARGE ORANGE CIRCLE..LARGE BROWN SQUARE
+1F800..1F80B ; So # [12] LEFTWARDS ARROW WITH SMALL TRIANGLE ARROWHEAD..DOWNWARDS ARROW WITH LARGE TRIANGLE ARROWHEAD
+1F810..1F847 ; So # [56] LEFTWARDS ARROW WITH SMALL EQUILATERAL ARROWHEAD..DOWNWARDS HEAVY ARROW
+1F850..1F859 ; So # [10] LEFTWARDS SANS-SERIF ARROW..UP DOWN SANS-SERIF ARROW
+1F860..1F887 ; So # [40] WIDE-HEADED LEFTWARDS LIGHT BARB ARROW..WIDE-HEADED SOUTH WEST VERY HEAVY BARB ARROW
+1F890..1F8AD ; So # [30] LEFTWARDS TRIANGLE ARROWHEAD..WHITE ARROW SHAFT WIDTH TWO THIRDS
+1F900..1F90B ; So # [12] CIRCLED CROSS FORMEE WITH FOUR DOTS..DOWNWARD FACING NOTCHED HOOK WITH DOT
+1F90D..1F971 ; So # [101] WHITE HEART..YAWNING FACE
+1F973..1F976 ; So # [4] FACE WITH PARTY HORN AND PARTY HAT..FREEZING FACE
+1F97A..1F9A2 ; So # [41] FACE WITH PLEADING EYES..SWAN
+1F9A5..1F9AA ; So # [6] SLOTH..OYSTER
+1F9AE..1F9CA ; So # [29] GUIDE DOG..ICE CUBE
+1F9CD..1FA53 ; So # [135] STANDING PERSON..BLACK CHESS KNIGHT-BISHOP
+1FA60..1FA6D ; So # [14] XIANGQI RED GENERAL..XIANGQI BLACK SOLDIER
+1FA70..1FA73 ; So # [4] BALLET SHOES..SHORTS
+1FA78..1FA7A ; So # [3] DROP OF BLOOD..STETHOSCOPE
+1FA80..1FA82 ; So # [3] YO-YO..PARACHUTE
+1FA90..1FA95 ; So # [6] RINGED PLANET..BANJO
+
+# Total code points: 6161
+
+# ================================================
+
+# General_Category=Initial_Punctuation
+
+00AB ; Pi # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+2018 ; Pi # LEFT SINGLE QUOTATION MARK
+201B..201C ; Pi # [2] SINGLE HIGH-REVERSED-9 QUOTATION MARK..LEFT DOUBLE QUOTATION MARK
+201F ; Pi # DOUBLE HIGH-REVERSED-9 QUOTATION MARK
+2039 ; Pi # SINGLE LEFT-POINTING ANGLE QUOTATION MARK
+2E02 ; Pi # LEFT SUBSTITUTION BRACKET
+2E04 ; Pi # LEFT DOTTED SUBSTITUTION BRACKET
+2E09 ; Pi # LEFT TRANSPOSITION BRACKET
+2E0C ; Pi # LEFT RAISED OMISSION BRACKET
+2E1C ; Pi # LEFT LOW PARAPHRASE BRACKET
+2E20 ; Pi # LEFT VERTICAL BAR WITH QUILL
+
+# Total code points: 12
+
+# ================================================
+
+# General_Category=Final_Punctuation
+
+00BB ; Pf # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+2019 ; Pf # RIGHT SINGLE QUOTATION MARK
+201D ; Pf # RIGHT DOUBLE QUOTATION MARK
+203A ; Pf # SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
+2E03 ; Pf # RIGHT SUBSTITUTION BRACKET
+2E05 ; Pf # RIGHT DOTTED SUBSTITUTION BRACKET
+2E0A ; Pf # RIGHT TRANSPOSITION BRACKET
+2E0D ; Pf # RIGHT RAISED OMISSION BRACKET
+2E1D ; Pf # RIGHT LOW PARAPHRASE BRACKET
+2E21 ; Pf # RIGHT VERTICAL BAR WITH QUILL
+
+# Total code points: 10
+
+# EOF
diff --git a/test/coverage.txt b/test/coverage.txt
index 4e1dfee..2c75ec2 100644
--- a/test/coverage.txt
+++ b/test/coverage.txt
@@ -181,7 +181,6 @@ __x_ _x___
````````````````````````````````
-
## Code coverage
### `md_is_unicode_whitespace__()`