Edit

kc3-lang/libxkbcommon/test/keysym-case-mapping.h

Branch :

  • Show log

    Commit

  • Author : Pierre Le Marre
    Date : 2024-12-09 09:21:01
    Hash : b9b4ab47
    Message : keysyms: Add sharp S upper case mapping exception The case mapping `ssharp` ß ↔ `U1E9E` ẞ was added in 13b30f4f0dccc08dfea426d73570b913596ed602 but was broken: - For the lower case mapping it returned the keysym `0x10000df`, which is an invalid Unicode keysym. - For the upper case mapping it returned the upper Unicode code point rather than the corresponding keysym. It did accidentally enable the detection of alphabetic key type for the pair (ß, ẞ) though. However this detection was accidentally removed in 5c7c79970a2800b6248e829464676e1f09c5f43d (v1.7) with an attempt to fix the wrong keysym case mapping. Finally both the *lower* case mapping and the key type detection were fixed for good when we implemented the complete Unicode simple case mappings and corresponding tests in e83d08ddbc9851944c662c18e86d4eb0eff23e68. However, the *upper* case mapping `ssharp` → `U1E9E` remained disabled. Indeed, ẞ is a relatively recent addition to Unicode (2008) and had no official recommendation, until recently. So while the lower mapping ẞ→ß exists in Unicode, its converse upper mapping does not. Yet since 2017 the Council for German Orthography (Rat für deutsche Rechtschreibung) recommends[^1] ẞ as the capitalization of ß. Due to its stability policies, the Unicode Character Database (UCD) that we use to generate our keysym case mappings (via ICU) cannot update the simple case mapping of ß. Discussions are currently ongoing in the Unicode mailing list[^2] and CLDR[^3] about how to deal with the new recommended case mapping. However, the discussions are oriented on text-processing and compatibility mappings, while libxkbcommon is on a rather lower level. It seems that the slow adoption of ẞ is partly due to the difficulty to type it. Since ẞ is used only for ALL CAPS casing, the expectation is to type it using CapsLock. While our detection of alphabetic key types works well[^4] for the pair (ß,ẞ), the *internal capitalization* currently does not work and is fixed by this commit. Added the ß → ẞ upper mapping: - Added an exception in the generation script - Fixed tests - Added documentation of the exceptions in `xkbcommon.h` - Added/updated log entries [^1]: https://www.rechtschreibrat.com/regeln-und-woerterverzeichnis/ [^2]: https://corp.unicode.org/pipermail/unicode/2024-November/011162.html [^3]: https://unicode-org.atlassian.net/browse/CLDR-17624 [^4]: Except libxkbcommon 1.7, see the second paragraph.

  • test/keysym-case-mapping.h
  • // WARNING: This file is automatically generated by: scripts/update-unicode.py
    #ifndef KEYSYM_CASE_EXCEPTIONS_TEST_H
    #define KEYSYM_CASE_EXCEPTIONS_TEST_H
    
    #include <stdint.h>
    #include <unicode/uchar.h>
    
    /* Unicode code points used in case mapping exceptions */
    #define LATIN_SMALL_LETTER_SHARP_S   0x00df // ß
    #define LATIN_CAPITAL_LETTER_SHARP_S 0x1e9e // ẞ
    
    static inline uint32_t
    to_simple_lower(uint32_t cp)
    {
        return (uint32_t)u_tolower((UChar32) cp);
    }
    
    static inline uint32_t
    to_simple_upper(uint32_t cp)
    {
        switch (cp) {
        /* Some exceptions */
        case LATIN_SMALL_LETTER_SHARP_S:
            return LATIN_CAPITAL_LETTER_SHARP_S;
        /* Default to the Unicode simple mapping */
        default:
            return (uint32_t)u_toupper((UChar32) cp);
        }
    }
    
    #endif