• Show log

    Commit

  • Hash : bc3e464b
    Author : Pierre Le Marre
    Date : 2025-04-09T12:35:05

    keysyms: Fix Unicode handling
    
    - `xkb_utf32_to_keysym`: Allow [Unicode noncharacters]. There is no
      requirement to drop them and this would be the only function of our
      API doing so.
    
      From the Unicode Standard 16.0, section 23.7 “Noncharacters”:
    
      > Applications are free to use any of these noncharacter code points
      > internally. They have no standard interpretation when exchanged
      > outside the context of internal use. However, they are not illegal
      > in interchange, nor does their presence cause Unicode text to be
      > ill-formed.
    
      > If a noncharacter is received in open interchange, an application is
      > not required to interpret it in any way. It is good practice,
      > however, to recognize it as a noncharacter and to take appropriate
      > action, such as replacing it with `U+FFFD` REPLACEMENT CHARACTER,
      > to indicate the problem in the text.
    
      The key part is:
    
      > an application is not required to interpret it in any way
    
      Since we handle the reverse conversion with `xkb_keysym_to_utf32` just
      fine, I do not see a good motivation to keep this asymmetry. This is
      the only function with a special case for these code points.
    - `xkb_keysym_from_name`:
      - Unicode format `UNNNN`: allow control characters C0 and C1 and use
        `xkb_utf32_to_keysym` for the conversion when `NNNN < 0x100`, for
        backward compatibility.
      - Numeric hexadecimal format `0xNNNN`: *unchanged*. Contrary to the
        Unicode format, it does not normalize any keysym values in order to
        enable roundtrip with `xkb_keysym_get_name`.
    
    Also added tests to ensure various properties and consistency.
    
    Note about *surrogates*: they are valid valid *code points* but invalid
    Unicode *scalar values*, i.e. they cannot be encoded in any Unicode
    encoding form (UTF-8, UTF-16, UTF-32). So their corresponding Unicode
    keysyms are valid, but:
    - cannot be used as input of `xkb_keysym_to_utf32` nor `xkb_keysym_to_utf8`
    - cannot result as output of `xkb_utf32_to_keysym`.
    Otherwise they are valid e.g. in the Unicode keysym notation.
    
    [Unicode noncharacters]: https://en.wikipedia.org/wiki/Universal_Character_Set_characters#Noncharacters
    

  • Properties

  • Git HTTP https://git.kmx.io/kc3-lang/libxkbcommon.git
    Git SSH git@git.kmx.io:kc3-lang/libxkbcommon.git
    Public access ? public
    Description

    keymap handling library for toolkits and window systems

    Users
    thodg_m kc3_lang_org thodg_w www_kmx_io thodg thodg_l
    Tags