src


Log

Author Commit Date CI Message
Ran Benita e6aec067 2025-04-29T17:14:01 build: drop support for byacc It doesn't support `%define parse.error detailed` which we want to use. If needed, we can probably bring back support using some macro hackery. Signed-off-by: Ran Benita <ran@unusedvar.com>
Pierre Le Marre c2d3694b 2025-05-06T07:01:01 xkbcomp: Do not discard extra bits in vmod masks Since we accept numeric values for the vmod mask in the keymap, we may have extra bits set that encode *no* real/virtual modifier. Keep them unchanged for consistency. E.g. the following keymap: xkb_keymap { xkb_keycodes { <a> = 38; }; xkb_symbols { virtual_modifiers X = 0xf0000000; key <a> { [ SetMods(mods = 0x00001100) ] }; }; }; would compile to: xkb_keymap { xkb_keycodes { <a> = 38; }; xkb_symbols { virtual_modifiers X = 0xf0000000; // Internal state key <a> { [ SetMods(mods = 0xf0001000) ] }; // Serialization key <a> { [ SetMods(mods = 0x00001100) ] }; }; };
Pierre Le Marre dddffd51 2025-05-05T13:22:57 state: Fix virtual modifiers with non-real mod mapping Currently there are 2 issues with the handling of virtual modifiers in the keyboard state: 1. We assume that the input modifiers masks encode the indexes of all the modifiers of the keymap, but this is true only for the *real* modifiers (at least in xkbcommon and X11). Indeed, since the virtual modifiers *indexes* are implementation-specific, the input modifier masks merely *encode* the modifiers via their *mapping*. Consider the following keymap: ```c xkb_keymap { xkb_compat { virtual_modifiers M1 = 0x100; }; xkb_types { virtual_modifiers M2 = 0x200; }; }; ``` Now to illustrate, consider the following 2 implementation variants of libxkbcommon (assuming indexes 0-7 are the usual real modifiers): 1. Process `xkb_compat` then `xkb_types`. M1 and M2 have the respective indexes 8 and 9 and map to themselves (with the current assumption about mask denotation). 2. Process `xkb_types` then `xkb_compat`. M1 and M2 have the respective indexes 9 and 8 and map to each other. With the current `xkb_state_update_mask`, implementation 2 will swap M1 and M2 (compared to impl. 1) at each update! Indeed, we can see that `xkb_state_serialize_mods` doesn’t roundtrip via `xkb_state_update_mask`. 2. We assume that modifier masks use only bits denoting modifiers in the keymap, but when parsing the keymap we accept explicit virtual modifiers mapping of arbitrary values. E.g. if `M1` is the only virtual modifier and it is defined by: ```c virtual_modifiers M1 = 0x80000000; // 1 << (32 - 1) ``` then the 32th bit of a modifier mask input does *not* denote the 32th virtual modifier of the keymap, but merely the encoding of the mapping of `M1`. So when calling `xkb_state_update_mask`, we may discard some bits of the modifiers masks and end up with an incorrect state. These 2 issues may break interoperability with other implementations of XKB (e.g. kbvm) and make pure virtual modifiers handling fragile. We introduce the notion of *canonical state modifier mask*: the mask with the smallest population count that denotes all bits used to encode the modifiers in the keyboard state. It is equal to the bitwise OR of real modifiers mask and all the virtual modifiers mappings. This commit fixes the 2 issues by making *weaker* assumptions about the input modifier masks: 1. Modifiers may map to arbitrary values, not only real modifiers. 2. Input modifier masks merely encode modifiers via their *mapping*: - *real* modifiers map to themselves; - *virtual* modifiers map to the bitwise OR of their *explicit* mapping (via `virtual_modifiers`) and their *implicit* mapping (via keys’ real and virtual modmaps). - modifiers indexes are implementation-specific. Since the implementation before this commit also resolved virtual modifiers to their mappings, we continue doing so, but using only the bits that are *not* set in the canonical state modifier mask, so that we enable roundtrip of `xkb_state_serialize_mods` via `xkb_state_update_mask`. 3. Input modifier masks do not denote modifiers indexes (apart from real modifiers), so it is safe to discard only the bits that are not set in the canonical state modifier mask.
Pierre Le Marre d5b779e1 2025-05-06T21:07:28 keymap: Fix empty compat interpretation map serialization X11’s `xkbcomp` requires at least one compat interpretation entry.
Pierre Le Marre 87f9ac76 2025-05-06T21:02:23 keymap: Fix empty compat interpretation statement serialization Statements such as `interpret VoidSymbol {};` can cannot be parsed by X11’s `xkbcomp`. Fixed by using a dummy action.
Pierre Le Marre 230b6a6a 2025-05-06T14:35:26 Fix key type map entry with unbound vmod not ignored Currently we only ignore key type map entries with non-zero mods and with a zero modifier mask. However, the XKB protocol states ([source]): > Map entries which specify unbound virtual modifiers are not considered. So we currently handle `map[Unbound]` key type map entries (all modifiers unbound) but not `map[Bound+Unbound]` entries (mix of bound and unbound modifiers). Fixed by properly checking unbound modifiers on each key type map entry. This also fixes a test that was accidentally passing. [source]: https://www.x.org/releases/X11R7.7/doc/kbproto/xkbproto.html#:~:text=Map%20entries%20which%20specify%20unbound%20virtual%20modifiers,not%20considered
Pierre Le Marre f8148744 2025-05-06T11:26:21 Define the mapping of real modifiers explicitly When querying for a modifier mapping, we should treat all modifiers equally. So simply store real modifier mapping as we do for the virtual ones. Also fixed useless boolean conversions.
Pierre Le Marre e1aca42e 2025-05-05T12:06:18 state: Minor refactor - Move variable declaration close to their use. - Make them constant whenever possible.
Pierre Le Marre 8bc60ee3 2025-05-05T13:20:45 modifiers: Minor optimization It has low impact, but it also adds better semantics.
Pierre Le Marre cd512b8f 2025-05-02T19:21:09 x11: Fix capitalization transformation
Pierre Le Marre 411e9a6f 2025-04-28T06:56:19 ExprKeySym: add comment about error recovery
Pierre Le Marre 76683d92 2025-04-29T11:37:46 symbols: Fix clang-tidy false positive
Pierre Le Marre 95c5c859 2025-03-25T05:50:02 xkbcomp: Quote erroneous field in logging
Pierre Le Marre d66a65c2 2025-03-20T17:34:07 xkbcomp: Consistent keycodes logging
Pierre Le Marre 9b0b8c68 2025-04-15T19:53:28 xkbcomp: Stricter handling of default map include Before this commit, including a *default* map, i.e. without an explicit section name (e.g. `include "au"` vs `include "au(basic)"`) would match the first section of the first matching file in the XKB include paths, even if this section is not an *explicit* default map (i.e. tagged with `default`) but an *implicit* default map (i.e. the first map of the file, i.e. a weak match). It makes user configuration risky: say a user wants to create a custom version `au(custom)` of the `au` layout: - `./config/xkb/symbols/au`: custom layout in section “custom”. - `/usr/share/X11/xkb/symbols/au`: system layout, with *default* section “basic”. In this setup *any* layout that imports the default map from `au` would in fact import the *implicit* default map `au(custom)` instead of the *explicit* default map `au(basic)`. This incorrect behavior may thus break setups with multiple layouts. This is especially true for symbols files such as: `pc`, `us` or `latin`. Fixed by trying harder to found the exact default map, defaulting to the old behavior (weak match) only if no *explicit* default map (exact match) has been found in the XKB include paths.
Pierre Le Marre 44bcdb73 2025-04-13T10:24:13 symbols: Avoid keysyms allocation by stealing darray
Pierre Le Marre 9ede705b 2025-04-13T09:50:18 state: Capitalization transformation in xkb_state_key_get_syms Previously `xkb_state_key_get_syms()` did not perform capitalization tranformation, while `xkb_state_key_get_one_sym()` does perform it. This is unfortunate if we want to promote the use of multiple keysyms per levels. The API make it difficult to change now though: we return a pointer to an immutable array rather than filling a buffer. While we could use an internal buffer in `xkb_state`, this option would limit the API to *sequential* calls of `xkb_state_key_get_syms()` or require some buffer handling (e.g. rotation). Instead we now store the capitalization directly in `xkb_level`. We modified `xkb_level` like so (see below for discussion about the size): ```diff struct xkb_level { - unsigned int num_syms; + uint16_t num_syms; - unsigned int num_actions; + uint16_t num_actions; + union { + /** num_syms == 1: Upper keysym */ + xkb_keysym_t upper; + /** num_syms > 1: Indicate if `syms` contains the upper case + * keysyms after the lower ones. */ + bool has_upper; + }; union { xkb_keysym_t sym; /* num_syms == 1 */ xkb_keysym_t *syms; /* num_syms > 1 */ } s; union { union xkb_action action; /* num_actions == 1 */ union xkb_action *actions; /* num_actions > 1 */ } a; }; ``` - When `level.num_syms` <= 1, we store the upper keysym in `level.upper`. - Else if there no cased syms, we set `level.has_upper` to false. - Else if there are some cased syms, we set `level.has_upper`` to `true` and we double the original size of `level.s.syms`, but *without* modifying `level.num_syms`. We then append the transformed keysyms right after the original ones, so that we can access them by a simple pointer operation: `level.s.syms + level.num_syms`. The memory footprint is *unchanged*, thanks to the reduced fields for actions and keysyms counts.
Pierre Le Marre 9e93e5e5 2025-04-13T10:25:02 symbols: Restrict the number of actions and keysyms per level In preparation to support capitalization in `xkb_state_key_get_syms()`, this commit reduces the number of supported actions and keysyms per level, going from UINT_MAX to UINT16_MAX. This is most likely still more than enough and could be even reduced further, but deemed unnecessary at the moment: alignment of `struct xkb_level` is driven by the fields `a` and `s`. - Switched the item count type from `unsigned int` to `uint16_t`. - Introduced `xkb_{action,keysym}_count_t` type for the respective item count for exact typing. - Added relevant bounds checks.
Pierre Le Marre 53d80b87 2025-03-20T15:29:17 xkbcomp: Safer keycode max_key_code Since we now always keep the keycodes array at the minimal dimensions, `max_key_code` is redundant and error prone. Let’s use `darray_size` directly.
Pierre Le Marre 256be1ea 2025-03-25T08:13:21 xkbcomp: Fix merge mode for defaults actions - Keep defaults local: do not share accross includes. - Do not allocate default actions.
Pierre Le Marre b1865376 2025-03-25T07:46:11 xkbcomp: Fix merge mode for defaults in symbols
Pierre Le Marre a629aa84 2025-03-25T05:49:04 xkbcomp: Fix merge mode for defaults in compat
Pierre Le Marre af5296cf 2025-03-19T13:11:35 xkbcomp: Fix virtual mods merge modes
Pierre Le Marre 06c024e0 2025-03-19T13:11:35 xkbcomp: Fix merge modes Fix various issues with merge mode handling: - Invalid initialization - Invalid merge mode inherited from keymap - Do not leak local merge mode
Pierre Le Marre a1e595e7 2025-04-11T11:13:25 rules: Fix merging KcCGST values in layout order When using layout index ranges (e.g. special indexes “any” or “later”), the rules still match following the order in the rules file, so layout indexes may match without following their natural order. So the resulting KcCGST value should not be merged with the output until reaching the end of the rule set. Because the rule set may also involve options, it may match multiple times for the *same* layout index. So these multiple matches should not be merged together either, until reaching the end of the rule set. When reaching the end of the rule set, for each KcCGST component the pending values are then merged: for each layout, for each KcCGST value in the corresponding sequence, merge with the output. --- Example: ! model = symbols * = pc ! layout[any] option = symbols C 1 = +c1:%i C 2 = +c2:%i B 3 = skip B 4 = +b:%i The result of RMLVO {layout: "A,B,C", options: "4,3,2,1"} is: symbols = pc+b:2+c1:3+c2:3 - `skip` was dropped because it has no explicit merge mode; - although every rule was matched in order, the resulting order of the symbols follows the order of the layouts, so `+b` appears before `+c1` and `+c2`. - the relative order of the options for layout C follows the order within the rule set, not the order of RMLVO. Before this commit, the result would have been: symbols = pc+c1:3+c2:3+b:2
Pierre Le Marre 66f71890 2025-03-31T08:01:29 symbols: Enable writing keysyms list as UTF-8 strings Each Unicode code point of the string will be translated to their respective keysym, if possible. An empty string denotes `NoSymbol`. When such conversion is not possible, this will raise a syntax error. This introduces the following syntax: ```c // Empty string = `NoSymbol` key <1> {[""]}; // NoSymbol // Single code point = single keysym key <2> {["é"]}; // eacute // String = translate each code point to their respective keysym key <3> {["sßξك🎺"]}; // {s, ssharp, Greek_xi, Arabic_kaf, U1F3BA} // Mix string and keysyms key <4> {[{"ξ", Greek_kappa, "β"}]}; // { Greek_xi, Greek_kappa, Greek_beta} ``` It can also be used wherever a keysym is required, e.g. in `interpret` and `modifier_map` statements. In these cases a single keysym is expected, so the string should contain *exactly one* Unicode code point.
Pierre Le Marre ead3ce77 2025-03-28T21:44:27 scanner: Enable LRM and RLM marks for BiDi text Enable displaying bidirectional text in XKB files using: - U+200E LEFT-TO-RIGHT MARK - U+200F RIGHT-TO-LEFT MARK We now parse these marks as white space. As such, they are dropped; note that a later serialization may not display correctly without the marks, although it will parse. References: - https://www.w3.org/International/articles/inline-bidi-markup/uba-basics - https://www.w3.org/International/questions/qa-bidi-unicode-controls - https://www.unicode.org/reports/tr31/#Whitespace - https://www.unicode.org/reports/tr55/
Pierre Le Marre bc3e464b 2025-04-09T12:35:05 keysyms: Fix Unicode handling - `xkb_utf32_to_keysym`: Allow [Unicode noncharacters]. There is no requirement to drop them and this would be the only function of our API doing so. From the Unicode Standard 16.0, section 23.7 “Noncharacters”: > Applications are free to use any of these noncharacter code points > internally. They have no standard interpretation when exchanged > outside the context of internal use. However, they are not illegal > in interchange, nor does their presence cause Unicode text to be > ill-formed. > If a noncharacter is received in open interchange, an application is > not required to interpret it in any way. It is good practice, > however, to recognize it as a noncharacter and to take appropriate > action, such as replacing it with `U+FFFD` REPLACEMENT CHARACTER, > to indicate the problem in the text. The key part is: > an application is not required to interpret it in any way Since we handle the reverse conversion with `xkb_keysym_to_utf32` just fine, I do not see a good motivation to keep this asymmetry. This is the only function with a special case for these code points. - `xkb_keysym_from_name`: - Unicode format `UNNNN`: allow control characters C0 and C1 and use `xkb_utf32_to_keysym` for the conversion when `NNNN < 0x100`, for backward compatibility. - Numeric hexadecimal format `0xNNNN`: *unchanged*. Contrary to the Unicode format, it does not normalize any keysym values in order to enable roundtrip with `xkb_keysym_get_name`. Also added tests to ensure various properties and consistency. Note about *surrogates*: they are valid valid *code points* but invalid Unicode *scalar values*, i.e. they cannot be encoded in any Unicode encoding form (UTF-8, UTF-16, UTF-32). So their corresponding Unicode keysyms are valid, but: - cannot be used as input of `xkb_keysym_to_utf32` nor `xkb_keysym_to_utf8` - cannot result as output of `xkb_utf32_to_keysym`. Otherwise they are valid e.g. in the Unicode keysym notation. [Unicode noncharacters]: https://en.wikipedia.org/wiki/Universal_Character_Set_characters#Noncharacters
Pierre Le Marre 08d9a031 2025-04-08T06:31:33 Unicode: Make surrogate handling more explicit
Pierre Le Marre 5e557040 2025-04-09T11:17:00 xkbcomp: Fix Unicode escape sequence While the previous code correctly rejected malformed sequences such as `\u{` (incomplete) or `\u{123x}`, it should try to consume as much input as possible until reaching the corresponding closing `}` within the string. Else we can get leftovers and the error message does not reference the whole malformed sequence. Also added further tests with surrogates and noncharacters.
Pierre Le Marre ca798d21 2025-04-08T16:21:46 keysyms: Pad Unicode keysyms only up to 4 digits Previously there was a distinction between keysyms with code points in BMP and the others: the former used a 4-padding while the latter used a 8-padding: e.g `U0001` vs `U00010000`. This is unnecessary and makes the reading harder. Let’s use the same padding for all: `U0001` and `U10000`. Parsing remains unchanged and would parse both paddings. Also added a test to check no explicit name can clash with Unicode notation.
Pierre Le Marre 47c2c820 2025-04-08T18:09:41 Add internal API to get all explicit names of a keysym
Pierre Le Marre 102f4ba1 2025-04-06T19:38:53 Fix integer conversion warnings
Pierre Le Marre 5a32b779 2025-04-06T06:16:41 logging: Handle NULL map name Display “(unnamed map)” instead of “(null)”.
Pierre Le Marre 36442baa 2025-04-03T15:01:46 xkbcomp: Support multiple actions in interpret Before this commit we supported multiple actions per level, but not in *interpret* statements. Let’s fix this asymmetry, so we can equivalently assign all actions sets either implicitly or explicitly.
Pierre Le Marre 06394afc 2025-04-03T08:49:12 xkbcomp: Minor parser refactor for keysyms and actions
Pierre Le Marre 39c1bb36 2025-03-29T17:47:31 xkbcomp: Fix static_assert syntax
Pierre Le Marre f348c6e9 2025-04-05T12:48:50 logging: Quote invalid escape sequence
Pierre Le Marre 6d4cc135 2025-04-05T13:39:30 xkbcomp: Escape ASCII control characters
Pierre Le Marre 3d79f459 2025-03-29T11:46:34 xkbcomp: Add Unicode code point escape sequence \u{NNNN} Unicode code point escape sequences `\u{NNNN}` are replaced with the UTF-8 encoding of their corresponding code point `U+NNNN`, if legal. Supported Unicode code points are in the range `1‥0x10ffff`. Note that we will reject the `U+0000` NULL code point, as we reject it in the octal escape sequence `\0`. This is intended mainly for the upcoming feature to write keysyms as UTF-8 encoded strings. It can be used for various reasons: - avoid encoding issues; - avoid issue with font rendering (e.g. Asian scripts); - make white space or zero-width characters more readable.
Pierre Le Marre 23bbec96 2025-03-29T12:33:53 xkbcomp: Add escape sequence \" `\"` seems like a very natural extension. However it is not supported by Xorg xkbcomp, so do not emit it when serializing.
Pierre Le Marre 7d91a753 2025-03-29T12:24:39 xkbcomp: Enable xkbcomp-style octal escape sequences Xorg xkbcomp only parses octal sequences with `\0`, while xkbcommon does not force the `0` prefix of the numeric part. However, we only parsed up to to 3 digits, which does not allow to parse e.g. `\0377` while `\377` parses fine. Fixed by parsing up to 4 octal digits, while checking the result fits into a byte.
Pierre Le Marre 3d026436 2025-04-05T14:32:34 keymap serialization: Fix unchecked allocation failures The previous commit enabled clang-tidy to detect some missing checks.
Pierre Le Marre aa8b572e 2025-03-29T12:04:26 keymap serialization: Ensure escaping relevant chars Previously we would write characters without any escaping in some cases (e.g.: names of indicators, types and groups). E.g. the string "new\nline" would be serialized as: "new line" which would raise a syntax error if parsed. Fixed by escaping any string that was not escaped after parsing (e.g. the section names are safe already).
Pierre Le Marre d2f7b9cd 2025-04-04T17:29:35 rules: Do not use strto* parsers
Pierre Le Marre d5a91fa9 2025-04-04T16:38:16 xkbcomp: Use custom parsers instead of strtol* The use of `strtol*` functions was already restricted due to its slowness and its capacity to parse other stuff than digits (e.g. signs and spaces). There is also another *big* limitation: it requires a NULL-terminated string. This is incompatible with our functions that work on buffers, because we cannot guarantee this. This may lead to a memory violation if the last token is a number. We now roll out our own parsers, which are more efficients and compatible with buffers.
Pierre Le Marre 8594adc4 2025-03-31T13:52:36 doc: Mention that `alternate` merge mode is not supported
Pierre Le Marre 36bb4fe3 2025-04-02T19:10:02 xkbcomp: Minor renaming Use the same case for `KeySym` in the parser.
Pierre Le Marre 44480f7c 2025-04-01T08:28:02 xkbcomp: Enable lists of keysyms and actions {} and {a} Motivations: - Follow the principle of least astonishment; - Ensure consistency; - Enhance the use of custom defaults; - Facilitate the tests. There is some ambiguity because we use `{}` to denote both an empty list of keysyms and an empty list of actions. But as soon as we get a keysym or an action, we know whether it is a `MultiKeySymList` or a `MultiActionList`. So we just count the `{}` at the *beginning* using `NoSymbolOrActionList`, then replace it by the relevant count of `NoSymbol` or `NoAction()` once the ambiguity is solved. If not, this is a list of empties of *some* type: we drop those empties and delegate the type resolution using `ExprEmptyList()`.
Pierre Le Marre e09cbe66 2025-04-02T10:46:06 symbols: Fix handling of empty keys Before this commit, the following symbols: ```c xkb_symbols { virtual_modifiers M1, M2; key <A> {}; key <B> { [] }; key.vmods = M1; key <C> {}; key <D> { vmods = M2 }; }; ``` would be equivalent to: ```c xkb_symbols { virtual_modifiers M1,M2; key <B> { [ NoSymbol ] }; }; ``` `<B>` entry could be skipped but is harmless. However, `<C>` and `<D>` are missing, which would lead to the mapping resolution of `M1` and `M2` failing. After this commit, it is equivalent to: ```c virtual_modifiers M1,M2; key <C> { vmods = M1 }; key <D> { vmods = M2 }; ``` Empty keys are skipped entirely, but any explicit field: - is taken into account: previously they would be skipped if there were no group; - forces the key to be printed at serialization.
Pierre Le Marre 2e0245f8 2025-04-02T10:45:44 xkbcomp: Enable more empty lists - Empty `interpret` - Empty key `type` - Empty `indicator` Motivations: - Follow the principle of least astonishment; - Ensure consistency; - Enhance the use of custom defaults; - Facilitate the tests.
Pierre Le Marre 6881fb32 2025-04-01T08:28:02 xkbcomp: Drop trailing NoSymbol and NoAction() This brings us closer to what `xkbcomp` outputs. One should use the explicit `VoidSymbol` instead of `NoSymbol`, in order to avoid dropping empty levels. This may affect keys that rely on an *implicit* key type. Example: - Input: ```c key <> { [a, A, NoSymbol] }; ``` - Compilation with xkbcommon \< 1.9.0: ```c key <> { type= "FOUR_LEVEL_SEMIALPHABETIC", [a, A, NoSymbol, NoSymbol] }; ``` - Compilation with xkbcommon ≥ 1.9.0: ```c key <> { type= "ALPHABETIC", [a, A] }; ```
Pierre Le Marre 7dbd2576 2025-04-01T19:20:10 keymap: Use constants for Lock and Control indexes These indexes are fixed, so there is no need to lookup their name.
Pierre Le Marre 55e99f0a 2025-04-01T09:03:25 keymap: refactor ClearLevelInfo
Pierre Le Marre 8ba5c453 2025-03-30T10:07:10 xkbcomp: Use section reference as default section name Before this commit the following keymap: ```c xkb_keymap { xkb_keycode {}; }; ``` would result in (boilerplate removed): ```c xkb_keymap { xkb_keycode "(unnamed)" {}; }; ``` This is both useless and wasting allocation: section names are optional, so we should just remove this default name altogether and keep it undefined, as in the original keymap. The situation is a bit different if there is an include, as for keymaps created from RMLVO names. Before this commit, the following keymap: ```c xkb_keymap { xkb_keycode { include "evdev+aliases(qwerty)" }; }; ``` would result in (boilerplate removed): ```c xkb_keymap { xkb_keycode "(unnamed)" { … }; }; ``` With this commit we now follow the Xorg xkbcomp style by using the section reference (the include string) as the *default* section name. So the previous example would now result in: ```c xkb_keymap { xkb_keycode "evdev_aliases(qwerty)" { … }; }; ``` which is useful to give a hint of the original include. Note that if the original section had a name, it would preserve it: ```c xkb_keymap { xkb_keycode "test" { include "evdev+aliases(qwerty)" }; }; ``` would compile to: ```c xkb_keymap { xkb_keycode "test" { … }; }; ```
Pierre Le Marre 3150bca8 2025-03-30T09:54:02 xkbcomp: Make all components optional We already accept *empty* components, such as: `xkb_compat {};`. Let’s accept missing components as well, so that we can reduce the boilerplate in our tests. Note that we will still explicitly serialize empty components for compatibility with previous xkbcommon versions and Xorg xkbcomp.
Pierre Le Marre 23598fa1 2025-03-25T22:52:06 Enable merge mode “replace” in include statements Previously only the merge modes “override” and “augment” were available in include statements, using the prefix ‘+’ and ‘|’ respectively. While on one hand `replace` include statement can be used in keymap files, on the other hand *rules* files have no way to express the *replace* mode. This commit enables the merge mode “replace” using the prefix `^`. This prefix was chosen due to its similarity with the `XOR` bit operator, which convey *mutual exclusion*. Other candidates: - `!` conveys some kind of higher precedence, akin to CSS `!important`. But it conflicts with the section header `!`, which is a token in the current parser. It would require special handling, not worth it. It also convey the meaning of negation, which is confusing. - `&` has the advantage of not corresponding to a token in the rules parser. `^` seems however to stand out more and it is less likely to trigger erroneous comparison with `|` and `&` bit operators.
Pierre Le Marre 6fc6e64b 2025-03-26T10:35:22 rules: Added extended wild cards <none>, <some> and <any> Added the following wild cards to the rules file syntax, in addition to the current `*` legacy wild card: - `<none>`: Match *empty* value. - `<some>`: Match *non-empty* value. - `<any>`: Match *any* (optionally empty) value. Its behavior does not depend on the context, contrary to the legacy wild card `*`. This will enable writing much simpler rules, see [!764] for an example of tricky rules in the `xkeyboard-config` project, that would benefit from the new wild cards. [!764]: https://gitlab.freedesktop.org/xkeyboard-config/xkeyboard-config/-/merge_requests/764 The verbose wild cards are preferred to single characters: - More intuitive: self-explanatory. - Does not steal syntax from other token. - Extensible syntax, should we need it. A previous proposal used the characters (`!`, `+`, `?`) for their similarity with the corresponding syntax of regular expressions (negative assertion & quantifiers), in line with `*`. But `!` is not that intuitive after all and conflict with its role as section header. Furthermore, `+` is also used as a merge mode. Finally, nothing beats whole short words for readability.
Pierre Le Marre 500b260b 2025-03-28T09:38:58 xkbcomp: Fix parser failure on floating-point numbers Before this commit we used `strtold`, which depends on the locale. But the XKB syntax is fixed and uses a period as decimal separator. So ensure the syntax is correct without relying on `strtold` and truncate the result, as the parser does not use floating-point numbers.
Pierre Le Marre d7e112fe 2025-03-29T19:44:13 registry: Added support for libxml2 2.14+ `libxml2-2.14+` now disallows parsing trailing `NULL` bytes, so don’t. This is backward-compatible with previous versions of the library.
Pierre Le Marre cc95f217 2025-03-25T11:15:45 xkbcomp: Fix whichGroupState serialization This indicator field was previously looked up in the wrong table, resulting the erroneous serialization `(null)`.
Pierre Le Marre 9a5547ce 2025-03-28T11:01:18 symbols: Fix leak in HandleSymbolsDef
Pierre Le Marre 8e92f25e 2025-03-13T21:26:59 rules: Added xkb_components_names_from_rules() This is mainly for debugging purposes and to enable displaying KcCGST values from RMLVO resolution in `xkbcli compile-keymap --kccgst`.
Pierre Le Marre f3a4eeaa 2025-03-26T16:04:39 symbols: Improve keysym parsing
Pierre Le Marre e5401b07 2025-03-26T16:02:58 symbols: Improve Modmap parsing Parse, dont’t validate: ensure *at parsing* that `modifier_map` definitions use a list of keys and keysyms. This enables to remove the redundant `ExprResolveKeySym` and have keysym parsing exclusively in handled in `parser.y`.
Pierre Le Marre 70d11abd 2025-03-26T07:38:05 messages: Add file encoding and invalid syntax entries Added: - `XKB_ERROR_INVALID_FILE_ENCODING` - `XKB_ERROR_INVALID_RULES_SYNTAX` - `XKB_ERROR_INVALID_COMPOSE_SYNTAX` Changed: - `XKB_ERROR_INVALID_SYNTAX` renamed to `XKB_ERROR_INVALID_XKB_SYNTAX`.
Pierre Le Marre fe0d3742 2025-03-18T14:41:30 registry: Fix typo in variable declaration
Pierre Le Marre e8561909 2025-03-18T14:34:10 xkbcomp: Fix keycodes bounds - Refactor to check conflicts first for the key names and then for the keycodes. This seems more useful for the user and enable further memory optimizations. - Do not allocate until we are sure to add the keycode. The bounds are only updated afterwards, so the call to `FindKeyByName` should be more efficient. - Fixed keycodes bounds not shrunk correctly when an existing keycode is overridden. - Do not prepare keyname strings for logging if we are not going to use them.
Pierre Le Marre 4e90cb9c 2025-03-17T07:02:07 xkbcomp: Improve logging of virtual modifiers When logging about virtual modifier *explicit* mappings, we should always use only real modifiers or hexadecimal numbers to print the mask. Consider: ``` virtual_modifiers M1, M2=0x200, M2=0x400; ``` Before this commit we would get the following warning: ``` WARNING: Virtual modifier M2 defined multiple times; Using M2, ignoring M1 ``` while we would prefer the less confusing: ``` WARNING: Virtual modifier M2 defined multiple times; Using 0x400, ignoring 0x200 ```
Pierre Le Marre 02b32244 2025-03-10T13:19:37 registry: Use safer contextual libxml2 functions Avoid using functions depreacted in 2.13 and 2.14 libxml releases. Follow-up of: 5f1b06b7497dc0e69ecfef8a934bce01f4f7fe8e.
Ran Benita 9953d9f0 2025-03-10T21:58:05 xkbcomp/ast-build: fix possible UB in expr AST node allocations (#659) The expression AST constructors all return `ExprDef *`. `ExprDef` is a union of all expr types. As a memory optimization, instead of allocating `sizeof(ExprDef)`, we only allocate the size of the actual type (e.g. `sizeof(ExprBinary)`) which is sometimes smaller than `sizeof(ExprDef)`. This is probably undefined behavior, and gcc (with optimization turned on) complains about it, for example: src/xkbcomp/ast-build.c:69:23: warning: array subscript ‘ExprDef[0]’ is partly outside array bounds of ‘unsigned char[24]’ [-Warray-bounds=] Since it doesn't save that much memory, drop this optimization. Fix #292. Signed-off-by: Ran Benita <ran@unusedvar.com>
Julian Orth cb3565b1 2025-03-07T17:59:33 keymap: Fix segfault due to invalid group wrapping The modular arithmetic is incorrect for negative values, e.g. for num_groups = 1. It triggers a segfault for the following settings: - layouts count (per key or total) N: `N > 0`, and - layout index n: `n = - k * N` (`k > 0`) % returns the *remainder* of the division, not the modulus (see C11 standard 6.5.5 “Multiplicative operators”: a % b = a - (a/b)*b. While both operators return the same result for positive operands, they do not for e.g. a negative dividend: remainder may be negative (in the open interval ]-num_groups, num_groups[) while the modulus is always positive. So if the remainder is negative, we must add `num_groups` to get the modulus. Fixes: 67b03cea ("state: correctly wrap state->locked_group and ->group") Signed-off-by: Julian Orth <ju.orth@gmail.com> Co-authored-by: Julian Orth <ju.orth@gmail.com> Co-authored-by: Pierre Le Marre <dev@wismill.eu>
Pierre Le Marre e1892266 2025-02-13T16:57:46 clang-tidy: Miscellaneous fixes
Pierre Le Marre 5ef19bcf 2025-02-13T17:58:21 xkbcomp: Fix incorrect memory comparison Spotted by clang-tidy: ``` ../src/keymap-priv.c:146:16: warning: comparing object representation of type 'union xkb_action' which does not have a unique object representation; consider comparing the members of the object manually [bugprone-suspicious-memory-comparison] ```` Indeed, from “EXP42-C. Do not compare padding data”[^1]: > unnamed members of objects of structure and union type do not participate > in initialization. Unnamed members of structure objects have indeterminate > representation even after initialization. Fixed by using a dedicated comparison function for `union xkb_action`. [^1]: https://wiki.sei.cmu.edu/confluence/display/c/EXP42-C.+Do+not+compare+padding+data
Pierre Le Marre 558447d8 2025-02-11T17:34:27 xkbcomp: Explicit vars initialization The `Resolve*` functions do not always initialize the parameters that they can modify, so it is safer to always initialize them at the call site.
Pierre Le Marre 97698fca 2025-02-11T17:34:23 xkbcomp: Use explicit int sizes for Expr resolution
Pierre Le Marre 2d94da3d 2025-02-11T17:34:15 xkbcomp: Fix the int type of ExprInteger Avoid implicit conversion from `int64_t`.
Pierre Le Marre 350931ad 2025-02-12T14:20:58 xkbcomp: Fix compat group index
Pierre Le Marre f2dd0302 2025-02-12T14:15:26 xkbcomp: Fix LED index int type
Pierre Le Marre 2d111bbe 2025-02-12T13:54:51 xkbcomp: Fix possible overflow in numbers parser
Pierre Le Marre a4038a47 2025-02-12T09:50:36 Miscellaneous int types fixes
Pierre Le Marre e3108182 2025-02-12T09:49:25 Fix int casts related to layout index - XkbWrapGroupIntoRange - State
Pierre Le Marre 14a816e5 2025-02-11T18:41:08 xkbcomp: Fix int cast
Pierre Le Marre 6c9806ae 2025-02-12T07:46:07 xkbcomp: Fix ExprResolveMaskLookup error message
Pierre Le Marre 3a0b77f0 2025-02-12T16:41:09 xkbcomp: Fix parser headers
Pierre Le Marre 16e3e3cb 2025-02-12T16:12:42 utils: Fix missing headers
Pierre Le Marre ce9bcbe0 2025-02-07T16:31:37 scripts: Rename keysyms-related files Previous names were too generic. Fixed by using explicit names and add the `.py` file extension.
Ran Benita bb5e464e 2025-02-07T14:30:51 xkbcomp/expr: remove comment on ExprResolveIntegerLookup What it says mostly no longer holds, I think it's more confusing than helpful now. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita aa3e4c71 2025-02-07T14:10:16 xkbcomp/expr: remove unused ExprResolveKeyCode This function was added in commit 4e22851141d89436c0659e68da57e91bbf971461. But that commit also changed the grammar: -KeyNameDecl : KeyName EQUALS Expr SEMI +KeyNameDecl : KeyName EQUALS KeyCode SEMI i.e. while before you could write <AE01> = 9+1; now this is a syntax error, an integer literal is expected. I'm not sure if it was intended to remove this ability. In any case, this rendered `ExprResolveKeyCode` useless since there's no longer an expression to evaluate, and after some refactoring it went unused. Even if we choose to restore Expr here, I don't see a reason for the specialized function over `ExprResolveInteger` except the type (which we should probably widen from int to int64_t...). So remove it. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita a4d782c7 2025-02-06T16:00:19 xkbcomp/ast: remove ExprCommon It's now empty and no longer serves a purpose. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 9d7eb849 2025-02-06T15:25:03 xkbcomp/ast: combine expr_value_type into stmt_type This field is a funky attempt at type inference, or perhaps some optimization? Anyway, after careful examination I conclude it serves no purpose except specifying the type of a literal (string/integer/float/boolean/keyname) when `STMT_EXPR_VALUE` (i.e. literal). Remove it and replace `STMT_EXPR_VALUE` with specific statement types for each literal type. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita d9fc01b3 2025-02-06T15:12:53 xkbcomp/ast: combine expr_op_type into stmt_type It's better to have a single AST type enum. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 635c48f8 2025-02-06T14:47:15 xkbcomp: remove unused EXPR_TYPE_ACTION Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita a5c2c746 2025-02-06T12:27:25 Fix a few direct modifications to generated files Accidentally modified generated files in a couple of recent changes. Sync the templates. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita f4e95280 2025-02-02T22:29:05 xkbcomp/scanner: avoid unneeded strdup of IDENT tokens The allocation is immediately discarded, either turned into a keysym or an atom. So use an sval slice into the input string instead strdup'ing. memusage ./release/bench-compile-keymap --iter=1000 --layout us,de --variant ,neo Before: Memory usage summary: heap total: 534063576, heap peak: 581022, stack peak: 18848 total calls total memory failed calls malloc| 11240525 291897104 0 realloc| 1447657 192307328 0 (nomove:37629, dec:0, free:0) calloc| 430573 49859144 0 free| 13993903 534063576 After: Memory usage summary: heap total: 506839909, heap peak: 581022, stack peak: 18960 total calls total memory failed calls malloc| 8016419 264673437 0 realloc| 1447657 192307328 0 (nomove:37278, dec:0, free:0) calloc| 430573 49859144 0 free| 10769797 506839909 Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 113ac304 2025-01-25T03:18:01 meson: link tests and benches against shared library, not static library This makes the tests, and especially benches, more realistic, since xkbcommon is almost always used as a shared library. Also significantly reduces the build time with LTO enabled (for me, from 90s to 30s). Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita a380ba52 2025-01-25T07:00:43 Move XKB_EXPORT to headers The Windows dllexport annotation wants to be on the declarations, not the definitions. Signed-off-by: Ran Benita <ran@unusedvar.com>
Pierre Le Marre c051d0ae 2025-02-05T14:09:17 Replace include guards by `#pragma once` (again) Follow-up of df2322d70cb0922f67c41db639a5c70733b4ce66.
Ran Benita df2322d7 2025-02-05T14:41:21 Replace include guards by `#pragma once` We currently have a mix of include headers, pragma once and some missing. pragma once is not standard but is widely supported, and we already use it with no issues, so I'd say it's not a problem. Let's convert all headers to pragma once to avoid the annoying include guards. The public headers are *not* converted. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita d75702ba 2025-01-25T06:14:55 utils: remove ifdefs for some ancient or obscure compilers If someone needs this, they can let us know... Signed-off-by: Ran Benita <ran@unusedvar.com>