test/buffercomp.c


Log

Author Commit Date CI Message
Pierre Le Marre ac9cd053 2025-06-11T19:00:47 test: Check extended layout indexes
Pierre Le Marre 7f39be25 2025-06-10T15:46:45 test: Use explicit keymap output format for test_compile_output()
Pierre Le Marre 0f89ad97 2025-06-09T19:26:13 dump: Always use numeric group indexes The upcoming raise of the maximum groups count will require to use numeric group indexes instead of the syntax `GroupN` if groups > 8. Let’s not bother with handling two cases (group count ≤ 8 or > 8) and always serialize group indexes as numeric values.
Pierre Le Marre f3386743 2025-06-09T16:44:54 test: Use explicit keymap format in test_compile_output()
Pierre Le Marre 2acf5eca 2025-06-09T16:26:56 test: Use explicit keymap format in test_compile_buffer()
Pierre Le Marre 79e95509 2025-06-09T11:07:36 test: Use explicit keymap format in test_compile_rules()
Pierre Le Marre 39b4b670 2025-06-06T18:40:29 Support including keymap components using %-expansion and absolute path Enable to use the same `include` features than *rules* files in *keymap components*: - *`%`-expansion*: `%H` home directory, `%S` sytem root and `%E` extra. - absolute file paths. This is useful if one wants to overwrite the system file with a user config (i.e. same name, but in `~/.config/xkb`), but still include the system file: ``` // File: ~/.config/xkb/symbols/de xkb_symbols "basic" { include "%S/de(basic)" key <AB01> { [z, Z] }; key <AD06> { [y, Y] }; } ```` Without the commit, using a mere `include "de(basic)"` would result in an include loop. Refactored by using the same code for rules and keymap components.
Pierre Le Marre 9b4fd82b 2025-05-13T11:46:46 test: Skip checked arithmetic if not available
Pierre Le Marre fb9fec18 2025-05-10T10:18:38 xkbcomp: Checked arithmetic Use a polyfill for C23 checked arithmetic. This is a bit paranoid, as we expect the user to use only 32 bit integers, so the signed 64 bit integer we use to store the result should be more than enough. Use jtckdint v1.0: - repository: https://github.com/jart/jtckdint - commit: 339450d13d8636f05dcb71ba36efddb226db481e - removed all C++-specific code
Pierre Le Marre 22d27277 2025-05-10T10:12:31 actions: Reject arguments if they are not expected `NoAction`, `VoidAction` and `TerminateServer` do not accept arguments.
Pierre Le Marre d239a3f0 2025-05-11T11:42:20 actions: Improve unsupported legacy X11 actions handling - Display a warning - Document drawbacks of degrading to `NoAction()`
Pierre Le Marre b4c89600 2025-05-09T15:15:10 actions: Add VoidAction(), mirroring NoSymbol/VoidSymbol. Added `VoidAction()` action to match the keysym pair `NoSymbol` / `VoidSymbol`. It enables overriding a previous action and breaks latches. This is a libxkbcommon extension. When serializing it will be converted to `LockControls(controls=none,affect=neither)` for backward compatibility. We cannot serialize it to `NoAction()`, as it would be dropped in e.g. the context of multiple actions.
Pierre Le Marre c2d3694b 2025-05-06T07:01:01 xkbcomp: Do not discard extra bits in vmod masks Since we accept numeric values for the vmod mask in the keymap, we may have extra bits set that encode *no* real/virtual modifier. Keep them unchanged for consistency. E.g. the following keymap: xkb_keymap { xkb_keycodes { <a> = 38; }; xkb_symbols { virtual_modifiers X = 0xf0000000; key <a> { [ SetMods(mods = 0x00001100) ] }; }; }; would compile to: xkb_keymap { xkb_keycodes { <a> = 38; }; xkb_symbols { virtual_modifiers X = 0xf0000000; // Internal state key <a> { [ SetMods(mods = 0xf0001000) ] }; // Serialization key <a> { [ SetMods(mods = 0x00001100) ] }; }; };
Pierre Le Marre 9b0b8c68 2025-04-15T19:53:28 xkbcomp: Stricter handling of default map include Before this commit, including a *default* map, i.e. without an explicit section name (e.g. `include "au"` vs `include "au(basic)"`) would match the first section of the first matching file in the XKB include paths, even if this section is not an *explicit* default map (i.e. tagged with `default`) but an *implicit* default map (i.e. the first map of the file, i.e. a weak match). It makes user configuration risky: say a user wants to create a custom version `au(custom)` of the `au` layout: - `./config/xkb/symbols/au`: custom layout in section “custom”. - `/usr/share/X11/xkb/symbols/au`: system layout, with *default* section “basic”. In this setup *any* layout that imports the default map from `au` would in fact import the *implicit* default map `au(custom)` instead of the *explicit* default map `au(basic)`. This incorrect behavior may thus break setups with multiple layouts. This is especially true for symbols files such as: `pc`, `us` or `latin`. Fixed by trying harder to found the exact default map, defaulting to the old behavior (weak match) only if no *explicit* default map (exact match) has been found in the XKB include paths.
Pierre Le Marre 66f71890 2025-03-31T08:01:29 symbols: Enable writing keysyms list as UTF-8 strings Each Unicode code point of the string will be translated to their respective keysym, if possible. An empty string denotes `NoSymbol`. When such conversion is not possible, this will raise a syntax error. This introduces the following syntax: ```c // Empty string = `NoSymbol` key <1> {[""]}; // NoSymbol // Single code point = single keysym key <2> {["é"]}; // eacute // String = translate each code point to their respective keysym key <3> {["sßξك🎺"]}; // {s, ssharp, Greek_xi, Arabic_kaf, U1F3BA} // Mix string and keysyms key <4> {[{"ξ", Greek_kappa, "β"}]}; // { Greek_xi, Greek_kappa, Greek_beta} ``` It can also be used wherever a keysym is required, e.g. in `interpret` and `modifier_map` statements. In these cases a single keysym is expected, so the string should contain *exactly one* Unicode code point.
Pierre Le Marre ead3ce77 2025-03-28T21:44:27 scanner: Enable LRM and RLM marks for BiDi text Enable displaying bidirectional text in XKB files using: - U+200E LEFT-TO-RIGHT MARK - U+200F RIGHT-TO-LEFT MARK We now parse these marks as white space. As such, they are dropped; note that a later serialization may not display correctly without the marks, although it will parse. References: - https://www.w3.org/International/articles/inline-bidi-markup/uba-basics - https://www.w3.org/International/questions/qa-bidi-unicode-controls - https://www.unicode.org/reports/tr31/#Whitespace - https://www.unicode.org/reports/tr55/
Pierre Le Marre bc3e464b 2025-04-09T12:35:05 keysyms: Fix Unicode handling - `xkb_utf32_to_keysym`: Allow [Unicode noncharacters]. There is no requirement to drop them and this would be the only function of our API doing so. From the Unicode Standard 16.0, section 23.7 “Noncharacters”: > Applications are free to use any of these noncharacter code points > internally. They have no standard interpretation when exchanged > outside the context of internal use. However, they are not illegal > in interchange, nor does their presence cause Unicode text to be > ill-formed. > If a noncharacter is received in open interchange, an application is > not required to interpret it in any way. It is good practice, > however, to recognize it as a noncharacter and to take appropriate > action, such as replacing it with `U+FFFD` REPLACEMENT CHARACTER, > to indicate the problem in the text. The key part is: > an application is not required to interpret it in any way Since we handle the reverse conversion with `xkb_keysym_to_utf32` just fine, I do not see a good motivation to keep this asymmetry. This is the only function with a special case for these code points. - `xkb_keysym_from_name`: - Unicode format `UNNNN`: allow control characters C0 and C1 and use `xkb_utf32_to_keysym` for the conversion when `NNNN < 0x100`, for backward compatibility. - Numeric hexadecimal format `0xNNNN`: *unchanged*. Contrary to the Unicode format, it does not normalize any keysym values in order to enable roundtrip with `xkb_keysym_get_name`. Also added tests to ensure various properties and consistency. Note about *surrogates*: they are valid valid *code points* but invalid Unicode *scalar values*, i.e. they cannot be encoded in any Unicode encoding form (UTF-8, UTF-16, UTF-32). So their corresponding Unicode keysyms are valid, but: - cannot be used as input of `xkb_keysym_to_utf32` nor `xkb_keysym_to_utf8` - cannot result as output of `xkb_utf32_to_keysym`. Otherwise they are valid e.g. in the Unicode keysym notation. [Unicode noncharacters]: https://en.wikipedia.org/wiki/Universal_Character_Set_characters#Noncharacters
Pierre Le Marre 5e557040 2025-04-09T11:17:00 xkbcomp: Fix Unicode escape sequence While the previous code correctly rejected malformed sequences such as `\u{` (incomplete) or `\u{123x}`, it should try to consume as much input as possible until reaching the corresponding closing `}` within the string. Else we can get leftovers and the error message does not reference the whole malformed sequence. Also added further tests with surrogates and noncharacters.
Pierre Le Marre 36442baa 2025-04-03T15:01:46 xkbcomp: Support multiple actions in interpret Before this commit we supported multiple actions per level, but not in *interpret* statements. Let’s fix this asymmetry, so we can equivalently assign all actions sets either implicitly or explicitly.
Pierre Le Marre 3d79f459 2025-03-29T11:46:34 xkbcomp: Add Unicode code point escape sequence \u{NNNN} Unicode code point escape sequences `\u{NNNN}` are replaced with the UTF-8 encoding of their corresponding code point `U+NNNN`, if legal. Supported Unicode code points are in the range `1‥0x10ffff`. Note that we will reject the `U+0000` NULL code point, as we reject it in the octal escape sequence `\0`. This is intended mainly for the upcoming feature to write keysyms as UTF-8 encoded strings. It can be used for various reasons: - avoid encoding issues; - avoid issue with font rendering (e.g. Asian scripts); - make white space or zero-width characters more readable.
Pierre Le Marre 7d91a753 2025-03-29T12:24:39 xkbcomp: Enable xkbcomp-style octal escape sequences Xorg xkbcomp only parses octal sequences with `\0`, while xkbcommon does not force the `0` prefix of the numeric part. However, we only parsed up to to 3 digits, which does not allow to parse e.g. `\0377` while `\377` parses fine. Fixed by parsing up to 4 octal digits, while checking the result fits into a byte.
Pierre Le Marre aa8b572e 2025-03-29T12:04:26 keymap serialization: Ensure escaping relevant chars Previously we would write characters without any escaping in some cases (e.g.: names of indicators, types and groups). E.g. the string "new\nline" would be serialized as: "new line" which would raise a syntax error if parsed. Fixed by escaping any string that was not escaped after parsing (e.g. the section names are safe already).
Pierre Le Marre d5a91fa9 2025-04-04T16:38:16 xkbcomp: Use custom parsers instead of strtol* The use of `strtol*` functions was already restricted due to its slowness and its capacity to parse other stuff than digits (e.g. signs and spaces). There is also another *big* limitation: it requires a NULL-terminated string. This is incompatible with our functions that work on buffers, because we cannot guarantee this. This may lead to a memory violation if the last token is a number. We now roll out our own parsers, which are more efficients and compatible with buffers.
Pierre Le Marre 44480f7c 2025-04-01T08:28:02 xkbcomp: Enable lists of keysyms and actions {} and {a} Motivations: - Follow the principle of least astonishment; - Ensure consistency; - Enhance the use of custom defaults; - Facilitate the tests. There is some ambiguity because we use `{}` to denote both an empty list of keysyms and an empty list of actions. But as soon as we get a keysym or an action, we know whether it is a `MultiKeySymList` or a `MultiActionList`. So we just count the `{}` at the *beginning* using `NoSymbolOrActionList`, then replace it by the relevant count of `NoSymbol` or `NoAction()` once the ambiguity is solved. If not, this is a list of empties of *some* type: we drop those empties and delegate the type resolution using `ExprEmptyList()`.
Pierre Le Marre e09cbe66 2025-04-02T10:46:06 symbols: Fix handling of empty keys Before this commit, the following symbols: ```c xkb_symbols { virtual_modifiers M1, M2; key <A> {}; key <B> { [] }; key.vmods = M1; key <C> {}; key <D> { vmods = M2 }; }; ``` would be equivalent to: ```c xkb_symbols { virtual_modifiers M1,M2; key <B> { [ NoSymbol ] }; }; ``` `<B>` entry could be skipped but is harmless. However, `<C>` and `<D>` are missing, which would lead to the mapping resolution of `M1` and `M2` failing. After this commit, it is equivalent to: ```c virtual_modifiers M1,M2; key <C> { vmods = M1 }; key <D> { vmods = M2 }; ``` Empty keys are skipped entirely, but any explicit field: - is taken into account: previously they would be skipped if there were no group; - forces the key to be printed at serialization.
Pierre Le Marre 2e0245f8 2025-04-02T10:45:44 xkbcomp: Enable more empty lists - Empty `interpret` - Empty key `type` - Empty `indicator` Motivations: - Follow the principle of least astonishment; - Ensure consistency; - Enhance the use of custom defaults; - Facilitate the tests.
Pierre Le Marre 6881fb32 2025-04-01T08:28:02 xkbcomp: Drop trailing NoSymbol and NoAction() This brings us closer to what `xkbcomp` outputs. One should use the explicit `VoidSymbol` instead of `NoSymbol`, in order to avoid dropping empty levels. This may affect keys that rely on an *implicit* key type. Example: - Input: ```c key <> { [a, A, NoSymbol] }; ``` - Compilation with xkbcommon \< 1.9.0: ```c key <> { type= "FOUR_LEVEL_SEMIALPHABETIC", [a, A, NoSymbol, NoSymbol] }; ``` - Compilation with xkbcommon ≥ 1.9.0: ```c key <> { type= "ALPHABETIC", [a, A] }; ```
Pierre Le Marre fbacdd98 2025-03-31T07:58:04 test: Refactor test_multi_keysyms_actions - Use less macros - Add golden tests to check the compilation *result*
Pierre Le Marre b254cc2e 2025-03-30T12:27:15 test: Remove empty components boilerplate
Pierre Le Marre 3150bca8 2025-03-30T09:54:02 xkbcomp: Make all components optional We already accept *empty* components, such as: `xkb_compat {};`. Let’s accept missing components as well, so that we can reduce the boilerplate in our tests. Note that we will still explicitly serialize empty components for compatibility with previous xkbcommon versions and Xorg xkbcomp.
Pierre Le Marre 500b260b 2025-03-28T09:38:58 xkbcomp: Fix parser failure on floating-point numbers Before this commit we used `strtold`, which depends on the locale. But the XKB syntax is fixed and uses a period as decimal separator. So ensure the syntax is correct without relying on `strtold` and truncate the result, as the parser does not use floating-point numbers.
Pierre Le Marre e5401b07 2025-03-26T16:02:58 symbols: Improve Modmap parsing Parse, dont’t validate: ensure *at parsing* that `modifier_map` definitions use a list of keys and keysyms. This enables to remove the redundant `ExprResolveKeySym` and have keysym parsing exclusively in handled in `parser.y`.
Pierre Le Marre e8561909 2025-03-18T14:34:10 xkbcomp: Fix keycodes bounds - Refactor to check conflicts first for the key names and then for the keycodes. This seems more useful for the user and enable further memory optimizations. - Do not allocate until we are sure to add the keycode. The bounds are only updated afterwards, so the call to `FindKeyByName` should be more efficient. - Fixed keycodes bounds not shrunk correctly when an existing keycode is overridden. - Do not prepare keyname strings for logging if we are not going to use them.
Pierre Le Marre befa0cdd 2025-02-12T15:38:58 test: Check integers syntax
Pierre Le Marre 4cef822a 2025-02-12T07:44:34 test: Check masks syntax
Ran Benita e120807b 2025-01-29T15:35:22 Update license notices to SDPX short identifiers + update LICENSE Fix #628. Signed-off-by: Ran Benita <ran@unusedvar.com>
Pierre Le Marre 502e9e5b 2025-01-29T12:19:10 xkbcomp: Add stricter bounds for keycodes and levels Our current implementation uses continuous arrays indexed by keycodes and levels. This is simple and good enough for realistic keymaps. However, they are allowed to have big values that will lead to either memory exhaustion or a waste of memory (sparse arrays). Added the much stricter upper bounds `0xfff` for keycodes[^1] and 2048 for levels[^2], which should still be plenty enough and provides stronger memory security. [^1]: Current max keycode is 0x2ff in Linux. [^2]: Should be big enough to satisfy automatically generated keymaps.
Pierre Le Marre 88a3d3c2 2025-01-23T16:03:51 tests: Refactor buffercomp Move tests into proper functions and log tests names.
Pierre Le Marre b1e1aae6 2025-01-23T15:20:44 xkbcomp: Fix memory leak when extra content after keymap It triggers with e.g.: ``` xkb_keymap { xkb_keycodes { }; }; }; // erroneous ```
Pierre Le Marre 709027ec 2025-01-23T09:12:15 symbols: Fix inconsistent error handling Currently the following keymap triggers a critical error (invalid `vmods`) only for the second key statement, while it should handle both equally. ``` xkb_keymap { xkb_keycodes { <> = 9; }; xkb_types { }; xkb_compat { }; xkb_symbols { key <> { vmods = [], repeats = false }; key <> { repeats = false, vmods = [] }; }; }; ``` Fixed by parsing the whole symbols body and failing if any error was found.
Pierre Le Marre ec2915fe 2025-01-22T17:18:21 symbols: Fix a possible null pointer deference Introduce a new Expression type, `EXPR_EMPTY_LIST`, to avoid the ambiguity between action and keysym empty lists. Two alternatives were rejected to keep the semantics clear: - Using `EXPR_KEYSYM_LIST`: because we would end up accepting an empty keysym list while processing actions. - Using NULL: convey no info and is hazardous.
Pierre Le Marre 7036e46c 2025-01-13T15:20:47 symbols: Add tests for key merge modes (keysyms/actions) This commit adds tests for merging various key configurations: - With/without keysyms/actions - Single/multiple keysyms/actions per level We test all the merge modes for including a map (global) as well as directly on the keys (local): - default (global: include, local: implicit) - augment - override - replace The tests data are generated with: - A Python script `scripts/update-merge-modes-tests.py` for keycodes and symbols data. Use `--debug` for extra comments to help debugging. The script can optionally generate C headers for alternative key sequence tests, that were used before implementing golden tests. The latter tests are not used anymore (duplicate with golden tests) but their generator is kept for now, as they can still be useful for debugging or writing similar tests. - The `merge-modes` test generates its own keymap files for golden tests, using: `build/test-merge-modes update`. It can also replace them with the obtained output rather than the expected one using `build/test-merge-modes update-obtained`, which is very useful for debugging.
Pierre Le Marre 71d64df3 2024-10-08T18:45:18 symbols: Add tests for multiple actions per level
Pierre Le Marre 31c6d866 2024-10-08T18:39:00 symbols: Min. 2 keysyms in level list Do not allow `{ a }` when a single `a` suffices.
Pierre Le Marre 929a485f 2024-10-08T12:52:53 symbols: Fix too liberal parsing of keysyms lists Currently we are too liberal when parsing symbols lists: e.g. `[{a,{b}}]` is parsed as `[{a,b}]` but it should be rejected.
Pierre Le Marre e325e65e 2024-02-20T08:13:37 Add test_unit to all tests Currently it only ensure we do not buffer `stdout`.
Pierre Le Marre 00e3058e 2023-11-06T21:53:51 Prevent recursive includes of keymap components - Add check for recursive includes of keymap components. It relies on limiting the include depth. The threshold is currently to 15, which seems reasonable with plenty of margin for keymaps in the wild. - Add corresponding new log message `recursive-include`. - Add tests for recursive includes.
Pierre Le Marre 82e9293e 2023-10-30T15:28:10 xkbcomp: early detection of invalid encoding
Pierre Le Marre f937c308 2023-10-29T07:31:34 xkbcomp: skip heading UTF-8 encoded BOM (U+FEFF) Leading BOM is legal and is used as a signature — an indication that an otherwise unmarked text file is in UTF-8. See: https://www.unicode.org/faq/utf_bom.html#bom5 for further details.
Peter Hutterer b06aedb8 2023-05-02T14:15:55 scanner: allow for a zero terminated string as keymap As the documentation for xkb_keymap_new_from_buffer() states, the "input string does not have to be zero-terminated". The actual implementation however failed with "unrecognized token/syntax error" when it encountered a null byte. Fix this by allowing a null byte at the last position of the buffer. Anything else is likely a client error anyway. Fixes #307
Ran Benita 40aab05e 2019-12-27T13:03:20 build: include config.h manually Previously we included it with an `-include` compiler directive. But that's not portable. And it's better to be explicit anyway. Every .c file should have `include "config.h"` first thing. Signed-off-by: Ran Benita <ran@unusedvar.com>
David Herrmann 36f55c49 2013-03-11T12:53:39 keymap: add xkb_keymap_new_from_buffer() The current API doesn't allow the caller to create keymaps from mmap()'ed files. The problem is, xkb_keymap_new_from_string() requires a terminating 0 byte. However, there is no way to guarantee that when using mmap() so a user currently has to copy the whole file just to get the terminating zero byte (assuming they cannot use xkb_keymap_new_from_file()). This adds a new entry xkb_keymap_new_from_buffer() which takes a memory location and the buffer size in bytes. Internally, we depend on yy_scan_{string,byte}() helpers. According to flex documentation these already copy the input string because they are wrappers around yy_scan_buffer(). yy_scan_buffer() on the other hand has some insane requirements. The buffer must be writeable and the last two bytes must be ASCII-NUL. But the buffer may contain other 0 bytes just fine. Because we don't want these constraints in our public API, xkb_keymap_new_from_buffer() needs to create a copy of the input memory. But it then calls yy_scan_buffer() directly. Hence, we have the same number of buffer-copies as with *_from_string() but without the terminating 0 requirement. The explicit yy_scan_buffer() call is preferred over yy_scan_byte() so the buffer-copy operation is not hidden somewhere in flex. Maybe some day we no longer depend on flex and can have a zero-copy API. A user could mmap() a file and it would get parsed right from this buffer. But until then, we shouldn't expose this limitation in the API but instead provide an API that some day can work with zero-copy. Signed-off-by: David Herrmann <dh.herrmann@gmail.com> [ran: rebased on top of my branch] Conflicts: Makefile.am src/xkbcomp/xkbcomp.c