kmx git

Commit	Date	Message
39b4b670	2025-06-06T18:40:29	Support including keymap components using %-expansion and absolute path Enable to use the same `include` features than rules files in keymap components: - `%`-expansion: `%H` home directory, `%S` sytem root and `%E` extra. - absolute file paths. This is useful if one wants to overwrite the system file with a user config (i.e. same name, but in `~/.config/xkb`), but still include the system file: ``` // File: ~/.config/xkb/symbols/de xkb_symbols "basic" { include "%S/de(basic)" key <AB01> { [z, Z] }; key <AD06> { [y, Y] }; } ```` Without the commit, using a mere `include "de(basic)"` would result in an include loop. Refactored by using the same code for rules and keymap components.
1d361b8f	2025-05-12T10:01:10	scanner: Ensure proper type for string length
5e557040	2025-04-09T11:17:00	xkbcomp: Fix Unicode escape sequence While the previous code correctly rejected malformed sequences such as `\u{` (incomplete) or `\u{123x}`, it should try to consume as much input as possible until reaching the corresponding closing `}` within the string. Else we can get leftovers and the error message does not reference the whole malformed sequence. Also added further tests with surrogates and noncharacters.
3d79f459	2025-03-29T11:46:34	xkbcomp: Add Unicode code point escape sequence \u{NNNN} Unicode code point escape sequences `\u{NNNN}` are replaced with the UTF-8 encoding of their corresponding code point `U+NNNN`, if legal. Supported Unicode code points are in the range `1‥0x10ffff`. Note that we will reject the `U+0000` NULL code point, as we reject it in the octal escape sequence `\0`. This is intended mainly for the upcoming feature to write keysyms as UTF-8 encoded strings. It can be used for various reasons: - avoid encoding issues; - avoid issue with font rendering (e.g. Asian scripts); - make white space or zero-width characters more readable.
7d91a753	2025-03-29T12:24:39	xkbcomp: Enable xkbcomp-style octal escape sequences Xorg xkbcomp only parses octal sequences with `\0`, while xkbcommon does not force the `0` prefix of the numeric part. However, we only parsed up to to 3 digits, which does not allow to parse e.g. `\0377` while `\377` parses fine. Fixed by parsing up to 4 octal digits, while checking the result fits into a byte.
d5a91fa9	2025-04-04T16:38:16	xkbcomp: Use custom parsers instead of strtol* The use of `strtol` functions was already restricted due to its slowness and its capacity to parse other stuff than digits (e.g. signs and spaces). There is also another big* limitation: it requires a NULL-terminated string. This is incompatible with our functions that work on buffers, because we cannot guarantee this. This may lead to a memory violation if the last token is a number. We now roll out our own parsers, which are more efficients and compatible with buffers.
70d11abd	2025-03-26T07:38:05	messages: Add file encoding and invalid syntax entries Added: - `XKB_ERROR_INVALID_FILE_ENCODING` - `XKB_ERROR_INVALID_RULES_SYNTAX` - `XKB_ERROR_INVALID_COMPOSE_SYNTAX` Changed: - `XKB_ERROR_INVALID_SYNTAX` renamed to `XKB_ERROR_INVALID_XKB_SYNTAX`.
e1892266	2025-02-13T16:57:46	clang-tidy: Miscellaneous fixes
f4e95280	2025-02-02T22:29:05	xkbcomp/scanner: avoid unneeded strdup of IDENT tokens The allocation is immediately discarded, either turned into a keysym or an atom. So use an sval slice into the input string instead strdup'ing. memusage ./release/bench-compile-keymap --iter=1000 --layout us,de --variant ,neo Before: Memory usage summary: heap total: 534063576, heap peak: 581022, stack peak: 18848 total calls total memory failed calls malloc\| 11240525 291897104 0 realloc\| 1447657 192307328 0 (nomove:37629, dec:0, free:0) calloc\| 430573 49859144 0 free\| 13993903 534063576 After: Memory usage summary: heap total: 506839909, heap peak: 581022, stack peak: 18960 total calls total memory failed calls malloc\| 8016419 264673437 0 realloc\| 1447657 192307328 0 (nomove:37278, dec:0, free:0) calloc\| 430573 49859144 0 free\| 10769797 506839909 Signed-off-by: Ran Benita <ran@unusedvar.com>
df2322d7	2025-02-05T14:41:21	Replace include guards by `#pragma once` We currently have a mix of include headers, pragma once and some missing. pragma once is not standard but is widely supported, and we already use it with no issues, so I'd say it's not a problem. Let's convert all headers to pragma once to avoid the annoying include guards. The public headers are not converted. Signed-off-by: Ran Benita <ran@unusedvar.com>
e120807b	2025-01-29T15:35:22	Update license notices to SDPX short identifiers + update LICENSE Fix #628. Signed-off-by: Ran Benita <ran@unusedvar.com>
6e97f57e	2025-01-29T19:21:43	scanner: speed up token position -> location using a cache Signed-off-by: Ran Benita <ran@unusedvar.com>
26807a90	2025-01-28T20:24:05	scanner: compute token line/column lazily on errors The scanner functions are hot, and the line/column location tracking is quite expensive. We only use it for errors, which don't need to be fast, because we bail if there are too many; and for warnings, which are usually not shown by default. So only keep the token start pos, and compute the line/column lazily from that. This will also allow some further improvements ahead. bench/rulescomp before: compiled 1000 keymaps in 1.669028s after: compiled 1000 keymaps in 1.550411s bench/compose: before: compiled 1000 compose tables in 2.145217s after: compiled 1000 compose tables in 2.016044s Signed-off-by: Ran Benita <ran@unusedvar.com>
53b3f446	2025-01-22T17:43:53	clang-tidy: Fix headers includes
4ea9d431	2023-11-16T17:12:03	rules: Add support for :all qualifier Some layout options require to be applied to every group to maintain consistency (e.g. a group switcher). Currently this must be done manually for all layout indexes. This is error prone and prevents the increase of the maximum group count. This commit introduces the `:all` qualifier for KcCGST values. When a rule with this qualifier is matched, it will expands the qualified value (and its optional merge mode) for every layout, e.g. `+group(toggle):all` (respectively `\|group(toggle)`) would expand to `+group(toggle):1+group(toggle):2` (respectively `\|group(toggle):1\|group(toggle):2`) if there are 2 layouts, etc. If there is no merge mode, it defaults to override `+`, e.g. `x:all` expands to `x:1+x:2+x:3` for 3 layouts. Note that only the qualified value is expanded, e.g. `x+y:all` expands to `x+y:1+y:2` for 2 layouts. `:all` can be used in combination with special layout indexes. Since this can lead to an unexpected behaviour, a warning will be raised.
ba896935	2024-09-24T21:28:12	logging: Make scanner_warn use a message ID
c8bd57dd	2024-09-24T21:20:41	logging: Make scanner_err use a message ID
a2da57ab	2023-10-30T14:50:00	Compose: early detection of invalid encoding Also move “unrecognized token” error message before skiping the line, in order to fix token position.
0038c866	2023-09-26T17:05:14	Prevent overflow of octal escape sequences The octal parser accepts the range `\1..\777`. The result is cast to `char` which will silently overflow. This commit prevents overlow and will treat `\400..\777` as invalid escape sequences.
ef81d04e	2023-09-18T18:17:34	Structured log messages with a message registry Currently there is little structure in the log messages, making difficult to use them for the following use cases: - A user looking for help about a log message: the user probably uses a search engine, thus the results will depend on the proper indexing of our documentation and the various forums. It relies only on the wording of the message, which may change with time. - A user wants to filter the logs resulting of the use of one of the components of xkbcommon. A typical example would be testing xkeyboard-config against libxkbcommon. It requires the use of a pattern (simple words detection or regex). The issue is that the pattern may become silently out-of-sync with xkbcommon. A common practice (e.g. in compilers) is to assign unique error codes to reference theses messages, along with an error index for documentation. Thus this commit implements the following features: - Create a message registry (message-registry.yaml) that defines the log messages produced by xkbcommon. This is a simple YAML file that provides, for each message: - A unique numeric code as a short identifier. It is used in the output message and thus can be easily be filtered to spot errors or searched in the internet. It must not change: if the semantics of message changes, it is better to introduce a new message for clarity. - A unique text identifier, meant for two uses: 1. Generate constants dealing with log information in our code base. 2. Generate human-friendly names for the documentation. - A type: currently warning or error. Used to prefix the constants (see hereinabove) and for basic classification in documentation. - A short description, used as concise and mandatory documentation. - An optionnal detailed description. - Optional examples, intended to help the user to fix issues themself. - Version of xkbcommon it was added. For old entries this often unknown, so they will default to 1.0.0. - Version of xkbcommon it was removed (optional) No entry should ever be deleted from this index, even if the message is not used anymore: it ensures we have unique identifiers along the history of xkbcommon, and that users can refer to the documentation even for older versions. - Add the script update-message-registry.py to generate the following files: - messages.h: message code enumeration for the messages currently used in the code base. Currently a private API. - message.registry.md: the error index documentation page. - Modify the logging functions to use structured messages. This is a work in progress.
0b3d9092	2022-03-14T16:44:13	scanner: prefix functions with `scanner_` to avoid symbol conflicts Particularly `eof()` in mingw-w64. Fixes: https://github.com/xkbcommon/libxkbcommon/pull/285 Reported-by: Marko Lindqvist Signed-off-by: Ran Benita <ran@unusedvar.com>
c3ac58a9	2019-12-27T14:06:47	scanner-utils: avoid possible implicit truncating of line/column This increases the size of the struct a bit but it's not very important. Fixes these MSVC warnings: src\scanner-utils.h(112): warning C4267: '+=': conversion from 'size_t' to 'unsigned int', possible loss of data src\scanner-utils.h(147): warning C4267: '+=': conversion from 'size_t' to 'unsigned int', possible loss of data Signed-off-by: Ran Benita <ran@unusedvar.com>
f774f819	2014-10-18T13:23:53	Replace some strncmp's with memcmp Signed-off-by: Ran Benita <ran234@gmail.com>
a3116f97	2014-10-13T18:51:12	compose/parser: fix segfault when including The keysym cache for the new scanner was not initialized. To avoid such errors also in the future, require passing the priv argument in scanner_init(), instead of initializing it separately. Signed-off-by: Ran Benita <ran234@gmail.com>
8a0acf2c	2014-10-07T23:42:08	scanner-utils: optimize one-line comments Compose files have a lot of those. Signed-off-by: Ran Benita <ran234@gmail.com>
29a1a780	2014-09-12T18:40:18	scanner-utils: add priv member For when a user of the scanner wants to pass something along with it. Signed-off-by: Ran Benita <ran234@gmail.com>
94a8e01c	2014-02-03T14:55:37	scanner-utils: add helper for appending an entire string Signed-off-by: Ran Benita <ran234@gmail.com>
8eb024d5	2013-10-27T20:17:29	scanner-utils: add helper for hex string escape Like the already existing oct. Signed-off-by: Ran Benita <ran234@gmail.com>
4ed68120	2014-10-01T19:14:36	scanner-utils: optimize str()/lit() Replace the dog-slow unneeded strncasecmp() with an inlineable memcmp(). Before: compiled 2500 keymaps in 8.348715629s After: compiled 2500 keymaps in 7.872640338s Signed-off-by: Ran Benita <ran234@gmail.com>
e55a0cea	2013-10-27T20:10:15	Move src/xkbcomp/scanner-utils.h to src/ As we'll use it for things unrelated to xkbcomp. Signed-off-by: Ran Benita <ran234@gmail.com>

39b4b670

2025-06-06T18:40:29

Support including keymap components using %-expansion and absolute path Enable to use the same `include` features than *rules* files in *keymap components*: - *`%`-expansion*: `%H` home directory, `%S` sytem root and `%E` extra. - absolute file paths. This is useful if one wants to overwrite the system file with a user config (i.e. same name, but in `~/.config/xkb`), but still include the system file: ``` // File: ~/.config/xkb/symbols/de xkb_symbols "basic" { include "%S/de(basic)" key <AB01> { [z, Z] }; key <AD06> { [y, Y] }; } ```` Without the commit, using a mere `include "de(basic)"` would result in an include loop. Refactored by using the same code for rules and keymap components.

1d361b8f

2025-05-12T10:01:10

scanner: Ensure proper type for string length

5e557040

2025-04-09T11:17:00

xkbcomp: Fix Unicode escape sequence While the previous code correctly rejected malformed sequences such as `\u{` (incomplete) or `\u{123x}`, it should try to consume as much input as possible until reaching the corresponding closing `}` within the string. Else we can get leftovers and the error message does not reference the whole malformed sequence. Also added further tests with surrogates and noncharacters.

3d79f459

2025-03-29T11:46:34

xkbcomp: Add Unicode code point escape sequence \u{NNNN} Unicode code point escape sequences `\u{NNNN}` are replaced with the UTF-8 encoding of their corresponding code point `U+NNNN`, if legal. Supported Unicode code points are in the range `1‥0x10ffff`. Note that we will reject the `U+0000` NULL code point, as we reject it in the octal escape sequence `\0`. This is intended mainly for the upcoming feature to write keysyms as UTF-8 encoded strings. It can be used for various reasons: - avoid encoding issues; - avoid issue with font rendering (e.g. Asian scripts); - make white space or zero-width characters more readable.

7d91a753

2025-03-29T12:24:39

xkbcomp: Enable xkbcomp-style octal escape sequences Xorg xkbcomp only parses octal sequences with `\0`, while xkbcommon does not force the `0` prefix of the numeric part. However, we only parsed up to to 3 digits, which does not allow to parse e.g. `\0377` while `\377` parses fine. Fixed by parsing up to 4 octal digits, while checking the result fits into a byte.

d5a91fa9

2025-04-04T16:38:16

xkbcomp: Use custom parsers instead of strtol* The use of `strtol*` functions was already restricted due to its slowness and its capacity to parse other stuff than digits (e.g. signs and spaces). There is also another *big* limitation: it requires a NULL-terminated string. This is incompatible with our functions that work on buffers, because we cannot guarantee this. This may lead to a memory violation if the last token is a number. We now roll out our own parsers, which are more efficients and compatible with buffers.

70d11abd

2025-03-26T07:38:05

messages: Add file encoding and invalid syntax entries Added: - `XKB_ERROR_INVALID_FILE_ENCODING` - `XKB_ERROR_INVALID_RULES_SYNTAX` - `XKB_ERROR_INVALID_COMPOSE_SYNTAX` Changed: - `XKB_ERROR_INVALID_SYNTAX` renamed to `XKB_ERROR_INVALID_XKB_SYNTAX`.

e1892266

2025-02-13T16:57:46

clang-tidy: Miscellaneous fixes

f4e95280

2025-02-02T22:29:05

xkbcomp/scanner: avoid unneeded strdup of IDENT tokens The allocation is immediately discarded, either turned into a keysym or an atom. So use an sval slice into the input string instead strdup'ing. memusage ./release/bench-compile-keymap --iter=1000 --layout us,de --variant ,neo Before: Memory usage summary: heap total: 534063576, heap peak: 581022, stack peak: 18848 total calls total memory failed calls malloc| 11240525 291897104 0 realloc| 1447657 192307328 0 (nomove:37629, dec:0, free:0) calloc| 430573 49859144 0 free| 13993903 534063576 After: Memory usage summary: heap total: 506839909, heap peak: 581022, stack peak: 18960 total calls total memory failed calls malloc| 8016419 264673437 0 realloc| 1447657 192307328 0 (nomove:37278, dec:0, free:0) calloc| 430573 49859144 0 free| 10769797 506839909 Signed-off-by: Ran Benita <ran@unusedvar.com>

df2322d7

2025-02-05T14:41:21

Replace include guards by `#pragma once` We currently have a mix of include headers, pragma once and some missing. pragma once is not standard but is widely supported, and we already use it with no issues, so I'd say it's not a problem. Let's convert all headers to pragma once to avoid the annoying include guards. The public headers are *not* converted. Signed-off-by: Ran Benita <ran@unusedvar.com>

e120807b

2025-01-29T15:35:22

Update license notices to SDPX short identifiers + update LICENSE Fix #628. Signed-off-by: Ran Benita <ran@unusedvar.com>

6e97f57e

2025-01-29T19:21:43

scanner: speed up token position -> location using a cache Signed-off-by: Ran Benita <ran@unusedvar.com>

26807a90

2025-01-28T20:24:05

scanner: compute token line/column lazily on errors The scanner functions are hot, and the line/column location tracking is quite expensive. We only use it for errors, which don't need to be fast, because we bail if there are too many; and for warnings, which are usually not shown by default. So only keep the token start pos, and compute the line/column lazily from that. This will also allow some further improvements ahead. bench/rulescomp before: compiled 1000 keymaps in 1.669028s after: compiled 1000 keymaps in 1.550411s bench/compose: before: compiled 1000 compose tables in 2.145217s after: compiled 1000 compose tables in 2.016044s Signed-off-by: Ran Benita <ran@unusedvar.com>

53b3f446

2025-01-22T17:43:53

clang-tidy: Fix headers includes

4ea9d431

2023-11-16T17:12:03

rules: Add support for :all qualifier Some layout options require to be applied to every group to maintain consistency (e.g. a group switcher). Currently this must be done manually for all layout indexes. This is error prone and prevents the increase of the maximum group count. This commit introduces the `:all` qualifier for KcCGST values. When a rule with this qualifier is matched, it will expands the qualified value (and its optional merge mode) for every layout, e.g. `+group(toggle):all` (respectively `|group(toggle)`) would expand to `+group(toggle):1+group(toggle):2` (respectively `|group(toggle):1|group(toggle):2`) if there are 2 layouts, etc. If there is no merge mode, it defaults to *override* `+`, e.g. `x:all` expands to `x:1+x:2+x:3` for 3 layouts. Note that only the qualified *value* is expanded, e.g. `x+y:all` expands to `x+y:1+y:2` for 2 layouts. `:all` can be used in combination with special layout indexes. Since this can lead to an unexpected behaviour, a warning will be raised.

ba896935

2024-09-24T21:28:12

logging: Make scanner_warn use a message ID

c8bd57dd

2024-09-24T21:20:41

logging: Make scanner_err use a message ID

a2da57ab

2023-10-30T14:50:00

Compose: early detection of invalid encoding Also move “unrecognized token” error message before skiping the line, in order to fix token position.

0038c866

2023-09-26T17:05:14

Prevent overflow of octal escape sequences The octal parser accepts the range `\1..\777`. The result is cast to `char` which will silently overflow. This commit prevents overlow and will treat `\400..\777` as invalid escape sequences.

ef81d04e

2023-09-18T18:17:34

Structured log messages with a message registry Currently there is little structure in the log messages, making difficult to use them for the following use cases: - A user looking for help about a log message: the user probably uses a search engine, thus the results will depend on the proper indexing of our documentation and the various forums. It relies only on the wording of the message, which may change with time. - A user wants to filter the logs resulting of the use of one of the components of xkbcommon. A typical example would be testing xkeyboard-config against libxkbcommon. It requires the use of a pattern (simple words detection or regex). The issue is that the pattern may become silently out-of-sync with xkbcommon. A common practice (e.g. in compilers) is to assign unique error codes to reference theses messages, along with an error index for documentation. Thus this commit implements the following features: - Create a message registry (message-registry.yaml) that defines the log messages produced by xkbcommon. This is a simple YAML file that provides, for each message: - A unique numeric code as a short identifier. It is used in the output message and thus can be easily be filtered to spot errors or searched in the internet. It must not change: if the semantics of message changes, it is better to introduce a new message for clarity. - A unique text identifier, meant for two uses: 1. Generate constants dealing with log information in our code base. 2. Generate human-friendly names for the documentation. - A type: currently warning or error. Used to prefix the constants (see hereinabove) and for basic classification in documentation. - A short description, used as concise and mandatory documentation. - An optionnal detailed description. - Optional examples, intended to help the user to fix issues themself. - Version of xkbcommon it was added. For old entries this often unknown, so they will default to 1.0.0. - Version of xkbcommon it was removed (optional) No entry should ever be deleted from this index, even if the message is not used anymore: it ensures we have unique identifiers along the history of xkbcommon, and that users can refer to the documentation even for older versions. - Add the script update-message-registry.py to generate the following files: - messages.h: message code enumeration for the messages currently used in the code base. Currently a private API. - message.registry.md: the error index documentation page. - Modify the logging functions to use structured messages. This is a work in progress.

0b3d9092

2022-03-14T16:44:13

scanner: prefix functions with `scanner_` to avoid symbol conflicts Particularly `eof()` in mingw-w64. Fixes: https://github.com/xkbcommon/libxkbcommon/pull/285 Reported-by: Marko Lindqvist Signed-off-by: Ran Benita <ran@unusedvar.com>

c3ac58a9

2019-12-27T14:06:47

scanner-utils: avoid possible implicit truncating of line/column This increases the size of the struct a bit but it's not very important. Fixes these MSVC warnings: src\scanner-utils.h(112): warning C4267: '+=': conversion from 'size_t' to 'unsigned int', possible loss of data src\scanner-utils.h(147): warning C4267: '+=': conversion from 'size_t' to 'unsigned int', possible loss of data Signed-off-by: Ran Benita <ran@unusedvar.com>

f774f819

2014-10-18T13:23:53

Replace some strncmp's with memcmp Signed-off-by: Ran Benita <ran234@gmail.com>

a3116f97

2014-10-13T18:51:12

compose/parser: fix segfault when including The keysym cache for the new scanner was not initialized. To avoid such errors also in the future, require passing the priv argument in scanner_init(), instead of initializing it separately. Signed-off-by: Ran Benita <ran234@gmail.com>

8a0acf2c

2014-10-07T23:42:08

scanner-utils: optimize one-line comments Compose files have a lot of those. Signed-off-by: Ran Benita <ran234@gmail.com>

29a1a780

2014-09-12T18:40:18

scanner-utils: add priv member For when a user of the scanner wants to pass something along with it. Signed-off-by: Ran Benita <ran234@gmail.com>

94a8e01c

2014-02-03T14:55:37

scanner-utils: add helper for appending an entire string Signed-off-by: Ran Benita <ran234@gmail.com>

8eb024d5

2013-10-27T20:17:29

scanner-utils: add helper for hex string escape Like the already existing oct. Signed-off-by: Ran Benita <ran234@gmail.com>

4ed68120

2014-10-01T19:14:36

scanner-utils: optimize str()/lit() Replace the dog-slow unneeded strncasecmp() with an inlineable memcmp(). Before: compiled 2500 keymaps in 8.348715629s After: compiled 2500 keymaps in 7.872640338s Signed-off-by: Ran Benita <ran234@gmail.com>

e55a0cea

2013-10-27T20:10:15

Move src/xkbcomp/scanner-utils.h to src/ As we'll use it for things unrelated to xkbcomp. Signed-off-by: Ran Benita <ran234@gmail.com>

kc3-lang/libxkbcommon/src/scanner-utils.h

src/scanner-utils.h

Log