Branch :
| Author | Commit | Date | CI | Message |
|---|---|---|---|---|
| 3d79f459 | 2025-03-29 11:46:34 | xkbcomp: Add Unicode code point escape sequence \u{NNNN} Unicode code point escape sequences `\u{NNNN}` are replaced with the UTF-8 encoding of their corresponding code point `U+NNNN`, if legal. Supported Unicode code points are in the range `1‥0x10ffff`. Note that we will reject the `U+0000` NULL code point, as we reject it in the octal escape sequence `\0`. This is intended mainly for the upcoming feature to write keysyms as UTF-8 encoded strings. It can be used for various reasons: - avoid encoding issues; - avoid issue with font rendering (e.g. Asian scripts); - make white space or zero-width characters more readable. | ||
| 23bbec96 | 2025-03-29 12:33:53 | xkbcomp: Add escape sequence \" `\"` seems like a very natural extension. However it is not supported by Xorg xkbcomp, so do not emit it when serializing. | ||
| aa8b572e | 2025-03-29 12:04:26 | keymap serialization: Ensure escaping relevant chars Previously we would write characters without any escaping in some cases (e.g.: names of indicators, types and groups). E.g. the string "new\nline" would be serialized as: "new line" which would raise a syntax error if parsed. Fixed by escaping any string that was not escaped after parsing (e.g. the section names are safe already). | ||
| d5a91fa9 | 2025-04-04 16:38:16 | xkbcomp: Use custom parsers instead of strtol* The use of `strtol*` functions was already restricted due to its slowness and its capacity to parse other stuff than digits (e.g. signs and spaces). There is also another *big* limitation: it requires a NULL-terminated string. This is incompatible with our functions that work on buffers, because we cannot guarantee this. This may lead to a memory violation if the last token is a number. We now roll out our own parsers, which are more efficients and compatible with buffers. | ||
| 500b260b | 2025-03-28 09:38:58 | xkbcomp: Fix parser failure on floating-point numbers Before this commit we used `strtold`, which depends on the locale. But the XKB syntax is fixed and uses a period as decimal separator. So ensure the syntax is correct without relying on `strtold` and truncate the result, as the parser does not use floating-point numbers. | ||
| 70d11abd | 2025-03-26 07:38:05 | messages: Add file encoding and invalid syntax entries Added: - `XKB_ERROR_INVALID_FILE_ENCODING` - `XKB_ERROR_INVALID_RULES_SYNTAX` - `XKB_ERROR_INVALID_COMPOSE_SYNTAX` Changed: - `XKB_ERROR_INVALID_SYNTAX` renamed to `XKB_ERROR_INVALID_XKB_SYNTAX`. | ||
| e1892266 | 2025-02-13 16:57:46 | clang-tidy: Miscellaneous fixes | ||
| 2d111bbe | 2025-02-12 13:54:51 | xkbcomp: Fix possible overflow in numbers parser | ||
| 558447d8 | 2025-02-11 17:34:27 | xkbcomp: Explicit vars initialization The `Resolve*` functions do not always initialize the parameters that they can modify, so it is safer to always initialize them at the call site. | ||
| f4e95280 | 2025-02-02 22:29:05 | xkbcomp/scanner: avoid unneeded strdup of IDENT tokens The allocation is immediately discarded, either turned into a keysym or an atom. So use an sval slice into the input string instead strdup'ing. memusage ./release/bench-compile-keymap --iter=1000 --layout us,de --variant ,neo Before: Memory usage summary: heap total: 534063576, heap peak: 581022, stack peak: 18848 total calls total memory failed calls malloc| 11240525 291897104 0 realloc| 1447657 192307328 0 (nomove:37629, dec:0, free:0) calloc| 430573 49859144 0 free| 13993903 534063576 After: Memory usage summary: heap total: 506839909, heap peak: 581022, stack peak: 18960 total calls total memory failed calls malloc| 8016419 264673437 0 realloc| 1447657 192307328 0 (nomove:37278, dec:0, free:0) calloc| 430573 49859144 0 free| 10769797 506839909 Signed-off-by: Ran Benita <ran@unusedvar.com> | ||
| 7e84c845 | 2025-02-01 17:04:33 | xkbcomp/scanner: avoid extra copies for keynames, keywords, identifiers The tokens don't have escapes so no need to use the `buf` for them. Signed-off-by: Ran Benita <ran@unusedvar.com> | ||
| e120807b | 2025-01-29 15:35:22 | Update license notices to SDPX short identifiers + update LICENSE Fix #628. Signed-off-by: Ran Benita <ran@unusedvar.com> | ||
| 26807a90 | 2025-01-28 20:24:05 | scanner: compute token line/column lazily on errors The scanner functions are hot, and the line/column location tracking is quite expensive. We only use it for errors, which don't need to be fast, because we bail if there are too many; and for warnings, which are usually not shown by default. So only keep the token start pos, and compute the line/column lazily from that. This will also allow some further improvements ahead. bench/rulescomp before: compiled 1000 keymaps in 1.669028s after: compiled 1000 keymaps in 1.550411s bench/compose: before: compiled 1000 compose tables in 2.145217s after: compiled 1000 compose tables in 2.016044s Signed-off-by: Ran Benita <ran@unusedvar.com> | ||
| ba896935 | 2024-09-24 21:28:12 | logging: Make scanner_warn use a message ID | ||
| c8bd57dd | 2024-09-24 21:20:41 | logging: Make scanner_err use a message ID | ||
| 82e9293e | 2023-10-30 15:28:10 | xkbcomp: early detection of invalid encoding | ||
| f937c308 | 2023-10-29 07:31:34 | xkbcomp: skip heading UTF-8 encoded BOM (U+FEFF) Leading BOM is legal and is used as a signature — an indication that an otherwise unmarked text file is in UTF-8. See: https://www.unicode.org/faq/utf_bom.html#bom5 for further details. | ||
| 9d15c6a7 | 2023-09-26 17:05:14 | Show invalid escape sequences It is easier to debug when the message actually displays the offending escape sequence. | ||
| ca7aa69c | 2023-09-26 17:05:05 | Disallow producing NULL character with escape sequences NULL usually terminates the strings; allowing to produce it via escape sequences may lead to undefined behaviour. - Make NULL escape sequences (e.g. `\0` and `\x0`) invalid. - Add corresponding test. - Introduce the new message: XKB_WARNING_INVALID_ESCAPE_SEQUENCE. | ||
| c0065c95 | 2023-09-21 20:06:27 | Messages: merge macros with and without message code Previously we had two types of macros for logging: with and without message code. They were intended to be merged afterwards. The idea is to use a special code – `XKB_LOG_MESSAGE_NO_ID = 0` – that should *not* be displayed. But we would like to avoid checking this special code at run time. This is achieved using macro tricks; they are detailed in the code (see: `PREPEND_MESSAGE_ID`). Now it is also easier to spot the remaining undocumented log entries: just search `XKB_LOG_MESSAGE_NO_ID`. | ||
| ef81d04e | 2023-09-18 18:17:34 | Structured log messages with a message registry Currently there is little structure in the log messages, making difficult to use them for the following use cases: - A user looking for help about a log message: the user probably uses a search engine, thus the results will depend on the proper indexing of our documentation and the various forums. It relies only on the wording of the message, which may change with time. - A user wants to filter the logs resulting of the use of one of the components of xkbcommon. A typical example would be testing xkeyboard-config against libxkbcommon. It requires the use of a pattern (simple words detection or regex). The issue is that the pattern may become silently out-of-sync with xkbcommon. A common practice (e.g. in compilers) is to assign unique error codes to reference theses messages, along with an error index for documentation. Thus this commit implements the following features: - Create a message registry (message-registry.yaml) that defines the log messages produced by xkbcommon. This is a simple YAML file that provides, for each message: - A unique numeric code as a short identifier. It is used in the output message and thus can be easily be filtered to spot errors or searched in the internet. It must not change: if the semantics of message changes, it is better to introduce a new message for clarity. - A unique text identifier, meant for two uses: 1. Generate constants dealing with log information in our code base. 2. Generate human-friendly names for the documentation. - A type: currently warning or error. Used to prefix the constants (see hereinabove) and for basic classification in documentation. - A short description, used as concise and mandatory documentation. - An optionnal detailed description. - Optional examples, intended to help the user to fix issues themself. - Version of xkbcommon it was added. For old entries this often unknown, so they will default to 1.0.0. - Version of xkbcommon it was removed (optional) No entry should ever be deleted from this index, even if the message is not used anymore: it ensures we have unique identifiers along the history of xkbcommon, and that users can refer to the documentation even for older versions. - Add the script update-message-registry.py to generate the following files: - messages.h: message code enumeration for the messages currently used in the code base. Currently a private API. - message.registry.md: the error index documentation page. - Modify the logging functions to use structured messages. This is a work in progress. | ||
| 0b3d9092 | 2022-03-14 16:44:13 | scanner: prefix functions with `scanner_` to avoid symbol conflicts Particularly `eof()` in mingw-w64. Fixes: https://github.com/xkbcommon/libxkbcommon/pull/285 Reported-by: Marko Lindqvist Signed-off-by: Ran Benita <ran@unusedvar.com> | ||
| 521bb498 | 2019-12-27 22:08:57 | xkbcomp: remove cast which triggers warning on gcc Will need some other way to take care of the warning on MSVC. Signed-off-by: Ran Benita <ran@unusedvar.com> | ||
| fbd0e643 | 2019-12-27 21:51:34 | xkbcomp: make a couple of casts explicit to mark them as checked This acknowledges some "possible loss of data cast" warnings from MSVC. Signed-off-by: Ran Benita <ran@unusedvar.com> | ||
| 40aab05e | 2019-12-27 13:03:20 | build: include config.h manually Previously we included it with an `-include` compiler directive. But that's not portable. And it's better to be explicit anyway. Every .c file should have `include "config.h"` first thing. Signed-off-by: Ran Benita <ran@unusedvar.com> | ||
| 2cca0289 | 2015-11-19 00:44:27 | src/utils: change map_file to not take const string argument map_file() uses PROT_READ, so const seems fitting; however unmap_file calls munmap/free, which do not take const, so an UNCONSTIFY is needed. To avoid the UNCONSTIFY hack, which is likely undefined behavior or some such, just remove the const. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| a3116f97 | 2014-10-13 18:51:12 | compose/parser: fix segfault when including The keysym cache for the new scanner was not initialized. To avoid such errors also in the future, require passing the priv argument in scanner_init(), instead of initializing it separately. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| 8a0acf2c | 2014-10-07 23:42:08 | scanner-utils: optimize one-line comments Compose files have a lot of those. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| d0c6fce2 | 2014-09-20 15:06:13 | parser: use "atom" instead of "sval" in yylval "sval" is already used for "struct sval". Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| cb4bae71 | 2014-06-30 14:52:30 | parser: don't shadow "str" It's a name of a function in scanner-utils.h and also of some parameters. https://bugs.freedesktop.org/show_bug.cgi?id=79898 Reported-by: Bryce Harrington <b.harrington@samsung.com> Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| 28d5f770 | 2014-02-10 20:33:34 | scanner: sort out scanner logging functions First, make the rules and xkb scanners/parsers use the same logging functions instead of rolling their own. Second, use the gcc ##__VA_ARGS extension instead of dealing with C99 stupidity. I hope all relevant compilers support it. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| 68b03097 | 2014-02-08 17:22:14 | scanner: make line and column unsigned Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| b82a0a86 | 2014-02-07 18:09:30 | scanner: avoid strlen in keyword lookup, we know the len Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| 917c7515 | 2014-01-12 14:37:39 | context: remove mostly useless log wrappers Just use xkb_log directly. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| ba7530fa | 2013-11-27 13:43:57 | scanner: restore lost DIVIDE token I don't know how this could have happened. Luckily this token is completely useless. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| dcdd4e10 | 2013-10-14 18:59:53 | Replace ctype.h functions with ascii ones ctype.h is locale-dependent, so using it in our scanners is not optimal. Let's be deterministic with our own simple functions. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| c35c388b | 2013-10-08 18:35:05 | scanner: remove unnecessary cast 'tok' is already an int now. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| 409f27d7 | 2013-09-29 00:41:17 | parser: don't use %locations byacc doesn't support this feature. We print the line/col of the last scanned token instead. This is slightly less in case of *parser* errors (not syntax errors), but I couldn't make it point to another line, and this are pretty cryptic anyways. So it's good enough. Also might be a bit faster, but haven't checked. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| e4c00e90 | 2013-09-29 00:19:32 | parser: don't use enum yytokentype byacc doesn't support this, it just puts out #define's for the tokens. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| 7caa1af2 | 2013-08-13 14:45:33 | scanner: don't fail over unknown escape sequence This is too strict, and causes symbols/cz to fail parsing. Instead, just emit a warning (not shown by default): xkbcommon: WARNING: cz:75:19: unknown escape sequence in string literal https://bugs.freedesktop.org/show_bug.cgi?id=68056 Reported-By: Gatis Paeglis <gatis.paeglis@digia.com> Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| aa9c9194 | 2013-08-02 14:41:19 | scanner: fix compiler warning src/xkbcomp/scanner.c:158:17: warning: comparison of constant -1 with expression of type 'enum yytokentype' is always true [-Wtautological-constant-out-of-range-compare] if (tok != -1) return tok; ~~~ ^ ~~ Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| e91d2653 | 2013-08-01 23:09:46 | scanner: allow empty key name literals Some keymaps actually have this, like the quartz.xkb which is tested. We need to support these. https://bugs.freedesktop.org/show_bug.cgi?id=67654 Reported-By: Gatis Paeglis <gatis.paeglis@digia.com> Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| 9e801ff7 | 2013-07-21 17:01:20 | ctx: adapt to the len-aware atom functions xkb_atom_intern now takes a len parameter. Turns out though that almost all of our xkb_atom_intern calls are called on string literals, the length of which we know statically. So we add a macro to micro-optimize this case. Signed-off-by: Ran Benita <ran234@gmail.com> | ||
| a392d268 | 2012-08-12 11:40:02 | Replace flex scanner with a hand-written one The scanner is very similar in structure to the one in xkbcomp/rules.c. It avoids copying and has nicer error reporting. It uses gperf to generate a hashtable for the keywords, which gives a nice speed boost (compared to the naive strcasecmp method at least). But since there's hardly a reason to regenerate it every time and require people to install gperf, the output (keywords.c) is added here as well. Here are some stats from test/rulescomp: Before: compiled 1000 keymaps in 4.052939625s ==22063== total heap usage: 101,101 allocs, 101,101 frees, 11,840,834 bytes allocated After: compiled 1000 keymaps in 3.519665434s ==26505== total heap usage: 99,945 allocs, 99,945 frees, 7,033,608 bytes allocated Signed-off-by: Ran Benita <ran234@gmail.com> |