kmx git

Commit	Date	Message
82e9293e	2023-10-30T15:28:10	xkbcomp: early detection of invalid encoding
f937c308	2023-10-29T07:31:34	xkbcomp: skip heading UTF-8 encoded BOM (U+FEFF) Leading BOM is legal and is used as a signature — an indication that an otherwise unmarked text file is in UTF-8. See: https://www.unicode.org/faq/utf_bom.html#bom5 for further details.
9d15c6a7	2023-09-26T17:05:14	Show invalid escape sequences It is easier to debug when the message actually displays the offending escape sequence.
ca7aa69c	2023-09-26T17:05:05	Disallow producing NULL character with escape sequences NULL usually terminates the strings; allowing to produce it via escape sequences may lead to undefined behaviour. - Make NULL escape sequences (e.g. `\0` and `\x0`) invalid. - Add corresponding test. - Introduce the new message: XKB_WARNING_INVALID_ESCAPE_SEQUENCE.
c0065c95	2023-09-21T20:06:27	Messages: merge macros with and without message code Previously we had two types of macros for logging: with and without message code. They were intended to be merged afterwards. The idea is to use a special code – `XKB_LOG_MESSAGE_NO_ID = 0` – that should not be displayed. But we would like to avoid checking this special code at run time. This is achieved using macro tricks; they are detailed in the code (see: `PREPEND_MESSAGE_ID`). Now it is also easier to spot the remaining undocumented log entries: just search `XKB_LOG_MESSAGE_NO_ID`.
ef81d04e	2023-09-18T18:17:34	Structured log messages with a message registry Currently there is little structure in the log messages, making difficult to use them for the following use cases: - A user looking for help about a log message: the user probably uses a search engine, thus the results will depend on the proper indexing of our documentation and the various forums. It relies only on the wording of the message, which may change with time. - A user wants to filter the logs resulting of the use of one of the components of xkbcommon. A typical example would be testing xkeyboard-config against libxkbcommon. It requires the use of a pattern (simple words detection or regex). The issue is that the pattern may become silently out-of-sync with xkbcommon. A common practice (e.g. in compilers) is to assign unique error codes to reference theses messages, along with an error index for documentation. Thus this commit implements the following features: - Create a message registry (message-registry.yaml) that defines the log messages produced by xkbcommon. This is a simple YAML file that provides, for each message: - A unique numeric code as a short identifier. It is used in the output message and thus can be easily be filtered to spot errors or searched in the internet. It must not change: if the semantics of message changes, it is better to introduce a new message for clarity. - A unique text identifier, meant for two uses: 1. Generate constants dealing with log information in our code base. 2. Generate human-friendly names for the documentation. - A type: currently warning or error. Used to prefix the constants (see hereinabove) and for basic classification in documentation. - A short description, used as concise and mandatory documentation. - An optionnal detailed description. - Optional examples, intended to help the user to fix issues themself. - Version of xkbcommon it was added. For old entries this often unknown, so they will default to 1.0.0. - Version of xkbcommon it was removed (optional) No entry should ever be deleted from this index, even if the message is not used anymore: it ensures we have unique identifiers along the history of xkbcommon, and that users can refer to the documentation even for older versions. - Add the script update-message-registry.py to generate the following files: - messages.h: message code enumeration for the messages currently used in the code base. Currently a private API. - message.registry.md: the error index documentation page. - Modify the logging functions to use structured messages. This is a work in progress.
0b3d9092	2022-03-14T16:44:13	scanner: prefix functions with `scanner_` to avoid symbol conflicts Particularly `eof()` in mingw-w64. Fixes: https://github.com/xkbcommon/libxkbcommon/pull/285 Reported-by: Marko Lindqvist Signed-off-by: Ran Benita <ran@unusedvar.com>
521bb498	2019-12-27T22:08:57	xkbcomp: remove cast which triggers warning on gcc Will need some other way to take care of the warning on MSVC. Signed-off-by: Ran Benita <ran@unusedvar.com>
fbd0e643	2019-12-27T21:51:34	xkbcomp: make a couple of casts explicit to mark them as checked This acknowledges some "possible loss of data cast" warnings from MSVC. Signed-off-by: Ran Benita <ran@unusedvar.com>
40aab05e	2019-12-27T13:03:20	build: include config.h manually Previously we included it with an `-include` compiler directive. But that's not portable. And it's better to be explicit anyway. Every .c file should have `include "config.h"` first thing. Signed-off-by: Ran Benita <ran@unusedvar.com>
2cca0289	2015-11-19T00:44:27	src/utils: change map_file to not take const string argument map_file() uses PROT_READ, so const seems fitting; however unmap_file calls munmap/free, which do not take const, so an UNCONSTIFY is needed. To avoid the UNCONSTIFY hack, which is likely undefined behavior or some such, just remove the const. Signed-off-by: Ran Benita <ran234@gmail.com>
a3116f97	2014-10-13T18:51:12	compose/parser: fix segfault when including The keysym cache for the new scanner was not initialized. To avoid such errors also in the future, require passing the priv argument in scanner_init(), instead of initializing it separately. Signed-off-by: Ran Benita <ran234@gmail.com>
8a0acf2c	2014-10-07T23:42:08	scanner-utils: optimize one-line comments Compose files have a lot of those. Signed-off-by: Ran Benita <ran234@gmail.com>
d0c6fce2	2014-09-20T15:06:13	parser: use "atom" instead of "sval" in yylval "sval" is already used for "struct sval". Signed-off-by: Ran Benita <ran234@gmail.com>
cb4bae71	2014-06-30T14:52:30	parser: don't shadow "str" It's a name of a function in scanner-utils.h and also of some parameters. https://bugs.freedesktop.org/show_bug.cgi?id=79898 Reported-by: Bryce Harrington <b.harrington@samsung.com> Signed-off-by: Ran Benita <ran234@gmail.com>
28d5f770	2014-02-10T20:33:34	scanner: sort out scanner logging functions First, make the rules and xkb scanners/parsers use the same logging functions instead of rolling their own. Second, use the gcc ##__VA_ARGS extension instead of dealing with C99 stupidity. I hope all relevant compilers support it. Signed-off-by: Ran Benita <ran234@gmail.com>
68b03097	2014-02-08T17:22:14	scanner: make line and column unsigned Signed-off-by: Ran Benita <ran234@gmail.com>
b82a0a86	2014-02-07T18:09:30	scanner: avoid strlen in keyword lookup, we know the len Signed-off-by: Ran Benita <ran234@gmail.com>
917c7515	2014-01-12T14:37:39	context: remove mostly useless log wrappers Just use xkb_log directly. Signed-off-by: Ran Benita <ran234@gmail.com>
ba7530fa	2013-11-27T13:43:57	scanner: restore lost DIVIDE token I don't know how this could have happened. Luckily this token is completely useless. Signed-off-by: Ran Benita <ran234@gmail.com>
dcdd4e10	2013-10-14T18:59:53	Replace ctype.h functions with ascii ones ctype.h is locale-dependent, so using it in our scanners is not optimal. Let's be deterministic with our own simple functions. Signed-off-by: Ran Benita <ran234@gmail.com>
c35c388b	2013-10-08T18:35:05	scanner: remove unnecessary cast 'tok' is already an int now. Signed-off-by: Ran Benita <ran234@gmail.com>
409f27d7	2013-09-29T00:41:17	parser: don't use %locations byacc doesn't support this feature. We print the line/col of the last scanned token instead. This is slightly less in case of parser errors (not syntax errors), but I couldn't make it point to another line, and this are pretty cryptic anyways. So it's good enough. Also might be a bit faster, but haven't checked. Signed-off-by: Ran Benita <ran234@gmail.com>
e4c00e90	2013-09-29T00:19:32	parser: don't use enum yytokentype byacc doesn't support this, it just puts out #define's for the tokens. Signed-off-by: Ran Benita <ran234@gmail.com>
7caa1af2	2013-08-13T14:45:33	scanner: don't fail over unknown escape sequence This is too strict, and causes symbols/cz to fail parsing. Instead, just emit a warning (not shown by default): xkbcommon: WARNING: cz:75:19: unknown escape sequence in string literal https://bugs.freedesktop.org/show_bug.cgi?id=68056 Reported-By: Gatis Paeglis <gatis.paeglis@digia.com> Signed-off-by: Ran Benita <ran234@gmail.com>
aa9c9194	2013-08-02T14:41:19	scanner: fix compiler warning src/xkbcomp/scanner.c:158:17: warning: comparison of constant -1 with expression of type 'enum yytokentype' is always true [-Wtautological-constant-out-of-range-compare] if (tok != -1) return tok; ~~~ ^ ~~ Signed-off-by: Ran Benita <ran234@gmail.com>
e91d2653	2013-08-01T23:09:46	scanner: allow empty key name literals Some keymaps actually have this, like the quartz.xkb which is tested. We need to support these. https://bugs.freedesktop.org/show_bug.cgi?id=67654 Reported-By: Gatis Paeglis <gatis.paeglis@digia.com> Signed-off-by: Ran Benita <ran234@gmail.com>
9e801ff7	2013-07-21T17:01:20	ctx: adapt to the len-aware atom functions xkb_atom_intern now takes a len parameter. Turns out though that almost all of our xkb_atom_intern calls are called on string literals, the length of which we know statically. So we add a macro to micro-optimize this case. Signed-off-by: Ran Benita <ran234@gmail.com>
a392d268	2012-08-12T11:40:02	Replace flex scanner with a hand-written one The scanner is very similar in structure to the one in xkbcomp/rules.c. It avoids copying and has nicer error reporting. It uses gperf to generate a hashtable for the keywords, which gives a nice speed boost (compared to the naive strcasecmp method at least). But since there's hardly a reason to regenerate it every time and require people to install gperf, the output (keywords.c) is added here as well. Here are some stats from test/rulescomp: Before: compiled 1000 keymaps in 4.052939625s ==22063== total heap usage: 101,101 allocs, 101,101 frees, 11,840,834 bytes allocated After: compiled 1000 keymaps in 3.519665434s ==26505== total heap usage: 99,945 allocs, 99,945 frees, 7,033,608 bytes allocated Signed-off-by: Ran Benita <ran234@gmail.com>

82e9293e

2023-10-30T15:28:10

xkbcomp: early detection of invalid encoding

f937c308

2023-10-29T07:31:34

xkbcomp: skip heading UTF-8 encoded BOM (U+FEFF) Leading BOM is legal and is used as a signature — an indication that an otherwise unmarked text file is in UTF-8. See: https://www.unicode.org/faq/utf_bom.html#bom5 for further details.

9d15c6a7

2023-09-26T17:05:14

Show invalid escape sequences It is easier to debug when the message actually displays the offending escape sequence.

ca7aa69c

2023-09-26T17:05:05

Disallow producing NULL character with escape sequences NULL usually terminates the strings; allowing to produce it via escape sequences may lead to undefined behaviour. - Make NULL escape sequences (e.g. `\0` and `\x0`) invalid. - Add corresponding test. - Introduce the new message: XKB_WARNING_INVALID_ESCAPE_SEQUENCE.

c0065c95

2023-09-21T20:06:27

Messages: merge macros with and without message code Previously we had two types of macros for logging: with and without message code. They were intended to be merged afterwards. The idea is to use a special code – `XKB_LOG_MESSAGE_NO_ID = 0` – that should *not* be displayed. But we would like to avoid checking this special code at run time. This is achieved using macro tricks; they are detailed in the code (see: `PREPEND_MESSAGE_ID`). Now it is also easier to spot the remaining undocumented log entries: just search `XKB_LOG_MESSAGE_NO_ID`.

ef81d04e

2023-09-18T18:17:34

Structured log messages with a message registry Currently there is little structure in the log messages, making difficult to use them for the following use cases: - A user looking for help about a log message: the user probably uses a search engine, thus the results will depend on the proper indexing of our documentation and the various forums. It relies only on the wording of the message, which may change with time. - A user wants to filter the logs resulting of the use of one of the components of xkbcommon. A typical example would be testing xkeyboard-config against libxkbcommon. It requires the use of a pattern (simple words detection or regex). The issue is that the pattern may become silently out-of-sync with xkbcommon. A common practice (e.g. in compilers) is to assign unique error codes to reference theses messages, along with an error index for documentation. Thus this commit implements the following features: - Create a message registry (message-registry.yaml) that defines the log messages produced by xkbcommon. This is a simple YAML file that provides, for each message: - A unique numeric code as a short identifier. It is used in the output message and thus can be easily be filtered to spot errors or searched in the internet. It must not change: if the semantics of message changes, it is better to introduce a new message for clarity. - A unique text identifier, meant for two uses: 1. Generate constants dealing with log information in our code base. 2. Generate human-friendly names for the documentation. - A type: currently warning or error. Used to prefix the constants (see hereinabove) and for basic classification in documentation. - A short description, used as concise and mandatory documentation. - An optionnal detailed description. - Optional examples, intended to help the user to fix issues themself. - Version of xkbcommon it was added. For old entries this often unknown, so they will default to 1.0.0. - Version of xkbcommon it was removed (optional) No entry should ever be deleted from this index, even if the message is not used anymore: it ensures we have unique identifiers along the history of xkbcommon, and that users can refer to the documentation even for older versions. - Add the script update-message-registry.py to generate the following files: - messages.h: message code enumeration for the messages currently used in the code base. Currently a private API. - message.registry.md: the error index documentation page. - Modify the logging functions to use structured messages. This is a work in progress.

0b3d9092

2022-03-14T16:44:13

scanner: prefix functions with `scanner_` to avoid symbol conflicts Particularly `eof()` in mingw-w64. Fixes: https://github.com/xkbcommon/libxkbcommon/pull/285 Reported-by: Marko Lindqvist Signed-off-by: Ran Benita <ran@unusedvar.com>

521bb498

2019-12-27T22:08:57

xkbcomp: remove cast which triggers warning on gcc Will need some other way to take care of the warning on MSVC. Signed-off-by: Ran Benita <ran@unusedvar.com>

fbd0e643

2019-12-27T21:51:34

xkbcomp: make a couple of casts explicit to mark them as checked This acknowledges some "possible loss of data cast" warnings from MSVC. Signed-off-by: Ran Benita <ran@unusedvar.com>

40aab05e

2019-12-27T13:03:20

build: include config.h manually Previously we included it with an `-include` compiler directive. But that's not portable. And it's better to be explicit anyway. Every .c file should have `include "config.h"` first thing. Signed-off-by: Ran Benita <ran@unusedvar.com>

2cca0289

2015-11-19T00:44:27

src/utils: change map_file to not take const string argument map_file() uses PROT_READ, so const seems fitting; however unmap_file calls munmap/free, which do not take const, so an UNCONSTIFY is needed. To avoid the UNCONSTIFY hack, which is likely undefined behavior or some such, just remove the const. Signed-off-by: Ran Benita <ran234@gmail.com>

a3116f97

2014-10-13T18:51:12

compose/parser: fix segfault when including The keysym cache for the new scanner was not initialized. To avoid such errors also in the future, require passing the priv argument in scanner_init(), instead of initializing it separately. Signed-off-by: Ran Benita <ran234@gmail.com>

8a0acf2c

2014-10-07T23:42:08

scanner-utils: optimize one-line comments Compose files have a lot of those. Signed-off-by: Ran Benita <ran234@gmail.com>

d0c6fce2

2014-09-20T15:06:13

parser: use "atom" instead of "sval" in yylval "sval" is already used for "struct sval". Signed-off-by: Ran Benita <ran234@gmail.com>

cb4bae71

2014-06-30T14:52:30

parser: don't shadow "str" It's a name of a function in scanner-utils.h and also of some parameters. https://bugs.freedesktop.org/show_bug.cgi?id=79898 Reported-by: Bryce Harrington <b.harrington@samsung.com> Signed-off-by: Ran Benita <ran234@gmail.com>

28d5f770

2014-02-10T20:33:34

scanner: sort out scanner logging functions First, make the rules and xkb scanners/parsers use the same logging functions instead of rolling their own. Second, use the gcc ##__VA_ARGS extension instead of dealing with C99 stupidity. I hope all relevant compilers support it. Signed-off-by: Ran Benita <ran234@gmail.com>

68b03097

2014-02-08T17:22:14

scanner: make line and column unsigned Signed-off-by: Ran Benita <ran234@gmail.com>

b82a0a86

2014-02-07T18:09:30

scanner: avoid strlen in keyword lookup, we know the len Signed-off-by: Ran Benita <ran234@gmail.com>

917c7515

2014-01-12T14:37:39

context: remove mostly useless log wrappers Just use xkb_log directly. Signed-off-by: Ran Benita <ran234@gmail.com>

ba7530fa

2013-11-27T13:43:57

scanner: restore lost DIVIDE token I don't know how this could have happened. Luckily this token is completely useless. Signed-off-by: Ran Benita <ran234@gmail.com>

dcdd4e10

2013-10-14T18:59:53

Replace ctype.h functions with ascii ones ctype.h is locale-dependent, so using it in our scanners is not optimal. Let's be deterministic with our own simple functions. Signed-off-by: Ran Benita <ran234@gmail.com>

c35c388b

2013-10-08T18:35:05

scanner: remove unnecessary cast 'tok' is already an int now. Signed-off-by: Ran Benita <ran234@gmail.com>

409f27d7

2013-09-29T00:41:17

parser: don't use %locations byacc doesn't support this feature. We print the line/col of the last scanned token instead. This is slightly less in case of *parser* errors (not syntax errors), but I couldn't make it point to another line, and this are pretty cryptic anyways. So it's good enough. Also might be a bit faster, but haven't checked. Signed-off-by: Ran Benita <ran234@gmail.com>

e4c00e90

2013-09-29T00:19:32

parser: don't use enum yytokentype byacc doesn't support this, it just puts out #define's for the tokens. Signed-off-by: Ran Benita <ran234@gmail.com>

7caa1af2

2013-08-13T14:45:33

scanner: don't fail over unknown escape sequence This is too strict, and causes symbols/cz to fail parsing. Instead, just emit a warning (not shown by default): xkbcommon: WARNING: cz:75:19: unknown escape sequence in string literal https://bugs.freedesktop.org/show_bug.cgi?id=68056 Reported-By: Gatis Paeglis <gatis.paeglis@digia.com> Signed-off-by: Ran Benita <ran234@gmail.com>

aa9c9194

2013-08-02T14:41:19

scanner: fix compiler warning src/xkbcomp/scanner.c:158:17: warning: comparison of constant -1 with expression of type 'enum yytokentype' is always true [-Wtautological-constant-out-of-range-compare] if (tok != -1) return tok; ~~~ ^ ~~ Signed-off-by: Ran Benita <ran234@gmail.com>

e91d2653

2013-08-01T23:09:46

scanner: allow empty key name literals Some keymaps actually have this, like the quartz.xkb which is tested. We need to support these. https://bugs.freedesktop.org/show_bug.cgi?id=67654 Reported-By: Gatis Paeglis <gatis.paeglis@digia.com> Signed-off-by: Ran Benita <ran234@gmail.com>

9e801ff7

2013-07-21T17:01:20

ctx: adapt to the len-aware atom functions xkb_atom_intern now takes a len parameter. Turns out though that almost all of our xkb_atom_intern calls are called on string literals, the length of which we know statically. So we add a macro to micro-optimize this case. Signed-off-by: Ran Benita <ran234@gmail.com>

a392d268

2012-08-12T11:40:02

Replace flex scanner with a hand-written one The scanner is very similar in structure to the one in xkbcomp/rules.c. It avoids copying and has nicer error reporting. It uses gperf to generate a hashtable for the keywords, which gives a nice speed boost (compared to the naive strcasecmp method at least). But since there's hardly a reason to regenerate it every time and require people to install gperf, the output (keywords.c) is added here as well. Here are some stats from test/rulescomp: Before: compiled 1000 keymaps in 4.052939625s ==22063== total heap usage: 101,101 allocs, 101,101 frees, 11,840,834 bytes allocated After: compiled 1000 keymaps in 3.519665434s ==26505== total heap usage: 99,945 allocs, 99,945 frees, 7,033,608 bytes allocated Signed-off-by: Ran Benita <ran234@gmail.com>

kc3-lang/libxkbcommon/src/xkbcomp/scanner.c

src/xkbcomp/scanner.c

Log