parser.y

Branch :

Log

Commit	Date	Message
e6aec067	2025-04-29 17:14:01	build: drop support for byacc It doesn't support `%define parse.error detailed` which we want to use. If needed, we can probably bring back support using some macro hackery. Signed-off-by: Ran Benita <ran@unusedvar.com>
66f71890	2025-03-31 08:01:29	symbols: Enable writing keysyms list as UTF-8 strings Each Unicode code point of the string will be translated to their respective keysym, if possible. An empty string denotes `NoSymbol`. When such conversion is not possible, this will raise a syntax error. This introduces the following syntax: ```c // Empty string = `NoSymbol` key <1> {[""]}; // NoSymbol // Single code point = single keysym key <2> {["é"]}; // eacute // String = translate each code point to their respective keysym key <3> {["sßξك🎺"]}; // {s, ssharp, Greek_xi, Arabic_kaf, U1F3BA} // Mix string and keysyms key <4> {[{"ξ", Greek_kappa, "β"}]}; // { Greek_xi, Greek_kappa, Greek_beta} ``` It can also be used wherever a keysym is required, e.g. in `interpret` and `modifier_map` statements. In these cases a single keysym is expected, so the string should contain exactly one Unicode code point.
bc3e464b	2025-04-09 12:35:05	keysyms: Fix Unicode handling - `xkb_utf32_to_keysym`: Allow [Unicode noncharacters]. There is no requirement to drop them and this would be the only function of our API doing so. From the Unicode Standard 16.0, section 23.7 “Noncharacters”: > Applications are free to use any of these noncharacter code points > internally. They have no standard interpretation when exchanged > outside the context of internal use. However, they are not illegal > in interchange, nor does their presence cause Unicode text to be > ill-formed. > If a noncharacter is received in open interchange, an application is > not required to interpret it in any way. It is good practice, > however, to recognize it as a noncharacter and to take appropriate > action, such as replacing it with `U+FFFD` REPLACEMENT CHARACTER, > to indicate the problem in the text. The key part is: > an application is not required to interpret it in any way Since we handle the reverse conversion with `xkb_keysym_to_utf32` just fine, I do not see a good motivation to keep this asymmetry. This is the only function with a special case for these code points. - `xkb_keysym_from_name`: - Unicode format `UNNNN`: allow control characters C0 and C1 and use `xkb_utf32_to_keysym` for the conversion when `NNNN < 0x100`, for backward compatibility. - Numeric hexadecimal format `0xNNNN`: unchanged. Contrary to the Unicode format, it does not normalize any keysym values in order to enable roundtrip with `xkb_keysym_get_name`. Also added tests to ensure various properties and consistency. Note about surrogates: they are valid valid code points but invalid Unicode scalar values, i.e. they cannot be encoded in any Unicode encoding form (UTF-8, UTF-16, UTF-32). So their corresponding Unicode keysyms are valid, but: - cannot be used as input of `xkb_keysym_to_utf32` nor `xkb_keysym_to_utf8` - cannot result as output of `xkb_utf32_to_keysym`. Otherwise they are valid e.g. in the Unicode keysym notation. [Unicode noncharacters]: https://en.wikipedia.org/wiki/Universal_Character_Set_characters#Noncharacters
5a32b779	2025-04-06 06:16:41	logging: Handle NULL map name Display “(unnamed map)” instead of “(null)”.
36442baa	2025-04-03 15:01:46	xkbcomp: Support multiple actions in interpret Before this commit we supported multiple actions per level, but not in interpret statements. Let’s fix this asymmetry, so we can equivalently assign all actions sets either implicitly or explicitly.
06394afc	2025-04-03 08:49:12	xkbcomp: Minor parser refactor for keysyms and actions
39c1bb36	2025-03-29 17:47:31	xkbcomp: Fix static_assert syntax
8594adc4	2025-03-31 13:52:36	doc: Mention that `alternate` merge mode is not supported
36bb4fe3	2025-04-02 19:10:02	xkbcomp: Minor renaming Use the same case for `KeySym` in the parser.
44480f7c	2025-04-01 08:28:02	xkbcomp: Enable lists of keysyms and actions {} and {a} Motivations: - Follow the principle of least astonishment; - Ensure consistency; - Enhance the use of custom defaults; - Facilitate the tests. There is some ambiguity because we use `{}` to denote both an empty list of keysyms and an empty list of actions. But as soon as we get a keysym or an action, we know whether it is a `MultiKeySymList` or a `MultiActionList`. So we just count the `{}` at the beginning using `NoSymbolOrActionList`, then replace it by the relevant count of `NoSymbol` or `NoAction()` once the ambiguity is solved. If not, this is a list of empties of some type: we drop those empties and delegate the type resolution using `ExprEmptyList()`.
2e0245f8	2025-04-02 10:45:44	xkbcomp: Enable more empty lists - Empty `interpret` - Empty key `type` - Empty `indicator` Motivations: - Follow the principle of least astonishment; - Ensure consistency; - Enhance the use of custom defaults; - Facilitate the tests.
8ba5c453	2025-03-30 10:07:10	xkbcomp: Use section reference as default section name Before this commit the following keymap: ```c xkb_keymap { xkb_keycode {}; }; ``` would result in (boilerplate removed): ```c xkb_keymap { xkb_keycode "(unnamed)" {}; }; ``` This is both useless and wasting allocation: section names are optional, so we should just remove this default name altogether and keep it undefined, as in the original keymap. The situation is a bit different if there is an include, as for keymaps created from RMLVO names. Before this commit, the following keymap: ```c xkb_keymap { xkb_keycode { include "evdev+aliases(qwerty)" }; }; ``` would result in (boilerplate removed): ```c xkb_keymap { xkb_keycode "(unnamed)" { … }; }; ``` With this commit we now follow the Xorg xkbcomp style by using the section reference (the include string) as the default section name. So the previous example would now result in: ```c xkb_keymap { xkb_keycode "evdev_aliases(qwerty)" { … }; }; ``` which is useful to give a hint of the original include. Note that if the original section had a name, it would preserve it: ```c xkb_keymap { xkb_keycode "test" { include "evdev+aliases(qwerty)" }; }; ``` would compile to: ```c xkb_keymap { xkb_keycode "test" { … }; }; ```
3150bca8	2025-03-30 09:54:02	xkbcomp: Make all components optional We already accept empty components, such as: `xkb_compat {};`. Let’s accept missing components as well, so that we can reduce the boilerplate in our tests. Note that we will still explicitly serialize empty components for compatibility with previous xkbcommon versions and Xorg xkbcomp.
f3a4eeaa	2025-03-26 16:04:39	symbols: Improve keysym parsing
e5401b07	2025-03-26 16:02:58	symbols: Improve Modmap parsing Parse, dont’t validate: ensure at parsing that `modifier_map` definitions use a list of keys and keysyms. This enables to remove the redundant `ExprResolveKeySym` and have keysym parsing exclusively in handled in `parser.y`.
70d11abd	2025-03-26 07:38:05	messages: Add file encoding and invalid syntax entries Added: - `XKB_ERROR_INVALID_FILE_ENCODING` - `XKB_ERROR_INVALID_RULES_SYNTAX` - `XKB_ERROR_INVALID_COMPOSE_SYNTAX` Changed: - `XKB_ERROR_INVALID_SYNTAX` renamed to `XKB_ERROR_INVALID_XKB_SYNTAX`.
3a0b77f0	2025-02-12 16:41:09	xkbcomp: Fix parser headers
9d7eb849	2025-02-06 15:25:03	xkbcomp/ast: combine expr_value_type into stmt_type This field is a funky attempt at type inference, or perhaps some optimization? Anyway, after careful examination I conclude it serves no purpose except specifying the type of a literal (string/integer/float/boolean/keyname) when `STMT_EXPR_VALUE` (i.e. literal). Remove it and replace `STMT_EXPR_VALUE` with specific statement types for each literal type. Signed-off-by: Ran Benita <ran@unusedvar.com>
d9fc01b3	2025-02-06 15:12:53	xkbcomp/ast: combine expr_op_type into stmt_type It's better to have a single AST type enum. Signed-off-by: Ran Benita <ran@unusedvar.com>
f4e95280	2025-02-02 22:29:05	xkbcomp/scanner: avoid unneeded strdup of IDENT tokens The allocation is immediately discarded, either turned into a keysym or an atom. So use an sval slice into the input string instead strdup'ing. memusage ./release/bench-compile-keymap --iter=1000 --layout us,de --variant ,neo Before: Memory usage summary: heap total: 534063576, heap peak: 581022, stack peak: 18848 total calls total memory failed calls malloc\| 11240525 291897104 0 realloc\| 1447657 192307328 0 (nomove:37629, dec:0, free:0) calloc\| 430573 49859144 0 free\| 13993903 534063576 After: Memory usage summary: heap total: 506839909, heap peak: 581022, stack peak: 18960 total calls total memory failed calls malloc\| 8016419 264673437 0 realloc\| 1447657 192307328 0 (nomove:37278, dec:0, free:0) calloc\| 430573 49859144 0 free\| 10769797 506839909 Signed-off-by: Ran Benita <ran@unusedvar.com>
e120807b	2025-01-29 15:35:22	Update license notices to SDPX short identifiers + update LICENSE Fix #628. Signed-off-by: Ran Benita <ran@unusedvar.com>
1f436703	2025-01-24 23:04:43	xkbcomp: rework KeysymList AST representation This is similar to the previous commit, for keysym lists. Signed-off-by: Ran Benita <ran@unusedvar.com>
39689867	2025-01-24 22:43:45	xkbcomp: rework ActionList AST representation The AST is heavily based on intrusive lists for representing lists, but actions are an exception, instead using darray. I don't see any reason for this; it ends up allocating more, and we don't especially need a flat array for this. Change it to use the familiar linked-list style. Signed-off-by: Ran Benita <ran@unusedvar.com>
4ac22263	2025-01-16 23:22:40	keysyms: Check clashes between keysyms names and keywords Due to how our parser is implemented, keysyms names that are also valid keywords require special handling. Added a check for these clashes in the keysym generator. The only current clash, `section`, is already handled. Note that it means that e.g. `section`, `Section` and `sEcTiOn` all parse to the same keysym. This side effect is fine here, because currently there is no other keysym that clashes with any possible of the case variation of `section`. But in order to be extra cautious, we now test thoses clashes too. Hopefully we will never have a clash again, but while it is unlikely that we modify the keywords, the keysyms are not a frozen set.
b1e1aae6	2025-01-23 15:20:44	xkbcomp: Fix memory leak when extra content after keymap It triggers with e.g.: ``` xkb_keymap { xkb_keycodes { }; }; }; // erroneous ```
ec2915fe	2025-01-22 17:18:21	symbols: Fix a possible null pointer deference Introduce a new Expression type, `EXPR_EMPTY_LIST`, to avoid the ambiguity between action and keysym empty lists. Two alternatives were rejected to keep the semantics clear: - Using `EXPR_KEYSYM_LIST`: because we would end up accepting an empty keysym list while processing actions. - Using NULL: convey no info and is hazardous.
26069b76	2025-01-21 10:48:28	xkbcomp/parser: silence a set but unused warning ``` libxkbcommon.so.0.7.0.p/parser.c:1632:9: warning: variable '_xkbcommon_nerrs' set but not used [-Wunused-but-set-variable] 1632 \| int yynerrs = 0; ``` Signed-off-by: Ran Benita <ran@unusedvar.com>
bf03b4b5	2024-12-19 16:23:05	symbols: Parse empty key The following syntax does not parse in xkbcommon, but it does in xkbcomp: ``` xkb_symbols "x" { key <AD01> { }; }; ``` While the usefulness of such statement is debatable, the fact that it does parse in xkbcomp and that tools may generate such keymap entry make it relevant to handle.
fdf2c525	2024-10-08 19:43:30	actions: Add support for multiple actions per level This makes 1 keysym == 1 action holds also for multiple keysyms per level. The motivation of this new feature are: - Make multiple keysyms per level more intuitive. - Explore how to fix the issue with shortcuts in multi-layout settings (see the xkeyboard-config issue[^1]). The idea is to use e.g.: ```c key <LCTL> { symbols[1] = [ {Control_L, ISO_First_Group } ], actions[1] = [ {SetMods(modifiers=Control), SetGroup(group=-4) } ] }; ``` in order to switch temporarily to a reference layout in order to get the same shortcuts on every layout. When no action is specified, `interpret` statements are used to find an action corresponding for each keysym, as expected. For an interpretation matching Any keysym, we may get the same interpretation for multiple keysyms. This may result in unwanted duplicate actions. So set this interpretation only if no previous keysym was matched with this interpret at this level, else set the default interpretation. For now, at most one action of each following categories is allowed per level: - modifier actions: `SetMods`, `LatchMods`, `LockMods`; - group actions: `SetGroup`, `LatchGroup`, `LockGroup`. Some examples: - `SetMods` + `SetGroup`: ok - `SetMods` + `SetMods`: error - `SetMods` + `LockMods`: error - `SetMods` + `LockGroup`: ok [^1]: https://gitlab.freedesktop.org/xkeyboard-config/xkeyboard-config/-/issues/416
31c6d866	2024-10-08 18:39:00	symbols: Min. 2 keysyms in level list Do not allow `{ a }` when a single `a` suffices.
929a485f	2024-10-08 12:52:53	symbols: Fix too liberal parsing of keysyms lists Currently we are too liberal when parsing symbols lists: e.g. `[{a,{b}}]` is parsed as `[{a,b}]` but it should be rejected.
ba896935	2024-09-24 21:28:12	logging: Make scanner_warn use a message ID
c8bd57dd	2024-09-24 21:20:41	logging: Make scanner_err use a message ID
44df6eee	2024-09-23 07:27:48	Add new warnings for deprecated keysyms Add 2 new warnings: - Deprecated keysym name (typo, historical alias, etc.); - Deprecated keysym (all names and forms). Guard deprecated keysym tests with verbosity level ≥2, so they are run only when actually needed.
efdb05d1	2024-01-27 23:00:28	parser: Do now allow the empty symbol declaration An empty element is allowed in SymbolsBody definition, so the following keymap is gramatically correct. ``` xkb_keymap { ... xkb_symbols "sym" { key <SPC> {, [Space] }; }; }; ``` However, the current parser crashes with the keymap due to null pointer access. This change fixes it by changing the parser not to allow it.
a83d745b	2023-09-21 20:06:27	Messages: add new messages to registry This commit is another step to identify and document the maximum number of logging messages. Bulk changes: - Rename `conflicting-key-type` to `conflicting-key-type-merging-groups`. Giving more context in the name allow us to introduce `conflicting-key-type-definitions` later. - Add conflicting-key-type-definitions - Add conflicting-key-type-map-entry - Add undeclared-modifiers-in-key-type Also improve the log messages. - Add conflicting-key-type-preserve-entries - Use XKB_ERROR_UNSUPPORTED_MODIFIER_MASK - Add illegal-key-type-preserve-result - Add conflicting-key-type-level-names - Add duplicate-entry - Add unsupported-symbols-field - Add missing-symbols-group-name-index - Use XKB_ERROR_WRONG_FIELD_TYPE - Add conflicting-key-name - Use XKB_WARNING_UNDEFINED_KEYCODE - Add illegal-keycode-alias - Add unsupported-geometry-section - Add missing-default-section - Add XKB_LOG_MESSAGE_NO_ID - Rename log_vrb_with_code to log_vrb - Use ERROR_WRONG_FIELD_TYPE & ERROR_INVALID_SYNTAX - Add unknown-identifier - Add invalid-expression-type - Add invalid-operation + fixes - Add unknown-operator - Rename ERROR_UNKNOWN_IDENTIFIER to ERROR_INVALID_IDENTIFIER - Add undeclared-virtual-modifier - Add expected-array-entry - Add invalid-include-statement - Add included-file-not-found - Add allocation-error - Add invalid-included-file - Process symbols.c - Add invalid-value - Add invalid-real-modifier - Add unknown-field - Add wrong-scope - Add invalid-modmap-entry - Add wrong-statement-type - Add conflicting-key-symbols-entry - Add invalid-set-default-statement
eafd3ace	2023-09-18 18:17:39	Add a new warning for numeric keysyms Usually it is better to use the corresponding human-friendly keysym names. If there is none, then the keysym is most probably not supported in the ecosystem. The only use case I see is similar to the PUA in Unicode (see: https://en.wikipedia.org/wiki/Private_Use_Areas). I am not aware of examples of this kind of use.
ef81d04e	2023-09-18 18:17:34	Structured log messages with a message registry Currently there is little structure in the log messages, making difficult to use them for the following use cases: - A user looking for help about a log message: the user probably uses a search engine, thus the results will depend on the proper indexing of our documentation and the various forums. It relies only on the wording of the message, which may change with time. - A user wants to filter the logs resulting of the use of one of the components of xkbcommon. A typical example would be testing xkeyboard-config against libxkbcommon. It requires the use of a pattern (simple words detection or regex). The issue is that the pattern may become silently out-of-sync with xkbcommon. A common practice (e.g. in compilers) is to assign unique error codes to reference theses messages, along with an error index for documentation. Thus this commit implements the following features: - Create a message registry (message-registry.yaml) that defines the log messages produced by xkbcommon. This is a simple YAML file that provides, for each message: - A unique numeric code as a short identifier. It is used in the output message and thus can be easily be filtered to spot errors or searched in the internet. It must not change: if the semantics of message changes, it is better to introduce a new message for clarity. - A unique text identifier, meant for two uses: 1. Generate constants dealing with log information in our code base. 2. Generate human-friendly names for the documentation. - A type: currently warning or error. Used to prefix the constants (see hereinabove) and for basic classification in documentation. - A short description, used as concise and mandatory documentation. - An optionnal detailed description. - Optional examples, intended to help the user to fix issues themself. - Version of xkbcommon it was added. For old entries this often unknown, so they will default to 1.0.0. - Version of xkbcommon it was removed (optional) No entry should ever be deleted from this index, even if the message is not used anymore: it ensures we have unique identifiers along the history of xkbcommon, and that users can refer to the documentation even for older versions. - Add the script update-message-registry.py to generate the following files: - messages.h: message code enumeration for the messages currently used in the code base. Currently a private API. - message.registry.md: the error index documentation page. - Modify the logging functions to use structured messages. This is a work in progress.
e7f02d32	2023-08-05 15:29:36	parser: change deprecated `%pure-parser` to `%define api.pure` (#370) This is now supported by byacc since version 2.0 20230516
0da68bc6	2023-07-04 09:23:24	Simplify parsing of numeric keysyms in parser.y In `parser.y`, a numeric keysym is parsed by formatting it in its hexadecimal form then parsed as a keysym name. This is convoluted. Fixed by checking directly the upper bound.
afdc9cee	2020-10-19 10:49:37	xkbcomp: where a keysym cannot be resolved, set it to NoSymbol Where resolve_keysym fails we warn but use the otherwise uninitialized variable as our keysym. That later ends up in the keymap as random garbage hex value. Simplest test case, set this in the 'us' keymap: key <TLDE> { [ xyz ] }; And without this patch we get random garbage: ./build/xkbcli-compile-keymap --layout us \| grep TLDE: key <TLDE> { [ 0x018a5cf0 ] }; With this patch, we now get NoSymbol: ./build/xkbcli-compile-keymap --layout us \| grep TLDE: key <TLDE> { [ NoSymbol ] };
69713ce3	2020-09-11 05:06:23	parser: fix another format string for int64_t (#191)
823708b7	2019-12-27 14:51:31	parser: fix format string for int64_t Signed-off-by: Ran Benita <ran@unusedvar.com>
6ca1a0c9	2019-12-27 14:17:55	parser: use int64_t for all numbers Don't use int which can have different size on different machines. Also avoid some warnings from MSVC: xkbcomp/parser.y(760): warning C4244: '=': conversion from 'int64_t' to 'int', possible loss of data xkbcomp/parser.y(761): warning C4244: '=': conversion from 'int64_t' to 'int', possible loss of data xkbcomp/parser.y(767): warning C4244: '=': conversion from 'int64_t' to 'int', possible loss of data Signed-off-by: Ran Benita <ran@unusedvar.com>
40aab05e	2019-12-27 13:03:20	build: include config.h manually Previously we included it with an `-include` compiler directive. But that's not portable. And it's better to be explicit anyway. Every .c file should have `include "config.h"` first thing. Signed-off-by: Ran Benita <ran@unusedvar.com>
a237f4f6	2019-12-14 13:44:33	parser: fix the remaining pointer chasing Fix the TODO added in 7c42945. Signed-off-by: Ran Benita <ran@unusedvar.com>
7c42945e	2019-11-13 22:41:38	parser: fix quadratic pointer chasing In the AST, lists (e.g. the list of statements in a file) are kept in singly-linked lists -- each AST node has a `next` pointer available for this purpose. Previously, a node was added to the list by starting from the head, chasing to the last, and appending. So creating a list of length N would take ~N^2/2 pointer dereferences. Now, we always (temporarily) keep the last as well, so appending is O(1) instead of O(N). Given a keymap xkb_keymap { xkb_keycodes { minimum = 8; minimum = 8; minimum = 8; minimum = 8; minimum = 8; [... repeated N times ...] }; xkb_types {}; xkb_compat {}; xkb_symbols {}; }; The compilation times are N \| Before \| After --------\|----------\|------- 10,000 \| 0.407s \| 0.006s 20,000 \| 1.851s \| 0.015s 30,000 \| 5.737s \| 0.021s 40,000 \| 12.759s \| 0.023s 50,000 \| 21.489s \| 0.035s 60,000 \| 40.473s \| 0.041s 70,000 \| 53.336s \| 0.039s 80,000 \| 72.485s \| 0.044s 90,000 \| 94.703s \| 0.048s 100,000 \| 118.390s \| 0.057s Another option is to ditch the linked lists and use arrays instead. I got it to work, but its more involved and allocation heavy so turns out to be worse without further optimizations. Signed-off-by: Ran Benita <ran@unusedvar.com>
f9b95c06	2019-11-13 23:37:47	parser: remove an unneeded check Signed-off-by: Ran Benita <ran@unusedvar.com>
322cd856	2019-11-12 20:34:31	parser: fix merge mode only applied to first vmod in a virtual_modifiers statement Given augment virtual_modifiers NumLock,Alt,LevelThree Previously it was expanded (directly in the parser) to augment virtual_modifiers NumLock; virtual_modifiers Alt; virtual_modifiers LevelThree; Now it expands to augment virtual_modifiers NumLock; augment virtual_modifiers Alt; augment virtual_modifiers LevelThree; Signed-off-by: Ran Benita <ran@unusedvar.com>
400cc849	2019-11-12 20:04:13	ast: use a separate expr struct for action list Currently it's under UnaryExpr, which just doesn't make sense. Signed-off-by: Ran Benita <ran@unusedvar.com>
2af474e8	2019-11-02 13:31:44	parser: get rid of "stealing" atoms This requires (well, at least implemented by) casting away `const` which is undefined behavior, and clang started to warn about it. The micro optimization didn't save too many allocations, anyway. Signed-off-by: Ran Benita <ran@unusedvar.com>
a8ea7a1d	2017-06-26 16:45:16	parser: Don't set more maps when we don't have any If the scanner indicates that we might have something which looks like a map, but the parser in fact fails to create that map, we will try to access the map regardless. Stop doing that. testcase: 'xkb_keymap {' -> '#kb_keymap' Signed-off-by: Daniel Stone <daniels@collabora.com>
917636b1	2018-03-11 17:07:06	xkbcomp: fix crash when parsing an xkb_geometry section xkb_geometry sections are ignored; previously the had done so by returning NULL for the section's XkbFile, however some sections of the code do not expect this. Instead, create an XkbFile for it, it will never be processes and discarded later. Caught with the afl fuzzer. Signed-off-by: Ran Benita <ran234@gmail.com>
e3cacae7	2018-03-10 23:32:12	xkbcomp: fix crashes in the parser when geometry tokens appear In the XKB format, floats and various keywords can only be used in the xkb_geometry section. xkbcommon removed support xkb_geometry, but still parses it for backward compatibility. As part of ignoring it, the float AST node and various keywords were removed, and instead NULL was returned by their parsing actions. However, the rest of the code does not handle NULLs, and so when they appear crashes usually ensue. To fix this, restore the float AST node and the ignored keywords. None of the evaluating code expects them, so nice error are displayed. Caught with the afl fuzzer. Signed-off-by: Ran Benita <ran234@gmail.com>
993f4837	2017-07-31 18:16:37	build: fix out-of-tree build The change in d44ba48 removed -I$(top_builddir)/src/xkbcomp, but this is needed in order to find the generated parser.h file which is put in the build dir. I also added -I$(top_builddir)/src in order to match the meson behavior. Fixes https://github.com/xkbcommon/libxkbcommon/issues/50 Signed-off-by: Ran Benita <ran234@gmail.com>
2e5530ad	2014-10-16 18:51:51	parser: bring back warning about includes of files with no default Using the same format as xkbcomp. Signed-off-by: Ran Benita <ran234@gmail.com>
d0c6fce2	2014-09-20 15:06:13	parser: use "atom" instead of "sval" in yylval "sval" is already used for "struct sval". Signed-off-by: Ran Benita <ran234@gmail.com>
37cf20c9	2014-07-26 22:49:30	parser: silence bison "unused value" warnings Previous commit triggered these for some reason: /home/ran/src/libxkbcommon/src/xkbcomp/parser.y:555.25-33: warning: unused value: $1 [-Wother] CoordList : CoordList COMMA Coord ^^^^^^^^^ Signed-off-by: Ran Benita <ran234@gmail.com>
7ec00933	2014-07-26 22:34:05	parser: don't leak AST nodes for discarded symbols If the parser has symbols on the stack, and then enters an error, it discards the symbols and fails. But their actions which allocate AST nodes had already ran. So we must free these to avoid leaks. We use %destructor declarations, see http://www.gnu.org/software/bison/manual/html_node/Destructor-Decl.html Note: byacc only supports %destructor when compiled with --enable-btyacc. Also, it doesn't support using the parse-param in the destructor. So we might revert this commit before the next release, or forget about byacc. https://github.com/xkbcommon/libxkbcommon/issues/8 Signed-off-by: Ran Benita <ran234@gmail.com>
cb4bae71	2014-06-30 14:52:30	parser: don't shadow "str" It's a name of a function in scanner-utils.h and also of some parameters. https://bugs.freedesktop.org/show_bug.cgi?id=79898 Reported-by: Bryce Harrington <b.harrington@samsung.com> Signed-off-by: Ran Benita <ran234@gmail.com>
2e561c3f	2014-04-30 08:57:16	parser: show the keysym in "unrecognized keysym" messages Signed-off-by: Ran Benita <ran234@gmail.com>
1b2bb204	2014-02-13 23:57:22	ast: cast to ParseCommon explictly instead of using ->common Some tools were getting mighty confused with what we were doing. Signed-off-by: Ran Benita <ran234@gmail.com>
28d5f770	2014-02-10 20:33:34	scanner: sort out scanner logging functions First, make the rules and xkb scanners/parsers use the same logging functions instead of rolling their own. Second, use the gcc ##__VA_ARGS extension instead of dealing with C99 stupidity. I hope all relevant compilers support it. Signed-off-by: Ran Benita <ran234@gmail.com>
16aab829	2014-02-09 23:21:19	ast: remove unneeded 'ctx' param to XkbFileCreate Signed-off-by: Ran Benita <ran234@gmail.com>
5547a82f	2014-02-07 21:12:53	parser: fix unrecognized keysym handling Integer may be negative, so also need to test >= 0. Also, $$ was left uninitialized if the keysym wasn't recognized. Signed-off-by: Ran Benita <ran234@gmail.com>
101720a2	2014-01-12 13:18:39	parser: shutup some 'may be used uninitialized' warnings Signed-off-by: Ran Benita <ran234@gmail.com>
b63fa3b1	2013-12-01 13:32:51	expr: make Expr creation naming and file location consistent Signed-off-by: Ran Benita <ran234@gmail.com>
972395b8	2013-12-01 12:08:47	expr: split expression types and allocate them separately Currently, we have one ExprDef type, which contains a tagged union with the value of all expression types. Turns out, this union is quite wasteful memory-wise. Instead, create separate types for all expressions (e.g ExprBinary, ExprInteger) which embed the common fields (ExprCommon), and malloc them per their size; ExprDef then becomes a union of all these types, but is just used as a generic pointer. [Instead of making ExprDef a union, another option is to use ExprCommon as the generic pointer type and then do up-castings, like we do with ParseCommon. But this makes the code much uglier.] The diff is mostly straightforward mechanical adaptations. It could have been much smaller with the help of C11 anonymous structs (which were previously a gnu extension). This will have saved all of the 'op' -> 'expr->op', etc changes. But if we can be a bit more portable for a little effort, we should. Before (./test/rulescomp, x86 32 bit, -O2): ==12974== total heap usage: 145,217 allocs, 145,217 frees, 10,476,238 bytes allocated After: ==11145== total heap usage: 145,217 allocs, 145,217 frees, 8,270,358 bytes allocated Signed-off-by: Ran Benita <ran234@gmail.com>
068016e4	2013-12-01 10:45:52	parser, symbols: drop unnecessary casts It's casted into ExprDef and then uncasted for no reason. Signed-off-by: Ran Benita <ran234@gmail.com>
fd98d64b	2013-11-30 23:29:58	parser: remove 'uval' yylval type We don't care about DoodadType. Signed-off-by: Ran Benita <ran234@gmail.com>
c24b6420	2013-11-30 23:24:18	expr: add constructor for boolean expressions Also add a 'bool set' to the ExprDef union, instead of using 'ival' as a bool. Signed-off-by: Ran Benita <ran234@gmail.com>
c5d85938	2013-11-30 23:12:45	expr: add constructors for more expression types This makes the parser a bit more declarative. But really it might make error handling easier. Signed-off-by: Ran Benita <ran234@gmail.com>
dbd8b1ef	2013-11-30 22:25:39	expr: add 'ident' value to ExprDef union This distinguishes between an identifier expression and a string expression in the union. Signed-off-by: Ran Benita <ran234@gmail.com>
9dc5b8cb	2013-11-27 13:49:13	Resolve keysyms early in parser Instead of having the parser passing strings to the AST, and symbols/compat etc. resolving them themselves. This simplifies the code a bit, and makes it possible to print where exactly in the file the bad keysym originates from. The previous lazy approach had an advantage of not needlessly resolving keysyms from unrelated maps. However, I think reporting these errors in any map is better, and the parser is also a bit smarter then old xkbcomp and doesn't parse many useless maps. So there's no discernible speed/memory difference with this change. Signed-off-by: Ran Benita <ran234@gmail.com>
8e14bff0	2013-09-29 01:41:52	parser: add some notes about byacc working We now also work with byacc (version tested: 20130925) which some people prefer, perhaps due to its license (public domain) or performance (haven't compared). When using byacc, currently the following warning comes up: src/xkbcomp/parser.c:954:14: warning: declaration shadows a variable in the global scope [-Wshadow] YYSTYPE yylval; ^ src/xkbcomp/parser.c:37:20: note: expanded from macro 'yylval' #define yylval _xkbcommon_lval ^ ./src/xkbcomp/parser.h:96:16: note: previous declaration is here extern YYSTYPE _xkbcommon_lval; This is due to a bug in byacc - it shouldn't output that extern line in %pure-parser mode. So the warning stays. Signed-off-by: Ran Benita <ran234@gmail.com>
8dcb30e5	2013-09-29 01:29:47	parser: add a workaround for byacc Unlike bison, byacc outputs its own parser code after our own parser.y code, which includes the #undef. So this fix is needed for the 'scanner' -> 'param->scanner' translation to work in the parser.c code generated by byacc. Signed-off-by: Ran Benita <ran234@gmail.com>
409f27d7	2013-09-29 00:41:17	parser: don't use %locations byacc doesn't support this feature. We print the line/col of the last scanned token instead. This is slightly less in case of parser errors (not syntax errors), but I couldn't make it point to another line, and this are pretty cryptic anyways. So it's good enough. Also might be a bit faster, but haven't checked. Signed-off-by: Ran Benita <ran234@gmail.com>
13da6da0	2013-09-29 00:24:50	parser: drop %name-prefix, use -p yacc argument instead Even though the %name-prefix is more sensible, byacc doesn't support it, but both bison and byacc support the -p argument. Signed-off-by: Ran Benita <ran234@gmail.com>
cfd7e7c1	2013-09-29 00:22:20	parser: use %pure-parser instead of %define api.pure Both bison and byacc support this syntax. Bison manpage says something about this giving more or less options, but we don't care. Signed-off-by: Ran Benita <ran234@gmail.com>
9e801ff7	2013-07-21 17:01:20	ctx: adapt to the len-aware atom functions xkb_atom_intern now takes a len parameter. Turns out though that almost all of our xkb_atom_intern calls are called on string literals, the length of which we know statically. So we add a macro to micro-optimize this case. Signed-off-by: Ran Benita <ran234@gmail.com>
a392d268	2012-08-12 11:40:02	Replace flex scanner with a hand-written one The scanner is very similar in structure to the one in xkbcomp/rules.c. It avoids copying and has nicer error reporting. It uses gperf to generate a hashtable for the keywords, which gives a nice speed boost (compared to the naive strcasecmp method at least). But since there's hardly a reason to regenerate it every time and require people to install gperf, the output (keywords.c) is added here as well. Here are some stats from test/rulescomp: Before: compiled 1000 keymaps in 4.052939625s ==22063== total heap usage: 101,101 allocs, 101,101 frees, 11,840,834 bytes allocated After: compiled 1000 keymaps in 3.519665434s ==26505== total heap usage: 99,945 allocs, 99,945 frees, 7,033,608 bytes allocated Signed-off-by: Ran Benita <ran234@gmail.com>
b36d5b23	2013-02-25 17:00:53	parser: also skip 'section' ELEMENT It's for geometry only. Signed-off-by: Ran Benita <ran234@gmail.com>
8cee7490	2013-02-17 22:18:57	Change 'indicator' to 'led' everywhere possible The code currently uses the two names interchangeably. Settle on 'led', because it is shorter, more recognizable, and what we use in our API (though of course the parser still uses 'indicator'). In camel case we make it 'Led'. We change 'xkb_indicator_map' to just 'xkb_led' and the variables of this type are 'led'. This mimics 'xkb_key' and 'key'. IndicatorNameInfo and LEDInfo are changed to 'LedNameInfo' and 'LedInfo', and the variables are 'ledi' (like 'keyi' etc.). This is instead of 'ii' and 'im'. This might make a few places a bit confusing, but less than before I think. It's also shorter. Signed-off-by: Ran Benita <ran234@gmail.com>
bb620df7	2012-12-06 15:04:15	Parser: Initialise geometry elements for VarDecl We were using uninitialised memory whilst parsing geometry, leaving random contents as the return for shape/overlay/etc sections. Somehow this actually worked everywhere but under Java. https://bugs.freedesktop.org/show_bug.cgi?id=57913 Signed-off-by: Daniel Stone <daniel@fooishbar.org>
1c880887	2012-09-30 11:55:11	Don't scan and parse useless maps One physical xkb file may (and usually does) contain multiple maps. For example, the us symbols file contains a map for every variant. Currently, when we need a map from a file (specific or default), we parse the entire file into a list of XkbFile's, find the map we want and discard the others. This happens for every include statement. This is a lot of unnecessary work; this commit is a first step at making it better. What we do now is make yyparse return one map at a time; if we find what we want, we can stop looking and avoid processing the rest of the file. This moves some logic from include.c to parser.y (i.e. finding the correct map, named or default). It also necessarily removes the CheckDefaultMap check, which warned about a file which contains multiple default maps. We can live without it. Some stats with test/rulecomp (under valgrind and the benchmark): Before: ==2280== total heap usage: 288,665 allocs, 288,665 frees, 13,121,349 bytes allocated compiled 1000 keymaps in 10.849487353s After: ==1070== total heap usage: 100,197 allocs, 100,197 frees, 9,329,900 bytes allocated compiled 1000 keymaps in 5.258960549s Pretty good. Note: we still do some unnecessary work, by parsing and discarding the maps before the one we want. However dealing with this is more complicated (maybe using bison's push-parser and sniffing the token stream). Probably not worth it. Signed-off-by: Ran Benita <ran234@gmail.com>
22684cd1	2012-09-30 10:50:38	parser: remove XkbCompMapList rule This rule allows you to put several xkb_keymaps in one file. This doesn't make any sense: only the default/first can ever be used, yet the others are fully parsed as well. Different keymaps should just be put in different files. Signed-off-by: Ran Benita <ran234@gmail.com>
3b5ada23	2012-09-30 10:33:59	parser: remove XkbConfig rule This rule allows you to write file maps as: xkb_keycodes <BLA> = 5; [...] instead of the usual format which is: xkb_keycodes { <BLA> = 5; [...] }; This is not documented, It is also not used in xkeyboard-config, and I have never run into it otherwise. It also only allows one map per file. It might be used in some obscure place, but probably nothing we should care about; the simplified grammar is more useful for us now. Signed-off-by: Ran Benita <ran234@gmail.com>
41a7fed3	2012-09-27 19:21:26	Fix type of keycode in parser and ast For some reason keycodes were listed under mapFlags in the yylval union. Fix it and some sanity checks. Signed-off-by: Ran Benita <ran234@gmail.com>
3b389b15	2012-09-27 18:49:13	Don't limit key names to 4 characters Currently you can't give a key in xkb_keycodes a name of more than XKB_KEY_NAME_LENGTH (= 4) chars. This is a pretty annoying and arbitrary limitation; it leads to names such as <RTSH>, <COMP>, <PRSC>, <KPAD> etc. which may be hard to decipher, and makes it impossible to give more standard names (e.g. from linux/input.h) to keycodes. The purpose of this, as far as I can tell, was to save memory and to allow encoding a key name directly to a 32 bit value (unsigned long it was). We remove this limitation by just storing the names as atoms; this lifts the limit, allows for easy comparison like the unsigned long thing, and doesn't use more memory than previous solution. It also relieves us from doing all of the annoying conversions to/from long. This has a large diffstat only because KeyNameText, which is used a lot, now needs to take the context in order to resolve the atom. Signed-off-by: Ran Benita <ran234@gmail.com>
005dee2b	2012-09-20 23:28:27	Add _xkbcommon_ prefix to parser and lexer symbols Signed-off-by: Daniel Stone <daniel@fooishbar.org>
fa1ea9a5	2012-09-11 14:09:20	kbproto unentanglement: XkbGeomPtsPerMM Signed-off-by: Daniel Stone <daniel@fooishbar.org>
b6e04571	2012-09-10 20:16:05	kbproto unentanglement: XkbLC_* Signed-off-by: Daniel Stone <daniel@fooishbar.org>
f5dffd2b	2012-08-21 11:21:19	kbproto untanglement: XkbKeyNameLength Define it ourselves as XKB_KEY_NAME_LENGTH and use that, instead of the one from XKB.h. Signed-off-by: Daniel Stone <daniel@fooishbar.org>
cdc228ea	2012-08-13 11:00:43	Organize xkbcomp/ header files Various non-functional changes: - Re-add keycodes.h and move some stuff there. - Add parser-priv.h for internal bison/flex stuff. - Don't include headers from other headers, such that file dependencies are immediate in each file. - Rename xkbcomp.h -> ast.h, parseutils.{c,h} -> ast-build.{c,h} - Rename path.{c,h} -> include.{c,h} - Rename keytypes.c -> types.c - Make the naming of XkbFile-related functions more consistent. - Move xkb_map_{new,ref,unref} to map.c. - Remove most extern keyword from function declarations, it's just noise (XKB_EXPORT is what's important here). - Append XKBCOMP_ to include guards. - Shuffle some code around to make all of this work. Splitting this would be a headache.. Signed-off-by: Ran Benita <ran234@gmail.com>
b2c4331a	2012-07-28 22:15:59	Handle key names consistently We treat the key names as fixed length, non NUL terminated strings of length XkbKeyNameLength, and use the appropriate *Text functions to print them. We also use strncpy everywhere instead of memcpy to copy the names, because it does some NUL padding and we might as well. Signed-off-by: Ran Benita <ran234@gmail.com>
01c81fa6	2012-07-25 21:37:20	parser: untabify Run vim's :%retab and some resulting indention fixes. Signed-off-by: Ran Benita <ran234@gmail.com>
724f62c8	2012-07-25 17:29:08	Convert defines to enums in xkbcomp.h For statement / expression types. Signed-off-by: Ran Benita <ran234@gmail.com>
13eb9c35	2012-07-23 17:41:55	scanner: don't strdup key names The key name is always XkbKeyNameLength (= 4) bytes, so we can maintain it directly in YYSTYPE union and copy when needed, instead of treating it like a full blown string and then copy. This means the scanner checks the length itself. rulescomp under valgrind, before: ==1038== total heap usage: 168,403 allocs, 168,403 frees, 9,732,648 bytes allocated after: ==9377== total heap usage: 155,643 allocs, 155,643 frees, 9,672,788 bytes allocated Signed-off-by: Ran Benita <ran234@gmail.com>
112cccb1	2012-07-23 16:03:34	Some atom related optimizations We often get a strdup'd string, just to pass it over the atom_intern and then immediately free it. But atom_intern then strdup's it again (if it's not interned already); so instead we can have the interning "steal" the memory instead of allocing a new one and freeing the old one. This is done by a new xkb_atom_steal function. It also turns out, that every time we strdup an atom, we don't actually modify it afterwards. Since we are guaranteed that the atom table will live as long as the context, we can just use xkb_atom_text instead. This removes a some more dynamic allocations. For this change we had to remove the ability to append two strings, e.g. "foo" + "bar" -> "foobar" which is only possible with string literals. This is unused and quite useless for our purposes. xkb_atom_strdup is left unused, as it may still be useful. Running rulescomp in valgrind, Before: ==7907== total heap usage: 173,698 allocs, 173,698 frees, 9,775,973 bytes allocated After: ==6348== total heap usage: 168,403 allocs, 168,403 frees, 9,732,648 bytes allocated Signed-off-by: Ran Benita <ran234@gmail.com>
f48ee2d2	2012-07-21 15:44:48	parse: use new log functions Signed-off-by: Ran Benita <ran234@gmail.com>

kc3-lang/libxkbcommon/src/xkbcomp/parser.y

Log

kc3-lang/libxkbcommon /src/xkbcomp/parser.y