src/xkbcomp/parser.y


Log

Author Commit Date CI Message
Pierre Le Marre 1d361b8f 2025-05-12T10:01:10 scanner: Ensure proper type for string length
Ran Benita a3f1a9d3 2025-02-04T20:45:38 xkbcomp/parser: enable Bison detailed syntax error It's not much, but instead of xkbcommon: ERROR: [XKB-769] (unknown file):5:25: syntax error we get xkbcommon: ERROR: [XKB-769] (unknown file):5:25: syntax error, unexpected +, expecting INTEGER which is at least a little helpful. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita e6aec067 2025-04-29T17:14:01 build: drop support for byacc It doesn't support `%define parse.error detailed` which we want to use. If needed, we can probably bring back support using some macro hackery. Signed-off-by: Ran Benita <ran@unusedvar.com>
Pierre Le Marre 66f71890 2025-03-31T08:01:29 symbols: Enable writing keysyms list as UTF-8 strings Each Unicode code point of the string will be translated to their respective keysym, if possible. An empty string denotes `NoSymbol`. When such conversion is not possible, this will raise a syntax error. This introduces the following syntax: ```c // Empty string = `NoSymbol` key <1> {[""]}; // NoSymbol // Single code point = single keysym key <2> {["é"]}; // eacute // String = translate each code point to their respective keysym key <3> {["sßξك🎺"]}; // {s, ssharp, Greek_xi, Arabic_kaf, U1F3BA} // Mix string and keysyms key <4> {[{"ξ", Greek_kappa, "β"}]}; // { Greek_xi, Greek_kappa, Greek_beta} ``` It can also be used wherever a keysym is required, e.g. in `interpret` and `modifier_map` statements. In these cases a single keysym is expected, so the string should contain *exactly one* Unicode code point.
Pierre Le Marre bc3e464b 2025-04-09T12:35:05 keysyms: Fix Unicode handling - `xkb_utf32_to_keysym`: Allow [Unicode noncharacters]. There is no requirement to drop them and this would be the only function of our API doing so. From the Unicode Standard 16.0, section 23.7 “Noncharacters”: > Applications are free to use any of these noncharacter code points > internally. They have no standard interpretation when exchanged > outside the context of internal use. However, they are not illegal > in interchange, nor does their presence cause Unicode text to be > ill-formed. > If a noncharacter is received in open interchange, an application is > not required to interpret it in any way. It is good practice, > however, to recognize it as a noncharacter and to take appropriate > action, such as replacing it with `U+FFFD` REPLACEMENT CHARACTER, > to indicate the problem in the text. The key part is: > an application is not required to interpret it in any way Since we handle the reverse conversion with `xkb_keysym_to_utf32` just fine, I do not see a good motivation to keep this asymmetry. This is the only function with a special case for these code points. - `xkb_keysym_from_name`: - Unicode format `UNNNN`: allow control characters C0 and C1 and use `xkb_utf32_to_keysym` for the conversion when `NNNN < 0x100`, for backward compatibility. - Numeric hexadecimal format `0xNNNN`: *unchanged*. Contrary to the Unicode format, it does not normalize any keysym values in order to enable roundtrip with `xkb_keysym_get_name`. Also added tests to ensure various properties and consistency. Note about *surrogates*: they are valid valid *code points* but invalid Unicode *scalar values*, i.e. they cannot be encoded in any Unicode encoding form (UTF-8, UTF-16, UTF-32). So their corresponding Unicode keysyms are valid, but: - cannot be used as input of `xkb_keysym_to_utf32` nor `xkb_keysym_to_utf8` - cannot result as output of `xkb_utf32_to_keysym`. Otherwise they are valid e.g. in the Unicode keysym notation. [Unicode noncharacters]: https://en.wikipedia.org/wiki/Universal_Character_Set_characters#Noncharacters
Pierre Le Marre 5a32b779 2025-04-06T06:16:41 logging: Handle NULL map name Display “(unnamed map)” instead of “(null)”.
Pierre Le Marre 36442baa 2025-04-03T15:01:46 xkbcomp: Support multiple actions in interpret Before this commit we supported multiple actions per level, but not in *interpret* statements. Let’s fix this asymmetry, so we can equivalently assign all actions sets either implicitly or explicitly.
Pierre Le Marre 06394afc 2025-04-03T08:49:12 xkbcomp: Minor parser refactor for keysyms and actions
Pierre Le Marre 39c1bb36 2025-03-29T17:47:31 xkbcomp: Fix static_assert syntax
Pierre Le Marre 8594adc4 2025-03-31T13:52:36 doc: Mention that `alternate` merge mode is not supported
Pierre Le Marre 36bb4fe3 2025-04-02T19:10:02 xkbcomp: Minor renaming Use the same case for `KeySym` in the parser.
Pierre Le Marre 44480f7c 2025-04-01T08:28:02 xkbcomp: Enable lists of keysyms and actions {} and {a} Motivations: - Follow the principle of least astonishment; - Ensure consistency; - Enhance the use of custom defaults; - Facilitate the tests. There is some ambiguity because we use `{}` to denote both an empty list of keysyms and an empty list of actions. But as soon as we get a keysym or an action, we know whether it is a `MultiKeySymList` or a `MultiActionList`. So we just count the `{}` at the *beginning* using `NoSymbolOrActionList`, then replace it by the relevant count of `NoSymbol` or `NoAction()` once the ambiguity is solved. If not, this is a list of empties of *some* type: we drop those empties and delegate the type resolution using `ExprEmptyList()`.
Pierre Le Marre 2e0245f8 2025-04-02T10:45:44 xkbcomp: Enable more empty lists - Empty `interpret` - Empty key `type` - Empty `indicator` Motivations: - Follow the principle of least astonishment; - Ensure consistency; - Enhance the use of custom defaults; - Facilitate the tests.
Pierre Le Marre 8ba5c453 2025-03-30T10:07:10 xkbcomp: Use section reference as default section name Before this commit the following keymap: ```c xkb_keymap { xkb_keycode {}; }; ``` would result in (boilerplate removed): ```c xkb_keymap { xkb_keycode "(unnamed)" {}; }; ``` This is both useless and wasting allocation: section names are optional, so we should just remove this default name altogether and keep it undefined, as in the original keymap. The situation is a bit different if there is an include, as for keymaps created from RMLVO names. Before this commit, the following keymap: ```c xkb_keymap { xkb_keycode { include "evdev+aliases(qwerty)" }; }; ``` would result in (boilerplate removed): ```c xkb_keymap { xkb_keycode "(unnamed)" { … }; }; ``` With this commit we now follow the Xorg xkbcomp style by using the section reference (the include string) as the *default* section name. So the previous example would now result in: ```c xkb_keymap { xkb_keycode "evdev_aliases(qwerty)" { … }; }; ``` which is useful to give a hint of the original include. Note that if the original section had a name, it would preserve it: ```c xkb_keymap { xkb_keycode "test" { include "evdev+aliases(qwerty)" }; }; ``` would compile to: ```c xkb_keymap { xkb_keycode "test" { … }; }; ```
Pierre Le Marre 3150bca8 2025-03-30T09:54:02 xkbcomp: Make all components optional We already accept *empty* components, such as: `xkb_compat {};`. Let’s accept missing components as well, so that we can reduce the boilerplate in our tests. Note that we will still explicitly serialize empty components for compatibility with previous xkbcommon versions and Xorg xkbcomp.
Pierre Le Marre f3a4eeaa 2025-03-26T16:04:39 symbols: Improve keysym parsing
Pierre Le Marre e5401b07 2025-03-26T16:02:58 symbols: Improve Modmap parsing Parse, dont’t validate: ensure *at parsing* that `modifier_map` definitions use a list of keys and keysyms. This enables to remove the redundant `ExprResolveKeySym` and have keysym parsing exclusively in handled in `parser.y`.
Pierre Le Marre 70d11abd 2025-03-26T07:38:05 messages: Add file encoding and invalid syntax entries Added: - `XKB_ERROR_INVALID_FILE_ENCODING` - `XKB_ERROR_INVALID_RULES_SYNTAX` - `XKB_ERROR_INVALID_COMPOSE_SYNTAX` Changed: - `XKB_ERROR_INVALID_SYNTAX` renamed to `XKB_ERROR_INVALID_XKB_SYNTAX`.
Pierre Le Marre 3a0b77f0 2025-02-12T16:41:09 xkbcomp: Fix parser headers
Ran Benita 9d7eb849 2025-02-06T15:25:03 xkbcomp/ast: combine expr_value_type into stmt_type This field is a funky attempt at type inference, or perhaps some optimization? Anyway, after careful examination I conclude it serves no purpose except specifying the type of a literal (string/integer/float/boolean/keyname) when `STMT_EXPR_VALUE` (i.e. literal). Remove it and replace `STMT_EXPR_VALUE` with specific statement types for each literal type. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita d9fc01b3 2025-02-06T15:12:53 xkbcomp/ast: combine expr_op_type into stmt_type It's better to have a single AST type enum. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita f4e95280 2025-02-02T22:29:05 xkbcomp/scanner: avoid unneeded strdup of IDENT tokens The allocation is immediately discarded, either turned into a keysym or an atom. So use an sval slice into the input string instead strdup'ing. memusage ./release/bench-compile-keymap --iter=1000 --layout us,de --variant ,neo Before: Memory usage summary: heap total: 534063576, heap peak: 581022, stack peak: 18848 total calls total memory failed calls malloc| 11240525 291897104 0 realloc| 1447657 192307328 0 (nomove:37629, dec:0, free:0) calloc| 430573 49859144 0 free| 13993903 534063576 After: Memory usage summary: heap total: 506839909, heap peak: 581022, stack peak: 18960 total calls total memory failed calls malloc| 8016419 264673437 0 realloc| 1447657 192307328 0 (nomove:37278, dec:0, free:0) calloc| 430573 49859144 0 free| 10769797 506839909 Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita e120807b 2025-01-29T15:35:22 Update license notices to SDPX short identifiers + update LICENSE Fix #628. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 1f436703 2025-01-24T23:04:43 xkbcomp: rework KeysymList AST representation This is similar to the previous commit, for keysym lists. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 39689867 2025-01-24T22:43:45 xkbcomp: rework ActionList AST representation The AST is heavily based on intrusive lists for representing lists, but actions are an exception, instead using darray. I don't see any reason for this; it ends up allocating more, and we don't especially need a flat array for this. Change it to use the familiar linked-list style. Signed-off-by: Ran Benita <ran@unusedvar.com>
Pierre Le Marre 4ac22263 2025-01-16T23:22:40 keysyms: Check clashes between keysyms names and keywords Due to how our parser is implemented, keysyms names that are also valid keywords require special handling. Added a check for these clashes in the keysym generator. The only current clash, `section`, is already handled. Note that it means that e.g. `section`, `Section` and `sEcTiOn` all parse to the same keysym. This side effect is fine here, because *currently* there is no other keysym that clashes with any possible of the case variation of `section`. But in order to be extra cautious, we now test thoses clashes too. Hopefully we will never have a clash again, but while it is unlikely that we modify the keywords, the keysyms are not a frozen set.
Pierre Le Marre b1e1aae6 2025-01-23T15:20:44 xkbcomp: Fix memory leak when extra content after keymap It triggers with e.g.: ``` xkb_keymap { xkb_keycodes { }; }; }; // erroneous ```
Pierre Le Marre ec2915fe 2025-01-22T17:18:21 symbols: Fix a possible null pointer deference Introduce a new Expression type, `EXPR_EMPTY_LIST`, to avoid the ambiguity between action and keysym empty lists. Two alternatives were rejected to keep the semantics clear: - Using `EXPR_KEYSYM_LIST`: because we would end up accepting an empty keysym list while processing actions. - Using NULL: convey no info and is hazardous.
Ran Benita 26069b76 2025-01-21T10:48:28 xkbcomp/parser: silence a set but unused warning ``` libxkbcommon.so.0.7.0.p/parser.c:1632:9: warning: variable '_xkbcommon_nerrs' set but not used [-Wunused-but-set-variable] 1632 | int yynerrs = 0; ``` Signed-off-by: Ran Benita <ran@unusedvar.com>
Pierre Le Marre bf03b4b5 2024-12-19T16:23:05 symbols: Parse empty key The following syntax does not parse in xkbcommon, but it does in xkbcomp: ``` xkb_symbols "x" { key <AD01> { }; }; ``` While the usefulness of such statement is debatable, the fact that it does parse in xkbcomp and that tools may generate such keymap entry make it relevant to handle.
Pierre Le Marre fdf2c525 2024-10-08T19:43:30 actions: Add support for multiple actions per level This makes 1 keysym == 1 action holds also for multiple keysyms per level. The motivation of this new feature are: - Make multiple keysyms per level more intuitive. - Explore how to fix the issue with shortcuts in multi-layout settings (see the xkeyboard-config issue[^1]). The idea is to use e.g.: ```c key <LCTL> { symbols[1] = [ {Control_L, ISO_First_Group } ], actions[1] = [ {SetMods(modifiers=Control), SetGroup(group=-4) } ] }; ``` in order to switch temporarily to a reference layout in order to get the same shortcuts on every layout. When no action is specified, `interpret` statements are used to find an action corresponding for *each* keysym, as expected. For an interpretation matching Any keysym, we may get the same interpretation for multiple keysyms. This may result in unwanted duplicate actions. So set this interpretation only if no previous keysym was matched with this interpret at this level, else set the default interpretation. For now, at most one action of each following categories is allowed per level: - modifier actions: `SetMods`, `LatchMods`, `LockMods`; - group actions: `SetGroup`, `LatchGroup`, `LockGroup`. Some examples: - `SetMods` + `SetGroup`: ok - `SetMods` + `SetMods`: error - `SetMods` + `LockMods`: error - `SetMods` + `LockGroup`: ok [^1]: https://gitlab.freedesktop.org/xkeyboard-config/xkeyboard-config/-/issues/416
Pierre Le Marre 31c6d866 2024-10-08T18:39:00 symbols: Min. 2 keysyms in level list Do not allow `{ a }` when a single `a` suffices.
Pierre Le Marre 929a485f 2024-10-08T12:52:53 symbols: Fix too liberal parsing of keysyms lists Currently we are too liberal when parsing symbols lists: e.g. `[{a,{b}}]` is parsed as `[{a,b}]` but it should be rejected.
Pierre Le Marre ba896935 2024-09-24T21:28:12 logging: Make scanner_warn use a message ID
Pierre Le Marre c8bd57dd 2024-09-24T21:20:41 logging: Make scanner_err use a message ID
Pierre Le Marre 44df6eee 2024-09-23T07:27:48 Add new warnings for deprecated keysyms Add 2 new warnings: - Deprecated keysym name (typo, historical alias, etc.); - Deprecated keysym (all names and forms). Guard deprecated keysym tests with verbosity level ≥2, so they are run only when actually needed.
Yuichiro Hanada efdb05d1 2024-01-27T23:00:28 parser: Do now allow the empty symbol declaration An empty element is allowed in SymbolsBody definition, so the following keymap is gramatically correct. ``` xkb_keymap { ... xkb_symbols "sym" { key <SPC> {, [Space] }; }; }; ``` However, the current parser crashes with the keymap due to null pointer access. This change fixes it by changing the parser not to allow it.
Pierre Le Marre a83d745b 2023-09-21T20:06:27 Messages: add new messages to registry This commit is another step to identify and document the maximum number of logging messages. Bulk changes: - Rename `conflicting-key-type` to `conflicting-key-type-merging-groups`. Giving more context in the name allow us to introduce `conflicting-key-type-definitions` later. - Add conflicting-key-type-definitions - Add conflicting-key-type-map-entry - Add undeclared-modifiers-in-key-type Also improve the log messages. - Add conflicting-key-type-preserve-entries - Use XKB_ERROR_UNSUPPORTED_MODIFIER_MASK - Add illegal-key-type-preserve-result - Add conflicting-key-type-level-names - Add duplicate-entry - Add unsupported-symbols-field - Add missing-symbols-group-name-index - Use XKB_ERROR_WRONG_FIELD_TYPE - Add conflicting-key-name - Use XKB_WARNING_UNDEFINED_KEYCODE - Add illegal-keycode-alias - Add unsupported-geometry-section - Add missing-default-section - Add XKB_LOG_MESSAGE_NO_ID - Rename log_vrb_with_code to log_vrb - Use ERROR_WRONG_FIELD_TYPE & ERROR_INVALID_SYNTAX - Add unknown-identifier - Add invalid-expression-type - Add invalid-operation + fixes - Add unknown-operator - Rename ERROR_UNKNOWN_IDENTIFIER to ERROR_INVALID_IDENTIFIER - Add undeclared-virtual-modifier - Add expected-array-entry - Add invalid-include-statement - Add included-file-not-found - Add allocation-error - Add invalid-included-file - Process symbols.c - Add invalid-value - Add invalid-real-modifier - Add unknown-field - Add wrong-scope - Add invalid-modmap-entry - Add wrong-statement-type - Add conflicting-key-symbols-entry - Add invalid-set-default-statement
Pierre Le Marre eafd3ace 2023-09-18T18:17:39 Add a new warning for numeric keysyms Usually it is better to use the corresponding human-friendly keysym names. If there is none, then the keysym is most probably not supported in the ecosystem. The only use case I see is similar to the PUA in Unicode (see: https://en.wikipedia.org/wiki/Private_Use_Areas). I am not aware of examples of this kind of use.
Pierre Le Marre ef81d04e 2023-09-18T18:17:34 Structured log messages with a message registry Currently there is little structure in the log messages, making difficult to use them for the following use cases: - A user looking for help about a log message: the user probably uses a search engine, thus the results will depend on the proper indexing of our documentation and the various forums. It relies only on the wording of the message, which may change with time. - A user wants to filter the logs resulting of the use of one of the components of xkbcommon. A typical example would be testing xkeyboard-config against libxkbcommon. It requires the use of a pattern (simple words detection or regex). The issue is that the pattern may become silently out-of-sync with xkbcommon. A common practice (e.g. in compilers) is to assign unique error codes to reference theses messages, along with an error index for documentation. Thus this commit implements the following features: - Create a message registry (message-registry.yaml) that defines the log messages produced by xkbcommon. This is a simple YAML file that provides, for each message: - A unique numeric code as a short identifier. It is used in the output message and thus can be easily be filtered to spot errors or searched in the internet. It must not change: if the semantics of message changes, it is better to introduce a new message for clarity. - A unique text identifier, meant for two uses: 1. Generate constants dealing with log information in our code base. 2. Generate human-friendly names for the documentation. - A type: currently warning or error. Used to prefix the constants (see hereinabove) and for basic classification in documentation. - A short description, used as concise and mandatory documentation. - An optionnal detailed description. - Optional examples, intended to help the user to fix issues themself. - Version of xkbcommon it was added. For old entries this often unknown, so they will default to 1.0.0. - Version of xkbcommon it was removed (optional) No entry should ever be deleted from this index, even if the message is not used anymore: it ensures we have unique identifiers along the history of xkbcommon, and that users can refer to the documentation even for older versions. - Add the script update-message-registry.py to generate the following files: - messages.h: message code enumeration for the messages currently used in the code base. Currently a private API. - message.registry.md: the error index documentation page. - Modify the logging functions to use structured messages. This is a work in progress.
M Kelly e7f02d32 2023-08-05T15:29:36 parser: change deprecated `%pure-parser` to `%define api.pure` (#370) This is now supported by byacc since version 2.0 20230516
Pierre Le Marre 0da68bc6 2023-07-04T09:23:24 Simplify parsing of numeric keysyms in parser.y In `parser.y`, a numeric keysym is parsed by formatting it in its hexadecimal form then parsed as a keysym name. This is convoluted. Fixed by checking directly the upper bound.
Peter Hutterer afdc9cee 2020-10-19T10:49:37 xkbcomp: where a keysym cannot be resolved, set it to NoSymbol Where resolve_keysym fails we warn but use the otherwise uninitialized variable as our keysym. That later ends up in the keymap as random garbage hex value. Simplest test case, set this in the 'us' keymap: key <TLDE> { [ xyz ] }; And without this patch we get random garbage: ./build/xkbcli-compile-keymap --layout us | grep TLDE: key <TLDE> { [ 0x018a5cf0 ] }; With this patch, we now get NoSymbol: ./build/xkbcli-compile-keymap --layout us | grep TLDE: key <TLDE> { [ NoSymbol ] };
hhb 69713ce3 2020-09-11T05:06:23 parser: fix another format string for int64_t (#191)
Ran Benita 823708b7 2019-12-27T14:51:31 parser: fix format string for int64_t Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 6ca1a0c9 2019-12-27T14:17:55 parser: use int64_t for all numbers Don't use int which can have different size on different machines. Also avoid some warnings from MSVC: xkbcomp/parser.y(760): warning C4244: '=': conversion from 'int64_t' to 'int', possible loss of data xkbcomp/parser.y(761): warning C4244: '=': conversion from 'int64_t' to 'int', possible loss of data xkbcomp/parser.y(767): warning C4244: '=': conversion from 'int64_t' to 'int', possible loss of data Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 40aab05e 2019-12-27T13:03:20 build: include config.h manually Previously we included it with an `-include` compiler directive. But that's not portable. And it's better to be explicit anyway. Every .c file should have `include "config.h"` first thing. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita a237f4f6 2019-12-14T13:44:33 parser: fix the remaining pointer chasing Fix the TODO added in 7c42945. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 7c42945e 2019-11-13T22:41:38 parser: fix quadratic pointer chasing In the AST, lists (e.g. the list of statements in a file) are kept in singly-linked lists -- each AST node has a `next` pointer available for this purpose. Previously, a node was added to the list by starting from the head, chasing to the last, and appending. So creating a list of length N would take ~N^2/2 pointer dereferences. Now, we always (temporarily) keep the last as well, so appending is O(1) instead of O(N). Given a keymap xkb_keymap { xkb_keycodes { minimum = 8; minimum = 8; minimum = 8; minimum = 8; minimum = 8; [... repeated N times ...] }; xkb_types {}; xkb_compat {}; xkb_symbols {}; }; The compilation times are N | Before | After --------|----------|------- 10,000 | 0.407s | 0.006s 20,000 | 1.851s | 0.015s 30,000 | 5.737s | 0.021s 40,000 | 12.759s | 0.023s 50,000 | 21.489s | 0.035s 60,000 | 40.473s | 0.041s 70,000 | 53.336s | 0.039s 80,000 | 72.485s | 0.044s 90,000 | 94.703s | 0.048s 100,000 | 118.390s | 0.057s Another option is to ditch the linked lists and use arrays instead. I got it to work, but its more involved and allocation heavy so turns out to be worse without further optimizations. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita f9b95c06 2019-11-13T23:37:47 parser: remove an unneeded check Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 322cd856 2019-11-12T20:34:31 parser: fix merge mode only applied to first vmod in a virtual_modifiers statement Given augment virtual_modifiers NumLock,Alt,LevelThree Previously it was expanded (directly in the parser) to augment virtual_modifiers NumLock; virtual_modifiers Alt; virtual_modifiers LevelThree; Now it expands to augment virtual_modifiers NumLock; augment virtual_modifiers Alt; augment virtual_modifiers LevelThree; Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 400cc849 2019-11-12T20:04:13 ast: use a separate expr struct for action list Currently it's under UnaryExpr, which just doesn't make sense. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 2af474e8 2019-11-02T13:31:44 parser: get rid of "stealing" atoms This requires (well, at least implemented by) casting away `const` which is undefined behavior, and clang started to warn about it. The micro optimization didn't save too many allocations, anyway. Signed-off-by: Ran Benita <ran@unusedvar.com>
Daniel Stone a8ea7a1d 2017-06-26T16:45:16 parser: Don't set more maps when we don't have any If the scanner indicates that we might have something which looks like a map, but the parser in fact fails to create that map, we will try to access the map regardless. Stop doing that. testcase: 'xkb_keymap {' -> '#kb_keymap' Signed-off-by: Daniel Stone <daniels@collabora.com>
Ran Benita 917636b1 2018-03-11T17:07:06 xkbcomp: fix crash when parsing an xkb_geometry section xkb_geometry sections are ignored; previously the had done so by returning NULL for the section's XkbFile, however some sections of the code do not expect this. Instead, create an XkbFile for it, it will never be processes and discarded later. Caught with the afl fuzzer. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita e3cacae7 2018-03-10T23:32:12 xkbcomp: fix crashes in the parser when geometry tokens appear In the XKB format, floats and various keywords can only be used in the xkb_geometry section. xkbcommon removed support xkb_geometry, but still parses it for backward compatibility. As part of ignoring it, the float AST node and various keywords were removed, and instead NULL was returned by their parsing actions. However, the rest of the code does not handle NULLs, and so when they appear crashes usually ensue. To fix this, restore the float AST node and the ignored keywords. None of the evaluating code expects them, so nice error are displayed. Caught with the afl fuzzer. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 993f4837 2017-07-31T18:16:37 build: fix out-of-tree build The change in d44ba48 removed -I$(top_builddir)/src/xkbcomp, but this is needed in order to find the generated parser.h file which is put in the build dir. I also added -I$(top_builddir)/src in order to match the meson behavior. Fixes https://github.com/xkbcommon/libxkbcommon/issues/50 Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 2e5530ad 2014-10-16T18:51:51 parser: bring back warning about includes of files with no default Using the same format as xkbcomp. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita d0c6fce2 2014-09-20T15:06:13 parser: use "atom" instead of "sval" in yylval "sval" is already used for "struct sval". Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 37cf20c9 2014-07-26T22:49:30 parser: silence bison "unused value" warnings Previous commit triggered these for some reason: /home/ran/src/libxkbcommon/src/xkbcomp/parser.y:555.25-33: warning: unused value: $1 [-Wother] CoordList : CoordList COMMA Coord ^^^^^^^^^ Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 7ec00933 2014-07-26T22:34:05 parser: don't leak AST nodes for discarded symbols If the parser has symbols on the stack, and then enters an error, it discards the symbols and fails. But their actions which allocate AST nodes had already ran. So we must free these to avoid leaks. We use %destructor declarations, see http://www.gnu.org/software/bison/manual/html_node/Destructor-Decl.html Note: byacc only supports %destructor when compiled with --enable-btyacc. Also, it doesn't support using the parse-param in the destructor. So we might revert this commit before the next release, or forget about byacc. https://github.com/xkbcommon/libxkbcommon/issues/8 Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita cb4bae71 2014-06-30T14:52:30 parser: don't shadow "str" It's a name of a function in scanner-utils.h and also of some parameters. https://bugs.freedesktop.org/show_bug.cgi?id=79898 Reported-by: Bryce Harrington <b.harrington@samsung.com> Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 2e561c3f 2014-04-30T08:57:16 parser: show the keysym in "unrecognized keysym" messages Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 1b2bb204 2014-02-13T23:57:22 ast: cast to ParseCommon explictly instead of using ->common Some tools were getting mighty confused with what we were doing. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 28d5f770 2014-02-10T20:33:34 scanner: sort out scanner logging functions First, make the rules and xkb scanners/parsers use the same logging functions instead of rolling their own. Second, use the gcc ##__VA_ARGS extension instead of dealing with C99 stupidity. I hope all relevant compilers support it. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 16aab829 2014-02-09T23:21:19 ast: remove unneeded 'ctx' param to XkbFileCreate Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 5547a82f 2014-02-07T21:12:53 parser: fix unrecognized keysym handling Integer may be negative, so also need to test >= 0. Also, $$ was left uninitialized if the keysym wasn't recognized. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 101720a2 2014-01-12T13:18:39 parser: shutup some 'may be used uninitialized' warnings Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita b63fa3b1 2013-12-01T13:32:51 expr: make Expr creation naming and file location consistent Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 972395b8 2013-12-01T12:08:47 expr: split expression types and allocate them separately Currently, we have one ExprDef type, which contains a tagged union with the value of all expression types. Turns out, this union is quite wasteful memory-wise. Instead, create separate types for all expressions (e.g ExprBinary, ExprInteger) which embed the common fields (ExprCommon), and malloc them per their size; ExprDef then becomes a union of all these types, but is just used as a generic pointer. [Instead of making ExprDef a union, another option is to use ExprCommon as the generic pointer type and then do up-castings, like we do with ParseCommon. But this makes the code much uglier.] The diff is mostly straightforward mechanical adaptations. It could have been much smaller with the help of C11 anonymous structs (which were previously a gnu extension). This will have saved all of the 'op' -> 'expr->op', etc changes. But if we can be a bit more portable for a little effort, we should. Before (./test/rulescomp, x86 32 bit, -O2): ==12974== total heap usage: 145,217 allocs, 145,217 frees, 10,476,238 bytes allocated After: ==11145== total heap usage: 145,217 allocs, 145,217 frees, 8,270,358 bytes allocated Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 068016e4 2013-12-01T10:45:52 parser, symbols: drop unnecessary casts It's casted into ExprDef and then uncasted for no reason. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita fd98d64b 2013-11-30T23:29:58 parser: remove 'uval' yylval type We don't care about DoodadType. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita c24b6420 2013-11-30T23:24:18 expr: add constructor for boolean expressions Also add a 'bool set' to the ExprDef union, instead of using 'ival' as a bool. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita c5d85938 2013-11-30T23:12:45 expr: add constructors for more expression types This makes the parser a bit more declarative. But really it might make error handling easier. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita dbd8b1ef 2013-11-30T22:25:39 expr: add 'ident' value to ExprDef union This distinguishes between an identifier expression and a string expression in the union. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 9dc5b8cb 2013-11-27T13:49:13 Resolve keysyms early in parser Instead of having the parser passing strings to the AST, and symbols/compat etc. resolving them themselves. This simplifies the code a bit, and makes it possible to print where exactly in the file the bad keysym originates from. The previous lazy approach had an advantage of not needlessly resolving keysyms from unrelated maps. However, I think reporting these errors in *any* map is better, and the parser is also a bit smarter then old xkbcomp and doesn't parse many useless maps. So there's no discernible speed/memory difference with this change. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 8e14bff0 2013-09-29T01:41:52 parser: add some notes about byacc working We now also work with byacc (version tested: 20130925) which some people prefer, perhaps due to its license (public domain) or performance (haven't compared). When using byacc, currently the following warning comes up: src/xkbcomp/parser.c:954:14: warning: declaration shadows a variable in the global scope [-Wshadow] YYSTYPE yylval; ^ src/xkbcomp/parser.c:37:20: note: expanded from macro 'yylval' #define yylval _xkbcommon_lval ^ ./src/xkbcomp/parser.h:96:16: note: previous declaration is here extern YYSTYPE _xkbcommon_lval; This is due to a bug in byacc - it shouldn't output that extern line in %pure-parser mode. So the warning stays. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 8dcb30e5 2013-09-29T01:29:47 parser: add a workaround for byacc Unlike bison, byacc outputs its own parser code *after* our own parser.y code, which includes the #undef. So this fix is needed for the 'scanner' -> 'param->scanner' translation to work in the parser.c code generated by byacc. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 409f27d7 2013-09-29T00:41:17 parser: don't use %locations byacc doesn't support this feature. We print the line/col of the last scanned token instead. This is slightly less in case of *parser* errors (not syntax errors), but I couldn't make it point to another line, and this are pretty cryptic anyways. So it's good enough. Also might be a bit faster, but haven't checked. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 13da6da0 2013-09-29T00:24:50 parser: drop %name-prefix, use -p yacc argument instead Even though the %name-prefix is more sensible, byacc doesn't support it, but both bison and byacc support the -p argument. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita cfd7e7c1 2013-09-29T00:22:20 parser: use %pure-parser instead of %define api.pure Both bison and byacc support this syntax. Bison manpage says something about this giving more or less options, but we don't care. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 9e801ff7 2013-07-21T17:01:20 ctx: adapt to the len-aware atom functions xkb_atom_intern now takes a len parameter. Turns out though that almost all of our xkb_atom_intern calls are called on string literals, the length of which we know statically. So we add a macro to micro-optimize this case. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita a392d268 2012-08-12T11:40:02 Replace flex scanner with a hand-written one The scanner is very similar in structure to the one in xkbcomp/rules.c. It avoids copying and has nicer error reporting. It uses gperf to generate a hashtable for the keywords, which gives a nice speed boost (compared to the naive strcasecmp method at least). But since there's hardly a reason to regenerate it every time and require people to install gperf, the output (keywords.c) is added here as well. Here are some stats from test/rulescomp: Before: compiled 1000 keymaps in 4.052939625s ==22063== total heap usage: 101,101 allocs, 101,101 frees, 11,840,834 bytes allocated After: compiled 1000 keymaps in 3.519665434s ==26505== total heap usage: 99,945 allocs, 99,945 frees, 7,033,608 bytes allocated Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita b36d5b23 2013-02-25T17:00:53 parser: also skip 'section' ELEMENT It's for geometry only. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 8cee7490 2013-02-17T22:18:57 Change 'indicator' to 'led' everywhere possible The code currently uses the two names interchangeably. Settle on 'led', because it is shorter, more recognizable, and what we use in our API (though of course the parser still uses 'indicator'). In camel case we make it 'Led'. We change 'xkb_indicator_map' to just 'xkb_led' and the variables of this type are 'led'. This mimics 'xkb_key' and 'key'. IndicatorNameInfo and LEDInfo are changed to 'LedNameInfo' and 'LedInfo', and the variables are 'ledi' (like 'keyi' etc.). This is instead of 'ii' and 'im'. This might make a few places a bit confusing, but less than before I think. It's also shorter. Signed-off-by: Ran Benita <ran234@gmail.com>
Daniel Stone bb620df7 2012-12-06T15:04:15 Parser: Initialise geometry elements for VarDecl We were using uninitialised memory whilst parsing geometry, leaving random contents as the return for shape/overlay/etc sections. Somehow this actually worked everywhere but under Java. https://bugs.freedesktop.org/show_bug.cgi?id=57913 Signed-off-by: Daniel Stone <daniel@fooishbar.org>
Ran Benita 1c880887 2012-09-30T11:55:11 Don't scan and parse useless maps One physical xkb file may (and usually does) contain multiple maps. For example, the us symbols file contains a map for every variant. Currently, when we need a map from a file (specific or default), we parse the entire file into a list of XkbFile's, find the map we want and discard the others. This happens for every include statement. This is a lot of unnecessary work; this commit is a first step at making it better. What we do now is make yyparse return one map at a time; if we find what we want, we can stop looking and avoid processing the rest of the file. This moves some logic from include.c to parser.y (i.e. finding the correct map, named or default). It also necessarily removes the CheckDefaultMap check, which warned about a file which contains multiple default maps. We can live without it. Some stats with test/rulecomp (under valgrind and the benchmark): Before: ==2280== total heap usage: 288,665 allocs, 288,665 frees, 13,121,349 bytes allocated compiled 1000 keymaps in 10.849487353s After: ==1070== total heap usage: 100,197 allocs, 100,197 frees, 9,329,900 bytes allocated compiled 1000 keymaps in 5.258960549s Pretty good. Note: we still do some unnecessary work, by parsing and discarding the maps before the one we want. However dealing with this is more complicated (maybe using bison's push-parser and sniffing the token stream). Probably not worth it. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 3b5ada23 2012-09-30T10:33:59 parser: remove XkbConfig rule This rule allows you to write file maps as: xkb_keycodes <BLA> = 5; [...] instead of the usual format which is: xkb_keycodes { <BLA> = 5; [...] }; This is not documented, It is also not used in xkeyboard-config, and I have never run into it otherwise. It also only allows one map per file. It *might* be used in some obscure place, but probably nothing we should care about; the simplified grammar is more useful for us now. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 22684cd1 2012-09-30T10:50:38 parser: remove XkbCompMapList rule This rule allows you to put several xkb_keymaps in one file. This doesn't make any sense: only the default/first can ever be used, yet the others are fully parsed as well. Different keymaps should just be put in different files. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 3b389b15 2012-09-27T18:49:13 Don't limit key names to 4 characters Currently you can't give a key in xkb_keycodes a name of more than XKB_KEY_NAME_LENGTH (= 4) chars. This is a pretty annoying and arbitrary limitation; it leads to names such as <RTSH>, <COMP>, <PRSC>, <KPAD> etc. which may be hard to decipher, and makes it impossible to give more standard names (e.g. from linux/input.h) to keycodes. The purpose of this, as far as I can tell, was to save memory and to allow encoding a key name directly to a 32 bit value (unsigned long it was). We remove this limitation by just storing the names as atoms; this lifts the limit, allows for easy comparison like the unsigned long thing, and doesn't use more memory than previous solution. It also relieves us from doing all of the annoying conversions to/from long. This has a large diffstat only because KeyNameText, which is used a lot, now needs to take the context in order to resolve the atom. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 41a7fed3 2012-09-27T19:21:26 Fix type of keycode in parser and ast For some reason keycodes were listed under mapFlags in the yylval union. Fix it and some sanity checks. Signed-off-by: Ran Benita <ran234@gmail.com>
Daniel Stone 005dee2b 2012-09-20T23:28:27 Add _xkbcommon_ prefix to parser and lexer symbols Signed-off-by: Daniel Stone <daniel@fooishbar.org>
Daniel Stone fa1ea9a5 2012-09-11T14:09:20 kbproto unentanglement: XkbGeomPtsPerMM Signed-off-by: Daniel Stone <daniel@fooishbar.org>
Daniel Stone b6e04571 2012-09-10T20:16:05 kbproto unentanglement: XkbLC_* Signed-off-by: Daniel Stone <daniel@fooishbar.org>
Daniel Stone f5dffd2b 2012-08-21T11:21:19 kbproto untanglement: XkbKeyNameLength Define it ourselves as XKB_KEY_NAME_LENGTH and use that, instead of the one from XKB.h. Signed-off-by: Daniel Stone <daniel@fooishbar.org>
Ran Benita cdc228ea 2012-08-13T11:00:43 Organize xkbcomp/ header files Various non-functional changes: - Re-add keycodes.h and move some stuff there. - Add parser-priv.h for internal bison/flex stuff. - Don't include headers from other headers, such that file dependencies are immediate in each file. - Rename xkbcomp.h -> ast.h, parseutils.{c,h} -> ast-build.{c,h} - Rename path.{c,h} -> include.{c,h} - Rename keytypes.c -> types.c - Make the naming of XkbFile-related functions more consistent. - Move xkb_map_{new,ref,unref} to map.c. - Remove most extern keyword from function declarations, it's just noise (XKB_EXPORT is what's important here). - Append XKBCOMP_ to include guards. - Shuffle some code around to make all of this work. Splitting this would be a headache.. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita b2c4331a 2012-07-28T22:15:59 Handle key names consistently We treat the key names as fixed length, non NUL terminated strings of length XkbKeyNameLength, and use the appropriate *Text functions to print them. We also use strncpy everywhere instead of memcpy to copy the names, because it does some NUL padding and we might as well. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 01c81fa6 2012-07-25T21:37:20 parser: untabify Run vim's :%retab and some resulting indention fixes. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 724f62c8 2012-07-25T17:29:08 Convert defines to enums in xkbcomp.h For statement / expression types. Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 13eb9c35 2012-07-23T17:41:55 scanner: don't strdup key names The key name is always XkbKeyNameLength (= 4) bytes, so we can maintain it directly in YYSTYPE union and copy when needed, instead of treating it like a full blown string and then copy. This means the scanner checks the length itself. rulescomp under valgrind, before: ==1038== total heap usage: 168,403 allocs, 168,403 frees, 9,732,648 bytes allocated after: ==9377== total heap usage: 155,643 allocs, 155,643 frees, 9,672,788 bytes allocated Signed-off-by: Ran Benita <ran234@gmail.com>