scripts


Log

Author Commit Date CI Message
Pierre Le Marre 636b8b97 2025-03-19T14:11:52 test: Add merge mode tests for all the sections The merge modes tests C file is now only generated locally, because it is too large. The generator Python script requires Jinja2, so the test is optional and depends on Jinja22 availability. The test aim to be exhaustive with various combinations of a base and an update: - plain base + plain update, for every mode - plain base + include (for every mode) update (every mode) - single include (base +| update)
Pierre Le Marre 08d9a031 2025-04-08T06:31:33 Unicode: Make surrogate handling more explicit
Pierre Le Marre 47c2c820 2025-04-08T18:09:41 Add internal API to get all explicit names of a keysym
Pierre Le Marre b2744402 2025-02-15T16:44:44 doc: Fix cool URIs
Pierre Le Marre e1892266 2025-02-13T16:57:46 clang-tidy: Miscellaneous fixes
Pierre Le Marre ce9bcbe0 2025-02-07T16:31:37 scripts: Rename keysyms-related files Previous names were too generic. Fixed by using explicit names and add the `.py` file extension.
Ran Benita a380ba52 2025-01-25T07:00:43 Move XKB_EXPORT to headers The Windows dllexport annotation wants to be on the declarations, not the definitions. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita df2322d7 2025-02-05T14:41:21 Replace include guards by `#pragma once` We currently have a mix of include headers, pragma once and some missing. pragma once is not standard but is widely supported, and we already use it with no issues, so I'd say it's not a problem. Let's convert all headers to pragma once to avoid the annoying include guards. The public headers are *not* converted. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 43f6036d 2025-02-01T16:34:00 xkbcomp/keywords: don't require C string for keyword lookup Needed for next commit, but good regardless. No noticeable effect on performance. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita e120807b 2025-01-29T15:35:22 Update license notices to SDPX short identifiers + update LICENSE Fix #628. Signed-off-by: Ran Benita <ran@unusedvar.com>
Pierre Le Marre c85c9bdc 2025-01-27T17:15:06 symbols: Allow levels with different keysyms and actions counts Contrary to groups, there is no reason for levels to restrict the same count of keysyms and actions.
Pierre Le Marre 4ac22263 2025-01-16T23:22:40 keysyms: Check clashes between keysyms names and keywords Due to how our parser is implemented, keysyms names that are also valid keywords require special handling. Added a check for these clashes in the keysym generator. The only current clash, `section`, is already handled. Note that it means that e.g. `section`, `Section` and `sEcTiOn` all parse to the same keysym. This side effect is fine here, because *currently* there is no other keysym that clashes with any possible of the case variation of `section`. But in order to be extra cautious, we now test thoses clashes too. Hopefully we will never have a clash again, but while it is unlikely that we modify the keywords, the keysyms are not a frozen set.
Pierre Le Marre 7036e46c 2025-01-13T15:20:47 symbols: Add tests for key merge modes (keysyms/actions) This commit adds tests for merging various key configurations: - With/without keysyms/actions - Single/multiple keysyms/actions per level We test all the merge modes for including a map (global) as well as directly on the keys (local): - default (global: include, local: implicit) - augment - override - replace The tests data are generated with: - A Python script `scripts/update-merge-modes-tests.py` for keycodes and symbols data. Use `--debug` for extra comments to help debugging. The script can optionally generate C headers for alternative key sequence tests, that were used before implementing golden tests. The latter tests are not used anymore (duplicate with golden tests) but their generator is kept for now, as they can still be useful for debugging or writing similar tests. - The `merge-modes` test generates its own keymap files for golden tests, using: `build/test-merge-modes update`. It can also replace them with the obtained output rather than the expected one using `build/test-merge-modes update-obtained`, which is very useful for debugging.
Pierre Le Marre d706e649 2025-01-10T14:18:21 scripts: Fix formatting with Ruff 0.9 See: https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md#090
Pierre Le Marre b9b4ab47 2024-12-09T09:21:01 keysyms: Add sharp S upper case mapping exception The case mapping `ssharp` ß ↔ `U1E9E` ẞ was added in 13b30f4f0dccc08dfea426d73570b913596ed602 but was broken: - For the lower case mapping it returned the keysym `0x10000df`, which is an invalid Unicode keysym. - For the upper case mapping it returned the upper Unicode code point rather than the corresponding keysym. It did accidentally enable the detection of alphabetic key type for the pair (ß, ẞ) though. However this detection was accidentally removed in 5c7c79970a2800b6248e829464676e1f09c5f43d (v1.7) with an attempt to fix the wrong keysym case mapping. Finally both the *lower* case mapping and the key type detection were fixed for good when we implemented the complete Unicode simple case mappings and corresponding tests in e83d08ddbc9851944c662c18e86d4eb0eff23e68. However, the *upper* case mapping `ssharp` → `U1E9E` remained disabled. Indeed, ẞ is a relatively recent addition to Unicode (2008) and had no official recommendation, until recently. So while the lower mapping ẞ→ß exists in Unicode, its converse upper mapping does not. Yet since 2017 the Council for German Orthography (Rat für deutsche Rechtschreibung) recommends[^1] ẞ as the capitalization of ß. Due to its stability policies, the Unicode Character Database (UCD) that we use to generate our keysym case mappings (via ICU) cannot update the simple case mapping of ß. Discussions are currently ongoing in the Unicode mailing list[^2] and CLDR[^3] about how to deal with the new recommended case mapping. However, the discussions are oriented on text-processing and compatibility mappings, while libxkbcommon is on a rather lower level. It seems that the slow adoption of ẞ is partly due to the difficulty to type it. Since ẞ is used only for ALL CAPS casing, the expectation is to type it using CapsLock. While our detection of alphabetic key types works well[^4] for the pair (ß,ẞ), the *internal capitalization* currently does not work and is fixed by this commit. Added the ß → ẞ upper mapping: - Added an exception in the generation script - Fixed tests - Added documentation of the exceptions in `xkbcommon.h` - Added/updated log entries [^1]: https://www.rechtschreibrat.com/regeln-und-woerterverzeichnis/ [^2]: https://corp.unicode.org/pipermail/unicode/2024-November/011162.html [^3]: https://unicode-org.atlassian.net/browse/CLDR-17624 [^4]: Except libxkbcommon 1.7, see the second paragraph.
Pierre Le Marre a7b84be9 2024-11-03T23:39:02 Fix types annotations of Python scripts
Pierre Le Marre bfddd9a8 2024-10-30T14:00:52 Add typing annotations to Python scripts
Pierre Le Marre b5f07797 2024-09-23T07:28:02 scripts: Improve messages registry update
Pierre Le Marre 378badab 2024-09-19T17:30:55 Add function xkb_keysym_is_deprecated This function allow to check whether a keysym is deprecated, based on the keysym and optionally its name. The generation of the table of deprecated keysyms relies on the rules described in `xkbcommon-keysyms.h`. The `ks_table.h` is now generated deterministically by setting explicitly the random seed to a constant. This will avoid noisy diffs in the future.
Pierre Le Marre e4269202 2024-09-20T10:41:00 keysyms: Fix off-by-one XKB_KEYSYM_NAME_MAX_SIZE The constant did not account for the terminating `NULL` byte and this was sadly not caught by the tests. Fixed the invalid value, the corresponding script and the tests.
Pierre Le Marre e83d08dd 2024-02-23T17:10:15 keysyms: Fast and complete case mappings (Unicode 15.1) The current code to handle keysym case mappings is quite complex and slow. It is also incomplete, as it does not cover recent Unicode database. Finally, it does not handle title case correctly. It would be easier if we were to use only a lookup table, but a trivial implementation would lead to a huge array: the cased characters range from `U+0041` to `U+`1F189, i.e. a span of 127 304 elements. Thus we need some tricks to compress the lookup table. We based our work on the post: https://github.com/apankrat/notes/blob/3c551cb028595fd34046c5761fd12d1692576003/fast-case-conversion/README.md The compression algorithm is roughly: 1. Compute the delta between the characters and their mappings. 2. Split the delta array in chunk of a given size. 3. Rearrange the order of the chunks in order to optimize consecutive chunks overlap. 4. Create a data table with the reordered chunks and an index table that maps the original chunk index to its offset in the data table. The compression algorithm is then applied a second time to the previous index table. The complete algorithm optimizes the two chunk sizes in order to get the lowest total data size. The mappings were generated using CPython 3.12.4, PyICU 2.13, PyYaml 6.0.1 and ICU 75.1. Also: - Added explicit list of named keysyms and their case mappings. - Added benchmark for case mappings. - Rework ICU tests. Note: 13b30f4f0dccc08dfea426d73570b913596ed602 introduced a fix for sharp S `U+00DF`. With the new implementation, the *conversion* functions `xkb_keysym_to_{lower,upper}` leave it *unchanged*, while the *predicate* functions `xkb_keysym_is_{lower,upper_or_title}` produce the expected results: ```c xkb_keysym_to_upper(XKB_KEY_ssharp) == XKB_KEY_ssharp; xkb_keysym_to_lower(XKB_KEY_ssharp) == XKB_KEY_ssharp; xkb_keysym_to_lower(XKB_KEY_Ssharp) == XKB_KEY_ssharp; xkb_keysym_is_lower (XKB_KEY_ssharp) == true; xkb_keysym_is_upper_or_title(XKB_KEY_Ssharp) == true; ```
Pierre Le Marre b5d3fa9e 2024-07-12T10:36:33 ci: Add ruff-format check - Add CI step - Fix errors
Pierre Le Marre ddbefb67 2024-03-07T14:20:42 keysyms: Make locale explicit in scripts/update-keysyms
Pierre Le Marre 53d9881e 2024-03-05T10:28:11 keysyms: Fix inconsistent case-insensitive name lookup `xkb_keysym_from_name` has inconsistent behavior when used with the flag `XKB_KEYSYM_CASE_INSENSITIVE`: ```c xkb_keysym_from_name("a", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_a; xkb_keysym_from_name("A", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_a; xkb_keysym_from_name("dead_a", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_dead_A; xkb_keysym_from_name("dead_A", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_dead_A; xkb_keysym_from_name("dead_o", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_dead_o; xkb_keysym_from_name("dead_O", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_dead_o; xkb_keysym_from_name("KANA_tsu", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_kana_tsu; xkb_keysym_from_name("KANA_TSU", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_kana_tsu; xkb_keysym_from_name("KANA_ya", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_kana_YA; xkb_keysym_from_name("KANA_YA", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_kana_YA; xkb_keysym_from_name("XF86Screensaver", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_XF86ScreenSaver; xkb_keysym_from_name("XF86ScreenSaver", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_XF86ScreenSaver; ``` So currently, if two keysym names differ only by case, then the lower-case *keysym* is returned, not the keysym corresponding to the lower-case keysym *name*. Indeed, `xkb_keysym_from_name` uses `xkb_keysym_is_lower` to test if a keysym is a lower-case keysym. Let’s look at the example for keysyms `a` and `A`: we get the keysym `a` not because its name is lower case, but because `xkb_keysym_is_lower(XKB_KEY_a)` returns true and `xkb_keysym_is_lower(XKB_KEY_A)` returns false. So the results are correct according to the doc: - Katakana is not a bicameral script, so e.g. `kana_ya` is *not* the lower case of `kana_YA`. - As for the `dead_*` keysyms, they are not cased either because they do not correspond to characters. - `XF86ScreenSaver` and `XF86Screensaver` are two different keysyms. But this is also very counter-intuitive: `xkb_keysym_is_lower` is not the right function to use in this case, because one would expect to check only the name, not the corresponding character case: ```c xkb_keysym_from_name("KANA_YA", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_kana_ya; xkb_keysym_from_name("XF86ScreenSaver", XKB_KEYSYM_CASE_INSENSITIVE) == XKB_KEY_XF86Screensaver; ``` Fixed by making the order of the keysyms names consistent in `src/ks_tables.h`: 1. Sort by the casefolded name: e.g. `kana_ya` < `kana_YO`. 2. If same casefolded name, then sort by cased name, i.e for ASCII: upper before lower: e.g `kana_YA` < `kana_ya`. Thus we now have e.g. `kana_YA` < `kana_ya` < `kana_YO` < `kana_yo`. The lookup logic has also been simplified. Added exhaustive test for ambiguous case-insensitive names.
Pierre Le Marre 806c5dc0 2024-01-10T17:17:48 scripts: Fix update-headers command The file name misses an extension.
Pierre Le Marre 0074baf4 2023-12-19T07:28:52 keysyms: Add XKB_KEYSYM_NAME_MAX_SIZE for internal use Currently there is no indication of the maximum length of keysym names. This is statically known, so add the new *internal* following API: `XKB_KEYSYM_NAME_MAX_SIZE`.
Pierre Le Marre 82f138c6 2023-12-14T09:13:35 keysyms: Add min and max assigned keysyms internal API Currently there is no direct way to know the minimum and maximum keysym values that are assigned, i.e. that have an explicit name or are Unicode keysyms. Introduce the new following internal API: - XKB_KEYSYM_MIN_ASSIGNED - XKB_KEYSYM_MAX_ASSIGNED - XKB_KEYSYM_MIN_EXPLICIT - XKB_KEYSYM_MAX_EXPLICIT - XKB_KEYSYM_COUNT_EXPLICIT Also add a bunch of tests to ensure consistant keysyms bounds.
Pierre Le Marre 238d1324 2023-09-29T11:33:28 Keysyms: Fix missing hpYdiaeresis The handling of keysym name guards (e.g. `#ifndef XK_Ydiaeresis`) was incomplete and led to a missing keysym. Make `sripts/makeheader` more robust to C macros handling.
Pierre Le Marre 1a4a89a7 2023-09-28T09:50:43 Python: make ruff & black happy
Pierre Le Marre 9c2f0fdb 2023-09-28T07:18:51 scripts/makeheader: Minor improvements Use `pathlib` for proper path handling.
Wismill b900faf7 2023-09-20T07:45:15 Keysyms: improve generator (#364) Motivation: normalization of keysyms header files in `xorgproto`. See: https://gitlab.freedesktop.org/xorg/proto/xorgproto/-/merge_requests/80 Improve `scripts/makeheader`: - Simplify `evdev` and `XK_` substitution and improve alignment. Also, perform some additional `XK_` substitutions in comments. - Format with `black`.
Pierre Le Marre 417d0747 2023-09-18T18:17:39 Add xkb-check-messages tool This tool checks whether messages codes are supported. This is useful e.g. for CI, where one may want to grep for some XKB error codes and ensure that these are still supported.
Pierre Le Marre ef81d04e 2023-09-18T18:17:34 Structured log messages with a message registry Currently there is little structure in the log messages, making difficult to use them for the following use cases: - A user looking for help about a log message: the user probably uses a search engine, thus the results will depend on the proper indexing of our documentation and the various forums. It relies only on the wording of the message, which may change with time. - A user wants to filter the logs resulting of the use of one of the components of xkbcommon. A typical example would be testing xkeyboard-config against libxkbcommon. It requires the use of a pattern (simple words detection or regex). The issue is that the pattern may become silently out-of-sync with xkbcommon. A common practice (e.g. in compilers) is to assign unique error codes to reference theses messages, along with an error index for documentation. Thus this commit implements the following features: - Create a message registry (message-registry.yaml) that defines the log messages produced by xkbcommon. This is a simple YAML file that provides, for each message: - A unique numeric code as a short identifier. It is used in the output message and thus can be easily be filtered to spot errors or searched in the internet. It must not change: if the semantics of message changes, it is better to introduce a new message for clarity. - A unique text identifier, meant for two uses: 1. Generate constants dealing with log information in our code base. 2. Generate human-friendly names for the documentation. - A type: currently warning or error. Used to prefix the constants (see hereinabove) and for basic classification in documentation. - A short description, used as concise and mandatory documentation. - An optionnal detailed description. - Optional examples, intended to help the user to fix issues themself. - Version of xkbcommon it was added. For old entries this often unknown, so they will default to 1.0.0. - Version of xkbcommon it was removed (optional) No entry should ever be deleted from this index, even if the message is not used anymore: it ensures we have unique identifiers along the history of xkbcommon, and that users can refer to the documentation even for older versions. - Add the script update-message-registry.py to generate the following files: - messages.h: message code enumeration for the messages currently used in the code base. Currently a private API. - message.registry.md: the error index documentation page. - Modify the logging functions to use structured messages. This is a work in progress.
Pierre Le Marre eec38903 2023-06-23T11:23:18 Fix typo in ensure-stable-doc-urls.py
Wismill 64aaa7cd 2023-05-14T15:11:15 Add support for stable doc URLs (#342) Doc URLs may change with time because they depend on Doxygen machinery. This is unfortunate because it is good practice to keep valid URLs (see: https://www.w3.org/Provider/Style/URI.html). I could not find a built-in solution in Doxygen, so the solution proposed here is to maintain a registry of all URLs and manage legacy URLs as redirections to their canonical page. This commit adds a registry of URLs that has three functions: - Check no previous URL is now invalid. - Add aliases for moved pages. - Generate redirection pages for aliases. The redirection works with a simple <meta http-equiv="refresh"> HTML tag. See: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta#http-equiv This commit also initialize the URLs registry with current pages and some redirections needed after recent documentation refactoring. Finally, the CI is updated to catch any change that invalidate previous URLs.
Pierre Le Marre fc664cf1 2023-05-13T05:30:11 Improve documentation - Add introduction to XKB - Embrace Doxygen features - More cross links
Wismill 0e9c2ec9 2023-04-30T21:30:36 Improve the doc of the XKB keymap text format, V1 (#321) - Add table of contents - Add terminology section - (WIP) Add Introduction to the format - Improve the keycode section - Improve the interpret section - Add guide to create and use modifiers - (WIP) Add actions documentation - Add cross-references - Add keysyms header to documentation
Peter Hutterer 8e9f943d 2021-05-14T08:36:59 scripts/update-keysyms: fix path to the include files after de1b6943d Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Adrian Perez de Castro 5cd76a8d 2021-04-26T17:38:48 Windows: Pass list of symbols to export to MSVC Arrange for passing .def files with the lists of symbols to export from DLLs when building on Windows with MSVC. Without this no symbols were being exported at all. The .def files are generated from the .map files at build time using scripts/map-to-def, which avoids needing to maintain two different sets of files.
Ran Benita 5d297c50 2021-04-08T10:13:27 scripts: update license note in perfect_hash.py Ref: https://github.com/ilanschnell/perfect-hash/issues/5 Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 45b1ca22 2021-04-01T22:46:56 keysym: speed up the perfect hash function Make it use a bit operation instead of an expensive modulo. perf diff: Baseline Delta Abs Shared Object Symbol ........ ......... ................. ................................... 28.15% -6.57% bench-compose [.] xkb_keysym_from_name Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 68e69b7d 2021-03-28T20:22:54 keysym: use a perfect hash function for case sensitive xkb_keysym_from_name In 7d84809fdccbb5898d0838849ec7c321410182d5 I added a fast path for the case-sensitive case, but it is still slowing down Compose parsing. Instead of the binary search, use a perfect hash function, computed with a simple python module I found (vendored). It is faster -- perf diff is: Baseline Delta Abs Shared Object Symbol ........ ......... ................. ................................... 22.35% -14.04% libc-2.33.so [.] __strcmp_avx2 16.75% +10.28% bench-compose [.] xkb_keysym_from_name 20.72% +2.40% bench-compose [.] parse.constprop.0 2.29% -1.97% bench-compose [.] strcmp@plt 2.56% +1.81% bench-compose [.] resolve_name 2.37% +0.92% libc-2.33.so [.] __GI_____strtoull_l_internal 26.19% -0.63% bench-compose [.] lex 1.45% +0.56% libc-2.33.so [.] __memchr_avx2 1.13% -0.31% libc-2.33.so [.] __strcpy_avx2 Also reduces the binary size: Before: text data bss dec hex filename 341111 5064 8 346183 54847 build/libxkbcommon.so.0.0.0 After: text data bss dec hex filename 330215 5064 8 335287 51db7 build/libxkbcommon.so.0.0.0 Note however that it's still larger than before 7d84809fdccbb5898d08388: text data bss dec hex filename 320617 5168 8 325793 4f8a1 build/libxkbcommon.so.0.0.0 Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 7d84809f 2021-03-28T15:51:01 keysym: fast path for case sensitive xkb_keysym_from_name xkb_keysym_from_name() is called a lot in Compose file parsing. The lower case handling slows things down a lot (particularly given we can't use the optimized strcasecmp() due to locale issues). So add separate handling for the non-case-sensitive case which is used by Compose. To do this we need to add another version of the ks_tables table. This adds ~20kb to the shared library binary. We can probably do something better here but I think it's fine. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 2dd391b6 2021-02-27T21:38:02 scripts: remove meson-junit-report.py Not used since ed5a0b4fede69b8e6dc4db53d97ea4ae0a73956d. Signed-off-by: Ran Benita <ran@unusedvar.com>
Peter Hutterer 3852106a 2021-02-17T09:06:57 scripts: update makeheader script for the _EVDEVK keysym defines As of xorgproto commit 5dbb5b76597f [1], the 0x10081XXX keycode range is defined for direct evdev kernel keycode mapping. For example, KEY_MACRO1 (0x290) is mapped to 0x10081290. The format of the #define lines for these keys is stable to allow for parsing: #define XF86XK_FooBar _EVDEVK(0x123) /* optional comment */ Update our script so we detect these new lines. Our keysym generation is a two-step process: makeheader and then makekeys. Replacing the key with its full value in the makeheader script means we don't have to update makekeys to handle the _EVDEVK macro and our header file is fully resolved. [1] https://gitlab.freedesktop.org/xorg/proto/xorgproto/-/merge_requests/23 Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Ran Benita 670566f0 2019-12-27T15:03:10 Only add GCC diagnostic pragmas when compiler is GCC compatible Avoid "unknown pragma" warnings on other compilers. Signed-off-by: Ran Benita <ran@unusedvar.com>
Ran Benita 90497b84 2019-10-31T21:21:35 scripts/makeheader: slight simplification Signed-off-by: Ran Benita <ran@unusedvar.com>
Sebastian Wick f0c0cb80 2019-10-31T17:04:49 scripts/makeheader: allow overriding the prefix path of the X11 headers with X11_HEADERS_PREFIX Signed-off-by: Sebastian Wick <sebastian@sebastianwick.net>
Adrian Perez de Castro c408adc2 2019-08-06T18:59:10 CI: Publish test results from Meson
Ran Benita 41bea9ab 2017-08-01T22:19:48 build: make doxygen run from the source tree I couldn't find any other way to make this work! Signed-off-by: Ran Benita <ran234@gmail.com>
Ran Benita 0a19267f 2017-07-29T14:37:23 build: move custom targets to scripts/ and remove from makefile These scripts generate source code that is committed to git and hence do not really belong in the build system. A maintainer runs them as needed. Signed-off-by: Ran Benita <ran234@gmail.com>