research


Log

Author Commit Date CI Message
Evgenii Kliuchnikov 2903f45e 2025-06-24T07:47:47 ignore slices that cross sample boundary PiperOrigin-RevId: 775230773
Andreas Deininger 93d0ac53 2025-05-27T09:47:01 Fix typos (#1242) Co-authored-by: Eugene Kliuchnikov <eustas.ru@gmail.com>
Evgenii Kliuchnikov 281b0aa5 2025-01-08T10:12:41 Fix most of build_test pipeline
Evgenii Kliuchnikov 57610b71 2025-01-07T13:59:06 translate includes in brotli/research PiperOrigin-RevId: 713033033
Evgenii Kliuchnikov 95b81fcc 2025-01-06T23:51:35 Partially pick https://github.com/google/brotli/pull/1232 PiperOrigin-RevId: 712791222
Evgenii Kliuchnikov 9b83be23 2023-10-26T02:02:51 fix wording PiperOrigin-RevId: 576788685
Evgenii Kliuchnikov 70e7b1ae 2023-07-06T11:56:38 simplify building of fuzzer PiperOrigin-RevId: 545950923
Eugene Kliuchnikov ce92c956 2023-01-03T20:44:14 brotlidump: fix dictionary file discovery (#997)
Evgenii Kliuchnikov a8f5813b 2022-11-17T13:03:09 Update Documentation: - add note that brotli is a "stream" format, not an archive-like - regenerate .1 with Pandoc Build: - drop legacy "BROTLI_BUILD_PORTABLE" option - drop "BROTLI_SANITIZED" definition Code: - c: comb includes - c/enc: extract encoder state into separate header - c/enc: drop designated q10 codepath - c/enc: dealing better with flushing of empty stream - fix MSVC compilation API: - py: use library version instead of one in version.h - c: add plugable API to report consumed input / produced output - c/java: support "lean" prepared dictionaries (without copy of source)
Eugene Kliuchnikov 8376f72e 2021-11-10T10:34:39 Prepare for copybara (#939) Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
Eugene Kliuchnikov 62662f87 2021-09-08T09:18:45 Strip "./" in includes (#925) Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
Eugene Kliuchnikov 0e42caf3 2021-08-31T14:07:17 Migrate to github actions (#920) Not all combinations are migrated to the initial configuration; corresponding TODOs added. Drive-by: additional combinations uncovered minor portability problems -> fixed Drive-by: remove no-longer used "script" files. Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
Eugene Kliuchnikov 68f1b90a 2021-08-18T19:15:07 Update (#918) Prepare to use copybara worklow.
Eugene Kliuchnikov f8c67177 2021-06-23T09:40:57 Update (#908) * re-enable Js build/test * improve decoder performance * rewrite dictionary data in Java/Js to a shorter uncompressed form * improve dictionary generation tool
Tim Gates 685d7bae 2020-09-27T19:00:29 docs: Fix small typo: rougly -> roughly (#849)
Eugene Kliuchnikov 223d80cf 2020-08-26T12:32:27 Update (#826) * IMPORTANT: decoder: fix potential overflow when input chunk is >2GiB * simplify max Huffman table size calculation * eliminate symbol duplicates (static arrays in .h files) * minor combing in research/ code
Eugene Kliuchnikov 7f740f13 2020-05-15T11:06:21 Update (#807) - fix formatting - fix type conversion - fix no-op arithmetic with null-pointer - improve performance of hash_longest_match64 - go: detect read after close - java decoder: support compound dictionary - remove executable flag on non-scripts
Eugene Kliuchnikov 4b2b2d4f 2019-04-12T13:57:42 Update (#749) Update: * Bazel: fix MSVC configuration * C: common: extended documentation and helpers around distance codes * C: common: enable BROTLI_DCHECK in "debug" builds * C: common: fix implicit trailing zero in `kPrefixSuffix` * C: dec: fix possible bit reader discharge for "large-window" mode * C: dec: simplify distance decoding via lookup table * C: dec: reuse decoder state members memory via union with lookup table * C: dec: add decoder state diagram * C: enc: clarify access to static dictionary * C: enc: improve static dictionary hash * C: enc: add "stream offset" parameter for parallel encoding * C: enc: reorganize hasher; now Q2-Q3 require exactly 256KiB to avoid global TCMalloc lock * C: enc: fix rare access to uninitialized data in ring-buffer * C: enc: reorganize logging / checks in `write_bits.h` * Java: dec: add "large-window" support * Java: dec: improve speed * Java: dec: debug and 32-bit mode are now activated via system properties * Java: dec: demystify some state variables (use better names) * Dictionary generator: add single input mode * Java: dec: modernize tests * Bazel: js: pick working commit for closure rules
Eugene Kliuchnikov 8544ae85 2018-06-09T11:17:13 Update (#680) * fix MSVC warnings * cleanups
Eugene Kliuchnikov 1e7ea1d8 2018-06-04T17:53:16 Inverse bazel project/workspace tree (#677) * Inverse bazel workspace tree. Now each subproject directly depends on root (c) project. This helps to mitigate Bazel bug bazelbuild/bazel#2391; short summary: Bazel does not work if referenced subproject `WORKSPACE` uses any repositories that embedding project does not. Bright side: building C project is much faster; no need to download closure, go and JDK...
Eugene Kliuchnikov 631fe194 2018-03-20T17:37:41 Update (#651) * fix `bazel` build (ignore switch case fall-through) * add `NPOSTFIX` / `NDIRECT` encoder parameters * fix source file lists (add `params.h`) * fix bug in `durchschlag` * print clarifying messages wheb CLI argument parsing fails
Eugene Kliuchnikov 533843e3 2018-03-02T15:49:58 Update (#643) Update * make the zopflification aware of `NDIRECT`, `NPOSTFIX` (better compression in `font` mode) * add small and simple decoder tool * fix typo * Java: wrapper: make decoder channel more async-friendly Ramp up version to 1.0.3 / 1.0.3
Eugene Kliuchnikov 35e69fc7 2018-02-26T09:04:36 New feature: "Large Window Brotli" (#640) * New feature: "Large Window Brotli" By setting special encoder/decoder flag it is now possible to extend LZ-window up to 30 bits; though produced stream will not be RFC7932 compliant. Added new dictionary generator - "DSH". It combines speed of "Sieve" and quality of "DM". Plus utilities to prepare train corpora (remove unique strings). Improved compression ratio: now two sub-blocks could be stitched: the last copy command could be extended to span the next sub-block. Fixed compression ineffectiveness caused by floating numbers rounding and wrong cost heuristic. Other C changes: - combined / moved `context.h` to `common` - moved transforms to `common` - unified some aspects of code formatting - added an abstraction for encoder (static) dictionary - moved default allocator/deallocator functions to `common` brotli CLI: - window size is auto-adjusted if not specified explicitly Java: - added "eager" decoding both to JNI wrapper and pure decoder - huge speed-up of `DictionaryData` initialization * Add dictionaryless compressed dictionary * Fix `sources.lst` * Fix `sources.lst` and add a note that `libtool` is also required. * Update setup.py * Fix `EagerStreamTest` * Fix BUILD file * Add missing `libdivsufsort` dependency * Fix "unused parameter" warning.
Daniel Chýlek b5033d0e 2018-02-08T12:48:24 Fix brotlidump.py crashing when complex prefix code has exactly 1 non-zero code length (#635) According to the format specification regarding complex prefix codes: > If there are at least two non-zero code lengths, any trailing zero > code lengths are omitted, i.e., the last code length in the > sequence must be non-zero. In this case, the sum of (32 >> code > length) over all the non-zero code lengths must equal to 32. > If the lengths have been read for the entire code length alphabet > and there was only one non-zero code length, then the prefix code > has one symbol whose code has zero length. The script does not handle a case where there is just 1 non-zero code length where the sum rule doesn't apply, which causes a StopIteration exception when it attempts to read past the list boundaries. An example of such file is tests/testdata/mapsdatazrh.compressed. I made sure this change doesn't break anything by processing all *.compressed files from the testdata folder with no thrown exceptions.
Eugene Kliuchnikov 0ad94eed 2017-11-28T15:37:28 Update (#620) * add autotools build * separate semantic and ABI version * extract sources.lst (used by CMake and Automake) * share pkgconfig templates (used by CMake and Automake) * decoder: always set `total_out` * encoder: fix `BROTLI_ENSURE_CAPACITY` macro (no-op after preprocessor) * decoder/encoder: refine `free_func` contract
Eugene Kliuchnikov 39ef4bbd 2017-10-13T11:25:03 Add new (fast) dictionary generator engine. (#616) Add CLI for dictionary generation. Add BUILD file for research folder
Tomáš Popela a0c7dafe 2017-10-10T11:24:13 Fix permissions of various files in project (#613) Move from 755 to 644.
Eugene Kliuchnikov a629289e 2017-08-28T11:31:29 Update (#590) * add transpiled JS decoder * make PY wrapper accept memview * fix dictionary generator * speedup compression of RLEish data
Eugene Kliuchnikov 52441069 2017-07-21T10:07:24 Update (#574) * Update * decoder: better behavior after failure * encoder: replace "len_x_code" with delta * research: add experimental dictionary generator * python: test combing
Eugene Kliuchnikov 27d94590 2016-12-22T13:03:28 Research (#491) * add advanced mode for optimal references generator * fix #489 Thanks to Ivan Nikulin for working on it.
Eugene Kliuchnikov fd96151b 2016-12-20T18:00:51 Move brotlidump.py to research/ (#487)
Eugene Kliuchnikov dd8fa3e8 2016-09-22T11:32:23 Update research * don't use `assert` when side-effect is desired * use `gflags` to pick options from args Other changes: * teach stub `Makefile` to do partial rebuild * remove obsolete `tools/version.h`
Ivan Nikulin 92940229 2016-09-19T19:12:30 Replace sais.hxx by submodule hillbig/esaxx.
Ivan Nikulin 42919320 2016-09-15T17:19:26 Update research tools description.
Ivan Nikulin 0e52c59a 2016-09-15T16:59:52 Update variable naming.
Ivan Nikulin 9589396e 2016-09-15T11:34:19 Add description of research tools.
Ivan Nikulin 58cecf17 2016-09-15T10:44:19 Add distance encoding research tools.