src/hb-ot-cmap-table.hh


Log

Author Commit Date CI Message
Behdad Esfahbod 49c52fa9 2023-06-26T17:55:29 [cmap] Don't zero a few allocations unnecessarily
Behdad Esfahbod 393f0f9f 2023-06-25T18:14:56 [map] Rename resize() to alloc() Better matches the functionality, and hb_vector_t.
Behdad Esfahbod 67b16247 2023-06-07T16:15:48 [set] Simplify a few set iterations as range loop
Behdad Esfahbod 3f2a1b64 2023-06-04T10:13:53 Fix build
Behdad Esfahbod 82741304 2023-06-04T09:41:41 [subset] start_embed never returns nullptr Remove checks.
Behdad Esfahbod ca27925d 2023-06-03T16:18:15 Use hb_codepoint_pair_t in more places
Garret Rieger a652281e 2023-05-26T19:47:50 [subset] Fix fuzzer timeout. Fixes https://oss-fuzz.com/testcase-detail/5979721620652032. Timeout was caused by degenerate map insert behaviour due to poor integer hash function. Presize the map to avoid it. Also fixes collect_mapping() for cmap format 13.
Behdad Esfahbod 9ee7c2ea 2023-04-25T16:13:54 [cmap] Minor remove magic number
Behdad Esfahbod 580b0dc1 2023-04-25T16:11:01 [cmap] Comment
Behdad Esfahbod 784fe9ac 2023-01-29T09:26:52 [cmap] Simplify caching
Matthias Clasen a451aa54 2023-01-29T11:25:28 Add back a null check This was accidentally dropped in the previous commit.
Matthias Clasen 318aa107 2023-01-29T09:17:17 [ot-font] Use the cmap cache more Use the cmap cache for get_nominal_glyph and get_variation_glyph as well. The first of these is used a lot in pango.
Behdad Esfahbod f8a744d9 2023-01-28T13:37:43 [ot-font] Add a cmap cache Speeds up Roboto shaping by 7%, for 1kb per face.
Behdad Esfahbod a34a204b 2023-01-11T12:27:19 [subset-plan] Simplify unicodes allocation
Behdad Esfahbod bd4b040e 2023-01-11T11:23:48 [shape-plan] Simplify glyphs_requested allocation
Behdad Esfahbod 023f595d 2022-12-03T11:18:05 [cmap] Speed up DefaultUVS::copy even more Another 14% on SourceHanSerifVF/10 benchmark.
Behdad Esfahbod 4ca61051 2022-12-03T11:15:06 [cmap] Remove double-min
Behdad Esfahbod cd29147e 2022-12-03T10:41:42 [cmap] Minor cast
Behdad Esfahbod 4cdb5cc6 2022-12-03T10:40:24 [cmap] Minor change iterator
Behdad Esfahbod a2d33779 2022-12-03T09:49:00 Fix arm bot build
Behdad Esfahbod dabbf13d 2022-12-03T09:46:11 [cmap] Speed up DefaultUVS::copy
Garret Rieger 2658370f 2022-11-30T00:19:10 [subset] make the cmap cache in accelerator const.
Garret Rieger 7a004a7a 2022-11-29T00:47:55 [subset] Cache per subtable cmap unicode mappings.
Behdad Esfahbod c503cf00 2022-11-28T15:53:35 [cmap] Store offset, not pointer, in cmap cache
Khaled Hosny a15ad778 2022-06-19T19:55:09 [arabic-fallback] Generate PUA table from data Uses packtab for more compact arrays.
Behdad Esfahbod 8c27c51c 2022-06-19T10:47:38 [arabic-pua] Rename symbols
Behdad Esfahbod 76989629 2022-06-19T10:41:45 [arabic-fallback] Disable PUA shaping under HB_NO_OT_SHAPER_ARABIC_FALLBACK
Behdad Esfahbod 55350377 2022-06-19T10:13:31 [cmap/ft] Only map 0xF000 range if font_page is NONE
Khaled Hosny c3f590bb 2022-06-16T11:04:13 [arabic] Support legacy PUA shaping Support legacy pre-OpenType Windows 3.1-era fonts, by remapping PUA code points in cmap table and letting our fallback shaper build the GSUB table. Uniscribe applies also mset-like substitution, but our fallback mark positioning gives better results, so this is not implemented.
Behdad Esfahbod 5dc12d7d 2022-06-03T01:37:02 [cmap] Rewrite set_for() slightly
Behdad Esfahbod 9552955e 2022-06-03T01:33:01 Add an unlikely
Behdad Esfahbod a7a68861 2022-06-02T18:59:15 [cmap] Convert another map use to unique_ptr
Behdad Esfahbod f82ee17a 2022-05-18T12:17:43 [map] Pre-size map in constructor if we can
Garret Rieger 8f9f0c49 2022-05-10T17:47:08 [subset] Enforce cmap12 group ordering constraints in collect_mapping. Fixes fuzzer issue: https://oss-fuzz.com/testcase-detail/6365271012540416
Behdad Esfahbod f10ddb8d 2022-05-05T11:21:24 [cmap] Use -1 as Unicode sentinel, not U+FFFF in Format12 serialize
Behdad Esfahbod 8a19968c 2022-05-05T11:17:23 [cmap] Use iterator bool operator
Behdad Esfahbod 052812b6 2022-05-04T15:38:30 Merge pull request #3561 from googlefonts/cmap_opt [subset] Further cmap subsetting speed optimizations
Garret Rieger f0c04114 2022-05-03T22:02:59 [subset] Embed unicode to gid list vector in subset plan.
Behdad Esfahbod 3fff2e91 2022-05-02T16:31:59 [perf/benchmark-font] Cosmetic
Behdad Esfahbod 307d2d8b 2022-05-02T16:30:22 [cmap] Sprinkle some 'unlikely's
Garret Rieger 088133d9 2022-05-02T21:29:16 [subset] cache cp to new gid list in subset plan. This avoids having to recompute the ordered list multiple times during cmap generation.
Garret Rieger 6922a256 2022-04-29T23:30:32 [subset] Change serialize_rangeoffset_glyid back to using iterator.
Garret Rieger c66fd50c 2022-04-29T23:18:53 [subset] in cmap4 serialization save cp to gid iter to memory. Iterator accesses are slow and it's iterated multiple times.
Garret Rieger 17b98563 2022-04-29T22:49:02 [subset] In cmap4 serialization reduce unnessecary calls into the iterator. Gives ~20% speedup for large subsets.
Garret Rieger 5e241094 2022-04-29T22:44:43 [subset] In unicodes cache cleanup if set insert fails.
Garret Rieger a424a92c 2022-04-29T22:14:03 [subset] s/void */intptr_t.
Garret Rieger aad67f56 2022-04-29T22:01:06 [subset] cache results of collect_unicodes.
Garret Rieger b4236b7d 2022-04-29T19:21:13 [subset] Optimize Cmap4 collect_unicodes. Use set add_range() instead of individual add() calls.
Behdad Esfahbod f41945e3 2022-03-21T18:24:30 [cmap] In collect_unicodes() of format 12/13, limit to max Unicode Fixes fuzzer timeout: https://oss-fuzz.com/testcase-detail/5062368881672192
Behdad Esfahbod ac1bb3e3 2022-01-20T11:47:17 [machinery] Move accelerators to constructor/destructor
Behdad Esfahbod e062376e 2022-01-19T17:09:34 [machinery] Make accelerator lazy-loader call Xinit/Xfini Instead of init/fini. To isolate those functions. To be turned into constructor/destructors, ideally one per commit (after some SFINAE foo.)
Behdad Esfahbod 8a69e006 2022-01-13T16:17:34 [meta] Use std::addressof() instead of hb_addressof()
Garret Rieger 1d9ef3a7 2021-12-01T10:30:27 [subset] Actually fix end_cp unitialized warning.
Garret Rieger d8635dfe 2021-12-01T10:14:10 [subset] Fix warning about uninitialized use of end_cp.
Garret Rieger 95329081 2021-11-26T16:18:42 [subset] further optimize cmap4 packing.
Garret Rieger d9660fd5 2021-11-25T18:15:35 [subset] Make cmap4 packing more optimal. The current CMAP4 implementation uses whatever the current codepoint ranges are and then encodes them as indivudal glyph ids or as a delta if possible. However, it's often possible to save bytes by splitting up existing ranges and encoding parts of them using deltas where the cost of splitting the range is less than encoding each glyph individual.
Behdad Esfahbod c852b868 2021-09-19T16:30:12 Rename HBGlyphID to HBGlyphID16
Garret Rieger 2bd911b8 2021-08-26T14:32:17 [subset] handle cmap4 overflows. If a cmap4 subtable overflows during serialization drop it and the corresponding EncodingRecord. Don't drop the corresponding cmap12 table if it would have otherwise been removed.
Garret Rieger b9a176e2 2021-08-29T10:33:12 [subset] speedup cmap4 subsetting for large codepoint counts. (#3178) glyphIdArray generation implementation was O(n^2). Refactored to use a hashmap to reduce complexity. After the change subset time for a 22k codepoint subset went from 7s to 0.7s.
Garret Rieger 2c024dc3 2021-08-04T11:38:38 [subset] prune redundant cmap12 subtables. If the post subset cmap12 table is equivalent to another cmap subtable don't include the 12 table in the final subset. Matches change https://github.com/fonttools/fonttools/pull/2146 from fontTools.
Behdad Esfahbod f0a1892f 2021-07-28T17:36:22 [serialize] Remove unnecessary pointer indirection
Garret Rieger 9aa0ecef 2021-07-14T17:27:14 [subset] de-duplicate the logic that finds unicodes corresponding to requested glyphs. Move the logic into subset planning and then re-use the results in cmap and OS2 subsetting. Removes depedency on cmap from os2.
Behdad Esfahbod 092094f7 2021-04-01T15:47:21 Use as_array() and range loops in a few places
Behdad Esfahbod 4dba749d 2021-03-31T16:09:39 Add SortedArray{16,32}Of<>
Behdad Esfahbod ad28f973 2021-03-31T12:49:14 Rename offset types to be explicit about their size Add Offset16To<>, Offset24To<>, and Offset32To<> for most use-cases.
Garret Rieger b14475d2 2021-03-18T10:51:26 [subset] further changes to serializer error handling. - Rename enum type and enum members. - in_errors() now returns true for any error having been set. hb-subset now looks for offset overflow only errors to divert to repacker. - Added INT_OVERFLOW and ARRAY_OVERFLOW enum values.
Garret Rieger 73ed59f7 2021-03-17T15:53:10 [subset] store errors in the serializer as a flag set. Make check_assign/check_equal specify the type of error to set.
Behdad Esfahbod 6d941944 2021-02-19T17:08:10 Use auto in range-for-loop more
Garret Rieger 18ab8029 2020-07-31T14:40:49 [ENOMEM] check vector status in cmap subsetting.
Ebrahim Byagowi 5a7cc7fd 2020-07-29T08:33:32 minor spacing tweak
Ebrahim Byagowi d0e2addd 2020-07-18T22:14:52 minor
Qunxin Liu 8e5bc535 2020-07-15T18:54:52 [subset] call collect_mapping only when --gids option is used. collect_mapping is time consuming as it iterates all codepoints in all cmap subtables, only trigger it when necessary
Qunxin Liu 10d6605b 2020-05-15T10:52:49 [subset] don't use << operator in collect_mapping
Qunxin Liu b2a965df 2020-04-22T15:58:41 [subset] Add support for "--gids" option cmap subsetting now retains entries associated with any glyph ids explicitly requested
Qunxin Liu e53c44e3 2020-04-24T14:06:13 [subset] temporarily revert previous cmap commit Required in https://github.com/harfbuzz/harfbuzz/issues/2356
Ebrahim Byagowi 08428a15 2020-04-24T23:45:17 minor, spacing
Ebrahim Byagowi 2dda6dd7 2020-04-20T14:12:45 minor, tweak spacing turn 8 spaces to tab, add space before Null/Crap
Ebrahim Byagowi a224f417 2020-03-13T08:33:34 Turn more of simple dagger chains to foreach Less noise, as was agreed before and applied 385741d also
Ebrahim Byagowi 07acd1a0 2020-03-08T23:39:24 [subset] Rename src_base args to base to match sanitize methods So it will become easier to follow that serialize methods signatures should match with their sanitize methods counterparts.
ariza 188a0a47 2020-03-07T11:02:36 removed default base; replaced w/ bias if required
Michiharu Ariza 5ab50eeb 2020-02-29T01:32:29 collect_unicodes() with clamp, calling add_range() Use add_range instead an inner loop, clamp its input number by number of glyphs a face has. Even the face cmap12 and 13 have 32-bit hb_codepoint_t, which is here used to make timeout, face's maxp has 16-bit gid limitation at least for now, using that makes sure we both fix and the timeout and don't need to change much things here also in order to support 32-bit gids also someday. Fixes #2204
Ebrahim Byagowi e9021386 2020-02-28T21:24:27 Revert "collect_unicodes() to check gid < num_glyphs with cmap 12" Didn't fix the case actually, making bots to fail. This reverts commit 15b43a410400c74a32d40f4b89dbea02fa7cd6e1.
Michiharu Ariza 15b43a41 2020-02-28T08:45:39 collect_unicodes() to check gid < num_glyphs with cmap 12 fixes #2204
Garret Rieger 50129b03 2020-02-25T17:39:59 Add a reverse () call to hb_array_t.
Garret Rieger 38c6598c 2020-02-25T17:20:05 Switch to C style comments.
Garret Rieger 52b6e0ba 2020-02-10T12:26:40 When serializing cmap14 order the offsets from smallest to largest. Current versions of OTS fail fonts with cmap 14's who's last offset does not point to the a block at the end of the table.
ckitagawa 03f778cf 2020-02-05T09:26:45 [cmap] remove dead code
Ebrahim Byagowi a7f694d4 2020-02-05T16:31:21 Merge branch 'subset_cblc' into master
ckitagawa-work 774725b4 2020-02-05T07:43:10 [subset] Avoid incorrectly dropping cmap for NotoColorEmoji.ttf NotoColorEmoji.ttf uses two cmap subtables Format 14 | Platform ID 0 | Platform Encoding ID 5 Format 12 | Platform ID 3 | Platform Encoding ID 10 This combination results in the cmap table being dropped during subsetting despite being valid/required.
ckitagawa e128f802 2020-01-21T13:35:43 parent 777ba47b50f6379b9f9abf1d72559316b7116b9e author ckitagawa <ckitagawa@chromium.org> 1579631743 -0500 committer ckitagawa <ckitagawa@chromium.org> 1580506176 -0500 [subset] Add CBLC support
Qunxin Liu b6a8f5e6 2020-01-28T09:30:51 [subset] CMAP table subsetting fix Not all codepoints smaller than 0xFFFF go to cmap4 table. Only subset codepoints existing in each table. This will also make harfbuzz consistent with fontTools' behavior
Qunxin Liu c370da45 2020-01-22T11:36:15 [subset] Cmap table: remove encodingRecord entry for empty cmap4 subtable
Qunxin Liu 1db2c1d0 2020-01-07T11:10:40 fix for cmap4 and OS_2 subsetting: maximum character code allowed is 0xFFFF
Behdad Esfahbod 6a60ca11 2019-12-10T12:32:37 [algs] Fold last other bsearch() in Now truly have only one bsearch implementation.
Ebrahim Byagowi 486754a8 2019-09-23T23:48:08 [serialize] Extract iterable copy, copy_all
Khaled Hosny dd288840 2019-10-29T01:45:49 [cmap] Check GID before adding ranges in format 4 & 12 Fixes https://github.com/harfbuzz/harfbuzz/issues/2031
Behdad Esfahbod 03028a5f 2019-10-28T13:46:56 Revert "Don't include codepoint 0 in the results of collect_unicodes." This reverts commit 14ad96ffbf77c33d8d33d2686d17c2375381989e. This was wrong. My bad! https://github.com/harfbuzz/harfbuzz/issues/2031
Garret Rieger 14ad96ff 2019-10-28T12:56:04 Don't include codepoint 0 in the results of collect_unicodes. It is always assumed to be the notdef glyph.
Ebrahim Byagowi 0558413f 2019-10-01T13:49:55 Minor, tweak spaces
Ebrahim Byagowi 035ec3d1 2019-09-23T20:51:43 [cmap] remove has_format14, minor format fixes #1986