src/hb-ot-cmap-table.hh


Log

Author Commit Date CI Message
Garret Rieger 8f9f0c49 2022-05-10T17:47:08 [subset] Enforce cmap12 group ordering constraints in collect_mapping. Fixes fuzzer issue: https://oss-fuzz.com/testcase-detail/6365271012540416
Behdad Esfahbod f10ddb8d 2022-05-05T11:21:24 [cmap] Use -1 as Unicode sentinel, not U+FFFF in Format12 serialize
Behdad Esfahbod 8a19968c 2022-05-05T11:17:23 [cmap] Use iterator bool operator
Behdad Esfahbod 052812b6 2022-05-04T15:38:30 Merge pull request #3561 from googlefonts/cmap_opt [subset] Further cmap subsetting speed optimizations
Garret Rieger f0c04114 2022-05-03T22:02:59 [subset] Embed unicode to gid list vector in subset plan.
Behdad Esfahbod 3fff2e91 2022-05-02T16:31:59 [perf/benchmark-font] Cosmetic
Behdad Esfahbod 307d2d8b 2022-05-02T16:30:22 [cmap] Sprinkle some 'unlikely's
Garret Rieger 088133d9 2022-05-02T21:29:16 [subset] cache cp to new gid list in subset plan. This avoids having to recompute the ordered list multiple times during cmap generation.
Garret Rieger 6922a256 2022-04-29T23:30:32 [subset] Change serialize_rangeoffset_glyid back to using iterator.
Garret Rieger c66fd50c 2022-04-29T23:18:53 [subset] in cmap4 serialization save cp to gid iter to memory. Iterator accesses are slow and it's iterated multiple times.
Garret Rieger 17b98563 2022-04-29T22:49:02 [subset] In cmap4 serialization reduce unnessecary calls into the iterator. Gives ~20% speedup for large subsets.
Garret Rieger 5e241094 2022-04-29T22:44:43 [subset] In unicodes cache cleanup if set insert fails.
Garret Rieger a424a92c 2022-04-29T22:14:03 [subset] s/void */intptr_t.
Garret Rieger aad67f56 2022-04-29T22:01:06 [subset] cache results of collect_unicodes.
Garret Rieger b4236b7d 2022-04-29T19:21:13 [subset] Optimize Cmap4 collect_unicodes. Use set add_range() instead of individual add() calls.
Behdad Esfahbod f41945e3 2022-03-21T18:24:30 [cmap] In collect_unicodes() of format 12/13, limit to max Unicode Fixes fuzzer timeout: https://oss-fuzz.com/testcase-detail/5062368881672192
Behdad Esfahbod ac1bb3e3 2022-01-20T11:47:17 [machinery] Move accelerators to constructor/destructor
Behdad Esfahbod e062376e 2022-01-19T17:09:34 [machinery] Make accelerator lazy-loader call Xinit/Xfini Instead of init/fini. To isolate those functions. To be turned into constructor/destructors, ideally one per commit (after some SFINAE foo.)
Behdad Esfahbod 8a69e006 2022-01-13T16:17:34 [meta] Use std::addressof() instead of hb_addressof()
Garret Rieger 1d9ef3a7 2021-12-01T10:30:27 [subset] Actually fix end_cp unitialized warning.
Garret Rieger d8635dfe 2021-12-01T10:14:10 [subset] Fix warning about uninitialized use of end_cp.
Garret Rieger 95329081 2021-11-26T16:18:42 [subset] further optimize cmap4 packing.
Garret Rieger d9660fd5 2021-11-25T18:15:35 [subset] Make cmap4 packing more optimal. The current CMAP4 implementation uses whatever the current codepoint ranges are and then encodes them as indivudal glyph ids or as a delta if possible. However, it's often possible to save bytes by splitting up existing ranges and encoding parts of them using deltas where the cost of splitting the range is less than encoding each glyph individual.
Behdad Esfahbod c852b868 2021-09-19T16:30:12 Rename HBGlyphID to HBGlyphID16
Garret Rieger 2bd911b8 2021-08-26T14:32:17 [subset] handle cmap4 overflows. If a cmap4 subtable overflows during serialization drop it and the corresponding EncodingRecord. Don't drop the corresponding cmap12 table if it would have otherwise been removed.
Garret Rieger b9a176e2 2021-08-29T10:33:12 [subset] speedup cmap4 subsetting for large codepoint counts. (#3178) glyphIdArray generation implementation was O(n^2). Refactored to use a hashmap to reduce complexity. After the change subset time for a 22k codepoint subset went from 7s to 0.7s.
Garret Rieger 2c024dc3 2021-08-04T11:38:38 [subset] prune redundant cmap12 subtables. If the post subset cmap12 table is equivalent to another cmap subtable don't include the 12 table in the final subset. Matches change https://github.com/fonttools/fonttools/pull/2146 from fontTools.
Behdad Esfahbod f0a1892f 2021-07-28T17:36:22 [serialize] Remove unnecessary pointer indirection
Garret Rieger 9aa0ecef 2021-07-14T17:27:14 [subset] de-duplicate the logic that finds unicodes corresponding to requested glyphs. Move the logic into subset planning and then re-use the results in cmap and OS2 subsetting. Removes depedency on cmap from os2.
Behdad Esfahbod 092094f7 2021-04-01T15:47:21 Use as_array() and range loops in a few places
Behdad Esfahbod 4dba749d 2021-03-31T16:09:39 Add SortedArray{16,32}Of<>
Behdad Esfahbod ad28f973 2021-03-31T12:49:14 Rename offset types to be explicit about their size Add Offset16To<>, Offset24To<>, and Offset32To<> for most use-cases.
Garret Rieger b14475d2 2021-03-18T10:51:26 [subset] further changes to serializer error handling. - Rename enum type and enum members. - in_errors() now returns true for any error having been set. hb-subset now looks for offset overflow only errors to divert to repacker. - Added INT_OVERFLOW and ARRAY_OVERFLOW enum values.
Garret Rieger 73ed59f7 2021-03-17T15:53:10 [subset] store errors in the serializer as a flag set. Make check_assign/check_equal specify the type of error to set.
Behdad Esfahbod 6d941944 2021-02-19T17:08:10 Use auto in range-for-loop more
Garret Rieger 18ab8029 2020-07-31T14:40:49 [ENOMEM] check vector status in cmap subsetting.
Ebrahim Byagowi 5a7cc7fd 2020-07-29T08:33:32 minor spacing tweak
Ebrahim Byagowi d0e2addd 2020-07-18T22:14:52 minor
Qunxin Liu 8e5bc535 2020-07-15T18:54:52 [subset] call collect_mapping only when --gids option is used. collect_mapping is time consuming as it iterates all codepoints in all cmap subtables, only trigger it when necessary
Qunxin Liu 10d6605b 2020-05-15T10:52:49 [subset] don't use << operator in collect_mapping
Qunxin Liu b2a965df 2020-04-22T15:58:41 [subset] Add support for "--gids" option cmap subsetting now retains entries associated with any glyph ids explicitly requested
Qunxin Liu e53c44e3 2020-04-24T14:06:13 [subset] temporarily revert previous cmap commit Required in https://github.com/harfbuzz/harfbuzz/issues/2356
Ebrahim Byagowi 08428a15 2020-04-24T23:45:17 minor, spacing
Ebrahim Byagowi 2dda6dd7 2020-04-20T14:12:45 minor, tweak spacing turn 8 spaces to tab, add space before Null/Crap
Ebrahim Byagowi a224f417 2020-03-13T08:33:34 Turn more of simple dagger chains to foreach Less noise, as was agreed before and applied 385741d also
Ebrahim Byagowi 07acd1a0 2020-03-08T23:39:24 [subset] Rename src_base args to base to match sanitize methods So it will become easier to follow that serialize methods signatures should match with their sanitize methods counterparts.
ariza 188a0a47 2020-03-07T11:02:36 removed default base; replaced w/ bias if required
Michiharu Ariza 5ab50eeb 2020-02-29T01:32:29 collect_unicodes() with clamp, calling add_range() Use add_range instead an inner loop, clamp its input number by number of glyphs a face has. Even the face cmap12 and 13 have 32-bit hb_codepoint_t, which is here used to make timeout, face's maxp has 16-bit gid limitation at least for now, using that makes sure we both fix and the timeout and don't need to change much things here also in order to support 32-bit gids also someday. Fixes #2204
Ebrahim Byagowi e9021386 2020-02-28T21:24:27 Revert "collect_unicodes() to check gid < num_glyphs with cmap 12" Didn't fix the case actually, making bots to fail. This reverts commit 15b43a410400c74a32d40f4b89dbea02fa7cd6e1.
Michiharu Ariza 15b43a41 2020-02-28T08:45:39 collect_unicodes() to check gid < num_glyphs with cmap 12 fixes #2204
Garret Rieger 50129b03 2020-02-25T17:39:59 Add a reverse () call to hb_array_t.
Garret Rieger 38c6598c 2020-02-25T17:20:05 Switch to C style comments.
Garret Rieger 52b6e0ba 2020-02-10T12:26:40 When serializing cmap14 order the offsets from smallest to largest. Current versions of OTS fail fonts with cmap 14's who's last offset does not point to the a block at the end of the table.
ckitagawa 03f778cf 2020-02-05T09:26:45 [cmap] remove dead code
Ebrahim Byagowi a7f694d4 2020-02-05T16:31:21 Merge branch 'subset_cblc' into master
ckitagawa-work 774725b4 2020-02-05T07:43:10 [subset] Avoid incorrectly dropping cmap for NotoColorEmoji.ttf NotoColorEmoji.ttf uses two cmap subtables Format 14 | Platform ID 0 | Platform Encoding ID 5 Format 12 | Platform ID 3 | Platform Encoding ID 10 This combination results in the cmap table being dropped during subsetting despite being valid/required.
ckitagawa e128f802 2020-01-21T13:35:43 parent 777ba47b50f6379b9f9abf1d72559316b7116b9e author ckitagawa <ckitagawa@chromium.org> 1579631743 -0500 committer ckitagawa <ckitagawa@chromium.org> 1580506176 -0500 [subset] Add CBLC support
Qunxin Liu b6a8f5e6 2020-01-28T09:30:51 [subset] CMAP table subsetting fix Not all codepoints smaller than 0xFFFF go to cmap4 table. Only subset codepoints existing in each table. This will also make harfbuzz consistent with fontTools' behavior
Qunxin Liu c370da45 2020-01-22T11:36:15 [subset] Cmap table: remove encodingRecord entry for empty cmap4 subtable
Qunxin Liu 1db2c1d0 2020-01-07T11:10:40 fix for cmap4 and OS_2 subsetting: maximum character code allowed is 0xFFFF
Behdad Esfahbod 6a60ca11 2019-12-10T12:32:37 [algs] Fold last other bsearch() in Now truly have only one bsearch implementation.
Ebrahim Byagowi 486754a8 2019-09-23T23:48:08 [serialize] Extract iterable copy, copy_all
Khaled Hosny dd288840 2019-10-29T01:45:49 [cmap] Check GID before adding ranges in format 4 & 12 Fixes https://github.com/harfbuzz/harfbuzz/issues/2031
Behdad Esfahbod 03028a5f 2019-10-28T13:46:56 Revert "Don't include codepoint 0 in the results of collect_unicodes." This reverts commit 14ad96ffbf77c33d8d33d2686d17c2375381989e. This was wrong. My bad! https://github.com/harfbuzz/harfbuzz/issues/2031
Garret Rieger 14ad96ff 2019-10-28T12:56:04 Don't include codepoint 0 in the results of collect_unicodes. It is always assumed to be the notdef glyph.
Ebrahim Byagowi 0558413f 2019-10-01T13:49:55 Minor, tweak spaces
Ebrahim Byagowi 035ec3d1 2019-09-23T20:51:43 [cmap] remove has_format14, minor format fixes #1986
Ebrahim Byagowi 385741d5 2019-09-21T15:26:14 [cmap] Turn hb_apply into foreach where possible
Ebrahim Byagowi 1023c2cc 2019-09-21T14:33:43 [cmap] minor
Ebrahim Byagowi ead46eef 2019-09-21T14:25:11 minor, use internal API instead public hb_set_has
Ebrahim Byagowi d8af4e77 2019-09-21T14:19:14 [cmap] minor, turn 8 spaces to tab
Qunxin Liu 43156662 2019-08-29T11:17:20 [subset] updates according to review comments
Qunxin Liu 2583afa0 2019-08-16T13:54:24 [subset] subsetting cmap14
Qunxin Liu 078ddbd0 2019-08-07T13:17:26 [subset] glyph closure for CMAP14
Ebrahim Byagowi d512087e 2019-09-14T10:36:29 Rename GlyphID to HBGlyphID Avoid collision with macOS's ATSUnicodeTypes.h GlyphID
Ebrahim Byagowi a0b4ac4d 2019-08-24T17:57:14 Turn 8 spaces to tab across the project According to the current code style of the project
Behdad Esfahbod d304d60e 2019-08-21T12:30:22 [ot-font] Prefer symbol cmap subtable if found Fixes https://github.com/harfbuzz/harfbuzz/issues/1918 Hopefully doesn't break anyone...
Qunxin Liu 37572882 2019-06-25T13:17:30 [subset] cmap table to use _subset2 and new iterator frameworks
Ebrahim Byagowi 7a9d643c 2019-07-11T01:35:06 Fix unintialized memory read in cmap subset (#1826)
Behdad Esfahbod 3caa32d7 2019-06-19T19:50:54 [config] Add HB_NO_CMAP_LEGACY_SUBTABLES Part of https://vimeo.com/331852453/06eec89c65
Michiharu Ariza 82d4bfb8 2019-06-14T10:49:42 enable cff subset tests add Unicode UCS-4 cmap fix Unicode bits in OS/2 add Unicode cmap sub-table in SourceHanSans-Regular_subset.otf regenerate cff subset test expected results
Qunxin Liu 993d81b9 2019-05-14T13:55:11 [subset] Add one ttf file with fvar/STAT tables to integration test Ignore gvar/MVAR/HVAR table add support for --nameIDs=* option
Garret Rieger a5fb44a8 2019-05-13T14:57:40 [subset] Fix shadowed 'groups' param in cmap.
Behdad Esfahbod 750d5af4 2019-05-08T12:01:55 Make compiler happy with -Og
Behdad Esfahbod 41248cce 2019-05-07T20:54:31 Remove MIN/MAX in favor of hb_min/hb_max
Behdad Esfahbod 699de689 2019-04-15T16:00:20 Delete default assignment operator Offset<>
Behdad Esfahbod 95df00ae 2019-04-12T17:50:03 Hide a few static methods Looks like static methods that do not get inlined end up exported. We have a lot more. Need to protect all at some point. Wish there was an easier way, like the visibility flag we pass that automatically hides all inline methods. Was exposed by check-symbols.sh when compiling on OS X 10.14 with: $ make CPPFLAGS=-Oz CXXFLAGS=-flto=thin LDFLAGS=-lc++
Behdad Esfahbod 64d0f089 2019-04-01T16:50:28 [cmap] Minor
Behdad Esfahbod b986c6a3 2019-03-29T20:17:46 [C++11] Remove IntType::set() in favor of operator=
Behdad Esfahbod 090fe56d 2019-01-25T15:34:03 Merge branch 'master' into iter
Behdad Esfahbod 447323b8 2019-01-22T12:45:40 Better fix for -Wcast-align errors
Behdad Esfahbod 8d05bf7d 2019-01-22T12:34:05 Fix cast-align error If compiler doesn't inline StructAtOffset, this was an error since we only disable cast-align at call-site. So, move the cast out. ../src/hb-machinery.hh: In instantiation of 'const Type& StructAtOffset(const void*, unsigned int) [with Type = unsigned int]': ../src/hb-font.cc:146:85: required from here ../src/hb-machinery.hh:63:12: error: cast from 'const char*' to 'const unsigned int*' increases required alignment of target type [-Werror=cast-align] { return * reinterpret_cast<const Type*> ((const char *) P + offset); } ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../src/hb-machinery.hh: In instantiation of 'Type& StructAtOffset(void*, unsigned int) [with Type = unsigned int]': ../src/hb-font.cc:147:79: required from here ../src/hb-machinery.hh:66:12: error: cast from 'char*' to 'unsigned int*' increases required alignment of target type [-Werror=cast-align] { return * reinterpret_cast<Type*> ((char *) P + offset); } ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Behdad Esfahbod ef006549 2019-01-22T12:08:57 Convert tag enum class consts to static constexpr Part of https://github.com/harfbuzz/harfbuzz/issues/1553
Behdad Esfahbod 8237809f 2019-01-07T22:00:45 [serialize] Make SortedArrayOf:;serialize() take sorted-iterator
Behdad Esfahbod b900f780 2019-01-18T10:08:23 [pragma] More cast-align whitelist
Behdad Esfahbod 474a1205 2018-12-21T18:46:51 [array/vector] Rename len to length
Behdad Esfahbod f1e95e40 2018-12-18T16:49:08 [arrays] Remove hb_supplier_t<>
Behdad Esfahbod cf39c242 2018-12-17T22:36:23 [arrays] Rename Supplier to hb_supplier_t
Ebrahim Byagowi e4120085 2018-12-17T21:31:01 Remove redundant void from C++ sources (#1486)
Ebrahim Byagowi b2ebaa9a 2018-12-16T22:38:10 Remove redundant 'inline' from methods (#1483)