src/hb-ot-shape-normalize.cc

Branch


Log

Author Commit Date CI Message
Behdad Esfahbod 5f46ff80 2025-08-07T22:04:20 [shape] Rename a buffer var For clarity.
Behdad Esfahbod c7ef6a2e 2024-09-25T19:42:40 Remove the hack re variation-selectors Instead of abusing an unused Gen_Cat value, use existing facilities to remember variation selectors. Addresses https://github.com/RazrFalcon/rustybuzz/pull/134#issuecomment-2374894164
Behdad Esfahbod 5e5cd10e 2024-09-22T12:36:51 Don't make variation-selectors default-ignorable if not-found set https://github.com/harfbuzz/harfbuzz/pull/4863#issuecomment-2366908261
Behdad Esfahbod b94a39d7 2024-09-22T08:23:34 Follow up to variation-selector-not-found glyph Addresses https://github.com/harfbuzz/harfbuzz/pull/4529#discussion_r1769638033 I'm not sure if this is an improvement. By leaving the var-selector as default-ignorable, ligatures can form around it, and the resulting cluster won't make it clear *which* base+var-selector could not be resolved... That doesn't quite help font fallback the way we want. Putting up for review.
Behdad Esfahbod 287046f7 2023-12-15T10:24:03 [buffer] Hook up not-found-variation-selector-glyph Fixes https://github.com/harfbuzz/harfbuzz/issues/4398
Behdad Esfahbod a003890e 2024-09-21T11:30:56 [buffer] Add hb_buffer_[sg]et_not_found_variation_selector_glyph() Unused.
Behdad Esfahbod 0c2f5ecd 2024-06-06T14:05:56 [normalizer] Add c.override_decompose_and_compose
Behdad Esfahbod 8a9bc523 2024-06-06T13:56:33 [normalizer] Move a couple functions around
Behdad Esfahbod 267ecd20 2023-05-01T14:05:17 [normalize] Micro-optimize
Behdad Esfahbod d21bfb08 2022-11-24T13:14:05 [normalize] Remove an unlikely Keep unlikely for truely unlikely scenarios.
Behdad Esfahbod 7ec4a556 2022-06-19T11:01:45 [normalize] Cosmetic I didn't know this syntax is allowed in old C++.
Behdad Esfahbod 10a8cc28 2022-06-10T07:31:06 [normalizer] Remove a TODO that's not going to happen
Behdad Esfahbod cc7ebb0f 2022-06-04T05:42:58 Remove remaining mention to complex shapers in the code https://github.com/harfbuzz/harfbuzz/pull/3628#issuecomment-1146248037
Behdad Esfahbod 5bfb0b72 2022-06-03T02:56:41 Rename s/shape-complex/shaper/g
Behdad Esfahbod 13fbed29 2022-06-03T02:45:04 s/HB_OT_SHAPE_COMPLEX_MAX_COMBINING_MARKS/HB_OT_SHAPE_MAX_COMBINING_MARKS/g
Behdad Esfahbod bea5369c 2022-01-04T10:52:05 [buffer] Rename swap_buffers() to sync()
Behdad Esfahbod 06ee4021 2021-12-21T14:14:09 Use invisible-glyph for spaces if font has no ASCII space Fixes https://github.com/harfbuzz/harfbuzz/issues/3340 Should add tests ideally.
Behdad Esfahbod da500568 2021-10-26T08:02:29 [API] Add hb_buffer_[sg]et_not_found_glyph() and --not-found-glyph Instead of using gid=0 when a character is not found in the font, client can now set a custom value. This is useful for shaper-driven font fallback and to differentiate that from .notdef glyph. Fixes https://github.com/harfbuzz/harfbuzz/issues/1360
Khaled Hosny 195c05df 2021-09-04T03:41:19 Revert "[ot-shape-normalize] Move buffer out of hb_ot_shape_normalize_context_t" This reverts commit 8cdbea5580731c2bf66e56bf619c1fbb2978692e. For some reason this is causing several tests to crash locally for me (on macOS), see: https://github.com/harfbuzz/harfbuzz/commit/8cdbea5580731c2bf66e56bf619c1fbb2978692e#commitcomment-55898088
Behdad Esfahbod 8cdbea55 2021-08-23T23:44:55 [ot-shape-normalize] Move buffer out of hb_ot_shape_normalize_context_t
Behdad Esfahbod 8450f43a 2021-03-15T15:18:06 [buffer] HB_NODISCARD next_glyph()
Behdad Esfahbod 34a1204f 2021-03-15T14:39:06 [buffer] HB_NODISCARD output_glyph() Also, generalize and use replace_glyphs() in morx where output_glyph() was used in a loop.
Behdad Esfahbod b05e5d9a 2021-03-15T14:08:08 [buffer] HB_NODISCARD next_glyphs()
Behdad Esfahbod 607979d1 2021-03-15T13:23:48 [buffer] HB_NODISCARD replace_glyphs()
Behdad Esfahbod bcd10bf2 2021-02-17T13:58:56 [normalize] Add buffer success check before ->next_glyph() Speculative fix for: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=27843
Simon Cozens fd8f4ffc 2020-09-18T16:37:22 Trace reorder, not normalize
David Corbett 645f4e7c 2019-05-03T18:28:18 Unhide CGJ before ccc=0 characters If a CGJ precedes a starter, then it cannot have blocked any reordering, so it can safely be skipped.
Evgeniy Reizner 34ed8e72 2019-12-13T07:25:34 Prefer _hb_glyph_info_is_unicode_mark where possible.
Ebrahim Byagowi a0b4ac4d 2019-08-24T17:57:14 Turn 8 spaces to tab across the project According to the current code style of the project
Behdad Esfahbod 7aad5365 2019-06-26T13:21:03 [config] Add HB_NO_OT_SHAPE / HB_NO_OT Part of https://github.com/harfbuzz/harfbuzz/issues/1652
Behdad Esfahbod 7f5941e1 2018-10-27T00:06:48 Remove stale comment Ugliness was fixed in 30eab97a0072fbc22d353082249e0e6e546cd86b But yeah, my smell detector was working. Ugliness was buggy.
Behdad Esfahbod 30eab97a 2018-10-26T21:54:07 Fix invalid memory read Buffer might be relocated inside replace_glyphs(). Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=895117
Behdad Esfahbod 39bd07ae 2018-10-26T21:01:11 Fix bunch of unused parameter warnings Show up with gcc -O0. There's a few more but those are functions that need to be filled in. Maybe this is a lost battle...
Behdad Esfahbod e883f527 2018-10-09T14:50:57 Call get_nominal_glyphs() for runs of simple clusters at a time Even without FT or OT font funcs implementing get_nominal_glyphs(), there's measurable speedup.
Behdad Esfahbod 8008bca8 2018-10-09T14:38:23 Whitespace
Behdad Esfahbod 30c114ff 2018-10-09T14:37:08 Avoid sort and recompose stages if all clusters simple Even has measurable speedup...
Behdad Esfahbod 24382deb 2018-10-09T14:33:24 Rewrite main normalizer loop to isolate runs of simple clusters
Behdad Esfahbod b5371f18 2018-10-09T14:12:59 Inline decompose_cluster Towards separating the common case into its own loop.
Behdad Esfahbod 1d1734e9 2018-10-02T13:04:05 Shuffle code around
Behdad Esfahbod 7f335390 2018-09-24T09:56:18 Revert change that would decompose text if GPOS mark feature is available https://github.com/harfbuzz/harfbuzz/issues/653#issuecomment-423905920
Behdad Esfahbod a6f4b2f7 2018-09-24T09:54:37 Fix normalization https://github.com/harfbuzz/harfbuzz/commit/62d1e0852a5549a1b510ad46a4b89f12730bb708#commitcomment-30613091
Behdad Esfahbod 62d1e085 2018-09-23T21:32:18 Prefer decomposed form if font has GPOS mark feature Fixes https://github.com/harfbuzz/harfbuzz/issues/653
Behdad Esfahbod c77ae408 2018-08-25T22:36:36 Rename hb-*private.hh to hb-*.hh Sorry for the noise, downstream custom builders. Please adjust.
Behdad Esfahbod 7185b273 2018-05-31T20:03:00 Rename in_error to !successful Towards possibly using Null pool for some nil objects.
Ebrahim Byagowi eaf64945 2018-04-23T18:39:40 Resolve clang's conditional-uninitialize warnings
Behdad Esfahbod 8c0d1916 2018-01-05T12:46:12 Improve CGJ skipping logic Previously we made CGJ unskippable. Now, if CGJ did NOT prevent any reordering, allow skipping over it. To make this work we had to make changes to the Arabic mark reordering algorithm implementation to renumber moved MCM marks. See comments. Fixes https://github.com/harfbuzz/harfbuzz/issues/554
Behdad Esfahbod ab8d70ec 2017-10-04T14:47:10 [arabic] Implement Unicode Arabic Mark Ordering Algorithm UTR#53 Fixes https://github.com/behdad/harfbuzz/issues/509
Behdad Esfahbod b6fe0ab6 2017-10-04T13:37:08 Add info_cc() convenience macro
Behdad Esfahbod 7f9e7f86 2017-10-04T13:20:33 Adjust normalizer for out-of-order marks We are going to implement Unicode Arabic Mark Ordering Algorithm: http://www.unicode.org/reports/tr53/tr53-1.pdf which will reorder marks out of their sorted ccc order. Adjust normalizer to stop combining as soon as dangerous ordering is detected.
Behdad Esfahbod 1dd630a7 2017-02-01T11:57:21 Minor Fixes https://github.com/behdad/harfbuzz/issues/411
Behdad Esfahbod 8b5bc141 2016-02-24T19:05:23 Add get_nominal_glyph() and get_variation_glyph() instead of get_glyph() New API: - hb_font_get_nominal_glyph_func_t - hb_font_get_variation_glyph_func_t - hb_font_funcs_set_nominal_glyph_func() - hb_font_funcs_set_variation_glyph_func() - hb_font_get_nominal_glyph() - hb_font_get_variation_glyph() Deprecated API: - hb_font_get_glyph_func_t - hb_font_funcs_set_glyph_func() Clients that implement their own font-funcs are encouraged to replace their get_glyph() implementation with a get_nominal_glyph() and get_variation_glyph() pair. The variation version can assume that variation_selector argument is not zero.
Behdad Esfahbod ea512f71 2015-11-26T19:22:22 Use C-style casts instead of compare to 0, to convert hb_bool_t to bool
Behdad Esfahbod 766963ad 2015-11-24T15:38:43 Merge pull request #114 from ThePhD/vc++-fixes Fix all VC++ warnings and errors
Behdad Esfahbod abadc171 2015-11-18T17:52:08 Try to better handle OOM situations Fixes assert fail in https://github.com/behdad/harfbuzz/issues/161 with libharfbuzz-fuzzing.
Behdad Esfahbod 6986208b 2015-11-04T18:46:22 Optimize runs without Default_Ignorable's Now that we have a buffer-wide scratch flags facility, use it to optimize away a few passes.
Behdad Esfahbod 9cbc39ae 2015-11-04T18:00:53 Minor
Behdad Esfahbod 52e6c4e1 2015-11-04T17:45:06 If font doesn't support U+2011, fall back to U+2010 Test passes now.
Behdad Esfahbod 75483aaf 2015-11-04T17:43:36 Untangle if/else waterfall
Behdad Esfahbod 49ef6309 2015-11-04T17:27:07 Adjust the width of various spaces if font does not cover them See discussion here: https://github.com/behdad/harfbuzz/commit/81ef4f407d9c7bd98cf62cef951dc538b13442eb There's no way to disable this fallback, but I don't think it would be needed. Let's hope for the best! Fixes https://github.com/behdad/harfbuzz/issues/153
Behdad Esfahbod 7793aad9 2015-11-04T14:48:46 Normalize various spaces to space if font doesn't support This resurrects the space fallback feature, after I disabled the compatibility decomposition. Now I can release HarfBuzz again without breaking Pango! It also remembers which space character it was, such that later on we can approximate the width of this particular space character. That part is not implemented yet. We normalize all GC=Zs chars except for U+1680 OGHA SPACE MARK, which is better left alone.
Behdad Esfahbod 5c8174ed 2015-10-21T18:51:40 Update comments for removal of compat decompositions
Behdad Esfahbod f6799700 2015-10-21T17:20:55 Disable compatibility decomposition usage during normalization Fixes https://github.com/behdad/harfbuzz/issues/152
Behdad Esfahbod 980e25ca 2015-10-02T08:21:12 Fix hb-ot-shape-normalize with empty buffer Part of https://github.com/behdad/harfbuzz/issues/136
Behdad Esfahbod e995d33c 2015-09-01T16:13:32 [OT] Merge clusters when reordering marks for normalization Fixes https://bugzilla.gnome.org/show_bug.cgi?id=541608 and cluster test.
Behdad Esfahbod 85846b3d 2015-09-01T15:07:52 Use insertion-sort instead of bubble-sort Needed for upcoming merge-clusters fix.
ThePhD 5c99cf93 2015-08-14T01:02:00 Merge branch 'master' into vc++-fixes
jfkthame c7dfe316 2015-08-07T17:55:03 Don't rely on .cluster in _hb_ot_shape_normalize() Fixes https://github.com/behdad/harfbuzz/pull/124
ThePhD 8e545d59 2015-06-22T22:29:04 Fix all VC++ warnings and errors in the current commit's builds.
Behdad Esfahbod 1eff4350 2015-01-27T12:26:04 Minor optimization
Behdad Esfahbod 8f3eebf7 2014-08-02T17:18:46 Make sure gsubgpos buffer vars are available during fallback_position Add buffer var allocation asserts to a few key places.
Behdad Esfahbod 5209c505 2014-07-17T12:23:44 Revert "Show U+FFFD REPLACEMENT CHARACTER for invalid Unicode codepoints" We now handle U+FFFD replacement in hb_buffer_add_utf*(). Any other manipulation can happen in user callbacks. No need for this. https://github.com/behdad/harfbuzz/commit/efe74214bbb68eaa3d7621e73869b5d58210107e#commitcomment-7039404 This reverts commit efe74214bbb68eaa3d7621e73869b5d58210107e. Conflicts: src/hb-ot-shape-normalize.cc
Behdad Esfahbod 7627100f 2014-07-11T14:54:42 Mark unsigned integer literals with the u suffix Simplifies hb_in_range() calls as the type can be inferred. The rest is obsessiveness, I admit.
Behdad Esfahbod efe74214 2014-07-11T11:59:48 Show U+FFFD REPLACEMENT CHARACTER for invalid Unicode codepoints Only if the font doesn't support it. Ie, this gives the user to use non-Unicode codepoints as private values and return a meaningful glyph for them. But if it's invalid and font callback doesn't like it, and if font has U+FFFD, show that instead. Font functions that do not want this automatic replacement to happen should return true from get_glyph() if unicode > 0x10FFFF. Replaces https://github.com/behdad/harfbuzz/pull/27
Behdad Esfahbod 08cf5d75 2014-01-22T07:53:55 [ot] Don't try to compose if normalization is off
Behdad Esfahbod 8fc1f7fe 2014-01-02T17:04:04 [ot/hangul] Don't decompose Hangul even when combining marks present As discussed on https://github.com/behdad/harfbuzz/pull/10#issuecomment-31442030
Behdad Esfahbod 64426ec7 2014-01-02T14:33:10 [ot] Simplify composing Not tested. Ouch.
Behdad Esfahbod 3d6ca0d3 2013-12-31T16:04:35 [ot] Simplify normalization_preference again No shaper has more than one behavior re this, so no need for a callback.
Behdad Esfahbod ac8cd511 2013-10-18T19:33:09 Refactor
Behdad Esfahbod 79d1007a 2013-06-13T19:01:07 If variation selector is not consumed by cmap, pass it on to GSUB This changes the semantics of get_glyph() callback and expect that callbacks return false if the requested variant is not available, and then we will call them back with variation_selector=0 and will retain the glyph for the selector in the glyph stream. Apparently most Mongolian fonts implement the Mongolian Variation Selectors using GSUB, not cmap. https://bugs.freedesktop.org/show_bug.cgi?id=65258 Note that this doesn't fix the Mongolian shaping yet, because the way that's implemented is that the, say, 'init' feature ligates the letter and the variation-selector. However, since currently the variation selector doesn't have the 'init' mask on, it will not be matched...
Behdad Esfahbod c7a84917 2013-06-06T20:17:32 Skip over multiple variation selectors in a row
Behdad Esfahbod 269de14d 2013-04-04T23:06:54 Don't compose Hangul jamo See thread "an issue regarding discrepancy between Korean and Unicode standards" on the mailing list for the rationale. In short: Uniscribe doesn't, so fonts are designed to work without it.
Behdad Esfahbod a88a62f7 2013-03-21T21:02:16 Minor
Behdad Esfahbod 6e74c642 2013-02-11T06:50:17 Improve normalization heuristic Before, for most scripts, we were not trying to recompose two characters if the second one had ccc=0. That fails for Myanmar where U+1026 decomposes to U+1025,U+102E, both of which have ccc=0. However, we do want to try to recompose those. We now check whether the second is a mark, using general category instead. At the same time, remove optimization that was conflicting with this. [Let the Ngapi hackfest begin!]
Behdad Esfahbod eba312c8 2012-11-16T12:39:23 Plumbing to get shape plan and font into complex decompose function So we can handle Sinhala split matras smartly... Coming soon.
Behdad Esfahbod 0736915b 2012-11-13T12:35:35 [Indic] Decompose Sinhala split matras the way old HarfBuzz / Pango did Had to do some refactoring to make this happen... Under uniscribe bug compatibility mode, we still plit them Uniscrie-style, but Jonathan and I convinced ourselves that there is no harm doing this the Unicode way. This change makes that happen, and unbreaks free Sinhala fonts.
Behdad Esfahbod 028a1706 2012-09-06T14:25:48 Refactor common macro
Behdad Esfahbod b85800f9 2012-08-31T18:12:01 [Indic] Implement dotted-circle insertion for broken clusters No panic, we reeally insert dotted circle when it's absolutely broken. Fixes most of the dotted-circle cases against Uniscribe. (for Devanagari fixes 80% of them, for Khmer 70%; the rest look like Uniscribe being really bogus...) I had to make a decision. Apparently Uniscribe adds one dotted circle to each broken character. I tried that, but that goes wrong easily with split matras. So I made it add only one dotted circle to an entire broken syllable tail. As in: "if there was a dotted circle here, this would have formed a correct cluster." That works better for split stuff, and I like it more.
Behdad Esfahbod f4cb4762 2012-08-10T03:51:44 [OT] Slightly adjust normalizer The change is very subtle. If we have a single-char cluster that decomposes to three or more characters, then try recomposition, in case the farther mark may compose with the base.
Behdad Esfahbod 07d68280 2012-08-10T03:28:50 Minor
Behdad Esfahbod b00321ea 2012-08-09T22:33:32 [OT] Avoid calling get_glyph() twice Essentially move the glyph mapping to normalization process. The effect on Devanagari is small (but observable). Should be more observable in simple text, like ASCII.
Behdad Esfahbod 8d1eef3f 2012-08-09T21:31:52 Minor
Behdad Esfahbod 0f8881d6 2012-08-07T16:57:02 More refactoring
Behdad Esfahbod 428dfcab 2012-08-07T16:51:48 Minor refactoring
Behdad Esfahbod 8fbfda92 2012-08-01T19:03:46 Inline font getters
Behdad Esfahbod 208f70f0 2012-08-01T17:13:10 Inline Unicode callbacks internally
Behdad Esfahbod 84186a64 2012-08-01T13:32:39 Add commentary on the compatibility decomposition in the normalizer
Behdad Esfahbod 378d279b 2012-07-31T21:36:16 Implement Unicode compatibility decompositions Based on patch from Philip Withnall. https://bugs.freedesktop.org/show_bug.cgi?id=41095
Behdad Esfahbod bc8357ea 2012-06-08T21:01:20 Merge clusters during normalization
Behdad Esfahbod 0594a244 2012-06-05T20:35:40 Cleanup TRUE/FALSE vs true/false
Behdad Esfahbod 9f377ed3 2012-05-13T16:13:44 Fix more unused-var warnings