|
5c558877
|
2013-10-16T11:14:15
|
|
[indic] Allow up to two syllable modifiers
Bug 70509 - Candrabindu+Visarga doesn't work in Devanagari
https://bugs.freedesktop.org/show_bug.cgi?id=70509
We categorize both bindus and visarga as syllable-modifiers.
OT spec doesn't actually say what characters go in the syllable
modifier category, and allows one. We just allow up to two now.
Test case: U+0930,U+0941,U+0901,U+0903
Uniscribe currently doesn't support that and produces a
dotted circle.
|
|
65a929b1
|
2013-10-15T18:08:05
|
|
[indic] If Malayalam dot-reph formed a ligature, don't move it
Rachana-0.6 implements dot-reph by ligation, so we shouldn't move it.
Uniscribe doesn't either. Test case:
U+0D4E,U+0D1A,U+0D4D,U+0D1A,U+0D4D
|
|
c46f4069
|
2013-10-15T16:24:21
|
|
[tests] Remove Myanmar micro-font and test
|
|
30145272
|
2013-10-15T13:47:27
|
|
[indic] Don't apply presentation features across syllables
More like Uniscribe... We still allow user-defined features to
work across syllables, but not pres,blws,abs,psts,etc.
This "regressed" Sinhala numbers by 11. These are cases were
there's Consonant followed by Ra,Halant,ZWJ at the of text.
The Ra,Halant,ZWJ ends up forming reph, which is wrong...
But before we were also ligating that reph with the previous
consonant. That's even more wrong. That's also what Uniscribe
does.
Current numbers:
BENGALI: 353732 out of 354188 tests passed. 456 failed (0.128745%)
DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60732 out of 60747 tests passed. 15 failed (0.0246926%)
KANNADA: 951030 out of 951913 tests passed. 883 failed (0.0927606%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
MALAYALAM: 1048140 out of 1048334 tests passed. 194 failed (0.0185056%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271655 out of 271847 tests passed. 192 failed (0.070628%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
|
|
3c7b3641
|
2013-10-15T11:21:01
|
|
[indic] Handle Avagraha
It can come either at the end(ish!) of the syllable, or independently.
When independent, it accepts a few bits and pieces.
|
|
2c85a3df
|
2013-10-14T19:41:52
|
|
Fix issue with automake
|
|
841e20d0
|
2013-10-14T18:47:51
|
|
Add test suite for shaping results
The new test suite runs tests included under
hb/test/shaping/tests/*.tests, which themselves reference
font files stored by sha1sum under hb/test/shaping/fonts/sha1sum.
The fonts are produced using a subsetter to only include glyphs
needed to run the test.
Four initial tests are added for (Chain)Context matching,
of which three currently fail.
|
|
e2dab692
|
2013-10-14T16:44:44
|
|
Minor
|
|
20cbc1f8
|
2013-09-06T15:29:22
|
|
Annotate hb-set a bit; add HB_SET_VALUE_INVALID
|
|
4dc798de
|
2013-08-26T20:39:00
|
|
Add hb-deprecated.h, and rename a couple enum values
Add deprecated alias for old name.
|
|
54e6f6c5
|
2013-08-09T14:34:54
|
|
Clean up list of Unicode scripts
Rename HB_SCRIPT_CANADIAN_ABORIGINAL to HB_SCRIPT_CANADIAN_SYLLABICS
and a macro for the old name.
|
|
7235f33f
|
2013-06-10T14:39:51
|
|
Fix misc warnings reported by cppcheck
https://bugs.freedesktop.org/show_bug.cgi?id=65544
|
|
a4446b10
|
2013-06-03T18:39:14
|
|
Fix build for C89 compilers
|
|
2966d360
|
2013-05-28T17:34:37
|
|
Fix test build
|
|
d9afa111
|
2013-05-28T15:27:40
|
|
Build hb-icu into libharfbuzz-icu.so
|
|
7d395c2a
|
2013-05-28T15:25:06
|
|
Minor
|
|
dfbd115e
|
2013-05-14T15:30:17
|
|
[test] Add test for hb_set_get_min() bug
Failing now.
Bug 64476 - Typo in hb_set_t.get_min()
|
|
0a2b2a50
|
2013-03-21T16:26:39
|
|
Remove gthread leftovers
We don't use gthread anymore, remove leftovers.
|
|
cc50bf5b
|
2013-03-19T06:59:40
|
|
Remove Hangul filler characters from Default_Ignorable chars
See discussion on mailing list.
|
|
a8cf7b43
|
2013-03-19T05:53:26
|
|
[Indic] Futher adjust ZWJ handling in Indic-like shapers
After the Ngapi hackfest work, we were assuming that fonts
won't use presentation features to choose specific forms
(eg. conjuncts). As such, we were using auto-joiner behavior
for such features. It proved to be troublesome as many fonts
used presentation forms ('pres') for example to form conjuncts,
which need to be disabled when a ZWJ is inserted.
Two examples:
U+0D2F,U+200D,U+0D4D,U+0D2F with kartika.ttf
U+0995,U+09CD,U+200D,U+09B7 with vrinda.ttf
What we do now is to never do magic to ZWJ during GSUB's main input
match for Indic-style shapers. Note that backtrack/lookahead are still
matched liberally, as is GPOS. This seems to be an acceptable
compromise.
As to the bug that initially started this work, that one needs to
be fixed differently:
Bug 58714 - Kannada u+0cb0 u+200d u+0ccd u+0c95 u+0cbe does not
provide same results as Windows8
https://bugs.freedesktop.org/show_bug.cgi?id=58714
New numbers:
BENGALI: 353689 out of 354188 tests passed. 499 failed (0.140886%)
DEVANAGARI: 707305 out of 707394 tests passed. 89 failed (0.0125814%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%)
KANNADA: 951030 out of 951913 tests passed. 883 failed (0.0927606%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048102 out of 1048334 tests passed. 232 failed (0.0221304%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
|
|
ea11abfc
|
2013-03-06T20:21:11
|
|
[build] Port to newer automake recommended syntax
|
|
c39def9b
|
2013-03-06T20:20:45
|
|
Move valgrind suppressions to the correct directory
|
|
6d69a2ce
|
2013-02-26T19:35:50
|
|
[tests] Add Malayalam tests frim cibu
|
|
9e5ac7b8
|
2013-02-25T17:54:10
|
|
Fix blob test to match c3ba49b6fa1865e8318926eaa6c0f2063d1053bb
|
|
e0486fc1
|
2013-02-19T00:58:10
|
|
[tests] Add Myanmar torture tests from Martin Hosken
|
|
a3df9a7b
|
2013-02-19T00:50:46
|
|
Minor
Moving files around
|
|
b1f44075
|
2013-02-17T12:12:37
|
|
[SEA] Fix order of pre-base reordering Ra and left matras
The code was confused because it was expecting left matra to have
POS_PRE_M, like we do in the Myanmar shaper, but that is not what
we were doing in this shaper. Rewrite to rely on category only.
Test case: U+AA06,U+AA34,U+AA2F
|
|
05ac8781
|
2013-02-15T09:26:41
|
|
[tests] Add Syriac Alaph shaping test cases
|
|
126f39cd
|
2013-02-13T08:29:21
|
|
Add more dot-reph tests
|
|
f22b7e77
|
2013-02-13T07:32:46
|
|
[Indic] Track base position when reordering things
Ouch, how did things ever work without this?! The added test that has a
dot-reph as well as a pre-base reordering Ra perfectly demonstrates the
bug (tested with Nirmala font from Win8 for example). Testing suggests
that Win8 shaper has the *exact* same bug / behavior that we used to
have. Odd.
|
|
cc5f24cd
|
2013-02-12T18:17:12
|
|
[tests] Add tests for Devanagary Eyelash Ra
Currently broken with Sanskrit 2003 font.
|
|
64bb2ae8
|
2013-02-12T16:29:25
|
|
Didn't mean to push this out
Ouch!
|
|
f9b66053
|
2013-02-12T16:13:56
|
|
[Myanmar] Use master Indic table for syllable data
|
|
f60793e8
|
2013-02-12T15:45:59
|
|
[tests] Add Cham sample
|
|
3a83d33e
|
2013-02-12T12:14:10
|
|
Add South-East Asian shaper
Handles Tai Tham, Cham, and New Tai Lue for now.
|
|
fb960212
|
2013-02-12T10:33:58
|
|
Minor test reshufflings
|
|
5676d5d5
|
2013-02-12T10:31:14
|
|
[Indic] Make sure New Tai Lue works!
|
|
bed687f8
|
2013-02-11T14:24:03
|
|
Shuffle test data around
|
|
5898fa94
|
2013-02-06T15:29:07
|
|
Don't use $(ENV)
As reported by Peter Breitenlohner:
I think this is a very bad idea because ENV is used to specify a startup
file to be read by some/all shells.
|
|
ecd454b3
|
2013-01-08T18:09:46
|
|
[Indic] In old-spec shaping, don't move viramas around if seq ends with one
For example: u0c9a u0ccd u0c9a u0ccd with Lohit. See:
https://bugs.freedesktop.org/show_bug.cgi?id=59118
|
|
1172dc73
|
2013-01-07T16:46:37
|
|
Rename hb_buffer_clear() to hb_buffer_clear_contents()
The previous name was clashing with harfbuzz.old. There are systems
that need to link both...
Clash-free now again.
|
|
e81aff9e
|
2013-01-02T23:22:54
|
|
[tests] Finish test-set.c
All passing now.
|
|
8165f276
|
2013-01-02T22:50:36
|
|
[tests] Start adding tests for hb-set.h
Fails now. Fixing.
|
|
b9d28f69
|
2013-01-02T22:49:58
|
|
[tests] Add set object to test-object.c
|
|
8b217f5a
|
2012-12-21T15:48:32
|
|
[Indic] Reorder Malayalam dot-reph to after base
Test sequence is simple: U+0D4E,U+0D15. The doth-reph should be
reordered to after the Ka.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
|
|
624933f6
|
2012-11-30T11:46:35
|
|
Add Persian test cases from Mehran Mehr
|
|
43b65315
|
2012-11-16T13:12:35
|
|
[Indic] Another try to unbreak Sinhala split matras
Just read the comments...
|
|
f3584d3a
|
2012-11-14T15:55:17
|
|
Add test cases for Thai PUA shaping
|
|
6b19fa48
|
2012-11-14T11:38:50
|
|
Adjust diff rule for the new hb-shape output format
|
|
82c4d988
|
2012-11-14T10:56:02
|
|
Add Sinhala test case for split matra U+0DDA
|
|
d04b1285
|
2012-11-14T10:53:10
|
|
Fix test
|
|
0c7df222
|
2012-11-13T14:42:35
|
|
Add buffer flags
New API:
hb_buffer_flags_t
HB_BUFFER_FLAGS_DEFAULT
HB_BUFFER_FLAG_BOT
HB_BUFFER_FLAG_EOT
HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES
hb_buffer_set_flags()
hb_buffer_get_flags()
We use the BOT flag to decide whether to insert dottedcircle if the
first char in the buffer is a combining mark.
The PRESERVE_DEFAULT_IGNORABLES flag prevents removal of characters like
ZWNJ/ZWJ/...
|
|
c8d4f8b0
|
2012-11-13T14:10:19
|
|
Minor
|
|
82ecaff7
|
2012-11-13T13:57:52
|
|
Add hb_buffer_clear()
Which is like _reset(), but does NOT clear unicode-funcs.
|
|
de796a6f
|
2012-11-12T17:27:51
|
|
Add "new" Myanmar OT Script tag
Windows 8 added support for Myanmar shaping using the "mym2" script tag,
even though Windows never supported the old "mymr" tag.
|
|
27f52dc3
|
2012-11-12T16:54:03
|
|
Add Myanmar tests from UTN#11
|
|
e6b86c85
|
2012-11-05T15:18:49
|
|
Add test for non-joining Mongolian letters
For U+1880..U+1886 Uniscribe thinks they are non-joining.
For U+1887 Uniscribe thinks it's joining, but looks wrong to me.
|
|
f5e55754
|
2012-11-02T13:53:18
|
|
Add Tifinagh test data
|
|
c21498af
|
2012-11-02T10:21:26
|
|
Add Mongolian and 'Phags-pa joining test cases
|
|
431bef2e
|
2012-11-01T16:26:01
|
|
Minor build fix
|
|
911ed096
|
2012-10-29T19:42:19
|
|
Ignore gid0 in test results
|
|
10b88d89
|
2012-10-29T18:18:24
|
|
Add Ethiopic test case
This sequence: U+120B,U+135F,U+120B with the Nyala font from Win7
exposes a GPOS bug in Uniscribe, in that the positioned mark is wrongly
moved as a result a following kern.
This is the one "failure" in the Ethiopic test suite :-).
ETHIOPIC: 118900 out of 118901 tests passed. 1 failed (0.000841036%)
|
|
166b5cf7
|
2012-09-07T14:55:07
|
|
[Indic] Find syllables before any features are applied
With FreeSerif, it seems that the 'ccmp' feature does ligature
substituttions. That was then causing syllable match failures. We now
find syllables before any features have been applied.
Test sequence: U+0D9A,U+0DCA,U+200D,U+0DBB,U+0DCF
|
|
efb8d3eb
|
2012-09-05T15:50:47
|
|
Fixup test failure reporting
After we implemented dotted-circle, we were still ignoring any tests
that had dottedcircle in it for any of the shapers. That meant that if
we wrongly outputted dottedcircle, the test was being ignored. Ouch!
Fixing that shows regressions across the board. Most are Uniscribe
bugs: NOT inserting dotted-circle when it should. Some are arou
machine bugs. This is in fact a nice way to catch Indic-machine
deficiencies and when I fix the regressions, our clusters should be
much closer to Uniscribe. For now, we regressed from:
BENGALI: 353997 out of 354285 tests passed. 288 failed (0.0812905%)
DEVANAGARI: 707339 out of 707394 tests passed. 55 failed (0.00777502%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60769 out of 60809 tests passed. 40 failed (0.0657797%)
KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%)
KHMER: 299106 out of 299124 tests passed. 18 failed (0.00601757%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048104 out of 1048416 tests passed. 312 failed (0.0297592%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271747 out of 271847 tests passed. 100 failed (0.0367854%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970558 out of 970573 tests passed. 15 failed (0.00154548%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
To:
BENGALI: 353990 out of 354285 tests passed. 295 failed (0.0832663%)
DEVANAGARI: 707315 out of 707394 tests passed. 79 failed (0.0111678%)
GUJARATI: 366447 out of 366506 tests passed. 59 failed (0.016098%)
GURMUKHI: 60707 out of 60809 tests passed. 102 failed (0.167738%)
KANNADA: 951042 out of 951913 tests passed. 871 failed (0.0915%)
KHMER: 298962 out of 299124 tests passed. 162 failed (0.0541581%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048074 out of 1048416 tests passed. 342 failed (0.0326206%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
TAMIL: 1091835 out of 1091837 tests passed. 2 failed (0.000183178%)
TELUGU: 970553 out of 970573 tests passed. 20 failed (0.00206064%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
Investigating.
|
|
a4e75e41
|
2012-08-27T15:54:15
|
|
Minor
|
|
206ab605
|
2012-08-10T09:06:30
|
|
[test] Move around
|
|
7a484c60
|
2012-08-10T09:05:29
|
|
[test] Add Urdu ligature sequences from CRULP
|
|
378d279b
|
2012-07-31T21:36:16
|
|
Implement Unicode compatibility decompositions
Based on patch from Philip Withnall.
https://bugs.freedesktop.org/show_bug.cgi?id=41095
|
|
70b3dc32
|
2012-07-30T12:40:18
|
|
Add Hebrew test
|
|
a973b5ce
|
2012-07-30T01:46:34
|
|
[GSUB] Further adjustments to mark-attachment vs ligation interaction
The d1d69ec52e75a78575b620a1c456d528b6078170 change broke Kannada badly,
since it was ligating consonants, pushing matra out, and then ligating
with the matra. Adjust for that. See comments.
|
|
97a201be
|
2012-07-29T20:31:36
|
|
Add Arabic tests for mark ligature component attachments
|
|
5d874d56
|
2012-07-28T21:05:25
|
|
[GPOS] Fix mark-to-mark positioning when one of the marks is a ligature
This commit: a3313e54008167e415b72c780ca7b9cda958d07e broke MarkMarkPos
when one of the marks itself is a ligature. That regressed 26 Tibetan
tests (up from zero!). Fix that. Tibetan back to zero.
|
|
6411e74c
|
2012-07-24T13:48:49
|
|
[Indic] Reposition Gurmukhi top matras to after post
The font is forming a post-base consonant in some samples, and Uniscribe
positions top matra on the post-base. Do the same.
Gurmukhi failures down from 59 to 41 (0.0674242%).
|
|
c3f769ba
|
2012-07-24T13:26:32
|
|
[Indic] Ignore Uniscribe output containing two zero-width space glyphs
Uniscribe is buggy and sometimes /eats/ a mark next to a non-joiner.
Most of Malayalam failures where actually hitting this bug.
Ignore test output with two zero-width space glyphs. This is a hack
until we build up the test suite infrastructure better.
Bengali went down by 9, Devanagari by 2, Kannada by 130, Malayalm down
from 1197 to 307, Sinhala down by 16, Telugu down by 26. New stats:
BENGALI: 353996 out of 354285 tests passed. 289 failed (0.0815727%)
DEVANAGARI: 693573 out of 693628 tests passed. 55 failed (0.00792932%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1048109 out of 1048416 tests passed. 307 failed (0.0292823%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271715 out of 271847 tests passed. 132 failed (0.0485567%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970550 out of 970573 tests passed. 23 failed (0.00236973%)
|
|
65c43acc
|
2012-07-24T03:36:47
|
|
[Indic] Better position left-matra in Malayalam
Just put it before base, which is what's expected.
Malayalam failures down from 1559 to 1197 (0.114172%).
BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%)
DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1047219 out of 1048416 tests passed. 1197 failed (0.114172%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
|
|
88f413b5
|
2012-07-24T03:04:36
|
|
[Indic] Implement Reph+Ya-Phalaa interaction
The sequence Ra,H,Ya in Bengali is ambigious and Unicode encoded that to
get Ya-Phalaa, one would place ZWJ before Halant. Ie. a ZWJ,H sequence
requests subjoining, while a H,ZWJ requests Half form. Implement that.
Bengali failures go down from 377 to 297 (0.0838308%).
Gujarati is down by 4 to 17 (0.0046384%).
Kannada is down by 226 to 957 (0.100534%).
Current status:
BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%)
DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1046857 out of 1048416 tests passed. 1559 failed (0.148701%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
|
|
330b329c
|
2012-07-24T02:25:26
|
|
[Indic] Unmark U+17D1 KHMER SIGN VIRIAM to NOT be a Virama
Fixes another 1 Khmer failure. Down to 30 (0.0100293%) now.
|
|
d90b8e84
|
2012-07-24T02:10:20
|
|
[Indic] Reposition Khmer prebase-reordering Ra around split matras
In Khmer coeng model, a V,Ra can go *after* matras. If it goes after a
split matra, it should be reordered to *before* the left part of such matra.
Khmer failures down from 136 to 39 (0.0130381%).
|
|
75737991
|
2012-07-24T01:32:07
|
|
[Indic] Position Khmer U+17CE
Fixes another 6 Khmer failures. Now at 136 (0.0454661%).
|
|
2278eefc
|
2012-07-24T00:26:43
|
|
[Indic] In Sinhala, form forced Reph even if no other consonant found
Fixes another 10 Sinhala failures. Down to 148 (0.0544424%).
|
|
71fd5e80
|
2012-07-24T00:21:16
|
|
[Indic] Further adjust base algorithm for Sinhala
Apparently if there is C,V,ZWJ,C, the first C will be base, but if
it's C,ZWJ,V,C, the second one will be.
Note that Uniscribe implements this differently, by breaking syllable in
the case of C,ZWJ,V,C and putting the first consonant in one syllable
and the rest in the next syllable.
Sinhala failures down from 208 to 158 (0.0581209%). No changes to
Khmer.
|
|
73d71cc5
|
2012-07-24T00:09:12
|
|
[Indic] End Vowel-based syllable at ZWJ
One Devanagari test regressed, plus 10 Malayalam (at 1545 now).
Fixed 120 Sinhala failures. Now at 208 (0.0765136%).
|
|
34c21503
|
2012-07-23T23:51:29
|
|
[Indic] Improve Sinhala base algorithm and reph positioning
Sinhala does not have half forms. And most (all?) consonants can be
base, except when preceded by ZWJ, which would request a subjoined form.
Hence switch the base algorithm to categorize with Khmer, start search
at start, and stop at a ZWJ.
Also, mark all pos=base consonants after base to be subjoined. Mark
base itself to have pos=base.
Finally, adjust Sinhala's reph position to after-main.
Brings down Sinhala failures from 455 to 328 (0.120656%).
|
|
771a8f50
|
2012-07-23T20:07:50
|
|
[Indic] exclude ligatures when matching on Indic category
If, say, a H,ZWJ,C ligature was formed, we don't want the code to detec
that as a Halant. So, ignore ligatures when matching category in
final_reordering.
Sinhala failures down from 514 to 455 (0.167374%).
|
|
42848453
|
2012-07-23T13:52:07
|
|
[Thai] Reorder U+0E3A THAI VOWEL SIGN PHINTHU
Uniscribe reorders U+0E3A to be after U+0E38 and U+0E39. We do that by
modifying the ccc for U+0E3A.
Fixes the two remaining Thai failures (see previous commit).
|
|
4a7f4f3e
|
2012-07-23T13:15:33
|
|
[Thai] Adjust SARA AM reordering to match Uniscribe
Adjust the list of marks before SARA AM that get the reordering
treatment. Also adjust cluster formation to match Uniscribe.
With Wikipedia test data, now I see:
- For Thai, with the Angsana New font from Win7, I see 54 failures out
of over 4M tests (0.00129107%). Of the 54, two are legitimate
reordering issues (fix coming soon), and the other 52 are simply
Uniscribe using a zero-width space char instead of an unknown
character for missing glyphs. No idea why. The missing-glyph
sequences include one that is a Thai character followed by an Arabic
Sokun. Someone confused it with Nikhahit I assume!
- For Lao, with the Dokchampa font from Win7, 33 tests fail out of
54k (0.0615167%). All seem to be insignificant mark positioning
with two marks on a base. Have to investigate.
|
|
60554f14
|
2012-07-22T23:23:56
|
|
[Indic] Merge in Malayalam tests
From:
http://silpa.org.in/pub/tests/hb/ml/ml-harfbuzz-testdata.txt
|
|
5c708177
|
2012-07-22T23:20:27
|
|
[Indic] Add extensive Sinhala tests
Generated by:
http://git.savannah.gnu.org/cgit/sinhala.git/plain/utils/gen-unicode-sinhala.py
|
|
2efe4707
|
2012-07-22T23:17:59
|
|
[Indic] Add Sinhala tests
Merge tests from:
http://git.savannah.gnu.org/cgit/sinhala.git/plain/patches/icu-sinhala-rendering.txt
|
|
3d4c111b
|
2012-07-20T19:34:39
|
|
Add a test case
|
|
bdd08043
|
2012-07-20T16:03:09
|
|
[Indic] Reposition Oriya Candrabindu
Oriya failures down from 0.65% to 0.20%.
|
|
87cd6326
|
2012-07-19T21:17:48
|
|
[Indic] Recategorize some Kannada right matras
Kannada failures down from 3.5% to 2.93%.
|
|
c87bcddb
|
2012-07-19T20:03:25
|
|
[Indic] Add failing test for Kannada
|
|
deeb540a
|
2012-07-19T11:30:48
|
|
[test] Ignore tests with DOTTED CIRCLE in the output
|
|
422ecd2d
|
2012-07-18T23:25:58
|
|
[Indic] Accept a forced Rakar sequence at the end of syllable
In Sinhala, Rakar is formed by Al-Lakuna,ZWJ,Ra. If you put that at the
end of a Consonant,Matra syllable, you get a dotted-circle from
Uniscribe. Apparently adding a ZWJ before the Al-Lakuna "fixes" that.
And people have been encoding that sequence... So, allow a forced
"ZWJ,Virama,ZWJ,Ra" sequence at the of syllables.
Fixes some 100 or more of Sinhala failures. Now at 622 only (0.23%).
|
|
10cdc94e
|
2012-07-18T17:42:34
|
|
[Indic] In final reordering, find base, even if it disappeared
POS_BASE can disappear if base ligated backward. Define base as last
with position not after base.
Fixes a few hundred of Sinhala failures with Iskoola Pota.
|
|
3285e107
|
2012-07-18T17:22:14
|
|
[Indic] Implement Sinhala "Al Lakuna" Reph behavior
In Sinhala, Reph is formed only explicitly, by the presence of a ZWJ.
|
|
552d19b7
|
2012-07-18T16:00:49
|
|
[Indic] Treat Register Shifters like Nukta
Really this time.
Fixes another 18 Khmer tests.
|
|
69f26bf3
|
2012-07-18T15:45:43
|
|
[Indic] Fix Matra reordering when base is at end of syllable
For example: U+915,U+200c,U+93f
Fixes last Tamil failure!
|
|
391cc033
|
2012-07-18T15:10:05
|
|
[Indic] Allow halant group in Vowel and placeholder syllables
Fixes 2 out of 560 Devanagari failures. AND:
Fixes 1 out of 2 Tamil failures.
|