Branch
Hash :
025f72b2
Author :
Date :
2025-08-04T03:05:16
Make example variable names consistent
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316
<?xml version="1.0"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
"http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
<!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'">
<!ENTITY version SYSTEM "version.xml">
]>
<chapter id="getting-started">
<title>Getting started with HarfBuzz</title>
<section id="an-overview-of-the-harfbuzz-shaping-api">
<title>An overview of the HarfBuzz shaping API</title>
<para>
The core of the HarfBuzz shaping API is the function
<function>hb_shape()</function>. This function takes a font, a
buffer containing a string of Unicode codepoints and
(optionally) a list of font features as its input. It replaces
the codepoints in the buffer with the corresponding glyphs from
the font, correctly ordered and positioned, and with any of the
optional font features applied.
</para>
<para>
In addition to holding the pre-shaping input (the Unicode
codepoints that comprise the input string) and the post-shaping
output (the glyphs and positions), a HarfBuzz buffer has several
properties that affect shaping. The most important are the
text-flow direction (e.g., left-to-right, right-to-left,
top-to-bottom, or bottom-to-top), the script tag, and the
language tag.
</para>
<para>
For input string buffers, flags are available to denote when the
buffer represents the beginning or end of a paragraph, to
indicate whether or not to visibly render Unicode <literal>Default
Ignorable</literal> codepoints, and to modify the cluster-merging
behavior for the buffer. For shaped output buffers, the
individual X and Y offsets and <literal>advances</literal>
(the logical dimensions) of each glyph are
accessible. HarfBuzz also flags glyphs as
<literal>UNSAFE_TO_BREAK</literal> if breaking the string at
that glyph (e.g., in a line-breaking or hyphenation process)
would require re-shaping the text.
</para>
<para>
HarfBuzz also provides methods to compare the contents of
buffers, join buffers, normalize buffer contents, and handle
invalid codepoints, as well as to determine the state of a
buffer (e.g., input codepoints or output glyphs). Buffer
lifecycles are managed and all buffers are reference-counted.
</para>
<para>
Although the default <function>hb_shape()</function> function is
sufficient for most use cases, a variant is also provided that
lets you specify which of HarfBuzz's shapers to use on a buffer.
</para>
<para>
HarfBuzz can read TrueType fonts, TrueType collections, OpenType
fonts, and OpenType collections. Functions are provided to query
font objects about metrics, Unicode coverage, available tables and
features, and variation selectors. Individual glyphs can also be
queried for metrics, variations, and glyph names. OpenType
variable fonts are supported, and HarfBuzz allows you to set
variation-axis coordinates on font objects.
</para>
<para>
HarfBuzz provides glue code to integrate with various other
libraries, including FreeType, GObject, and CoreText. Support
for integrating with Uniscribe and DirectWrite is experimental
at present.
</para>
</section>
<section id="terminology">
<title>Terminology</title>
<para>
</para>
<variablelist>
<?dbfo list-presentation="blocks"?>
<varlistentry>
<term>script</term>
<listitem>
<para>
In text shaping, a <emphasis>script</emphasis> is a
writing system: a set of symbols, rules, and conventions
that is used to represent a language or multiple
languages.
</para>
<para>
In general computing lingo, the word "script" can also
be used to mean an executable program (usually one
written in a human-readable programming language). For
the sake of clarity, HarfBuzz documents will always use
more specific terminology when referring to this
meaning, such as "Python script" or "shell script." In
all other instances, "script" refers to a writing system.
</para>
<para>
For developers using HarfBuzz, it is important to note
the distinction between a script and a language. Most
scripts are used to write a variety of different
languages, and many languages may be written in more
than one script.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>shaper</term>
<listitem>
<para>
In HarfBuzz, a <emphasis>shaper</emphasis> is a
handler for a specific script-shaping model. HarfBuzz
implements separate shapers for Indic, Arabic, Thai and
Lao, Khmer, Myanmar, Tibetan, Hangul, Hebrew, the
Universal Shaping Engine (USE), and a default shaper for
scripts with no script-specific shaping model.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>cluster</term>
<listitem>
<para>
In text shaping, a <emphasis>cluster</emphasis> is a
sequence of codepoints that must be treated as an
indivisible unit. Clusters can include code-point
sequences that form a ligature or base-and-mark
sequences. Tracking and preserving clusters is important
when shaping operations might separate or reorder
code points.
</para>
<para>
HarfBuzz provides three cluster
<emphasis>levels</emphasis> that implement different
approaches to the problem of preserving clusters during
shaping operations.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>grapheme</term>
<listitem>
<para>
In linguistics, a <emphasis>grapheme</emphasis> is one
of the indivisible units that make up a writing system or
script. Often, graphemes are individual symbols (letters,
numbers, punctuation marks, logograms, etc.) but,
depending on the writing system, a particular grapheme
might correspond to a sequence of several Unicode code
points.
</para>
<para>
In practice, HarfBuzz and other text-shaping engines
are not generally concerned with graphemes. However, it
is important for developers using HarfBuzz to recognize
that there is a difference between graphemes and shaping
clusters (see above). The two concepts may overlap
frequently, but there is no guarantee that they will be
identical.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>syllable</term>
<listitem>
<para>
In linguistics, a <emphasis>syllable</emphasis> is an
a sequence of sounds that makes up a building block of a
particular language. Every language has its own set of
rules describing what constitutes a valid syllable.
</para>
<para>
For text-shaping purposes, the various definitions of
"syllable" are important because script-specific shaping
operations may be applied at the syllable level. For
example, a reordering rule might specify that a vowel
mark be reordered to the beginning of the syllable.
</para>
<para>
Syllables will consist of one or more Unicode code
points. The definition of a syllable for a particular
writing system might correspond to how HarfBuzz
identifies clusters (see above) for the same writing
system. However, it is important for developers using
HarfBuzz to recognize that there is a difference between
syllables and shaping clusters. The two concepts may
overlap frequently, but there is no guarantee that they
will be identical.
</para>
</listitem>
</varlistentry>
</variablelist>
</section>
<section id="a-simple-shaping-example">
<title>A simple shaping example</title>
<para>
Below is the simplest HarfBuzz shaping example possible.
</para>
<orderedlist numeration="arabic">
<listitem>
<para>
Create a buffer and put your text in it.
</para>
</listitem>
</orderedlist>
<programlisting language="C">
#include <hb.h>
hb_buffer_t *buf;
buf = hb_buffer_create();
hb_buffer_add_utf8(buf, text, -1, 0, -1);
</programlisting>
<orderedlist numeration="arabic">
<listitem override="2">
<para>
Set the script, language and direction of the buffer.
</para>
</listitem>
</orderedlist>
<programlisting language="C">
// If you know the direction, script, and language
hb_buffer_set_direction(buf, HB_DIRECTION_LTR);
hb_buffer_set_script(buf, HB_SCRIPT_LATIN);
hb_buffer_set_language(buf, hb_language_from_string("en", -1));
// If you don't know the direction, script, and language
hb_buffer_guess_segment_properties(buf);
</programlisting>
<orderedlist numeration="arabic">
<listitem override="3">
<para>
Create a face and a font from a font file.
</para>
</listitem>
</orderedlist>
<programlisting language="C">
hb_blob_t *blob = hb_blob_create_from_file(filename); /* or hb_blob_create_from_file_or_fail() */
hb_face_t *face = hb_face_create(blob, 0);
hb_font_t *font = hb_font_create(face);
</programlisting>
<orderedlist numeration="arabic">
<listitem override="4">
<para>
Shape!
</para>
</listitem>
</orderedlist>
<programlisting>
hb_shape(font, buf, NULL, 0);
</programlisting>
<orderedlist numeration="arabic">
<listitem override="5">
<para>
Get the glyph and position information.
</para>
</listitem>
</orderedlist>
<programlisting language="C">
unsigned int glyph_count;
hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count);
hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count);
</programlisting>
<orderedlist numeration="arabic">
<listitem override="6">
<para>
Iterate over each glyph.
</para>
</listitem>
</orderedlist>
<programlisting language="C">
hb_position_t cursor_x = 0;
hb_position_t cursor_y = 0;
for (unsigned int i = 0; i < glyph_count; i++) {
hb_codepoint_t glyphid = glyph_info[i].codepoint;
hb_position_t x_offset = glyph_pos[i].x_offset;
hb_position_t y_offset = glyph_pos[i].y_offset;
hb_position_t x_advance = glyph_pos[i].x_advance;
hb_position_t y_advance = glyph_pos[i].y_advance;
/* draw_glyph(glyphid, cursor_x + x_offset, cursor_y + y_offset); */
cursor_x += x_advance;
cursor_y += y_advance;
}
</programlisting>
<orderedlist numeration="arabic">
<listitem override="7">
<para>
Tidy up.
</para>
</listitem>
</orderedlist>
<programlisting language="C">
hb_buffer_destroy(buf);
hb_font_destroy(font);
hb_face_destroy(face);
hb_blob_destroy(blob);
</programlisting>
<para>
This example shows enough to get us started using HarfBuzz. In
the sections that follow, we will use the remainder of
HarfBuzz's API to refine and extend the example and improve its
text-shaping capabilities.
</para>
</section>
</chapter>