Edit

kc3-lang/libxml2/doc/encoding.html

Branch :

  • Show log

    Commit

  • Author : Daniel Veillard
    Date : 2005-04-11 23:28:16
    Hash : d49370e9
    Message : Updated Steve Ball and Mac OS X binaries URL, Daniel

  • doc/encoding.html
  • <?xml version="1.0" encoding="ISO-8859-1"?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css">
    TD {font-family: Verdana,Arial,Helvetica}
    BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
    H1 {font-family: Verdana,Arial,Helvetica}
    H2 {font-family: Verdana,Arial,Helvetica}
    H3 {font-family: Verdana,Arial,Helvetica}
    A:link, A:visited, A:active { text-decoration: underline }
    </style><title>Encodings support</title></head><body bgcolor="#8b7765" text="#000000" link="#a06060" vlink="#000000"><table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr><td width="120"><a href="http://swpat.ffii.org/"><img src="epatents.png" alt="Action against software patents" /></a></td><td width="180"><a href="http://www.gnome.org/"><img src="gnome2.png" alt="Gnome2 Logo" /></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo" /></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo" /></a><div align="left"><a href="http://xmlsoft.org/"><img src="Libxml2-Logo-180x168.gif" alt="Made with Libxml2 Logo" /></a></div></td><td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center"><h1>The XML C parser and toolkit of Gnome</h1><h2>Encodings support</h2></td></tr></table></td></tr></table></td></tr></table><table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr><td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr><tr><td bgcolor="#fffacd"><form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form><ul><li><a href="index.html">Home</a></li><li><a href="http://xmlsoft.org/wiki">Wiki</a></li><li><a href="html/index.html">Reference Manual</a></li><li><a href="intro.html">Introduction</a></li><li><a href="FAQ.html">FAQ</a></li><li><a href="docs.html" style="font-weight:bold">Developer Menu</a></li><li><a href="bugs.html">Reporting bugs and getting help</a></li><li><a href="help.html">How to help</a></li><li><a href="downloads.html">Downloads</a></li><li><a href="news.html">Releases</a></li><li><a href="XMLinfo.html">XML</a></li><li><a href="XSLT.html">XSLT</a></li><li><a href="xmldtd.html">Validation &amp; DTDs</a></li><li><a href="encoding.html">Encodings support</a></li><li><a href="catalog.html">Catalog support</a></li><li><a href="namespaces.html">Namespaces</a></li><li><a href="contribs.html">Contributions</a></li><li><a href="examples/index.html" style="font-weight:bold">Code Examples</a></li><li><a href="html/index.html" style="font-weight:bold">API Menu</a></li><li><a href="guidelines.html">XML Guidelines</a></li><li><a href="ChangeLog.html">Recent Changes</a></li></ul></td></tr></table><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr><tr><td bgcolor="#fffacd"><ul><li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li><li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li><li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li><li><a href="http://www.aleksey.com/xmlsec/">XML-DSig xmlsec</a></li><li><a href="ftp://xmlsoft.org/">FTP</a></li><li><a href="http://www.zlatkovic.com/projects/libxml/">Windows binaries</a></li><li><a href="http://www.blastwave.org/packages.php/libxml2">Solaris binaries</a></li><li><a href="http://www.explain.com.au/oss/libxml2xslt.html">MacOsX binaries</a></li><li><a href="http://libxmlplusplus.sourceforge.net/">C++ bindings</a></li><li><a href="http://www.zend.com/php5/articles/php5-xmlphp.php#Heading4">PHP bindings</a></li><li><a href="http://sourceforge.net/projects/libxml2-pas/">Pascal bindings</a></li><li><a href="http://rubyforge.org/projects/xml-tools/">Ruby bindings</a></li><li><a href="http://tclxml.sourceforge.net/">Tcl bindings</a></li><li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">Bug Tracker</a></li></ul></td></tr></table></td></tr></table></td><td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd"><p>If you are not really familiar with Internationalization (usual shortcut
    is I18N) , Unicode, characters and glyphs, I suggest you read a <a href="http://www.tbray.org/ongoing/When/200x/2003/04/06/Unicode">presentation</a>
    by Tim Bray on Unicode and why you should care about it.</p><p>If you don't understand why <b>it does not make sense to have a string
    without knowing what encoding it uses</b>, then as Joel Spolsky said
    <a href="http://www.joelonsoftware.com/articles/Unicode.html">please do
    not write another line of code until you finish reading that article.</a>.
    It is a prerequisite to understand this page, and avoid a lot of problems
    with libxml2, XML or text processing in general.</p><p>Table of Content:</p><ol><li><a href="encoding.html#What">What does internationalization support
        mean ?</a></li>
      <li><a href="encoding.html#internal">The internal encoding, how and
      why</a></li>
      <li><a href="encoding.html#implemente">How is it implemented ?</a></li>
      <li><a href="encoding.html#Default">Default supported encodings</a></li>
      <li><a href="encoding.html#extend">How to extend the existing
      support</a></li>
    </ol><h3><a name="What" id="What">What does internationalization support mean ?</a></h3><p>XML was designed from the start to allow the support of any character set
    by using Unicode. Any conformant XML parser has to support the UTF-8 and
    UTF-16 default encodings which can both express the full unicode ranges. UTF8
    is a variable length encoding whose greatest points are to reuse the same
    encoding for ASCII and to save space for Western encodings, but it is a bit
    more complex to handle in practice. UTF-16 use 2 bytes per character (and
    sometimes combines two pairs), it makes implementation easier, but looks a
    bit overkill for Western languages encoding. Moreover the XML specification
    allows the document to be encoded in other encodings at the condition that
    they are clearly labeled as such. For example the following is a wellformed
    XML document encoded in ISO-8859-1 and using accentuated letters that we
    French like for both markup and content:</p><pre>&lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;
    &lt;tr