<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>The XML C parser and toolkit of Gnome</title>
<meta name="GENERATOR" content="amaya 5.1">
<meta http-equiv="Content-Type" content="text/html">
</head>
<body bgcolor="#ffffff">
<h1 align="center">The XML C parser and toolkit of Gnome</h1>
<h1>Note: this is the flat content of the <a href="index.html">web
site</a></h1>
<h1 style="text-align: center">libxml, a.k.a. gnome-xml</h1>
<p></p>
<p
style="text-align: right; font-style: italic; font-size: 10pt">"Programming
with libxml2 is like the thrilling embrace of an exotic stranger." <a
href="http://diveintomark.org/archives/2004/02/18/libxml2">Mark
Pilgrim</a></p>
<p>Libxml2 is the XML C parser and toolkit developed for the Gnome project
(but usable outside of the Gnome platform), it is free software available
under the <a href="http://www.opensource.org/licenses/mit-license.html">MIT
License</a>. XML itself is a metalanguage to design markup languages, i.e.
text language where semantic and structure are added to the content using
extra "markup" information enclosed between angle brackets. HTML is the most
well-known markup language. Though the library is written in C <a
href="python.html">a variety of language bindings</a> make it available in
other environments.</p>
<p>Libxml2 is known to be very portable, the library should build and work
without serious troubles on a variety of systems (Linux, Unix, Windows,
CygWin, MacOS, MacOS X, RISC Os, OS/2, VMS, QNX, MVS, ...)</p>
<p>Libxml2 implements a number of existing standards related to markup
languages:</p>
<ul>
<li>the XML standard: <a
href="http://www.w3.org/TR/REC-xml">http://www.w3.org/TR/REC-xml</a></li>
<li>Namespaces in XML: <a
href="http://www.w3.org/TR/REC-xml-names/">http://www.w3.org/TR/REC-xml-names/</a></li>
<li>XML Base: <a
href="http://www.w3.org/TR/xmlbase/">http://www.w3.org/TR/xmlbase/</a></li>
<li><a href="http://www.cis.ohio-state.edu/rfc/rfc2396.txt">RFC 2396</a> :
Uniform Resource Identifiers <a
href="http://www.ietf.org/rfc/rfc2396.txt">http://www.ietf.org/rfc/rfc2396.txt</a></li>
<li>XML Path Language (XPath) 1.0: <a
href="http://www.w3.org/TR/xpath">http://www.w3.org/TR/xpath</a></li>
<li>HTML4 parser: <a
href="http://www.w3.org/TR/html401/">http://www.w3.org/TR/html401/</a></li>
<li>XML Pointer Language (XPointer) Version 1.0: <a
href="http://www.w3.org/TR/xptr">http://www.w3.org/TR/xptr</a></li>
<li>XML Inclusions (XInclude) Version 1.0: <a
href="http://www.w3.org/TR/xinclude/">http://www.w3.org/TR/xinclude/</a></li>
<li>ISO-8859-x encodings, as well as <a
href="http://www.cis.ohio-state.edu/rfc/rfc2044.txt">rfc2044</a> [UTF-8]
and <a href="http://www.cis.ohio-state.edu/rfc/rfc2781.txt">rfc2781</a>
[UTF-16] Unicode encodings, and more if using iconv support</li>
<li>part of SGML Open Technical Resolution TR9401:1997</li>
<li>XML Catalogs Working Draft 06 August 2001: <a
href="http://www.oasis-open.org/committees/entity/spec-2001-08-06.html">http://www.oasis-open.org/committees/entity/spec-2001-08-06.html</a></li>
<li>Canonical XML Version 1.0: <a
href="http://www.w3.org/TR/xml-c14n">http://www.w3.org/TR/xml-c14n</a>
and the Exclusive XML Canonicalization CR draft <a
href="http://www.w3.org/TR/xml-exc-c14n">http://www.w3.org/TR/xml-exc-c14n</a></li>
<li>Relax NG, ISO/IEC 19757-2:2003, <a
href="http://www.oasis-open.org/committees/relax-ng/spec-20011203.html">http://www.oasis-open.org/committees/relax-ng/spec-20011203.html</a></li>
<li>W3C XML Schemas Part 2: Datatypes <a
href="http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/">REC 02 May
2001</a></li>
<li>W3C <a href="http://www.w3.org/TR/xml-id/">xml:id</a> Working Draft 7
April 2004</li>
</ul>
<p>In most cases libxml2 tries to implement the specifications in a
relatively strictly compliant way. As of release 2.4.16, libxml2 passed all
1800+ tests from the <a
href="http://www.oasis-open.org/committees/xml-conformance/">OASIS XML Tests
Suite</a>.</p>
<p>To some extent libxml2 provides support for the following additional
specifications but doesn't claim to implement them completely:</p>
<ul>
<li>Document Object Model (DOM) <a
href="http://www.w3.org/TR/DOM-Level-2-Core/">http://www.w3.org/TR/DOM-Level-2-Core/</a>
the document model, but it doesn't implement the API itself, gdome2 does
this on top of libxml2</li>
<li><a href="http://www.cis.ohio-state.edu/rfc/rfc959.txt">RFC 959</a> :
libxml2 implements a basic FTP client code</li>
<li><a href="http://www.cis.ohio-state.edu/rfc/rfc1945.txt">RFC 1945</a> :
HTTP/1.0, again a basic HTTP client code</li>
<li>SAX: a SAX2 like interface and a minimal SAX1 implementation compatible
with early expat versions</li>
</ul>
<p>A partial implementation of <a
href="http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/">XML Schemas Part
1: Structure</a> is being worked on but it would be far too early to make any
conformance statement about it at the moment.</p>
<p>Separate documents:</p>
<ul>
<li><a href="http://xmlsoft.org/XSLT/">the libxslt page</a> providing an
implementation of XSLT 1.0 and common extensions like EXSLT for
libxml2</li>
<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">the gdome2 page</a>
: a standard DOM2 implementation for libxml2</li>
<li><a href="http://www.aleksey.com/xmlsec/">the XMLSec page</a>: an
implementation of <a href="http://www.w3.org/TR/xmldsig-core/">W3C XML
Digital Signature</a> for libxml2</li>
<li>also check the related links section below for more related and active
projects.</li>
</ul>
<!----------------<p>Results of the <a
href="http://xmlbench.sourceforge.net/results/benchmark/index.html">xmlbench
benchmark</a> on sourceforge February 2004 (smaller is better):</p>
<p align="center"><img src="benchmark.png"
alt="benchmark results for Expat Xerces libxml2 Oracle and Sun toolkits"></p>
-------------->
<p>Logo designed by <a href="mailto:liyanage@access.ch">Marc Liyanage</a>.</p>
<h2><a name="Introducti">Introduction</a></h2>
<p>This document describes libxml, the <a
href="http://www.w3.org/XML/">XML</a> C parser and toolkit developed for the
<a href="http://www.gnome.org/">Gnome</a> project. <a
href="http://www.w3.org/XML/">XML is a standard</a> for building tag-based
structured documents/data.</p>
<p>Here are some key points about libxml:</p>
<ul>
<li>Libxml2 exports Push (progressive) and Pull (blocking) type parser
interfaces for both XML and HTML.</li>
<li>Libxml2 can do DTD validation at parse time, using a parsed document
instance, or with an arbitrary DTD.</li>
<li>Libxml2 includes complete <a
href="http://www.w3.org/TR/xpath">XPath</a>, <a
href="http://www.w3.org/TR/xptr">XPointer</a> and <a
href="http://www.w3.org/TR/xinclude">XInclude</a> implementations.</li>
<li>It is written in plain C, making as few assumptions as possible, and
sticking closely to ANSI C/POSIX for easy embedding. Works on
Linux/Unix/Windows, ported to a number of other platforms.</li>
<li>Basic support for HTTP and FTP client allowing applications to fetch
remote resources.</li>
<li>The design is modular, most of the extensions can be compiled out.</li>
<li>The internal document representation is as close as possible to the <a
href="http://www.w3.org/DOM/">DOM</a> interfaces.</li>
<li>Libxml2 also has a <a
href="http://www.megginson.com/SAX/index.html">SAX like interface</a>;
the interface is designed to be compatible with <a
href="http://www.jclark.com/xml/expat.html">Expat</a>.</li>
<li>This library is released under the <a
href="http://www.opensource.org/licenses/mit-license.html">MIT
License</a>. See the Copyright file in the distribution for the precise
wording.</li>
</ul>
<p>Warning: unless you are forced to because your application links with a
Gnome-1.X library requiring it, <strong><span
style="background-color: #FF0000">Do Not Use libxml1</span></strong>, use
libxml2</p>
<h2><a name="FAQ">FAQ</a></h2>
<p>Table of Contents:</p>
<ul>
<li><a href="FAQ.html#License">License(s)</a></li>
<li><a href="FAQ.html#Installati">Installation</a></li>
<li><a href="FAQ.html#Compilatio">Compilation</a></li>
<li><a href="FAQ.html#Developer">Developer corner</a></li>
</ul>
<h3><a name="License">License</a>(s)</h3>
<ol>
<li><em>Licensing Terms for libxml</em>
<p>libxml2 is released under the <a
href="http://www.opensource.org/licenses/mit-license.html">MIT
License</a>; see the file Copyright in the distribution for the precise
wording</p>
</li>
<li><em>Can I embed libxml2 in a proprietary application ?</em>
<p>Yes. The MIT License allows you to keep proprietary the changes you
made to libxml, but it would be graceful to send-back bug fixes and
improvements as patches for possible incorporation in the main
development tree.</p>
</li>
</ol>
<h3><a name="Installati">Installation</a></h3>
<ol>
<li><strong><span style="background-color: #FF0000">Do Not Use
libxml1</span></strong>, use libxml2</li>
<li><em>Where can I get libxml</em> ?
<p>The original distribution comes from <a
href="ftp://rpmfind.net/pub/libxml/">rpmfind.net</a> or <a
href="ftp://ftp.gnome.org/pub/GNOME/sources/libxml2/2.6/">gnome.org</a></p>
<p>Most Linux and BSD distributions include libxml, this is probably the
safer way for end-users to use libxml.</p>
<p>David Doolin provides precompiled Windows versions at <a
href="http://www.ce.berkeley.edu/~doolin/code/libxmlwin32/ ">http://www.ce.berkeley.edu/~doolin/code/libxmlwin32/</a></p>
</li>
<li><em>I see libxml and libxml2 releases, which one should I install ?</em>
<ul>
<li>If you are not constrained by backward compatibility issues with
existing applications, install libxml2 only</li>
<li>If you are not doing development, you can safely install both.
Usually the packages <a
href="http://rpmfind.net/linux/RPM/libxml.html">libxml</a> and <a
href="http://rpmfind.net/linux/RPM/libxml2.html">libxml2</a> are
compatible (this is not the case for development packages).</li>
<li>If you are a developer and your system provides separate packaging
for shared libraries and the development components, it is possible
to install libxml and libxml2, and also <a
href="http://rpmfind.net/linux/RPM/libxml-devel.html">libxml-devel</a>
and <a
href="http://rpmfind.net/linux/RPM/libxml2-devel.html">libxml2-devel</a>
too for libxml2 >= 2.3.0</li>
<li>If you are developing a new application, please develop against
libxml2(-devel)</li>
</ul>
</li>
<li><em>I can't install the libxml package, it conflicts with libxml0</em>
<p>You probably have an old libxml0 package used to provide the shared
library for libxml.so.0, you can probably safely remove it. The libxml
packages provided on <a
href="ftp://rpmfind.net/pub/libxml/">rpmfind.net</a> provide
libxml.so.0</p>
</li>
<li><em>I can't install the libxml(2) RPM package due to failed
dependencies</em>
<p>The most generic solution is to re-fetch the latest src.rpm , and
rebuild it locally with</p>
<p><code>rpm --rebuild libxml(2)-xxx.src.rpm</code>.</p>
<p>If everything goes well it will generate two binary rpm packages (one
providing the shared libs and xmllint, and the other one, the -devel
package, providing includes, static libraries and scripts needed to build
applications with libxml(2)) that you can install locally.</p>
</li>
</ol>
<h3><a name="Compilatio">Compilation</a></h3>
<ol>
<li><em>What is the process to compile libxml2 ?</em>
<p>As most UNIX libraries libxml2 follows the "standard":</p>
<p><code>gunzip -c xxx.tar.gz | tar xvf -</code></p>
<p><code>cd libxml-xxxx</code></p>
<p><code>./configure --help</code></p>
<p>to see the options, then the compilation/installation proper</p>
<p><code>./configure [possible options]</code></p>
<p><code>make</code></p>
<p><code>make install</code></p>
<p>At that point you may have to rerun ldconfig or a similar utility to
update your list of installed shared libs.</p>
</li>
<li><em>What other libraries are needed to compile/install libxml2 ?</em>
<p>Libxml2 does not require any other library, the normal C ANSI API
should be sufficient (please report any violation to this rule you may
find).</p>
<p>However if found at configuration time libxml2 will detect and use the
following libs:</p>
<ul>
<li><a href="http://www.info-zip.org/pub/infozip/zlib/">libz</a> : a
highly portable and available widely compression library.</li>
<li>iconv: a powerful character encoding conversion library. It is
included by default in recent glibc libraries, so it doesn't need to
be installed specifically on Linux. It now seems a <a
href="http://www.opennc.org/onlinepubs/7908799/xsh/iconv.html">part
of the official UNIX</a> specification. Here is one <a
href="http://www.gnu.org/software/libiconv/">implementation of the
library</a> which source can be found <a
href="ftp://ftp.ilog.fr/pub/Users/haible/gnu/">here</a>.</li>
</ul>
</li>
<li><em>Make check fails on some platforms</em>
<p>Sometimes the regression tests' results don't completely match the
value produced by the parser, and the makefile uses diff to print the
delta. On some platforms the diff return breaks the compilation process;
if the diff is small this is probably not a serious problem.</p>
<p>Sometimes (especially on Solaris) make checks fail due to limitations
in make. Try using GNU-make instead.</p>
</li>
<li><em>I use the CVS version and there is no configure script</em>
<p>The configure script (and other Makefiles) are generated. Use the
autogen.sh script to regenerate the configure script and Makefiles,
like:</p>
<p><code>./autogen.sh --prefix=/usr --disable-shared</code></p>
</li>
<li><em>I have troubles when running make tests with gcc-3.0</em>
<p>It seems the initial release of gcc-3.0 has a problem with the
optimizer which miscompiles the URI module. Please use another
compiler.</p>
</li>
</ol>
<h3><a name="Developer">Developer</a> corner</h3>
<ol>
<li><em>Troubles compiling or linking programs using libxml2</em>
<p>Usually the problem comes from the fact that the compiler doesn't get
the right compilation or linking flags. There is a small shell script
<code>xml2-config</code> which is installed as part of libxml2 usual
install process which provides those flags. Use</p>
<p><code>xml2-config --cflags</code></p>
<p>to get the compilation flags and</p>
<p><code>xml2-config --libs</code></p>
<p>to get the linker flags. Usually this is done directly from the
Makefile as:</p>
<p><code>CFLAGS=`xml2-config --cflags`</code></p>
<p><code>LIBS=`xml2-config --libs`</code></p>
</li>
<li><em>I want to install my own copy of libxml2 in my home directory and link
my programs against it, but it doesn't work</em>
<p>There are many different ways to accomplish this. Here is one way to
do this under Linux. Suppose your home directory is <code>/home/user.
</code>Then:</p>
<ul><li>Create a subdirectory, let's call it <code>myxml</code></li>
<li>unpack the libxml2 distribution into that subdirectory</li>
<li>chdir into the unpacked distribution (<code>/home/user/myxml/libxml2
</code>)</li>
<li>configure the library using the "<code>--prefix</code>" switch,
specifying an installation subdirectory in <code>/home/user/myxml</code>,
e.g.
<p><code>./configure --prefix /home/user/myxml/xmlinst</code> {other
configuration options}</p></li>
<li>now run <code>make</code> followed by <code>make install</code></li>
<li>At this point, the installation subdirectory contains the complete
"private" include files, library files and binary program files (e.g.
xmllint), located in
<p> <code>/home/user/myxml/xmlinst/lib, /home/user/myxml/xmlinst/include
</code> and <code> /home/user/myxml/xmlinst/bin</code></p>
respectively.</li>
<li>In order to use this "private" library, you should first add it
to the beginning of your default PATH (so that your own private
program files such as xmllint will be used instead of the normal
system ones). To do this, the Bash command would be
<p><code>export PATH=/home/user/myxml/xmlinst/bin:$PATH</code></p></li>
<li>Now suppose you have a program <code>test1.c</code> that you would
like to compile with your "private" library. Simply compile it
using the command <p><code>gcc `xml2-config --cflags --libs` -o test
test.c</code></p> Note that, because your PATH has been set with <code>
/home/user/myxml/xmlinst/bin</code> at the beginning, the
xml2-config program which you just installed will be used instead of
the system default one, and this will <em>automatically</em> get the
correct libraries linked with your program.</li></ul>
</li><p/>
<li><em>xmlDocDump() generates output on one line.</em>
<p>Libxml2 will not <strong>invent</strong> spaces in the content of a
document since <strong>all spaces in the content of a document are
significant</strong>. If you build a tree from the API and want
indentation:</p>
<ol>
<li>the correct way is to generate those yourself too.</li>
<li>the dangerous way is to ask libxml2 to add those blanks to your
content <strong>modifying the content of your document in the
process</strong>. The result may not be what you expect. There is
<strong>NO</strong> way to guarantee that such a modification won't
affect other parts of the content of your document. See <a
href="http://xmlsoft.org/html/libxml-parser.html#xmlKeepBlanksDefault">xmlKeepBlanksDefault
()</a> and <a
href="http://xmlsoft.org/html/libxml-tree.html#xmlSaveFormatFile">xmlSaveFormatFile
()</a></li>
</ol>
</li>
<li>Extra nodes in the document:
<p><em>For a XML file as below:</em></p>
<pre><?xml version="1.0"?>
<PLAN xmlns="http://www.argus.ca/autotest/1.0/">
<NODE CommFlag="0"/>
<NODE CommFlag="1"/>
</PLAN></pre>
<p><em>after parsing it with the function
pxmlDoc=xmlParseFile(...);</em></p>
<p><em>I want to the get the content of the first node (node with the
CommFlag="0")</em></p>
<p><em>so I did it as following;</em></p>
<pre>xmlNodePtr pnode;
pnode=pxmlDoc->children->children;</pre>
<p><em>but it does not work. If I change it to</em></p>
<pre>pnode=pxmlDoc->children->children->next;</pre>
<p><em>then it works. Can someone explain it to me.</em></p>
<p></p>
<p>In XML all characters in the content of the document are significant
<strong>including blanks and formatting line breaks</strong>.</p>
<p>The extra nodes you are wondering about are just that, text nodes with
the formatting spaces which are part of the document but that people tend
to forget. There is a function <a
href="http://xmlsoft.org/html/libxml-parser.html">xmlKeepBlanksDefault
()</a> to remove those at parse time, but that's an heuristic, and its
use should be limited to cases where you are certain there is no
mixed-content in the document.</p>
</li>
<li><em>I get compilation errors of existing code like when accessing
<strong>root</strong> or <strong>child fields</strong> of nodes.</em>
<p>You are compiling code developed for libxml version 1 and using a
libxml2 development environment. Either switch back to libxml v1 devel or
even better fix the code to compile with libxml2 (or both) by <a
href="upgrade.html">following the instructions</a>.</p>
</li>
<li><em>I get compilation errors about non existing
<strong>xmlRootNode</strong> or <strong>xmlChildrenNode</strong>
fields.</em>
<p>The source code you are using has been <a
href="upgrade.html">upgraded</a> to be able to compile with both libxml
and libxml2, but you need to install a more recent version:
libxml(-devel) >= 1.8.8 or libxml2(-devel) >= 2.1.0</p>
</li>
<li><em>XPath implementation looks seriously broken</em>
<p>XPath implementation prior to 2.3.0 was really incomplete. Upgrade to
a recent version, there are no known bugs in the current version.</p>
</li>
<li><em>The example provided in the web page does not compile.</em>
<p>It's hard to maintain the documentation in sync with the code
<grin/> ...</p>
<p>Check the previous points 1/ and 2/ raised before, and please send
patches.</p>
</li>
<li><em>Where can I get more examples and information than provided on the
web page?</em>
<p>Ideally a libxml2 book would be nice. I have no such plan ... But you
can:</p>
<ul>
<li>check more deeply the <a href="html/libxml-lib.html">existing
generated doc</a></li>
<li>have a look at <a href="examples/index.html">the set of
examples</a>.</li>
<li>look for examples of use for libxml2 function using the Gnome code.
For example the following will query the full Gnome CVS base for the
use of the <strong>xmlAddChild()</strong> function:
<p><a
href="http://cvs.gnome.org/lxr/search?string=xmlAddChild">http://cvs.gnome.org/lxr/search?string=xmlAddChild</a></p>
<p>This may be slow, a large hardware donation to the gnome project
could cure this :-)</p>
</li>
<li><a
href="http://cvs.gnome.org/bonsai/rview.cgi?cvsroot=/cvs/gnome&dir=gnome-xml">Browse
the libxml2 source</a> , I try to write code as clean and documented
as possible, so looking at it may be helpful. In particular the code
of xmllint.c and of the various testXXX.c test programs should
provide good examples of how to do things with the library.</li>
</ul>
</li>
<li>What about C++ ?
<p>libxml2 is written in pure C in order to allow easy reuse on a number
of platforms, including embedded systems. I don't intend to convert to
C++.</p>
<p>There is however a C++ wrapper which may fulfill your needs:</p>
<ul>
<li>by Ari Johnson <ari@btigate.com>:
<p>Website: <a
href="http://libxmlplusplus.sourceforge.net/">http://libxmlplusplus.sourceforge.net/</a></p>
<p>Download: <a
href="http://sourceforge.net/project/showfiles.php?group_id=12999">http://sourceforge.net/project/showfiles.php?group_id=12999</a></p>
</li>
<!-- Website is currently unavailable as of 2003-08-02
<li>by Peter Jones <pjones@pmade.org>
<p>Website: <a
href="http://pmade.org/pjones/software/xmlwrapp/">http://pmade.org/pjones/software/xmlwrapp/</a></p>
</li>
-->
</ul>
</li>
<li>How to validate a document a posteriori ?
<p>It is possible to validate documents which had not been validated at
initial parsing time or documents which have been built from scratch
using the API. Use the <a
href="http://xmlsoft.org/html/libxml-valid.html#xmlValidateDtd">xmlValidateDtd()</a>
function. It is also possible to simply add a DTD to an existing
document:</p>
<pre>xmlDocPtr doc; /* your existing document */
xmlDtdPtr dtd = xmlParseDTD(NULL, filename_of_dtd); /* parse the DTD */
dtd->name = xmlStrDup((xmlChar*)"root_name"); /* use the given root */
doc->intSubset = dtd;
if (doc->children == NULL) xmlAddChild((xmlNodePtr)doc, (xmlNodePtr)dtd);
else xmlAddPrevSibling(doc->children, (xmlNodePtr)dtd);
</pre>
</li>
<li>So what is this funky "xmlChar" used all the time?
<p>It is a null terminated sequence of utf-8 characters. And only utf-8!
You need to convert strings encoded in different ways to utf-8 before
passing them to the API. This can be accomplished with the iconv library
for instance.</p>
</li>
<li>etc ...</li>
</ol>
<p></p>
<h2><a name="Documentat">Developer Menu</a></h2>
<p>There are several on-line resources related to using libxml:</p>
<ol>
<li>Use the <a href="search.php">search engine</a> to look up
information.</li>
<li>Check the <a href="FAQ.html">FAQ.</a></li>
<li>Check the <a href="http://xmlsoft.org/html/libxml-lib.html">extensive
documentation</a> automatically extracted from code comments.</li>
<li>Look at the documentation about <a href="encoding.html">libxml
internationalization support</a>.</li>
<li>This page provides a global overview and <a href="example.html">some
examples</a> on how to use libxml.</li>
<li><a href="examples/index.html">Code examples</a></li>
<li>John Fleck's libxml2 tutorial: <a href="tutorial/index.html">html</a>
or <a href="tutorial/xmltutorial.pdf">pdf</a>.</li>
<li>If you need to parse large files, check the <a
href="xmlreader.html">xmlReader</a> API tutorial</li>
<li><a href="mailto:james@daa.com.au">James Henstridge</a> wrote <a
href="http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html">some nice
documentation</a> explaining how to use the libxml SAX interface.</li>
<li>George Lebl wrote <a
href="http://www-106.ibm.com/developerworks/library/l-gnome3/">an article
for IBM developerWorks</a> about using libxml.</li>
<li>Check <a href="http://cvs.gnome.org/lxr/source/gnome-xml/TODO">the TODO
file</a>.</li>
<li>Read the <a href="upgrade.html">1.x to 2.x upgrade path</a>
description. If you are starting a new project using libxml you should
really use the 2.x version.</li>
<li>And don't forget to look at the <a
href="http://mail.gnome.org/archives/xml/">mailing-list archive</a>.</li>
</ol>
<h2><a name="Reporting">Reporting bugs and getting help</a></h2>
<p>Well, bugs or missing features are always possible, and I will make a
point of fixing them in a timely fashion. The best way to report a bug is to
use the <a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">Gnome
bug tracking database</a> (make sure to use the "libxml2" module name). I
look at reports there regularly and it's good to have a reminder when a bug
is still open. Be sure to specify that the bug is for the package libxml2.</p>
<p>For small problems you can try to get help on IRC, the #xml channel on
irc.gnome.org (port 6667) usually have a few person subscribed which may help
(but there is no garantee and if a real issue is raised it should go on the
mailing-list for archival).</p>
<p>There is also a mailing-list <a
href="mailto:xml@gnome.org">xml@gnome.org</a> for libxml, with an <a
href="http://mail.gnome.org/archives/xml/">on-line archive</a> (<a
href="http://xmlsoft.org/messages">old</a>). To subscribe to this list,
please visit the <a
href="http://mail.gnome.org/mailman/listinfo/xml">associated Web</a> page and
follow the instructions. <strong>Do not send code, I won't debug it</strong>
(but patches are really appreciated!).</p>
<p>Please note that with the current amount of virus and SPAM, sending mail
to the list without being subscribed won't work. There is *far too many
bounces* (in the order of a thousand a day !) I cannot approve them manually
anymore. If your mail to the list bounced waiting for administrator approval,
it is LOST ! Repost it and fix the problem triggering the error.</p>
<p>Check the following <strong><span style="color: #FF0000">before
posting</span></strong>:</p>
<ul>
<li>Read the <a href="FAQ.html">FAQ</a> and <a href="search.php">use the
search engine</a> to get information related to your problem.</li>
<li>Make sure you are <a href="ftp://xmlsoft.org/">using a recent
version</a>, and that the problem still shows up in a recent version.</li>
<li>Check the <a href="http://mail.gnome.org/archives/xml/">list
archives</a> to see if the problem was reported already. In this case
there is probably a fix available, similarly check the <a
href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">registered
open bugs</a>.</li>
<li>Make sure you can reproduce the bug with xmllint or one of the test
programs found in source in the distribution.</li>
<li>Please send the command showing the error as well as the input (as an
attachment)</li>
</ul>
<p>Then send the bug with associated information to reproduce it to the <a
href="mailto:xml@gnome.org">xml@gnome.org</a> list; if it's really libxml
related I will approve it. Please do not send mail to me directly, it makes
things really hard to track and in some cases I am not the best person to
answer a given question, ask on the list.</p>
<p>To <span style="color: #E50000">be really clear about support</span>:</p>
<ul>
<li>Support or help <span style="color: #E50000">requests MUST be sent to
the list or on bugzilla</span> in case of problems, so that the Question
and Answers can be shared publicly. Failing to do so carries the implicit
message "I want free support but I don't want to share the benefits with
others" and is not welcome. I will automatically Carbon-Copy the
xml@gnome.org mailing list for any technical reply made about libxml2 or
libxslt.</li>
<li>There is <span style="color: #E50000">no garantee of support</span>, if
your question remains unanswered after a week, repost it, making sure you
gave all the detail needed and the information requested.</li>
<li>Failing to provide information as requested or double checking first
for prior feedback also carries the implicit message "the time of the
library maintainers is less valuable than my time" and might not be
welcome.</li>
</ul>
<p>Of course, bugs reported with a suggested patch for fixing them will
probably be processed faster than those without.</p>
<p>If you're looking for help, a quick look at <a
href="http://mail.gnome.org/archives/xml/">the list archive</a> may actually
provide the answer. I usually send source samples when answering libxml2
usage questions. The <a
href="http://xmlsoft.org/html/book1.html">auto-generated documentation</a> is
not as polished as I would like (i need to learn more about DocBook), but
it's a good starting point.</p>
<h2><a name="help">How to help</a></h2>
<p>You can help the project in various ways, the best thing to do first is to
subscribe to the mailing-list as explained before, check the <a
href="http://mail.gnome.org/archives/xml/">archives </a>and the <a
href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">Gnome bug
database</a>:</p>
<ol>
<li>Provide patches when you find problems.</li>
<li>Provide the diffs when you port libxml2 to a new platform. They may not
be integrated in all cases but help pinpointing portability problems
and</li>
<li>Provide documentation fixes (either as patches to the code comments or
as HTML diffs).</li>
<li>Provide new documentations pieces (translations, examples, etc
...).</li>
<li>Check the TODO file and try to close one of the items.</li>
<li>Take one of the points raised in the archive or the bug database and
provide a fix. <a href="mailto:daniel@veillard.com">Get in touch with me
</a>before to avoid synchronization problems and check that the suggested
fix will fit in nicely :-)</li>
</ol>
<h2><a name="Downloads">Downloads</a></h2>
<p>The latest versions of libxml2 can be found on the <a
href="ftp://xmlsoft.org/">xmlsoft.org</a> server ( <a
href="http://xmlsoft.org/sources/">HTTP</a>, <a
href="ftp://xmlsoft.org/">FTP</a> and rsync are available), there is also
mirrors (<a href="ftp://ftp.planetmirror.com/pub/xmlsoft/">Australia</a>( <a
href="http://xmlsoft.planetmirror.com/">Web</a>), <a
href="ftp://fr.rpmfind.net/pub/libxml/">France</a>) or on the <a
href="ftp://ftp.gnome.org/pub/GNOME/MIRRORS.html">Gnome FTP server</a> as <a
href="ftp://ftp.gnome.org/pub/GNOME/sources/libxml2/2.6/">source archive</a>
, Antonin Sprinzl also provide <a href="ftp://gd.tuwien.ac.at/pub/libxml/">a
mirror in Austria</a>. (NOTE that you need both the <a
href="http://rpmfind.net/linux/RPM/libxml2.html">libxml(2)</a> and <a
href="http://rpmfind.net/linux/RPM/libxml2-devel.html">libxml(2)-devel</a>
packages installed to compile applications using libxml.)</p>
<p>You can find all the history of libxml(2) and libxslt releases in the <a
href="http://xmlsoft.org/sources/old/">old</a> directory. The precompiled
Windows binaries made by Igor Zlatovic are available in the <a
href="http://xmlsoft.org/sources/win32/">win32</a> directory.</p>
<p>Binary ports:</p>
<ul>
<li>Red Hat RPMs for i386 are available directly on <a
href="ftp://xmlsoft.org/">xmlsoft.org</a>, the source RPM will compile on
any architecture supported by Red Hat.</li>
<li><a href="mailto:igor@zlatkovic.com">Igor Zlatkovic</a> is now the
maintainer of the Windows port, <a
href="http://www.zlatkovic.com/projects/libxml/index.html">he provides
binaries</a>.</li>
<li>Blastwave provides
<a href="http://www.blastwave.org/packages.php/libxml2">Solaris binaries</a>.</li>
<li><a href="mailto:Steve.Ball@explain.com.au">Steve Ball</a> provides <a
href="http://www.explain.com.au/oss/libxml2xslt.html">Mac Os X
binaries</a>.</li>
<li>The HP-UX porting center provides <a
href="http://hpux.connect.org.uk/hppd/hpux/Gnome/">HP-UX binaries</a></li>
</ul>
<p>If you know other supported binary ports, please <a
href="http://veillard.com/">contact me</a>.</p>
<p><a name="Snapshot">Snapshot:</a></p>
<ul>
<li>Code from the W3C cvs base libxml2 module, updated hourly <a
href="ftp://xmlsoft.org/libxml2-cvs-snapshot.tar.gz">libxml2-cvs-snapshot.tar.gz</a>.</li>
<li>Docs, content of the web site, the list archive included <a
href="ftp://xmlsoft.org/libxml-docs.tar.gz">libxml-docs.tar.gz</a>.</li>
</ul>
<p><a name="Contribs">Contributions:</a></p>
<p>I do accept external contributions, especially if compiling on another
platform, get in touch with the list to upload the package, wrappers for
various languages have been provided, and can be found in the <a
href="python.html">bindings section</a></p>
<p>Libxml2 is also available from CVS:</p>
<ul>
<li><p>The <a href="http://cvs.gnome.org/viewcvs/libxml2/">Gnome CVS
base</a>. Check the <a
href="http://developer.gnome.org/tools/cvs.html">Gnome CVS Tools</a>
page; the CVS module is <b>libxml2</b>.</p>
</li>
<li>The <strong>libxslt</strong> module is also present there</li>
</ul>
<h2><a name="News">Releases</a></h2>
<p>Items not finished and worked on, get in touch with the list if you want
to help those</p>
<ul>
<li>More testing on RelaxNG</li>
<li>Finishing up <a href="http://www.w3.org/TR/xmlschema-1/">XML
Schemas</a></li>
</ul>
<p>The <a href="ChangeLog.html">change log</a> describes the recents commits
to the <a href="http://cvs.gnome.org/viewcvs/libxml2/">CVS</a> code base.</p>
<p>There is the list of public releases:</p>
<h3>2.6.20: Jul 10 2005</h3>
<ul>
<li> build fixes: Windows build (Rob Richards), Mingw compilation (Igor
Zlatkovic), Windows Makefile (Igor), gcc warnings (Kasimier and
andriy@google.com), use gcc weak references to pthread to avoid the
pthread dependancy on Linux, compilation problem (Steve Nairn),
compiling of subset (Morten Welinder), IPv6/ss_family compilation
(William Brack), compilation when disabling parts of the library,
standalone test distribution.
</li>
<li> bug fixes: bug in lang(), memory cleanup on errors (William Brack),
HTTP query strings (Aron Stansvik), memory leak in DTD (William),
integer overflow in XPath (William), nanoftp buffer size, pattern
"." apth fixup (Kasimier), leak in tree reported by Malcolm Rowe,
replaceNode patch (Brent Hendricks), CDATA with NULL content
(Mark Vakoc), xml:base fixup on XInclude (William), pattern
fixes (William), attribute bug in exclusive c14n (Aleksey Sanin),
xml:space and xml:lang with SAX2 (Rob Richards), namespace
trouble in complex parsing (Malcolm Rowe), XSD type QNames fixes
(Kasimier), XPath streaming fixups (William), RelaxNG bug (Rob Richards),
Schemas for Schemas fixes (Kasimier), removal of ID (Rob Richards),
a small RelaxNG leak, HTML parsing in push mode bug (James Bursa),
failure to detect UTF-8 parsing bugs in CDATA sections, areBlanks()
heuristic failure, duplicate attributes in DTD bug (William).
</li>
<li> improvements: lot of work on Schemas by Kasimier Buchcik both on
conformance and streaming, Schemas validation messages (Kasimier
Buchcik, Matthew Burgess), namespace removal at the python level
(Brent Hendricks), Update to new Schemas regression tests from
W3C/Nist (Kasimier), xmlSchemaValidateFile() (Kasimier), implementation
of xmlTextReaderReadInnerXml and xmlTextReaderReadOuterXml (James Wert),
standalone test framework and programs, new DOM import APIs
xmlDOMWrapReconcileNamespaces() xmlDOMWrapAdoptNode() and
xmlDOMWrapRemoveNode(), extension of xmllint capabilities for
SAX and Schemas regression tests, xmlStopParser() available in
pull mode too, ienhancement to xmllint --shell namespaces support,
Windows port of the standalone testing tools (Kasimier and William),
xmlSchemaValidateStream() xmlSchemaSAXPlug() and xmlSchemaSAXUnplug()
SAX Schemas APIs, Schemas xmlReader support.
</li>
</ul>
<h3>2.6.19: Apr 02 2005</h3>
<ul>
<li> build fixes: drop .la from RPMs, --with-minimum build fix (William
Brack), use XML_SOCKLEN_T instead of SOCKLEN_T because it breaks with
AIX 5.3 compiler, fixed elfgcchack.h generation and PLT reduction
code on Linux/ELF/gcc4</li>
<li> bug fixes: schemas type decimal fixups (William Brack), xmmlint return
code (Gerry Murphy), small schemas fixes (Matthew Burgess and
GUY Fabrice), workaround "DAV:" namespace brokeness in c14n (Aleksey
Sanin), segfault in Schemas (Kasimier Buchcik), Schemas attribute
validation (Kasimier), Prop related functions and xmlNewNodeEatName
(Rob Richards), HTML serialization of name attribute on a elements,
Python error handlers leaks and improvement (Brent Hendricks),
uninitialized variable in encoding code, Relax-NG validation bug,
potential crash if gnorableWhitespace is NULL, xmlSAXParseDoc and
xmlParseDoc signatures, switched back to assuming UTF-8 in case
no encoding is given at serialization time</li>
<li> improvements: lot of work on Schemas by Kasimier Buchcik on facets
checking and also mixed handling.</li>
<li></li>
</ul>
<h3>2.6.18: Mar 13 2005</h3>
<ul>
<li> build fixes: warnings (Peter Breitenlohner), testapi.c generation,
Bakefile support (Francesco Montorsi), Windows compilation (Joel Reed),
some gcc4 fixes, HP-UX portability fixes (Rick Jones).</li>
<li> bug fixes: xmlSchemaElementDump namespace (Kasimier Buchcik), push and
xmlreader stopping on non-fatal errors, thread support for dictionnaries
reference counting (Gary Coady), internal subset and push problem,
URL saved in xmlCopyDoc, various schemas bug fixes (Kasimier), Python
paths fixup (Stephane Bidoul), xmlGetNodePath and namespaces,
xmlSetNsProp fix (Mike Hommey), warning should not count as error
(William Brack), xmlCreatePushParser empty chunk, XInclude parser
flags (William), cleanup FTP and HTTP code to reuse the uri parsing
and IPv6 (William), xmlTextWriterStartAttributeNS fix (Rob Richards),
XMLLINT_INDENT being empty (William), xmlWriter bugs (Rob Richards),
multithreading on Windows (Rich Salz), xmlSearchNsByHref fix (Kasimier),
Python binding leak (Brent Hendricks), aliasing bug exposed by gcc4
on s390, xmlTextReaderNext bug (Rob Richards), Schemas decimal type
fixes (William Brack), xmlByteConsumed static buffer (Ben Maurer).</li>
<li> improvement: speedup parsing comments and DTDs, dictionnary support for
hash tables, Schemas Identity constraints (Kasimier), streaming XPath
subset, xmlTextReaderReadString added (Bjorn Reese), Schemas canonical
values handling (Kasimier), add xmlTextReaderByteConsumed (Aron
Stansvik), </li>
<li> Documentation: Wiki support (Joel Reed)
</ul>
<h3>2.6.17: Jan 16 2005</h3>
<ul>
<li>build fixes: Windows, warnings removal (William Brack),
maintainer-clean dependency(William), build in a different directory
(William), fixing --with-minimum configure build (William), BeOS
build (Marcin Konicki), Python-2.4 detection (William), compilation
on AIX (Dan McNichol)</li>
<li>bug fixes: xmlTextReaderHasAttributes (Rob Richards), xmlCtxtReadFile()
to use the catalog(s), loop on output (William Brack), XPath memory leak,
ID deallocation problem (Steve Shepard), debugDumpNode crash (William),
warning not using error callback (William), xmlStopParser bug (William),
UTF-16 with BOM on DTDs (William), namespace bug on empty elements
in push mode (Rob Richards), line and col computations fixups (Aleksey
Sanin), xmlURIEscape fix (William), xmlXPathErr on bad range (William),
patterns with too many steps, bug in RNG choice optimization, line
number sometimes missing.
</li>
<li>improvements: XSD Schemas (Kasimier Buchcik), python generator (William),
xmlUTF8Strpos speedup (William), unicode Python strings (William),
XSD error reports (Kasimier Buchcik), Python __str__ call serialize().
</li>
<li>new APIs: added xmlDictExists(), GetLineNumber and GetColumnNumber
for the xmlReader (Aleksey Sanin), Dynamic Shared Libraries APIs
(mostly Joel Reed), error extraction API from regexps, new XMLSave
option for format (Phil Shafer)</li>
<li>documentation: site improvement (John Fleck), FAQ entries (William).</li>
</ul>
<h3>2.6.16: Nov 10 2004</h3>
<ul>
<li>general hardening and bug fixing crossing all the API based on new
automated regression testing</li>
<li>build fix: IPv6 build and test on AIX (Dodji Seketeli)</li>
<li>bug fixes: problem with XML::Libxml reported by Petr Pajas, encoding
conversion functions return values, UTF-8 bug affecting XPath reported by
Markus Bertheau, catalog problem with NULL entries (William Brack)</li>
<li>documentation: fix to xmllint man page, some API function descritpion
were updated.</li>
<li>improvements: DTD validation APIs provided at the Python level (Brent
Hendricks) </li>
</ul>
<h3>2.6.15: Oct 27 2004</h3>
<ul>
<li>security fixes on the nanoftp and nanohttp modules</li>
<li>build fixes: xmllint detection bug in configure, building outside the
source tree (Thomas Fitzsimmons)</li>
<li>bug fixes: HTML parser on broken ASCII chars in names (William), Python
paths (Malcolm Tredinnick), xmlHasNsProp and default namespace (William),
saving to python file objects (Malcolm Tredinnick), DTD lookup fix
(Malcolm), save back <group> in catalogs (William), tree build
fixes (DV and Rob Richards), Schemas memory bug, structured error handler
on Python 64bits, thread local memory deallocation, memory leak reported
by Volker Roth, xmlValidateDtd in the presence of an internal subset,
entities and _private problem (William), xmlBuildRelativeURI error
(William).</li>
<li>improvements: better XInclude error reports (William), tree debugging
module and tests, convenience functions at the Reader API (Graham
Bennett), add support for PI in the HTML parser.</li>
</ul>
<h3>2.6.14: Sep 29 2004</h3>
<ul>
<li>build fixes: configure paths for xmllint and xsltproc, compilation
without HTML parser, compilation warning cleanups (William Brack &
Malcolm Tredinnick), VMS makefile update (Craig Berry),</li>
<li>bug fixes: xmlGetUTF8Char (William Brack), QName properties (Kasimier
Buchcik), XInclude testing, Notation serialization, UTF8ToISO8859x
transcoding (Mark Itzcovitz), lots of XML Schemas cleanup and fixes
(Kasimier), ChangeLog cleanup (Stepan Kasal), memory fixes (Mark Vakoc),
handling of failed realloc(), out of bound array adressing in Schemas
date handling, Python space/tabs cleanups (Malcolm Tredinnick), NMTOKENS
E20 validation fix (Malcolm),</li>
<li>improvements: added W3C XML Schemas testsuite (Kasimier Buchcik), add
xmlSchemaValidateOneElement (Kasimier), Python exception hierearchy
(Malcolm Tredinnick), Python libxml2 driver improvement (Malcolm
Tredinnick), Schemas support for xsi:schemaLocation,
xsi:noNamespaceSchemaLocation, xsi:type (Kasimier Buchcik)</li>
</ul>
<h3>2.6.13: Aug 31 2004</h3>
<ul>
<li>build fixes: Windows and zlib (Igor Zlatkovic), -O flag with gcc,
Solaris compiler warning, fixing RPM BuildRequires,</li>
<li>fixes: DTD loading on Windows (Igor), Schemas error reports APIs
(Kasimier Buchcik), Schemas validation crash, xmlCheckUTF8 (William Brack
and Julius Mittenzwei), Schemas facet check (Kasimier), default namespace
problem (William), Schemas hexbinary empty values, encoding error could
genrate a serialization loop.</li>
<li>Improvements: Schemas validity improvements (Kasimier), added --path
and --load-trace options to xmllint</li>
<li>documentation: tutorial update (John Fleck)</li>
</ul>
<h3>2.6.12: Aug 22 2004</h3>
<ul>
<li>build fixes: fix --with-minimum, elfgcchack.h fixes (Peter
Breitenlohner), perl path lookup (William), diff on Solaris (Albert
Chin), some 64bits cleanups.</li>
<li>Python: avoid a warning with 2.3 (William Brack), tab and space mixes
(William), wrapper generator fixes (William), Cygwin support (Gerrit P.
Haase), node wrapper fix (Marc-Antoine Parent), XML Schemas support
(Torkel Lyng)</li>
<li>Schemas: a lot of bug fixes and improvements from Kasimier Buchcik</li>
<li>fixes: RVT fixes (William), XPath context resets bug (William), memory
debug (Steve Hay), catalog white space handling (Peter Breitenlohner),
xmlReader state after attribute reading (William), structured error
handler (William), XInclude generated xml:base fixup (William), Windows
memory reallocation problem (Steve Hay), Out of Memory conditions
handling (William and Olivier Andrieu), htmlNewDoc() charset bug,
htmlReadMemory init (William), a posteriori validation DTD base
(William), notations serialization missing, xmlGetNodePath (Dodji),
xmlCheckUTF8 (Diego Tartara), missing line numbers on entity
(William)</li>
<li>improvements: DocBook catalog build scrip (William), xmlcatalog tool
(Albert Chin), xmllint --c14n option, no_proxy environment (Mike Hommey),
xmlParseInNodeContext() addition, extend xmllint --shell, allow XInclude
to not generate start/end nodes, extend xmllint --version to include CVS
tag (William)</li>
<li>documentation: web pages fixes, validity API docs fixes (William)
schemas API fix (Eric Haszlakiewicz), xmllint man page (John Fleck)</li>
</ul>
<h3>2.6.11: July 5 2004</h3>
<ul>
<li>Schemas: a lot of changes and improvements by Kasimier Buchcik for
attributes, namespaces and simple types.</li>
<li>build fixes: --with-minimum (William Brack), some gcc cleanup
(William), --with-thread-alloc (William)</li>
<li>portability: Windows binary package change (Igor Zlatkovic), Catalog
path on Windows</li>
<li>documentation: update to the tutorial (John Fleck), xmllint return code
(John Fleck), man pages (Ville Skytta),</li>
<li>bug fixes: C14N bug serializing namespaces (Aleksey Sanin), testSAX
properly initialize the library (William), empty node set in XPath
(William), xmlSchemas errors (William), invalid charref problem pointed
by Morus Walter, XInclude xml:base generation (William), Relax-NG bug
with div processing (William), XPointer and xml:base problem(William),
Reader and entities, xmllint return code for schemas (William), reader
streaming problem (Steve Ball), DTD serialization problem (William),
libxml.m4 fixes (Mike Hommey), do not provide destructors as methods on
Python classes, xmlReader buffer bug, Python bindings memory interfaces
improvement (with St