Edit

kc3-lang/libxml2/doc/example.html

Branch :

  • Show log

    Commit

  • Author : Daniel Veillard
    Date : 2001-10-30 12:51:17
    Hash : 52dcab39
    Message : preparing 2.4.7 switched to the latest xmllint manual page from John * configure.in: preparing 2.4.7 * Makefile.am doc/Makefile.am: switched to the latest xmllint manual page from John * doc/*: updated the doc and rebuilt the generated pages Daniel

  • doc/example.html
  • <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    <html>
    <head>
    <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
    <style type="text/css"><!--
    TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
    BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
    H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
    H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
    H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
    A:link, A:visited, A:active { text-decoration: underline }
    --></style>
    <title>A real example</title>
    </head>
    <body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
    <table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
    <td width="180">
    <a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
    </td>
    <td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
    <h1>The XML C library for Gnome</h1>
    <h2>A real example</h2>
    </td></tr></table></td></tr></table></td>
    </tr></table>
    <table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
    <td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
    <table width="100%" border="0" cellspacing="1" cellpadding="3">
    <tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
    <tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
    <li><a href="index.html">Home</a></li>
    <li><a href="intro.html">Introduction</a></li>
    <li><a href="FAQ.html">FAQ</a></li>
    <li><a href="docs.html">Documentation</a></li>
    <li><a href="bugs.html">Reporting bugs and getting help</a></li>
    <li><a href="help.html">How to help</a></li>
    <li><a href="downloads.html">Downloads</a></li>
    <li><a href="news.html">News</a></li>
    <li><a href="XML.html">XML</a></li>
    <li><a href="XSLT.html">XSLT</a></li>
    <li><a href="architecture.html">libxml architecture</a></li>
    <li><a href="tree.html">The tree output</a></li>
    <li><a href="interface.html">The SAX interface</a></li>
    <li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
    <li><a href="xmlmem.html">Memory Management</a></li>
    <li><a href="encoding.html">Encodings support</a></li>
    <li><a href="xmlio.html">I/O Interfaces</a></li>
    <li><a href="catalog.html">Catalog support</a></li>
    <li><a href="library.html">The parser interfaces</a></li>
    <li><a href="entities.html">Entities or no entities</a></li>
    <li><a href="namespaces.html">Namespaces</a></li>
    <li><a href="upgrade.html">Upgrading 1.x code</a></li>
    <li><a href="threads.html">Thread safety</a></li>
    <li><a href="DOM.html">DOM Principles</a></li>
    <li><a href="example.html">A real example</a></li>
    <li><a href="contribs.html">Contributions</a></li>
    <li>
    <a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
    </li>
    </ul></td></tr>
    </table>
    <table width="100%" border="0" cellspacing="1" cellpadding="3">
    <tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
    <tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
    <li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
    <li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
    <li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
    <li><a href="ftp://xmlsoft.org/">FTP</a></li>
    <li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
    <li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
    <li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml">Bug Tracker</a></li>
    </ul></td></tr>
    </table>
    </td></tr></table></td>
    <td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
    <p>Here is a real size example, where the actual content of the application
    data is not kept in the DOM tree but uses internal structures. It is based on
    a proposal to keep a database of jobs related to Gnome, with an XML based
    storage structure. Here is an <a href="gjobs.xml">XML encoded jobs
    base</a>:</p>
    <pre>&lt;?xml version=&quot;1.0&quot;?&gt;
    &lt;gjob:Helping xmlns:gjob=&quot;http://www.gnome.org/some-location&quot;&gt;
      &lt;gjob:Jobs&gt;
    
        &lt;gjob:Job&gt;
          &lt;gjob:Project ID=&quot;3&quot;/&gt;
          &lt;gjob:Application&gt;GBackup&lt;/gjob:Application&gt;
          &lt;gjob:Category&gt;Development&lt;/gjob:Category&gt;
    
          &lt;gjob:Update&gt;
            &lt;gjob:Status&gt;Open&lt;/gjob:Status&gt;
            &lt;gjob:Modified&gt;Mon, 07 Jun 1999 20:27:45 -0400 MET DST&lt;/gjob:Modified&gt;
            &lt;gjob:Salary&gt;USD 0.00&lt;/gjob:Salary&gt;
          &lt;/gjob:Update&gt;
    
          &lt;gjob:Developers&gt;
            &lt;gjob:Developer&gt;
            &lt;/gjob:Developer&gt;
          &lt;/gjob:Developers&gt;
    
          &lt;gjob:Contact&gt;
            &lt;gjob:Person&gt;Nathan Clemons&lt;/gjob:Person&gt;
            &lt;gjob:Email&gt;nathan@windsofstorm.net&lt;/gjob:Email&gt;
            &lt;gjob:Company&gt;
            &lt;/gjob:Company&gt;
            &lt;gjob:Organisation&gt;
            &lt;/gjob:Organisation&gt;
            &lt;gjob:Webpage&gt;
            &lt;/gjob:Webpage&gt;
            &lt;gjob:Snailmail&gt;
            &lt;/gjob:Snailmail&gt;
            &lt;gjob:Phone&gt;
            &lt;/gjob:Phone&gt;
          &lt;/gjob:Contact&gt;
    
          &lt;gjob:Requirements&gt;
          The program should be released as free software, under the GPL.
          &lt;/gjob:Requirements&gt;
    
          &lt;gjob:Skills&gt;
          &lt;/gjob:Skills&gt;
    
          &lt;gjob:Details&gt;
          A GNOME based system that will allow a superuser to configure 
          compressed and uncompressed files and/or file systems to be backed 
          up with a supported media in the system.  This should be able to 
          perform via find commands generating a list of files that are passed 
          to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine 
          or via operations performed on the filesystem itself. Email 
          notification and GUI status display very important.
          &lt;/gjob:Details&gt;
    
        &lt;/gjob:Job&gt;
    
      &lt;/gjob:Jobs&gt;
    &lt;/gjob:Helping&gt;</pre>
    <p>While loading the XML file into an internal DOM tree is a matter of
    calling only a couple of functions, browsing the tree to gather the ata and
    generate the internal structures is harder, and more error prone.</p>
    <p>The suggested principle is to be tolerant with respect to the input
    structure. For example, the ordering of the attributes is not significant,
    the XML specification is clear about it. It's also usually a good idea not to
    depend on the order of the children of a given node, unless it really makes
    things harder. Here is some code to parse the information for a person:</p>
    <pre>/*
     * A person record
     */
    typedef struct person {
        char *name;
        char *email;
        char *company;
        char *organisation;
        char *smail;
        char *webPage;
        char *phone;
    } person, *personPtr;
    
    /*
     * And the code needed to parse it
     */
    personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
        personPtr ret = NULL;
    
    DEBUG(&quot;parsePerson\n&quot;);
        /*
         * allocate the struct
         */
        ret = (personPtr) malloc(sizeof(person));
        if (ret == NULL) {
            fprintf(stderr,&quot;out of memory\n&quot;);
            return(NULL);
        }
        memset(ret, 0, sizeof(person));
    
        /* We don't care what the top level element name is */
        cur = cur-&gt;xmlChildrenNode;
        while (cur != NULL) {
            if ((!strcmp(cur-&gt;name, &quot;Person&quot;)) &amp;&amp; (cur-&gt;ns == ns))
                ret-&gt;name = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
            if ((!strcmp(cur-&gt;name, &quot;Email&quot;)) &amp;&amp; (cur-&gt;ns == ns))
                ret-&gt;email = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
            cur = cur-&gt;next;
        }
    
        return(ret);
    }</pre>
    <p>Here are a couple of things to notice:</p>
    <ul>
    <li>Usually a recursive parsing style is the more convenient one: XML data
        is by nature subject to repetitive constructs and usually exibits highly
        stuctured patterns.</li>
    <li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>,
        i.e. the pointer to the global XML document and the namespace reserved to
        the application. Document wide information are needed for example to
        decode entities and it's a good coding practice to define a namespace for
        your application set of data and test that the element and attributes
        you're analyzing actually pertains to your application space. This is
        done by a simple equality test (cur-&gt;ns == ns).</li>
    <li>To retrieve text and attributes value, you can use the function
        <em>xmlNodeListGetString</em> to gather all the text and entity reference
        nodes generated by the DOM output and produce an single text string.</li>
    </ul>
    <p>Here is another piece of code used to parse another level of the
    structure:</p>
    <pre>#include &lt;libxml/tree.h&gt;
    /*
     * a Description for a Job
     */
    typedef struct job {
        char *projectID;
        char *application;
        char *category;
        personPtr contact;
        int nbDevelopers;
        personPtr developers[100]; /* using dynamic alloc is left as an exercise */
    } job, *jobPtr;
    
    /*
     * And the code needed to parse it
     */
    jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
        jobPtr ret = NULL;
    
    DEBUG(&quot;parseJob\n&quot;);
        /*
         * allocate the struct
         */
        ret = (jobPtr) malloc(sizeof(job));
        if (ret == NULL) {
            fprintf(stderr,&quot;out of memory\n&quot;);
            return(NULL);
        }
        memset(ret, 0, sizeof(job));
    
        /* We don't care what the top level element name is */
        cur = cur-&gt;xmlChildrenNode;
        while (cur != NULL) {
            
            if ((!strcmp(cur-&gt;name, &quot;Project&quot;)) &amp;&amp; (cur-&gt;ns == ns)) {
                ret-&gt;projectID = xmlGetProp(cur, &quot;ID&quot;);
                if (ret-&gt;projectID == NULL) {
                    fprintf(stderr, &quot;Project has no ID\n&quot;);
                }
            }
            if ((!strcmp(cur-&gt;name, &quot;Application&quot;)) &amp;&amp; (cur-&gt;ns == ns))
                ret-&gt;application = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
            if ((!strcmp(cur-&gt;name, &quot;Category&quot;)) &amp;&amp; (cur-&gt;ns == ns))
                ret-&gt;category = xmlNodeListGetString(doc, cur-&gt;xmlChildrenNode, 1);
            if ((!strcmp(cur-&gt;name, &quot;Contact&quot;)) &amp;&amp; (cur-&gt;ns == ns))
                ret-&gt;contact = parsePerson(doc, ns, cur);
            cur = cur-&gt;next;
        }
    
        return(ret);
    }</pre>
    <p>Once you are used to it, writing this kind of code is quite simple, but
    boring. Ultimately, it could be possble to write stubbers taking either C
    data structure definitions, a set of XML examples or an XML DTD and produce
    the code needed to import and export the content between C data and XML
    storage. This is left as an exercise to the reader :-)</p>
    <p>Feel free to use <a href="example/gjobread.c">the code for the full C
    parsing example</a> as a template, it is also available with Makefile in the
    Gnome CVS base under gnome-xml/example</p>
    <p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
    </td></tr></table></td></tr></table></td></tr></table></td>
    </tr></table></td></tr></table>
    </body>
    </html>