include/libxml/HTMLparser.h


Log

Author Commit Date CI Message
Daniel Veillard f8e3db04 2012-09-11T13:26:36 Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.
Daniel Veillard c62efc84 2011-05-16T16:03:50 Add options to ignore the internal encoding For both XML and HTML, the document can provide an encoding either in XMLDecl in XML, or as a meta element in HTML head. This adds options to ignore those encodings if the encoding is known in advace for example if the content had been converted before being passed to the parser. * parser.c include/libxml/parser.h: add XML_PARSE_IGNORE_ENC option for XML parsing * include/libxml/HTMLparser.h HTMLparser.c: adds the HTML_PARSE_IGNORE_ENC for HTML parsing * HTMLtree.c: fix the handling of saving when an unknown encoding is defined in meta document header * xmllint.c: add a --noenc option to activate the new parser options
Daniel Veillard f1121c48 2010-07-26T14:02:42 Add an HTML parser option to avoid a default doctype - include/libxml/HTMLparser.h: defines the new HTML parser option HTML_PARSE_NODEFDTD - HTMLparser.c: if option is set don't add a default DTD - xmllint.c: add the corresponding --nodefdtd option in xmllint
Daniel Veillard e20fb5a7 2010-01-29T20:47:08 Fix xmlParseInNodeContext for HTML content xmlParseInNodeContext notices that the enclosing document is an HTML document, so invoke the HTML parser for that fragment, and the HTML parser finding a "<p>hello world!</p>" document automatically augment it with defaulted <html> and <body>. This defaulting should be turned off in the HTML parser for this to work, but there is no such HTML parser option. There is an htmlOmittedDefaultValue global variable that you could use, but really we should not rely on global variable for processing options anymore, best is to add an HTML_PARSE_NOIMPLIED. * include/libxml/HTMLparser.h: add the HTML_PARSE_NOIMPLIED parser flag * HTMLparser.c: do add implied element if HTML_PARSE_NOIMPLIED is set * parser.c: add HTML_PARSE_NOIMPLIED to options for xmlParseInNodeContext on HTML documents
Daniel Veillard 34c647cf 2006-09-21T06:53:59 exports htmlNewParserCtxt() as Michael Day pointed out this is needed to * HTMLparser.c include/libxml/HTMLparser.h: exports htmlNewParserCtxt() as Michael Day pointed out this is needed to use htmlCtxtRead*() Daniel
Daniel Veillard 8874b94c 2005-08-25T13:19:21 added a parser XML_PARSE_COMPACT option to allocate small text nodes (less * HTMLparser.c parser.c SAX2.c debugXML.c tree.c valid.c xmlreader.c xmllint.c include/libxml/HTMLparser.h include/libxml/parser.h: added a parser XML_PARSE_COMPACT option to allocate small text nodes (less than 8 bytes on 32bits, less than 16bytes on 64bits) directly within the node, various changes to cope with this. * result/XPath/tests/* result/XPath/xptr/* result/xmlid/*: this slightly change the output Daniel
Daniel Veillard ea4b0bae 2005-08-23T16:06:08 added a recovery mode for the HTML parser based on the suggestions of bug * HTMLparser.c include/libxml/HTMLparser.h: added a recovery mode for the HTML parser based on the suggestions of bug #169834 by Paul Loberg Daniel
Daniel Veillard a2351322 2004-06-27T12:08:10 hack based on Arjan van de Ven suggestion to reduce ELF footprint and * elfgcchack.h doc/elfgcchack.xsl libxml.h: hack based on Arjan van de Ven suggestion to reduce ELF footprint and generated code. Based on aliasing of libraries function to generate direct call instead of indirect ones * doc/libxml2-api.xml doc/Makefile.am doc/apibuild.py: added automatic generation of elfgcchack.h based on the API description, extended the API description to show the conditionals configuration flags required for symbols. * nanohttp.c parser.c xmlsave.c include/libxml/*.h: lot of cleanup * doc/*: regenerated the docs. Daniel
Daniel Veillard be586972 2003-11-18T20:56:51 modified the file header to add more informations, painful... updated to * include/libxml/*.h include/libxml/*.h.in: modified the file header to add more informations, painful... * genChRanges.py genUnicode.py: updated to generate said changes in headers * doc/apibuild.py: extract headers, add them to libxml2-api.xml * *.html *.xsl *.xml: updated the stylesheets to flag geprecated APIs modules. Updated the stylesheets, some cleanups, regenerated * doc/html/*.html: regenerated added back book1 and libxml-lib.html Daniel
Daniel Veillard 73b013fc 2003-09-30T12:36:01 added a new configure option --with-push, some cleanups, chased code size * HTMLparser.c Makefile.am configure.in legacy.c parser.c parserInternals.c testHTML.c xmllint.c include/libxml/HTMLparser.h include/libxml/parser.h include/libxml/parserInternals.h include/libxml/xmlversion.h.in: added a new configure option --with-push, some cleanups, chased code size anomalies. Now a library configured --with-minimum is around 150KB, sounds good enough. Daniel
Daniel Veillard 9475a352 2003-09-26T12:47:50 added the same htmlRead APIs than their XML counterparts new parser * HTMLparser.c testHTML.c xmllint.c include/libxml/HTMLparser.h: added the same htmlRead APIs than their XML counterparts * include/libxml/parser.h: new parser options, not yet implemented, added an options field to the context. * tree.c: patch from Shaun McCance to fix bug #123238 when ]]> is found within a cdata section. * result/noent/cdata2 result/cdata2 result/cdata2.rdr result/cdata2.sax test/cdata2: add one more cdata test Daniel
Igor Zlatkovic 76874e45 2003-08-25T09:05:12 Exportability taint of the headers
Daniel Veillard 2fdbd32d 2003-08-18T12:15:38 new dictionary module to keep a single instance of the names used by the * dict.c include/libxml/dict.h Makefile.am include/libxml/Makefile.am: new dictionary module to keep a single instance of the names used by the parser * DOCBparser.c HTMLparser.c parser.c parserInternals.c valid.c: switched all parsers to use the dictionary internally * include/libxml/HTMLparser.h include/libxml/parser.h include/libxml/parserInternals.h include/libxml/valid.h: Some of the interfaces changed as a result to receive or return "const xmlChar *" instead of "xmlChar *", this is either insignificant from an user point of view or when the returning value changed, those function are really parser internal methods that no user code should really change * doc/libxml2-api.xml doc/html/*: the API interface changed and the docs were regenerated Daniel
William M. Brack 7a82165d 2003-08-15T07:27:40 Minor changes to comments, etc. for improving documentation generation * encoding.c, threads.c, include/libxml/HTMLparser.h, doc/libxml2-api.xml: Minor changes to comments, etc. for improving documentation generation * doc/Makefile.am: further adjustment to auto-generation of win32/libxml2.def.src
Daniel Veillard 02ea1414 2003-04-09T12:08:47 exported htmlCreateMemoryParserCtxt() it was static Daniel * HTMLparser.c include/libxml/HTMLparser.h: exported htmlCreateMemoryParserCtxt() it was static Daniel
Daniel Veillard 930dfb63 2003-02-05T10:17:38 applied HTML improvements from Nick Kew, allowing to do more checking to * HTMLparser.c include/libxml/HTMLparser.h: applied HTML improvements from Nick Kew, allowing to do more checking to HTML elements and attributes. Daniel
Daniel Veillard 1b31e4a0 2002-05-27T14:44:50 fixing #79334 making htmlParseDocument a public entry point. rebuilt the * HTMLparser.c win32/libxml2.def.src win32/dsp/libxml2.def.src include/libxml/HTMLparser.h: fixing #79334 making htmlParseDocument a public entry point. * doc/*: rebuilt the API and docs Daniel
Daniel Veillard 61f26174 2002-03-12T18:46:39 Heiko W. Rupp fixed a lot of comments to generate better API descriptions * include/libxml/*.h: Heiko W. Rupp fixed a lot of comments to generate better API descriptions etc... Daniel
Daniel Veillard 963d2ae4 2002-01-20T22:08:18 cleanup patch from Anthony Jones fix the headers to avoid in make scan * SAX.c: cleanup patch from Anthony Jones * doc/Makefile.am: fix the headers to avoid in make scan * parserInternals.c xpath.c include/libxml/*.h: cleanup of the includes, * vs Ptr and general cleanup * parsedecl.py: first version of a script to extract the module interfaces, the goal will be to provide .decl or XML specification of the interfaces to build wrappers. Daniel
Daniel Veillard cbaf3995 2001-12-31T16:16:02 applied 42 documentation patches from Charlie Bozeman. Regenerated the * *.c include/libxml/*.h doc/html/*: applied 42 documentation patches from Charlie Bozeman. Regenerated the HTML docs. Daniel
Daniel Veillard bb371297 2001-08-16T23:26:59 trying to fix some troubles w.r.t. function returning const xxxPtr. Daniel * HTMLparser.c HTMLtree.c include/libxml/HTMLparser.h: trying to fix some troubles w.r.t. function returning const xxxPtr. Daniel
Daniel Veillard 22090731 2001-07-16T00:06:07 cleanup of global variables, marking some const or private. Daniel * include/libxml/parserInternals.h include/libxml/HTMLparser.h xmlIO.c tree.c parserInternals.c entities.c encoding.c HTMLparser.c: cleanup of global variables, marking some const or private. Daniel
Daniel Veillard c5d64345 2001-06-24T12:13:24 Summer's cleanup, a really big one: * AUTHORS: added William and Bjorn * include/libxml/*.h *.c README doc/*.html etc.: changed old email to daniel@veillard.com hopefully I won't have to do this again * doc/Makefile.am doc/html/*.html: cleanup makefile, checked that docs can be rebuilt cleanly now * include/libxml/xml*version.h*: removed include/libxml/xmlversion.h from CVs it's generated, added include/libxml/xmlwin32version.h also generated but which should change far less frequently. * catalog.c nanoftp.c: made sure to include libxml.h not libxml/xmlversion.h directly * include/libxml/*.h: include xmlwin32version.h instead of xmlversion.h when compiling on WIN32 and MSC Daniel
Daniel Veillard 02bb170a 2001-06-13T21:11:59 - HTMLparser.[ch] HTMLtree.c: stored the inline/block property of element and use it to avoid outputting formatting spaces at the wrong place. Implemented the format parameter for HTML save. - result/HTML/doc2.htm result/HTML/doc3.htm result/HTML/fp40.htm result/HTML/script.html result/HTML/test2.html result/HTML/test3.html result/HTML/wired.html: of course this impact the result of a number of HTML tests Daniel
Daniel Veillard 56a4cb8c 2001-03-24T17:00:36 Huge cleanup, I switched to compile with -Wall -g -O -ansi -pedantic -W -Wunused -Wimplicit -Wreturn-type -Wswitch -Wcomment -Wtrigraphs -Wformat -Wchar-subscripts -Wuninitialized -Wparentheses -Wshadow -Wpointer-arith -Wcast-align -Wwrite-strings -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes -Wnested-externs -Winline - HTMLparser.[ch] HTMLtree.c SAX.c debugXML.c encoding.[ch] encoding.h entities.c error.c list.[ch] nanoftp.c nanohttp.c parser.[ch] parserInternals.[ch] testHTML.c testSAX.c testURI.c testXPath.c tree.[ch] uri.c valid.[ch] xinclude.c xmlIO.[ch] xmllint.c xmlmemory.c xpath.c xpathInternals.h xpointer.[ch] example/gjobread.c: Cleanup, staticfied a number of non-exported functions, detected and cleaned up a dozen of problem found this way, avoided a lot of public function name/typedef/system names clashes - doc/xml.html: updated - configure.in: switched private flags to the really pedantic ones. Daniel
Owen Taylor 3473f88a 2001-02-23T17:55:21 Revert directory structure changes
CET 2001 Tomasz Kłoczko 64636e7f 2001-02-23T01:37:32 moved to libxml directory - this allow simplify automake/autoconf. Now Thu Feb 23 02:03:56 CET 2001 Tomasz Kłoczko <kloczek@pld.org.pl> * *.c *.h libxml files: moved to libxml directory - this allow simplify automake/autoconf. Now isn't neccessary hack on am/ac level for make and remove libxml symlink (modified for this also configure.in and main Makefile.am). Now automake abilities are used in best way (like in many other projects with libraries). * include/win32config.h: moved to libxml directory (now include directory isn't neccessary). * Makefile.am, examples/Makefile.am, libxml/Makefile.am: added empty DEFS and in INCLUDES rest only -I$(top_builddir) - this allow minimize parameters count passed to libtool script (now compilation is also slyghtly more quiet). * configure.in: simplifies libzdetestion - prepare separated variables for keep libz name and path to libz header files isn't realy neccessary (if someone have libz installed in non standard prefix path to header files ald library can be passed as: $ CFALGS="-I</libz.h/path>" LDFLAGS="-L</libz/path>" ./configure * autogen.sh: check now for libxml/entities.h. After above building libxml pass correctly and also pass "make install DESTDIR=</install/prefix>" from tar ball generated by "make dist". Seems ac/am reorganization is finished. This changes not touches any other things on *.{c,h} files level.
Daniel Veillard f41fbbf6 2001-02-13T17:05:35 testing and bug fixing related to XSLT: - xpath.c result/XPath/tests/chaptersprefol: bugfixes on order and on predicate - HTMLparser.[ch] HTMLtree.c result/HTML/doc3.htm.err result/HTML/doc3.htm.sax result/HTML/wired.html: sometimes one really want to have tags closed on output even if we accept unclosed ones on input Daniel
Daniel Veillard a6d8eb62 2000-12-27T10:46:47 Finally had a bit of time to resynch both trees: - HTMLparser.[ch]: added a way to avoid adding automatically omitted tags. htmlHandleOmittedElem() allows to change the default handling. - tree.[ch] xmllint.c: added xmlDocDumpFormatMemory() and xmlDocDumpFormatMemoryEnc(), uses memory functions for output of xmllint too when using --memory flag, added a memory test suite at the Makefile level. - xpathInternals.h xpath.[ch] xpointer.c: fixed problems with namespace use when encountering QNames in XPath evalation, added xmlns() scheme in XPointer. - nanoftp.c : incorporated a fix - parser.c xmlIO.c: fixed problems raised with encoding when using the memory I/O - parserInternals.c: closed bug 25934 reported by torsten.landschoff@innominate.de - TODO: updated Daniel
Daniel Veillard 47e12f23 2000-10-15T14:24:25 HTML attributes handling: - SAX.c: HTML attributes need normalization too (Bjorn Reese) - HTMLparser.[ch]: addded htmlIsScriptAttribute() Daniel
Daniel Veillard e010c17d 2000-08-28T10:04:51 Mostly HTML generation and parsing enhancements: - HTMLparser.[ch] testHTML.c: applied the second set of patches from Wayne Davison <wayned@blorf.net>, adding htmlEncodeEntities() - HTMLparser.c: fixed an ignorable white space detection bug occuring when parsing with SAX only - result/HTML/*.sax: updated since the output is now HTML encoded... Daniel.
Daniel Veillard 47f3f31f 2000-08-27T22:40:15 - HTMLparser.[ch]: applied some of Wayne Davison <wayned@blorf.net> patches Daniel
Daniel Veillard 32bc74ef 2000-07-14T14:49:25 - doc/encoding.html doc/xml.html: added I18N doc - encoding.[ch] HTMLtree.[ch] parser.c HTMLparser.c: I18N encoding improvements, both parser and filters, added ASCII & HTML, fixed the ISO-Latin-1 one - xmllint.c testHTML.c: added/made visible --encode - debugXML.c : cleanup - most .c files: applied patches due to warning on Windows and when using Sun Pro cc compiler - xpath.c : cleanup memleaks - nanoftp.c : added a TESTING preprocessor flag for standalong compile so that people can report bugs more easilly - nanohttp.c : ditched socklen_t which was a portability mess and replaced it with unsigned int. - tree.[ch]: added xmlHasProp() - TODO: updated - test/ : added more test for entities, NS, encoding, HTML, wap - configure.in: preparing for 2.2.0 release Daniel
Daniel Veillard 361d845d 2000-04-03T19:48:13 Work done on the plane, ready to release libxml2-2.0.0, Daniel
Daniel Veillard 71b656e0 2000-01-05T14:46:17 - added xmlRemoveID() and xmlRemoveRef() - added check and handling when possibly removing an ID - fixed some entities problems - added xmlParseTryOrFinish() - changed the way struct aredeclared to allow gtk-doc to expose those - closed #4960 - fixes to libs detection from Albert Chin-A-Young - preparing 1.8.3 release Daniel
Daniel Veillard 5e5c6235 1999-12-29T12:49:06 - Push mode for the HTML parser (new calls) - Improved the memory debugger to provide content informations - cleanups, last known mem leak killed Daniel
Daniel Veillard 5cb5ab8d 1999-12-21T15:35:29 - release 1.8.2 - HTML handling improvement - new tree handling functions - release 1.8.2 - HTML handling improvement - new tree handling functions - default namespace on attribute bug fixed - libxml use for C++ fixed (for good this time !) Daniel
Daniel Veillard f600e253 1999-12-18T15:32:46 - Fixed bug #4344 - Fixed C++ problems in headers - Released 1.8.1 Daniel - Fixed bug #4344 - Fixed C++ problems in headers - Released 1.8.1 Daniel
Daniel Veillard dd6b3676 1999-09-23T22:19:22 Fixed CHAR, errno, alpha RPM compile, updated doc, Daniel
Daniel Veillard b96e6438 1999-08-29T21:02:19 Release 1.6, lot of fixes, more validation, code cleanup, added namespace on attributes, Daniel.
Daniel Veillard 82150d8a 1999-07-07T07:32:15 HTML parsing, output is now correct, added HTMLtests target and testcases, Daniel
Daniel Veillard 5233ffc8 1999-07-06T22:25:25 Restore binary compat, more HTML stuff, allow stdin input, Daniel.
Daniel Veillard be70ff71 1999-07-05T16:50:46 Closing reported bugs: 617 1591 1592, adding an HTML parser, Daniel