include/libxml/encoding.h


Log

Author Commit Date CI Message
Nick Wellnhofer 1167c334 2024-06-28T21:51:21 encoding: Don't include iconv.h from libxml/encoding.h
Nick Wellnhofer 95d36333 2024-06-28T21:19:44 encoding: Rework conversion error codes This should match the old code more closely. Remove XML_ERR_PARTIAL. It's unlikely that anyone is using these codes already.
Nick Wellnhofer 282ec1d5 2024-06-28T19:06:57 encoding: Rework xmlCharEncodingHandler layout Reuse some of the old members. The "input" and "output" function pointers are actually of type xmlCharEncConvFunc, accepting an additional argument. For default handlers, this argument is unused, so this should work with most ABIs. For iconv handlers, these function pointers used to be NULL but now point to a function which requires the extra argument. "iconv_in" and "iconv_out" are made void pointers. "uconv_in" and "uconv_out" are renamed and made void pointers. This is unlikely to cause issues. We now expect that the built-in conversion functions correctly report XML_ENC_ERR_SPACE. For UTF8ToHtml and the ISO-8859-X code, this will be done in the following commits.
Nick Wellnhofer 501e5d19 2024-06-28T04:10:03 encoding: Stop using XML_ENC_ERR_PARTIAL
Nick Wellnhofer c59c2449 2024-06-27T23:32:58 encoding: Support custom implementations
Nick Wellnhofer 1e3da9f4 2024-06-27T21:37:18 encoding: Start with callbacks
Nick Wellnhofer 6d8427dc 2024-06-27T20:39:52 encoding: Rework encoding lookup Add missing xmlCharEncoding enum values. Simplify and speed up encoding lookup by using a table mapping names to xmlCharEncoding enums and binary search. Rearrange the default handler table to match the enum layout. For some encodings we now only lookup the provided or most canonical name instead of trying several names, expecting that iconv or ICU handle aliases: - IBM037 (EBCDIC) - UCS-2 - UCS-4 - Shift_JIS
Nick Wellnhofer 3b4a84e4 2024-06-10T23:20:43 encoding: Deprecate xmlCharEncodingHandler members
Nick Wellnhofer e75e878e 2024-05-20T13:58:22 doc: Update and fix documentation
Nick Wellnhofer 0821efc8 2024-01-02T18:33:57 encoding: Check whether encoding handlers support input/output The "HTML" encoding handler doesn't support input which could lead to a wrong error report.
Nick Wellnhofer bd5ad030 2023-12-10T14:56:21 encoding: Report malloc failures Introduce new API functions that return a separate error code if a memory allocation fails. - xmlOpenCharEncodingHandler - xmlLookupCharEncodingHandler Fix a few places where malloc failures weren't reported.
Nick Wellnhofer 7909ff08 2023-09-20T17:38:26 include: Remove unnecessary includes - Don't include tree.h from encoding.h - Don't include parser.h from xmlIO.h
Nick Wellnhofer 3ff6abbf 2023-02-22T17:11:20 encoding: Rework error codes Use an enum instead of magic numbers. Fix a few error codes. Simplify handling of "space" and "partial" errors. See #506.
Nick Wellnhofer 98840d40 2023-03-21T19:07:12 parser: Rework EBCDIC code page detection To detect EBCDIC code pages, we used to switch the encoding twice and had to be very careful not to decode data after the XML declaration before the second switch. This relied on a hard-coded expected size of the XML declaration and was complicated and unreliable. Now we convert the first 200 bytes to EBCDIC-US and parse the encoding declaration manually.
Nick Wellnhofer ce9baf94 2022-12-08T02:48:27 Remove XMLCALL and XMLCDECL macros from public headers
Nick Wellnhofer 40483d0c 2022-03-06T13:55:48 Deprecate module init and cleanup functions These functions shouldn't be part of the public API. Most init functions are only thread-safe when called from xmlInitParser. Global variables should only be cleaned up by calling xmlCleanupParser.
Nick Wellnhofer b66ce0bb 2022-03-01T12:39:02 Don't include ICU headers in public headers There's no need to make these implementation details public.
Joel Hockey 0b19f236 2017-10-25T18:11:12 Fixed ICU to set flush correctly and provide pivot buffer. By always setting flush=TRUE when doing multiple reads, ICU will not correctly handle truncated utf8 chars across read boundaries. The fix is to set flush=TRUE only on final read, and to provide a pivot buffer which is maintained by libxml between calls to ucnv_convertEx.
Daniel Veillard f8e3db04 2012-09-11T13:26:36 Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.
Giuseppe Iuculano 48f7dcb7 2010-11-04T17:42:42 480323 add code to plug in ICU converters by default This is not configured in by default but after some serious massaging incorporate that patch from Chromium/Chrome.
William M. Brack 21e4ef20 2005-01-02T09:53:13 Re-examined the problems of configuring a "minimal" library. Synchronized the header files with the library code in order to assure that all the various conditionals (LIBXML_xxxx_ENABLED) were the same in both. Modified the API database content to more accurately reflect the conditionals. Enhanced the generation of that database. Although there was no substantial change to any of the library code's logic, a large number of files were modified to achieve the above, and the configuration script was enhanced to do some automatic enabling of features (e.g. --with-xinclude forces --with-xpath). Additionally, all the format errors discovered by apibuild.py were corrected. * configure.in: enhanced cross-checking of options * doc/apibuild.py, doc/elfgcchack.xsl, doc/libxml2-refs.xml, doc/libxml2-api.xml, gentest.py: changed the usage of the <cond> element in module descriptions * elfgcchack.h, testapi.c: regenerated with proper conditionals * HTMLparser.c, SAX.c, globals.c, tree.c, xmlschemas.c, xpath.c, testSAX.c: cleaned up conditionals * include/libxml/[SAX.h, SAX2.h, debugXML.h, encoding.h, entities.h, hash.h, parser.h, parserInternals.h, schemasInternals.h, tree.h, valid.h, xlink.h, xmlIO.h, xmlautomata.h, xmlreader.h, xpath.h]: synchronized the conditionals with the corresponding module code * doc/examples/tree2.c, doc/examples/xpath1.c, doc/examples/xpath2.c: added additional conditions required for compilation * doc/*.html, doc/html/*.html: rebuilt the docs
Daniel Veillard 3671190b 2004-02-11T13:25:26 added xmlByteConsumed() interface updated the benchmark rebuilt the docs * parserInternals.c xmlIO.c encoding.c include/libxml/parser.h include/libxml/xmlIO.h: added xmlByteConsumed() interface * doc/*: updated the benchmark rebuilt the docs * python/tests/Makefile.am python/tests/indexes.py: added a specific regression test for xmlByteConsumed() * include/libxml/encoding.h rngparser.c tree.c: small cleanups Daniel
William M. Brack a2e844a3 2004-01-06T11:52:13 moved string and UTF8 routines out of parser.c and encoding.c into a new * encoding.c, parser.c, xmlstring.c, Makefile.am, include/libxml/Makefile.am, include/libxml/catalog.c, include/libxml/chvalid.h, include/libxml/encoding.h, include/libxml/parser.h, include/libxml/relaxng.h, include/libxml/tree.h, include/libxml/xmlwriter.h, include/libxml/xmlstring.h: moved string and UTF8 routines out of parser.c and encoding.c into a new module xmlstring.c with include file include/libxml/xmlstring.h mostly using patches from Reid Spencer. Since xmlChar now defined in xmlstring.h, several include files needed to have a #include added for safety. * doc/apibuild.py: added some additional sorting for various references displayed in the APIxxx.html files. Rebuilt the docs, and also added new file for xmlstring module. * configure.in: small addition to help my testing; no effect on normal usage. * doc/search.php: added $_GET[query] so that persistent globals can be disabled (for recent versions of PHP)
William M. Brack f9415e49 2003-11-28T09:39:10 Enhanced the handling of UTF-16, UTF-16LE and UTF-16BE encodings. Now * encoding.c, include/libxml/encoding.h: Enhanced the handling of UTF-16, UTF-16LE and UTF-16BE encodings. Now UTF-16 output is handled internally by default, with proper BOM and UTF-16LE encoding. Native UTF-16LE and UTF-16BE encoding will not generate a BOM on output, and will be automatically recognized on input. * test/utf16lebom.xml, test/utf16bebom.xml, result/utf16?ebom*: added regression tests for above.
Daniel Veillard be586972 2003-11-18T20:56:51 modified the file header to add more informations, painful... updated to * include/libxml/*.h include/libxml/*.h.in: modified the file header to add more informations, painful... * genChRanges.py genUnicode.py: updated to generate said changes in headers * doc/apibuild.py: extract headers, add them to libxml2-api.xml * *.html *.xsl *.xml: updated the stylesheets to flag geprecated APIs modules. Updated the stylesheets, some cleanups, regenerated * doc/html/*.html: regenerated added back book1 and libxml-lib.html Daniel
William M. Brack 60f394e9 2003-11-16T06:25:42 Finally - found the problem with the page generation (XMLPUBFUN not * doc/html/*.html: Finally - found the problem with the page generation (XMLPUBFUN not recognized by gtkdoc). Re-created the pages using a temporary version of include/libxml/*.h. * testOOMlib.c,include/libxml/encoding.h, include/libxml/schemasInternals.h,include/libxml/valid.h, include/libxml/xlink.h,include/libxml/xmlwin32version.h, include/libxml/xmlwin32version.h.in, include/libxml/xpathInternals.h: minor edit of comments to help automatic documentation generation * doc/docdescr.doc: small elaboration * doc/examples/test1.c,doc/examples/Makefile.am: re-commit (messed up on last try) * xmlreader.c: minor change to clear warning.
Igor Zlatkovic 76874e45 2003-08-25T09:05:12 Exportability taint of the headers
William M. Brack 4a557d97 2003-07-29T04:28:04 fixed problem with comments reported by Nick Kew added routines * HTMLparser.c: fixed problem with comments reported by Nick Kew * encoding.c: added routines xmlUTF8Size and xmlUTF8Charcmp for some future cleanup of UTF8 handling
Igor Zlatkovic 7ae91bcd 2002-11-08T17:18:52 retired xmlwin32version.h
Daniel Veillard f000f073 2002-10-22T14:28:17 made xmlGetUTF8Char public Daniel * include/libxml/encoding.h encoding.c: made xmlGetUTF8Char public Daniel
Daniel Veillard 6f46f6c5 2002-08-01T12:22:24 Opening the interface xmlNewCharEncodingHandler as requested in #89415 * encoding.c include/libxml/encoding.h: Opening the interface xmlNewCharEncodingHandler as requested in #89415 * python/generator.py python/setup.py.in: applied cleanup patches from Marc-Andre Lemburg * tree.c: fixing bug #89332 on a specific case of loosing the XML-1.0 namespace on xml:xxx attributes Daniel
Igor Zlatkovic a6f2d906 2002-04-16T17:57:17 *** empty log message ***
Daniel Veillard 61f26174 2002-03-12T18:46:39 Heiko W. Rupp fixed a lot of comments to generate better API descriptions * include/libxml/*.h: Heiko W. Rupp fixed a lot of comments to generate better API descriptions etc... Daniel
Daniel Veillard 6c4ffafd 2002-02-11T08:54:05 trying to fix the include mess Daniel * include/libxml/encoding.h include/libxml/entities.h include/libxml/globals.h include/libxml/parser.h include/libxml/threads.h include/libxml/tree.h include/libxml/xmlmemory.h: trying to fix the include mess Daniel
Daniel Veillard 963d2ae4 2002-01-20T22:08:18 cleanup patch from Anthony Jones fix the headers to avoid in make scan * SAX.c: cleanup patch from Anthony Jones * doc/Makefile.am: fix the headers to avoid in make scan * parserInternals.c xpath.c include/libxml/*.h: cleanup of the includes, * vs Ptr and general cleanup * parsedecl.py: first version of a script to extract the module interfaces, the goal will be to provide .decl or XML specification of the interfaces to build wrappers. Daniel
Daniel Veillard cbaf3995 2001-12-31T16:16:02 applied 42 documentation patches from Charlie Bozeman. Regenerated the * *.c include/libxml/*.h doc/html/*: applied 42 documentation patches from Charlie Bozeman. Regenerated the HTML docs. Daniel
Daniel Veillard 60087f30 2001-10-10T09:45:09 preparing 2.4.6 release updated and rebuilt the docs fixed a number of * configure.in: preparing 2.4.6 release * doc/xml.html doc/html/*: updated and rebuilt the docs * include/libxml/*.h *.c: fixed a number of teh/the widht/width typos Daniel
Daniel Veillard c5d64345 2001-06-24T12:13:24 Summer's cleanup, a really big one: * AUTHORS: added William and Bjorn * include/libxml/*.h *.c README doc/*.html etc.: changed old email to daniel@veillard.com hopefully I won't have to do this again * doc/Makefile.am doc/html/*.html: cleanup makefile, checked that docs can be rebuilt cleanly now * include/libxml/xml*version.h*: removed include/libxml/xmlversion.h from CVs it's generated, added include/libxml/xmlwin32version.h also generated but which should change far less frequently. * catalog.c nanoftp.c: made sure to include libxml.h not libxml/xmlversion.h directly * include/libxml/*.h: include xmlwin32version.h instead of xmlversion.h when compiling on WIN32 and MSC Daniel
Daniel Veillard 97ac1319 2001-05-30T19:14:17 - xpath.c encoding.[ch]: William M. Brack provided a set of UTF8 string oriented functions and started cleaning the related areas in xpath.c which needed fixing in this respect Daniel
Daniel Veillard f69bb4b5 2001-05-19T13:24:56 - HTMLparser.c: Closed bug #54891 - result/HTML/cf_128.html* test/HTML/cf_128.html: added the test to the suite forgot to commit this one yesterday - encoding.h hash.c nanoftp.h parser.h tree.h uri.h xlink.h xpointer.c: applied a documentation patch from LotR and filled in a few missing descriptions Daniel
Daniel Veillard e043ee17 2001-04-16T14:08:07 - xpath.c: fixed xmlXPathNodeCollectAndTest() to do proper prefix lookup. - parserInternals.c: fixed the bug reported by Morus Walter due to an off by one typo in xmlStringCurrentChar() Daniel
Daniel Veillard 56a4cb8c 2001-03-24T17:00:36 Huge cleanup, I switched to compile with -Wall -g -O -ansi -pedantic -W -Wunused -Wimplicit -Wreturn-type -Wswitch -Wcomment -Wtrigraphs -Wformat -Wchar-subscripts -Wuninitialized -Wparentheses -Wshadow -Wpointer-arith -Wcast-align -Wwrite-strings -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes -Wnested-externs -Winline - HTMLparser.[ch] HTMLtree.c SAX.c debugXML.c encoding.[ch] encoding.h entities.c error.c list.[ch] nanoftp.c nanohttp.c parser.[ch] parserInternals.[ch] testHTML.c testSAX.c testURI.c testXPath.c tree.[ch] uri.c valid.[ch] xinclude.c xmlIO.[ch] xmllint.c xmlmemory.c xpath.c xpathInternals.h xpointer.[ch] example/gjobread.c: Cleanup, staticfied a number of non-exported functions, detected and cleaned up a dozen of problem found this way, avoided a lot of public function name/typedef/system names clashes - doc/xml.html: updated - configure.in: switched private flags to the really pedantic ones. Daniel
Owen Taylor 3473f88a 2001-02-23T17:55:21 Revert directory structure changes
CET 2001 Tomasz Kłoczko 64636e7f 2001-02-23T01:37:32 moved to libxml directory - this allow simplify automake/autoconf. Now Thu Feb 23 02:03:56 CET 2001 Tomasz Kłoczko <kloczek@pld.org.pl> * *.c *.h libxml files: moved to libxml directory - this allow simplify automake/autoconf. Now isn't neccessary hack on am/ac level for make and remove libxml symlink (modified for this also configure.in and main Makefile.am). Now automake abilities are used in best way (like in many other projects with libraries). * include/win32config.h: moved to libxml directory (now include directory isn't neccessary). * Makefile.am, examples/Makefile.am, libxml/Makefile.am: added empty DEFS and in INCLUDES rest only -I$(top_builddir) - this allow minimize parameters count passed to libtool script (now compilation is also slyghtly more quiet). * configure.in: simplifies libzdetestion - prepare separated variables for keep libz name and path to libz header files isn't realy neccessary (if someone have libz installed in non standard prefix path to header files ald library can be passed as: $ CFALGS="-I</libz.h/path>" LDFLAGS="-L</libz/path>" ./configure * autogen.sh: check now for libxml/entities.h. After above building libxml pass correctly and also pass "make install DESTDIR=</install/prefix>" from tar ball generated by "make dist". Seems ac/am reorganization is finished. This changes not touches any other things on *.{c,h} files level.
Daniel Veillard f0cc7ccc 2000-08-26T21:40:43 libxml now grok Docbook-3.1.5 and Docbook-4.1.1 DTDs, this popped out a couple of bugs and 3 speed issues, there is only on minor speed issue left. Assorted collection of user reported bugs and fixes: - doc/encoding.html: added encoding aliases doc - doc/xml.html: updates - encoding.[ch]: added EncodingAliases functions - entities.[ch] valid.[ch] debugXML.c: removed two serious bottleneck affecting large DTDs like Docbook - parser.[ch] xmllint.c: added a pedantic option, will be useful - SAX.c: redefinition of entities is reported in pedantic mode - testHTML.c: uninitialized warning from gcc - uri.c: fixed a couple of bugs - TODO: added issue raised by Michael Daniel
Daniel Veillard 32bc74ef 2000-07-14T14:49:25 - doc/encoding.html doc/xml.html: added I18N doc - encoding.[ch] HTMLtree.[ch] parser.c HTMLparser.c: I18N encoding improvements, both parser and filters, added ASCII & HTML, fixed the ISO-Latin-1 one - xmllint.c testHTML.c: added/made visible --encode - debugXML.c : cleanup - most .c files: applied patches due to warning on Windows and when using Sun Pro cc compiler - xpath.c : cleanup memleaks - nanoftp.c : added a TESTING preprocessor flag for standalong compile so that people can report bugs more easilly - nanohttp.c : ditched socklen_t which was a portability mess and replaced it with unsigned int. - tree.[ch]: added xmlHasProp() - TODO: updated - test/ : added more test for entities, NS, encoding, HTML, wap - configure.in: preparing for 2.2.0 release Daniel
Daniel Veillard be803967 2000-06-28T23:40:59 - Large resync between W3C and Gnome tree - configure.in: 2.1.0 prerelease - example/Makefile.am example/gjobread.c tree.h: work on libxml1 libxml2 convergence. - nanoftp, nanohttp.c: fixed stalled connections probs - HTMLtree.c SAX.c : support for attribute without values in HTML for andersca - valid.c: Fixed most validation + namespace problems - HTMLparser.c: start document callback for andersca - debugXML.c xpath.c: lots of XPath fixups from Picdar Technology - parser.h, SAX.c: serious speed improvement for large CDATA blocks - encoding.[ch] xmlIO.[ch]: Improved seriously saving to different encoding - config.h.in parser.c xmllint.c: added xmlCheckVersion() and the LIBXML_TEST_VERSION macro Daniel
Daniel Veillard 496a1cf5 2000-05-03T14:20:55 revamped the encoding support, added iconv support, so now libxml if * encoding.[ch], xmlIO.[ch], parser.c, configure.in : revamped the encoding support, added iconv support, so now libxml if compiled with iconv automatically support japanese encodings among others. Work based on initial patch from Yuan-Chen Cheng I may have broken binary compat in the encoding handler registration scheme, but that was so utterly broken I don't expect anybody to have used this feature until now. * parserInternals.h: fixup on the CHAR range macro * xml-error.h, parser.c: catch URL/URI errors using the uri.c code. * tree.[ch]: added xmlBufferGrow(), was needed for iconv * uri.c: added xmlParseURI() I can't believe I forgot to implement this one in 2.0 !!! * SAX.c: moved doc->encoding update in the endDocument() call. * TODO: updated. Iconv rules :-) Daniel
Daniel Veillard 361d845d 2000-04-03T19:48:13 Work done on the plane, ready to release libxml2-2.0.0, Daniel
Daniel Veillard cf46199c 2000-03-14T18:30:20 This is the 2.0.0-beta, lots and lots and lots of changes Have a look at http://xmlsoft.org/upgrade.html Daniel
Daniel Veillard 71b656e0 2000-01-05T14:46:17 - added xmlRemoveID() and xmlRemoveRef() - added check and handling when possibly removing an ID - fixed some entities problems - added xmlParseTryOrFinish() - changed the way struct aredeclared to allow gtk-doc to expose those - closed #4960 - fixes to libs detection from Albert Chin-A-Young - preparing 1.8.3 release Daniel
Daniel Veillard a819dace 1999-11-24T18:04:22 Added cleanup routines, cleanup with -pedantic on linux, closed #3788, Daniel
Daniel Veillard b96e6438 1999-08-29T21:02:19 Release 1.6, lot of fixes, more validation, code cleanup, added namespace on attributes, Daniel.
Daniel Veillard 14fff064 1999-06-22T21:49:07 Big changes, seems that 1.2.0 wasn't commited, here is 1.3.0, Daniel
Daniel Veillard 011b63cb 1999-06-02T17:44:04 Release of libxml-1.1, Daniel.
Daniel Veillard 27d88744 1999-05-29T11:51:49 CORBA defines fixes, char encoding atodetection, Daniel
Daniel Veillard 39a1f9a3 1999-01-17T19:11:59 Speed, conformance testing, more parsing, general improvements, Daniel.
Daniel Veillard 891e404a 1998-10-19T00:43:02 Added the UTF-8, UTF-16 and ISO Lat 1 conversion routines, not yet used, Daniel.