encoding.c


Log

Author Commit Date CI Message
Nick Wellnhofer ad338ca7 2022-09-01T01:18:30 Remove explicit integer casts Remove explicit integer casts as final operation - in assignments - when passing arguments - when returning values Remove casts - to the same type - from certain range-bound values The main motivation is that these explicit casts don't change the result of operations and only render UBSan's implicit-conversion checks useless. Removing these casts allows UBSan to detect cases where truncation or sign-changes occur unexpectedly. Document some explicit casts as truncating and add a few missing ones.
Nick Wellnhofer 0f568c0b 2022-08-26T01:22:33 Consolidate private header files Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.
David Kilzer c14cac8b 2022-05-25T18:13:07 xmlBufAvail() should return length without including a byte for NUL terminator * buf.c: (xmlBufAvail): - Return the number of bytes available in the buffer, but do not include a byte for the NUL terminator so that it is reserved. * encoding.c: (xmlCharEncFirstLineInput): (xmlCharEncInput): (xmlCharEncOutput): * xmlIO.c: (xmlOutputBufferWriteEscape): - Remove code that subtracts 1 from the return value of xmlBufAvail(). It was implemented inconsistently anyway.
David Kilzer 21561e83 2016-05-20T15:21:43 Mark more static data as `const` Similar to 8f5710379, mark more static data structures with `const` keyword. Also fix placement of `const` in encoding.c. Original patch by Sarah Wilkin.
Nick Wellnhofer 40483d0c 2022-03-06T13:55:48 Deprecate module init and cleanup functions These functions shouldn't be part of the public API. Most init functions are only thread-safe when called from xmlInitParser. Global variables should only be cleaned up by calling xmlCleanupParser.
Nick Wellnhofer f2072a8b 2022-03-05T18:23:34 Fix memory leak in xmlFindCharEncodingHandler Fix memory leak in an unlikely error condition. Thanks to Wentao Liang for the report. Fixes #342.
Nick Wellnhofer 21ddad52 2022-03-04T01:07:40 Remove ICONV_CONST test We can simply cast the offending pointer to (void *).
Nick Wellnhofer 776d15d3 2022-03-02T00:29:17 Don't check for standard C89 headers Don't check for - ctype.h - errno.h - float.h - limits.h - math.h - signal.h - stdarg.h - stdlib.h - string.h - time.h Stop including non-standard headers - malloc.h - strings.h
Nick Wellnhofer b66ce0bb 2022-03-01T12:39:02 Don't include ICU headers in public headers There's no need to make these implementation details public.
Nick Wellnhofer c41bc10d 2022-02-22T19:57:12 Fix unused variable warnings with disabled features
Nick Wellnhofer 346c3a93 2022-02-20T18:46:42 Remove elfgcchack.h The same optimization can be enabled with -fno-semantic-interposition since GCC 5. clang has always used this option by default.
Nick Wellnhofer 7abc6e6a 2022-01-25T02:27:53 Fix integer conversion warning in xmlIconvWrapper Use size_t for return value of iconv(3) to avoid an UBSan integer conversion warning.
Mohammad Razavi eb4c1bf8 2021-11-03T09:48:13 Fix random dropping of characters on dumping ASCII encoded XML Fix a bug in xmlCharEncOutput return value which will cause xmlNodeDumpOutput to drop characters randomly. xmlCharEncOutput returns zero if the length of the input buffer is zero but ignores the fact that it may already encoded the input buffer and the input's length is zero due to the fact that xmlEncOutputChunk returned -2 errors and underlying code tries to fix the error by encoding the input. xmlCharEncOutput is collecting the number of bytes written to the output buffer but is returning zero instead of the total number of bytes in this situation. This commit will fix this issue by returning the total number of bytes instead. So the xmlNodeDumpOutput will also continue writing and will not stop due to the fact that it mistakenly thinks the output buffer is not changed in that iteration. Fixes #314
David Kilzer 03bb9293 2021-07-07T18:23:18 Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8(). * encoding.c: (UTF16LEToUTF8): - Fix comment to describe what the code does. (UTF16BEToUTF8): - Fix undefined behavior which was applied to UTF16LEToUTF8() in 2f9382033e. - Add bounds check to while() loop which was applied to UTF16LEToUTF8() in be803967db. - Do not return -2 when (in >= inend) to fix the bug. This was applied to UTF16LEToUTF8() in 496a1cf592. - Inline (<< 8) statements to match UTF16LEToUTF8(). Add the following tests and results: test/text-4-byte-UTF-16-BE-offset.xml test/text-4-byte-UTF-16-BE.xml test/text-4-byte-UTF-16-LE-offset.xml test/text-4-byte-UTF-16-LE.xml
David King b92b16f6 2021-05-19T10:15:54 Remove unused variable in xmlCharEncOutFunc Fixes a compiler warning: encoding.c: In function 'xmlCharEncOutFunc__internal_alias': encoding.c:2632:9: warning: unused variable 'output' [-Wunused-variable] 2632 | int output = 0; https://gitlab.gnome.org/GNOME/libxml2/-/issues/254
Nick Wellnhofer dcb80b92 2021-02-20T20:30:43 Fix slow parsing of HTML with encoding errors Under certain circumstances, the HTML parser would try to guess and switch input encodings multiple times, leading to slow processing of documents with encoding errors. The repeated scanning of the input buffer when guessing encodings could even lead to quadratic behavior. The code htmlCurrentChar probably assumed that if there's an encoding handler, it is guaranteed to produce valid UTF-8. This holds true in general, but if the detected encoding was "UTF-8", the UTF8ToUTF8 encoding handler simply invoked memcpy without checking for invalid UTF-8. This still must be fixed, preferably by not using this handler at all. Also leave a note that switching encodings twice seems impossible to implement correctly. Add a check when handling UTF-8 encoding errors in htmlCurrentChar to avoid this situation, even if encoders produce invalid UTF-8. Found by OSS-Fuzz.
Xiaoming Ni 649d02ea 2020-12-07T20:19:53 encoding: fix memleak in xmlRegisterCharEncodingHandler() The return type of xmlRegisterCharEncodingHandler() is void. The invoker cannot determine whether xmlRegisterCharEncodingHandler() is executed successfully. when nbCharEncodingHandler >= MAX_ENCODING_HANDLERS, the "handler" is not added to the array "handlers". As a result, the memory of "handler" cannot be managed and released: memory leakage. so add "xmlfree(handler)" to fix memory leakage on the failure branch of xmlRegisterCharEncodingHandler(). Reported-by: wuqing <wuqing30@huawei.com> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Frederik Seiffert b516ed18 2020-11-12T12:53:43 Fix building with ICU 68. ICU 68 no longer defines the TRUE macro. Closes #204.
Nick Wellnhofer 1e41e4fa 2020-06-30T02:43:57 Fix return values and documentation in encoding.c Make xmlEncInputChunk and xmlEncOutputChunk return 0 on success and never a positive value. Make xmlCharEncFirstLineInt, xmlCharEncFirstLineInt and xmlCharEncOutFunc return the number of bytes written.
Nick Wellnhofer 2f938203 2020-06-15T15:45:47 Fix undefined behavior in UTF16LEToUTF8 Don't perform arithmetic on null pointer. Found with libFuzzer and UBSan.
Nick Wellnhofer a697ed1e 2020-06-15T14:49:22 Fix return value of xmlCharEncOutput Commit 407b393d introduced a regression caused by xmlCharEncOutput returning 0 in case of success instead of the number of bytes written. Always use its return value for nbchars in xmlOutputBufferWrite. Fixes #166.
Nick Wellnhofer 20c60886 2020-03-08T17:19:42 Fix typos Resolves #133.
Jared Yanovich 2a350ee9 2019-09-30T17:04:54 Large batch of typo fixes Closes #109.
Andrey Bienkowski d2293cdb 2018-01-30T15:04:11 Remove a misleading line from xmlCharEncOutput Closes: https://bugzilla.gnome.org/show_bug.cgi?id=793028 It seams this line was accidentally copied over from xmlCharEncOutFunc. In xmlCharEncOutput output is a pointer so incrementing it by ret can point it where it wasn't supposed to be pointing. Luckily the current implementation doesn't dereference the pointer after advancing it. Signed-off-by: Daniel Veillard <veillard@redhat.com>
Nick Wellnhofer 772c0648 2017-11-09T17:56:31 Fix unused parameter warning without ICU
Joel Hockey 0b19f236 2017-10-25T18:11:12 Fixed ICU to set flush correctly and provide pivot buffer. By always setting flush=TRUE when doing multiple reads, ICU will not correctly handle truncated utf8 chars across read boundaries. The fix is to set flush=TRUE only on final read, and to provide a pivot buffer which is maintained by libxml between calls to ucnv_convertEx.
Nick Wellnhofer e5107772 2017-06-19T15:32:56 Fix pathological performance when outputting charrefs If a character can't be represented in the output encoding, it is converted to a character reference. This used to to replace the character in the input stream by calling xmlBufAddHead or xmlBufferAddHead. These functions shifted the entire input array around, leading to quadratic performance when converting a run of non-representable characters. This is most pronounced when dumping to memory. Output the charref directly instead. Found with libFuzzer.
Nick Wellnhofer c9ccbd6a 2017-06-19T14:57:43 Deduplicate code in encoding.c Introduce static functions xmlEncInputChunk and xmlEncOutputChunk that handle the internal/iconv/ICU branching.
David Kilzer 4472c3a5 2016-05-13T15:13:17 Fix some format string warnings with possible format string vulnerability For https://bugzilla.gnome.org/show_bug.cgi?id=761029 Decorate every method in libxml2 with the appropriate LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups following the reports.
Gaurav 080a22c5 2013-11-29T23:10:50 Avoid a possibility of dangling encoding handler For https://bugzilla.gnome.org/show_bug.cgi?id=711149 In Function: int xmlCharEncCloseFunc(xmlCharEncodingHandler *handler) If the freed handler is any one of handlers[i] list, then it will make that hanldlers[i] as dangling. This may lead to crash issues at places where handlers is read.
Denis Pauk e28c8a1a 2013-08-03T14:22:54 #705267 - add additional defines checks for support "./configure --with-minimum" https://bugzilla.gnome.org/show_bug.cgi?id=705267
Daniel Veillard bf058dce 2013-02-13T18:19:42 Fix the flushing out of raw buffers on encoding conversions https://bugzilla.gnome.org/show_bug.cgi?id=692915 the new set of converting functions tried to limit the encoding conversion of the raw buffer to the consumption one to work in a more progressive fashion. Unfortunately this was bad for performances and led to errors on progressive parsing when a very large chunk was close to the end of the document. Fix the new internal function and switch back to the old way of converting. Fix another bug in the process.
Petr Sumbera 6f49c73b 2012-12-12T15:41:30 Try IBM-037 when looking for EBCDIC handlers http://en.wikipedia.org/wiki/EBCDIC_037 as it is another variat of EBCDIC
Daniel Veillard f8e3db04 2012-09-11T13:26:36 Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.
Daniel Veillard 28cc42d0 2012-08-10T10:00:18 Regenerating docs and API files Various cleanups * configure.in: force regeneration of APIs in my environment * buf.c buf.h enc.h encoding.c include/libxml/tree.h include/libxml/xmlerror.h save.h tree.c: various comment cleanups pointed by apibuild * doc/apibuild.py: added the 3 new internal headers in the excludes * doc/libxml2-api.xml doc/libxml2-refs.xml: regenerated the API * doc/symbols.xml: listing new entry points for 2.9.0 * doc/devhelp/*: regenerated
Daniel Veillard 18d0db25 2012-07-13T19:51:15 Adding new encoding function to deal with the new structures * encoding.c: adds xmlCharEncFirstLineInput, xmlCharEncInput and xmlCharEncOutput * enc.h: the functions are not made public but added to this new header
Timothy Elliott 689408bd 2012-05-08T22:03:22 Prevent an infinite loop when dumping a node with encoding problems When a node is dumped with a new encoding, we may encounter characters that are not supported in the new encoding. libxml2 handles this by replacing the character with character references, but in some encodings this can result in an infinite loop when the character references themselves contain unsupported characters. This fixes the infinite loop by undoing a character reference substitution when it cannot be inserted, and returning an encoder error. This bug was noticed when looking into an infinite loop bug report for the Ruby Nokogiri project. The original bug report, "nokogiri process hangs on call to inner_html" is here: https://github.com/tenderlove/nokogiri/issues/400
Daniel Veillard 69f04562 2011-08-19T11:05:04 Fix an off by one error in encoding this off by one error doesn't seems to reproduce on linux but the error is real.
Giuseppe Iuculano 48f7dcb7 2010-11-04T17:42:42 480323 add code to plug in ICU converters by default This is not configured in by default but after some serious massaging incorporate that patch from Chromium/Chrome.
Daniel Veillard ad4f0a2d 2010-11-03T20:40:46 630140 better fix for iso995x encoding error Changing semantic of xmlCharEncInFunc() wasn't the proper way to do this, better change UTF8ToISO8859x() appropriately
Daniel Veillard 1cc912ec 2010-11-03T19:26:35 Various cleanups on encoding handling Done while chasing previous bug
Daniel Veillard 083caf5e 2010-11-03T19:24:05 630140 fix iso995x encoding error https://bugzilla.gnome.org/show_bug.cgi?id=630140 Fix the bug, which happen when using the embedded converters and not iconv
Daniel Veillard d44b9364 2009-09-07T12:15:08 A few more safety cleanup raised by scan * SAX2.c encoding.c parser.c xmlschemas.c: a few more safety checks * relaxng.c: remove an unused intitialization
Daniel Veillard 76d36458 2009-09-07T11:19:33 Fixing assorted potential problems raised by scan * encoding.c parser.c relaxng.c runsuite.c tree.c xmlreader.c xmlschemas.c: nothing really serious but better safe than sorry
Daniel Veillard 7e385bd4 2009-08-26T11:38:49 566012 autodetected encoding and encoding conflict * encoding.c parser.c parserInternals.c: when we autodetect an encoding but it's actually not completely compatible with the one declared great care must be taken to not convert more than just the first line. Led to some refactoring, more private functions and a bit of cleanup.
Martin Kögler c78988ac 2009-08-24T16:47:48 566012 Incomplete EBCDIC parsing support * encoding.c: the iconv converter is sometimes only found as "EBCDIC-US"
Daniel Veillard e83e93e7 2008-08-30T12:52:26 make a new kind of buffer where shrinking and adding in head can avoid * include/libxml/tree.h tree.c: make a new kind of buffer where shrinking and adding in head can avoid reallocation or full buffer memmoves * encoding.c xmlIO.c: use the new kind of buffers for output buffers Daniel svn path=/trunk/; revision=3787
Daniel Veillard f124539f 2008-04-03T09:46:34 buffer may not be large enough to convert to UCS4, patch from Christian * encoding.c: buffer may not be large enough to convert to UCS4, patch from Christian Fruth , fixes #504015 Daniel svn path=/trunk/; revision=3727
Daniel Veillard 57c9db07 2008-03-06T14:37:10 poblem with encoding detection for UTF-16 reported by Ashwin and found by * encoding.c: poblem with encoding detection for UTF-16 reported by Ashwin and found by Bill * test/valid/dtds/utf16b.ent test/valid/dtds/utf16l.ent test/valid/UTF16Entity.xml result/valid/UTF16Entity.xml*: added the example to the regression tests Daniel svn path=/trunk/; revision=3700
Daniel Veillard 8e1a46d5 2008-02-15T07:47:26 patch from Roumen Petrov to detect if iconv() needs a const for the second * config.h.in configure.in encoding.c: patch from Roumen Petrov to detect if iconv() needs a const for the second parameter Daniel svn path=/trunk/; revision=3693
William M. Brack 38d452ac 2007-05-22T16:00:06 Fixed typo in xmlCharEncFirstLine pointed out by Mark Rowe (bug #440159) * encoding.c: Fixed typo in xmlCharEncFirstLine pointed out by Mark Rowe (bug #440159) * include/libxml/xmlversion.h.in: Added check for definition of _POSIX_C_SOURCE to avoid warnings on Apple OS/X (patch from Wendy Doyle and Mark Rowe, bug #346675) * schematron.c, testapi.c, tree.c, xmlIO.c, xmlsave.c: minor changes to fix compilation warnings - no change to logic. svn path=/trunk/; revision=3618
Daniel Veillard 28aac0b0 2006-10-16T08:31:18 remove a warning check with uppercase for AIX iconv() should fix #352644 * HTMLparser.c: remove a warning * encoding.c: check with uppercase for AIX iconv() should fix #352644 * doc/examples/Makefile.am: partially handle one bug report Daniel
Daniel Veillard df750627 2006-05-02T12:24:06 fixing bug #340398 xmlCharEncOutFunc writing to input buffer Daniel * encoding.c: fixing bug #340398 xmlCharEncOutFunc writing to input buffer Daniel
Daniel Veillard aac7c68e 2006-03-10T13:40:16 fix a few warning raised by gcc-4.1 and latests changes Daniel * c14n.c encoding.c xmlschemas.c xpath.c xpointer.c: fix a few warning raised by gcc-4.1 and latests changes Daniel
Daniel Veillard 2728f845 2006-03-09T16:49:24 more cleanups based on coverity reports. Daniel * SAX2.c catalog.c encoding.c entities.c example/gjobread.c python/libxml.c: more cleanups based on coverity reports. Daniel
Daniel Veillard 2e7598cb 2005-09-02T12:28:34 avoid passing a char[] as snprintf first argument. implemented * encoding.c parserInternals.c: avoid passing a char[] as snprintf first argument. * threads.c include/libxml/threads.h: implemented xmlIsThreadsEnabled() based on Andrew W. Nosenko idea. * doc/* elfgcchack.h: regenerated the API Daniel
Daniel Veillard 2644ab27 2005-08-24T14:22:55 applied the patch suggested #309565 which can avoid looping in error * encoding.c: applied the patch suggested #309565 which can avoid looping in error conditions. Daniel
Daniel Veillard 1fc3ed02 2005-08-24T12:46:09 finally converted the encoding module to the common error reporting * encoding.c error.c include/libxml/xmlerror.h: finally converted the encoding module to the common error reporting mechanism * doc/* doc/html/libxml-xmlerror.html: rebuilt Daniel
Daniel Veillard 24505b0f 2005-07-28T23:49:35 a lot of small cleanups based on Linus' sparse check output. Daniel * HTMLparser.c SAX2.c encoding.c globals.c parser.c relaxng.c runsuite.c runtest.c schematron.c testHTML.c testReader.c testRegexp.c testSAX.c testThreads.c valid.c xinclude.c xmlIO.c xmllint.c xmlmodule.c xmlschemas.c xpath.c xpointer.c: a lot of small cleanups based on Linus' sparse check output. Daniel
Daniel Veillard 5d4644ef 2005-04-01T13:11:58 revamped the elfgcchack.h format to cope with gcc4 change of aliasing * doc/apibuild.py doc/elfgcchack.xsl: revamped the elfgcchack.h format to cope with gcc4 change of aliasing allowed scopes, had to add extra informations to doc/libxml2-api.xml to separate the header from the c module source. * *.c: updated all c library files to add a #define bottom_xxx and reimport elfgcchack.h thereafter, and a bit of cleanups. * doc//* testapi.c: regenerated when rebuilding the API Daniel
Daniel Veillard 394902e0 2005-03-31T08:43:44 fix unitinialized variable in not frequently used code bug #172182 Daniel * encoding.c: fix unitinialized variable in not frequently used code bug #172182 Daniel
Daniel Veillard cffc1c7a 2005-03-12T18:54:55 removed a static buffer in xmlByteConsumed(), as pointed by Ben Maurer, * encoding.c: removed a static buffer in xmlByteConsumed(), as pointed by Ben Maurer, fixes #170086 * xmlschemas.c: remove a potentially uninitialized pointer warning Daniel
Daniel Veillard 56de87ee 2005-02-16T00:22:29 fix the comment to describe the real return values lot of work on the * encoding.c: fix the comment to describe the real return values * pattern.c xpath.c include/libxml/pattern.h: lot of work on the patterns, pluggin in the XPath default evaluation, but disabled right now because it's not yet good enough for XSLT. pattern.h streaming API are likely to be changed to handle relative and absolute paths in the same expression. Daniel
Daniel Veillard aba37dff 2004-11-11T20:42:04 forgot a $(srcdir) stupid error wrong name #157976 Daniel * Makefile.am: forgot a $(srcdir) * encoding.c: stupid error wrong name #157976 Daniel
Daniel Veillard 01ca83cd 2004-11-06T13:26:59 fixed a regression in iconv support. Daniel * encoding.c: fixed a regression in iconv support. Daniel
Daniel Veillard ce682bc2 2004-11-05T17:22:25 autogenerate a minimal NULL value sequence for unknown pointer types This * gentest.py testapi.c: autogenerate a minimal NULL value sequence for unknown pointer types * HTMLparser.c SAX2.c chvalid.c encoding.c entities.c parser.c parserInternals.c relaxng.c valid.c xmlIO.c xmlreader.c xmlsave.c xmlschemas.c xmlschemastypes.c xmlstring.c xpath.c xpointer.c: This uncovered an impressive amount of entry points not checking for NULL pointers when they ought to, closing all the open gaps. Daniel
Daniel Veillard 05f9735b 2004-10-31T15:35:32 Fixed bug #153937, making sure the conversion functions return the number * encoding.c doc/examples/testWriter.c: Fixed bug #153937, making sure the conversion functions return the number of byte written. Had to fix one of the examples. Daniel
William M. Brack 13dfa87e 2004-09-18T04:52:08 added the routine xmlNanoHTTPContentLength to the external API * nanohttp.c, include/libxml/nanohttp.h: added the routine xmlNanoHTTPContentLength to the external API (bug151968). * parser.c: fixed unnecessary internal error message (bug152060); also changed call to strncmp over to xmlStrncmp. * encoding.c: fixed compilation warning (bug152307). * tree.c: fixed segfault in xmlCopyPropList (bug152368); fixed a couple of compilation warnings. * HTMLtree.c, debugXML.c, xmlmemory.c: fixed a few compilation warnings; no change to logic.
William M. Brack f54924bd 2004-09-09T14:35:17 applied fixes for the UTF8ToISO8859x transcoding routine suggested by Mark * encoding.c: applied fixes for the UTF8ToISO8859x transcoding routine suggested by Mark Itzcovitz
William M. Brack a3215c7a 2004-07-31T16:24:01 many further little changes for OOM problems. Now seems to be getting * SAX2.c, encoding.c, error.c, parser.c, tree.c, uri.c, xmlIO.c, xmlreader.c, include/libxml/tree.h: many further little changes for OOM problems. Now seems to be getting closer to "ok". * testOOM.c: added code to intercept more errors, found more problems with library. Changed method of flagging / counting errors intercepted.
Daniel Veillard b5da42af 2004-02-21T14:57:44 small patch to try to fix a warning with Sun One compiler Daniel * encoding.c: small patch to try to fix a warning with Sun One compiler Daniel
Daniel Veillard 3288882e 2004-02-21T14:21:50 small patch removing a warning with MS compiler. Daniel * encoding.c: small patch removing a warning with MS compiler. Daniel
Daniel Veillard 3671190b 2004-02-11T13:25:26 added xmlByteConsumed() interface updated the benchmark rebuilt the docs * parserInternals.c xmlIO.c encoding.c include/libxml/parser.h include/libxml/xmlIO.h: added xmlByteConsumed() interface * doc/*: updated the benchmark rebuilt the docs * python/tests/Makefile.am python/tests/indexes.py: added a specific regression test for xmlByteConsumed() * include/libxml/encoding.h rngparser.c tree.c: small cleanups Daniel
William M. Brack 030a7a17 2004-02-10T12:48:57 applied patch supplied by Christophe Dubach to fix problem with * encoding.c: applied patch supplied by Christophe Dubach to fix problem with --with-minimum configuration (bug 133773) * nanoftp.c: fixed potential buffer overflow problem, similar to fix just applied to nanohttp.c.
Daniel Veillard 182d32a5 2004-02-09T12:42:55 applied a small patch from Alfred Mickautsch to avoid an out of bound * encoding.c: applied a small patch from Alfred Mickautsch to avoid an out of bound error in isolat1ToUTF8() Daniel
William M. Brack a2e844a3 2004-01-06T11:52:13 moved string and UTF8 routines out of parser.c and encoding.c into a new * encoding.c, parser.c, xmlstring.c, Makefile.am, include/libxml/Makefile.am, include/libxml/catalog.c, include/libxml/chvalid.h, include/libxml/encoding.h, include/libxml/parser.h, include/libxml/relaxng.h, include/libxml/tree.h, include/libxml/xmlwriter.h, include/libxml/xmlstring.h: moved string and UTF8 routines out of parser.c and encoding.c into a new module xmlstring.c with include file include/libxml/xmlstring.h mostly using patches from Reid Spencer. Since xmlChar now defined in xmlstring.h, several include files needed to have a #include added for safety. * doc/apibuild.py: added some additional sorting for various references displayed in the APIxxx.html files. Rebuilt the docs, and also added new file for xmlstring module. * configure.in: small addition to help my testing; no effect on normal usage. * doc/search.php: added $_GET[query] so that persistent globals can be disabled (for recent versions of PHP)
William M. Brack f9415e49 2003-11-28T09:39:10 Enhanced the handling of UTF-16, UTF-16LE and UTF-16BE encodings. Now * encoding.c, include/libxml/encoding.h: Enhanced the handling of UTF-16, UTF-16LE and UTF-16BE encodings. Now UTF-16 output is handled internally by default, with proper BOM and UTF-16LE encoding. Native UTF-16LE and UTF-16BE encoding will not generate a BOM on output, and will be automatically recognized on input. * test/utf16lebom.xml, test/utf16bebom.xml, result/utf16?ebom*: added regression tests for above.
Daniel Veillard d0c9c32f 2003-10-10T00:49:42 cleanup fix a funny typo converted the Schemas code to the new error * Makefile.am: cleanup * encoding.c: fix a funny typo * error.c xmlschemas.c xmlschemastypes.c include/libxml/xmlerror.h: converted the Schemas code to the new error handling. PITA, still need to check output from regression tests. Daniel
Daniel Veillard a9cce9cd 2003-09-29T13:20:24 Okay this is scary but it is just adding a configure option to disable * HTMLtree.c SAX2.c c14n.c catalog.c configure.in debugXML.c encoding.c entities.c nanoftp.c nanohttp.c parser.c relaxng.c testAutomata.c testC14N.c testHTML.c testRegexp.c testRelax.c testSchemas.c testXPath.c threads.c tree.c valid.c xmlIO.c xmlcatalog.c xmllint.c xmlmemory.c xmlreader.c xmlschemas.c example/gjobread.c include/libxml/HTMLtree.h include/libxml/c14n.h include/libxml/catalog.h include/libxml/debugXML.h include/libxml/entities.h include/libxml/nanohttp.h include/libxml/relaxng.h include/libxml/tree.h include/libxml/valid.h include/libxml/xmlIO.h include/libxml/xmlschemas.h include/libxml/xmlversion.h.in include/libxml/xpathInternals.h python/libxml.c: Okay this is scary but it is just adding a configure option to disable output, this touches most of the files. Daniel
William M. Brack 7b9154b0 2003-09-27T19:23:50 further (final?) minor changes for compilation warnings. No change to * encoding.c, parser.c, relaxng.c: further (final?) minor changes for compilation warnings. No change to logic.
William M. Brack 7a82165d 2003-08-15T07:27:40 Minor changes to comments, etc. for improving documentation generation * encoding.c, threads.c, include/libxml/HTMLparser.h, doc/libxml2-api.xml: Minor changes to comments, etc. for improving documentation generation * doc/Makefile.am: further adjustment to auto-generation of win32/libxml2.def.src
Daniel Veillard ab1ae3a7 2003-08-14T12:19:54 applied UTF-16 encoding handling patch provided by Mark Itzcovitz more * encoding.c: applied UTF-16 encoding handling patch provided by Mark Itzcovitz * encoding.c parser.c: more cleanup and fixes for UTF-16 when not having iconv support. Daniel
William M. Brack 16db7b6e 2003-08-07T13:12:49 further small changes for warnings when configured with --with-iconv=no * encoding.c: further small changes for warnings when configured with --with-iconv=no
Daniel Veillard 01fc1a9d 2003-07-30T15:12:01 applying patch from Peter Jacobi to added ISO-8859-x encoding support when * encoding.c: applying patch from Peter Jacobi to added ISO-8859-x encoding support when iconv is not available * configure.in include/libxml/xmlversion.h.in include/libxml/xmlwin32version.h.in: added the glue needed at the configure level and made it the default for Windows Daniel
Daniel Veillard 9ff7de14 2003-07-29T13:30:42 fix the previous commit Daniel * encoding.c: fix the previous commit Daniel
William M. Brack 4a557d97 2003-07-29T04:28:04 fixed problem with comments reported by Nick Kew added routines * HTMLparser.c: fixed problem with comments reported by Nick Kew * encoding.c: added routines xmlUTF8Size and xmlUTF8Charcmp for some future cleanup of UTF8 handling
Daniel Veillard 8caa9c2c 2003-06-02T13:35:24 small fix fixed an error message Daniel * encoding.c: small fix * xmlIO.c: fixed an error message Daniel
Daniel Veillard 3c908dca 2003-04-19T00:07:51 added xmlMallocAtomic() to be used when allocating blocks which do not * DOCBparser.c HTMLparser.c c14n.c catalog.c encoding.c globals.c nanohttp.c parser.c parserInternals.c relaxng.c tree.c uri.c xmlmemory.c xmlreader.c xmlregexp.c xpath.c xpointer.c include/libxml/globals.h include/libxml/xmlmemory.h: added xmlMallocAtomic() to be used when allocating blocks which do not contains pointers, add xmlGcMemSetup() and xmlGcMemGet() to allow registering the full set of functions needed by a garbage collecting allocator like libgc, ref #109944 Daniel
Igor Zlatkovic 73267db5 2003-03-08T13:29:24 applied Gennady's patch against buffer overrun
Daniel Veillard 809faa52 2003-02-10T15:43:53 fixing bug #104646 about iconv based encoding conversion when the input * encoding.c xmlIO.c: fixing bug #104646 about iconv based encoding conversion when the input buffer stops in the middle of a multibyte char Daniel
Daniel Veillard 81601f98 2003-01-14T13:42:37 fixing bug #103100 with a dummy UTF8ToUTF8 copy Daniel * encoding.c: fixing bug #103100 with a dummy UTF8ToUTF8 copy Daniel
Daniel Veillard 01c13b5b 2002-12-10T15:19:08 code cleanup, especially the function comments. fixed a small bug when * DOCBparser.c HTMLparser.c c14n.c debugXML.c encoding.c hash.c nanoftp.c nanohttp.c parser.c parserInternals.c testC14N.c testDocbook.c threads.c tree.c valid.c xmlIO.c xmllint.c xmlmemory.c xmlreader.c xmlregexp.c xmlschemas.c xmlschemastypes.c xpath.c: code cleanup, especially the function comments. * tree.c: fixed a small bug when freeing nodes which are XInclude ones. Daniel
Daniel Veillard d076a20e 2002-11-20T13:28:31 fixed #99082 for xi:include encoding="..." support on text includes. added * xinclude.c parserInternals.c encoding.c: fixed #99082 for xi:include encoding="..." support on text includes. * result/XInclude/tstencoding.xml test/XInclude/docs/tstencoding.xml test/XInclude/ents/isolatin.txt : added a specific regression test * python/generator.py python/libxml2class.txt: fixed the generator the new set of comments generated for doc/libxml2-api.xml were breaking the python generation. Daniel
Daniel Veillard f000f073 2002-10-22T14:28:17 made xmlGetUTF8Char public Daniel * include/libxml/encoding.h encoding.c: made xmlGetUTF8Char public Daniel
Daniel Veillard 6f46f6c5 2002-08-01T12:22:24 Opening the interface xmlNewCharEncodingHandler as requested in #89415 * encoding.c include/libxml/encoding.h: Opening the interface xmlNewCharEncodingHandler as requested in #89415 * python/generator.py python/setup.py.in: applied cleanup patches from Marc-Andre Lemburg * tree.c: fixing bug #89332 on a specific case of loosing the XML-1.0 namespace on xml:xxx attributes Daniel
Aleksey Sanin 49cc9756 2002-06-14T17:07:10 replaced sprintf() with snprintf() to prevent possible buffer overflow * DOCBparser.c HTMLparser.c debugXML.c encoding.c nanoftp.c nanohttp.c parser.c tree.c uri.c xmlIO.c xmllint.c xpath.c: replaced sprintf() with snprintf() to prevent possible buffer overflow (the bug was pointed out by Anju Premachandran)
Daniel Veillard e72c7563 2002-05-31T09:47:30 another peroformance patch from Peter Jacobi, that time on parsing * parser.c: another peroformance patch from Peter Jacobi, that time on parsing attribute values. Daniel
Daniel Veillard db552915 2002-03-21T13:27:59 fixed a bug in the ISO-Latin 1 to UTF8 encoder raised by Morus Walter * encoding.c: fixed a bug in the ISO-Latin 1 to UTF8 encoder raised by Morus Walter Daniel
Daniel Veillard 34ce8bec 2002-03-18T19:37:11 preparing 2.4.18 updated and rebuilt the web site implement the new * configure.in: preparing 2.4.18 * doc/*: updated and rebuilt the web site * *.c libxml.h: implement the new IN_LIBXML scheme discussed with the Windows and Cygwin maintainers. * parser.c: humm, changed the way the SAX parser work when xmlSubstituteEntitiesDefault(1) is set, it will then do the entity registration and loading by itself in case the user provided SAX getEntity() returns NULL. * testSAX.c: added --noent to test the behaviour. Daniel
Daniel Veillard 73c6e53a 2002-01-08T13:15:33 Paul Keogh pointed out a possibility of segfault on repeted * encoding.c: Paul Keogh pointed out a possibility of segfault on repeted xmlAddEncodingAlias() / xmlCleanupEncodingAlias(). Closes bug # 68238 Daniel