xmlsave.c


Log

Author Commit Date CI Message
Nick Wellnhofer 7bd8d1d9 2025-05-28T15:53:38 doc: Prefix autolinks with '#' Use `#func` instead of `func()` to ignore parameters and make all autolinks work.
Nick Wellnhofer adfbeb7e 2025-05-14T04:58:21 doc: Stop using *Ptr typedefs in documentation
Nick Wellnhofer a40f36e7 2025-05-14T04:04:28 include: Stop using *Ptr typedefs in public headers
Nick Wellnhofer fcb7a777 2025-05-13T22:38:15 io: Make xmlOutputBufferCreate* not free encoder on error Revert a530ff12 which was an inadvertent API change.
Nick Wellnhofer 39ae5d12 2025-05-12T21:04:41 save: Add NULL check in xmlBufDumpEntityContent Short-lived regression.
Nick Wellnhofer f602c0c1 2025-05-12T00:04:22 html: Rework serialization of meta encoding attributes Don't allocate memory.
Nick Wellnhofer 5f8ebc88 2025-05-10T00:56:18 save: Avoid xmlOutputBufferWriteQuotedString xmlOutputBufferWriteQuotedString should be reserved for things like system IDs.
Nick Wellnhofer 777e2adf 2025-05-09T23:53:03 io: Consolidate escaping code Use generated table approach of xmlSerializeText for xmlEscapeText. Move most code to xmlIO.c.
Nick Wellnhofer dad11630 2025-05-09T22:05:38 entities: Always replace invalid chars when escaping The previous refactor painstakingly recreated the different behavior of separate functions that were merged. It makes Optimize IS_CHAR check for non-ASCII chars.
Nick Wellnhofer c8cea39d 2025-05-09T21:31:07 save: Fix serialization of attribute defaults containing < Long-standing bug that produced invalid XML.
Nick Wellnhofer 971038e5 2025-05-09T20:26:33 html: Call lower-level escaping functions Removes the need to pass a document around.
Nick Wellnhofer 442c1903 2025-05-09T18:52:36 doc: Fix some damage from automated conversions Add some newlines, fix returns.
Nick Wellnhofer 46f05ea4 2025-05-09T00:21:47 html: Rework meta charset handling Don't use encoding from meta tags when serializing. Only use the value in `doc->encoding`, matching the XML serializer. This is the actual encoding used when parsing. Stop modifying the input document by setting meta tags before serializing. Meta tags are now injected during serialization. Add full support for <meta charset=""> which is also used when adding meta tags. Align with HTML5 and implement the "algorithm for extracting a character encoding from a meta element". Only modify the encoding substring in Content-Type meta tags. Only switch encoding once when parsing. Fix htmlSaveFileFormat with a NULL encoding not to declare a misleading UTF-8 charset. Fixes #909.
Nick Wellnhofer 9bbffec5 2025-05-06T17:42:46 doc: Move brief to top, params to bottom of doc comments
Nick Wellnhofer b1685459 2025-05-06T12:50:52 doc: Misc fixes to xmlsave docs
Nick Wellnhofer cb1635a6 2025-05-02T19:05:25 doc: Use @since command
Nick Wellnhofer e6d6fa6f 2025-05-02T17:23:30 doc: Fix xmlsave format hint Don't recommend deprecated symbols.
Nick Wellnhofer f7c41287 2025-05-02T15:57:17 doc: Remove more comment block headers
Nick Wellnhofer e525564f 2025-05-01T19:20:06 doc: Remove empty lines at start of block These lines were left over after automatic conversion.
Nick Wellnhofer e549622b 2025-04-28T15:11:24 doc: Convert documentation to Doxygen Automated conversion based on a few regexes.
Nick Wellnhofer 69879da8 2025-04-28T14:04:30 doc: Remove email addresses from documentation Also remove authorship information from generated files, hash.c and globals.c which were rewritten.
Nick Wellnhofer 61890e39 2025-04-27T21:50:15 doc: Prepare for conversion to Doxygen Fix many params in internal functions (not really necessary but Doxygen warns about that in XML mode). Fix formatting in a few corner cases that automatic conversion can't handle. Rearrange some DOC_DISABLE blocks.
Nick Wellnhofer 8a791fdd 2025-04-21T17:31:29 save: Fix xmlDocDump with encoding Short-lived regression.
Nick Wellnhofer 78b123c6 2025-04-20T23:42:44 save: Fix XML escape table Regressed with 2adcde39.
Nick Wellnhofer 5df94fc7 2025-04-20T21:52:03 save: Remove unused struct members
Nick Wellnhofer 8c6c6165 2025-04-20T21:42:51 save: Rework encoding setup Always set up encoding in xmlDocContentDumpOutput. Refactor and simplify some code.
Nick Wellnhofer 936e3d52 2025-04-20T19:25:04 save: Fix xmlSave with NULL encoding Regressed with cc45f618.
Nick Wellnhofer b3492259 2025-03-14T00:01:11 include: Change some return types from int to enum This also affects some new functions from 2.13.
Nick Wellnhofer 9c16a153 2025-02-13T18:41:33 Revert "include: Make most IS_* macros private" This reverts commit 84a6c82ff83d04963d6e1c5cd18ded68ea02d99f.
Nick Wellnhofer ebbc31cc 2025-02-13T12:09:58 malloc-fail: Check for malloc failure in xhtmlNodeDumpOutput
Nick Wellnhofer 84a6c82f 2024-12-19T20:59:10 include: Make most IS_* macros private Macros like IS_DIGIT or IS_LETTER severely pollute the C namespace.
Nick Wellnhofer 0dd910e8 2024-12-18T23:37:35 save: Fix handling of catastrophic errors Don't overwrite catastrophic errors xmlSaveErr. Overwrite non-catastrophic errors in xmlOutputBufferClose.
Nick Wellnhofer 0160076f 2024-12-17T17:54:20 save: Forward error from closing IO in xmlSaveFinish
Nick Wellnhofer 5e787401 2024-09-10T17:12:25 save: Make xmlEscapeTab signed Fixes issues in platforms where char is unsigned. Fixes #797.
Nick Wellnhofer a530ff12 2024-07-29T14:18:57 io: Always consume encoding handler when creating output buffers Also free encoding handler in error case. Remove xmlAllocOutputBufferInternal which was identical to xmlAllocOutputBuffer.
Nick Wellnhofer bc14d70f 2024-07-25T00:26:48 xmlsave: Improve "unsupported encoding" error message Incomplete support of XML_SAVE_* error codes was removed. Error handling still needs work. xmlOutputBufferCreateFilename should return an error code.
Nick Wellnhofer 5862e9dd 2024-07-18T01:59:25 Add NULL checks Short-lived regression.
Nick Wellnhofer a221cd78 2024-07-07T03:01:51 buf: Rework xmlBuf code Always use what the old implementation called the "IO" allocation scheme, allowing to move the content pointer past the initial allocation. This is inexpensive and allows efficient shrinking. Optimize xmlBufGrow, reusing shrunken memory as much as possible. Simplify xmlBufAdd. Make xmlBufBackToBuffer return an error on overflow. Make "size" exclude the terminating NULL byte. Always provide an initial size. Reintroduce static buffers. Remove xmlBufResize and several other functions.
Nick Wellnhofer 2adcde39 2024-07-12T16:25:05 save: Optimize xmlSerializeText Use lookup tables.
Nick Wellnhofer 1b067082 2024-07-12T15:19:26 save: Always serialize CR as decimal "&#13;" We used to serialize CR as "&#xD;" when there was no encoding and we weren't in an attribute. This was somewhat inconsistent.
Nick Wellnhofer 1cfc5b80 2024-07-12T03:07:57 entities: Rework serialization of numeric character references
Nick Wellnhofer 8d160626 2024-07-12T02:01:06 entities: Rework text escaping
Nick Wellnhofer cc45f618 2024-07-11T22:06:31 save: Rework text escaping Stop using xmlOutputBufferWriteEscape except when using deprecated xmlSaveSetEscape. Rewrite xmlOutputBufferWriteEscape to use an extra buffer and call xmlOutputBufferWrite. Introduce xmlSerializeText to serialize both text and attribute content. Don't read encoding from document when serializing and remove all hacks that temporarily changed the document's encoding.
Nick Wellnhofer e488695b 2024-07-11T20:23:49 save: Deprecate xmlSaveSet*Escape xmlSaveSetAttrEscape never had an effect.
Nick Wellnhofer 673ca0ed 2024-07-11T01:23:57 tests: Regenerate testapi.c
Nick Wellnhofer 96d850c3 2024-07-02T22:43:49 save: Fix "Factor out xmlSaveWriteIndent"
Nick Wellnhofer 35146ff3 2024-07-02T19:43:24 save: Implement xmlSaveSetIndentString Allow to set indent string without using global xmlTreeIndentString. See #736.
Nick Wellnhofer 7cc619d5 2024-07-02T19:22:32 save: Implement save options for indenting Implement XML_SAVE_NO_INDENT to disable and XML_SAVE_INDENT to enable indenting regardless of the global xmlIndentTreeOutput. Implement XML_SAVE_EMPTY to enable empty tags regardless of the global xmlSaveNoEmptyTags. See #736.
Nick Wellnhofer 2c4204ec 2024-07-02T19:14:40 save: Factor out xmlSaveWriteIndent
Nick Wellnhofer 202045f8 2024-07-02T18:51:59 save: Pass options to xmlSaveCtxtInit
Nick Wellnhofer 598ee0d2 2024-06-26T01:18:55 error: Remove underscores from xmlRaiseError
Nick Wellnhofer 5b893fa9 2024-06-22T19:15:17 encoding: Fix encoding lookup with xmlOpenCharEncodingHandler Make xmlOpenCharEncodingHandler call xmlParseCharEncoding first so we prefer our own handlers for names like "UTF8". Only UTF-16 needs an exception. Make callers check the return value. For UTF-8, a NULL encoding doesn't mean an error. Remove unnecessary UTF-8 check from htmlFindOutputEncoder. Don't try to look up ASCII handler since the HTML handler is always available. Fix return code of xmlParseCharEncoding. Should fix #744.
Rosen Penev 2def7b4b 2024-06-18T13:55:34 clang-tidy: move assignments out of if Found with bugprone-assignment-in-if-condition Signed-off-by: Rosen Penev <rosenp@gmail.com>
Nick Wellnhofer e75e878e 2024-05-20T13:58:22 doc: Update and fix documentation
Nick Wellnhofer f506ec66 2024-04-15T11:27:44 parser: Always decode entities in namespace URIs Also decode entities in namespace URIs if entity substitution wasn't requested. This should fix some corner cases when comparing namespace URIs. The Namespaces in XML 1.0 spec says: > In a namespace declaration, the URI reference is the normalized value > of the attribute, so replacement of XML character and entity > references has already been done before any comparison. Make the serialization code escape special characters in namespace URIs like in attribute values. This fixes serialization if entities were substituted when parsing. Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/106
Nick Wellnhofer 20fca2bb 2024-04-09T15:39:06 save: Report malloc failure in xmlAttrSerializeTxtContent Flush buffer before checking for errors.
Nick Wellnhofer 86c27206 2024-04-02T14:41:15 save: Handle invalid parent pointers in xhtmlNodeDumpOutput See #255 and commit 85b1792e.
Nick Wellnhofer fb1e6302 2024-03-17T19:24:06 save: Check for NULL node->name in xhtmlIsEmpty
Nick Wellnhofer ee0c1f87 2024-02-29T14:51:49 fuzz: New tree API fuzzer
Nick Wellnhofer 10c202f9 2024-03-04T01:31:12 malloc-fail: Check for NULL pointer in xmlSaveNotation*
Nick Wellnhofer b1e75a91 2024-03-05T20:00:44 save: Report malloc failure in xmlAttrSerializeTxtContent
Nick Wellnhofer 3494aa4f 2024-03-04T01:39:00 save: Cast return code of xmlBufNodeDump Avoid implicit sign change.
Nick Wellnhofer 1d392fab 2024-03-05T18:06:02 save: Check for output buffer errors Report more error conditions.
Nick Wellnhofer d2f7ca53 2024-03-03T16:51:07 save: Add range check for level in xmlNodeDump
Nick Wellnhofer e314109a 2024-02-16T15:42:38 save: Don't write directly to internal buffer Make sure that OOM errors are reported.
Nick Wellnhofer fbe10a46 2024-02-01T19:01:57 save: Move DTD serialization code to xmlsave.c
Nick Wellnhofer c2b3294f 2024-01-04T21:20:51 fuzz: Abort on invalid UTF-8 The parser should never generate invalid UTF-8 these days even in recovery mode.
Nick Wellnhofer ca5965d5 2024-01-02T21:49:43 save: Report more malloc failures
Nick Wellnhofer 0821efc8 2024-01-02T18:33:57 encoding: Check whether encoding handlers support input/output The "HTML" encoding handler doesn't support input which could lead to a wrong error report.
Nick Wellnhofer 4dcc2d74 2024-01-02T14:04:44 save: Output U+FFFD replacement characters This degrades more gracefully and helps to diagnose errors. We stop raising errors for now, since there's no way to report malloc failures during error handling yet.
Nick Wellnhofer bc1e0306 2023-12-18T21:30:22 save: Improve error handling Handle malloc failrue from xmlRaiseError. Use xmlRaiseMemoryError. Stop using xmlGenericError. Remove argument from memory error handler. Remove TODO macro.
Nick Wellnhofer 6c8acdec 2023-12-14T13:37:43 save: Fix build --without-html Fixes #646
Nick Wellnhofer 0d97e439 2023-12-10T17:14:57 save: Report malloc failures Fix places where malloc failures aren't report. Introduce a new API function xmlSaveFinish which returns an error code.
Nick Wellnhofer 8c084ebd 2023-09-21T22:57:33 doc: Make apibuild.py happy
Nick Wellnhofer da274bfa 2023-09-21T01:29:40 build: Fix build when certain modules are disabled
Nick Wellnhofer 4e1c13eb 2023-09-18T14:45:10 debug: Remove debugging code This is barely useful these days and only clutters the code base.
Nick Wellnhofer c82701ff 2023-02-14T15:13:06 malloc-fail: Fix memory leak in xmlDocDumpFormatMemoryEnc Found with libFuzzer, see #344.
Nick Wellnhofer bdcf842c 2022-09-01T20:45:35 Move xmlIsXHTML to tree.c It's declared in tree.h and not guarded by LIBXML_OUTPUT_ENABLED like the other functions in xmlsave.c.
Nick Wellnhofer ad338ca7 2022-09-01T01:18:30 Remove explicit integer casts Remove explicit integer casts as final operation - in assignments - when passing arguments - when returning values Remove casts - to the same type - from certain range-bound values The main motivation is that these explicit casts don't change the result of operations and only render UBSan's implicit-conversion checks useless. Removing these casts allows UBSan to detect cases where truncation or sign-changes occur unexpectedly. Document some explicit casts as truncating and add a few missing ones.
Nick Wellnhofer 0f568c0b 2022-08-26T01:22:33 Consolidate private header files Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.
Nick Wellnhofer 3e7b4f37 2022-05-20T23:28:25 Avoid calling xmlSetTreeDoc Create text nodes with xmlNewDocText or set the document directly to avoid xmlSetTreeDoc being called when the node is inserted.
Nick Wellnhofer d99ddd9b 2022-03-05T21:46:40 Improve buffer allocation scheme In most places, we really need the double-it scheme to avoid quadratic behavior. The hybrid scheme still can cause many reallocations and the bounded scheme doesn't seem to provide meaningful protection in xmlreader.c.
Nick Wellnhofer 346c3a93 2022-02-20T18:46:42 Remove elfgcchack.h The same optimization can be enabled with -fno-semantic-interposition since GCC 5. clang has always used this option by default.
Nick Wellnhofer 13ad8736 2021-05-25T10:55:25 Fix regression in xmlNodeDumpOutputInternal Commit 85b1792e could cause additional whitespace if xmlNodeDump was called with a non-zero starting level.
Nick Wellnhofer 85b1792e 2021-05-18T20:08:28 Work around lxml API abuse Make xmlNodeDumpOutput and htmlNodeDumpFormatOutput work with corrupted parent pointers. This used to work with the old recursive code but the non-recursive rewrite required parent pointers to be set correctly. Unfortunately, lxml relies on the old behavior and passes subtrees with a corrupted structure. Fall back to a recursive function call if an invalid parent pointer is detected. Fixes #255.
Nick Wellnhofer 0b3c64d9 2020-09-29T18:08:37 Handle dumps of corrupted documents more gracefully Check parent pointers for NULL after the non-recursive rewrite of the serialization code. This avoids segfaults with corrupted documents which can apparently be seen with lxml, see issue #187.
Nick Wellnhofer 00a86d41 2020-08-16T23:38:00 Don't add formatting newlines to XInclude nodes
Nick Wellnhofer 1a360c1c 2020-07-29T00:39:15 More *NodeDumpOutput fixes When leaving nodes, restrict more operations to XML_ELEMENT_NODEs.
Nick Wellnhofer 7b2e5172 2020-07-28T21:52:55 Fix *NodeDumpOutput functions Only output end tag for elements. Should fix serialization of document fragments.
Nick Wellnhofer dc6f0092 2020-07-28T19:07:19 Make xmlNodeDumpOutputInternal non-recursive Fixes stack overflow with deeply nested documents.
Nick Wellnhofer 5330153d 2020-07-28T18:33:50 Make xhtmlNodeDumpOutput non-recursive Fixes stack overflow with deeply nested documents.
Nick Wellnhofer 20c60886 2020-03-08T17:19:42 Fix typos Resolves #133.
Nick Wellnhofer c9faa292 2020-01-02T14:12:39 Fix overflow check in xmlNodeDump Store return value of xmlBufNodeDump in a size_t before checking for integer overflow. Found by lgtm.com
Nick Wellnhofer 42942066 2019-11-11T13:49:11 Fix memory leaks of encoding handlers in xmlsave.c Fix leak of iconv/ICU encoding handler in xmlSaveToBuffer. Fix leaks of iconv/ICU encoding handlers in xmlSaveTo* error paths. Closes #127.
Jared Yanovich 2a350ee9 2019-09-30T17:04:54 Large batch of typo fixes Closes #109.
Jan Pokorný 81958b6e 2019-07-11T19:24:11 Doc: do not mislead towards "infeasible" scenario wrt. xmlBufNodeDump At least when merely public API is to be leveraged, one cannot use xmlBufCreate function that would otherwise be a clear fit, and relying on some invariants wrt. how some other struct fields will get initialized along the construction/filling such parent struct and (ab)using that instead does not appear clever, either. Hence, instruct people what's the Right Thing for the moment, that is, make them use xmlNodeDumpOutput instead (together with likewise public xmlAllocOutputBuffer). Going forward, it's questionable what do with xmlBuf* family of functions that are once public, since they, for any practical purpose, cannot be used by the library clients (that's how I've run into this). Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Nick Wellnhofer 96125557 2019-05-10T12:30:03 Remove unused member `doc` in xmlSaveCtxt
Nick Wellnhofer ee501f54 2018-10-13T15:23:35 Stop using doc->charset outside parser code doc->charset does not specify the in-memory encoding which is always UTF-8.
Nick Wellnhofer cb5541c9 2017-11-13T17:08:38 Fix libz and liblzma detection If libz or liblzma are detected with pkg-config, AC_CHECK_HEADERS must not be run because the correct CPPFLAGS aren't set. It is actually not required have separate checks for LIBXML_ZLIB_ENABLED and HAVE_ZLIB_H. Only check for LIBXML_ZLIB_ENABLED and remove HAVE_ZLIB_H macro. Fixes bug 764657, bug 787041.
Nick Wellnhofer 359e7504 2017-11-13T21:13:46 Fix -Wmisleading-indentation warnings