xmlIO.c


Log

Author Commit Date CI Message
Nick Wellnhofer 0a4fe2f9 2025-07-20T18:52:06 io: Fix argument type See c70d88f1 and #951.
Nick Wellnhofer c70d88f1 2025-07-20T13:03:59 io: Fix reading from pipes like stdin on Windows On Windows, lseek doesn't return an error on unseekable streams like pipes. Fixes #951.
Nick Wellnhofer 7bd8d1d9 2025-05-28T15:53:38 doc: Prefix autolinks with '#' Use `#func` instead of `func()` to ignore parameters and make all autolinks work.
Nick Wellnhofer 78454e30 2025-05-25T16:53:41 io: Remove xmlInputDefaultOpen Not necessary after removal of HTTP client.
Nick Wellnhofer 258d8706 2025-05-15T17:49:49 codegen: Consolidate tools for code generation Move tools, source files and output tables into codegen directory. Rename some files. Adjust tools to match modified files. Remove generation date and source files from output. Distribute all tools and sources.
Nick Wellnhofer adfbeb7e 2025-05-14T04:58:21 doc: Stop using *Ptr typedefs in documentation
Nick Wellnhofer a40f36e7 2025-05-14T04:04:28 include: Stop using *Ptr typedefs in public headers
Nick Wellnhofer 2d83a84c 2025-05-14T00:29:19 doc: Misc improvements
Nick Wellnhofer fcb7a777 2025-05-13T22:38:15 io: Make xmlOutputBufferCreate* not free encoder on error Revert a530ff12 which was an inadvertent API change.
Nick Wellnhofer 4df8d557 2025-05-12T17:31:14 io: Fix stack use after scope Short-lived regression.
Nick Wellnhofer f602c0c1 2025-05-12T00:04:22 html: Rework serialization of meta encoding attributes Don't allocate memory.
Nick Wellnhofer 825f3a9d 2025-05-11T21:38:16 html: Always serialize attributes with double quotes Align with HTML5.
Nick Wellnhofer 5f8ebc88 2025-05-10T00:56:18 save: Avoid xmlOutputBufferWriteQuotedString xmlOutputBufferWriteQuotedString should be reserved for things like system IDs.
Nick Wellnhofer 777e2adf 2025-05-09T23:53:03 io: Consolidate escaping code Use generated table approach of xmlSerializeText for xmlEscapeText. Move most code to xmlIO.c.
Nick Wellnhofer dad11630 2025-05-09T22:05:38 entities: Always replace invalid chars when escaping The previous refactor painstakingly recreated the different behavior of separate functions that were merged. It makes Optimize IS_CHAR check for non-ASCII chars.
Nick Wellnhofer 442c1903 2025-05-09T18:52:36 doc: Fix some damage from automated conversions Add some newlines, fix returns.
Nick Wellnhofer a1e83b24 2025-05-07T20:16:17 io: Fix negation of potentially unsigned value
Nick Wellnhofer 9bbffec5 2025-05-06T17:42:46 doc: Move brief to top, params to bottom of doc comments
Nick Wellnhofer f38f3e7b 2025-05-04T16:49:49 doc: Misc fixes to IO documentation
Nick Wellnhofer e78e05c9 2025-05-02T17:32:51 doc: Fix autolinks to functions Unfortunately, autolinks in .c files aren't converted by Doxygen for some reason.
Nick Wellnhofer f7c41287 2025-05-02T15:57:17 doc: Remove more comment block headers
Nick Wellnhofer e525564f 2025-05-01T19:20:06 doc: Remove empty lines at start of block These lines were left over after automatic conversion.
Nick Wellnhofer e549622b 2025-04-28T15:11:24 doc: Convert documentation to Doxygen Automated conversion based on a few regexes.
Nick Wellnhofer 69879da8 2025-04-28T14:04:30 doc: Remove email addresses from documentation Also remove authorship information from generated files, hash.c and globals.c which were rewritten.
Nick Wellnhofer 61890e39 2025-04-27T21:50:15 doc: Prepare for conversion to Doxygen Fix many params in internal functions (not really necessary but Doxygen warns about that in XML mode). Fix formatting in a few corner cases that automatic conversion can't handle. Rearrange some DOC_DISABLE blocks.
Nick Wellnhofer b85d77d1 2025-04-20T14:31:24 http: Remove built-in HTTP client Stubs are retained for ABI compatibility. Fixes #631. Obsoletes #160.
Nick Wellnhofer 2c2578b6 2025-03-31T13:10:00 io: Use switch statement in xmlIOErr
Collin Funk fa539305 2025-03-20T22:34:55 io: Remove duplicated conditionals.
Nick Wellnhofer b3492259 2025-03-14T00:01:11 include: Change some return types from int to enum This also affects some new functions from 2.13.
Nick Wellnhofer fd1b9391 2025-03-13T23:20:16 include: Convert some macros to enums
Nick Wellnhofer 69b83bb6 2025-03-10T02:18:51 encoding: Detect truncated multi-byte sequences with ICU Unlike iconv or the internal converters, ICU consumes truncated multi- byte sequences at the end of an input buffer. We currently check for a non-empty raw input buffer to detect truncated sequences, so this fails with ICU. It might be possible to inspect the pivot buffer pointers, but it seems cleaner to implement a `flush` flag for some encoding and I/O functions. After flushing, we can check for U_TRUNCATED_CHAR_FOUND with ICU, or detect remaining input with other converters. Also fix detection of truncated sequences for HTML, XML content and DTDs with iconv.
Nick Wellnhofer d96911f1 2025-03-08T23:00:29 doc: Documentation fixes
Nick Wellnhofer a0f156ff 2025-03-02T13:21:29 io: Fix `compressed` flag for uncompressed stdin This could cause xmlstarlet to generate compressed output unexpectedly. Regressed with a78843be. Should fix #869.
Nick Wellnhofer a78843be 2025-01-28T20:13:58 xmllint: Support compressed input from stdin Another regression related to reading from stdin. Making a "-" filename read from stdin was deeply baked into the core IO code but is inherently insecure. I really want to reenable this dangerous feature as sparingly as possible. This now enables compressed input when using the "Fd" API functions which wan't supported before. But XML_PARSE_NO_UNZIP will be inverted later. Allow compressed stdin in xmlReadFile to support xmlstarlet and older versions of xsltproc. So far, these are the only known command-line tools that rely on "-" meaning stdin.
Nick Wellnhofer 1c82bca6 2025-01-17T22:54:51 xmllint: Improve error reports from reader
Nick Wellnhofer 41c10c0c 2025-01-03T19:49:37 io: Don't cast file descriptors to pointers This doesn't work if open() returns 0 which is rare but can happen. Wrap the fd in a context struct. Fixes #835.
Nick Wellnhofer b3871dd1 2024-12-21T21:50:13 io: Fix memory leaks of encoding handler in error cases xmlOutputBufferCreate* must always free the encoding handler.
Nick Wellnhofer 0dd910e8 2024-12-18T23:37:35 save: Fix handling of catastrophic errors Don't overwrite catastrophic errors xmlSaveErr. Overwrite non-catastrophic errors in xmlOutputBufferClose.
Nick Wellnhofer 1e4d8c55 2024-11-06T16:42:05 xmlIO: Fix reading from non-regular files like pipes Commit 7e14c05d removed unnecessary copying of uncompressed input through zlib or xzlib. This broke input from non-regular files like pipes which can't be reopened. Try to detect such files by checking whether they're seekable and always pipe them through zlib or xzlib. Also remove seemingly unnecessary calls to gzread and gzrewind to support unseekable files. Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/124.
Nick Wellnhofer 55ddccb6 2024-09-14T00:03:56 io: Make sure not to pass partial UTF-8 to write callback We cannot split UTF-8 at arbitrary boundaries.
triallax 67ff748c 2024-08-26T23:53:29 io: don't set the executable bit when creating files Issue seems to have been introduced in 0bef93bf24def68c448af0e71844b942e0ed93ec.
Nick Wellnhofer f2c48847 2024-08-13T14:38:07 io: Add missing calls to xmlInitParser This is required after c9a46a91. Should fix #782.
Nick Wellnhofer a530ff12 2024-07-29T14:18:57 io: Always consume encoding handler when creating output buffers Also free encoding handler in error case. Remove xmlAllocOutputBufferInternal which was identical to xmlAllocOutputBuffer.
Nick Wellnhofer 36ea881b 2024-07-26T18:07:27 malloc-fail: Fix memory leak in xmlOutputBufferCreateFilename Close encoding handler on error.
Nick Wellnhofer 7b98e8d6 2024-07-18T01:54:22 io: Don't call getcwd in xmlParserGetDirectory The "directory" value isn't used internally. Calling getcwd is unnecessary and can cause problems in sandboxed environments. Fixes #770.
Nick Wellnhofer eb66d03e 2024-07-07T23:15:54 io: Deprecate a few functions
Nick Wellnhofer 97680d6c 2024-07-07T21:29:18 io: Rework xmlParserInputBufferGrow Remove dubious (len != 4) check. Remove compression-related code. This should already be set when opening the input.
Nick Wellnhofer a6f54f05 2024-07-07T18:52:17 io: Fine-tune initial IO buffer size
Nick Wellnhofer 7148b778 2024-07-07T16:11:08 parser: Optimize memory buffer I/O Reenable zero-copy IO for zero-terminated static memory buffers. Don't stream zero-terminated dynamic memory buffers on top of creating a copy.
Nick Wellnhofer 34c9108f 2024-07-07T18:38:31 encoding: Add sizeOut argument to xmlCharEncInput When push parsing, we want to convert as much of the input as possible. When pull parsing memory buffers, we want to convert data chunk by chunk to save memory.
Nick Wellnhofer a221cd78 2024-07-07T03:01:51 buf: Rework xmlBuf code Always use what the old implementation called the "IO" allocation scheme, allowing to move the content pointer past the initial allocation. This is inexpensive and allows efficient shrinking. Optimize xmlBufGrow, reusing shrunken memory as much as possible. Simplify xmlBufAdd. Make xmlBufBackToBuffer return an error on overflow. Make "size" exclude the terminating NULL byte. Always provide an initial size. Reintroduce static buffers. Remove xmlBufResize and several other functions.
Nick Wellnhofer 8d160626 2024-07-12T02:01:06 entities: Rework text escaping
Nick Wellnhofer cc45f618 2024-07-11T22:06:31 save: Rework text escaping Stop using xmlOutputBufferWriteEscape except when using deprecated xmlSaveSetEscape. Rewrite xmlOutputBufferWriteEscape to use an extra buffer and call xmlOutputBufferWrite. Introduce xmlSerializeText to serialize both text and attribute content. Don't read encoding from document when serializing and remove all hacks that temporarily changed the document's encoding.
Nick Wellnhofer 0ab07b21 2024-07-11T20:04:39 io: Rework xmlOutputBufferWrite Simplify code, handle short writes from callback.
Nick Wellnhofer e0494c0d 2024-07-15T15:10:18 io: Add some deprecation warnings
Nick Wellnhofer da686399 2024-07-09T12:29:53 io: Fix return value of xmlFileRead This broke in commit 6d27c54. Fixes #766.
Nick Wellnhofer 84a4f84c 2024-06-22T02:11:24 build: Don't check for required headers and functions Unless we are on Windows, the following POSIX headers are required. They're part of the earliest POSIX specs and it doesn't make sense to check for them. - fcntl.h - unistd.h - sys/stat.h - sys/time.h On Windows, io.h, fcntl.h and sys/stat.h are always available.
Nick Wellnhofer dba1ed85 2024-06-12T18:19:55 ftp: Remove FTP support Remove the built-in FTP client. If you configure --with-legacy, old symbols are retained for ABI compatibility.
Nick Wellnhofer ab5e6deb 2024-06-11T18:11:51 parser: Introduce XML_INPUT_NETWORK input flag This allows to disable network access when creating parser inputs with xmlInputCreateUrl.
Nick Wellnhofer 64ad2725 2024-06-11T03:51:43 parser: Introduce per-context resource loader
Nick Wellnhofer b9d2f3c9 2024-06-11T02:15:18 parser: Introduce new input API - xmlInputCreateUrl - xmlInputCreateMemory - xmlInputCreateString - xmlInputCreateFd - xmlInputCreateIO - xmlInputSetEncoding These functions don't take a parser context and work on xmlParserInputs, replacing functions working on xmlParserInputBuffers. xmlInputCreateUrl and xmlInputSetEncoding offer fine-grained error handling. Several XML_INPUT_* flags offer additional control.
Nick Wellnhofer ff3b0919 2024-06-11T00:00:32 parser: Implement XML_PARSE_NO_UNZIP option
Nick Wellnhofer 1432949d 2024-06-10T23:57:52 io: Pass input flags to xmlParserInputBufferCreateUrl
Nick Wellnhofer b5890cb4 2024-06-10T18:51:56 io: Remove xmlParserInputBufferCreateFilenameSafe
Nick Wellnhofer 1b1e8b3c 2024-06-10T16:39:57 io: Stop invoking generic error handler for IO errors
Nick Wellnhofer a331526c 2024-06-10T16:21:12 io: Don't report write errors twice
Nick Wellnhofer 717f3a7b 2024-06-10T18:50:28 io: Fix resetting xmlParserInputBufferCreateFilename hook We don't want to invoke the default function.
Nick Wellnhofer e75e878e 2024-05-20T13:58:22 doc: Update and fix documentation
Nick Wellnhofer a4c2b723 2024-05-05T17:26:31 io: Don't set close callback in xmlParserInputBufferCreateFd
Nick Wellnhofer a279aae3 2024-03-18T14:20:19 io: Allocate output buffer with XML_BUFFER_ALLOC_IO This allows efficient shrinking of memory buffers. Support IO buffers in xmlBufDetach.
Nick Wellnhofer c1fe9e72 2024-03-06T15:21:49 io: Report more malloc failures when writing to output buffer
Nick Wellnhofer 67e475b7 2024-02-19T11:09:39 http: Improve error message for HTTPS redirects
Nick Wellnhofer e314109a 2024-02-16T15:42:38 save: Don't write directly to internal buffer Make sure that OOM errors are reported.
Nick Wellnhofer 0d170aca 2024-02-01T11:51:58 io: Report malloc failure in xmlOutputBufferWrite Fixes #676.
Nick Wellnhofer d2b55a7a 2024-01-05T20:31:10 writer: Implement xmlTextWriterClose This function can be used to make sure that closing the output stream succeeded. Fixes #513.
Nick Wellnhofer e45a4d71 2023-12-29T00:00:21 io: Always forward IO errors to global handler The HTTP module raises errors without context. This won't be fixed, so send them to the global error handler.
Nick Wellnhofer 7e0bbbc1 2023-12-27T18:33:30 parser: New input API Provide a new set of functions to create xmlParserInputs. These can be used for the document entity or from external entity loaders. - Don't require xmlParserInputBuffer. - All functions take a base URI. - All functions take an encoding as string. - xmlNewInputURL also takes a public ID. - xmlNewInputMemory takes a size_t. - Optimization hints for memory buffers. Improve documentation. Only call xmlInitParser before allocating a new parser context. Call xmlCtxtUseOptions as early as possible.
Nick Wellnhofer c2ef78f7 2023-12-24T23:56:57 io: Fix close error handling There's no way to report error codes from closing an output buffer yet.
Nick Wellnhofer 6d27c549 2023-12-24T17:59:02 io: Fix read/write error handling Handle short reads/writes from fd. Fix stdio error handling.
Nick Wellnhofer 0bef93bf 2023-12-23T04:03:41 io: More refactoring and unescaping fixes Merge Windows wrappers into relevant functions. Remove more unnecessary unescaping. Merge *OpenW into *Open functions. Use unbuffered IO for output.
Nick Wellnhofer a2693410 2023-12-23T00:35:30 io: Move some code from xmlIO.c to parserInternals.c Move everything related to parser contexts to parserInternals.c.
Nick Wellnhofer 8ab1b122 2023-12-23T00:00:15 Fix filename and URI handling Many strings are passed to the library that could be either URIs or filesystem paths. We now assume that strings are a URI if they contain the substring "://". This means that they have a scheme and an authority. Otherwise, URI resolution wouldn't make much sense. Fix xmlBuildURI to work with filesystem paths. If the base URI doesn't contain "://" it is treated as filename. The resolved URI is unescaped, appended and the result is normalized. Rewrite xmlNormalizePath to handle Windows quirks. All special handling for Windows paths is removed in xmlCanonicPath. If the path looks like an URI, only escape characters allowed in Legacy Extended IRIs. Make xmlPathToURI only call xmlCanonicPath. Theh additional round-trip through URI parser and serializer seems useless. Add a helper function xmlConvertUriToPath in xmlIO.c which checks for file URIs and unescapes them. Always process strings with xmlCanonicPath in xmlLoadExternalEntity. This should be harmless now. Should help with #334, #387, #611.
Nick Wellnhofer 229e5ff7 2023-12-21T18:09:42 io: Remove support for HTTP POST This feature is unlikely to be used these days.
Nick Wellnhofer 0a658c0f 2023-12-20T23:53:19 io: Don't use "-" to read from stdin To implement this feature on such a low level is a disaster waiting to happen. Remove these checks from the IO code and move them to xmllint. Note that the serialization API will still treat "-" as stdout.
Nick Wellnhofer c9a46a91 2023-12-20T20:11:09 io: Rework initialization
Nick Wellnhofer b75fc1ab 2023-12-20T20:01:19 io: Rearrange code
Nick Wellnhofer 13043691 2023-12-20T00:33:34 parser: Rename xmlErrParser to xmlCtxtErr
Nick Wellnhofer 9fbe46ba 2023-12-19T20:10:10 io: Consolidate error messages
Nick Wellnhofer 23345a1c 2023-12-19T19:52:28 io: Report IO errors through xmlCtxtErrIO This is also a new public API function to be used in external entity loaders.
Nick Wellnhofer 1ef35663 2023-12-19T19:36:35 io: Always use unbuffered input Before, we often used unbuffered input via the lzma or gzip handlers, more or less inadvertently. Change the default file handlers from buffered (stdc FILE) to unbuffered (POSIX fds).
Nick Wellnhofer 7e14c05d 2023-12-19T17:05:08 io: Fix detection of compressed streams Make sure that we don't try to open uncompressed streams with a compression handler in copying mode.
Nick Wellnhofer 7e511f35 2023-12-19T15:41:37 io: Pass error codes from xmlFileOpenReal to xmlNewInputFromFile This allows to report the reason why opening a file failed to the parser context and improve error messages. Now we can also remove the stat call before opening a file.
Nick Wellnhofer b2dbcc43 2023-12-19T13:33:59 io: Rework default callbacks Register a dummy callback struct for default callbacks. Handle them in a separate function which will later allow to return meaningful error codes.
Nick Wellnhofer 54c70ed5 2023-12-18T19:31:29 parser: Improve error handling Introduce xmlCtxtSetErrorHandler allowing to set a structured error for a parser context. There already was the "serror" SAX handler but this always receives the parser context as argument. Start to use xmlRaiseMemoryError. Remove useless arguments from memory error functions. Rename xmlErrMemory to xmlCtxtErrMemory. Remove a few calls to xmlGenericError. Remove support for runtime entity debugging.
Nick Wellnhofer c5a8aef2 2023-12-18T19:12:08 error: Refactor error reporting Introduce xmlStrVASPrintf, trying to handle buggy snprintf implementations. Introduce xmlSetError to set errors atomically. Introduce xmlUpdateError to set an error, fixing up node, file and line. Introduce helper function xmlRaiseMemoryError. Make legacy error handlers call xmlReportError, avoiding checks in xmlVRaiseError. Remove fragile support for getting file and line info from XInclude nodes.
Nick Wellnhofer c2bbeed1 2023-12-12T23:51:32 io: Fix memory lifetime issue with input buffers xmlParserInputBufferCreateMem must make a copy of the buffer. This fixes a regression from 2.11 which could cause reads from freed memory depending on the use case. Undeprecate xmlParserInputBufferCreateStatic which can avoid copying the whole buffer.
Nick Wellnhofer f19a9510 2023-12-10T17:50:22 parser: Report malloc failures Fix many places where malloc failures aren't reported. Make xmlErrMemory public. This is useful for custom external entity loaders. Introduce new API function xmlSwitchEncodingName. Change the way how we store whether the the parser is stopped. This used to be signaled by setting ctxt->instate to XML_PARSER_EOF which was misdesigned and error-prone. Set ctxt->disableSAX to 2 instead and introduce a macro PARSER_STOPPED. Also stop to remove parser inputs in xmlHaltParser. This allows to remove many checks of ctxt->instate. Introduce xmlErrParser to handle errors if a parser context is available.
Nick Wellnhofer 455c61d6 2023-11-23T15:59:41 Remove VMS support This was last updated 10 years ago and is most likely broken.
Nick Wellnhofer 11a1839d 2023-09-20T17:54:48 globals: Move remaining globals back to correct header files This undoes a lot of damage.
Nick Wellnhofer 4e1c13eb 2023-09-18T14:45:10 debug: Remove debugging code This is barely useful these days and only clutters the code base.