|
c2bbeed1
|
2023-12-12T23:51:32
|
|
io: Fix memory lifetime issue with input buffers
xmlParserInputBufferCreateMem must make a copy of the buffer.
This fixes a regression from 2.11 which could cause reads from freed
memory depending on the use case.
Undeprecate xmlParserInputBufferCreateStatic which can avoid copying
the whole buffer.
|
|
157df344
|
2023-12-10T18:23:53
|
|
xmlreader: Report malloc failures
Fix many places where malloc failures aren't reported.
Introduce a new API function xmlTextReaderGetLastError.
|
|
78eab7a1
|
2023-12-10T18:15:59
|
|
xinclude: Report malloc failures
Fix many places where malloc failures aren't reported.
Introduce a new API function xmlXIncludeGetLastError.
|
|
f19a9510
|
2023-12-10T17:50:22
|
|
parser: Report malloc failures
Fix many places where malloc failures aren't reported.
Make xmlErrMemory public. This is useful for custom external entity
loaders.
Introduce new API function xmlSwitchEncodingName.
Change the way how we store whether the the parser is stopped. This used
to be signaled by setting ctxt->instate to XML_PARSER_EOF which was
misdesigned and error-prone. Set ctxt->disableSAX to 2 instead and
introduce a macro PARSER_STOPPED. Also stop to remove parser inputs in
xmlHaltParser. This allows to remove many checks of ctxt->instate.
Introduce xmlErrParser to handle errors if a parser context is
available.
|
|
0d97e439
|
2023-12-10T17:14:57
|
|
save: Report malloc failures
Fix places where malloc failures aren't report.
Introduce a new API function xmlSaveFinish which returns an error code.
|
|
1a354d5b
|
2023-12-10T17:09:45
|
|
regexp: Report malloc failures
Fix places where malloc failures aren't reported.
|
|
e632d9f0
|
2023-12-10T16:56:16
|
|
xpath: Report malloc failures
Fix many places where malloc failures aren't reported.
Rework XPath object cache to store free objects in a linked list to
avoid allocating an additional array. Remove some unneeded object pools.
|
|
f3455ecd
|
2023-12-10T15:46:53
|
|
error: Report malloc failures
Don't ignore malloc failures in xmlRaiseError and xmlCopyError.
Don't print filename if context has no input.
Introduce xmlVRaiseError taking a va_list.
|
|
aca16fb3
|
2023-12-10T16:37:43
|
|
tree: Report malloc failures
Fix many places where malloc failures aren't reported.
Make some API function return an error code. Changing the return type
from void to int is technically an ABI break but should be safe on most
platforms.
- xmlNodeSetContent
- xmlNodeSetContentLen
- xmlNodeAddContent
- xmlNodeAddContentLen
- xmlNodeSetBase
Introduce new API functions that return a separate error code if a
memory allocation fails.
- xmlNodeGetAttrValue
- xmlNodeGetBaseSafe
- xmlGetNsListSafe
Introduce private functions xmlTreeEnsureXMLDecl and xmlSplitQName4.
Don't report low-level errors to the global error handler.
Fix tree
Introduce xmlGetNsListSafe
Fix tree
|
|
e34a49b7
|
2023-12-10T16:29:31
|
|
valid: Improve addition and deletion of IDs
Introduce a new API function xmlAddIDSafe that returns a separate error
code if a memory allocation fails.
Store a pointer to the ID struct in xmlAttr so attributes can be
freed without allocating memory. It's impossible to report malloc
failures in deallocation code.
|
|
e1560990
|
2023-12-10T15:30:47
|
|
pattern: Report malloc failures
Fix places where malloc failures aren't reported.
Introduce a new API function xmlPatternCompileSafe that returns a
separate error code if a memory allocation fails.
|
|
a1f7ecae
|
2023-12-10T15:25:42
|
|
entities: Report malloc failures
Fix places where malloc failures aren't reported.
Introduce new API function xmlAddEntity that returns separate error
codes.
Don't invoke global error handler for low-level errors which should be
handled by higher layers.
Invalid redelcaration warnings will be fixed later.
|
|
f313848b
|
2023-12-10T15:14:15
|
|
hash: Report malloc failures
Introduce new API functions that return a separate error code if a
memory allocation fails.
- xmlHashAdd
- xmlHashCopySafe
|
|
bd5ad030
|
2023-12-10T14:56:21
|
|
encoding: Report malloc failures
Introduce new API functions that return a separate error code if a
memory allocation fails.
- xmlOpenCharEncodingHandler
- xmlLookupCharEncodingHandler
Fix a few places where malloc failures weren't reported.
|
|
da996c8d
|
2023-12-10T14:46:59
|
|
uri: Report malloc failures
Fix many places where malloc failures weren't reported, for example
after calling xmlStrdup.
Introduce new public API functions that return a separate error code if
a memory allocation fails:
- xmlParseURISafe
- xmlBuildURISafe
- xmlBuildRelativeURISafe
Update the fuzzer to check whether malloc failures are reported.
|
|
df0b540b
|
2023-12-07T14:40:13
|
|
include: Rename XML_EMPTY helper macro
Avoid name clash with downstream projects.
|
|
a9738e31
|
2023-12-07T14:15:29
|
|
include: Move declaration of xmlInitGlobals
Fix downstream build issues after reworking globals.h.
|
|
52703ffd
|
2023-12-07T12:04:02
|
|
include: Add missing includes
|
|
9122ad0c
|
2023-12-06T19:56:50
|
|
include: Move globals from xmlsave.h to parser.h
Fix downstream build issues after reworking globals.h.
|
|
c011e760
|
2023-12-06T01:09:31
|
|
globals: Remove unused globals from thread storage
Setting these deprecated globals hasn't had an effect for a long time.
Make them constants. This reduces the size of per-thread storage from
~700 to ~250 bytes.
|
|
1c7f4c70
|
2023-11-27T12:10:09
|
|
nanohttp: Deprecate public API
The long-term plan is to remove the built-in HTTP client. There are
still a few downstream projects that use libxml2's client for other
purposes. These users should get deprecation warnings now.
|
|
7d6969d9
|
2023-11-23T15:48:52
|
|
Remove Trio
Trio is a rather old cross-platform printf library which was bundled with
libxml2. It was needed for ancient pre-C99 systems without snprintf and
should be safe to remove these days.
|
|
ff6c3188
|
2023-11-23T15:22:59
|
|
include: Remove useless 'const' from function arguments
|
|
6bc86405
|
2023-11-09T02:04:15
|
|
Avoid EDG deprecation warnings for LCC compiler
|
|
aca37d8c
|
2023-11-20T15:20:37
|
|
parser: Only enable SAX2 if there are SAX2 element handlers
This reverts part of commit 235b15a5 for backward compatibility and
adds some comments trying to clarify the whole mess.
Fixes #623.
|
|
58598494
|
2023-11-04T23:47:33
|
|
parser: Fix combination of hash values
This bug resulted in a stuck bit in hash values which can have a severe
performance impact.
|
|
61034116
|
2023-10-24T15:02:36
|
|
error: Make more xmlError structs constant
Prepare for future changes, see 45470611.
|
|
c082ef46
|
2023-08-09T16:59:36
|
|
parser: Stop switching to ISO-8859-1 on encoding errors
Use U+FFFD Replacement Character if invalid UTF-8 is encountered in
recovery mode.
Also rewrite xmlNextChar and xmlCurrentChar.
Fixes #598.
|
|
253f260b
|
2023-10-18T20:06:35
|
|
threads: Fix --with-thread-alloc
Fixes #606.
|
|
713ded60
|
2023-10-06T10:43:38
|
|
entities: Make xmlFreeEntity public
|
|
eb69c1d3
|
2023-10-02T12:16:05
|
|
parser: Fix initialization of namespace data
Move initialization to xmlInitSAXParserCtxt. Also add missing XML_HIDDEN
to xmlParserNsFree.
Fixes #597.
|
|
e0dd330b
|
2023-09-29T00:18:44
|
|
parser: Use hash tables to avoid quadratic behavior
Use a hash table to lookup namespaces by prefix. The hash table stores
an index into the namespace table. Auxiliary data for namespaces is
stored in a separate array along the main namespace table.
Use a hash table to verify attribute uniqueness. The hash table stores
an index into the attribute table.
Reuse hash value from the dictionary to avoid computing them twice.
See #346.
|
|
19161bab
|
2023-09-25T14:00:48
|
|
dict: Internal API to look up hash values
|
|
1425d8f6
|
2023-09-16T19:08:10
|
|
dict: Separate RNG code
|
|
b31813e6
|
2023-09-28T15:34:08
|
|
include: Add more missing stdio.h includes
|
|
84e1ffc8
|
2023-09-22T15:44:17
|
|
doc: Don't document internal macros in xmlversion.h
|
|
b94283fb
|
2023-09-22T14:23:27
|
|
regexp: Add missing include
|
|
45470611
|
2023-09-21T23:52:52
|
|
error: Make xmlGetLastError return a const error
This is a slight break of the API, but users really shouldn't modify the
global error struct. The goal is to make xmlLastError use static buffers
for its strings eventually. This should warn people if they're abusing
the struct.
|
|
8c084ebd
|
2023-09-21T22:57:33
|
|
doc: Make apibuild.py happy
|
|
72262030
|
2023-09-21T14:52:14
|
|
parser: Readd some includes to parser.h and xmlreader.h
Fix backward compatibility.
|
|
9fc5090c
|
2023-09-16T19:58:42
|
|
hash: Clean up libxml/hash.h
Rename variables, fix subincludes, whitespace.
|
|
da274bfa
|
2023-09-21T01:29:40
|
|
build: Fix build when certain modules are disabled
|
|
9b5cce7a
|
2023-09-21T00:44:50
|
|
include: Remove more unnecessary includes
|
|
d6ba4033
|
2023-09-20T20:49:59
|
|
globals: Move remaining declarations to correct places
globals.h is now deprecated. Sanity is restored.
|
|
1117fae0
|
2023-09-20T19:20:41
|
|
include: Remove unneeded includes
|
|
736327df
|
2023-09-20T19:09:15
|
|
include: Break inclusion cycle between tree.h and xmlregexp.h
|
|
699299ca
|
2023-09-20T18:54:39
|
|
globals: Stop including globals.h
|
|
11a1839d
|
2023-09-20T17:54:48
|
|
globals: Move remaining globals back to correct header files
This undoes a lot of damage.
|
|
7909ff08
|
2023-09-20T17:38:26
|
|
include: Remove unnecessary includes
- Don't include tree.h from encoding.h
- Don't include parser.h from xmlIO.h
|
|
eb985d6f
|
2023-09-20T17:17:49
|
|
globals: Move error globals back to xmlerror.c
|
|
d1336fd3
|
2023-09-20T17:00:50
|
|
globals: Move malloc hooks back to xmlmemory.h
|
|
a77f9ab8
|
2023-09-20T16:57:22
|
|
globals: Don't include SAX2.h from globals.h
|
|
2e6c49a7
|
2023-09-20T14:43:14
|
|
globals: Don't store xmlParserVersion in global state
This is a constant.
|
|
0830fcfa
|
2023-09-20T14:30:12
|
|
globals: Deprecate xmlLastError
The last error should be accessed with xmlGetLastError.
|
|
db8b9722
|
2023-09-20T13:56:16
|
|
parser: Deprecate global parser options
Note that setting global options has no effect anyway when using any of
the modern parser API functions which take an option argument like
xmlReadMemory or when using xmlCtxtUseOptions.
Global options only have an effect when using old API functions
xmlParse* or xmlSAXParse* or when using an xmlParserCtxt without calling
xmlCtxtUseOptions.
Unfortunately, many downstream projects still modify global parser
options often without realizing that it has no effect. If necessary,
switch to the modern API. Then you can safely remove all code that
changes global options.
Here's a list of deprecated functions and global variables together with
the corresponding parser options.
- xmlSubstituteEntitiesDefault, xmlSubstituteEntitiesDefaultValue
Parser option XML_PARSE_NOENT
- xmlKeepBlanksDefault, xmlKeepBlanksDefaultValue
Inverse of parser option XML_PARSE_NOBLANKS
- xmlPedanticParserDefault, xmlPedanticParserDefaultValue
Parser option XML_PARSE_PEDANTIC
- xmlLineNumbersDefault, xmlLineNumbersDefaultValue
Always enabled by new API
- xmlDoValidityCheckingDefaultValue
Parser option XML_PARSE_DTDVALID
- xmlGetWarningsDefaultValue
Inverse of parser option XML_PARSE_NOWARNING
- xmlLoadExtDtdDefaultValue
Parser options XML_PARSE_DTDLOAD and XML_PARSE_DTDATTR
|
|
868b94b8
|
2023-09-20T13:10:29
|
|
globals: Reformat libxml/globals.h
|
|
bbf08608
|
2023-09-20T13:05:02
|
|
globals: Move buffer callback declarations to xmlIO.h
|
|
dc3382ef
|
2023-09-20T12:58:03
|
|
globals: Move xmlRegisterNodeDefault to tree.c
Code in globals.c must not try to access globals itself since the
accessor macros aren't defined and we would only see the main
variable.
|
|
e7b6ca15
|
2023-09-18T13:25:06
|
|
globals: Rework global state destruction on Windows
If DllMain is used, rely on it working as expected. The old code seemed
to attempt to free global state of other threads if, for some reason,
the DllMain mechanism didn't work.
In a static build, register a destructor with
RegisterWaitForSingleObject.
Make public functions xmlGetGlobalState and xmlInitializeGlobalState
no-ops.
Move initialization and registration of global state objects to
xmlInitGlobalState. Lookup global state with xmlGetThreadLocalStorage
which can be inlined nicely.
Also cleanup global state when using TLS. xmlLastError must be reset.
|
|
39a275a5
|
2023-09-18T21:25:35
|
|
globals: Define globals using macros
Declare and define globals and helper functions by (ab)using the
preprocessor.
|
|
bf6bd161
|
2023-09-18T19:53:31
|
|
globals: Introduce xmlCheckThreadLocalStorage
Checks whether (emulated) thread-local storage could be allocated.
|
|
89f49767
|
2023-09-18T18:44:32
|
|
globals: Make xmlGlobalState private
This removes a public struct but it seems impossible to use its members
in a sensible way from external code.
|
|
a07ec7c1
|
2023-09-18T17:39:13
|
|
threads: Move library initialization code to threads.c
This allows to consolidate the initialization code since the global init
lock was already implemented in threads.c.
|
|
4e1c13eb
|
2023-09-18T14:45:10
|
|
debug: Remove debugging code
This is barely useful these days and only clutters the code base.
|
|
c19771c1
|
2023-09-18T00:54:39
|
|
globals: Move code from threads.c to globals.c
Move all code that handles globals to the place where it belongs.
|
|
2a4b8114
|
2023-09-17T23:16:49
|
|
globals: Rename members of xmlGlobalState
This is a deliberate first step to remove some internals from the
public API and to avoid issues when redefining tokens.
|
|
edc2dd48
|
2023-09-04T16:07:23
|
|
dict: Update hash function
Update hash function from classic Jenkins OAAT (dict.c) and a variant of
DJB2 (hash.c) to "GoodOAAT" taken from the SMHasher repo. This hash
function passes all SMHasher tests.
|
|
57cfd221
|
2023-09-01T14:52:04
|
|
dict: Use xoroshiro64** as PRNG
Stop using rand_r. This enables hash randomization on all platforms.
|
|
778cca38
|
2023-08-20T22:50:57
|
|
legacy: Add stubs for disabled modules
When legacy support is requested, always enable stubs for FTP and
XPointer location modules which were removed from the standard
configuration. Going forward, the --with-legacy configuration option
should be used to provide maximum ABI compatibility.
Fixes #433.
|
|
ed3bd052
|
2023-08-20T20:48:10
|
|
parser: Allow to set maximum amplification factor
|
|
f1c1f5c6
|
2023-08-16T19:43:02
|
|
parser: Revert change to doc->encoding
Fixes #579.
|
|
95e81a36
|
2023-08-08T15:21:31
|
|
parser: Decode all data in xmlCharEncInput
Even with flush set to true, xmlCharEncInput didn't guarantee to decode
all data. This complicated the push parser.
Remove the flush flag and always decode all available data.
Also fix ICU code where the flush flag has a different meaning. Always
set flush to false and retry even with empty input buffers.
|
|
834b8123
|
2023-08-08T15:21:28
|
|
parser: Stream data when reading from memory
Don't create a copy of the whole input buffer. Read the data chunk by
chunk to save memory.
Historically, it was probably envisioned to read data from memory
without additional copying. This doesn't work reliably with the current
design of the XML parser which requires a terminating null byte at the
end of input buffers. This lead to xmlReadMemory interfaces, which
expect pointer and size arguments, being changed to make a
zero-terminated copy of the input buffer. Interfaces based on
xmlReadDoc, which actually expect a zero-terminated string and
would make zero-copy operation work, were then simplified to rely on
xmlReadMemoryi, resulting in an unnecessary copy.
To avoid copying (possibly gigabytes) of memory temporarily, we now
stream in-memory input just like content read from files in a
chunk-by-chunk fashion (using a somewhat outdated INPUT_CHUNK size of
250 bytes). As a side effect, we also avoid another copy of the whole
input when handling non-UTF-8 data which was made possible by some
earlier commits.
Interfaces expecting zero-terminated strings now make use of strnlen
which unfortunately isn't part of the standard C library and only
mandated since POSIX 2008.
|
|
59fa0bb3
|
2023-08-08T15:21:14
|
|
parser: Simplify input pointer updates
The base member always points to the beginning of the buffer.
|
|
ec7be506
|
2023-08-08T15:19:46
|
|
parser: Rework encoding detection
Introduce XML_INPUT_HAS_ENCODING flag for xmlParserInput which is set
when xmlSwitchEncoding is called. The parser can use the flag to
reliably detect whether an encoding was already set via user override,
BOM or other auto-detection. In this case, the encoding declaration
won't be used to switch the encoding.
Before, an inscrutable mix of ctxt->charset, ctxt->input->encoding
and ctxt->input->buf->encoder was used.
Introduce private helper functions to switch encodings used by both the
XML and HTML parser:
- xmlDetectEncoding which skips over the BOM, allowing to remove the
BOM checks from other encoding functions.
- xmlSetDeclaredEncoding, replacing htmlCheckEncodingDirect, which warns
about encoding mismatches.
If users override the encoding, store the declared instead of the actual
encoding in xmlDoc. In this case, the actual encoding is known and the
raw value from the doc is more useful.
Also use the input flags to store the ISO-8859-1 fallback state.
Restrict the fallback to cases where no encoding was specified. (The
fallback is only useful in recovery mode and these days broken UTF-8 is
probably more likely than ISO-8859-1, so it might eventually be removed
completely.)
The 'charset' member of xmlParserCtxt is now unused. The 'encoding'
member of xmlParserInput is now unused.
The 'standalone' member of xmlParserInput is renamed to 'flags'.
A new parser state XML_PARSER_XML_DECL is added for the push parser.
|
|
b8961df6
|
2023-05-09T03:25:24
|
|
SAX: Always validate xml:ids
The behavior shouldn't depend on mostly random configuration options.
|
|
8d5e33ef
|
2023-05-03T20:42:10
|
|
Fix compiler warning on GCC < 8
-Wcast-function-type is only available since GCC 8.
|
|
fc69cf56
|
2023-04-30T17:51:29
|
|
parser: Move xmlFatalErr to parserInternals.c
|
|
3ff6abbf
|
2023-02-22T17:11:20
|
|
encoding: Rework error codes
Use an enum instead of magic numbers. Fix a few error codes. Simplify
handling of "space" and "partial" errors.
See #506.
|
|
fa993130
|
2023-04-30T12:57:09
|
|
xpath: Remove remaining references to valueFrame
Fixes #529.
|
|
3ffcc03b
|
2023-03-13T19:38:41
|
|
parser: Deprecate more internal functions
|
|
98840d40
|
2023-03-21T19:07:12
|
|
parser: Rework EBCDIC code page detection
To detect EBCDIC code pages, we used to switch the encoding twice and
had to be very careful not to decode data after the XML declaration
before the second switch. This relied on a hard-coded expected size of
the XML declaration and was complicated and unreliable.
Now we convert the first 200 bytes to EBCDIC-US and parse the encoding
declaration manually.
|
|
04d1bedd
|
2023-03-21T13:08:44
|
|
parser: Rework shrinking of input buffers
Don't try to grow the input buffer in xmlParserShrink. This makes sure
that no memory allocations are made and the function always succeeds.
Remove unnecessary invocations of SHRINK. Invoke SHRINK at the end of
DTD parsing loops.
Shrink before growing.
|
|
b167c731
|
2023-03-14T14:42:36
|
|
parser: Fix short-lived regression causing infinite loops
Fix 3eb6bf03. We really have to halt the parser, so the input buffer
gets reset.
|
|
f8efa589
|
2023-03-14T13:55:06
|
|
malloc-fail: Handle malloc failures in xmlSchemaInitTypes
Note that this changes the return value of public function
xmlSchemaInitTypes from void to int. This shouldn't break the ABI on
most platforms.
Found when investigating #500.
|
|
d7daf9fd
|
2023-03-14T13:02:36
|
|
xmllint: Fix use-after-free with --maxmem
Fixes #498.
|
|
e7c3a4ca
|
2023-03-13T19:19:46
|
|
parser: Deprecate some parser input functions
|
|
2099441f
|
2023-03-13T17:51:13
|
|
parser: Stop calling xmlParserInputShrink
Introduce xmlParserShrink which takes a parser context to simplify error
handling.
|
|
48379394
|
2023-03-13T17:11:27
|
|
malloc-fail: Stop using XPath stack frames
There's too much code which assumes that if ctxt->value is non-null,
a value can be successfully popped off the stack. This assumption can
break with stack frames when malloc fails.
Instead of trying to fix all call sites, remove the stack frame logic.
It only offered very little protection against misbehaving extension
functions. We already check the stack size after a function call which
should be enough.
Found by OSS-Fuzz.
|
|
bd63d730
|
2023-03-12T17:40:55
|
|
html: Impose some length limits
Impose length limits on names, attribute values, PIs and comments,
similar to the XML parser.
|
|
3eb6bf03
|
2023-03-12T16:47:15
|
|
parser: Stop calling xmlParserInputGrow
Introduce xmlParserGrow which takes a parser context to simplify error
handling.
|
|
b51478dc
|
2023-02-24T16:21:17
|
|
Revert "malloc-fail: Avoid use-after-free after unsuccessful valuePush"
This reverts commit 6a12be77c6a94c374ab7476087edcee2ba41d9b4.
There's too much code reading ctxt->value directly and making the wrong
assumptions.
|
|
4f0a0fb7
|
2023-02-22T14:24:24
|
|
xinclude: Fix include guard
|
|
905386ec
|
2023-02-13T11:14:34
|
|
autotools: Fix make distcheck
- Add private/xinclude.h to EXTRA_DIST
- Add runsuite.log to CLEANFILES
Fixes #485.
|
|
6a12be77
|
2023-01-31T12:46:30
|
|
malloc-fail: Avoid use-after-free after unsuccessful valuePush
In xpath.c there's a lot of code like:
valuePush(ctxt, xmlCacheNewX());
...
valuePop(ctxt);
If xmlCacheNewX fails, no value will be pushed on the stack. If there's
no error check in between, valuePop will pop an unrelated value which
can lead to use-after-free errors.
Instead of trying to fix all call sites, we simply stop popping values
if an error was signaled. This requires to change the CHECK_TYPE macro
which is often used to determine whether a value can be safely popped.
Found with libFuzzer, see #344.
|
|
59b33661
|
2022-12-27T14:15:51
|
|
error: Limit number of parser errors
Reporting errors is expensive and some abusive test cases can generate
an error for each invalid input byte. This causes the parser to spend
most of the time with error handling. Limit the number of errors and
warnings to 100.
|
|
a41b09c7
|
2022-12-23T21:29:28
|
|
parser: Improve detection of entity loops
Set a flag to detect entity loops at once instead of processing until
the depth limit is exceeded.
|
|
b47ebf04
|
2022-12-21T00:02:47
|
|
parser: Deprecate xmlString*DecodeEntities
These are internal functions.
|
|
ce76ebfd
|
2022-12-19T20:56:23
|
|
entities: Stop counting entities
This was only used in the old version of xmlParserEntityCheck.
|
|
a3c8b180
|
2022-12-19T20:51:52
|
|
entities: Add entity flag for loop check
|