|
84e1ffc8
|
2023-09-22T15:44:17
|
|
doc: Don't document internal macros in xmlversion.h
|
|
b94283fb
|
2023-09-22T14:23:27
|
|
regexp: Add missing include
|
|
45470611
|
2023-09-21T23:52:52
|
|
error: Make xmlGetLastError return a const error
This is a slight break of the API, but users really shouldn't modify the
global error struct. The goal is to make xmlLastError use static buffers
for its strings eventually. This should warn people if they're abusing
the struct.
|
|
8c084ebd
|
2023-09-21T22:57:33
|
|
doc: Make apibuild.py happy
|
|
72262030
|
2023-09-21T14:52:14
|
|
parser: Readd some includes to parser.h and xmlreader.h
Fix backward compatibility.
|
|
9fc5090c
|
2023-09-16T19:58:42
|
|
hash: Clean up libxml/hash.h
Rename variables, fix subincludes, whitespace.
|
|
da274bfa
|
2023-09-21T01:29:40
|
|
build: Fix build when certain modules are disabled
|
|
9b5cce7a
|
2023-09-21T00:44:50
|
|
include: Remove more unnecessary includes
|
|
d6ba4033
|
2023-09-20T20:49:59
|
|
globals: Move remaining declarations to correct places
globals.h is now deprecated. Sanity is restored.
|
|
1117fae0
|
2023-09-20T19:20:41
|
|
include: Remove unneeded includes
|
|
736327df
|
2023-09-20T19:09:15
|
|
include: Break inclusion cycle between tree.h and xmlregexp.h
|
|
eb985d6f
|
2023-09-20T17:17:49
|
|
globals: Move error globals back to xmlerror.c
|
|
d1336fd3
|
2023-09-20T17:00:50
|
|
globals: Move malloc hooks back to xmlmemory.h
|
|
a77f9ab8
|
2023-09-20T16:57:22
|
|
globals: Don't include SAX2.h from globals.h
|
|
2e6c49a7
|
2023-09-20T14:43:14
|
|
globals: Don't store xmlParserVersion in global state
This is a constant.
|
|
0830fcfa
|
2023-09-20T14:30:12
|
|
globals: Deprecate xmlLastError
The last error should be accessed with xmlGetLastError.
|
|
db8b9722
|
2023-09-20T13:56:16
|
|
parser: Deprecate global parser options
Note that setting global options has no effect anyway when using any of
the modern parser API functions which take an option argument like
xmlReadMemory or when using xmlCtxtUseOptions.
Global options only have an effect when using old API functions
xmlParse* or xmlSAXParse* or when using an xmlParserCtxt without calling
xmlCtxtUseOptions.
Unfortunately, many downstream projects still modify global parser
options often without realizing that it has no effect. If necessary,
switch to the modern API. Then you can safely remove all code that
changes global options.
Here's a list of deprecated functions and global variables together with
the corresponding parser options.
- xmlSubstituteEntitiesDefault, xmlSubstituteEntitiesDefaultValue
Parser option XML_PARSE_NOENT
- xmlKeepBlanksDefault, xmlKeepBlanksDefaultValue
Inverse of parser option XML_PARSE_NOBLANKS
- xmlPedanticParserDefault, xmlPedanticParserDefaultValue
Parser option XML_PARSE_PEDANTIC
- xmlLineNumbersDefault, xmlLineNumbersDefaultValue
Always enabled by new API
- xmlDoValidityCheckingDefaultValue
Parser option XML_PARSE_DTDVALID
- xmlGetWarningsDefaultValue
Inverse of parser option XML_PARSE_NOWARNING
- xmlLoadExtDtdDefaultValue
Parser options XML_PARSE_DTDLOAD and XML_PARSE_DTDATTR
|
|
868b94b8
|
2023-09-20T13:10:29
|
|
globals: Reformat libxml/globals.h
|
|
bbf08608
|
2023-09-20T13:05:02
|
|
globals: Move buffer callback declarations to xmlIO.h
|
|
dc3382ef
|
2023-09-20T12:58:03
|
|
globals: Move xmlRegisterNodeDefault to tree.c
Code in globals.c must not try to access globals itself since the
accessor macros aren't defined and we would only see the main
variable.
|
|
e7b6ca15
|
2023-09-18T13:25:06
|
|
globals: Rework global state destruction on Windows
If DllMain is used, rely on it working as expected. The old code seemed
to attempt to free global state of other threads if, for some reason,
the DllMain mechanism didn't work.
In a static build, register a destructor with
RegisterWaitForSingleObject.
Make public functions xmlGetGlobalState and xmlInitializeGlobalState
no-ops.
Move initialization and registration of global state objects to
xmlInitGlobalState. Lookup global state with xmlGetThreadLocalStorage
which can be inlined nicely.
Also cleanup global state when using TLS. xmlLastError must be reset.
|
|
39a275a5
|
2023-09-18T21:25:35
|
|
globals: Define globals using macros
Declare and define globals and helper functions by (ab)using the
preprocessor.
|
|
11a1839d
|
2023-09-20T17:54:48
|
|
globals: Move remaining globals back to correct header files
This undoes a lot of damage.
|
|
7909ff08
|
2023-09-20T17:38:26
|
|
include: Remove unnecessary includes
- Don't include tree.h from encoding.h
- Don't include parser.h from xmlIO.h
|
|
bf6bd161
|
2023-09-18T19:53:31
|
|
globals: Introduce xmlCheckThreadLocalStorage
Checks whether (emulated) thread-local storage could be allocated.
|
|
89f49767
|
2023-09-18T18:44:32
|
|
globals: Make xmlGlobalState private
This removes a public struct but it seems impossible to use its members
in a sensible way from external code.
|
|
4e1c13eb
|
2023-09-18T14:45:10
|
|
debug: Remove debugging code
This is barely useful these days and only clutters the code base.
|
|
c19771c1
|
2023-09-18T00:54:39
|
|
globals: Move code from threads.c to globals.c
Move all code that handles globals to the place where it belongs.
|
|
2a4b8114
|
2023-09-17T23:16:49
|
|
globals: Rename members of xmlGlobalState
This is a deliberate first step to remove some internals from the
public API and to avoid issues when redefining tokens.
|
|
778cca38
|
2023-08-20T22:50:57
|
|
legacy: Add stubs for disabled modules
When legacy support is requested, always enable stubs for FTP and
XPointer location modules which were removed from the standard
configuration. Going forward, the --with-legacy configuration option
should be used to provide maximum ABI compatibility.
Fixes #433.
|
|
ed3bd052
|
2023-08-20T20:48:10
|
|
parser: Allow to set maximum amplification factor
|
|
f1c1f5c6
|
2023-08-16T19:43:02
|
|
parser: Revert change to doc->encoding
Fixes #579.
|
|
ec7be506
|
2023-08-08T15:19:46
|
|
parser: Rework encoding detection
Introduce XML_INPUT_HAS_ENCODING flag for xmlParserInput which is set
when xmlSwitchEncoding is called. The parser can use the flag to
reliably detect whether an encoding was already set via user override,
BOM or other auto-detection. In this case, the encoding declaration
won't be used to switch the encoding.
Before, an inscrutable mix of ctxt->charset, ctxt->input->encoding
and ctxt->input->buf->encoder was used.
Introduce private helper functions to switch encodings used by both the
XML and HTML parser:
- xmlDetectEncoding which skips over the BOM, allowing to remove the
BOM checks from other encoding functions.
- xmlSetDeclaredEncoding, replacing htmlCheckEncodingDirect, which warns
about encoding mismatches.
If users override the encoding, store the declared instead of the actual
encoding in xmlDoc. In this case, the actual encoding is known and the
raw value from the doc is more useful.
Also use the input flags to store the ISO-8859-1 fallback state.
Restrict the fallback to cases where no encoding was specified. (The
fallback is only useful in recovery mode and these days broken UTF-8 is
probably more likely than ISO-8859-1, so it might eventually be removed
completely.)
The 'charset' member of xmlParserCtxt is now unused. The 'encoding'
member of xmlParserInput is now unused.
The 'standalone' member of xmlParserInput is renamed to 'flags'.
A new parser state XML_PARSER_XML_DECL is added for the push parser.
|
|
b8961df6
|
2023-05-09T03:25:24
|
|
SAX: Always validate xml:ids
The behavior shouldn't depend on mostly random configuration options.
|
|
8d5e33ef
|
2023-05-03T20:42:10
|
|
Fix compiler warning on GCC < 8
-Wcast-function-type is only available since GCC 8.
|
|
3ff6abbf
|
2023-02-22T17:11:20
|
|
encoding: Rework error codes
Use an enum instead of magic numbers. Fix a few error codes. Simplify
handling of "space" and "partial" errors.
See #506.
|
|
fa993130
|
2023-04-30T12:57:09
|
|
xpath: Remove remaining references to valueFrame
Fixes #529.
|
|
3ffcc03b
|
2023-03-13T19:38:41
|
|
parser: Deprecate more internal functions
|
|
98840d40
|
2023-03-21T19:07:12
|
|
parser: Rework EBCDIC code page detection
To detect EBCDIC code pages, we used to switch the encoding twice and
had to be very careful not to decode data after the XML declaration
before the second switch. This relied on a hard-coded expected size of
the XML declaration and was complicated and unreliable.
Now we convert the first 200 bytes to EBCDIC-US and parse the encoding
declaration manually.
|
|
f8efa589
|
2023-03-14T13:55:06
|
|
malloc-fail: Handle malloc failures in xmlSchemaInitTypes
Note that this changes the return value of public function
xmlSchemaInitTypes from void to int. This shouldn't break the ABI on
most platforms.
Found when investigating #500.
|
|
d7daf9fd
|
2023-03-14T13:02:36
|
|
xmllint: Fix use-after-free with --maxmem
Fixes #498.
|
|
e7c3a4ca
|
2023-03-13T19:19:46
|
|
parser: Deprecate some parser input functions
|
|
48379394
|
2023-03-13T17:11:27
|
|
malloc-fail: Stop using XPath stack frames
There's too much code which assumes that if ctxt->value is non-null,
a value can be successfully popped off the stack. This assumption can
break with stack frames when malloc fails.
Instead of trying to fix all call sites, remove the stack frame logic.
It only offered very little protection against misbehaving extension
functions. We already check the stack size after a function call which
should be enough.
Found by OSS-Fuzz.
|
|
bd63d730
|
2023-03-12T17:40:55
|
|
html: Impose some length limits
Impose length limits on names, attribute values, PIs and comments,
similar to the XML parser.
|
|
b51478dc
|
2023-02-24T16:21:17
|
|
Revert "malloc-fail: Avoid use-after-free after unsuccessful valuePush"
This reverts commit 6a12be77c6a94c374ab7476087edcee2ba41d9b4.
There's too much code reading ctxt->value directly and making the wrong
assumptions.
|
|
6a12be77
|
2023-01-31T12:46:30
|
|
malloc-fail: Avoid use-after-free after unsuccessful valuePush
In xpath.c there's a lot of code like:
valuePush(ctxt, xmlCacheNewX());
...
valuePop(ctxt);
If xmlCacheNewX fails, no value will be pushed on the stack. If there's
no error check in between, valuePop will pop an unrelated value which
can lead to use-after-free errors.
Instead of trying to fix all call sites, we simply stop popping values
if an error was signaled. This requires to change the CHECK_TYPE macro
which is often used to determine whether a value can be safely popped.
Found with libFuzzer, see #344.
|
|
59b33661
|
2022-12-27T14:15:51
|
|
error: Limit number of parser errors
Reporting errors is expensive and some abusive test cases can generate
an error for each invalid input byte. This causes the parser to spend
most of the time with error handling. Limit the number of errors and
warnings to 100.
|
|
b47ebf04
|
2022-12-21T00:02:47
|
|
parser: Deprecate xmlString*DecodeEntities
These are internal functions.
|
|
ce76ebfd
|
2022-12-19T20:56:23
|
|
entities: Stop counting entities
This was only used in the old version of xmlParserEntityCheck.
|
|
463bbeec
|
2022-12-19T18:39:45
|
|
entities: Rework entity amplification checks
This commit implements robust detection of entity amplification attacks,
better known as the "billion laughs" attack.
We now limit the size of the document after substitution of entities to
10 times the size before expansion. This guarantees linear behavior by
definition. There already was a similar check before, but the accounting
of "sizeentities" (size of external entities) and "sizeentcopy" (size of
all copies created by entity references) wasn't accurate.
We also need saturation arithmetic since we're historically limited to
"unsigned long" which is 32-bit on many platforms.
A maximum of 10 MB of substitutions is always allowed. This should make
use cases like DITA work which have caused problems in the past.
The old checks based on the number of entities were removed. This is
accounted for by adding a fixed cost to each entity reference.
Entity amplification checks are now enabled even if XML_PARSE_HUGE is
set. This option is mainly used to allow larger text nodes. Most users
were unaware that it also disabled entity expansion checks.
Some of the limits might be adjusted later. If this change turns out to
affect legitimate use cases, we can add a separate parser option to
disable the checks.
Fixes #294.
Fixes #345.
|
|
f34f184f
|
2022-12-19T15:24:53
|
|
entities: Add "flags" member to struct xmlEntity
This will hold various flags and eventually replace the "checked"
member.
|
|
a6debffd
|
2022-12-08T03:37:24
|
|
xmlexports.h: Disable docs for internal macro XMLPUBLIC
|
|
3b6cc47a
|
2022-12-08T02:51:52
|
|
xmlexports.h: Remove LIBXML_FASTCALL optimization
This was an experimental and undocumented micro-optimization for
Windows which apparently required different calling conventions for
variable-argument functions, making it impossible to maintain without
domain knowledge.
|
|
ce9baf94
|
2022-12-08T02:48:27
|
|
Remove XMLCALL and XMLCDECL macros from public headers
|
|
c73d464a
|
2022-11-24T15:00:03
|
|
threads: Deprecate some internal functions
|
|
707ade22
|
2022-11-22T14:56:58
|
|
Visual Studio builds: Allow silencing deprecation warnings
Define XML_IGNORE_DEPRECATION_WARNINGS and the corresponding XML_POP_WARNINGS
for Visual Studio, and consequently define XML_IGNORE_FPTR_CAST_WARNINGS so
that we do not get a compiler warning on Visual Studio by doing a
__pragma(warning(pop)) without a corresponding __pragma(warning(push)).
Also correct the documentation a bit for XML_POP_WARNINGS.
|
|
b9590d5d
|
2022-11-18T11:23:23
|
|
Visual Studio: Define XML_DEPRECATED
We can mark APIs as deprecated using __declspec(deprecated) with Visual Studio
2005 and later, so add a definition of that so that we can help users avoid
using deprecated APIs when using Visual Studio as well.
For the existing GCC definition, check whether we are on GCC 3.1+ before
enabling the definition.
|
|
68a6518c
|
2022-11-15T18:23:33
|
|
parser: Rewrite push parser boundary checks
Remove inaccurate xmlParseCheckTransition check.
Remove non-incremental xmlParseGetLasts check.
Add functions that check for several boundary constructs more
accurately, keeping track of progress in ctxt->checkIndex.
Fixes #439.
|
|
2059df53
|
2022-11-14T22:27:58
|
|
buf: Deprecate static/immutable buffers
|
|
b693905f
|
2022-11-04T14:50:39
|
|
doc: Remove xmlDllMain from documentation and version script
This is a Windows-only symbol.
|
|
a9669679
|
2022-09-09T01:44:00
|
|
error: Don't use initGenericErrorDefaultFunc
The code in xmlInitParser did only set the error handler if it was NULL
which should never happen.
|
|
0d901258
|
2022-09-04T16:41:43
|
|
Fix Windows compiler warnings in python/types.c
|
|
1e60c768
|
2022-09-04T01:49:41
|
|
Remove HAVE_WIN32_THREADS configuration flag
Check for LIBXML_THREAD_ENABLED and _WIN32 instead.
|
|
59f2f60e
|
2022-09-02T00:27:57
|
|
Remove "runtime debugging"
This doesn't seem useful as configuration option.
|
|
65dc8a63
|
2022-09-01T00:13:19
|
|
Make xmlNewSAXParserCtx take a const sax handler
Also improve documentation.
|
|
39dfb048
|
2022-08-26T02:41:15
|
|
Don't forget to install xmlversion.h
Fixes 7f7961df.
|
|
0f568c0b
|
2022-08-26T01:22:33
|
|
Consolidate private header files
Private functions were previously declared
- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.
Consolidate all private header files in include/private.
|
|
48f84ea8
|
2022-08-25T21:31:08
|
|
Remove internal macros from parserInternals.h
Replace MOVETO_ENDTAG with code that updates line and column numbers.
|
|
58fc89e8
|
2022-08-25T20:57:30
|
|
Deprecate internal parser functions
|
|
a308c0cd
|
2022-08-25T20:18:16
|
|
Deprecate old HTML SAX API
|
|
51035c53
|
2022-08-25T19:53:04
|
|
Generate deprecation warnings for old SAX API
|
|
7f7961df
|
2022-08-25T13:47:16
|
|
Remove generated files from distribution
- libxml2.spec
- libxml-2.0.pc
- xml2-config
- include/libxml/xmlversion.h
- python/libxml2.py
- python/libxml2-export.c
- python/libxml2-py.c
- python/libxml2-py.h
- python/libxml2class.py
- python/libxml2class.txt
- python/setup.py
|
|
34a050cd
|
2022-08-24T16:35:58
|
|
Move some HTML functions to correct header file
|
|
9a82b94a
|
2022-08-24T04:21:58
|
|
Introduce xmlNewSAXParserCtxt and htmlNewSAXParserCtxt
Add API functions to create a parser context with a custom SAX handler
without having to mess with ctxt->sax manually.
|
|
92bb889b
|
2022-08-24T00:03:44
|
|
Don't index anything in DOC_DISABLE sections
Somewhat misleadingly, the DOC_DISABLE directive only disabled warnings.
Now we really stop the documentation generator from indexing.
This results in additional warnings for xmlThrDef* functions. This should
be fixed by documenting or deprecating them.
|
|
75b5bc2b
|
2022-08-23T20:57:49
|
|
Deprecate some global variables
Most of these should be completely unused.
|
|
5264fa11
|
2022-08-18T20:17:27
|
|
Fix warnings from apibuild.py
|
|
e986d09c
|
2022-07-15T14:02:26
|
|
Skip incorrectly opened HTML comments
Commit 4fd69f3e fixed handling of '<' characters not followed by an
ASCII letter. But a '<!' sequence followed by invalid characters should
be treated as bogus comment and skipped.
Fixes #380.
|
|
e9270ef0
|
2022-05-06T10:44:03
|
|
fix Schematron spelling
|
|
67070107
|
2022-04-20T23:17:14
|
|
Add configuration flag for XPointer locations support
Add a new configuration flag that controls whether the outdated support
for XPointer locations (ranges and points) is enabled.
--with-xptr-locs # Autotools
LIBXML2_WITH_XPTR_LOCS # CMake
The latest spec for what it essentially an XPath extension seems to be
this working draft from 2002:
https://www.w3.org/TR/xptr-xpointer/
The xpointer() scheme is listed as "being reviewed" in the XPointer
registry since at least 2006. libxml2 seems to be the only modern
software that tries to implement this spec, but the code has many bugs
and quality issues.
The flag defaults to "off" and support for this extensions has to be
requested explicitly. The relevant API functions are deprecated.
|
|
aab584dc
|
2022-03-06T23:23:43
|
|
Clean up encoding switching code
- Remove xmlSwitchToEncodingInt which was basically just a wrapper
around xmlSwitchInputEncodingInt.
- Simplify xmlSwitchEncoding.
- Improve error handling in xmlSwitchInputEncodingInt.
- Deprecate xmlSwitchInputEncoding.
|
|
2070ade6
|
2022-03-06T23:19:04
|
|
Undeprecate schema init functions
These functions aren't called from xmlInitParser, so it's better to keep
them public for now.
|
|
40483d0c
|
2022-03-06T13:55:48
|
|
Deprecate module init and cleanup functions
These functions shouldn't be part of the public API. Most init
functions are only thread-safe when called from xmlInitParser. Global
variables should only be cleaned up by calling xmlCleanupParser.
|
|
4a8c71eb
|
2022-03-04T03:35:57
|
|
Remove DOCBparser
This code has been broken and deprecated since version 2.6.0, released
in 2003. Because of a bug in commit 961b535c, DOCBparser.c was never
compiled since 2012. I couldn't find a Debian package using any of its
symbols, so it seems safe to remove this module.
|
|
ebb17970
|
2022-03-04T02:31:59
|
|
Remove unneeded #includes
|
|
d7b287b9
|
2021-07-17T14:36:53
|
|
htmlParseComment: handle abruptly-closed comments
See guidance provided on abrutply-closed comments here:
https://html.spec.whatwg.org/multipage/parsing.html#parse-error-abrupt-closing-of-empty-comment
|
|
b66ce0bb
|
2022-03-01T12:39:02
|
|
Don't include ICU headers in public headers
There's no need to make these implementation details public.
|
|
2489c1d0
|
2022-02-28T22:42:10
|
|
Remove useless __CYGWIN__ checks
From what I can tell, some really early Cygwin versions from around
1998-2000 used to erroneously define _WIN32. This was eventually fixed,
but these days, the `defined(_WIN32) && !defined(__CYGWIN__)` idiom is
unnecessary.
Now, we only check for __CYGWIN__ in xmlexports.h when deciding whether
to use __declspec.
|
|
004fe9de
|
2022-02-20T19:02:31
|
|
Deprecate IDREF-related functions in valid.h
These functions are only needed internally for validation.
xmlGetRefs is inherently unsafe because the ref table isn't updated
if attributes are removed (unlike the ids table).
None of the Ubuntu 20.04 packages depending on libxml2 use any of these
functions (except xmlFreeRefTable in libxslt), so it seems perfectly
safe to deprecate them.
Remove xmlIsRef and xmlRemoveRef from the Python bindings.
|
|
61de9297
|
2022-02-20T20:59:14
|
|
Deprecate all functions in DOCBparser.h
|
|
cf4893f7
|
2022-02-20T19:56:41
|
|
Deprecate legacy functions
|
|
9e0ca5a1
|
2022-02-20T19:29:01
|
|
Deprecate all functions in nanoftp.h
|
|
a2fe74c0
|
2022-02-20T18:19:27
|
|
Add XML_DEPRECATED macro
__attribute__((deprecated)) is available since at least GCC 3.1, so an
exact version check is probably unnecessary.
|
|
d7cb33cf
|
2022-01-13T17:06:14
|
|
Rework validation context flags
Use a bitmask instead of magic values to
- keep track whether the validation context is part of a parser context
- keep track whether xmlValidateDtdFinal was called
This allows to add addtional flags later.
Note that this deliberately changes the name of a public struct member,
assuming that this was always private data never to be used by client
code.
|
|
b041d829
|
2022-02-16T19:55:30
|
|
Remove xmlwin32version.h
This file was undocumented and never used anywhere. Maybe users were
supposed to rename this file to xmlversion.h manually. These days, both
CMake and win32/configure.js generate xmlversion.h from xmlversion.h.in,
just like the Autotools build.
|
|
7fe9addc
|
2022-02-13T23:29:51
|
|
Remove CVS and SVN-related code
|
|
7d4060d2
|
2021-05-16T18:00:21
|
|
Add missing file xmlwin32version.h.in to EXTRA_DIST
|
|
ce00c36e
|
2021-05-08T21:20:05
|
|
Store per-element parser state in a struct
Make the parser context's "pushTab" point to an array of structs
instead of void pointers. This avoids casting unrelated types to void
pointers, improving readability and portability, and allows for more
efficient packing. Ultimately, the struct could be extended to include
the contents of "nameTab" and "spaceTab", further simplifying the code.
Historically, "pushTab" was only used by the push parser (hence the
name), so the change to the public headers should be safe.
Also remove an unused parameter from xmlParseEndTag2.
|
|
fb08d9fe
|
2021-03-20T22:02:26
|
|
Fix include order in c14n.h
- Include xmlversion.h before testing feature flags.
- Include libxml headers before extern "C".
Fixes #226.
|
|
1fe38530
|
2020-12-16T15:27:13
|
|
Remove temporary members from struct _xmlXPathContext
These values are hardcoded now and the struct members, while public,
were recently introduced and never part of an official release.
|