|
e4cbc295
|
2025-05-20T21:57:01
|
|
parser: Check attribute normalization standalone constraint
To fully implement "VC: Standalone Document Declaration", we have to
check for normalization changes caused by non-CDATA attribute types
declared externally.
Fixes #119.
|
|
682195c8
|
2025-05-20T22:00:57
|
|
parser: Fix "Proper Declaration/PE Nesting" validity constraint
Now that we handle "WFC: PE Between Declarations" correctly, we can turn
"Proper Declaration/PE Nesting" from a WFC into VC as specified.
Fixes #118.
|
|
2f3655c9
|
2025-05-20T19:40:06
|
|
parser: Pop PEs that start markup declarations explicitly
We currently only handle "Validity constraint: Proper Declaration/PE
Nesting", but we must detect "Well-formedness constraint: PE Between
Declarations" separately:
> The replacement text of a parameter entity reference in a DeclSep must
> match the production extSubsetDecl.
PEs in DeclSeps are PEs that start with a full markup declaration (or
another PE). These are handled in xmParse{Internal|External}Subset. We
set a flag on these PEs and don't close them implicitly in
xmlSkipBlankCharsPE. This will make unterminated declarations in such
PEs cause a parser error. The PEs are closed explicitly in
xmParse{Internal|External}Subset, the only location where they are
allowed to end.
|
|
dd1961e0
|
2025-05-20T16:37:18
|
|
valid: Skip more validity checks if not validating
|
|
6c2bd975
|
2025-05-20T15:51:18
|
|
valid: Don't validate unused default attributes
See erratum E9 of XML 1.0 Second Edition.
See #120.
|
|
2a60ca06
|
2025-05-20T16:50:32
|
|
valid: Don't check enum values
Rely on the parser to pass valid arguments.
|
|
fca0860d
|
2025-05-19T21:17:39
|
|
tree: Deprecate public struct members related to DTDs
Let's deprecate these members for now. If these are really used, they
can be undeprecated later.
|
|
74ff6c00
|
2025-05-20T22:00:29
|
|
error: Fix line number in entities
Allow line numbers from more domains, see code above.
|
|
4aa7192f
|
2025-05-21T16:32:17
|
|
tests: Add dtor for xmlElementContent in testapi.c
|
|
fc1cabc8
|
2025-05-25T14:03:50
|
|
valid: Also raise duplicate ID error without validation support
Whether an error is raised should not depend on config options.
|
|
3ab040c2
|
2025-05-24T01:12:15
|
|
Fix unidiomatic use of vsnprintf().
* Don't terminate an already-terminated buffer.
* Consistently use 1024-byte buffers.
* While here, consistently use ap for a va_list.
|
|
8ea253b8
|
2025-05-24T01:00:25
|
|
Remove bogus casts.
* Casting a string literal to `char *` and then immediately passing or
assigning the result to a `const char *` makes no sense.
* There is no need to cast `int` to `Py_ssize_t` as they have the same
sign and the latter is at least as wide as the former.
|
|
7c9b5535
|
2025-05-19T19:10:55
|
|
doc: Document unused error domains
|
|
47aca2c6
|
2025-05-19T18:43:14
|
|
parser: Only check validity contraints when validating
|
|
3a68d0b7
|
2025-05-19T18:59:51
|
|
SAX2: Handle xml:id errors separately
|
|
172550d2
|
2025-05-18T17:45:11
|
|
parser: Only validate EnumerationTypes when requested
This has quadratic behavior and is only a validity constraint.
|
|
7008740a
|
2025-05-18T01:52:38
|
|
parser: Consolidate scanning of XML Names
Use new productions by default.
Fixes #194.
Fixes #364.
See #707.
|
|
657254a8
|
2025-05-18T01:21:43
|
|
parser: Factor out xmlIsNameCharNew/Old
|
|
315bd443
|
2025-05-17T18:59:52
|
|
meson: Switch to cfg_data.set10()
|
|
4e5945fc
|
2025-05-17T14:41:28
|
|
cmake: Avoid overlinking with non-CMake libxml2-config.cmake
Align libxml2-config.cmake generated by Autotools and Meson with the
CMake version and only add dependencies to libraries when linking
statically. Also set LIBXML_STATIC for static builds.
Fixes #918.
|
|
faaa01b8
|
2025-05-17T12:20:32
|
|
cmake: Make iconv a private dependency
This was only needed for the headers before 2.14.
|
|
70e5d664
|
2025-05-17T01:30:41
|
|
doc: Don't document deprecated headers
|
|
7c82391c
|
2025-05-17T01:01:03
|
|
codegen: Factor out code to generate range tables
|
|
502c5f65
|
2025-05-17T00:11:03
|
|
meson: Dependency on directory doesn't work
|
|
210f5a37
|
2025-05-16T21:18:16
|
|
chvalid: Mark functions as deprecated
|
|
954aae90
|
2025-05-16T21:13:17
|
|
doc: Improve regexp documentation
|
|
cbad60ff
|
2025-05-16T18:31:16
|
|
xmllint: Remove unused macros
|
|
2132150d
|
2025-05-16T18:27:00
|
|
xmllint: Switch to xmlCtxtGetDocument
|
|
c5b45fbc
|
2025-05-16T16:54:09
|
|
doc: Misc fixes
|
|
c4926b19
|
2025-05-16T02:12:23
|
|
codegen: Merge xmlunicode.c into xmlregexp.c
Include generated parts.
Generate xmlChRangeGroups instead of functions for Unicode blocks.
|
|
4cb767e9
|
2025-05-16T01:52:44
|
|
codegen: Only generate tables for character ranges
The rest can be easily maintained manually.
|
|
770c6dec
|
2025-05-16T01:19:19
|
|
buf: Remove ABI compatibility hack
I think this was required when some struct members like
xmlParserInputBuffer::buffer were changed from xmlBuffer to xmlBuf (20+
years ago).
Unfortunately, I missed the opportunity to align xmlBuffer with xmlBuf
before the ABI break.
|
|
344190db
|
2025-05-16T00:54:51
|
|
doc: Document deprecated xmlThrDef* functions
|
|
6f4b4527
|
2025-05-15T23:43:32
|
|
parser: Stop using ctxt->linenumbers
I think this was used to avoid setting the `line` member before it was
added (20+ years ago).
|
|
5ce48ec1
|
2025-05-15T22:51:54
|
|
SAX2: Rework xmlSAX2Text
Simplify and make more readable.
|
|
d834437b
|
2025-05-15T19:12:25
|
|
python: Add deprecation warning
|
|
a05fa9a9
|
2025-05-15T18:41:35
|
|
codegen: Rerun codegen scripts
|
|
258d8706
|
2025-05-15T17:49:49
|
|
codegen: Consolidate tools for code generation
Move tools, source files and output tables into codegen directory.
Rename some files.
Adjust tools to match modified files. Remove generation date and source
files from output.
Distribute all tools and sources.
|
|
0d34d690
|
2025-05-15T17:11:33
|
|
README: Update configuration options
Python is disabled by default now. Mention --prefix.
|
|
adfbeb7e
|
2025-05-14T04:58:21
|
|
doc: Stop using *Ptr typedefs in documentation
|
|
a40f36e7
|
2025-05-14T04:04:28
|
|
include: Stop using *Ptr typedefs in public headers
|
|
0da20b83
|
2025-05-14T04:20:07
|
|
autotools: Quote filenames in doc/Makefile.am
|
|
2d83a84c
|
2025-05-14T00:29:19
|
|
doc: Misc improvements
|
|
87087def
|
2025-05-13T16:19:42
|
|
tests: Remove result files committed by accident
|
|
d6151c23
|
2025-05-13T13:28:28
|
|
libxml2.doap: Remove inactive maintainer
|
|
af4fae5a
|
2025-05-13T12:05:15
|
|
html: Add some comments regarding HTML5 serialization
It seems that the specification of the HTML output method in XSLT 1.0
had a lot of influence on how the HTML serializer in libxml2 ended up:
https://www.w3.org/TR/xslt-10/#section-HTML-Output-Method
There are two remaining behaviors suggested by XSLT 1.0 that don't match
the HTML5 fragment serialization algorithm:
We escape non-ASCII characters in URI attributes (the list of which is
probably outdated). This was originally recommended in appendix B of the
HTML 4.01 spec, but only for user agents:
https://www.w3.org/TR/html401/appendix/notes.html#h-B.2.1
From my experience, any tool that processes HTML should escape as little
as possible. For example, we used to escape many more characters which
are invalid in URIs, but often used in template languages. (Note that we
still escape whitespace and control chars.) Nevertheless, I guess that
some libxslt users continue to expect this behavior from libxml2.
Then we collapse Boolean attributes using an outdated list. This is
mostly a cosmetic issue, but a somewhat important one for libxslt users.
We probably need a serialization option for the xmlsave module that
enables fully HTML5-conformant output.
|
|
b0234633
|
2025-05-13T20:19:39
|
|
encoding: Preserve original encoding label
When using built-in encodings, the label would be normalized which
causes various issues. We now create a copy of the handler with the
original name.
This is somewhat dangerous as it will require users to free built-in
encodings with xmlCharEncCloseFunc. But to handle the general case, this
was already required.
Fixes #916 in another way than originally proposed.
|
|
fcb7a777
|
2025-05-13T22:38:15
|
|
io: Make xmlOutputBufferCreate* not free encoder on error
Revert a530ff12 which was an inadvertent API change.
|
|
5b71dca6
|
2025-05-12T21:39:54
|
|
Fix -Wunterminated-string-initialization warnings
Don't use strings for table.
|
|
cdce17c3
|
2025-05-12T21:21:25
|
|
html: Only map HTML encodings from meta tag
|
|
19b99311
|
2025-05-12T21:07:41
|
|
encoding: Fix -Wswitch warning
|
|
39ae5d12
|
2025-05-12T21:04:41
|
|
save: Add NULL check in xmlBufDumpEntityContent
Short-lived regression.
|
|
c2929b5d
|
2025-05-12T21:01:35
|
|
html: Ignore namespaces when handling meta tags
Revert to old behavior to fix issues with XHTML documents.
|
|
4df8d557
|
2025-05-12T17:31:14
|
|
io: Fix stack use after scope
Short-lived regression.
|
|
f0983199
|
2025-05-12T13:00:20
|
|
html: Map some encodings according to HTML5
Windows-1252 is a superset of ISO-8859-1 and should be used instead.
Same for ASCII.
Also map UCS-2 and UTF-16 to UTF-16LE.
|
|
93f67106
|
2025-05-12T12:27:54
|
|
encoding: Add HTML5 aliases
|
|
628006f4
|
2025-05-12T11:47:40
|
|
encoding: Add windows-1252
Fixes #915.
|
|
a7016bae
|
2025-05-12T02:40:36
|
|
tools: Remove unnecessary data from iso8859x.inc
|
|
c92374f1
|
2025-05-12T02:15:11
|
|
tools: Recreate script to generate iso8859x.inc
The script to create these tables was never committed to version
control.
|
|
f602c0c1
|
2025-05-12T00:04:22
|
|
html: Rework serialization of meta encoding attributes
Don't allocate memory.
|
|
7654c2ef
|
2025-05-11T23:37:38
|
|
html: Rework serialization of URIs
Don't allocate memory.
|
|
bd777e4f
|
2025-05-11T22:18:31
|
|
html: Speed up htmlIsBooleanAttr
This is used when serializing.
|
|
825f3a9d
|
2025-05-11T21:38:16
|
|
html: Always serialize attributes with double quotes
Align with HTML5.
|
|
5c4cc456
|
2025-05-11T21:19:22
|
|
html: Escape encoding in meta tags
|
|
0674ccb7
|
2025-05-11T20:55:57
|
|
html: Stop omitting end tags when serializing
Align with HTML5.
|
|
05b8fe0a
|
2025-04-12T23:10:40
|
|
html: Don't escape RAWTEXT and PLAINTEXT
Align with HTML5.
|
|
809ded58
|
2025-04-12T22:50:56
|
|
html: Add more empty elements
Add empty HTML5 elements <bgsound>, <keygen>, <source>, <track> and
<wbr>.
Make <embed> an empty element.
|
|
5f8ebc88
|
2025-05-10T00:56:18
|
|
save: Avoid xmlOutputBufferWriteQuotedString
xmlOutputBufferWriteQuotedString should be reserved for things like
system IDs.
|
|
0d81d6f8
|
2025-05-10T00:52:22
|
|
html: Use xmlOutputBufferWrite if possible
|
|
89fcfe3a
|
2025-05-10T00:14:05
|
|
html: Start to use xmlSerializeText
Avoid temporary copy to speed up serialization.
|
|
777e2adf
|
2025-05-09T23:53:03
|
|
io: Consolidate escaping code
Use generated table approach of xmlSerializeText for xmlEscapeText.
Move most code to xmlIO.c.
|
|
cdaf657f
|
2025-05-09T23:02:32
|
|
html: Don't escape < and > when serializing attribute values
Align with HTML5.
This will break some test suites.
|
|
e0e0a1f0
|
2025-05-09T22:44:54
|
|
html: Remove special handling of &{...} when serializing
See https://www.w3.org/TR/html401/appendix/notes.html#h-B.7.1
Align with HTML5.
|
|
dad11630
|
2025-05-09T22:05:38
|
|
entities: Always replace invalid chars when escaping
The previous refactor painstakingly recreated the different behavior of
separate functions that were merged. It makes
Optimize IS_CHAR check for non-ASCII chars.
|
|
c8cea39d
|
2025-05-09T21:31:07
|
|
save: Fix serialization of attribute defaults containing <
Long-standing bug that produced invalid XML.
|
|
971038e5
|
2025-05-09T20:26:33
|
|
html: Call lower-level escaping functions
Removes the need to pass a document around.
|
|
63535d39
|
2025-05-09T20:13:43
|
|
tree: Make xmlNodeListGetStringInternal work with escape flags
|
|
442c1903
|
2025-05-09T18:52:36
|
|
doc: Fix some damage from automated conversions
Add some newlines, fix returns.
|
|
98a61c9d
|
2025-05-09T16:48:09
|
|
doc: Fix briefs in tree docs
|
|
4b4bc15a
|
2025-05-09T16:24:35
|
|
doc: Misc fixes to buffer docs
|
|
ad390a5d
|
2025-05-09T15:34:53
|
|
parser: Set doc properties in endDocument SAX handler
|
|
c7c49643
|
2025-05-09T15:26:15
|
|
html: Move DTD creation to endDocument SAX callback
|
|
46f05ea4
|
2025-05-09T00:21:47
|
|
html: Rework meta charset handling
Don't use encoding from meta tags when serializing. Only use the value
in `doc->encoding`, matching the XML serializer. This is the actual
encoding used when parsing.
Stop modifying the input document by setting meta tags before
serializing. Meta tags are now injected during serialization.
Add full support for <meta charset=""> which is also used when adding
meta tags.
Align with HTML5 and implement the "algorithm for extracting a character
encoding from a meta element". Only modify the encoding substring in
Content-Type meta tags.
Only switch encoding once when parsing.
Fix htmlSaveFileFormat with a NULL encoding not to declare a misleading
UTF-8 charset.
Fixes #909.
|
|
9aaa52fe
|
2025-05-08T22:49:20
|
|
tree: Make xmlNodeAddContent work with attributes
|
|
655ac5f8
|
2025-05-07T16:35:09
|
|
html: Add comment regarding hack for XML documents
|
|
f3a080bc
|
2025-05-07T14:32:42
|
|
html: Ignore U+0000 in body text
Align with HTML5. Fixes #908.
|
|
a1e83b24
|
2025-05-07T20:16:17
|
|
io: Fix negation of potentially unsigned value
|
|
b3854fe9
|
2025-05-07T20:20:31
|
|
reader: Fix null deref on malloc failure
Short-lived regression from 177067ea.
|
|
6684eb93
|
2025-05-07T20:13:59
|
|
fuzz: Fix out-of-tree build
|
|
6bd380ce
|
2025-05-07T14:32:26
|
|
fuzz: Update README
|
|
967df734
|
2025-05-07T13:03:11
|
|
malloc-fail: Handle malloc failure in xmlSchemaCopyValue
Avoid null pointer dereference. Fixes #905.
|
|
4ed71574
|
2025-05-09T11:58:01
|
|
python: fix use-after-free in functions xmlPythonFileReadRaw(), xmlPythonFileRead()
with python2.
Fixes #910.
|
|
38ea8fa9
|
2025-05-06T18:31:45
|
|
doc: Fix varargs
|
|
9bbffec5
|
2025-05-06T17:42:46
|
|
doc: Move brief to top, params to bottom of doc comments
|
|
7bc7ae9d
|
2025-05-06T15:30:46
|
|
doc: Enable Doxygen autobrief
|
|
ab13fbfd
|
2025-05-06T14:06:43
|
|
doc: Misc fixes to error docs
|
|
b1685459
|
2025-05-06T12:50:52
|
|
doc: Misc fixes to xmlsave docs
|
|
7d689fab
|
2025-05-06T10:54:46
|
|
doc: Fix doc installation with Autotools
|
|
7b59e74c
|
2025-05-06T10:54:18
|
|
doc: Always use case sensitive filenames with Doxygen
Avoid platform-specific behavior.
|
|
298f70b3
|
2025-05-05T21:36:36
|
|
doc: Misc fixes to HTML tree docs
|