|
46f05ea4
|
2025-05-09T00:21:47
|
|
html: Rework meta charset handling
Don't use encoding from meta tags when serializing. Only use the value
in `doc->encoding`, matching the XML serializer. This is the actual
encoding used when parsing.
Stop modifying the input document by setting meta tags before
serializing. Meta tags are now injected during serialization.
Add full support for <meta charset=""> which is also used when adding
meta tags.
Align with HTML5 and implement the "algorithm for extracting a character
encoding from a meta element". Only modify the encoding substring in
Content-Type meta tags.
Only switch encoding once when parsing.
Fix htmlSaveFileFormat with a NULL encoding not to declare a misleading
UTF-8 charset.
Fixes #909.
|
|
f933c898
|
2012-09-07T19:32:12
|
|
Keep non-significant blanks node in HTML parser
For https://bugzilla.gnome.org/show_bug.cgi?id=681822
Regardless if the option HTML_PARSE_NOBLANKS is set or not, blank nodes
are removed from a HTML document, for example:
<html>
<head>
<title>This is a test.</title>
</head>
<body>
<p>This is a test.</p>
</body>
</html>
is read as:
<html><head><title>This is a test.</title></head><body>
<p>This is a test.</p>
</body></html>
This changes the default behaviour but the old behaviour is available
as expected when using the parser flag HTML_PARSE_NOBLANKS
Based on original patch from Igor Ignatyuk <igor_ignatiouk@hotmail.com>
* HTMLparser.c: change various places in the parser where ignorable_space
SAX callback was called without checking for the parser flag preference
* xmllint.c: make sure we use the new flag even for HTML parsing
* result/HTML/*: this modifies the output of a number of tests
|
|
36d73403
|
2005-09-01T09:52:30
|
|
Applied the last patch from Gary Coady for #304637 changing the behaviour
* HTMLparser.c: Applied the last patch from Gary Coady for #304637
changing the behaviour when text nodes are found in body
* result/HTML/*: this changes the output of some tests
Daniel
|
|
42fd4126
|
2003-11-04T08:47:48
|
|
change --html to make sure we use the HTML serialization rule by default
* xmllint.c: change --html to make sure we use the HTML serialization
rule by default when HTML parser is used, add --xmlout to allow to
force the XML serializer on HTML.
* HTMLtree.c: ugly tweak to fix the output on <p> element and
solve #125093
* result/HTML/*: this changes the output of some tests
Daniel
|
|
8c9872ca
|
2002-07-05T18:17:10
|
|
trying to fix 87235 about discarded white spaces in the HTML parser. this
* HTMLparser.c: trying to fix 87235 about discarded white
spaces in the HTML parser.
* result/HTML/*: this changes the output of a number of HTML
regression tests
Daniel
|
|
02bb170a
|
2001-06-13T21:11:59
|
|
- HTMLparser.[ch] HTMLtree.c: stored the inline/block property
of element and use it to avoid outputting formatting spaces at
the wrong place. Implemented the format parameter for HTML save.
- result/HTML/doc2.htm result/HTML/doc3.htm result/HTML/fp40.htm
result/HTML/script.html result/HTML/test2.html result/HTML/test3.html
result/HTML/wired.html: of course this impact the result of a
number of HTML tests
Daniel
|
|
32bc74ef
|
2000-07-14T14:49:25
|
|
- doc/encoding.html doc/xml.html: added I18N doc
- encoding.[ch] HTMLtree.[ch] parser.c HTMLparser.c: I18N encoding
improvements, both parser and filters, added ASCII & HTML,
fixed the ISO-Latin-1 one
- xmllint.c testHTML.c: added/made visible --encode
- debugXML.c : cleanup
- most .c files: applied patches due to warning on Windows and
when using Sun Pro cc compiler
- xpath.c : cleanup memleaks
- nanoftp.c : added a TESTING preprocessor flag for standalong
compile so that people can report bugs more easilly
- nanohttp.c : ditched socklen_t which was a portability mess
and replaced it with unsigned int.
- tree.[ch]: added xmlHasProp()
- TODO: updated
- test/ : added more test for entities, NS, encoding, HTML, wap
- configure.in: preparing for 2.2.0 release
Daniel
|
|
663a607a
|
2000-07-01T09:08:24
|
|
Fixing one test suite result, Daniel.
|
|
71b656e0
|
2000-01-05T14:46:17
|
|
- added xmlRemoveID() and xmlRemoveRef()
- added check and handling when possibly removing an ID
- fixed some entities problems
- added xmlParseTryOrFinish()
- changed the way struct aredeclared to allow gtk-doc to expose those
- closed #4960
- fixes to libs detection from Albert Chin-A-Young
- preparing 1.8.3 release
Daniel
|
|
4a53eca2
|
1999-12-12T13:03:50
|
|
- Updated HTML test outputs
- Fixed taht f....g problem with C++ and includes,
Daniel
|
|
af78a0e1
|
1999-12-12T13:03:50
|
|
Large commit of changes done while travelling to XML'99
- cleanups on memory use and parsers
- start of Link interfaces HTML and XLink
- rebuild the doc
- released as 1.8.0
Daniel
|
|
7c1206fc
|
1999-10-14T09:10:25
|
|
Revamped HTML parsing, lots of bug fixes for HTML stuff,
Added xmlValidGetValidElements and xmlValidGetPotentialChildren,
Completed and cleaned up the tests,
Added doc for new modules gnome-xml-xmlmemory.html and gnome-xml-nanohttp.html,
Daniel
|
|
424af391
|
1999-08-10T19:10:03
|
|
Added and updated all the results for 1.5.0, Daniel
|