kmx git

Commit	Date	Message
46f05ea4	2025-05-09T00:21:47	html: Rework meta charset handling Don't use encoding from meta tags when serializing. Only use the value in `doc->encoding`, matching the XML serializer. This is the actual encoding used when parsing. Stop modifying the input document by setting meta tags before serializing. Meta tags are now injected during serialization. Add full support for <meta charset=""> which is also used when adding meta tags. Align with HTML5 and implement the "algorithm for extracting a character encoding from a meta element". Only modify the encoding substring in Content-Type meta tags. Only switch encoding once when parsing. Fix htmlSaveFileFormat with a NULL encoding not to declare a misleading UTF-8 charset. Fixes #909.
f933c898	2012-09-07T19:32:12	Keep non-significant blanks node in HTML parser For https://bugzilla.gnome.org/show_bug.cgi?id=681822 Regardless if the option HTML_PARSE_NOBLANKS is set or not, blank nodes are removed from a HTML document, for example: <html> <head> <title>This is a test.</title> </head> <body> <p>This is a test.</p> </body> </html> is read as: <html><head><title>This is a test.</title></head><body> <p>This is a test.</p> </body></html> This changes the default behaviour but the old behaviour is available as expected when using the parser flag HTML_PARSE_NOBLANKS Based on original patch from Igor Ignatyuk <igor_ignatiouk@hotmail.com> * HTMLparser.c: change various places in the parser where ignorable_space SAX callback was called without checking for the parser flag preference * xmllint.c: make sure we use the new flag even for HTML parsing * result/HTML/*: this modifies the output of a number of tests
36d73403	2005-09-01T09:52:30	Applied the last patch from Gary Coady for #304637 changing the behaviour * HTMLparser.c: Applied the last patch from Gary Coady for #304637 changing the behaviour when text nodes are found in body * result/HTML/*: this changes the output of some tests Daniel
42fd4126	2003-11-04T08:47:48	change --html to make sure we use the HTML serialization rule by default * xmllint.c: change --html to make sure we use the HTML serialization rule by default when HTML parser is used, add --xmlout to allow to force the XML serializer on HTML. * HTMLtree.c: ugly tweak to fix the output on <p> element and solve #125093 * result/HTML/*: this changes the output of some tests Daniel
8c9872ca	2002-07-05T18:17:10	trying to fix 87235 about discarded white spaces in the HTML parser. this * HTMLparser.c: trying to fix 87235 about discarded white spaces in the HTML parser. * result/HTML/*: this changes the output of a number of HTML regression tests Daniel
02bb170a	2001-06-13T21:11:59	- HTMLparser.[ch] HTMLtree.c: stored the inline/block property of element and use it to avoid outputting formatting spaces at the wrong place. Implemented the format parameter for HTML save. - result/HTML/doc2.htm result/HTML/doc3.htm result/HTML/fp40.htm result/HTML/script.html result/HTML/test2.html result/HTML/test3.html result/HTML/wired.html: of course this impact the result of a number of HTML tests Daniel
32bc74ef	2000-07-14T14:49:25	- doc/encoding.html doc/xml.html: added I18N doc - encoding.[ch] HTMLtree.[ch] parser.c HTMLparser.c: I18N encoding improvements, both parser and filters, added ASCII & HTML, fixed the ISO-Latin-1 one - xmllint.c testHTML.c: added/made visible --encode - debugXML.c : cleanup - most .c files: applied patches due to warning on Windows and when using Sun Pro cc compiler - xpath.c : cleanup memleaks - nanoftp.c : added a TESTING preprocessor flag for standalong compile so that people can report bugs more easilly - nanohttp.c : ditched socklen_t which was a portability mess and replaced it with unsigned int. - tree.[ch]: added xmlHasProp() - TODO: updated - test/ : added more test for entities, NS, encoding, HTML, wap - configure.in: preparing for 2.2.0 release Daniel
663a607a	2000-07-01T09:08:24	Fixing one test suite result, Daniel.
71b656e0	2000-01-05T14:46:17	- added xmlRemoveID() and xmlRemoveRef() - added check and handling when possibly removing an ID - fixed some entities problems - added xmlParseTryOrFinish() - changed the way struct aredeclared to allow gtk-doc to expose those - closed #4960 - fixes to libs detection from Albert Chin-A-Young - preparing 1.8.3 release Daniel
4a53eca2	1999-12-12T13:03:50	- Updated HTML test outputs - Fixed taht f....g problem with C++ and includes, Daniel
af78a0e1	1999-12-12T13:03:50	Large commit of changes done while travelling to XML'99 - cleanups on memory use and parsers - start of Link interfaces HTML and XLink - rebuild the doc - released as 1.8.0 Daniel
7c1206fc	1999-10-14T09:10:25	Revamped HTML parsing, lots of bug fixes for HTML stuff, Added xmlValidGetValidElements and xmlValidGetPotentialChildren, Completed and cleaned up the tests, Added doc for new modules gnome-xml-xmlmemory.html and gnome-xml-nanohttp.html, Daniel
424af391	1999-08-10T19:10:03	Added and updated all the results for 1.5.0, Daniel

46f05ea4

2025-05-09T00:21:47

html: Rework meta charset handling Don't use encoding from meta tags when serializing. Only use the value in `doc->encoding`, matching the XML serializer. This is the actual encoding used when parsing. Stop modifying the input document by setting meta tags before serializing. Meta tags are now injected during serialization. Add full support for <meta charset=""> which is also used when adding meta tags. Align with HTML5 and implement the "algorithm for extracting a character encoding from a meta element". Only modify the encoding substring in Content-Type meta tags. Only switch encoding once when parsing. Fix htmlSaveFileFormat with a NULL encoding not to declare a misleading UTF-8 charset. Fixes #909.

f933c898

2012-09-07T19:32:12

Keep non-significant blanks node in HTML parser For https://bugzilla.gnome.org/show_bug.cgi?id=681822 Regardless if the option HTML_PARSE_NOBLANKS is set or not, blank nodes are removed from a HTML document, for example: <html> <head> <title>This is a test.</title> </head> <body> This is a test. </body> </html> is read as: <html><head><title>This is a test.</title></head><body> This is a test. </body></html> This changes the default behaviour but the old behaviour is available as expected when using the parser flag HTML_PARSE_NOBLANKS Based on original patch from Igor Ignatyuk <igor_ignatiouk@hotmail.com> * HTMLparser.c: change various places in the parser where ignorable_space SAX callback was called without checking for the parser flag preference * xmllint.c: make sure we use the new flag even for HTML parsing * result/HTML/*: this modifies the output of a number of tests

36d73403

2005-09-01T09:52:30

Applied the last patch from Gary Coady for #304637 changing the behaviour * HTMLparser.c: Applied the last patch from Gary Coady for #304637 changing the behaviour when text nodes are found in body * result/HTML/*: this changes the output of some tests Daniel

42fd4126

2003-11-04T08:47:48

change --html to make sure we use the HTML serialization rule by default * xmllint.c: change --html to make sure we use the HTML serialization rule by default when HTML parser is used, add --xmlout to allow to force the XML serializer on HTML. * HTMLtree.c: ugly tweak to fix the output on element and solve #125093 * result/HTML/*: this changes the output of some tests Daniel

8c9872ca

2002-07-05T18:17:10

trying to fix 87235 about discarded white spaces in the HTML parser. this * HTMLparser.c: trying to fix 87235 about discarded white spaces in the HTML parser. * result/HTML/*: this changes the output of a number of HTML regression tests Daniel

02bb170a

2001-06-13T21:11:59

- HTMLparser.[ch] HTMLtree.c: stored the inline/block property of element and use it to avoid outputting formatting spaces at the wrong place. Implemented the format parameter for HTML save. - result/HTML/doc2.htm result/HTML/doc3.htm result/HTML/fp40.htm result/HTML/script.html result/HTML/test2.html result/HTML/test3.html result/HTML/wired.html: of course this impact the result of a number of HTML tests Daniel

32bc74ef

2000-07-14T14:49:25

- doc/encoding.html doc/xml.html: added I18N doc - encoding.[ch] HTMLtree.[ch] parser.c HTMLparser.c: I18N encoding improvements, both parser and filters, added ASCII & HTML, fixed the ISO-Latin-1 one - xmllint.c testHTML.c: added/made visible --encode - debugXML.c : cleanup - most .c files: applied patches due to warning on Windows and when using Sun Pro cc compiler - xpath.c : cleanup memleaks - nanoftp.c : added a TESTING preprocessor flag for standalong compile so that people can report bugs more easilly - nanohttp.c : ditched socklen_t which was a portability mess and replaced it with unsigned int. - tree.[ch]: added xmlHasProp() - TODO: updated - test/ : added more test for entities, NS, encoding, HTML, wap - configure.in: preparing for 2.2.0 release Daniel

663a607a

2000-07-01T09:08:24

Fixing one test suite result, Daniel.

71b656e0

2000-01-05T14:46:17

- added xmlRemoveID() and xmlRemoveRef() - added check and handling when possibly removing an ID - fixed some entities problems - added xmlParseTryOrFinish() - changed the way struct aredeclared to allow gtk-doc to expose those - closed #4960 - fixes to libs detection from Albert Chin-A-Young - preparing 1.8.3 release Daniel

4a53eca2

1999-12-12T13:03:50

- Updated HTML test outputs - Fixed taht f....g problem with C++ and includes, Daniel

af78a0e1

1999-12-12T13:03:50

Large commit of changes done while travelling to XML'99 - cleanups on memory use and parsers - start of Link interfaces HTML and XLink - rebuild the doc - released as 1.8.0 Daniel

7c1206fc

1999-10-14T09:10:25

Revamped HTML parsing, lots of bug fixes for HTML stuff, Added xmlValidGetValidElements and xmlValidGetPotentialChildren, Completed and cleaned up the tests, Added doc for new modules gnome-xml-xmlmemory.html and gnome-xml-nanohttp.html, Daniel

424af391

1999-08-10T19:10:03

Added and updated all the results for 1.5.0, Daniel

kc3-lang/libxml2/result/HTML/fp40.htm

result/HTML/fp40.htm

Log