kmx git

Commit	Date	Message
46f05ea4	2025-05-09T00:21:47	html: Rework meta charset handling Don't use encoding from meta tags when serializing. Only use the value in `doc->encoding`, matching the XML serializer. This is the actual encoding used when parsing. Stop modifying the input document by setting meta tags before serializing. Meta tags are now injected during serialization. Add full support for <meta charset=""> which is also used when adding meta tags. Align with HTML5 and implement the "algorithm for extracting a character encoding from a meta element". Only modify the encoding substring in Content-Type meta tags. Only switch encoding once when parsing. Fix htmlSaveFileFormat with a NULL encoding not to declare a misleading UTF-8 charset. Fixes #909.
8cf6129b	2025-02-13T18:20:46	html: Stop implying <p> start tags Only <html>, <head> or <body> should be implied. Opening extra <p> tags has always been a libxml2 quirk.
59511792	2024-09-03T15:52:44	html: Parse named character references according to HTML5
85c112a0	2017-06-12T18:26:11	Add test cases for bug 758518 test/HTML/758518-entity.html exposed a bug in pushParseTest() in runtest.c which assumed that an input file was at least 4 bytes long. That test case is only 3 bytes, so we now take the minimum of 4 bytes or the length of the test input. We also now use 'chunkSize' in place of the hard-coded value '1024' later in the function.

46f05ea4

2025-05-09T00:21:47

html: Rework meta charset handling Don't use encoding from meta tags when serializing. Only use the value in `doc->encoding`, matching the XML serializer. This is the actual encoding used when parsing. Stop modifying the input document by setting meta tags before serializing. Meta tags are now injected during serialization. Add full support for <meta charset=""> which is also used when adding meta tags. Align with HTML5 and implement the "algorithm for extracting a character encoding from a meta element". Only modify the encoding substring in Content-Type meta tags. Only switch encoding once when parsing. Fix htmlSaveFileFormat with a NULL encoding not to declare a misleading UTF-8 charset. Fixes #909.

8cf6129b

2025-02-13T18:20:46

html: Stop implying <p> start tags Only <html>, <head> or <body> should be implied. Opening extra <p> tags has always been a libxml2 quirk.

59511792

2024-09-03T15:52:44

html: Parse named character references according to HTML5

85c112a0

2017-06-12T18:26:11

Add test cases for bug 758518 test/HTML/758518-entity.html exposed a bug in pushParseTest() in runtest.c which assumed that an input file was at least 4 bytes long. That test case is only 3 bytes, so we now take the minimum of 4 bytes or the length of the test input. We also now use 'chunkSize' in place of the hard-coded value '1024' later in the function.

kc3-lang/libxml2/result/HTML/758518-entity.html

result/HTML/758518-entity.html

Log