fixed a nasty bug #119387, bad heuristic from the progressive HTML parser * HTMLparser.c: fixed a nasty bug #119387, bad heuristic from the progressive HTML parser front-end on large character data island leading to an erroneous end of data detection by the parser. Some cleanup too to get closer from the XML progressive parser. Daniel