|
aabc0847
|
2012-08-10T12:34:24
|
|
Fix compiler warnings of wincecompat.c
For https://bugzilla.gnome.org/show_bug.cgi?id=681592
*) Add and explicit cast when converting FILE* to int
*) Don't assign a c-string to the element of an char-array
|
|
fd4f6fdd
|
2012-08-13T17:54:20
|
|
Fix non __GNUC__ build
For https://bugzilla.gnome.org/show_bug.cgi?id=681590
Length member of _xmlDictEntry is called "len" and not "l"
|
|
3b666224
|
2012-08-13T17:49:15
|
|
Fix const qualifyer to definition of xmlBufferDetach
For https://bugzilla.gnome.org/show_bug.cgi?id=676629
As the buffer is beng mdified by the call the const doesn't make
sense.
|
|
5a82e48e
|
2012-08-13T17:39:06
|
|
Fix windows unicode build
For https://bugzilla.gnome.org/show_bug.cgi?id=638650
After much discussions in the list:
https://mail.gnome.org/archives/xml/2012-May/msg00062.html
The simplest at this point is to fallback to only officially
supporting ASCII names in those APIs, document it and use
the "A" entry points on Windows.
|
|
c3b1d09b
|
2012-08-13T16:50:48
|
|
clean redefinition of {v}snprintf in C-source
as those from *config.h are preferable (e.g. win32config.h)
|
|
1f0453f7
|
2012-08-13T16:56:11
|
|
minimize use of HAVE_CONFIG_H
as build process for supported platforms provide "config.h" header file
|
|
8886f335
|
2012-08-13T16:38:09
|
|
fixup regression in Various "make distcheck" and portability fixups
Was using the wrong variable and adds proper m4 quoting
|
|
968a03a2
|
2012-08-13T12:41:33
|
|
Add support for big line numbers in error reporting
Fix the lack of line number as reported by Johan Corveleyn <jcorvel@gmail.com>
* parser.c include/libxml/parser.h: add an XML_PARSE_BIG_LINES parser
option not switch on by default, it's an opt-in
* SAX2.c: if XML_PARSE_BIG_LINES is set store the long line numbers
in the psvi field of text nodes
* tree.c: expand xmlGetLineNo to extract those informations, also
make sure we can't fail on recursive behaviour
* error.c: in __xmlRaiseError, if a node is provided, call
xmlGetLineNo() if we can't get a valid line number.
* xmllint.c: switch on XML_PARSE_BIG_LINES in xmllint
|
|
264cee69
|
2012-08-13T12:40:53
|
|
Add a missing element check
|
|
aa017c54
|
2012-08-10T10:42:56
|
|
Release candidate 1 of libxml2-2.9.0
* configure.in libxml.spec.in python/setup.py: bumped release numbers
* doc//*: regenerated as part of the release
|
|
28cc42d0
|
2012-08-10T10:00:18
|
|
Regenerating docs and API files
Various cleanups
* configure.in: force regeneration of APIs in my environment
* buf.c buf.h enc.h encoding.c include/libxml/tree.h
include/libxml/xmlerror.h save.h tree.c: various comment cleanups
pointed by apibuild
* doc/apibuild.py: added the 3 new internal headers in the excludes
* doc/libxml2-api.xml doc/libxml2-refs.xml: regenerated the API
* doc/symbols.xml: listing new entry points for 2.9.0
* doc/devhelp/*: regenerated
|
|
3e62adbe
|
2012-08-09T14:24:02
|
|
Adding various checks on node type though the API
Specifially checking against namespace nodes before accessing node
pointers
|
|
6ca24a39
|
2012-08-08T15:31:55
|
|
Namespace nodes can't be unlinked with xmlUnlinkNode
|
|
89b6f73a
|
2012-08-04T05:09:56
|
|
use xmlBuf... if DEBUG_INPUT is defined
|
|
c15df7d4
|
2012-08-07T15:15:04
|
|
Avoid using xmlBuffer for serialization
Mostly an optimization to avoid xmlBuffer->xmlBuf conversions
and use the new code.
|
|
7f713494
|
2012-08-07T14:34:53
|
|
Improve compatibility between xmlBuf and xmlBuffer
An old xsltproc binary now works correctly with the new libxml2
|
|
495a73df
|
2012-08-07T10:14:56
|
|
fix runtests to use pthreads support for various Unix platforms
The runtests program currently fails with
Specific platform thread support not detected
on HP-UX, AIX and other Unix systems which do not match the conditional
#if defined(linux) || defined(__sun) || defined(__APPLE_CC__)
It is silly to try to enumerate all systems which use pthreads in a conditional
like this. I am attaching a patch (against git master) that rewrites the cpp
conditional structure so that pthreads is used if HAVE_PTHREAD_H is defined,
and moves that section of code down below the Win32 and BeOS cases so that
native thread libraries are used preferentially in those two cases.
|
|
5d6c02ba
|
2012-08-07T10:05:34
|
|
Various "make distcheck" and portability fixups 2nd part
doc/examples/Makefile.am:
* Use $(VAR), not @VAR@
* Use $(MKDIR_P) instead of $(mkinstalldirs), as the latter is an
* obsolete
name
* Added $(srcdir) qualification to the various test program invocations
* in
the "tests" target. More work is needed here (notably, when the
reference output contains the path to the input file), but this gets
things a lot closer to working correctly in an out-of-source build.
doc/examples/reader4.res:
* Added "./" path qualifiers so that the reader4 test continues to pass
cleanly for in-source builds
python/tests/Makefile.am:
* Symlink in test input files for out-of-source builds
|
|
5706b6d8
|
2012-08-06T11:32:54
|
|
Various "make distcheck" and portability fixups
Makefile.am:
* Don't use @VAR@, use $(VAR). Autoconf's AC_SUBST provides us the Make
variable, it allows overriding the value at the command line, and
(notably) it avoids a Make parse error in the libxml2_la_LDFLAGS
assignment when @MODULE_PLATFORM_LIBS@ is empty
* Changed how the THREADS_W32 mechanism switches the build between
testThreads.c and testThreadsWin32.c as appropriate; using AM_CONDITIONAL
allows this to work cleanly and plays well with dependencies
* testapi.c should be specified as BUILT_SOURCES
* Create symlinks to the test/ and result/ subdirs so that the runtests
target is usable in out-of-source-tree builds
* Don't do MAKEFLAGS+=--silent as this is not portable to non-GNU Makes
* Fixed incorrect find(1) syntax in the "cleanup" rule, and doing "rm -f"
instead of just "rm" is good form
* (DIST)CLEANFILES needed a bit more coverage to allow "make distcheck" to
pass
configure.in:
* Need AC_PROG_LN_S to create test/ and result/ symlinks in Makefile.am
* AC_LIBTOOL_WIN32_DLL and AM_PROG_LIBTOOL are obsolete; these have been
superceded by LT_INIT
* Don't rebuild docs by default, as this requires GNU Make (as
implemented)
* Check for uint32_t as some platforms don't provide it
* Check for some more functions, and undefine HAVE_MMAP if we don't also
HAVE_MUNMAP (one system I tested on actually needed this)
* Changed THREADS_W32 from a filename insert into an Automake conditional
* The "Copyright" file will not be in the current directory if builddir !=
srcdir
doc/Makefile.am:
* EXTRA_DIST cannot use wildcards when they refer to generated files; this
breaks dependencies. What I did was define EXTRA_DIST_wc, which uses GNU
Make $(wildcard) directives to build up a list of files, and EXTRA_DIST,
as a literal expansion of EXTRA_DIST_wc. I also added a new rule,
"check-extra-dist", to simplify checking that the two variables are
equivalent. (Note that this works only when builddir == srcdir)
(I can implement this differently if desired; this is just one way of
doing it)
* Don't define an "all" target; this steps on Automake's toes
* Fixed up the "libxml2-api.xml ..." rule by using $(wildcard) for
dependencies (as Make doesn't process the wildcards otherwise) and
qualifying appropriate files with $(srcdir)
(Note that $(srcdir) is not needed in the dependencies, thanks to VPATH,
which we can count on as this is GNU-Make-only code anyway)
doc/devhelp/Makefile.am:
* Qualified appropriate files with $(srcdir)
* Added an "uninstall-local" rule so that "make distcheck" passes
doc/examples/Makefile.am:
* Rather than use a wildcard that doesn't work, use a substitution that
most Make programs can handle
doc/examples/index.py:
* Do the same here
include/libxml/nanoftp.h:
* Some platforms (e.g. MSVC 6) already #define INVALID_SOCKET:
user@host:/cygdrive/c/Program Files/Microsoft Visual Studio/VC98/\
Include$ grep -R INVALID_SOCKET .
./WINSOCK.H:#define INVALID_SOCKET (SOCKET)(~0)
./WINSOCK2.H:#define INVALID_SOCKET (SOCKET)(~0)
include/libxml/xmlversion.h.in:
* Support ancient GCCs (I was actually able to build the library with 2.5
but for this bit)
python/Makefile.am:
* Expanded CLEANFILES to allow "make distcheck" to pass
python/tests/Makefile.am:
* Define CLEANFILES instead of a "clean" rule, and added tmp.xml to allow
"make distcheck" to pass
testRelax.c:
* Use HAVE_MMAP instead of the less explicit HAVE_SYS_MMAN_H (as some
systems have the header but not the function)
testSchemas.c:
* Use HAVE_MMAP instead of the less explicit HAVE_SYS_MMAN_H
testapi.c:
* Don't use putenv() if it's not available
threads.c:
* This fixes the following build error on Solaris 8:
libtool: compile: cc -DHAVE_CONFIG_H -I. -I./include -I./include \
-D_REENTRANT -D__EXTENSIONS__ -D_REENTRANT -Dsparc -Xa -mt -v \
-xarch=v9 -xcrossfile -xO5 -c threads.c -KPIC -DPIC -o threads.o
"threads.c", line 442: controlling expressions must have scalar type
"threads.c", line 512: controlling expressions must have scalar type
cc: acomp failed for threads.c
*** Error code 1
trio.c:
* Define isascii() if the system doesn't provide it
trio.h:
* The trio library's HAVE_CONFIG_H header is not the same as LibXML2's
HAVE_CONFIG_H header; this change is needed to avoid a double-inclusion
win32/configure.js:
* Added support for the LZMA compression option
win32/Makefile.{bcb,mingw,msvc}:
* Added appropriate bits to support WITH_LZMA=1
* Install the header files under $(INCPREFIX)\libxml2\libxml instead of
$(INCPREFIX)\libxml, to mirror the install location on Unix+Autotools
xml2-config.in:
* @MODULE_PLATFORM_LIBS@ (usually "-ldl") needs to be in there in order for
`xml2-config --libs` to provide a complete set of dependencies
xmllint.c:
* Use HAVE_MMAP instead of the less-explicit HAVE_SYS_MMAN_H
|
|
e258adec
|
2012-08-06T11:16:30
|
|
Provide new accessors for xmlOutputBuffer
To avoid digging into buf->buffer insternal strcuture the two
new entry points xmlOutputBufferGetContent() and
xmlOutputBufferGetSize() should make the ode cleaner.
* include/libxml/xmlIO.h: add two new functions
* xmlIO.c: impement the 2 functions based on the new buffer entry points
|
|
187e5290
|
2012-08-06T10:27:58
|
|
Fix make dist to include new private header files
|
|
18e1f1f1
|
2012-08-06T10:16:41
|
|
Improvements for old buffer compatibility
Now tree.h exports LIBXML2_NEW_BUFFER macro indicating that the
API uses the new buffers, important to keep code working with
both versions.
* tree.h buf.h: also export xmlBufContent(), xmlBufEnd(), and xmlBufUse()
to help port the old code
* buf.c: make sure the compatibility counters are updated on
buffer usage, to keep proper working of application compiled
against the old structures, but take care of int overflow
|
|
3f0c613f
|
2012-08-03T12:04:09
|
|
Expand the limit test program
|
|
5353bbf7
|
2012-08-03T12:03:31
|
|
More fixups on the push parser behaviour
|
|
2b52aa00
|
2012-07-31T10:53:47
|
|
Strengthen behaviour of the push parser in problematic situations
Implement the maximum lookahead stategy, and fix some handling
of DTD to speed up processing.
|
|
e7bf892d
|
2012-07-30T20:09:25
|
|
Improve error reporting on parser errors
The extra string was being dismissed when provided.
* parser.c: handle bot case properly
* result/: this changes a few error reports
|
|
48b4cdde
|
2012-07-30T16:16:04
|
|
Enforce XML_PARSER_EOF state handling through the parser
That condition is one raised when the parser should positively stop
processing further even to report errors. Best is to test is after
most GROW call especially within loops
|
|
0df83cae
|
2012-07-30T15:41:10
|
|
Fixup limits parser
|
|
cd852ad1
|
2012-07-30T10:12:18
|
|
Implement some default limits in the XPath module
This adds some internal limitationson XPath expression complexity,
and limits at runtime like depth of the stack and maximum size
for nodeset.
* xpath.c: implement the above as well as the maximum Name lenght
|
|
52d8ade7
|
2012-07-30T10:08:45
|
|
Introduce some default parser limits
Those can be overrided by the XML_PARSE_HUGE option, they
are just default limits for Name lenght, dictionary size limits
and maximum amount of parser lookup.
* include/libxml/parserInternals.h: define the limits
* include/libxml/xmlerror.h: add a new error
* parser.c parserInternals.c: implements the new limits
|
|
7c693dad
|
2012-07-25T16:32:18
|
|
Cleanups and new limit APIs for dictionaries
* include/libxml/dict.h dict.c: adding 2 new functions xmlDictGetUsage
and xmlDictSetLimit allowing to review the amount of memory allocated
for dictionary strings. Aslo cleanup of various signed int used as
size values in the code.
|
|
6f6feba8
|
2012-07-25T16:30:56
|
|
Fixup for buf.c
|
|
57560386
|
2012-07-24T11:44:23
|
|
Cleanup URI module memory allocation code
* uri.c: cleanup the code doing the allocations, set up a structured
error handler to report memory errors, and set up an abitrary
limit on URI saving size
* error.c include/libxml/xmlerror.h: add a new FROM_URI indication
for structured error reporting, also adding strings for schematron
and buffer which were missing
|
|
747c2c10
|
2012-07-19T20:36:43
|
|
Extend testlimits
|
|
f572a78d
|
2012-07-19T20:36:25
|
|
More avoid quadratic behaviour
|
|
51304816
|
2012-07-19T20:34:26
|
|
Impose a reasonable limit on PI size
Unless the XML_PARSE_HUGE option is given to the parser,
the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a
text node within content.
Also cleanup some unsigned int used for memory size.
|
|
0de1f311
|
2012-07-18T17:43:34
|
|
first version of testlimits new test
Used to check behaviour on various parsing limits
|
|
65686451
|
2012-07-19T18:25:01
|
|
Avoid quadratic behaviour in some push parsing cases
avoid rescanning over and over a very long input, just check
the incoming chunks
|
|
58f73aca
|
2012-07-19T11:58:47
|
|
Impose a reasonable limit on comment size
Unless the XML_PARSE_HUGE option is given to the parser,
the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a
text node within content.
Also cleanup some unsigned int used for memory size.
|
|
e17db994
|
2012-07-19T11:25:16
|
|
Impose a reasonable limit on attribute size
Unless the XML_PARSE_HUGE option is given to the parser,
the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a
text node within content.
|
|
b60e612e
|
2012-07-18T16:21:17
|
|
Small cleanup of unused variables in test
|
|
9ee02f80
|
2012-07-16T19:57:42
|
|
Harden the buffer code and make it more compatible
Mimic the old xmlBuffer strcture in xmlBuf to avaoid catastrophic
failures in case of old code directly reading ctxt->input->buf->buffer
Check on all buffer entry points if an error previously occured on
the buffer, and fail the operation if this is the case, the buffer
becomes immutable and unreadable.
|
|
00ac0d3b
|
2012-07-16T18:03:01
|
|
More cleanups for input/buffers code
When calling xmlParserInputBufferPush, the buffer may be reallocated
and at the input level the pointers for base, cur and end need to
be reevaluated.
* buf.c buf.h: add two new functions, one to get the base from the
input of the buffer, and another one to reset the pointers based
on the cur and base inded
* HTMLparser.c parser.c: cleanup to use the new helper functions
as well as making sure size_t is used for the indexes computations
|
|
61551a1e
|
2012-07-16T16:28:47
|
|
Cleanup function xmlBufResetInput() to set input from Buffer
This was scattered in a number of modules, xmlParserInputPtr
have usually their base, cur and end pointer set from an
xmlBuf used as input.
* buf.c buf.h: add a new function implementing this setup
* parser.c HTMLparser.c catalog.c parserInternals.c xmlreader.c
use the new function instead of digging into the buffer in
all those modules
|
|
145477d8
|
2012-07-16T14:59:29
|
|
Swicth the test program for characters to new input buffers
it was manipulating the buffer content and structures directly
this cleans it up
|
|
7b9b0719
|
2012-07-16T14:58:02
|
|
Convert the HTML tree module to the new buffers
The new input buffers induced a couple of changes, the others
are related to the switch to xmlBuf in saving routines.
|
|
a78d8036
|
2012-07-16T14:56:50
|
|
Convert of the HTML parser to new input buffers
Changes similar to the ones done in the XML parser for the
routines which are not shared.
|
|
dbf5411b
|
2012-07-16T14:54:45
|
|
Convert the writer to new output buffer and save APIs
Only a handful of places had to be converted for xmlBuf and
the new saving entry point.
|
|
8aebce3e
|
2012-07-16T14:42:31
|
|
Convert XMLReader to the new input buffers
A few direct access were replaced, and also one internal
xmlBuffer structure is converted to use xmlBuf instead
|
|
50cdab55
|
2012-07-16T14:52:00
|
|
New saving functions using xmlBuf and conversion
* save.h: new header providing new functions currently internal
and xmlBuf counterparts of old xmlBuffer based ones
* xmlsave.c: convert functions to use xmlBuf as much as possible
|
|
dddeede0
|
2012-07-16T14:44:26
|
|
Provide new xmlBuf based saving functions
* include/libxml/tree.h: adds xmlBufGetNodeContent and xmlBufNodeDump
as xmlBuf based equivalents of xmlNodeGetContent and xmlNodeDump
* tree.c: implements one new routine and converts xmlNodeBufGetContent
to use the xmlBuf equivalent. It should behave better as a result
in case of data larger than 2GB.
|
|
345ee8b6
|
2012-07-16T14:40:37
|
|
Convert XInclude to the new input buffers
A few xmlBuffer...() calls changed to their xmlBuf...() counterparts
|
|
2a1d2422
|
2012-07-16T14:38:14
|
|
Convert catalog code to the new input buffers
Only one place where the buffers fields where accessed directly
|
|
53aa293d
|
2012-07-16T14:37:00
|
|
Convert C14N to the new Input buffer
one case of direct access cleaned up
|
|
a6a6e70c
|
2012-07-16T14:22:54
|
|
Convert xmlIO.c to the new input and output buffers
Relatively mechanical changes, this also led to a couple of fixes
upon review of the I/O code on buffer usage.
|
|
768eb3b8
|
2012-07-16T14:19:49
|
|
Convert XML parser to the new input buffers
The main changes are when the internal of the buffers structure
were adressed directly, we now use routines coming from buf.h
The routine xmlParserInputRead() which wasn't used anywhere is
deprecated too.
|
|
65c7d3b2
|
2012-07-16T14:13:58
|
|
Incompatible change to the Input and Output buffers
Since the whole set of structures was public, the only way
to switch to size_t clean buffer is to introduce an incompatible
API change. Modifying the xmlParserInputBuffer and xmlOutputBuffer
structures is the best place to make this change as those
structures are deep into the parser feeding data, and no public
API suggest to build those manually.
|
|
18d0db25
|
2012-07-13T19:51:15
|
|
Adding new encoding function to deal with the new structures
* encoding.c: adds xmlCharEncFirstLineInput, xmlCharEncInput and
xmlCharEncOutput
* enc.h: the functions are not made public but added to this new header
|
|
ade10f2c
|
2012-07-12T09:43:27
|
|
Convert XPath to xmlBuf
Easy as no buffer was exported in the APIs
|
|
bca22f40
|
2012-07-11T16:48:47
|
|
Adding a new buf module for buffers
This also add converter functions between xmlBuf and xmlBuffer
* buf.c buf.h: the old xmlBuffer routines but modified for size_t
and using xmlBuf instead of xmlBuffer
* Makefile.am: add the 2 new files
* include/libxml/xmlerror.h: add an entry for the new module
* include/libxml/tree.h: expose the xmlBufPtr type but not the
structure which stay private
|
|
4629ee02
|
2012-07-23T14:15:40
|
|
Do not fetch external parsed entities
Unless explicietely asked for when validating or replacing entities
with their value. Problem pointed out by Tom Lane <tgl@redhat.com>
* parser.c: do not load external parsed entities unless needed
* test/errors/extparsedent.xml result/errors/extparsedent.xml*:
add a regression test to avoid change of the behaviour in the future
|
|
baaf03f8
|
2012-07-20T15:41:34
|
|
Fix an error in previous commit
|
|
4f9fdc70
|
2012-07-18T11:38:17
|
|
Fix entities local buffers size problems
|
|
459eeb9d
|
2012-07-17T16:19:17
|
|
Fix parser local buffers size problems
|
|
740cb1a4
|
2012-07-18T16:05:37
|
|
Memory error within SAX2 reuse common framework
There is no reason for that class of errors to not use
the same handling allowing strctured error processing.
|
|
c508fa3f
|
2012-07-18T17:39:56
|
|
Fix a failure to report xmlreader parsing failures
Related to https://bugzilla.gnome.org/show_bug.cgi?id=654567
the problem is that the provided patch failed to raise an error
on xmlTextReaderRead() return when an actual parsing error occured
|
|
549f06a8
|
2012-07-11T15:21:12
|
|
Expand .gitignore with more files
|
|
8fc913fc
|
2012-06-06T11:29:29
|
|
Fix compilation on older Visual Studio
For https://bugzilla.gnome.org/show_bug.cgi?id=666491
Reported by Matt Budd <matt.budd@gmail.com>, the added support
for VS 2010 broke older version 2005 and 2008 because it assumed
some of the defines where present in all versions, fix that
to check the version of VS
|
|
2e1eaca6
|
2012-05-25T16:44:20
|
|
Fix xmllint --xpath node initialization
By default it's more sensible to initialize it to the document itself
than the root element
|
|
c943f708
|
2012-05-23T17:10:59
|
|
Release of libxml2-2.8.0
- Makefile.am: don't package .git
- configure.in : update to new release
- doc/xml.html: added the new release
- doc/* testapi.c: regenerated
|
|
22030ef8
|
2012-05-23T15:52:45
|
|
Restore code for Windows compilation
Try to keep as close to rc1 but still allow the change from Roumen for
mingw
|
|
ee8f1d4c
|
2012-05-21T11:16:12
|
|
Cleanups before 2.8.0-rc2
new symbols, a missing comment and a fix on symbol release
|
|
978ff224
|
2012-05-20T16:07:54
|
|
use mingw C99 compatible functions {v}snprintf instead those from MSVC runtime
|
|
f27c6683
|
2012-05-21T10:15:40
|
|
New symbols added for the next release
|
|
59df1e4f
|
2012-05-21T10:14:34
|
|
Avoid an extra operation
In the catalog code, tsan also complained of testing
the variable without locking and that was done a few lines below
|
|
d495e6a8
|
2012-05-20T20:48:34
|
|
Part for rand_r checking missing
Forgot to push that change in previous commit
|
|
379ebc1d
|
2012-05-18T15:41:31
|
|
Cleanup on randomization
tsan reported that rand() is not thread safe, so create
a thread safe wrapper, use rand_r() if available.
Consolidate the function, initialization and cleanup in
dict.c and make sure it is initialized in xmlInitParser()
|
|
9d9685ad
|
2012-05-15T20:10:25
|
|
xmlTextReader bails too quickly on error
For https://bugzilla.gnome.org/show_bug.cgi?id=654567
I use xmlTextReader to parse failed that might be incomplete. These files are
the beginning of a well-formed file, but the end is missing so the file as a
whole is not well-formed.
The problem is that xmlTextReader starts returning errors when it encounters
the early EOF, even though I haven't finished reading all of the valid data in
the file. It would be helpful if xmlTextReader kept working until the very
end.
|
|
1ea6b141
|
2012-05-15T19:36:02
|
|
Fix undefined reference in python module
For https://bugzilla.gnome.org/show_bug.cgi?id=622023
when compiled with LDFLAGS="${LDFLAGS} -Wl,-z,-defs -Wl,--no-undefined"
the python module would failed due to the undefined. This add an
explicit reference to python lib.
|
|
0d51cfeb
|
2012-05-15T11:18:40
|
|
Fix a race in xmlNewInputStream
For https://bugzilla.gnome.org/show_bug.cgi?id=643148
Reported by Bill Clarke <llib@computer.org>, it used a global variable
as a counter for the input id and this was not thread safe. To avoid the
race without adding unneeded locking in the parser path, move the id to
the parser context instead.
|
|
9313ae85
|
2012-05-15T11:03:46
|
|
Fix weird streaming RelaxNG errors
For https://bugzilla.gnome.org/show_bug.cgi?id=512454
The bug was to use compiled determinitic automata when
the content model was found to be non-deterministic, leading
to random parsing errors.
|
|
94431ecb
|
2012-05-15T10:45:05
|
|
Fix various bugs in new code raised by the API checking
* testapi.c: regenerated and covering new APIs
* tree.c: xmlBufferDetach can't work on immutable buffers
* xzlib.c: fix a deallocation error
|
|
79ee284a
|
2012-05-15T10:25:31
|
|
Fix various problems with "make dist"
* tree.c: missing documentation for xmlBufferDetach
* doc/symbols.xml: add two new symbols xmlTextReaderRelaxNGValidateCtxt
and xmlBufferDetach
* doc/apibuild.py: ignore internal header xzlib.h
|
|
9f3cdef0
|
2012-05-15T09:38:13
|
|
Fix a memory leak in the xzlib code
The freeing function wasn't called due to a bogus #ifdef surrounding
value. Also switch the code to use the normal libxml2 allocation and
freeing routines.
|
|
7d0d2a50
|
2012-05-14T14:18:58
|
|
Use a hybrid allocation scheme in xmlNodeSetContent
On Fri, May 11, 2012 at 9:10 AM, Daniel Veillard <veillard@redhat.com> wrote:
> Hi Conrad,
>
> that's interesting ! I was initially afraid of a sudden explosion of
> memory allocations for building a tree since by default buffers tend to
> "waste" memory by using doubling allocations, but that's not the case.
> xmllint --noout doc/libxml2-api.xml
> when compiled with memory debug produce
>
> paphio:~/XML -> cat .memdump
> MEMORY ALLOCATED : 0, MAX was 12756699
>
> and without your patch 12755657, i.e. the increase is minimal.
Heh, I thought that too. Actually you're looking at the result with XML_ALLOC_EXACT! This
is because EXACT adds 10bytes "spare" on each alloc, and that interestingly wastes about the
same amount of space as XML_ALLOC_DOUBLEIT on this example (see below).
So it turns out that the default realloc() on my system actually handles this case really
well — and I guess that all the time in xmlRealloc() was actually in xmlStrlen, not the
underlying realloc() after all (sorry for misleading you). If you replace the realloc()
with a bad one (like valgrind's), then the performance degrades severely.
This patch implements a HYBRID allocator which has the behaviour you describe (it's
like EXACT to start with, though without the spare 10 bytes; and switches to DOUBLEIT
after 4kb) — that gets the memory back down to 12755657, with no noticeable impact on the
performance of the synthetic pathological example under valgrind.
In summary:
max_memory on ./xmllint --noout doc/libxml2-api.xml,
valgrind time on https://gist.github.com/2656940
max_memory valgrind time
before | 12755657 | 29:18.2
EXACT | 12756699 | 2:58.6 <-- this is the state after the first patch.
DOUBLEIT | 12756727 | 0:02.7
HYBRID | 12755754 | 0:02.7 <-- this is the state with both patches.
>
> There is also the cost of creating the buffers all the time.
> I need to read the code and check but I may be interested in an hybrid
> approach where we switch to buffer only when the text node starts to
> become too big (4k would remove nearly all usuall types of "document"
> usage, i.e. not blocks of data)
I tried to avoid too much buffer creation by introducing the xmlBufferDetach function,
which allows re-using one buffer to construct many strings. It's maybe a bit of a "hack"
in API terms though I thought the gains would be worth it.
Conrad
------8<------
To keep memory usage tight in normal conditions it's desirable to only
allocate as much space as is needed. Unfortunately this can lead to
problems when constructing a long string out of small chunks, because
every chunk you add will need to resize the buffer.
To fix this XML_ALLOC_HYBRID will switch (when the buffer is 4kb big)
from using exact allocations to doubling buffer size every time it is
full. This limits the number of buffer resizes to O(log n) (down from
O(n)), and thus greatly increases the performance of constructing very
large strings in this manner.
|
|
7d553f83
|
2012-05-10T20:17:25
|
|
Use buffers when constructing string node lists.
Hi Veillard and all,
Firstly, thanks for libxml: it's awesome!
I noticed recently that libxml was taking a surprisingly long time to perform some
operations (many minutes instead of milliseconds), and so I did some digging. It turns out
that the problem was caused by the realloc()ing done in xmlNodeAddContentLen() which can
be called many (many) times when assigning some content into a node.
For background, I'm dealing with XML that contains emails, these can have large
attachments (~6MB) which are base-64 encoded, line-wrapped at 78 chars, and each line ends
with . This means that xmlNodeAddContentLen() is being called about 200,000 times,
and so there are 200,000 reallocs of a 6MB string, which takes a while... (I put a synthetic
example of this at https://gist.github.com/2656940)
The attached patch works around that problem by using the existing buffer API to merge the
strings together before even creating the text node, this keeps the number of realloc()s
at a managable level.
I'd love feedback on the patch, and am happy to fix problems with it, or explore other
solutions if you think that this is barking up the wrong tree :).
Thanks,
Conrad
P.S. Should I create a bug for this too?
------8<------
Before this change xmlStringGetNodeList would perform a realloc() of the
entire new content for every XML entity in the assigned text in order to
merge together adjacent text nodes. This had the effect of making
xmlSetNodeContent O(n^2), which led to unexpectedly bad performance on
inputs that contained a large number of XML entities.
After this change the memory management is done by the buffer API,
avoiding the need to continually re-measure and realloc() the string.
For my test data (6MB of 80 character lines, each ending with )
this takes the time to xmlSetNodeContent from about 500 seconds to
around 50ms. I have not profiled smaller cases, though I tried to
minimize the performance impact of my change by avoiding unnecessary
string copying.
Signed-off-by: Conrad Irwin <conrad.irwin@gmail.com>
|
|
a0cd075d
|
2012-05-11T19:31:12
|
|
HTML parser error with <noscript> in the <head>
For https://bugzilla.gnome.org/show_bug.cgi?id=615785
When the <noscript> is found, <head> is closed and a <body> element is created.
The real <body id="xxx"> gets skipped over, so I can't see any of the
body's attributes.
Just don't close <head> when encountering a <noscript>
Add a regression test too
|
|
4609e6c9
|
2012-05-11T15:31:05
|
|
XSD: optional element in complex type extension
For https://bugzilla.gnome.org/show_bug.cgi?id=609796
Libxml2 fails to validate an instance document against a schema if an element
whose type is a complex extension of some base type with an optional child
element and that child element is not specified in the instance document. For
example, suppose I have some complex type BaseType that is defined to have one
child element in a sequence group that has minOccurs set to 0
|
|
39d027cd
|
2012-05-11T12:38:23
|
|
Fix html serialization error and htmlSetMetaEncoding()
For https://bugzilla.gnome.org/show_bug.cgi?id=630682
The python tests were reporting errors, some of it was due to
a small change in case encoding, but the main one was about
htmlSetMetaEncoding(doc, NULL) being broken by not removing
the associated meta tag anymore
|
|
2c437da7
|
2012-05-11T12:08:15
|
|
Fix a wrong return value in previous patch
|
|
ed35d3d7
|
2012-05-11T10:52:27
|
|
Fix an uninitialized variable use
When compiled without SAX1 support
|
|
0c7109c8
|
2012-05-11T10:50:59
|
|
Fix a compilation problem with --minimum
For https://bugzilla.gnome.org/show_bug.cgi?id=636750
Moved a #endif /* LIBXML_OUTPUT_ENABLED */ a few lines down
to avoid reference an undefined variable
|
|
399aaba1
|
2012-05-11T10:09:32
|
|
Remove redundant and ungarded include of resolv.h
For https://bugzilla.gnome.org/show_bug.cgi?id=617053
This broke the build on Interix-6.0
|
|
040dcb59
|
2012-05-10T22:55:07
|
|
Remove git error message during configure
For https://bugzilla.gnome.org/show_bug.cgi?id=635531
If git is not installed but .git was found configure would emit an
error message
|
|
023206fc
|
2012-05-10T22:17:51
|
|
xmllint: Build fix for endTimer if !defined(HAVE_GETTIMEOFDAY)
For https://bugzilla.gnome.org/show_bug.cgi?id=638649
code was broken !
|
|
a4fe9b26
|
2012-05-10T22:12:46
|
|
emove a bashism in confgure.in
Not portable, broke on old FreeBSD
|
|
4cf7325e
|
2012-05-10T20:59:33
|
|
xinclude with parse="text" does not use the entity loader
For https://bugzilla.gnome.org/show_bug.cgi?id=552479
The code for xinclude parse="text" was not using the registered
entity loader, defeating attempts to control loading of files.
|
|
fdf990c2
|
2012-05-10T20:40:49
|
|
Allow to parse 1 byte HTML files
For https://bugzilla.gnome.org/show_bug.cgi?id=605740
File 1 byte long were not accepted by the HTML push parser
|
|
204f1f14
|
2012-05-10T20:24:00
|
|
undef ERROR if already defined
|
|
b91111b4
|
2012-05-10T18:52:37
|
|
Patch that fixes the skipping of the HTML_PARSE_NOIMPLIED flag
For https://bugzilla.gnome.org/show_bug.cgi?id=642916
I just noticed that the HTML_PARSE_NOIMPLIED flag that you can pass to the
HTML-Parser methods doesn't do anything. Its intended purpose is to stop the
HTML-parser from forcibly adding a pair of html/body tags if the stream does
not contain any.
This is highly useful when you don't need this level of strictness.
Unfortunately, specifying it doesn't work, because the option is not
copied into the parsing context.
|