Hash :
faea2fa9
Author :
Date :
2020-11-21T01:21:56
Avoid quadratic checking of identity-constraints key/unique/keyref schema attributes currently use qudratic loops to check their various constraints (that keys are unique and that keyrefs refer to existing keys). That becomes extremely slow if there are many elements with keys. This happens in the wild with e.g. the OVAL XML descriptions of security patches. You need the openscap schemata, and then an example xml file: % zypper in openscap-utils % wget ftp://ftp.suse.com/pub/projects/security/oval/opensuse.leap.15.1.xml % time xmllint --schema /usr/share/openscap/schemas/oval/5.5/oval-definitions-schema.xsd opensuse.leap.15.1.xml > /dev/null opensuse.leap.15.1.xml validates real 16m59,857s user 16m55,787s sys 0m1,060s This patch makes libxml use a hash table to avoid the quadratic behaviour. The existing hash table only accepts strings as keys, so we're mostly reusing the canonical representation of key values to derive such strings (with the caveat given in a comment). The alternative would be to rework the hash table code to accept either numbers or free functions as hash workers, but the code is fast enough as is. With the patch we have this then: % time LD_LIBRARY_PATH=./libxml2/.libs/ ./libxml2/.libs/xmllint --schema /usr/share/openscap/schemas/oval/5.5/oval-definitions-schema.xsd opensuse.leap.15.1.xml > /dev/null opensuse.leap.15.1.xml validates real 0m3,531s user 0m3,427s sys 0m0,103s So, a ~300x speedup. This patch survives 'make check' and 'make tests'.
XML toolkit from the GNOME project
Full documentation is available on-line at
http://xmlsoft.org/
This code is released under the MIT Licence see the Copyright file.
To build on an Unixised setup:
./configure ; make ; make install
if the ./configure file does not exist, run ./autogen.sh instead.
To build on Windows:
see instructions on win32/Readme.txt
To assert build quality:
on an Unixised setup:
run make tests
otherwise:
There is 3 standalone tools runtest.c runsuite.c testapi.c, which
should compile as part of the build or as any application would.
Launch them from this directory to get results, runtest checks
the proper functioning of libxml2 main APIs while testapi does
a full coverage check. Report failures to the list.
To report bugs, follow the instructions at:
http://xmlsoft.org/bugs.html
A mailing-list xml@gnome.org is available, to subscribe:
http://mail.gnome.org/mailman/listinfo/xml
The list archive is at:
http://mail.gnome.org/archives/xml/
All technical answers asked privately will be automatically answered on
the list and archived for public access unless privacy is explicitly
required and justified.
Daniel Veillard
$Id$