hash.c


Log

Author Commit Date CI Message
Nick Wellnhofer e75e878e 2024-05-20T13:58:22 doc: Update and fix documentation
Nick Wellnhofer f313848b 2023-12-10T15:14:15 hash: Report malloc failures Introduce new API functions that return a separate error code if a memory allocation fails. - xmlHashAdd - xmlHashCopySafe
Nick Wellnhofer a2b5c90a 2023-11-21T14:35:54 hash: Fix deletion of entries during scan Functions like xmlCleanSpecialAttr scan a hash table and possibly delete entries in the callback. xmlHashScanFull must detect such deletions and rescan the entry. This regressed when rewriting the hash table code in 4a513d56. Fixes #626.
Nick Wellnhofer a7b03795 2023-11-04T19:04:23 doc: Minor fixes for apibuild.py
Nick Wellnhofer 61e29b69 2023-09-30T17:02:46 malloc-fail: Grow hash tables before making allocations Fix short-lived memory leak found by OSS-Fuzz.
Nick Wellnhofer 4a513d56 2023-09-16T19:12:25 hash: Rewrite hash table code This is a complete rewrite of the code in hash.c Move from a chained hash table implementation to open addressing with Robin Hood probing. This allows to increase the maximum fill factor and further reduce the growth factor, saving considerable amounts of memory without sacrificing performance. To make this work, hash values are now cached in the table entry also avoiding many key comparisons. Tables are created lazily with a smaller minimum size. Insertion functions now report an error if growing the table resulted in a memory allocation failure. Some string comparisons were optimized to call directly into libc instead of using the xmlstring API. The length of inserted keys is computed along with the hash improving allocation performance. Bounds checking was made more robust. In dictionary-based mode, unneeded interning of strings is avoided.
Nick Wellnhofer 699299ca 2023-09-20T18:54:39 globals: Stop including globals.h
Nick Wellnhofer efcaeadc 2023-09-04T16:00:53 hash: Fix use-of-uninitialized-value Short-lived regression.
Nick Wellnhofer edc2dd48 2023-09-04T16:07:23 dict: Update hash function Update hash function from classic Jenkins OAAT (dict.c) and a variant of DJB2 (hash.c) to "GoodOAAT" taken from the SMHasher repo. This hash function passes all SMHasher tests.
Nick Wellnhofer 57cfd221 2023-09-01T14:52:04 dict: Use xoroshiro64** as PRNG Stop using rand_r. This enables hash randomization on all platforms.
Nick Wellnhofer 6d7aaaa8 2023-09-01T14:51:55 dict: Tune hash table growth Introduce load factor as main trigger and increase MAX_HASH_LEN. This should make growth behavior more predictable. Raise size limit to INT_MAX. This avoids quadratic behavior with larger tables.
Nick Wellnhofer 4b8f7cf0 2023-09-01T13:07:27 hash: Fix integer overflow of nbElems
Nick Wellnhofer 06a2c251 2023-05-06T15:28:13 hash: Fix possible startup crash with old libxslt versions Call xmlInitParser in xmlHashCreate to make it work if the library wasn't initialized yet. Otherwise, exsltRegisterAll from libxslt 1.1.24 or older might cause a crash. See #534.
Nick Wellnhofer 8c2e508b 2023-03-12T14:45:14 gitlab-ci: Enable all "integer" sanitizers
Nick Wellnhofer 4499143a 2023-02-26T15:43:50 malloc-fail: Check for malloc failure in xmlHashAddEntry Found with libFuzzer, see #344.
Nick Wellnhofer ad338ca7 2022-09-01T01:18:30 Remove explicit integer casts Remove explicit integer casts as final operation - in assignments - when passing arguments - when returning values Remove casts - to the same type - from certain range-bound values The main motivation is that these explicit casts don't change the result of operations and only render UBSan's implicit-conversion checks useless. Removing these casts allows UBSan to detect cases where truncation or sign-changes occur unexpectedly. Document some explicit casts as truncating and add a few missing ones.
Nick Wellnhofer 0f568c0b 2022-08-26T01:22:33 Consolidate private header files Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.
Nick Wellnhofer 72119afe 2022-03-02T01:14:08 Don't check for standard C89 library functions Don't check for - fprintf - localtime - printf - rand - sprintf - srand - sscanf - strftime - time - vfprintf - vsprintf If the C99 functions snprintf and vsnprintf are missing, Trio is enabled.
Nick Wellnhofer 776d15d3 2022-03-02T00:29:17 Don't check for standard C89 headers Don't check for - ctype.h - errno.h - float.h - limits.h - math.h - signal.h - stdarg.h - stdlib.h - string.h - time.h Stop including non-standard headers - malloc.h - strings.h
Nick Wellnhofer 346c3a93 2022-02-20T18:46:42 Remove elfgcchack.h The same optimization can be enabled with -fno-semantic-interposition since GCC 5. clang has always used this option by default.
Nick Wellnhofer 67c2e78b 2022-01-25T02:44:37 Fix integer conversion warnings in hash.c Use unsigned long for temporary variable to avoid integer conversion warnings with UBSan. Note that this does change the computation of hash values for input bytes larger than 0x7F. Before, these bytes were first converted to a (typically) signed char with a negative value, then to a large unsigned long near ULONG_MAX. I doubt that this was intentional. Input bytes larger than 0x7F are now converted to unsigned long unchanged.
Nick Wellnhofer 20c60886 2020-03-08T17:19:42 Fix typos Resolves #133.
Nick Wellnhofer b88ae6d2 2019-10-14T15:38:28 Avoid ignored attribute warnings under GCC GCC doesn't support the unsigned-integer-overflow sanitizer.
Nick Wellnhofer 44e7a0d5 2019-05-16T21:17:28 Annotate functions with __attribute__((no_sanitize))
Nick Wellnhofer fa3166c2 2019-04-12T12:03:04 Disable hash randomization when fuzzing Use the FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION macro proposed by libFuzzer.
Nick Wellnhofer e03f0a19 2017-11-09T16:42:47 Fix hash callback signatures Make sure that all parameters and return values of hash callback functions exactly match the callback function type. This is required to pass clang's Control Flow Integrity checks and to allow compilation to asm.js with Emscripten. Fixes bug 784861.
Nick Wellnhofer 8bbe4508 2017-06-17T16:15:09 Spelling and grammar fixes Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other misspellings.
Gaurav Gupta 1811add7 2014-07-14T17:50:27 Fix various Missing Null checks For https://bugzilla.gnome.org/show_bug.cgi?id=732823
Daniel Franke b1237111 2013-04-12T18:53:53 Improve the hashing functions
Daniel Veillard f8e3db04 2012-09-11T13:26:36 Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.
Daniel Veillard 379ebc1d 2012-05-18T15:41:31 Cleanup on randomization tsan reported that rand() is not thread safe, so create a thread safe wrapper, use rand_r() if available. Consolidate the function, initialization and cleanup in dict.c and make sure it is initialized in xmlInitParser()
Daniel Veillard 8973d58b 2012-02-04T19:07:44 Add hash randomization to hash and dict structures Following http://www.ocert.org/advisories/ocert-2011-003.html it seems that having hash randomization might be a good idea when using XML with untrusted data * configure.in: lookup for rand, srand and time * dict.c: add randomization to dictionaries hash tables * hash.c: add randomization to normal hash tables
Daniel Veillard 594e5dfb 2009-09-07T14:58:47 Chasing dead assignments reported by clang-scan * SAX2.c dict.c error.c hash.c nanohttp.c parser.c python/libxml.c relaxng.c runtest.c tree.c valid.c xinclude.c xmlregexp.c xmlsave.c xmlschemas.c xpath.c xpointer.c: mostly removing unneded affectations, but this led to a few real bugs and some part not yet understood (relaxng/interleave)
Daniel Veillard ac4118d5 2008-01-11T05:27:32 handle a erroneous parsing of attributes in case said attribute has been * parser.c: handle a erroneous parsing of attributes in case said attribute has been redeclared in the DTD with a different type * hash.c: fix the hash scanner to not crash if a first element from the hash list is been removed in the callback Daniel svn path=/trunk/; revision=3669
Daniel Veillard 5d4644ef 2005-04-01T13:11:58 revamped the elfgcchack.h format to cope with gcc4 change of aliasing * doc/apibuild.py doc/elfgcchack.xsl: revamped the elfgcchack.h format to cope with gcc4 change of aliasing allowed scopes, had to add extra informations to doc/libxml2-api.xml to separate the header from the c module source. * *.c: updated all c library files to add a #define bottom_xxx and reimport elfgcchack.h thereafter, and a bit of cleanups. * doc//* testapi.c: regenerated when rebuilding the API Daniel
Daniel Veillard 316a5c39 2005-01-23T22:56:39 added xmlHashCreateDict where the hash reuses the dictionnary for internal * hash.c include/libxml/hash.h: added xmlHashCreateDict where the hash reuses the dictionnary for internal strings * entities.c valid.c parser.c: reuse that new API, leads to a decent speedup when parsing for example DocBook documents. Daniel
Daniel Veillard e991fe95 2003-10-29T11:18:37 change suggested by Anthony Carrico when unregistering a namespace prefix * xpath.c: change suggested by Anthony Carrico when unregistering a namespace prefix to a context * hash.c: be more careful about calling callbacks with NULL payloads. Daniel
Daniel Veillard 092643b5 2003-09-25T14:29:29 preparing a beta3 solving the ABI problems make sure the global variables * configure.in: preparing a beta3 solving the ABI problems * globals.c parser.c parserInternals.c testHTML.c HTMLparser.c SAX.c include/libxml/globals.h include/libxml/SAX.h: make sure the global variables for the default SAX handler are V1 ones to avoid ABI compat problems. * xmlreader.c: cleanup of uneeded code * hash.c: fix a comment Daniel
Daniel Veillard 7a02cfe0 2003-09-25T12:18:34 fixing some comments to avoid warnings from apibuild.py Daniel * SAX2.c hash.c parser.c include/libxml/xmlexports.h include/libxml/xmlmemory.h include/libxml/xmlversion.h.in: fixing some comments to avoid warnings from apibuild.py Daniel
Daniel Veillard 8e36e6a0 2003-09-10T10:50:59 2.6.0beta1 changes Fixing attribute normalization, might not be totally * configure.in doc/* : 2.6.0beta1 changes * SAX2.c hash.c parser.c parserInternals.c: Fixing attribute normalization, might not be totally fixed but this should make sure SAX event provide the right strings for attributes except entities for which libxml2 is different by default This should fix #109564 * result/attrib.xml.sax result/ent3.sax result/p3p.sax: minor changes in attribute callback values * result/c14n/with-comments/example-4 result/c14n/without-comments/example-4: this also fixes a subtle bug in the canonicalization tests. Daniel
Daniel Veillard e57ec790 2003-09-10T10:50:59 Time to commit 3 days of work rewriting the parser internal, fixing bugs and migrating to SAX2 interface by default. There is some work letf TODO, like namespace validation and attributes normalization (this break C14N right now) * Makefile.am: fixed the test rules * include/libxml/SAX2.h include/libxml/parser.h include/libxml/parserInternals.h SAX2.c parser.c parserInternals.c: changing the parser, migrating to SAX2, adding new interface to switch back to SAX1 or initialize a SAX block for v1 or v2. Most of the namespace work is done below SAX, as well as attribute defaulting * globals.c: changed initialization of the default SAX handlers * hash.c tree.c include/libxml/hash.h: added QName specific handling * xmlIO.c: small fix * xmllint.c testSAX.c: provide a --sax1 switch to test the old version code path * result/p3p result/p3p.sax result/noent/p3p test/p3p: the new code pointed out a typo in a very old test namespace Daniel
Daniel Veillard 6155d8aa 2003-08-19T15:01:28 optimization when freeing hash tables. some tuning of buffer allocations * dict.c hash.c: optimization when freeing hash tables. * parser.c xmlIO.c include/libxml/tree.h: some tuning of buffer allocations * parser.c parserInternals.c include/libxml/parser.h: keep a single allocated block for all the attributes callbacks, avoid useless malloc()/free() * tree.c: do not realloc() when growing a buffer if the buffer ain't full, malloc/memcpy/free avoid copying memory. Daniel
Daniel Veillard 01c13b5b 2002-12-10T15:19:08 code cleanup, especially the function comments. fixed a small bug when * DOCBparser.c HTMLparser.c c14n.c debugXML.c encoding.c hash.c nanoftp.c nanohttp.c parser.c parserInternals.c testC14N.c testDocbook.c threads.c tree.c valid.c xmlIO.c xmllint.c xmlmemory.c xmlreader.c xmlregexp.c xmlschemas.c xmlschemastypes.c xpath.c: code cleanup, especially the function comments. * tree.c: fixed a small bug when freeing nodes which are XInclude ones. Daniel
Daniel Veillard aeb258a9 2002-09-13T14:48:12 cosmetic cleanup started integrating a DTD validation layer based on the * hash.c: cosmetic cleanup * valid.c include/libxml/tree.h include/libxml/valid.h: started integrating a DTD validation layer based on the regexps Daniel
Daniel Veillard fdc9156a 2002-07-01T21:52:03 applied patch from Richard Jinks for the namespace axis + fixed a memory * xpath.c: applied patch from Richard Jinks for the namespace axis + fixed a memory error. * parser.c parserInternals.c: applied patches from Peter Jacobi removing ctxt->token for good. * xmlschemas.c xmlschemastypes.c: fixed a few memory leaks popped out by the regression tests. * Makefile.am: patch for threads makefile from Gary Pennington Daniel
Daniel Veillard 153120c4 2002-06-18T07:58:35 applied a patch from Peter Jacobi to solve a problem when compiling with * hash.c: applied a patch from Peter Jacobi to solve a problem when compiling with the Watcom C on Win32 * result/schemas/*.err: the change of hashing algo generated permutations in the output Daniel
Daniel Veillard 5f7f991a 2002-06-17T17:03:00 applied patch from Sander Vesik improving the quality of the hash * hash.c: applied patch from Sander Vesik improving the quality of the hash function. Daniel
Daniel Veillard 34ce8bec 2002-03-18T19:37:11 preparing 2.4.18 updated and rebuilt the web site implement the new * configure.in: preparing 2.4.18 * doc/*: updated and rebuilt the web site * *.c libxml.h: implement the new IN_LIBXML scheme discussed with the Windows and Cygwin maintainers. * parser.c: humm, changed the way the SAX parser work when xmlSubstituteEntitiesDefault(1) is set, it will then do the entity registration and loading by itself in case the user provided SAX getEntity() returns NULL. * testSAX.c: added --noent to test the behaviour. Daniel
Daniel Veillard 314cfa08 2002-01-14T17:58:01 patch from Anthony Jones for hash.c allocation size trying to work around * hash.c: patch from Anthony Jones for hash.c allocation size * Makefile.am: trying to work around Yet Another Libtool Madness and build the 2.4.13 release finally ... daniel
Daniel Veillard cbaf3995 2001-12-31T16:16:02 applied 42 documentation patches from Charlie Bozeman. Regenerated the * *.c include/libxml/*.h doc/html/*: applied 42 documentation patches from Charlie Bozeman. Regenerated the HTML docs. Daniel
Daniel Veillard 3c01b1d8 2001-10-17T15:58:35 - include/libxml/globals.h include/libxml/threads.h threads.c testThreads.c: far more testing, cleaning up bugs - *.c : make sure globals.h is always included. Daniel
Thomas Broyer e8126247 2001-07-22T03:54:15 added xmlHashScannerFull, xmlHashScanFull and xmlHashScannFull3 to get * hash.c include/libxml/hash.h: added xmlHashScannerFull, xmlHashScanFull and xmlHashScannFull3 to get passed the three keys as arguments to the callback function
Daniel Veillard 5e2dace1 2001-07-18T19:30:27 Cleanup, cleanup .. removed libxml softlink for good cleanup to get 100% Cleanup, cleanup .. * configure.in Makefile.am: removed libxml softlink for good * include/libxml/*.h *.c doc/Makefile.am: cleanup to get 100% coverage by gtk-doc Daniel
Daniel Veillard f69bb4b5 2001-05-19T13:24:56 - HTMLparser.c: Closed bug #54891 - result/HTML/cf_128.html* test/HTML/cf_128.html: added the test to the suite forgot to commit this one yesterday - encoding.h hash.c nanoftp.h parser.h tree.h uri.h xlink.h xpointer.c: applied a documentation patch from LotR and filled in a few missing descriptions Daniel
Bjorn Reese 70a9da54 2001-04-21T16:57:29 trio upgrade and integration
Daniel Veillard dab4cb37 2001-04-20T13:03:48 Geez, this one was painful ! I still need to handle entities references for the validation step but I have a clean way to add this without touching the algorithm: - valid.[ch] tree.h: worked *hard* to get non-determinist content validation without using an ugly NFA -> DFA algo in the source. Made a specific algorithm easier to maintain, using a single stack and without recursion. - Makefile.am test/VCM/*.xml: added more tests to "make Validtests" - hash.c: made the growing routine static - tree.h parser.c: added the parent information to an xmlElementContent node. Daniel
Daniel Veillard a10efa8a 2001-04-18T13:09:01 - debugXML.c hash.c tree.h valid.c : some changes related to the validation suport to improve speed with DocBook - result/VC/OneID2 result/VC/OneID3 : this slightly changes the way validation errors get reported Daniel
Daniel Veillard 9e7160d4 2001-03-18T23:17:47 Completely changed the way the XPath evaluation is done, likely to break stuff like libxslt right now: - Makefile.am: detect XPath memleaks in regreson tests - error.c: fixed and error w.r.t. error reporting still using stderr - hash.c: added new line at end of file - tree.h: minor cleanup - xpath.[ch] xpointer.[ch]: Major changes ! Separated XPath expression parsing from evaluation, resulted in a number of changes internally, and in XPointer. Likely to break stuff using xpathInternals.h but should remain binary compatible, new interfaces will be added. Daniel
Owen Taylor 3473f88a 2001-02-23T17:55:21 Revert directory structure changes
CET 2001 Tomasz Kłoczko 64636e7f 2001-02-23T01:37:32 moved to libxml directory - this allow simplify automake/autoconf. Now Thu Feb 23 02:03:56 CET 2001 Tomasz Kłoczko <kloczek@pld.org.pl> * *.c *.h libxml files: moved to libxml directory - this allow simplify automake/autoconf. Now isn't neccessary hack on am/ac level for make and remove libxml symlink (modified for this also configure.in and main Makefile.am). Now automake abilities are used in best way (like in many other projects with libraries). * include/win32config.h: moved to libxml directory (now include directory isn't neccessary). * Makefile.am, examples/Makefile.am, libxml/Makefile.am: added empty DEFS and in INCLUDES rest only -I$(top_builddir) - this allow minimize parameters count passed to libtool script (now compilation is also slyghtly more quiet). * configure.in: simplifies libzdetestion - prepare separated variables for keep libz name and path to libz header files isn't realy neccessary (if someone have libz installed in non standard prefix path to header files ald library can be passed as: $ CFALGS="-I</libz.h/path>" LDFLAGS="-L</libz/path>" ./configure * autogen.sh: check now for libxml/entities.h. After above building libxml pass correctly and also pass "make install DESTDIR=</install/prefix>" from tar ball generated by "make dist". Seems ac/am reorganization is finished. This changes not touches any other things on *.{c,h} files level.
Daniel Veillard d194dd28 2001-02-14T10:37:43 - hash.[ch]: added Paolo Casarini patch to provide Delete from hash functionnalities. - doc/html/* : rebuild the doc Daniel
Daniel Veillard 1f83d39f 2001-02-08T09:37:42 - hash.[ch]: added a first version of xmlHashSize() - valid.c: another bug fix from Gary Pennington Daniel
Daniel Veillard c2def84b 2000-11-07T14:21:01 Various patches and bug fixes, and XInclude progresses: - nanohttp.[ch]: applied Wayne Davison patches to access the WWW-Authorization header. - parser.c: Closed Bug#30847: Problems when switching encoding in short files by applying Simon Berg's patch. - valid.c: fixed a validation problem - hash.c parser.h parserInternals.h testHTML.c testSAX.c tree.h xmlerror.h xmlmemory.h xmlversion.h.in: applied a DLL patch from Wayne Davison - xpointer.[ch]: added first version of xmlXPtrBuildNodeList() need to be extended to non full nodes selections. - xinclude.c: starts to work decently Daniel
Daniel Veillard 9e8bfae5 2000-11-06T16:43:11 XInclude and other stuff while travelling. Contributed patches: - tree.[ch] xinclude.[ch] xmllint.c configure.in valid.c debugXML.c xmlversion.h.in: Started adding XInclude support, this is a new xmllint option - tree.c xpath.c: applied TOM patches for XPath - xpointer.c: fixed a couple of errors. - uri.c: added an escaping function needed for xinclude - testXPath.c hash.c HTMLtree.c: minor cleanups raised by new warning from RH70 gcc's version Daniel
Daniel Veillard 126f2799 2000-10-24T17:10:12 Bunch of fixes, finishing moving datastructures to the hash stuff: - hash.[ch] debugXML.c: expanded/enhanced the API, added multikey tuples, made hash structure opaque - valid.[ch]: moved elements, attributes, notations decalarations as well as ID and refs to hash tables. - entities.c: hash cleanup - xmlmemory.c: fixed a dump problem in debug mode - include/Makefile.am: problem passing in DESTDIR= values patch from Marc Christensen <marc@calderasystems.com> - nanohttp.c: removed debugging remains - HTMLparser.c: the bogus tag should be ignored (Wayne) - HTMLparser.c parser.c: fixing a number of problems with the macros in the *parser.c files (Wayne). - HTMLparser.c: close the previous option when opening a new one (Marc Sanfacon). - result/HTML/*: updated the HTML results accordingly Daniel
Daniel Veillard 3fe87689 2000-10-23T08:10:05 Ooops forgot the hash module on last commit, Daniel.