src/util.h


Log

Author Commit Date CI Message
Edward Thomson add30a83 2021-11-18T12:36:25 date: rfc2822 formatting uses a `git_buf` instead of a static string
Edward Thomson b2c40314 2021-11-18T12:19:32 date: make it a proper `git_date` utility class Instead of `git__date`, just use `git_date`.
Edward Thomson f0e693b1 2021-09-07T17:53:49 str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Duncan Thomson 6c53d6ab 2021-09-08T18:42:42 Use __typeof__ GNUC keyword for ISO C compatibility
Edward Thomson 1196de4f 2021-08-31T15:22:44 util: introduce `git__strlcmp` Introduce a utility function that compares a NUL terminated string to a possibly not-NUL terminated string with length. This is similar to `strncmp` but with an added check to ensure that the lengths match (not just the `size` portion of the two strings).
Edward Thomson 6a7f0403 2021-07-16T08:47:37 Merge pull request #5941 from NattyNarwhal/stdintification stdintification: use int64_t and INT64_C instead of long long
Peter Pettersson e4e173e8 2021-07-15T21:00:02 Allow compilation on systems without CLOCK_MONOTONIC Makes usage of CLOCK_MONOTONIC conditional and makes functions that uses git__timer handle clock resynchronization. Call gettimeofday with tzp set to NULL as required by https://pubs.opengroup.org/onlinepubs/9699919799/functions/gettimeofday.html
Calvin Buckley 52505ab5 2021-07-07T19:12:02 Convert long long constant specifiers to stdint macros
Peter Pettersson b444918a 2021-07-06T20:51:14 Limit ITimer usage to AmigaOS4
Edward Thomson 1d95b59b 2021-04-14T15:47:27 utf8: refactor utf8 functions Move the utf8 functions into a proper namespace `git_utf8` instead of being in the namespaceless `git__` function group. Update them to have out-params first and use `char *` instead of `uint8_t *` to match our API treating strings as `char *` (even if they truly contain `uchar`s inside).
Edward Thomson 404dd024 2020-12-05T15:57:48 threads: rename thread files to thread.[ch]
Edward Thomson ab772974 2020-12-05T15:49:30 threads: give atomic functions the git_atomic prefix
Edward Thomson 37763d38 2020-12-05T15:26:59 threads: rename git_atomic to git_atomic32 Clarify the `git_atomic` type and functions now that we have a 64 bit version as well (`git_atomic64`).
Edward Thomson 9800728a 2020-12-05T16:08:34 util: move git_online_cpus into util The number of CPUs is useful information for creating a thread pool or a number of workers, but it's not really about threading directly. Evict it from the thread file
Edward Thomson 8413c0f9 2020-12-05T21:32:48 util: move git__noop into the util header The git__noop function is more largely useful; move it into the util header. (And reduce the number of underscores.)
lhchavez cc1d7f5c 2020-08-01T17:47:20 Improve the support of atomics This change: * Starts using GCC's and clang's `__atomic_*` intrinsics instead of the `__sync_*` ones, since the former supercede the latter (and can be safely replaced by their equivalent `__atomic_*` version with the sequentially consistent model). * Makes `git_atomic64`'s value `volatile`. Otherwise, this will make ThreadSanitizer complain. * Adds ways to load the values from atomics. As it turns out, unsynchronized read are okay only in some architectures, but if we want to be correct (and make ThreadSanitizer happy), those loads should also be performed with the atomic builtins. * Fixes two ThreadSanitizer warnings, as a proof-of-concept that this works: - Avoid directly accessing `git_refcount`'s `owner` directly, and instead makes all callers go through the `GIT_REFCOUNT_*()` macros, which also use the atomic utilities. - Makes `pool_system_page_size()` race-free. Part of: #5592
Edward Thomson 60319788 2019-08-23T09:58:15 Merge pull request #5054 from tniessen/util-use-64-bit-timer util: use 64 bit timer on Windows
Patrick Steinhardt 8cbef12d 2019-08-08T11:52:54 util: do not perform allocations in insertsort Our hand-rolled fallback sorting function `git__insertsort_r` does an in-place sort of the given array. As elements may not necessarily be pointers, it needs a way of swapping two values of arbitrary size, which is currently implemented by allocating a temporary buffer of the element's size. This is problematic, though, as the emulated `qsort` interface doesn't provide any return values and thus cannot signal an error if allocation of that temporary buffer has failed. Convert the function to swap via a temporary buffer allocated on the stack. Like this, it can `memcpy` contents of both elements in small batches without requiring a heap allocation. The buffer size has been chosen such that in most cases, a single iteration of copying will suffice. Most importantly, it can fully contain `git_oid` structures and pointers. Add a bunch of tests for the `git__qsort_r` interface to verify nothing breaks. Furthermore, this removes the declaration of `git__insertsort_r` and makes it static as it is not used anywhere else.
Tobias Nießen fb0730f1 2019-04-16T23:49:16 util: use 64 bit timer on Windows git__timer was originally implemented using a 32 bit timer since Windows XP did not support GetTickCount64. Windows XP was discontinued five years ago, so it should be safe to use the new API. As a benefit, we do not need to care about overflows for the next 585 million years.
Patrick Steinhardt b5f40441 2019-04-16T13:21:03 util: introduce GIT_CONTAINER_OF macro In some parts of our code, we make rather heavy use of casting structures to their respective specialized implementation. One example is the configuration code with the general `git_config_backend` and the specialized `diskfile_header` structures. At some occasions, it can get confusing though with regards to the correct inheritance structure, which led to the recent bug fixed in 2424e64c4 (config: harden our use of the backend objects a bit, 2018-02-28). Object-oriented programming in C is hard, but we can at least try to have some checks when it comes to casting around stuff. Thus, this commit introduces a `GIT_CONTAINER_OF` macro, which accepts as parameters the pointer that is to be casted, the pointer it should be cast to as well as the member inside of the target structure that is the containing structure. This macro then tries hard to detect mis-casts: - It checks whether the source and target pointers are of the same type. This requires support by the compiler, as it makes use of the builtin `__builtin_types_compatible_p`. - It checks whether the embedded member of the target structure is the first member. In order to make this a compile-time constant, the compiler-provided `__builtin_offsetof` is being used for this. - It ties these two checks together by the compiler-builtin `__builtin_choose_expr`. Based on whether the previous two checks evaluate to `true`, the compiler will either compile in the correct cast, or it will output `(void)0`. The second case results in a compiler error, resulting in a compile-time check for wrong casts. The only downside to this is that it relies heavily on compiler-specific extensions. As both GCC and Clang support these features, only define this macro like explained above in case `__GNUC__` is set (Clang also defines `__GNUC__`). If the compiler is not Clang or GCC, just go with a simple cast without any additional checks.
romkatv 30a56ba6 2019-03-14T14:54:47 optimize string comparisons
Patrick Steinhardt 623647af 2018-10-26T12:33:59 Merge pull request #4864 from pks-t/pks/object-parse-fixes Object parse fixes
Patrick Steinhardt 83e8a6b3 2018-10-18T16:08:46 util: provide `git__memmem` function Unfortunately, neither the `memmem` nor the `strnstr` functions are part of any C standard but are merely extensions of C that are implemented by e.g. glibc. Thus, there is no standardized way to search for a string in a block of memory with a limited size, and using `strstr` is to be considered unsafe in case where the buffer has not been sanitized. In fact, there are some uses of `strstr` in exactly that unsafe way in our codebase. Provide a new function `git__memmem` that implements the `memmem` semantics. That is in a given haystack of `n` bytes, search for the occurrence of a byte sequence of `m` bytes and return a pointer to the first occurrence. The implementation chosen is the "Not So Naive" algorithm from [1]. It was chosen as the implementation is comparably simple while still being reasonably efficient in most cases. Preprocessing happens in constant time and space, searching has a time complexity of O(n*m) with a slightly sub-linear average case. [1]: http://www-igm.univ-mlv.fr/~lecroq/string/
Patrick Steinhardt 8d7fa88a 2018-10-18T12:04:07 util: remove `git__strtol32` The function `git__strtol32` can easily be misused when untrusted data is passed to it that may not have been sanitized with trailing `NUL` bytes. As all usages of this function have now been removed, we can remove this function altogether to avoid future misuse of it.
Patrick Steinhardt 68deb2cc 2018-10-18T11:37:10 util: remove unsafe `git__strtol64` function The function `git__strtol64` does not take a maximum buffer length as parameter. This has led to some unsafe usages of this function, and as such we may consider it as being unsafe to use. As we have now eradicated all usages of this function, let's remove it completely to avoid future misuse.
Patrick Steinhardt d2e996fa 2018-03-14T10:36:14 util: extract allocators into its own "alloc.h" header Our "util.h" header is a grabbag of various different functions, where many don't have a clear group they belong to. Our set of allocator functions though can be clearly singled out as a single group of functions that always belongs together. Furthermore, we will need to implement additional functions relating to our allocators subsystem when moving to pluggable allocators. Thus, we should just move these functions into their own "alloc" module.
Patrick Steinhardt c47f7155 2018-03-14T10:34:59 util: extract `stdalloc` allocator into its own module Right now, the standard allocator is being declared as part of the "util.h" header as a set of inline functions. As with the crtdbg allocator functions, these inline functions make it hard to convert to function pointers for our allocators. Create a new "stdalloc" module containing our standard allocations functions to split these out. Convert the existing allocators to macros which make use of the stdalloc functions.
Patrick Steinhardt 496b0df2 2018-03-14T10:28:50 win32: crtdbg: provide independent `free` function Currently, the `git__free` function is being defined in a single place, only, disregarding whether we use our standard allocators or the crtdbg allocators. This makes it a bit harder to convert our code base to use pluggable allocators, and furthermore makes the border between our two allocators a bit more blurry. Implement a separate `git__crtdbg__free` function for the crtdbg allocator in order to completely separate both allocator implementations.
Stan Hu 9d83a2b0 2018-02-22T22:55:50 Sanitize the hunk header to ensure it contains UTF-8 valid data The diff driver truncates the hunk header text to 80 bytes, which can truncate 4-byte Unicode characters and introduce garbage characters in the diff output. This change sanitizes the hunk header before it is displayed. This mirrors the test in git: https://github.com/git/git/blob/master/t/t4025-hunk-header.sh Closes https://github.com/libgit2/rugged/issues/716
Patrick Steinhardt 92324d84 2018-02-16T11:28:53 util: clean up header includes While "util.h" declares the macro `git__tolower`, which simply resorts to tolower(3P) on Unix-like systems, the <ctype.h> header is only being included in "util.c". Thus, anybody who has included "util.h" without having <ctype.h> included will fail to compile as soon as the macro is in use. Furthermore, we can clean up additional includes in "util.c" and simply replace them with an include for "common.h".
Edward Thomson abb04caa 2018-02-01T15:55:48 consistent header guards use consistent names for the #include / #define header guard pattern.
Edward Thomson 86219f40 2017-11-30T15:40:13 util: introduce `git__prefixncmp` and consolidate implementations Introduce `git_prefixncmp` that will search up to the first `n` characters of a string to see if it is prefixed by another string. This is useful for examining if a non-null terminated character array is prefixed by a particular substring. Consolidate the various implementations of `git__prefixcmp` around a single core implementation and add some test cases to validate its behavior.
Patrick Steinhardt 585b5dac 2017-11-18T15:43:11 refcount: make refcounting conform to aliasing rules Strict aliasing rules dictate that for most data types, you are not allowed to cast them to another data type and then access the casted pointers. While this works just fine for most compilers, technically we end up in undefined behaviour when we hurt that rule. Our current refcounting code makes heavy use of casting and thus violates that rule. While we didn't have any problems with that code, Travis started spitting out a lot of warnings due to a change in their toolchain. In the refcounting case, the code is also easy to fix: as all refcounting-statements are actually macros, we can just access the `rc` field directly instead of casting. There are two outliers in our code where that doesn't work. Both the `git_diff` and `git_patch` structures have specializations for generated and parsed diffs/patches, which directly inherit from them. Because of that, the refcounting code is only part of the base structure and not of the children themselves. We can help that by instead passing their base into `GIT_REFCOUNT_INC`, though.
Patrick Steinhardt 0c7f49dd 2017-06-30T13:39:01 Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Patrick Steinhardt 0fb4b351 2017-06-30T13:27:26 Fix missing include for header files Some of our header files are not included at all by any of their implementing counter-parts. Including them inside of these files leads to some compile errors mostly due to unknown types because of missing includes. But there's also one case where a declared function does not match the implementation's prototype. Fix all these errors by fixing up the prototype and adding missing includes. This is preparatory work for fixing up missing includes in the implementation files.
Patrick Steinhardt 459fb8fe 2017-06-30T15:35:46 win32: fix circular include deps with w32_crtdbg The current order of declarations and includes between "common.h" and "w32_crtdbg_stacktrace.h" is rather complicated. Both header files make use of things defined in the other one and are thus circularly dependent on each other. This makes it currently impossible to compile the "w32_crtdbg_stacktrace.c" file when including "common.h" inside of "w32_crtdbg_stacktrace.h". We can disentangle the mess by moving declaration of the inline crtdbg functions into the "w32_crtdbg_stacktrace.h" file and adding additional includes inside of it, such that all required functions are available to it. This allows us to break the dependency cycle.
Edward Thomson d34f6826 2014-04-08T17:18:47 Patch parsing from patch files
Carlos Martín Nieto ea445e06 2015-07-07T00:48:17 Merge pull request #3288 from ethomson/getenv git__getenv: utf-8 aware env reader
Edward Thomson e069c621 2015-07-02T09:25:48 git__getenv: utf-8 aware env reader Introduce `git__getenv` which is a UTF-8 aware `getenv` everywhere. Make `cl_getenv` use this to keep consistent memory handling around return values (free everywhere, as opposed to only some platforms).
Jeff Hostetler 93b42728 2015-06-09T14:38:30 Include stacktrace summary in memory leak output.
Edward Thomson 75a4636f 2015-05-29T16:56:38 git__tolower: a tolower() that isn't dumb Some brain damaged tolower() implementations appear to want to take the locale into account, and this may require taking some insanely aggressive lock on the locale and slowing down what should be the most trivial of trivial calls for people who just want to downcase ASCII.
Jeff Hostetler d06c589f 2015-04-10T06:15:06 Add MSVC CRTDBG memory leak reporting.
Edward Thomson f1453c59 2015-02-12T12:19:37 Make our overflow check look more like gcc/clang's Make our overflow checking look more like gcc and clang's, so that we can substitute it out with the compiler instrinsics on platforms that support it. This means dropping the ability to pass `NULL` as an out parameter. As a result, the macros also get updated to reflect this as well.
Edward Thomson 190b76a6 2015-02-11T14:52:08 Introduce git__add_sizet_overflow and friends Add some helper functions to check for overflow in a type-specific manner.
Edward Thomson 8d534b47 2015-02-11T13:01:00 p_read: ensure requested len is ssize_t Ensure that the given length to `p_read` is of ssize_t and ensure that callers test the return as if it were an `ssize_t`.
Edward Thomson 2884cc42 2015-02-11T09:39:38 overflow checking: don't make callers set oom Have the ALLOC_OVERFLOW testing macros also simply set_oom in the case where a computation would overflow, so that callers don't need to.
Edward Thomson 3603cb09 2015-02-10T23:13:49 git__*allocarray: safer realloc and malloc Introduce git__reallocarray that checks the product of the number of elements and element size for overflow before allocation. Also introduce git__mallocarray that behaves like calloc, but without the `c`. (It does not zero memory, for those truly worried about every cycle.)
Edward Thomson 392702ee 2015-02-09T23:41:13 allocations: test for overflow of requested size Introduce some helper macros to test integer overflow from arithmetic and set error message appropriately.
Maks Naumov d8b5c8c3 2015-01-15T15:16:19 Remove strlen() calls from loop condition Avoid str length recalculation every iteration
Sebastian Bauer 7cf86f92 2014-12-28T10:35:26 Added AmigaOS-specific implementation of git__timer(). The clock_gettime() function is normally not available under AmigaOS, hence another solution is required. We are using now GetUpTime() that is present in current versions of this operating system.
Edward Thomson fe5f7722 2014-12-23T11:27:01 don't treat 0x85 as whitespace A byte value of 0x85 is not whitespace, we were conflating that with U+0085 (UTF8: 0xc2 0x85). This caused us to incorrectly treat valid multibyte characters like U+88C5 (UTF8: 0xe8 0xa3 0x85) as whitespace.
Vicent Marti 8e35527d 2014-12-16T13:03:02 path: Use UTF8 iteration for HFS chars
Edward Thomson a64119e3 2014-11-25T18:13:00 checkout: disallow bad paths on win32 Disallow: 1. paths with trailing dot 2. paths with trailing space 3. paths with trailing colon 4. paths that are 8.3 short names of .git folders ("GIT~1") 5. paths that are reserved path names (COM1, LPT1, etc). 6. paths with reserved DOS characters (colons, asterisks, etc) These paths would (without \\?\ syntax) be elided to other paths - for example, ".git." would be written as ".git". As a result, writing these paths literally (using \\?\ syntax) makes them hard to operate with from the shell, Windows Explorer or other tools. Disallow these.
Russell Belfer 3822d2cc 2014-08-05T15:06:45 Fix typo in timer normalization constants The effect of this would be that various update callbacks would not be made at the correct interval.
Russell Belfer 947a58c1 2014-05-30T13:19:49 Clean up the handling of large binary diffs
Russell Belfer cd424ad5 2014-04-28T16:39:53 Add GIT_STATUS_OPT_UPDATE_INDEX and use trace API This adds an option to refresh the stat cache while generating status. It also rips out the GIT_PERF stuff I had an makes use of the trace API to keep statistics about what happens during diff.
Russell Belfer 240f4af3 2014-04-28T14:04:29 Add build option for diff internal statistics
Philip Kelley 7110000d 2014-04-22T10:21:19 React to feedback for UTF-8 <-> WCHAR and reparse work
Russell Belfer 3b4c401a 2014-02-10T13:20:08 Decouple index iterator sort from index This makes the index iterator honor the GIT_ITERATOR_IGNORE_CASE and GIT_ITERATOR_DONT_IGNORE_CASE flags without modifying the index data itself. To take advantage of this, I had to export a number of the internal index entry comparison functions. I also wrote some new tests to exercise the capability.
Jacques Germishuys 8e14b47f 2014-04-10T12:42:29 Introduce git__date_rfc2822_fmt. Allows for RFC2822 date headers
Carlos Martín Nieto 24f3024f 2014-02-05T14:31:01 Split p_strlen into its own header We need this from util.h and posix.h, but the latter includes common.h which includes util.h, which means p_strlen is not defined by the time we get to git__strndup(). Split the definition on p_strlen() off into its own header so we can use it in util.h.
Carlos Martín Nieto 1e6f0ac4 2014-02-05T13:49:51 utils: don't reimplement strnlen The standard library provides a very nice strnlen function, which knows to use SSE, let's not reimplement it ourselves.
Ben Straub 816d28e7 2013-10-01T12:56:47 Mark git__timer as inline on OSX
Jameson Miller b176eded 2013-09-19T14:52:57 Initial Implementation of progress reports during push This adds the basics of progress reporting during push. While progress for all aspects of a push operation are not reported with this change, it lays the foundation to add these later. Push progress reporting can be improved in the future - and consumers of the API should just get more accurate information at that point. The main areas where this is lacking are: 1) packbuilding progress: does not report progress during deltafication, as this involves coordinating progress from multiple threads. 2) network progress: reports progress as objects and bytes are going to be written to the subtransport (instead of as client gets confirmation that they have been received by the server) and leaves out some of the bytes that are transfered as part of the push protocol. Basically, this reports the pack bytes that are written to the subtransport. It does not report the bytes sent on the wire that are received by the server. This should be a good estimate of progress (and an improvement over no progress).
nulltoken 191adce8 2013-08-27T20:00:28 vector: Teach git_vector_uniq() to free while deduplicating
Edward Thomson a1f69452 2013-08-08T12:36:11 git_strndup fix when OOM
Russell Belfer d730d3f4 2013-07-31T16:40:42 Major rename detection changes After doing further profiling, I found that a lot of time was being spent attempting to insert hashes into the file hash signature when using the rolling hash because the rolling hash approach generates a hash per byte of the file instead of one per run/line of data. To optimize this, I decided to convert back to a run-based file signature algorithm which would be more like core Git. After changing this, a number of the existing tests started to fail. In some cases, this appears to have been because the test was coded to be too specific to the particular results of the file similarity metric and in some cases there appear to have been bugs in the core rename detection code where only by the coincidence of the file similarity scoring were the expected results being generated. This renames all the variables in the core rename detection code to be more consistent and hopefully easier to follow which made it a bit easier to reason about the behavior of that code and fix the problems that I was seeing. I think it's in better shape now. There are a couple of tests now that attempt to stress test the rename detection code and they are quite slow. Most of the time is spent setting up the test data on disk and in the index. When we roll out performance improvements for index insertion, it should also speed up these tests I hope.
Russell Belfer 302a04b0 2013-06-29T12:41:39 Add accessors for refcount value
Edward Thomson e3b4a47c 2013-05-31T16:30:09 git__strcasesort_cmp: strcasecmp sorting rules but requires strict equality
yorah 3425fee6 2013-06-17T14:27:34 util: git__memzero() tweaks On Linux: fix a warning message related to the volatile qualifier (cast) On Windows: use SecureZeroMemory() On both, inline the call, so that no entry point can lead back to this "secure" memory zeroing.
Vicent Marti 6de9b2ee 2013-06-12T21:10:33 util: It's called `memzero`
Vicent Marti eb58e2d0 2013-06-12T21:05:48 Merge remote-tracking branch 'arrbee/minor-paranoia' into development
Russell Belfer 114f5a6c 2013-06-10T10:10:39 Reorganize diff and add basic diff driver This is a significant reorganization of the diff code to break it into a set of more clearly distinct files and to document the new organization. Hopefully this will make the diff code easier to understand and to extend. This adds a new `git_diff_driver` object that looks of diff driver information from the attributes and the config so that things like function content in diff headers can be provided. The full driver spec is not implemented in the commit - this is focused on the reorganization of the code and putting the driver hooks in place. This also removes a few #includes from src/repository.h that were overbroad, but as a result required extra #includes in a variety of places since including src/repository.h no longer results in pulling in the whole world.
Russell Belfer 3e9e6cda 2013-06-07T09:54:33 Add safe memset and use it This adds a `git__memset` routine that will not be optimized away and updates the places where I memset() right before a free() call to use it.
Russell Belfer d958e37a 2013-05-17T17:21:45 Fix issues with git_diff_find_similar There are a number of bugs in the rename code that only were obvious when I started testing it against large old repos with more complex patterns. (The code to do that testing is not ready to merge with libgit2, but I do plan to add more thorough tests.) This contains a significant number of changes and also tweaks the public API slightly to make emulating core git easier. Most notably, this separates the GIT_DIFF_FIND_AND_BREAK_REWRITES flag into FIND_REWRITES (which adds a self-similarity score to every modified file) and BREAK_REWRITES (which splits the modified deltas into add/remove pairs in the diff list). When you do a raw output of core git, rewrites show up as M090 or such, not at A and D output, so I wanted to be able to emulate that. Publicly, this also changes the flags to be uint16_t since we don't need values out of that range. Internally, this contains significant changes from a number of small bug fixes (like using the wrong side of the diff to decide if the object could be found in the ODB vs the workdir) to larger issues about which files can and should be compared and how the various edge cases of similarity scores should be treated. Honestly, I don't think this is the last update that will have to be made to this code, but I think this moves us closer to correct behavior and I tried to document the code so it would be easier to follow..
Linquize 0cb16fe9 2013-05-15T20:26:55 Unify whitespaces to tabs
Carlos Martín Nieto 05b17964 2013-04-21T19:26:35 Make refcounting atomic
Russell Belfer e976b56d 2013-04-15T14:27:53 Add git__compare_and_swap and use it This removes the lock from the repository object and changes the internals to use the new atomic git__compare_and_swap to update the _odb, _config, _index, and _refdb variables in a threadsafe manner.
Russell Belfer 53607868 2013-04-15T00:09:03 Further threading fixes This builds on the earlier thread safety work to make it so that setting the odb, index, refdb, or config for a repository is done in a threadsafe manner with minimized locking time. This is done by adding a lock to the repository object and using it to guard the assignment of the above listed pointers. The lock is only held to assign the pointer value. This also contains some minor fixes to the other work with pack files to reduce the time that locks are being held to and fix an apparently memory leak.
Russell Belfer 4dcd8780 2013-04-19T17:17:44 Move refdb_backend to include/git2/sys This moves most of the refdb stuff over to the include/git2/sys directory, with some minor shifts in function organization. While I was making the necessary updates, I also removed the trailing whitespace in a few files that I modified just because I was there and it was bugging me.
Russell Belfer 62beacd3 2013-03-11T16:43:58 Sorting function cleanup and MinGW fix Clean up some sorting function stuff including fixing qsort_r on MinGW, common function pointer type for comparison, and basic insertion sort implementation (which we, regrettably, fall back on for MinGW).
Russell Belfer e40f1c2d 2013-03-08T16:39:57 Make tree iterator handle icase equivalence There is a serious bug in the previous tree iterator implementation. If case insensitivity resulted in member elements being equivalent to one another, and those member elements were trees, then the children of the colliding elements would be processed in sequence instead of in a single flattened list. This meant that the tree iterator was not truly acting like a case-insensitive list. This completely reworks the tree iterator to manage lists with case insensitive equivalence classes and advance through the items in a unified manner in a single sorted frame. It is possible that at a future date we might want to update this to separate the case insensitive and case sensitive tree iterators so that the case sensitive one could be a minimal amount of code and the insensitive one would always know what it needed to do without checking flags. But there would be so much shared code between the two, that I'm not sure it that's a win. For now, this gets what we need. More tests are needed, though.
Vicent Marti 41051e3f 2013-02-20T17:09:51 signature: Shut up MSVC, you silly goose
Ben Straub 15760c59 2013-02-01T19:21:55 Use malloc rather than calloc
Ben Straub c4beee76 2013-02-01T10:00:55 Introduce git__substrdup
Philip Kelley 11d9f6b3 2013-01-27T14:17:07 Vector improvements and their fallout
Russell Belfer 851ad650 2013-01-09T16:00:16 Add payload "_r" versions of bsearch and tsort git__bsearch and git__tsort did not pass a payload through to the comparison function. This makes it impossible to implement sorted lists where the sort order depends on external data (e.g. building a secondary sort order for the entries in a tree). This commit adds git__bsearch_r and git__tsort_r versions that pass a third parameter to the cmp function of a user payload.
Edward Thomson 359fc2d2 2013-01-08T17:07:25 update copyrights
Edward Thomson 7fcec834 2012-12-11T22:31:21 fetchhead reading/iterating
Russell Belfer a277345e 2012-11-14T22:37:13 Create internal strcmp variants for function ptrs Using the builtin strcmp and strcasecmp as function pointers is problematic on win32. This adds internal implementations and divorces us from the platform linkage.
Russell Belfer 55cbd05b 2012-11-08T16:56:34 Some diff refactorings to help code reuse There are some diff functions that are useful in a rewritten checkout and this lays some groundwork for that. This contains three main things: 1. Share the function diff uses to calculate the OID for a file in the working directory (now named `git_diff__oid_for_file` 2. Add a `git_diff__paired_foreach` function to iterator over two diff lists concurrently. Convert status to use it. 3. Move all the string/prefix/index entry comparisons into function pointers inside the `git_diff_list` object so they can be switched between case sensitive and insensitive versions. This makes them easier to reuse in various functions without replicating logic. As part of this, move a couple of index functions out of diff.c and into index.c.
Philip Kelley 2364735c 2012-11-09T15:39:10 Fix implementation of strndup to not overrun
Philip Kelley ec40b7f9 2012-09-17T15:42:41 Support for core.ignorecase
yorah 02a0d651 2012-07-12T16:31:59 Add git_buf_unescape and git__unescape to unescape all characters in a string (in-place)
Vicent Martí 48bcf81d 2012-07-12T09:32:44 Merge pull request #812 from arrbee/assorted-tweaks Assorted goodies
Russell Belfer c3a875c9 2012-07-10T15:20:08 Adding unicode space to match crlf patterns Adding 0x85 to `git__isspace` since we also look for that in filter.c as a whitespace character.
nulltoken 98d6a1fd 2012-07-04T16:24:09 util: add git__isdigit()
nulltoken 494ae940 2012-07-02T17:51:02 revparse: fix parsing of date specifiers
Ben Straub 8a385c04 2012-06-06T12:25:22 Move git__date_parse declaration to util.h.
Ben Straub 56a5000d 2012-06-05T12:52:44 Merge branch 'development' into rev-parse Conflicts: src/util.h tests-clar/refs/branches/listall.c