src


Log

Author Commit Date CI Message
Edward Thomson 3d9e82fd 2019-05-21T14:59:55 Merge pull request #4935 from libgit2/ethomson/pcre Use PCRE for our fallback regex engine when regcomp_l is unavailable
Erik Aigner 59647e1a 2019-04-08T15:54:25 remote: add callback to resolve URLs before connecting Since libssh2 doesn't read host configuration from the config file, this callback can be used to hand over URL resolving to the client without touching the SSH implementation itself.
Edward Thomson ac2b235e 2019-05-21T12:22:40 regex: use REGEX_BACKEND as the cmake option name This avoids any misunderstanding with the REGEX keyword in cmake.
Patrick Steinhardt 4aa36ff2 2019-05-21T12:18:47 Merge pull request #5075 from libgit2/ethomson/ignore_skip_bom Skip UTF8 BOM in ignore files
David Brooks ada1cd01 2019-05-21T21:35:57 Use tabs for indentation (#5079).
David Brooks b575c242 2019-05-21T20:20:04 Fix indentation (#5079).
David Brooks 2c2e924b 2019-05-21T20:17:48 We still need to update pkgconfig variables when zlib is unbundled (#5079).
David Brooks 06dbf734 2019-05-21T15:44:32 We've already added `ZLIB_LIBRARIES` to `LIBGIT2_LIBS` so don't also add the `z` library (libgit2/#5079).
Jacques Germishuys 0fd259ed 2019-05-20T12:44:37 define SYMBOLIC_LINK_FLAG_DIRECTORY if not defined
Edward Thomson 133bceba 2019-05-19T13:57:13 ignore: skip UTF8 BOM in ignore file
Edward Thomson ce6d624a 2019-05-19T10:30:04 regex: optionally use PCRE2 Use PCRE2 and its POSIX compatibility layer if requested by the user. Although PCRE2 is adequate for our needs, the PCRE2 POSIX layer as installed on Debian and Ubuntu systems is broken, so we do not opt-in to it by default to avoid breaking users on those platforms.
Edward Thomson 69ecdad5 2019-05-19T10:09:55 regex: use system PCRE if available Attempt to locate a system-installed version of PCRE and use its POSIX compatibility layer, if possible.
Edward Thomson 622166c4 2019-05-18T19:37:59 regex: disambiguate builtin vs system pcre
Edward Thomson c6e48fef 2019-02-17T21:51:34 regex: allow regex selection in cmake Users can now select which regex implementation they want to use: one of the system `regcomp_l`, the system PCRE, the builtin PCRE or the system's `regcomp`. By default the system `regcomp_l` will be used if it exists, otherwise the system PCRE will be used. If neither of those exist, then the builtin PCRE implementation will be used. The system's `regcomp` is not used by default due to problems with locales.
Edward Thomson fe1fb36e 2019-01-13T21:10:50 win32: move type definitions for improved inclusion Move some win32 type definitions to a standalone file so that they can be included before other header files try to use the definitions.
Patrick Steinhardt 31f8f82a 2018-03-02T12:18:59 diff_driver: detect memory allocation errors when loading diff driver When searching for a configuration key for the diff driver, we construct the config key by modifying a buffer and then passing it to `git_config_get_multivar_foreach`. We do not check though whether the modification of the buffer actually succeded, so we could in theory end up passing the OOM buffer to the config function. Fix that by checking return codes. While at it, switch to use `git_buf_PUTS` to avoid repetition of the appended string to calculate its length.
Edward Thomson 9ceafb57 2019-01-12T22:55:31 regexec: use pcre as our fallback/builtin regex Use PCRE 8.42 as the builtin regex implementation, using its POSIX compatibility layer. PCRE uses ASCII by default and the users locale will not influence its behavior, so its `regcomp` implementation is similar to `regcomp_l` with a C locale.
Edward Thomson 02683b20 2019-01-12T23:06:39 regexec: prefix all regexec function calls with p_ Prefix all the calls to the the regexec family of functions with `p_`. This allows us to swap out all the regular expression functions with our own implementation. Move the declarations to `posix_regex.h` for simpler inclusion.
Edward Thomson c9f116f1 2019-05-12T22:06:00 Merge branch 'pr/5061'
Edward Thomson ab27c835 2019-05-12T22:05:26 revwalk: update error message for clarity
Edward Thomson 1e3a639d 2019-05-12T21:54:39 Merge pull request #5065 from danielgindi/feature/win32_symlink_dir Support symlinks for directories in win32
Edward Thomson 7f562f2c 2019-05-12T11:00:31 Merge pull request #5057 from eaigner/merge-rebase-onto-name rebase: orig_head and onto accessors
Heiko Voigt 6990a492 2019-05-06T11:39:51 revwalk: fix memory leak in error handling This is not implemented and should fail, but it should also not leak. To allow the memory debugger to find leaks and fix this one we test this.
Daniel Cohen Gindi 336e98bb 2019-05-06T14:51:52 Moved dwFlags declaration to beginning of scope
Daniel Cohen Gindi 37a7adb5 2019-05-05T07:49:09 Support symlinks for directories in win32
Heiko Voigt d55bb479 2019-04-26T15:59:49 git_revwalk_push_range: do not crash if range is missing If someone passes just one ref (i.e. "master") and misses passing the range we should be nice and return an error code instead of crashing.
Patrick Steinhardt 604e2811 2019-05-02T12:09:23 Merge pull request #5063 from pks-t/pks/cmake-regcomp-fix cmake: correctly detect if system provides `regcomp`
Patrick Steinhardt ee3d71fb 2019-04-26T08:01:56 cmake: fix include ordering issues with bundled deps When linking against bundled libraries, we include their header directories by using "-isystem". The reason for that is that we want to handle our vendored library headers specially, most importantly to ignore warnings generated by including them. By using "-isystem", though, we screw up the order of searched include directories by moving those bundled dependencies towards the end of the lookup order. Like this, chances are high that any other specified include directory contains a file that collides with the actual desired include file. Fix this by not treating the bundled dependencies' include directories as system includes. This will move them to the front of the lookup order and thus cause them to override system-provided headers. While this may cause the compiler to generate additional warnings when processing bundled headers, this is a tradeoff we should make regardless to fix builds on systems hitting this issue.
Patrick Steinhardt 13cb9f7a 2019-02-25T11:35:16 cmake: correctly detect if system provides `regcomp` We assume that if we are on Win32, Amiga OS, Solaris or SunOS, that the regcomp(3P) function cannot be provided by the system. Thus we will in these cases always include our own, bundled regex sources to make a regcomp implementation available. This test is obviously very fragile, and we have seen it fail on MSYS2/MinGW systems, which do in fact provide the regcomp symbol. The effect is that during compilation, we will use the "regex.h" header provided by MinGW, but use symbols provided by ourselves. This in fact may cause subtle memory layout issues, as the structure made available via MinGW doesn't match what our bundled code expects. There's one more problem with our regex detection: on the listed platforms, we will incorrectly include the bundled regex code even in case where the system provides regcomp_l(3), but it will never be used for anything. Fix the issue by improving our regcomp detection code. Instead of relying on a fragile listing of platforms, we can just use `CHECK_FUNCTION_EXISTS` instead. This will not in fact avoid the header-ordering problem. But we can assume that as soon as a system-provided "regex.h" header is provided, that `CHECK_FUNCTION_EXISTS` will now correctly find the desired symbol and thus not include our bundled regex code.
Ian Hattendorf e44110db 2019-03-20T12:28:45 Correctly write to missing locked global config Opening a default config when ~/.gitconfig doesn't exist, locking it, and attempting to write to it causes an assertion failure. Treat non-existent global config file content as an empty string.
Patrick Steinhardt bc5b19e6 2019-04-29T09:01:45 Merge pull request #4561 from pks-t/pks/downcasting [RFC] util: introduce GIT_DOWNCAST macro
Erik Aigner e215f475 2019-04-21T21:36:36 rebase: orig_head and onto accessors The rebase struct stores fields with information about the current rebase process, which were not accessible via a public interface. Accessors for getting the `orig_head` and `onto` branch names and object ids have been added.
Edward Thomson b3923cf7 2019-04-17T13:43:52 Merge pull request #5050 from libgit2/ethomson/windows_init_traversal git_repository_init: stop traversing at windows root
Edward Thomson 45f24e78 2019-04-12T08:54:06 git_repository_init: stop traversing at windows root Stop traversing the filesystem at the Windows directory root. We were calculating the filesystem root for the given directory to create, and walking up the filesystem hierarchy. We intended to stop when the traversal path length is equal to the root path length (ie, stopping at the root, since no path may be shorter than the root path). However, on Windows, the root path may be specified in two different ways, as either `Z:` or `Z:\`, where `Z:` is the current drive letter. `git_path_dirname_r` returns the path _without_ a trailing slash, even for the Windows root. As a result, during traversal, we need to test that the traversal path is _less than or equal to_ the root path length to determine if we've hit the root to ensure that we stop when our traversal path is `Z:` and our calculated root path was `Z:\`.
Tobias Nießen cc8a9892 2019-04-16T18:13:31 config_file: check result of git_array_alloc git_array_alloc can return NULL if no memory is available, causing a segmentation fault in memset. This adds GIT_ERROR_CHECK_ALLOC similar to how other parts of the code base deal with the return value of git_array_alloc.
Etienne Samson 431601f2 2019-04-05T15:05:10 iterator: make use the `GIT_CONTAINER_OF` macro
Etienne Samson b51789ac 2019-04-16T13:20:08 transports: make use of the `GIT_CONTAINER_OF` macro
Etienne Samson 2e246474 2019-04-16T13:19:53 refdb_fs: make use of the `GIT_CONTAINER_OF` macro
Patrick Steinhardt 65203b5a 2019-04-16T13:21:16 config_file: make use of `GIT_CONTAINER_OF` macro
Patrick Steinhardt b5f40441 2019-04-16T13:21:03 util: introduce GIT_CONTAINER_OF macro In some parts of our code, we make rather heavy use of casting structures to their respective specialized implementation. One example is the configuration code with the general `git_config_backend` and the specialized `diskfile_header` structures. At some occasions, it can get confusing though with regards to the correct inheritance structure, which led to the recent bug fixed in 2424e64c4 (config: harden our use of the backend objects a bit, 2018-02-28). Object-oriented programming in C is hard, but we can at least try to have some checks when it comes to casting around stuff. Thus, this commit introduces a `GIT_CONTAINER_OF` macro, which accepts as parameters the pointer that is to be casted, the pointer it should be cast to as well as the member inside of the target structure that is the containing structure. This macro then tries hard to detect mis-casts: - It checks whether the source and target pointers are of the same type. This requires support by the compiler, as it makes use of the builtin `__builtin_types_compatible_p`. - It checks whether the embedded member of the target structure is the first member. In order to make this a compile-time constant, the compiler-provided `__builtin_offsetof` is being used for this. - It ties these two checks together by the compiler-builtin `__builtin_choose_expr`. Based on whether the previous two checks evaluate to `true`, the compiler will either compile in the correct cast, or it will output `(void)0`. The second case results in a compiler error, resulting in a compile-time check for wrong casts. The only downside to this is that it relies heavily on compiler-specific extensions. As both GCC and Clang support these features, only define this macro like explained above in case `__GNUC__` is set (Clang also defines `__GNUC__`). If the compiler is not Clang or GCC, just go with a simple cast without any additional checks.
Patrick Steinhardt ed959ca2 2019-04-16T12:36:24 Merge pull request #5027 from ddevault/master patch_parse.c: Handle CRLF in parse_header_start
Edward Thomson c4cd69b2 2019-04-07T19:10:16 Merge pull request #5039 from libgit2/ethomson/win32_hash sha1: don't inline `git_hash_global_init` for win32
Drew DeVault 30c06b60 2019-03-22T23:56:10 patch_parse.c: Handle CRLF in parse_header_start
Patrick Steinhardt 9d117e20 2019-04-05T10:22:46 ignore: treat paths with trailing "/" as directories The function `git_ignore_path_is_ignored` is there to test the ignore status of paths that need not necessarily exist inside of a repository. This has the implication that for a given path, we cannot always decide whether it references a directory or a file, and we need to distinguish those cases because ignore rules may treat those differently. E.g. given the following gitignore file: * !/**/ we'd only want to unignore directories, while keeping files ignored. But still, calling `git_ignore_path_is_ignored("dir/")` will say that this directory is ignored because it treats "dir/" as a file path. As said, the `is_ignored` function cannot always decide whether the given path is a file or directory, and thus it may produce wrong results in some cases. While this is unfixable in the general case, we can do better when we are being passed a path name with a trailing path separator (e.g. "dir/") and always treat them as directories.
Edward Thomson aeea1c46 2019-04-04T15:06:44 Merge pull request #4874 from tiennou/test/4615 Test that largefiles can be read through the tree API
Edward Thomson 6bcb7357 2019-04-04T14:04:59 Merge pull request #5035 from pks-t/pks/diff-with-space-in-filenames patch_parse: fix parsing addition/deletion of file with space
Edward Thomson 18e836cb 2019-04-04T10:55:38 Merge pull request #5018 from romkatv/strings Optimize string comparisons
Edward Thomson e5aecaf6 2019-04-04T18:45:30 sha1: don't inline `git_hash_global_init` for win32 Users of the Win32 hash cannot be inlined, as it uses a static struct. Don't inline it, but continue to declare the function in the header.
romkatv 30a56ba6 2019-03-14T14:54:47 optimize string comparisons
Patrick Steinhardt 9aa049d4 2019-03-29T13:28:59 Merge pull request #5020 from implausible/fix/gitignore-negation Negation of subdir ignore causes other subdirs to be unignored
Patrick Steinhardt b3497344 2019-03-29T12:15:20 patch_parse: fix parsing addition/deletion of file with space The diff header format is a strange beast in that it is inherently unparseable in an unambiguous way. While parsing a/file.txt b/file.txt is obvious and trivially doable, parsing a diff header of a/file b/file ab.txt b/file b/file ab.txt is not (but in fact valid and created by git.git). Due to that, we have relaxed our diff header parser in commit 80226b5f6 (patch_parse: allow parsing ambiguous patch headers, 2017-09-22), so that we started to bail out when seeing diff headers with spaces in their file names. Instead, we try to use the "---" and "+++" lines, which are unambiguous. In some cases, though, we neither have a useable file name from the header nor from the "---" or "+++" lines. This is the case when we have a deletion or addition of a file with spaces: the header is unparseable and the other lines will simply show "/dev/null". This trips our parsing logic when we try to extract the prefix (the "a/" part) that is being used in the path line, where we unconditionally try to dereference a NULL pointer in such a scenario. We can fix this by simply not trying to parse the prefix in cases where we have no useable path name. That'd leave the parsed patch without either `old_prefix` or `new_prefix` populated. But in fact such cases are already handled by users of the patch object, which simply opt to use the default prefixes in that case.
Patrick Steinhardt 131cd9b1 2019-03-29T11:58:50 patch_parse: improve formatting
Patrick Steinhardt 5f188c48 2019-03-29T11:52:39 Merge pull request #5024 from stewid/xdiff-fix-typo xdiff: fix typo
Aaron Patterson be9a386c 2019-03-22T17:04:32 Each hash implementation should define `git_hash_global_init` This means the forward declaration isn't necessary. The forward declaration can cause compilation errors as it conflicts with the `GIT_INLINE` declaration (the signatures are different).
Stefan Widgren 1a349003 2019-03-20T21:20:01 xdiff: fix typo
Steve King Jr e3d7bccb 2019-03-14T15:51:15 ignore: Do not match on prefix of negated patterns Matching on the prefix of a negated pattern was triggering false negatives on siblings of that pattern. e.g. Given the .gitignore: dir/* !dir/sub1/sub2/** The path `dir/a.text` would not be ignored.
Edward Thomson 7b083d3c 2019-03-02T18:14:36 Merge pull request #5005 from libgit2/ethomson/odb_backend_allocations odb: provide a free function for custom backends
Edward Thomson 68729289 2019-02-25T09:25:34 Merge pull request #5000 from augfab/branch_lookup_all Have git_branch_lookup accept GIT_BRANCH_ALL
Edward Thomson 459ac856 2019-02-23T18:42:53 odb: provide a free function for custom backends Custom backends can allocate memory when reading objects and providing them to libgit2. However, if an error occurs in the custom backend after the memory has been allocated for the custom object but before it's returned to libgit2, the custom backend has no way to free that memory and it must be leaked. Provide a free function that corresponds to the alloc function so that custom backends have an opportunity to free memory before they return an error.
Edward Thomson 790aae77 2019-02-23T18:40:43 odb: rename git_odb_backend_malloc for consistency The `git_odb_backend_malloc` name is a system function that is provided for custom ODB backends and allows them to allocate memory for an ODB object in the read callback. This is important so that libgit2 can later free the memory used by an ODB object that was read from the custom backend. However, the name _suggests_ that it actually allocates a `git_odb_backend`. It does not; rename it to make it clear that it actually allocates backend _data_.
Augustin Fabre c5d8e300 2019-02-21T21:46:39 branch: have git_branch_lookup accept GIT_BRANCH_ALL
Edward Thomson bd132046 2019-02-22T20:10:52 p_fallocate: compatibility fixes for macOS On macOS, fcntl(..., F_PREALLOCATE, ...) will only succeed when followed by an ftruncate(), even when it reports success. However, that syscall will fail when the file already exists. Thus, we must ignore the error code and simply let ftruncate extend the size of the file itself (albeit slowly). By calling ftruncate, we also need to prevent against file shrinkage, for compatibility with posix_ftruncate, which will only extend files, never shrink them.
Edward Thomson 7ab7bf46 2019-02-22T11:32:01 p_fallocate: don't duplicate definitions for win32
Edward Thomson 32f50452 2019-02-22T11:22:28 p_fallocate: add Windows emulation Emulate `p_fallocate` on Windows by seeking beyond the end of the file and setting the size to the current seek position.
Edward Thomson 59001e83 2019-02-21T11:41:19 remote: rename git_push_transfer_progress callback The `git_push_transfer_progress` is a callback and as such should be suffixed with `_cb` for consistency. Rename `git_push_transfer_progress` to `git_push_transfer_progress_cb`.
Edward Thomson a1ef995d 2019-02-21T10:33:30 indexer: use git_indexer_progress throughout Update internal usage of `git_transfer_progress` to `git_indexer_progreses`.
Edward Thomson 4069f924 2019-02-22T10:56:08 Merge pull request #4901 from pks-t/pks/uniform-map-api High-level map APIs
Edward Thomson 75dd7f2a 2019-02-22T10:13:00 Merge pull request #4984 from pks-t/pks/refdb-fs-race refdb_fs: fix loose/packed refs lookup racing with repacks
Edward Thomson c5594852 2019-02-22T10:06:24 Merge pull request #4998 from pks-t/pks/allocator-restructuring Allocator restructuring
Patrick Steinhardt bbdcd450 2019-02-20T10:40:06 cache: fix misnaming of `git_cache_free` Functions that free a structure's contents but not the structure itself shall be named `dispose` in the libgit2 project, but the function `git_cache_free` does not follow this naming pattern. Fix this by renaming it to `git_cache_dispose` and adjusting all callers to make use of the new name.
Patrick Steinhardt 48727e5d 2019-02-21T12:27:42 allocators: extract crtdbg allocator into its own file The Windows-specific crtdbg allocator is currently mixed into the crtdbg stacktracing compilation unit, making it harder to find than necessary. Extract it and move it into the new "allocators/" subdirectory to improve discoverability. This change means that the crtdbg compilation unit is now compiled unconditionally, whereas it has previously only been compiled on Windows platforms. Thus we now have additional guards around the code so that it will only be compiled if GIT_MSVC_CRTDBG is defined. This also allows us to move over the fallback-implementation of `git_win32_crtdbg_init_allocator` into the same compilation unit.
Patrick Steinhardt 765ff6e0 2019-02-21T12:35:48 allocators: make crtdbg allocator reuse its own realloc In commit 6e0dfc6ff (Make stdalloc__reallocarray call stdalloc__realloc, 2019-02-16), we have changed the stdalloc allocator to reuse `stdalloc__realloc` to implement `stdalloc__reallocarray`. This commit is making the same change for the Windows-specific crtdbg allocator to avoid code duplication.
Patrick Steinhardt b63396b7 2019-02-21T12:13:59 allocators: move standard allocator into subdirectory Right now, our two allocator implementations are scattered around the tree in "stdalloc.h" and "win32/w32_crtdbg_stacktrace.h". Start grouping them together in a single directory "allocators/", similar to how e.g. our streams are organized.
Patrick Steinhardt 9eb098d8 2019-02-21T11:37:04 Merge pull request #4991 from libgit2/ethomson/inttypes Remove public 'inttypes.h' header
Edward Thomson 247e6d90 2019-02-18T07:22:20 Remove public 'inttypes.h' header Remove an `inttypes.h` header that is too large in scope, and far too public. For Visual Studio 2012 and earlier (ie, `_MSC_VER < 1800`), we do need to include `stdint.h` in our public headers, for types like `uint32_t`. Internally, we also need to define `PRId64` as a printf formatting string when it is not available.
Patrick Steinhardt 554b3b9a 2019-02-21T10:31:21 Merge pull request #4996 from eaigner/master Prevent reading out of bounds memory
Erik Aigner 014d4955 2019-02-20T15:30:11 apply: prevent OOB read when parsing source buffer When parsing the patch image from a string, we split the string by newlines to get a line-based view of it. To split, we use `memchr` on the buffer and limit the buffer length by the original length provided by the caller. This works just fine for the first line, but for every subsequent line we need to actually subtract the amount of bytes that we have already read. The above issue can be easily triggered by having a source buffer with at least two lines, where the second line does _not_ end in a newline. Given a string "foo\nb", we have an original length of five bytes. After having extracted the first line, we will point to 'b' and again try to `memchr(p, '\n', 5)`, resulting in an out-of-bounds read of four bytes. Fix the issue by correctly subtracting the amount of bytes already read.
lhchavez 6b3730d4 2019-02-16T19:55:30 Fix a memory leak in odb_otype_fast() This change frees a copy of a cached object in odb_otype_fast().
Patrick Steinhardt 12c6e1fa 2019-02-20T10:54:00 Merge pull request #4986 from lhchavez/realloc Make stdalloc__reallocarray call stdalloc__realloc
Patrick Steinhardt 9f388e9f 2019-02-20T10:51:33 Merge pull request #4990 from libgit2/remove_time_monotonic Remove `git_time_monotonic`
Edward Thomson e6c6d3bb 2019-02-17T22:31:37 Remove `git_time_monotonic` `git_time_monotonic` was added so that non-native bindings like rugged could get high-resolution timing for benchmarking. However, this is outside the scope of libgit2 *and* rugged decided not to use this function in the first place. Google suggests that absolutely _nobody_ is using this function and we don't want to be in the benchmarking business. Remove the function.
lhchavez dd45539d 2019-02-16T22:06:58 Fix a _very_ improbable memory leak in git_odb_new() This change fixes a mostly theoretical memory leak in got_odb_new() that can only manifest if git_cache_init() fails due to running out of memory or not being able to acquire its lock.
lhchavez 6e0dfc6f 2019-02-16T20:26:17 Make stdalloc__reallocarray call stdalloc__realloc This change avoids calling realloc(3) in more than one place.
Patrick Steinhardt df42f368 2018-12-01T10:54:57 idxmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_idxmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.
Patrick Steinhardt bd66925a 2018-12-01T10:29:32 oidmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_oidmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.
Patrick Steinhardt 4713e7c8 2018-12-01T09:58:30 offmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_offmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.
Patrick Steinhardt fdfabdc4 2018-12-01T09:49:10 strmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_strmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.
Patrick Steinhardt 6a9117f5 2018-12-01T10:18:42 cache: use iteration interface for cache eviction To relieve us from memory pressure, we may regularly call `cache_evict_entries` to remove some entries from it. Unfortunately, our cache does not support a least-recently-used mode or something similar, which is why we evict entries completeley at random right now. Thing is, this is only possible due to the map interfaces exposing the entry indices, and we intend to completely remove those to decouple map users from map implementations. As soon as that is done, we are unable to do this random eviction anymore. Convert this to make use of an iterator for now. Obviously, there is no random eviction possible like that anymore, but we'll always start by evicting from the beginning of the map. Due to hashing, one may hope that the selected buckets will be evicted at least in some way unpredictably. But more likely than not, this will not be the case. But let's see what happens and if any users complain about degraded performance. If so, we might come up with a different scheme than random removal, e.g. by using an LRU cache.
Patrick Steinhardt c976b4f9 2018-12-01T10:18:26 indexer: use map iterator to delete expected OIDs To compute whether there are objects missing in a packfile, the indexer keeps around a map of OIDs that it still expects to see. This map does not store any values at all, but in fact the keys are owned by the map itself. Right now, we free these keys by iterating over the map and freeing the key itself, which is kind of awkward as keys are expected to be constant. We can make this a bit prettier by inserting the OID as value, too. As we already store the `NULL` pointer either way, this does not increase memory usage, but makes the code a tad more clear. Furthermore, we convert the previously existing map iteration via indices to make use of an iterator, instead.
Patrick Steinhardt 18cf5698 2018-12-01T09:37:40 maps: provide high-level iteration interface Currently, our headers need to leak some implementation details of maps due to their direct use of indices in the implementation of their foreach macros. This makes it impossible to completely hide the map structures away, and also makes it impossible to include the khash implementation header in the C files of the respective map only. This is now being fixed by providing a high-level iteration interface `map_iterate`, which takes as inputs the map that shall be iterated over, an iterator as well as the locations where keys and values shall be put into. For simplicity's sake, the iterator is a simple `size_t` that shall initialized to `0` on the first call. All existing foreach macros are then adjusted to make use of this new function.
Patrick Steinhardt c50a8ac2 2018-12-01T08:59:24 maps: use high-level function to check existence of keys Some callers were still using the tightly-coupled pattern of `lookup_index` and `valid_index` to verify that an entry exists in a map. Instead, use the more high-level `exists` functions to decouple map users from its implementation.
Patrick Steinhardt 84a089da 2018-12-01T08:50:36 maps: provide return value when deleting entries Currently, the delete functions of maps do not provide a return value. Like this, it is impossible to tell whether the entry has really been deleted or not. Change the implementation to provide either a return value of zero if the entry has been successfully deleted or `GIT_ENOTFOUND` if the key could not be found. Convert callers to the `delete_at` functions to instead use this higher-level interface.
Patrick Steinhardt 8da93944 2018-12-01T10:52:44 idxmap: have `resize` functions return proper error code The currently existing function `git_idxmap_resize` and `git_idxmap_icase_resize` do not return any error codes at all due to their previous implementation making use of a macro. Due to that, it is impossible to see whether the resize operation might have failed due to an out-of-memory situation. Fix this by providing a proper error code. Adjust callers to make use of it.
Patrick Steinhardt 661fc57b 2018-12-01T01:16:25 idxmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_idxmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_idxmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_idxmap_insert` to make use of it.
Patrick Steinhardt d00c24a9 2019-01-23T10:49:25 idxmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce new high-level functions `git_idxmap_get` and `git_idxmap_icase_get` that take a map and a key and return a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
Patrick Steinhardt b9d0b664 2018-12-17T09:10:53 offmap: introduce high-level setter for key/value pairs Currently, there is only one caller that adds entries into an offset map, and this caller first uses `git_offmap_put` to add a key and then set the value at the returned index by using `git_offmap_set_value_at`. This is just too tighlty coupled with implementation details of the map as it exposes the index of inserted entries, which we really do not care about at all. Introduce a new function `git_offmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert the caller to make use of it instead.
Patrick Steinhardt aa245623 2018-11-30T18:28:05 offmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_offmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
Patrick Steinhardt 2e0a3048 2019-01-23T10:48:55 oidmap: introduce high-level setter for key/value pairs Currently, one would use either `git_oidmap_insert` to insert key/value pairs into a map or `git_oidmap_put` to insert a key only. These function have historically been macros, which is why their syntax is kind of weird: instead of returning an error code directly, they instead have to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes.Furthermore, `git_oidmap_put` is tightly coupled with implementation details of the map as it exposes the index of inserted entries. Introduce a new function `git_oidmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all trivial callers of `git_oidmap_insert` and `git_oidmap_put` to make use of it.
Patrick Steinhardt 9694ef20 2018-12-17T09:01:53 oidmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_oidmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
Patrick Steinhardt 03555830 2019-01-23T10:44:33 strmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_strmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_strmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_strmap_insert` to make use of it.