kmx git

Commit	Date	Message
c4cd69b2	2019-04-07T19:10:16	Merge pull request #5039 from libgit2/ethomson/win32_hash sha1: don't inline `git_hash_global_init` for win32
9d117e20	2019-04-05T10:22:46	ignore: treat paths with trailing "/" as directories The function `git_ignore_path_is_ignored` is there to test the ignore status of paths that need not necessarily exist inside of a repository. This has the implication that for a given path, we cannot always decide whether it references a directory or a file, and we need to distinguish those cases because ignore rules may treat those differently. E.g. given the following gitignore file: * !/**/ we'd only want to unignore directories, while keeping files ignored. But still, calling `git_ignore_path_is_ignored("dir/")` will say that this directory is ignored because it treats "dir/" as a file path. As said, the `is_ignored` function cannot always decide whether the given path is a file or directory, and thus it may produce wrong results in some cases. While this is unfixable in the general case, we can do better when we are being passed a path name with a trailing path separator (e.g. "dir/") and always treat them as directories.
aeea1c46	2019-04-04T15:06:44	Merge pull request #4874 from tiennou/test/4615 Test that largefiles can be read through the tree API
6bcb7357	2019-04-04T14:04:59	Merge pull request #5035 from pks-t/pks/diff-with-space-in-filenames patch_parse: fix parsing addition/deletion of file with space
18e836cb	2019-04-04T10:55:38	Merge pull request #5018 from romkatv/strings Optimize string comparisons
e5aecaf6	2019-04-04T18:45:30	sha1: don't inline `git_hash_global_init` for win32 Users of the Win32 hash cannot be inlined, as it uses a static struct. Don't inline it, but continue to declare the function in the header.
30a56ba6	2019-03-14T14:54:47	optimize string comparisons
9aa049d4	2019-03-29T13:28:59	Merge pull request #5020 from implausible/fix/gitignore-negation Negation of subdir ignore causes other subdirs to be unignored
b3497344	2019-03-29T12:15:20	patch_parse: fix parsing addition/deletion of file with space The diff header format is a strange beast in that it is inherently unparseable in an unambiguous way. While parsing a/file.txt b/file.txt is obvious and trivially doable, parsing a diff header of a/file b/file ab.txt b/file b/file ab.txt is not (but in fact valid and created by git.git). Due to that, we have relaxed our diff header parser in commit 80226b5f6 (patch_parse: allow parsing ambiguous patch headers, 2017-09-22), so that we started to bail out when seeing diff headers with spaces in their file names. Instead, we try to use the "---" and "+++" lines, which are unambiguous. In some cases, though, we neither have a useable file name from the header nor from the "---" or "+++" lines. This is the case when we have a deletion or addition of a file with spaces: the header is unparseable and the other lines will simply show "/dev/null". This trips our parsing logic when we try to extract the prefix (the "a/" part) that is being used in the path line, where we unconditionally try to dereference a NULL pointer in such a scenario. We can fix this by simply not trying to parse the prefix in cases where we have no useable path name. That'd leave the parsed patch without either `old_prefix` or `new_prefix` populated. But in fact such cases are already handled by users of the patch object, which simply opt to use the default prefixes in that case.
131cd9b1	2019-03-29T11:58:50	patch_parse: improve formatting
5f188c48	2019-03-29T11:52:39	Merge pull request #5024 from stewid/xdiff-fix-typo xdiff: fix typo
be9a386c	2019-03-22T17:04:32	Each hash implementation should define `git_hash_global_init` This means the forward declaration isn't necessary. The forward declaration can cause compilation errors as it conflicts with the `GIT_INLINE` declaration (the signatures are different).
1a349003	2019-03-20T21:20:01	xdiff: fix typo
e3d7bccb	2019-03-14T15:51:15	ignore: Do not match on prefix of negated patterns Matching on the prefix of a negated pattern was triggering false negatives on siblings of that pattern. e.g. Given the .gitignore: dir/* !dir/sub1/sub2/** The path `dir/a.text` would not be ignored.
7b083d3c	2019-03-02T18:14:36	Merge pull request #5005 from libgit2/ethomson/odb_backend_allocations odb: provide a free function for custom backends
68729289	2019-02-25T09:25:34	Merge pull request #5000 from augfab/branch_lookup_all Have git_branch_lookup accept GIT_BRANCH_ALL
459ac856	2019-02-23T18:42:53	odb: provide a free function for custom backends Custom backends can allocate memory when reading objects and providing them to libgit2. However, if an error occurs in the custom backend after the memory has been allocated for the custom object but before it's returned to libgit2, the custom backend has no way to free that memory and it must be leaked. Provide a free function that corresponds to the alloc function so that custom backends have an opportunity to free memory before they return an error.
790aae77	2019-02-23T18:40:43	odb: rename git_odb_backend_malloc for consistency The `git_odb_backend_malloc` name is a system function that is provided for custom ODB backends and allows them to allocate memory for an ODB object in the read callback. This is important so that libgit2 can later free the memory used by an ODB object that was read from the custom backend. However, the name _suggests_ that it actually allocates a `git_odb_backend`. It does not; rename it to make it clear that it actually allocates backend _data_.
c5d8e300	2019-02-21T21:46:39	branch: have git_branch_lookup accept GIT_BRANCH_ALL
bd132046	2019-02-22T20:10:52	p_fallocate: compatibility fixes for macOS On macOS, fcntl(..., F_PREALLOCATE, ...) will only succeed when followed by an ftruncate(), even when it reports success. However, that syscall will fail when the file already exists. Thus, we must ignore the error code and simply let ftruncate extend the size of the file itself (albeit slowly). By calling ftruncate, we also need to prevent against file shrinkage, for compatibility with posix_ftruncate, which will only extend files, never shrink them.
7ab7bf46	2019-02-22T11:32:01	p_fallocate: don't duplicate definitions for win32
32f50452	2019-02-22T11:22:28	p_fallocate: add Windows emulation Emulate `p_fallocate` on Windows by seeking beyond the end of the file and setting the size to the current seek position.
59001e83	2019-02-21T11:41:19	remote: rename git_push_transfer_progress callback The `git_push_transfer_progress` is a callback and as such should be suffixed with `_cb` for consistency. Rename `git_push_transfer_progress` to `git_push_transfer_progress_cb`.
a1ef995d	2019-02-21T10:33:30	indexer: use git_indexer_progress throughout Update internal usage of `git_transfer_progress` to `git_indexer_progreses`.
4069f924	2019-02-22T10:56:08	Merge pull request #4901 from pks-t/pks/uniform-map-api High-level map APIs
75dd7f2a	2019-02-22T10:13:00	Merge pull request #4984 from pks-t/pks/refdb-fs-race refdb_fs: fix loose/packed refs lookup racing with repacks
c5594852	2019-02-22T10:06:24	Merge pull request #4998 from pks-t/pks/allocator-restructuring Allocator restructuring
bbdcd450	2019-02-20T10:40:06	cache: fix misnaming of `git_cache_free` Functions that free a structure's contents but not the structure itself shall be named `dispose` in the libgit2 project, but the function `git_cache_free` does not follow this naming pattern. Fix this by renaming it to `git_cache_dispose` and adjusting all callers to make use of the new name.
765ff6e0	2019-02-21T12:35:48	allocators: make crtdbg allocator reuse its own realloc In commit 6e0dfc6ff (Make stdalloc__reallocarray call stdalloc__realloc, 2019-02-16), we have changed the stdalloc allocator to reuse `stdalloc__realloc` to implement `stdalloc__reallocarray`. This commit is making the same change for the Windows-specific crtdbg allocator to avoid code duplication.
48727e5d	2019-02-21T12:27:42	allocators: extract crtdbg allocator into its own file The Windows-specific crtdbg allocator is currently mixed into the crtdbg stacktracing compilation unit, making it harder to find than necessary. Extract it and move it into the new "allocators/" subdirectory to improve discoverability. This change means that the crtdbg compilation unit is now compiled unconditionally, whereas it has previously only been compiled on Windows platforms. Thus we now have additional guards around the code so that it will only be compiled if GIT_MSVC_CRTDBG is defined. This also allows us to move over the fallback-implementation of `git_win32_crtdbg_init_allocator` into the same compilation unit.
b63396b7	2019-02-21T12:13:59	allocators: move standard allocator into subdirectory Right now, our two allocator implementations are scattered around the tree in "stdalloc.h" and "win32/w32_crtdbg_stacktrace.h". Start grouping them together in a single directory "allocators/", similar to how e.g. our streams are organized.
9eb098d8	2019-02-21T11:37:04	Merge pull request #4991 from libgit2/ethomson/inttypes Remove public 'inttypes.h' header
247e6d90	2019-02-18T07:22:20	Remove public 'inttypes.h' header Remove an `inttypes.h` header that is too large in scope, and far too public. For Visual Studio 2012 and earlier (ie, `_MSC_VER < 1800`), we do need to include `stdint.h` in our public headers, for types like `uint32_t`. Internally, we also need to define `PRId64` as a printf formatting string when it is not available.
554b3b9a	2019-02-21T10:31:21	Merge pull request #4996 from eaigner/master Prevent reading out of bounds memory
014d4955	2019-02-20T15:30:11	apply: prevent OOB read when parsing source buffer When parsing the patch image from a string, we split the string by newlines to get a line-based view of it. To split, we use `memchr` on the buffer and limit the buffer length by the original length provided by the caller. This works just fine for the first line, but for every subsequent line we need to actually subtract the amount of bytes that we have already read. The above issue can be easily triggered by having a source buffer with at least two lines, where the second line does _not_ end in a newline. Given a string "foo\nb", we have an original length of five bytes. After having extracted the first line, we will point to 'b' and again try to `memchr(p, '\n', 5)`, resulting in an out-of-bounds read of four bytes. Fix the issue by correctly subtracting the amount of bytes already read.
6b3730d4	2019-02-16T19:55:30	Fix a memory leak in odb_otype_fast() This change frees a copy of a cached object in odb_otype_fast().
12c6e1fa	2019-02-20T10:54:00	Merge pull request #4986 from lhchavez/realloc Make stdalloc__reallocarray call stdalloc__realloc
9f388e9f	2019-02-20T10:51:33	Merge pull request #4990 from libgit2/remove_time_monotonic Remove `git_time_monotonic`
e6c6d3bb	2019-02-17T22:31:37	Remove `git_time_monotonic` `git_time_monotonic` was added so that non-native bindings like rugged could get high-resolution timing for benchmarking. However, this is outside the scope of libgit2 and rugged decided not to use this function in the first place. Google suggests that absolutely _nobody_ is using this function and we don't want to be in the benchmarking business. Remove the function.
dd45539d	2019-02-16T22:06:58	Fix a _very_ improbable memory leak in git_odb_new() This change fixes a mostly theoretical memory leak in got_odb_new() that can only manifest if git_cache_init() fails due to running out of memory or not being able to acquire its lock.
6e0dfc6f	2019-02-16T20:26:17	Make stdalloc__reallocarray call stdalloc__realloc This change avoids calling realloc(3) in more than one place.
df42f368	2018-12-01T10:54:57	idxmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_idxmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.
bd66925a	2018-12-01T10:29:32	oidmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_oidmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.
4713e7c8	2018-12-01T09:58:30	offmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_offmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.
fdfabdc4	2018-12-01T09:49:10	strmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_strmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.
6a9117f5	2018-12-01T10:18:42	cache: use iteration interface for cache eviction To relieve us from memory pressure, we may regularly call `cache_evict_entries` to remove some entries from it. Unfortunately, our cache does not support a least-recently-used mode or something similar, which is why we evict entries completeley at random right now. Thing is, this is only possible due to the map interfaces exposing the entry indices, and we intend to completely remove those to decouple map users from map implementations. As soon as that is done, we are unable to do this random eviction anymore. Convert this to make use of an iterator for now. Obviously, there is no random eviction possible like that anymore, but we'll always start by evicting from the beginning of the map. Due to hashing, one may hope that the selected buckets will be evicted at least in some way unpredictably. But more likely than not, this will not be the case. But let's see what happens and if any users complain about degraded performance. If so, we might come up with a different scheme than random removal, e.g. by using an LRU cache.
c976b4f9	2018-12-01T10:18:26	indexer: use map iterator to delete expected OIDs To compute whether there are objects missing in a packfile, the indexer keeps around a map of OIDs that it still expects to see. This map does not store any values at all, but in fact the keys are owned by the map itself. Right now, we free these keys by iterating over the map and freeing the key itself, which is kind of awkward as keys are expected to be constant. We can make this a bit prettier by inserting the OID as value, too. As we already store the `NULL` pointer either way, this does not increase memory usage, but makes the code a tad more clear. Furthermore, we convert the previously existing map iteration via indices to make use of an iterator, instead.
18cf5698	2018-12-01T09:37:40	maps: provide high-level iteration interface Currently, our headers need to leak some implementation details of maps due to their direct use of indices in the implementation of their foreach macros. This makes it impossible to completely hide the map structures away, and also makes it impossible to include the khash implementation header in the C files of the respective map only. This is now being fixed by providing a high-level iteration interface `map_iterate`, which takes as inputs the map that shall be iterated over, an iterator as well as the locations where keys and values shall be put into. For simplicity's sake, the iterator is a simple `size_t` that shall initialized to `0` on the first call. All existing foreach macros are then adjusted to make use of this new function.
c50a8ac2	2018-12-01T08:59:24	maps: use high-level function to check existence of keys Some callers were still using the tightly-coupled pattern of `lookup_index` and `valid_index` to verify that an entry exists in a map. Instead, use the more high-level `exists` functions to decouple map users from its implementation.
84a089da	2018-12-01T08:50:36	maps: provide return value when deleting entries Currently, the delete functions of maps do not provide a return value. Like this, it is impossible to tell whether the entry has really been deleted or not. Change the implementation to provide either a return value of zero if the entry has been successfully deleted or `GIT_ENOTFOUND` if the key could not be found. Convert callers to the `delete_at` functions to instead use this higher-level interface.
8da93944	2018-12-01T10:52:44	idxmap: have `resize` functions return proper error code The currently existing function `git_idxmap_resize` and `git_idxmap_icase_resize` do not return any error codes at all due to their previous implementation making use of a macro. Due to that, it is impossible to see whether the resize operation might have failed due to an out-of-memory situation. Fix this by providing a proper error code. Adjust callers to make use of it.
661fc57b	2018-12-01T01:16:25	idxmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_idxmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_idxmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_idxmap_insert` to make use of it.
d00c24a9	2019-01-23T10:49:25	idxmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce new high-level functions `git_idxmap_get` and `git_idxmap_icase_get` that take a map and a key and return a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
b9d0b664	2018-12-17T09:10:53	offmap: introduce high-level setter for key/value pairs Currently, there is only one caller that adds entries into an offset map, and this caller first uses `git_offmap_put` to add a key and then set the value at the returned index by using `git_offmap_set_value_at`. This is just too tighlty coupled with implementation details of the map as it exposes the index of inserted entries, which we really do not care about at all. Introduce a new function `git_offmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert the caller to make use of it instead.
aa245623	2018-11-30T18:28:05	offmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_offmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
2e0a3048	2019-01-23T10:48:55	oidmap: introduce high-level setter for key/value pairs Currently, one would use either `git_oidmap_insert` to insert key/value pairs into a map or `git_oidmap_put` to insert a key only. These function have historically been macros, which is why their syntax is kind of weird: instead of returning an error code directly, they instead have to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes.Furthermore, `git_oidmap_put` is tightly coupled with implementation details of the map as it exposes the index of inserted entries. Introduce a new function `git_oidmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all trivial callers of `git_oidmap_insert` and `git_oidmap_put` to make use of it.
9694ef20	2018-12-17T09:01:53	oidmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_oidmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
03555830	2019-01-23T10:44:33	strmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_strmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_strmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_strmap_insert` to make use of it.
ef507bc7	2019-01-23T10:44:02	strmap: introduce `git_strmap_get` and use it throughout the tree The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_strmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
7e926ef3	2018-11-30T12:14:43	maps: provide a uniform entry count interface There currently exist two different function names for getting the entry count of maps, where offmaps offset and string maps use `num_entries` and OID maps use `size`. In most programming languages with built-in map types, this is simply called `size`, which is also shorter to type. Thus, this commit renames the other two functions `num_entries` to match the common way and adjusts all callers.
351eeff3	2019-01-23T10:42:46	maps: use uniform lifecycle management functions Currently, the lifecycle functions for maps (allocation, deallocation, resize) are not named in a uniform way and do not have a uniform function signature. Rename the functions to fix that, and stick to libgit2's naming scheme of saying `git_foo_new`. This results in the following new interface for allocation: - `int git_<t>map_new(git_<t>map *out)` to allocate a new map, returning an error code if we ran out of memory - `void git_<t>map_free(git_<t>map map)` to free a map - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map This commit also fixes all existing callers.
94743daf	2019-02-15T11:16:46	refdb_fs: fix race when migrating loose to packed refs in iteration Right now, we first load the packed refs cache and only afterwards load the loose references. This is susceptible to a race when the loose ref is being migrated to a packed cache by e.g. git-pack-refs(1): libgit2 git-pack-refs 1. We load the packed ref, which does not yet have the migrated reference. 2. git-pack-refs updates the packed ref file to have the migrated ref. 3. git-pack-refs deletes the old loose ref. 4. We look up the loose ref. So we now do not find the reference at all and will never iterate over it. Fix the issue by reversing the order: instead of first loading the packed refs, we will now look up the loose reference first. If it has already been deleted, then it must already be present in the packed-refs by definition, as git.git will only delete the reference after updating the packed refs file.
3ff0e3b5	2019-02-15T11:16:05	refdb_fs: remove ordering dependency on loose/packed refs loading Right now, loading loose refs has the side-effect of setting the `PACKREF_SHADOWED` flag for references that exist both in the loose and the packed refs. Because of this, we are force do first look up packed refs and only afterwards loading the packed refs. This is susceptible to a race, though, when refs are being repacked: when first loading the packed cache, then it may not yet have the migrated loose ref. But when now trying to look up the loose reference afterwards, then it may already have been migrated. Thus, we would fail to find this reference in this scenario. Remove this ordering dependency to allow fixing the above race. Instead of setting the flag when loading loose refs, we will now instead set it lazily when iterating over the loose refs. This even has the added benefit of not requiring us to lock the packed refs cache, as we already have an owned copy of it.
83333814	2019-02-15T10:56:50	refdb_fs: do not lazily copy packed ref cache When creating a new iterator, we eagerly load loose refs but only lazily create a copy of packed refs. The lazy load only happens as soon as we have iterated over all loose refs, opening up a potentially wide window for races. This may lead to an inconsistent view e.g. when the caller decides to reload packed references somewhen between iterating the loose refs, which is unexpected. Fix the issue by eagerly copying the sorted cache. Note that right now, we are heavily dependent on ordering here: we first need to reload packed refs, then we have to load loose refs and only as a last step are we allowed to copy the cache. This is because loading loose refs has the side-effect of setting the `PACKED_SHADOWED` flag in the packed refs cache, which we require to avoid outputting packed refs that already exist as loose refs.
32063d82	2019-02-15T10:41:30	refdb_fs: refactor error handling in iterator creation Refactor the error handling in `refdb_fs_backend__iterator` to always return the correct error code returned by the failing function.
8c773438	2019-02-15T10:15:39	refdb_fs: fix potential race with ref repacking in `exists` callback When repacking references, git.git will first update the packed refs and only afterwards delete any existing loose references that have now been moved to the new packed refs file. Due to this, there is a potential for racing if one first reads the packfile (which has not been updated yet) and only then trying to read the loose reference (which has just been deleted). In this case, one will incorrectly fail to lookup the reference and it will be reported as missing. Naturally, this is exactly what we've been doing in `refdb_fs_backend__exists`. Fix the race by reversing the lookup: we will now first check if the loose reference exists and only afterwards refresh the packed file.
bda08397	2019-02-14T16:57:47	Merge pull request #4982 from pks-t/pks/worktree-add-bare-head Enable creation of worktree from bare repo's default branch
48005936	2019-02-14T16:55:18	Merge pull request #4965 from hackworks/eliminate-check-for-keep-file Allow bypassing check for '.keep' file
bf013fc0	2019-02-14T13:30:33	branch: fix `branch_is_checked_out` with bare repos In a bare repository, HEAD usually points to the branch that is considered the "default" branch. As the current implementation for `git_branch_is_checked_out` only does a comparison of HEAD with the branch that is to be checked, it will say that the branch pointed to by HEAD in such a bare repo is checked out. Fix this by skipping the main repo's HEAD when it is bare.
efb20825	2019-02-14T13:05:49	branches: introduce flag to skip enumeration of certain HEADs Right now, the function `git_repository_foreach_head` will always iterate over all HEADs of the main repository and its worktrees. In some cases, it might be required to skip either of those, though. Add a flag in preparation for the following commit that enables this behaviour.
788cd2d5	2019-02-14T13:49:35	branches: do not assert that the given ref is a branch Libraries should use assert(3P) only very scarcely. First, we usually shouldn't cause the caller of our library to abort in case where the assert fails. Second, if code is compiled with -DNDEBUG, then the assert will not be included at all. In our `git_branch_is_checked_out` function, we have an assert that verifies that the given reference parameter is non-NULL and in fact a branch. While the first check is fine, the second is not. E.g. when compiled with -DNDEBUG, we'd proceed and treat the given reference as a branch in all cases. Fix the issue by instead treating a non-branch reference as not being checked out. This is the obvious solution, as references other than branches cannot be directly checked out.
698eae13	2019-02-14T12:52:25	worktree: error out early if given ref is not valid When adding a new worktree, we only verify that an optionally given reference is valid half-way through the function. At this point, some data structures have already been created on-disk. If we bail out due to an invalid reference, these will be left behind and need to be manually cleaned up by the user. Improve the situation by moving the reference checks to the function's preamble. Like this, we error out as early as possible and will not leave behind any files.
24ac9e0c	2019-02-13T23:26:54	deprecation: ensure we GIT_EXTERN deprecated funcs Although the error functions were deprecated, we did not properly mark them as deprecated. We need to include the `deprecated.h` file in order to ensure that the functions get their export attributes. Similarly, do not define `GIT_DEPRECATE_HARD` within the library, or those functions will also not get their export attributes. Define that only on the tests and examples.
004a3398	2019-01-28T18:31:21	Allow bypassing check '.keep' files using libgit2 option 'GIT_OPT_IGNORE_PACK_KEEP_FILE_CHECK'
0ceac0d0	2019-01-23T14:45:19	mbedtls: fix potential size overflow when reading or writing data The mbedtls library uses a callback mechanism to allow downstream users to plug in their own receive and send functions. We implement `bio_read` and `bio_write` functions, which simply wrap the `git_stream_read` and `git_stream_write` functions, respectively. The problem arises due to the return value of the callback functions: mbedtls expects us to return an `int` containing the actual number of bytes that were read or written. But this is in fact completely misdesigned, as callers are allowed to pass in a buffer with length `SIZE_MAX`. We thus may be unable to represent the number of bytes written via the return value. Fix this by only ever reading or writing at most `INT_MAX` bytes.
75918aba	2019-01-23T14:43:54	mbedtls: make global variables static The mbedtls stream implementation makes use of some global variables which are not marked as `static`, even though they're only used in this compilation unit. Fix this and remove a duplicate declaration.
657197e6	2019-01-23T15:54:05	openssl: fix potential size overflow when writing data Our `openssl_write` function calls `SSL_write` by passing in both `data` and `len` arguments directly. Thing is, our `len` parameter is of type `size_t` and theirs is of type `int`. We thus need to clamp our length to be at most `INT_MAX`.
7613086d	2019-01-23T15:49:28	streams: handle short writes only in generic stream Now that the function `git_stream__write_full` exists and callers of `git_stream_write` have been adjusted, we can lift logic for short writes out of the stream implementations. Instead, this is now handled either by `git_stream__write_full` or by callers of `git_stream_write` directly.
5265b31c	2019-01-23T15:00:20	streams: fix callers potentially only writing partial data Similar to the write(3) function, implementations of `git_stream_write` do not guarantee that all bytes are written. Instead, they return the number of bytes that actually have been written, which may be smaller than the total number of bytes. Furthermore, due to an interface design issue, we cannot ever write more than `SSIZE_MAX` bytes at once, as otherwise we cannot represent the number of bytes written to the caller. Unfortunately, no caller of `git_stream_write` ever checks the return value, except to verify that no error occurred. Due to this, they are susceptible to the case where only partial data has been written. Fix this by introducing a new function `git_stream__write_full`. In contrast to `git_stream_write`, it will always return either success or failure, without returning the number of bytes written. Thus, it is able to write all `SIZE_MAX` bytes and loop around `git_stream_write` until all data has been written. Adjust all callers except the BIO callbacks in our mbedtls and OpenSSL streams, which already do the right thing and require the amount of bytes written.
193e7ce9	2019-01-23T15:42:07	streams: make file-local functions static The callback functions that implement the `git_stream` structure are only used inside of their respective implementation files, but they are not marked as `static`. Fix this.
4e3949b7	2019-01-30T02:14:11	tests: test that largefiles can be read through the tree API
fac08837	2019-01-21T11:38:46	filter: return an int Validate that the return value of the read is not less than INT_MAX, then cast.
89bd4ddb	2019-01-21T11:32:53	diff_generate: validate oid file size Index entries are 32 bit unsigned ints, not `size_t`s.
fd9d4e28	2019-01-21T11:29:16	describe: don't mix and match abbreviated size types The git_describe_format_options.abbreviated_size type is an unsigned int. There's no need for it to be anything else; keep it what it is.
751eb462	2019-01-21T11:20:18	delta: validate sizes and cast safely Quiet down a warning from MSVC about how we're potentially losing data. Validate that our data will fit into the type provided then cast.
4947216f	2019-01-21T11:11:27	git transport: only write INT_MAX bytes The transport code returns an `int` with the number of bytes written; thus only attempt to write at most `INT_MAX`.
a861839d	2019-01-21T10:55:59	windows: add SSIZE_MAX Windows doesn't include ssize_t or its _MAX value by default. We are already declaring ssize_t as SSIZE_T, which is __int64_t on Win64 and long otherwise. Include its _MAX value as a correspondence to its type.
f1986a23	2019-01-21T09:56:23	streams: don't write more than SSIZE_MAX Our streams implementation takes a `size_t` that indicates the length of the data buffer to be written, and returns an `ssize_t` that indicates the length that _was_ written. Clearly no such implementation can write more than `SSIZE_MAX` bytes. Ensure that each TLS stream implementation does not try to write more than `SSIZE_MAX` bytes (or smaller; if the given implementation takes a smaller size).
e5e2fac8	2019-01-21T00:57:39	buffer: explicitly cast Quiet down a warning from MSVC about how we're potentially losing data. This is safe since we've explicitly tested it.
f4ebb2d4	2019-01-21T00:56:35	blame: make hunk_cmp handle unsigned differences
ae681d3f	2019-01-21T00:49:07	apply: make update_hunk accept a size_t
1d4ddb8e	2019-01-20T23:42:08	iterator: cast filesystem iterator entry values explicitly The filesystem iterator takes `stat` data from disk and puts them into index entries, which use 32 bit ints for time (the seconds portion) and filesize. However, on most systems these are not 32 bit, thus will typically invoke a warning. Most users ignore these fields entirely. Diff and checkout code do use the values, however only for the cache to determine if they should check file modification. Thus, this is not a critical error (and will cause a hash recomputation at worst).
c6cac733	2019-01-20T22:40:38	blob: validate that blob sizes fit in a size_t Our blob size is a `git_off_t`, which is a signed 64 bit int. This may be erroneously negative or larger than `SIZE_MAX`. Ensure that the blob size fits into a `size_t` before casting.
3aa6d96a	2019-01-20T20:38:25	tree: cast filename length in git_tree__parse_raw Quiet down a warning from MSVC about how we're potentially losing data. Ensure that we're within a uint16_t before we do.
759502ed	2019-01-20T20:30:42	odb_loose: explicitly cast to size_t Quiet down a warning from MSVC about how we're potentially losing data. This is safe since we've explicitly tested that it's positive and less than SIZE_MAX.
80c3867b	2019-01-20T19:20:12	patch: explicitly cast down in parse_header_percent Quiet down a warning from MSVC about how we're potentially losing data. This is safe since we've explicitly tested that it's within the range of 0-100.
494448a5	2019-01-20T19:10:08	index: explicitly cast down to a size_t Quiet down a warning from MSVC about how we're potentially losing data. This cast is safe since we've explicitly tested that `strip_len` <= `last_len`.
c3866fa8	2019-01-20T18:54:16	diff: explicitly cast in flush_hunk Quiet down a warning from MSVC about how we're potentially losing data.
826d9a4d	2019-01-25T09:43:20	Merge pull request #4858 from tiennou/fix/index-ext-read index: preserve extension parsing errors
e09f0c10	2019-01-23T10:21:42	deprecation: don't use deprecated stream cb Avoid the deprecated `git_stream_cb` typedef since we want to compile the library without deprecated functions or types. Instead, we can unroll the alias to its actual type.

c4cd69b2

2019-04-07T19:10:16

Merge pull request #5039 from libgit2/ethomson/win32_hash sha1: don't inline `git_hash_global_init` for win32

9d117e20

2019-04-05T10:22:46

ignore: treat paths with trailing "/" as directories The function `git_ignore_path_is_ignored` is there to test the ignore status of paths that need not necessarily exist inside of a repository. This has the implication that for a given path, we cannot always decide whether it references a directory or a file, and we need to distinguish those cases because ignore rules may treat those differently. E.g. given the following gitignore file: * !/**/ we'd only want to unignore directories, while keeping files ignored. But still, calling `git_ignore_path_is_ignored("dir/")` will say that this directory is ignored because it treats "dir/" as a file path. As said, the `is_ignored` function cannot always decide whether the given path is a file or directory, and thus it may produce wrong results in some cases. While this is unfixable in the general case, we can do better when we are being passed a path name with a trailing path separator (e.g. "dir/") and always treat them as directories.

aeea1c46

2019-04-04T15:06:44

Merge pull request #4874 from tiennou/test/4615 Test that largefiles can be read through the tree API

6bcb7357

2019-04-04T14:04:59

Merge pull request #5035 from pks-t/pks/diff-with-space-in-filenames patch_parse: fix parsing addition/deletion of file with space

18e836cb

2019-04-04T10:55:38

Merge pull request #5018 from romkatv/strings Optimize string comparisons

e5aecaf6

2019-04-04T18:45:30

sha1: don't inline `git_hash_global_init` for win32 Users of the Win32 hash cannot be inlined, as it uses a static struct. Don't inline it, but continue to declare the function in the header.

30a56ba6

2019-03-14T14:54:47

optimize string comparisons

9aa049d4

2019-03-29T13:28:59

Merge pull request #5020 from implausible/fix/gitignore-negation Negation of subdir ignore causes other subdirs to be unignored

b3497344

2019-03-29T12:15:20

patch_parse: fix parsing addition/deletion of file with space The diff header format is a strange beast in that it is inherently unparseable in an unambiguous way. While parsing a/file.txt b/file.txt is obvious and trivially doable, parsing a diff header of a/file b/file ab.txt b/file b/file ab.txt is not (but in fact valid and created by git.git). Due to that, we have relaxed our diff header parser in commit 80226b5f6 (patch_parse: allow parsing ambiguous patch headers, 2017-09-22), so that we started to bail out when seeing diff headers with spaces in their file names. Instead, we try to use the "---" and "+++" lines, which are unambiguous. In some cases, though, we neither have a useable file name from the header nor from the "---" or "+++" lines. This is the case when we have a deletion or addition of a file with spaces: the header is unparseable and the other lines will simply show "/dev/null". This trips our parsing logic when we try to extract the prefix (the "a/" part) that is being used in the path line, where we unconditionally try to dereference a NULL pointer in such a scenario. We can fix this by simply not trying to parse the prefix in cases where we have no useable path name. That'd leave the parsed patch without either `old_prefix` or `new_prefix` populated. But in fact such cases are already handled by users of the patch object, which simply opt to use the default prefixes in that case.

131cd9b1

2019-03-29T11:58:50

patch_parse: improve formatting

5f188c48

2019-03-29T11:52:39

Merge pull request #5024 from stewid/xdiff-fix-typo xdiff: fix typo

be9a386c

2019-03-22T17:04:32

Each hash implementation should define `git_hash_global_init` This means the forward declaration isn't necessary. The forward declaration can cause compilation errors as it conflicts with the `GIT_INLINE` declaration (the signatures are different).

1a349003

2019-03-20T21:20:01

xdiff: fix typo

e3d7bccb

2019-03-14T15:51:15

ignore: Do not match on prefix of negated patterns Matching on the prefix of a negated pattern was triggering false negatives on siblings of that pattern. e.g. Given the .gitignore: dir/* !dir/sub1/sub2/** The path `dir/a.text` would not be ignored.

7b083d3c

2019-03-02T18:14:36

Merge pull request #5005 from libgit2/ethomson/odb_backend_allocations odb: provide a free function for custom backends

68729289

2019-02-25T09:25:34

Merge pull request #5000 from augfab/branch_lookup_all Have git_branch_lookup accept GIT_BRANCH_ALL

459ac856

2019-02-23T18:42:53

odb: provide a free function for custom backends Custom backends can allocate memory when reading objects and providing them to libgit2. However, if an error occurs in the custom backend after the memory has been allocated for the custom object but before it's returned to libgit2, the custom backend has no way to free that memory and it must be leaked. Provide a free function that corresponds to the alloc function so that custom backends have an opportunity to free memory before they return an error.

790aae77

2019-02-23T18:40:43

odb: rename git_odb_backend_malloc for consistency The `git_odb_backend_malloc` name is a system function that is provided for custom ODB backends and allows them to allocate memory for an ODB object in the read callback. This is important so that libgit2 can later free the memory used by an ODB object that was read from the custom backend. However, the name _suggests_ that it actually allocates a `git_odb_backend`. It does not; rename it to make it clear that it actually allocates backend _data_.

c5d8e300

2019-02-21T21:46:39

branch: have git_branch_lookup accept GIT_BRANCH_ALL

bd132046

2019-02-22T20:10:52

p_fallocate: compatibility fixes for macOS On macOS, fcntl(..., F_PREALLOCATE, ...) will only succeed when followed by an ftruncate(), even when it reports success. However, that syscall will fail when the file already exists. Thus, we must ignore the error code and simply let ftruncate extend the size of the file itself (albeit slowly). By calling ftruncate, we also need to prevent against file shrinkage, for compatibility with posix_ftruncate, which will only extend files, never shrink them.

7ab7bf46

2019-02-22T11:32:01

p_fallocate: don't duplicate definitions for win32

32f50452

2019-02-22T11:22:28

p_fallocate: add Windows emulation Emulate `p_fallocate` on Windows by seeking beyond the end of the file and setting the size to the current seek position.

59001e83

2019-02-21T11:41:19

remote: rename git_push_transfer_progress callback The `git_push_transfer_progress` is a callback and as such should be suffixed with `_cb` for consistency. Rename `git_push_transfer_progress` to `git_push_transfer_progress_cb`.

a1ef995d

2019-02-21T10:33:30

indexer: use git_indexer_progress throughout Update internal usage of `git_transfer_progress` to `git_indexer_progreses`.

4069f924

2019-02-22T10:56:08

Merge pull request #4901 from pks-t/pks/uniform-map-api High-level map APIs

75dd7f2a

2019-02-22T10:13:00

Merge pull request #4984 from pks-t/pks/refdb-fs-race refdb_fs: fix loose/packed refs lookup racing with repacks

c5594852

2019-02-22T10:06:24

Merge pull request #4998 from pks-t/pks/allocator-restructuring Allocator restructuring

bbdcd450

2019-02-20T10:40:06

cache: fix misnaming of `git_cache_free` Functions that free a structure's contents but not the structure itself shall be named `dispose` in the libgit2 project, but the function `git_cache_free` does not follow this naming pattern. Fix this by renaming it to `git_cache_dispose` and adjusting all callers to make use of the new name.

765ff6e0

2019-02-21T12:35:48

allocators: make crtdbg allocator reuse its own realloc In commit 6e0dfc6ff (Make stdalloc__reallocarray call stdalloc__realloc, 2019-02-16), we have changed the stdalloc allocator to reuse `stdalloc__realloc` to implement `stdalloc__reallocarray`. This commit is making the same change for the Windows-specific crtdbg allocator to avoid code duplication.

48727e5d

2019-02-21T12:27:42

allocators: extract crtdbg allocator into its own file The Windows-specific crtdbg allocator is currently mixed into the crtdbg stacktracing compilation unit, making it harder to find than necessary. Extract it and move it into the new "allocators/" subdirectory to improve discoverability. This change means that the crtdbg compilation unit is now compiled unconditionally, whereas it has previously only been compiled on Windows platforms. Thus we now have additional guards around the code so that it will only be compiled if GIT_MSVC_CRTDBG is defined. This also allows us to move over the fallback-implementation of `git_win32_crtdbg_init_allocator` into the same compilation unit.

b63396b7

2019-02-21T12:13:59

allocators: move standard allocator into subdirectory Right now, our two allocator implementations are scattered around the tree in "stdalloc.h" and "win32/w32_crtdbg_stacktrace.h". Start grouping them together in a single directory "allocators/", similar to how e.g. our streams are organized.

9eb098d8

2019-02-21T11:37:04

Merge pull request #4991 from libgit2/ethomson/inttypes Remove public 'inttypes.h' header

247e6d90

2019-02-18T07:22:20

Remove public 'inttypes.h' header Remove an `inttypes.h` header that is too large in scope, and far too public. For Visual Studio 2012 and earlier (ie, `_MSC_VER < 1800`), we do need to include `stdint.h` in our public headers, for types like `uint32_t`. Internally, we also need to define `PRId64` as a printf formatting string when it is not available.

554b3b9a

2019-02-21T10:31:21

Merge pull request #4996 from eaigner/master Prevent reading out of bounds memory

014d4955

2019-02-20T15:30:11

apply: prevent OOB read when parsing source buffer When parsing the patch image from a string, we split the string by newlines to get a line-based view of it. To split, we use `memchr` on the buffer and limit the buffer length by the original length provided by the caller. This works just fine for the first line, but for every subsequent line we need to actually subtract the amount of bytes that we have already read. The above issue can be easily triggered by having a source buffer with at least two lines, where the second line does _not_ end in a newline. Given a string "foo\nb", we have an original length of five bytes. After having extracted the first line, we will point to 'b' and again try to `memchr(p, '\n', 5)`, resulting in an out-of-bounds read of four bytes. Fix the issue by correctly subtracting the amount of bytes already read.

6b3730d4

2019-02-16T19:55:30

Fix a memory leak in odb_otype_fast() This change frees a copy of a cached object in odb_otype_fast().

12c6e1fa

2019-02-20T10:54:00

Merge pull request #4986 from lhchavez/realloc Make stdalloc__reallocarray call stdalloc__realloc

9f388e9f

2019-02-20T10:51:33

Merge pull request #4990 from libgit2/remove_time_monotonic Remove `git_time_monotonic`

e6c6d3bb

2019-02-17T22:31:37

Remove `git_time_monotonic` `git_time_monotonic` was added so that non-native bindings like rugged could get high-resolution timing for benchmarking. However, this is outside the scope of libgit2 *and* rugged decided not to use this function in the first place. Google suggests that absolutely _nobody_ is using this function and we don't want to be in the benchmarking business. Remove the function.

dd45539d

2019-02-16T22:06:58

Fix a _very_ improbable memory leak in git_odb_new() This change fixes a mostly theoretical memory leak in got_odb_new() that can only manifest if git_cache_init() fails due to running out of memory or not being able to acquire its lock.

6e0dfc6f

2019-02-16T20:26:17

Make stdalloc__reallocarray call stdalloc__realloc This change avoids calling realloc(3) in more than one place.

df42f368

2018-12-01T10:54:57

idxmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_idxmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.

bd66925a

2018-12-01T10:29:32

oidmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_oidmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.

4713e7c8

2018-12-01T09:58:30

offmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_offmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.

fdfabdc4

2018-12-01T09:49:10

strmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_strmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.

6a9117f5

2018-12-01T10:18:42

cache: use iteration interface for cache eviction To relieve us from memory pressure, we may regularly call `cache_evict_entries` to remove some entries from it. Unfortunately, our cache does not support a least-recently-used mode or something similar, which is why we evict entries completeley at random right now. Thing is, this is only possible due to the map interfaces exposing the entry indices, and we intend to completely remove those to decouple map users from map implementations. As soon as that is done, we are unable to do this random eviction anymore. Convert this to make use of an iterator for now. Obviously, there is no random eviction possible like that anymore, but we'll always start by evicting from the beginning of the map. Due to hashing, one may hope that the selected buckets will be evicted at least in some way unpredictably. But more likely than not, this will not be the case. But let's see what happens and if any users complain about degraded performance. If so, we might come up with a different scheme than random removal, e.g. by using an LRU cache.

c976b4f9

2018-12-01T10:18:26

indexer: use map iterator to delete expected OIDs To compute whether there are objects missing in a packfile, the indexer keeps around a map of OIDs that it still expects to see. This map does not store any values at all, but in fact the keys are owned by the map itself. Right now, we free these keys by iterating over the map and freeing the key itself, which is kind of awkward as keys are expected to be constant. We can make this a bit prettier by inserting the OID as value, too. As we already store the `NULL` pointer either way, this does not increase memory usage, but makes the code a tad more clear. Furthermore, we convert the previously existing map iteration via indices to make use of an iterator, instead.

18cf5698

2018-12-01T09:37:40

maps: provide high-level iteration interface Currently, our headers need to leak some implementation details of maps due to their direct use of indices in the implementation of their foreach macros. This makes it impossible to completely hide the map structures away, and also makes it impossible to include the khash implementation header in the C files of the respective map only. This is now being fixed by providing a high-level iteration interface `map_iterate`, which takes as inputs the map that shall be iterated over, an iterator as well as the locations where keys and values shall be put into. For simplicity's sake, the iterator is a simple `size_t` that shall initialized to `0` on the first call. All existing foreach macros are then adjusted to make use of this new function.

c50a8ac2

2018-12-01T08:59:24

maps: use high-level function to check existence of keys Some callers were still using the tightly-coupled pattern of `lookup_index` and `valid_index` to verify that an entry exists in a map. Instead, use the more high-level `exists` functions to decouple map users from its implementation.

84a089da

2018-12-01T08:50:36

maps: provide return value when deleting entries Currently, the delete functions of maps do not provide a return value. Like this, it is impossible to tell whether the entry has really been deleted or not. Change the implementation to provide either a return value of zero if the entry has been successfully deleted or `GIT_ENOTFOUND` if the key could not be found. Convert callers to the `delete_at` functions to instead use this higher-level interface.

8da93944

2018-12-01T10:52:44

idxmap: have `resize` functions return proper error code The currently existing function `git_idxmap_resize` and `git_idxmap_icase_resize` do not return any error codes at all due to their previous implementation making use of a macro. Due to that, it is impossible to see whether the resize operation might have failed due to an out-of-memory situation. Fix this by providing a proper error code. Adjust callers to make use of it.

661fc57b

2018-12-01T01:16:25

idxmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_idxmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_idxmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_idxmap_insert` to make use of it.

d00c24a9

2019-01-23T10:49:25

idxmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce new high-level functions `git_idxmap_get` and `git_idxmap_icase_get` that take a map and a key and return a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.

b9d0b664

2018-12-17T09:10:53

offmap: introduce high-level setter for key/value pairs Currently, there is only one caller that adds entries into an offset map, and this caller first uses `git_offmap_put` to add a key and then set the value at the returned index by using `git_offmap_set_value_at`. This is just too tighlty coupled with implementation details of the map as it exposes the index of inserted entries, which we really do not care about at all. Introduce a new function `git_offmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert the caller to make use of it instead.

aa245623

2018-11-30T18:28:05

offmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_offmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.

2e0a3048

2019-01-23T10:48:55

oidmap: introduce high-level setter for key/value pairs Currently, one would use either `git_oidmap_insert` to insert key/value pairs into a map or `git_oidmap_put` to insert a key only. These function have historically been macros, which is why their syntax is kind of weird: instead of returning an error code directly, they instead have to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes.Furthermore, `git_oidmap_put` is tightly coupled with implementation details of the map as it exposes the index of inserted entries. Introduce a new function `git_oidmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all trivial callers of `git_oidmap_insert` and `git_oidmap_put` to make use of it.

9694ef20

2018-12-17T09:01:53

oidmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_oidmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.

03555830

2019-01-23T10:44:33

strmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_strmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_strmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_strmap_insert` to make use of it.

ef507bc7

2019-01-23T10:44:02

strmap: introduce `git_strmap_get` and use it throughout the tree The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_strmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.

7e926ef3

2018-11-30T12:14:43

maps: provide a uniform entry count interface There currently exist two different function names for getting the entry count of maps, where offmaps offset and string maps use `num_entries` and OID maps use `size`. In most programming languages with built-in map types, this is simply called `size`, which is also shorter to type. Thus, this commit renames the other two functions `num_entries` to match the common way and adjusts all callers.

351eeff3

2019-01-23T10:42:46

maps: use uniform lifecycle management functions Currently, the lifecycle functions for maps (allocation, deallocation, resize) are not named in a uniform way and do not have a uniform function signature. Rename the functions to fix that, and stick to libgit2's naming scheme of saying `git_foo_new`. This results in the following new interface for allocation: - `int git_<t>map_new(git_<t>map **out)` to allocate a new map, returning an error code if we ran out of memory - `void git_<t>map_free(git_<t>map *map)` to free a map - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map This commit also fixes all existing callers.

94743daf

2019-02-15T11:16:46

refdb_fs: fix race when migrating loose to packed refs in iteration Right now, we first load the packed refs cache and only afterwards load the loose references. This is susceptible to a race when the loose ref is being migrated to a packed cache by e.g. git-pack-refs(1): libgit2 git-pack-refs 1. We load the packed ref, which does not yet have the migrated reference. 2. git-pack-refs updates the packed ref file to have the migrated ref. 3. git-pack-refs deletes the old loose ref. 4. We look up the loose ref. So we now do not find the reference at all and will never iterate over it. Fix the issue by reversing the order: instead of first loading the packed refs, we will now look up the loose reference first. If it has already been deleted, then it must already be present in the packed-refs by definition, as git.git will only delete the reference after updating the packed refs file.

3ff0e3b5

2019-02-15T11:16:05

refdb_fs: remove ordering dependency on loose/packed refs loading Right now, loading loose refs has the side-effect of setting the `PACKREF_SHADOWED` flag for references that exist both in the loose and the packed refs. Because of this, we are force do first look up packed refs and only afterwards loading the packed refs. This is susceptible to a race, though, when refs are being repacked: when first loading the packed cache, then it may not yet have the migrated loose ref. But when now trying to look up the loose reference afterwards, then it may already have been migrated. Thus, we would fail to find this reference in this scenario. Remove this ordering dependency to allow fixing the above race. Instead of setting the flag when loading loose refs, we will now instead set it lazily when iterating over the loose refs. This even has the added benefit of not requiring us to lock the packed refs cache, as we already have an owned copy of it.

83333814

2019-02-15T10:56:50

refdb_fs: do not lazily copy packed ref cache When creating a new iterator, we eagerly load loose refs but only lazily create a copy of packed refs. The lazy load only happens as soon as we have iterated over all loose refs, opening up a potentially wide window for races. This may lead to an inconsistent view e.g. when the caller decides to reload packed references somewhen between iterating the loose refs, which is unexpected. Fix the issue by eagerly copying the sorted cache. Note that right now, we are heavily dependent on ordering here: we first need to reload packed refs, then we have to load loose refs and only as a last step are we allowed to copy the cache. This is because loading loose refs has the side-effect of setting the `PACKED_SHADOWED` flag in the packed refs cache, which we require to avoid outputting packed refs that already exist as loose refs.

32063d82

2019-02-15T10:41:30

refdb_fs: refactor error handling in iterator creation Refactor the error handling in `refdb_fs_backend__iterator` to always return the correct error code returned by the failing function.

8c773438

2019-02-15T10:15:39

refdb_fs: fix potential race with ref repacking in `exists` callback When repacking references, git.git will first update the packed refs and only afterwards delete any existing loose references that have now been moved to the new packed refs file. Due to this, there is a potential for racing if one first reads the packfile (which has not been updated yet) and only then trying to read the loose reference (which has just been deleted). In this case, one will incorrectly fail to lookup the reference and it will be reported as missing. Naturally, this is exactly what we've been doing in `refdb_fs_backend__exists`. Fix the race by reversing the lookup: we will now first check if the loose reference exists and only afterwards refresh the packed file.

bda08397

2019-02-14T16:57:47

Merge pull request #4982 from pks-t/pks/worktree-add-bare-head Enable creation of worktree from bare repo's default branch

48005936

2019-02-14T16:55:18

Merge pull request #4965 from hackworks/eliminate-check-for-keep-file Allow bypassing check for '.keep' file

bf013fc0

2019-02-14T13:30:33

branch: fix `branch_is_checked_out` with bare repos In a bare repository, HEAD usually points to the branch that is considered the "default" branch. As the current implementation for `git_branch_is_checked_out` only does a comparison of HEAD with the branch that is to be checked, it will say that the branch pointed to by HEAD in such a bare repo is checked out. Fix this by skipping the main repo's HEAD when it is bare.

efb20825

2019-02-14T13:05:49

branches: introduce flag to skip enumeration of certain HEADs Right now, the function `git_repository_foreach_head` will always iterate over all HEADs of the main repository and its worktrees. In some cases, it might be required to skip either of those, though. Add a flag in preparation for the following commit that enables this behaviour.

788cd2d5

2019-02-14T13:49:35

branches: do not assert that the given ref is a branch Libraries should use assert(3P) only very scarcely. First, we usually shouldn't cause the caller of our library to abort in case where the assert fails. Second, if code is compiled with -DNDEBUG, then the assert will not be included at all. In our `git_branch_is_checked_out` function, we have an assert that verifies that the given reference parameter is non-NULL and in fact a branch. While the first check is fine, the second is not. E.g. when compiled with -DNDEBUG, we'd proceed and treat the given reference as a branch in all cases. Fix the issue by instead treating a non-branch reference as not being checked out. This is the obvious solution, as references other than branches cannot be directly checked out.

698eae13

2019-02-14T12:52:25

worktree: error out early if given ref is not valid When adding a new worktree, we only verify that an optionally given reference is valid half-way through the function. At this point, some data structures have already been created on-disk. If we bail out due to an invalid reference, these will be left behind and need to be manually cleaned up by the user. Improve the situation by moving the reference checks to the function's preamble. Like this, we error out as early as possible and will not leave behind any files.

24ac9e0c

2019-02-13T23:26:54

deprecation: ensure we GIT_EXTERN deprecated funcs Although the error functions were deprecated, we did not properly mark them as deprecated. We need to include the `deprecated.h` file in order to ensure that the functions get their export attributes. Similarly, do not define `GIT_DEPRECATE_HARD` within the library, or those functions will also not get their export attributes. Define that only on the tests and examples.

004a3398

2019-01-28T18:31:21

Allow bypassing check '.keep' files using libgit2 option 'GIT_OPT_IGNORE_PACK_KEEP_FILE_CHECK'

0ceac0d0

2019-01-23T14:45:19

mbedtls: fix potential size overflow when reading or writing data The mbedtls library uses a callback mechanism to allow downstream users to plug in their own receive and send functions. We implement `bio_read` and `bio_write` functions, which simply wrap the `git_stream_read` and `git_stream_write` functions, respectively. The problem arises due to the return value of the callback functions: mbedtls expects us to return an `int` containing the actual number of bytes that were read or written. But this is in fact completely misdesigned, as callers are allowed to pass in a buffer with length `SIZE_MAX`. We thus may be unable to represent the number of bytes written via the return value. Fix this by only ever reading or writing at most `INT_MAX` bytes.

75918aba

2019-01-23T14:43:54

mbedtls: make global variables static The mbedtls stream implementation makes use of some global variables which are not marked as `static`, even though they're only used in this compilation unit. Fix this and remove a duplicate declaration.

657197e6

2019-01-23T15:54:05

openssl: fix potential size overflow when writing data Our `openssl_write` function calls `SSL_write` by passing in both `data` and `len` arguments directly. Thing is, our `len` parameter is of type `size_t` and theirs is of type `int`. We thus need to clamp our length to be at most `INT_MAX`.

7613086d

2019-01-23T15:49:28

streams: handle short writes only in generic stream Now that the function `git_stream__write_full` exists and callers of `git_stream_write` have been adjusted, we can lift logic for short writes out of the stream implementations. Instead, this is now handled either by `git_stream__write_full` or by callers of `git_stream_write` directly.

5265b31c

2019-01-23T15:00:20

streams: fix callers potentially only writing partial data Similar to the write(3) function, implementations of `git_stream_write` do not guarantee that all bytes are written. Instead, they return the number of bytes that actually have been written, which may be smaller than the total number of bytes. Furthermore, due to an interface design issue, we cannot ever write more than `SSIZE_MAX` bytes at once, as otherwise we cannot represent the number of bytes written to the caller. Unfortunately, no caller of `git_stream_write` ever checks the return value, except to verify that no error occurred. Due to this, they are susceptible to the case where only partial data has been written. Fix this by introducing a new function `git_stream__write_full`. In contrast to `git_stream_write`, it will always return either success or failure, without returning the number of bytes written. Thus, it is able to write all `SIZE_MAX` bytes and loop around `git_stream_write` until all data has been written. Adjust all callers except the BIO callbacks in our mbedtls and OpenSSL streams, which already do the right thing and require the amount of bytes written.

193e7ce9

2019-01-23T15:42:07

streams: make file-local functions static The callback functions that implement the `git_stream` structure are only used inside of their respective implementation files, but they are not marked as `static`. Fix this.

4e3949b7

2019-01-30T02:14:11

tests: test that largefiles can be read through the tree API

fac08837

2019-01-21T11:38:46

filter: return an int Validate that the return value of the read is not less than INT_MAX, then cast.

89bd4ddb

2019-01-21T11:32:53

diff_generate: validate oid file size Index entries are 32 bit unsigned ints, not `size_t`s.

fd9d4e28

2019-01-21T11:29:16

describe: don't mix and match abbreviated size types The git_describe_format_options.abbreviated_size type is an unsigned int. There's no need for it to be anything else; keep it what it is.

751eb462

2019-01-21T11:20:18

delta: validate sizes and cast safely Quiet down a warning from MSVC about how we're potentially losing data. Validate that our data will fit into the type provided then cast.

4947216f

2019-01-21T11:11:27

git transport: only write INT_MAX bytes The transport code returns an `int` with the number of bytes written; thus only attempt to write at most `INT_MAX`.

a861839d

2019-01-21T10:55:59

windows: add SSIZE_MAX Windows doesn't include ssize_t or its _MAX value by default. We are already declaring ssize_t as SSIZE_T, which is __int64_t on Win64 and long otherwise. Include its _MAX value as a correspondence to its type.

f1986a23

2019-01-21T09:56:23

streams: don't write more than SSIZE_MAX Our streams implementation takes a `size_t` that indicates the length of the data buffer to be written, and returns an `ssize_t` that indicates the length that _was_ written. Clearly no such implementation can write more than `SSIZE_MAX` bytes. Ensure that each TLS stream implementation does not try to write more than `SSIZE_MAX` bytes (or smaller; if the given implementation takes a smaller size).

e5e2fac8

2019-01-21T00:57:39

buffer: explicitly cast Quiet down a warning from MSVC about how we're potentially losing data. This is safe since we've explicitly tested it.

f4ebb2d4

2019-01-21T00:56:35

blame: make hunk_cmp handle unsigned differences

ae681d3f

2019-01-21T00:49:07

apply: make update_hunk accept a size_t

1d4ddb8e

2019-01-20T23:42:08

iterator: cast filesystem iterator entry values explicitly The filesystem iterator takes `stat` data from disk and puts them into index entries, which use 32 bit ints for time (the seconds portion) and filesize. However, on most systems these are not 32 bit, thus will typically invoke a warning. Most users ignore these fields entirely. Diff and checkout code do use the values, however only for the cache to determine if they should check file modification. Thus, this is not a critical error (and will cause a hash recomputation at worst).

c6cac733

2019-01-20T22:40:38

blob: validate that blob sizes fit in a size_t Our blob size is a `git_off_t`, which is a signed 64 bit int. This may be erroneously negative or larger than `SIZE_MAX`. Ensure that the blob size fits into a `size_t` before casting.

3aa6d96a

2019-01-20T20:38:25

tree: cast filename length in git_tree__parse_raw Quiet down a warning from MSVC about how we're potentially losing data. Ensure that we're within a uint16_t before we do.

759502ed

2019-01-20T20:30:42

odb_loose: explicitly cast to size_t Quiet down a warning from MSVC about how we're potentially losing data. This is safe since we've explicitly tested that it's positive and less than SIZE_MAX.

80c3867b

2019-01-20T19:20:12

patch: explicitly cast down in parse_header_percent Quiet down a warning from MSVC about how we're potentially losing data. This is safe since we've explicitly tested that it's within the range of 0-100.

494448a5

2019-01-20T19:10:08

index: explicitly cast down to a size_t Quiet down a warning from MSVC about how we're potentially losing data. This cast is safe since we've explicitly tested that `strip_len` <= `last_len`.

c3866fa8

2019-01-20T18:54:16

diff: explicitly cast in flush_hunk Quiet down a warning from MSVC about how we're potentially losing data.

826d9a4d

2019-01-25T09:43:20

Merge pull request #4858 from tiennou/fix/index-ext-read index: preserve extension parsing errors

e09f0c10

2019-01-23T10:21:42

deprecation: don't use deprecated stream cb Avoid the deprecated `git_stream_cb` typedef since we want to compile the library without deprecated functions or types. Instead, we can unroll the alias to its actual type.

thodg/libgit2/src

src

Log