kmx git

Commit	Date	Message
63307cba	2019-09-28T17:32:18	Merge pull request #5226 from pks-t/pks/regexp-api regexp: implement a new regular expression API
70325370	2019-09-27T11:16:02	Merge pull request #5106 from tiennou/fix/ref-api-fixes git_refdb API fixes
452b7f8f	2019-09-25T20:29:21	Don't use enum for flags Using an `enum` causes trouble when used with C++ as bitwise operations are not possible w/o casting (e.g., `opts.flags &= ~GIT_BLOB_FILTER_CHECK_FOR_BINARY;` is invalid as there is no `&=` operator for `enum`). Signed-off-by: Sven Strickroth <email@cs-ware.de>
3c1aa232	2019-09-21T16:09:00	Merge pull request #5232 from pks-t/pks/buffer-ensure-size-oom buffer: fix writes into out-of-memory buffers
f585b129	2019-09-12T14:29:28	posix: remove superseded POSIX regex wrappers The old POSIX regex wrappers have been superseded by our own regexp API that provides a higher-level abstraction. Remove the POSIX wrappers in favor of the new one.
7aacf027	2019-09-13T08:55:33	global: convert all users of POSIX regex to use our new regexp API The old POSIX regex API has been superseded by our new regexp API. Convert all users to make use of the new one.
d77378eb	2019-09-13T08:54:26	regexp: implement new regular expression API We currently support a set of different regular expression backends with PCRE, PCRE2, regcomp(3P) and regcomp_l(3). The current implementation of this is done via a simple POSIX wrapper that either directly uses supplied functions or that is a very small wrapper. To support PCRE and PCRE2, we use their provided <pcreposix.h> and <pcre2posix.h> wrappers. These wrappers are implemented in such a way that the accompanying libraries pcre-posix and pcre2-posix provide the same symbols as the libc ones, namely regcomp(3P) et al. This works out on some systems just fine, most importantly on glibc-based ones, where the regular expression functions are implemented as weak aliases and thus get overridden by linking in the pcre{,2}-posix library. On other systems we depend on the linking order of libc and pcre library, and as libc always comes first we will end up with the functions of the libc implementation. As a result, we may use the structures `regex_t` and `regmatch_t` declared by <pcre{,2}posix.h>, but use functions defined by the libc, leading to segfaults. The issue is not easily solvable. Somed distributions like Debian have resolved this by patching PCRE and PCRE2 to carry custom prefixes to all the POSIX function wrappers. But this is not supported by upstream and thus inherently unportable between distributions. We could instead try to modify linking order, but this starts becoming fragile and will not work e.g. when libgit2 is loaded via dlopen(3P) or similar ways. In the end, this means that we simply cannot use the POSIX wrappers provided by the PCRE libraries at all. Thus, this commit introduces a new regular expression API. The new API is on a tad higher level than the previous POSIX abstraction layer, as it tries to abstract away any non-portable flags like e.g. REG_EXTENDED, which has no equivalents in all of our supported backends. As there are no users of POSIX regular expressions that do _not_ reguest REG_EXTENDED this is fine to be abstracted away, though. Due to the API being higher-level than before, it should generally be a tad easier to use than the previous one. Note: ideally, the new API would've been called `git_regex_foobar` with a file "regex.h" and "regex.c". Unfortunately, this is currently impossible to implement due to naming clashes between the then-existing "regex.h" and <regex.h> provided by the libc. As we add the source directory of libgit2 to the header search path, an include of <regex.h> would always find our own "regex.h". Thus, we have to take the bitter pill of adding one more character to all the functions to disambiguate the includes. To improve guarantees around cross-backend compatibility, this commit also brings along an improved regular expression test suite core::regexp.
174b7a32	2019-09-19T12:24:06	buffer: fix printing into out-of-memory buffer Before printing into a `git_buf` structure, we always call `ENSURE_SIZE` first. This macro will reallocate the buffer as-needed depending on whether the current amount of allocated bytes is sufficient or not. If `asize` is big enough, then it will just do nothing, otherwise it will call out to `git_buf_try_grow`. But in fact, it is insufficient to only check `asize`. When we fail to allocate any more bytes e.g. via `git_buf_try_grow`, then we set the buffer's pointer to `git_buf__oom`. Note that we touch neither `asize` nor `size`. So if we just check `asize > targetsize`, then we will happily let the caller of `ENSURE_SIZE` proceed with an out-of-memory buffer. As a result, we will print all bytes into the out-of-memory buffer instead, resulting in an out-of-bounds write. Fix the issue by having `ENSURE_SIZE` verify that the buffer is not marked as OOM. Add a test to verify that we're not writing into the OOM buffer.
208f1d7a	2019-09-19T12:46:37	buffer: fix infinite loop when growing buffers When growing buffers, we repeatedly multiply the currently allocated number of bytes by 1.5 until it exceeds the requested number of bytes. This has two major problems: 1. If the current number of bytes is tiny and one wishes to resize to a comparatively huge number of bytes, then we may need to loop thousands of times. 2. If resizing to a value close to `SIZE_MAX` (which would fail anyway), then we probably hit an infinite loop as multiplying the current amount of bytes will repeatedly result in integer overflows. When reallocating buffers, one typically chooses values close to 1.5 to enable re-use of resulting memory holes in later reallocations. But because of this, it really only makes sense to use a factor of 1.5 _once_, but not looping until we finally are able to fit it. Thus, we can completely avoid the loop and just opt for the much simpler algorithm of multiplying with 1.5 once and, if the result doesn't fit, just use the target size. This avoids both problems of looping extensively and hitting overflows. This commit also adds a test that would've previously resulted in an infinite loop.
3e8a17b0	2019-09-21T15:18:42	buffer: fix memory leak if unable to grow buffer If growing a buffer fails, we set its pointer to the static `git_buf__oom` structure. While we correctly free the old pointer if `git__malloc` returned an error, we do not free it if there was an integer overflow while calculating the new allocation size. Fix this issue by freeing the pointer to plug the memory leak.
49a3289e	2019-09-21T08:25:23	cred: add missing private header in GSSAPI block Should have been part of 8bf0f7eb26c65b2b937b1f40a384b9b269b0b76d
aa407ca3	2019-09-19T13:23:59	Merge pull request #5206 from tiennou/cmake/pkgconfig-building CMake pkg-config modulification
564b3ffc	2019-08-17T12:34:59	cmake: add missing requires to the .pc file
d80d9d56	2019-08-17T12:17:21	cmake: streamline *.pc file handling via a module
8bf0f7eb	2019-09-09T13:00:27	cred: separate public interface from low-level details
5d8a4659	2019-09-13T10:31:49	Merge pull request #5195 from tiennou/fix/commitish-smart-push smart: use push_glob instead of manual filtering
dde6d9c7	2019-09-10T17:09:57	open:move all cleanup code to cleanup label in git_repository_open_ext
b545be3d	2019-09-10T11:14:36	open:fix memory leak when passing NULL to git_repository_open_ext
c3a7892f	2019-09-09T13:10:24	Merge pull request #5209 from mkostyuk/apply-wrong-patch apply: Fix a patch corruption related to EOFNL handling
17d6cd45	2019-09-09T13:06:22	Merge pull request #5210 from buddyspike/master ignore: correct handling of nested rules overriding wild card unignore
4d3392dd	2019-09-09T13:03:42	Merge pull request #5214 from pks-t/pks/diff-iterator-allocation-fixes Memory allocation fixes for diff generator
39028eb6	2019-09-09T13:00:53	Merge pull request #5212 from libgit2/ethomson/creds_for_scheme Use an HTTP scheme that supports the given credentials
8c142241	2019-06-14T08:20:05	refdb: make sure to remove packed refs first This fixes part of the issue where, given a concurrent `git pack-refs`, a ref lookup could return an old, vestigial value from the packed file, as the valid loose one would have been deleted.
171116e7	2019-06-14T06:50:41	refdb: repurpose filesystem prune function
8fd855fd	2019-02-02T19:00:51	refdb: reorder parameters for consistency
9b25cf15	2019-02-02T19:00:49	refdb: fix packed_delete clobbering some errors In the case of a failed lookup, we'd paper over that by writing back the packed-refs successfully.
0a88c83d	2019-02-02T19:00:47	refdb: make low-level deletion helpers explicit
baf411e7	2019-02-02T19:00:45	refdb: ensure all mandatory functions are provided at setup time
c2cf9844	2019-02-02T19:00:43	refdb: check the version of the backend we're about to set
8db9fd3b	2019-02-02T19:00:41	refdb: documentation
a7b4b639	2019-08-24T12:14:31	ignore: correct handling of nested rules overriding wild card unignore problem: filesystem_iterator loads .gitignore files in top-down order. subsequently, ignore module evaluates them in the order they are loaded. this creates a problem if we have unignored a rule (using a wild card) in a sub dir and ignored it again in a level further below (see the test included in this patch). solution: process ignores in reverse order. closes #4963
5fc27aac	2019-08-27T13:38:08	Merge pull request #5208 from mkostyuk/apply-removed-new-file apply: git_apply_to_tree fails to apply patches that add new files
6de48085	2019-08-27T11:29:24	Merge pull request #5189 from libgit2/ethomson/attrs_from_head Optionally read `.gitattributes` from HEAD
aaa48d06	2019-08-27T11:26:50	Merge pull request #5196 from pks-t/pks/config-include-onbranch config: implement "onbranch" conditional
9ca7a60e	2019-08-27T10:36:20	iterator: avoid leaving partially initialized frame on stack When allocating tree iterator entries, we use GIT_ERROR_ALLOC_CHECK` to check whether the allocation has failed. The macro will cause the function to immediately return, though, leaving behind a partially initialized iterator frame. Fix the issue by manually checking for memory allocation errors and using `goto done` in case of an error, popping the iterator frame.
fe241071	2019-08-27T10:36:19	diff_generate: detect memory allocation errors when preparing opts When preparing options for the two iterators that are about to be diffed, we allocate a common prefix for both iterators depending on the options passed by the user. We do not check whether the allocation was successful, though. In fact, this isn't much of a problem, as using a `NULL` prefix is perfectly fine. But in the end, we probably want to detect that the system doesn't have any memory left, as we're unlikely to be able to continue afterwards anyway. While the issue is being fixed in the newly created function `diff_prepare_iterator_opts`, it has been previously existing in the previous macro `DIFF_FROM_ITERATORS` already.
699de9c5	2019-08-27T10:36:17	iterator: remove duplicate memset When allocating new tree iterator frames, we zero out the allocated memory twice. Remove one of the `memset` calls.
8a23597b	2019-08-27T10:36:18	diff_generate: refactor `DIFF_FROM_ITERATORS` macro of doom While the `DIFF_FROM_ITERATORS` does make it shorter to implement the various `git_diff_foo_to_bar` functions, it is a complex and unreadable beast that implicitly assumes certain local variable names. This is not something desirable to have at all and obstructs understanding and more importantly debugging the code by quite a bit. The `DIFF_FROM_ITERATORS` macro basically removed the burden of having to derive the options for both iterators from a pair of iterator flags and the diff options. This patch introduces a new function that does the that exact and refactors all callers to manage the iterators by themselves. As we potentially need to allocate a shared prefix for the iterator, we need to tell the caller to allocate that prefix as soon as the options aren't required anymore. Thus, the function has a `char **prefix` out pointer that will get set to the allocated string and subsequently be free'd by the caller. While this patch increases the line count, I personally deem this to an acceptable tradeoff for increased readbiblity.
4e20c7b1	2019-08-25T22:11:39	Merge pull request #5213 from boardwalk/dskorupski/fix_include_case Fix include casing for case-sensitive filesystems.
44d5e47d	2019-08-24T10:39:56	Fix include casing for case-sensitive filesystems.
4de51f9e	2019-08-23T16:05:28	http: ensure the scheme supports the credentials When a server responds with multiple scheme support - for example, Negotiate and NTLM are commonly used together - we need to ensure that we choose a scheme that supports the credentials.
60319788	2019-08-23T09:58:15	Merge pull request #5054 from tniessen/util-use-64-bit-timer util: use 64 bit timer on Windows
53f51c60	2019-08-21T19:48:05	smart: implement by-date insertion when revwalking
4b91f058	2019-08-21T19:43:06	revwalk: expose more ways of scheduling commits Before we can tweak the revwalk to be more efficent when negotiating, we need to add an "insertion mode" option. Since there's already an implicit set of those, make it visible, at least privately.
8cbef12d	2019-08-08T11:52:54	util: do not perform allocations in insertsort Our hand-rolled fallback sorting function `git__insertsort_r` does an in-place sort of the given array. As elements may not necessarily be pointers, it needs a way of swapping two values of arbitrary size, which is currently implemented by allocating a temporary buffer of the element's size. This is problematic, though, as the emulated `qsort` interface doesn't provide any return values and thus cannot signal an error if allocation of that temporary buffer has failed. Convert the function to swap via a temporary buffer allocated on the stack. Like this, it can `memcpy` contents of both elements in small batches without requiring a heap allocation. The buffer size has been chosen such that in most cases, a single iteration of copying will suffice. Most importantly, it can fully contain `git_oid` structures and pointers. Add a bunch of tests for the `git__qsort_r` interface to verify nothing breaks. Furthermore, this removes the declaration of `git__insertsort_r` and makes it static as it is not used anywhere else.
f3b3e543	2019-08-08T11:34:01	xdiff: catch memory allocation errors The xdiff code contains multiple call sites where the results of `xdl_malloc` are not being checked for memory allocation errors. Add checks to fix possible segfaults due to `NULL` pointer accesses.
c2dd895a	2019-08-08T10:47:29	transports: http: check for memory allocation failures When allocating a chunk that is used to write to HTTP streams, we do not check for memory allocation errors. This may lead us to write to a `NULL` pointer and thus cause a segfault. Fix this by adding a call to `GIT_ERROR_CHECK_ALLOC`.
08699541	2019-08-08T10:46:42	trailer: check for memory allocation errors The "trailer.c" code has been copied mostly verbatim from git.git with minor adjustments, only. As git.git's `xmalloc` function, which aborts on memory allocation errors, has been swapped out for `git_malloc`, which doesn't abort, we may inadvertently access `NULL` pointers. Add checks to fix this.
8c7d9761	2019-08-08T10:45:12	posix: fix direct use of `malloc` In "posix.c" there are multiple callsites which execute `malloc` instead of `git__malloc`. Thus, users of library are not able to track these allocations with a custom allocator. Convert these call sites to use `git__malloc` instead.
a477bff1	2019-08-08T10:44:57	indexer: catch OOM when adding expected OIDs When adding OIDs to the indexer's map of yet-to-be-seen OIDs to verify that packfiles are complete, we do so by first allocating a new OID and then calling `git_oidmap_set` on it. There was no check for memory allocation errors in place, though, leading to possible segfaults due to trying to copy data to a `NULL` pointer. Verify the result of `git__malloc` with `GIT_ERROR_CHECK_ALLOC` to fix the issue.
d4fe402b	2019-08-08T10:36:33	merge: check return value of `git_commit_list_insert` The function `git_commit_list_insert` dynamically allocates memory and may thus fail to insert a given commit, but we didn't check for that in several places in "merge.c". Convert surrounding functions to return error codes and check whether `git_commit_list_insert` was successful, returning an error if not.
c0486188	2019-08-08T10:28:09	blame_git: detect memory allocation errors The code in "blame_git.c" was mostly imported from git.git with only minor changes. One of these changes was to use our own allocators instead of git's `xmalloc`, but there's a subtle difference: `xmalloc` would abort the program if unable to allocate any memory, bit `git__malloc` doesn't. As we didn't check for memory allocation errors in some places, we might inadvertently dereference a `NULL` pointer in out-of-memory situations. Convert multiple functions to return proper error codes and add calls to `GIT_ERROR_CHECK_ALLOC` to fix this.
1c847169	2019-08-21T16:38:59	http: allow dummy negotiation scheme to fail to act The dummy negotiation scheme is used for known authentication strategies that do not wish to act. For example, when a server requests the "Negotiate" scheme but libgit2 is not built with Negotiate support, and will use the "dummy" strategy which will simply not act. Instead of setting `out` to NULL and returning a successful code, return `GIT_PASSTHROUGH` to indicate that it did not act and catch that error code.
39d18fe6	2019-07-31T08:37:10	smart: use push_glob instead of manual filtering The code worked under the assumption that anything under `refs/tags` are tag objects, and all the rest would be peelable to a commit. As it is completely valid to have tags to blobs under a non `refs/tags` ref, this would cause failures when trying to peel a tag to a commit. Fix the broken filtering by switching to `git_revwalk_push_glob`, which already handles this case.
de4bc2bd	2019-08-20T03:29:45	apply: git_apply_to_tree fails to apply patches that add new files git_apply_to_tree() cannot be used apply patches with new files. An attempt to apply such a patch fails because git_apply_to_tree() tries to remove a non-existing file from an old index. The solution is to modify git_apply_to_tree() to git_index_remove() when the patch states that the modified files is removed.
630127e3	2019-08-20T03:08:32	apply: Fix a patch corruption related to EOFNL handling Use of apply's API can lead to an improper patch application and a corruption of the modified file. The issue is caused by mishandling of the end of file changes if there are several hunks to apply. The new line character is added to a line from a wrong hunk. The solution is to modify apply_hunk() to add the newline character at the end of a line from a right hunk.
071750a3	2019-08-15T14:18:26	cmake: move _WIN32_WINNT definitions to root
0f40e68e	2019-08-14T09:05:07	Merge pull request #5187 from ianhattendorf/fix/clone-whitespace clone: don't decode URL percent encodings
57a9ccd5	2019-06-21T15:53:54	commit_list: fix possible buffer overflow in `commit_quick_parse` The function `commit_quick_parse` provides a way to quickly parse parts of a commit without storing or verifying most of its metadata. The first thing it does is calculating the number of parents by skipping "parent " lines until it finds the first non-parent line. Afterwards, this parent count is passed to `alloc_parents`, which will allocate an array to store all the parent. To calculate the amount of storage required for the parents array, `alloc_parents` simply multiplicates the number of parents with the respective elements's size. This already screams "buffer overflow", and in fact this problem is getting worse by the result being cast to an `uint32_t`. In fact, triggering this is possible: git-hash-object(1) will happily write a commit with multiple millions of parents for you. I've stopped at 67,108,864 parents as git-hash-object(1) unfortunately soaks up the complete object without streaming anything to disk and thus will cause an OOM situation at a later point. The point here is: this commit was about 4.1GB of size but compressed down to 24MB and thus easy to distribute. The above doesn't yet trigger the buffer overflow, thus. As the array's elements are all pointers which are 8 bytes on 64 bit, we need a total of 536,870,912 parents to trigger the overflow to `0`. The effect is that we're now underallocating the array and do an out-of-bound writes. As the buffer is kindly provided by the adversary, this may easily result in code execution. Extrapolating from the test file with 67m commits to the one with 536m commits results in a factor of 8. Thus the uncompressed contents would be about 32GB in size and the compressed ones 192MB. While still easily distributable via the network, only servers will have that amount of RAM and not cause an out-of-memory condition previous to triggering the overflow. This at least makes this attack not an easy vector for client-side use of libgit2.
cb1439c9	2019-06-19T12:59:27	config: validate ownership of C:\ProgramData\Git\config before using it When the VirtualStore feature is in effect, it is safe to let random users write into C:\ProgramData because other users won't see those files. This seemed to be the case when we introduced support for C:\ProgramData\Git\config. However, when that feature is not in effect (which seems to be the case in newer Windows 10 versions), we'd rather not use those files unless they come from a trusted source, such as an administrator. This change imitates the strategy chosen by PowerShell's native OpenSSH port to Windows regarding host key files: if a system file is owned neither by an administrator, a system account, or the current user, it is ignored.
5774b2b1	2019-08-11T23:42:45	Merge pull request #5113 from pks-t/pks/stash-perf stash: avoid recomputing tree when committing worktree
42bacbc6	2019-08-11T21:06:19	Merge pull request #5121 from pks-t/pks/variadic-errors Variadic macros
fba3bf79	2019-07-21T14:15:12	blob: optionally read attributes from repository When `GIT_BLOB_FILTER_ATTTRIBUTES_FROM_HEAD` is passed to `git_blob_filter`, read attributes from `gitattributes` files that are checked in to the repository at the HEAD revision. This passes the flag `GIT_FILTER_ATTRIBUTES_FROM_HEAD` to the filter functions.
f0f27c1c	2019-07-21T14:13:25	filter: optionally read attributes from repository When `GIT_FILTER_ATTRIBUTES_FROM_HEAD` is specified, configure the filter to read filter attributes from `gitattributes` files that are checked in to the repository at the HEAD revision. This passes the flag `GIT_ATTR_CHECK_INCLUDE_HEAD` to the attribute reading functions.
4fd5748c	2019-07-21T14:11:03	attr: optionally read attributes from repository When `GIT_ATTR_CHECK_INCLUDE_HEAD` is specified, read `gitattribute` files that are checked into the repository at the HEAD revision.
a5392eae	2019-07-21T12:13:07	blob: allow blob filtering to ignore system gitattributes Introduce `GIT_BLOB_FILTER_NO_SYSTEM_ATTRIBUTES`, which tells `git_blob_filter` to ignore the system-wide attributes file, usually `/etc/gitattributes`. This simply passes the appropriate flag to the attribute loading code.
22eb12af	2019-07-21T12:12:05	filter: add GIT_FILTER_NO_SYSTEM_ATTRIBUTES option Allow system-wide attributes (the ones specified in `/etc/gitattributes`) to be ignored if the flag `GIT_FILTER_NO_SYSTEM_ATTRIBUTES` is specified.
fa1a4c77	2019-07-21T11:03:01	blob: deprecate `git_blob_filtered_content` Users should now use `git_blob_filter`.
a32ab076	2019-07-21T10:56:42	blob: introduce git_blob_filter Provide a function to filter blobs that allows for more functionality than the existing `git_blob_filtered_content` function.
b0692d6b	2019-08-09T09:01:56	Merge pull request #4913 from implausible/feature/signing-rebase-commits Add sign capability to git_rebase_commit
998f9c15	2019-08-07T07:21:27	fixup: strange indentation
f627ba6c	2019-08-02T13:18:07	Merge pull request #5197 from pks-t/pks/remote-ifdeffed-block remote: remove unused block of code
e23c0b18	2019-08-02T07:52:58	remote: remove unused block of code In "remote.c", we have a chunk of code that is #ifdef'fed out via `#if 0` with a comment that we could export it as a helper function. The code was implemented in 2013 and ifdef'fed in 2014, which shows that there's clearly no interest in having such a helper at all. As this block has recently created some confusion about `p_getenv` due to it containing the only reference to that function in our codebase, let's remove this block altogether.
d588de7c	2019-08-02T07:51:02	Merge pull request #5191 from eaigner/master config: check if we are running in a sandboxed environment
952fbbfb	2019-08-01T20:04:11	config: check if we are running in a sandboxed environment On macOS the $HOME environment variable returns the path to the sandbox container instead of the actual user $HOME for sandboxed apps. To get the correct path, we have to get it from the password file entry.
722ba93f	2019-08-01T15:14:06	config: implement "onbranch" conditional With Git v2.23.0, the conditional include mechanism gained another new conditional "onbranch". As the name says, it will cause a file to be included if the "onbranch" pattern matches the currently checked out branch. Implement this new condition and add a bunch of tests.
1721ab04	2019-06-16T11:25:47	unix: posix: avoid use of variadic macro `p_snprintf` The macro `p_snprintf` is implemented as a variadic macro that calls `snprintf` directly with `__VA_ARGS__`. In C89, variadic macros are not allowed, but as the arguments of `p_snprintf` and `snprintf` are matching 1:1, we can fix this by simply removing the parameter list from `p_snprintf`.
63d8cd18	2019-06-16T11:17:17	apply: remove use of variadic error macro The macro `apply_err` is implemented as a variadic macro, which are not defined by C89. Convert it to a variadic function, instead.
27b8b31e	2019-08-01T11:57:03	parse: remove use of variadic macros which are not C89 compliant The macro `git_parse_error` is implemented in a variadic way so that it's possible to pass printf-style parameters. Unfortunately, variadic macros are not defined by C89 and thus we cannot use that functionality. But as we have implemented `git_error_vset` in the previous commit, we can now just use that instead. Convert `git_parse_error` to a variadic function and use `git_error_vset` to fix the compliance violation. While at it, move the function to "patch_parse.c".
c8e63812	2019-06-16T11:03:08	errors: introduce `git_error_vset` function Right now, we only provide a `git_error_set` that has a variadic function signature. It's impossible to drive this function in a C89-compliant way from other functions that have a variadic signature, though, like for example `git_parse_error`. Implement a new `git_error_vset` function that gets a `va_list` as parameter, fixing the above problem.
e8f63411	2019-08-01T11:29:58	Merge pull request #5186 from pks-t/pks/config-snapshot-separation config: separate file and snapshot backends
fb0730f1	2019-04-16T23:49:16	util: use 64 bit timer on Windows git__timer was originally implemented using a 32 bit timer since Windows XP did not support GetTickCount64. Windows XP was discontinued five years ago, so it should be safe to use the new API. As a benefit, we do not need to care about overflows for the next 585 million years.
c8e249b0	2019-07-29T10:51:22	object: deprecate git_object__size for removal In #5118 we remove the double-underscore to make it a normally-named public function. However, this is not an interesting function outside of the library and it takes up a name for something that could be more useful. Remove the single-underscore version as we have not done any releases with it.
37ebe9ad	2019-07-24T18:49:08	config_backend: rename internal structures The internal backend structures are kind-of legacy and do not really speak for themselves. Rename them accordingly to make them easier to understand.
2bff84ba	2019-07-26T21:02:56	config_file: separate out read-only backend To further distinguish the file writeable and readonly backends, separate the readonly backend into its own "config_snapshot.c" implementation. The snapshot backend can be generically used to snapshot any type of backend.
f0b10066	2019-07-24T18:37:14	config_file: fix cast of readonly backend In `backend_readonly_free`, the passed in config backend is being cast to a `diskfile_backend` instead of to a `diskfile_readonly_backend`. While this works out just fine because we only access its header values, which were shared between both backends, it is undefined behaviour. Use the correct type to fix this.
a3159df8	2019-07-24T18:31:43	config_file: remove shared `diskfile_header` struct The `diskfile_header` structure is shared between both `diskfile_backend` and `diskfile_readonly_backend`. The separation and resulting casting is confusing at times and a source for programming errors. Remove the shared structure and inline them directly.
271e5fba	2019-07-24T18:18:18	config_file: duplicate accessors for readonly backend While most functions of the readonly configuration backend are implemented separately from the writeable configuration backend, the two functions `config_iterator_new` and `config_get` are shared between both. This sharing makes it necessary to have some shared data structures, which is the `diskfile_header` structure. Unfortunately, this makes the backends harder to grasp than necessary due to all the casting between structs and also quite error prone. Reimplement those functions for the readonly backends. As readonly backends cannot be refreshed anyway, we can remove the calls to `config_refresh` in there.
4e7ce1fb	2019-07-24T18:13:52	config_file: reimplement `config_readonly_open` generically The `config_readonly_open` function currently receives as input a diskfile backend and will copy its entries to a new snapshot. This is rather intimate, as we need to assume that the source config backend is in fact a diskfile entry. We can do better than this though by using generic methods to copy contents of the provided backend, e.g. by using a config iterator. This also allows us to decouple the read-only backend from the read-write backend.
76182e84	2019-07-24T18:04:38	config_entries: fix possible segfault when duplicating entries When duplicating a configuration entry, we allocate a new entry but do not verify that we get a valid pointer back. As we're dereferencing the pointer afterwards, we might thus run into a segfault in out-of-memory situations. Extract a new function `git_config_entries_dup_entry` that handles the complete entry duplication. Fix the error by using `GIT_ERROR_CHECK_ALLOC`.
ba2885da	2019-07-24T18:05:28	git_net_url_parse: don't git_buf_decode_percent for path
2766b92d	2019-07-21T15:10:34	config_file: refresh when creating an iterator When creating a new iterator for a config file backend, then we should always make sure that we're up to date by calling `config_refresh`. Otherwise, we might not notice when another process has modified the configuration file and thus will represent outdated values. Add two tests to config::stress that verify that we get up-to-date values when reading configuration entries via `git_config_iterator`.
9fac8b78	2019-07-21T15:08:22	config_file: do not refresh read-only backends If calling `config_refresh` on a read-only configuration file backend, then we will segfault when comparing the timestamp of the file due to `path` being uninitialized. As a read-only snapshot should not be refreshed anyway and stay consistent, we can simply return early when calling `config_refresh` on a read-only snapshot.
28d11b59	2019-07-21T14:41:21	config_file: consistently use `GIT_CONTAINER_OF`
6be5ac23	2019-07-11T15:30:51	checkout: postpone creation of symlinks to the end On most platforms it's fine to create symlinks to nonexisting files. Not so on Windows, where the type of a symlink (file or directory) needs to be set at creation time. So depending on whether the target file exists or not, we may end up with different symlink types. This creates a problem when performing checkouts, where we simply iterate over all blobs that need to be updated without treating symlinks any special. If the target file of the symlink is going to be checked out after the symlink itself, then the symlink will be created as directory symlink and not as file symlink. Fix the issue by iterating over blobs twice: once to perform postponed deletions and updates to non-symlink blobs, and once to perform updates to symlink blobs.
50194dcd	2019-07-11T15:14:42	win32: fix symlinks to relative file targets When creating a symlink in Windows, one needs to tell Windows whether the symlink should be a file or directory symlink. To determine which flag to pass, we call `GetFileAttributesW` on the target file to see whether it is a directory and then pass the flag accordingly. The problem though is if create a symlink with a relative target path, then we will check that relative path while not necessarily being inside of the working directory where the symlink is to be created. Thus, getting its attributes will either fail or return attributes of the wrong target. Fix this by resolving the target path relative to the directory in which the symlink is to be created.
a00842c4	2019-06-29T09:59:14	win32: correctly unlink symlinks to directories When deleting a symlink on Windows, then the way to delete it depends on whether it is a directory symlink or a file symlink. In the first case, we need to use `DeleteFile`, in the second `RemoveDirectory`. Right now, `p_unlink` will only ever try to use `DeleteFile`, though, and thus fail to remove directory symlinks. This mismatches how unlink(3P) is expected to behave, though, as it shall remove any symlink disregarding whether it is a file or directory symlink. In order to correctly unlink a symlink, we thus need to check what kind of file this is. If we were to first query file attributes of every file upon calling `p_unlink`, then this would penalize the common case though. Instead, we can try to first delete the file with `DeleteFile` and only if the error returned is `ERROR_ACCESS_DENIED` will we query file attributes and determine whether it is a directory symlink to use `RemoveDirectory` instead.
ded77bb1	2019-06-29T09:58:34	path: extract function to check whether a path supports symlinks When initializing a repository, we need to check whether its working directory supports symlinks to correctly set the initial value of the "core.symlinks" config variable. The code to check the filesystem is reusable in other parts of our codebase, like for example in our tests to determine whether certain tests can be expected to succeed or not. Extract the code into a new function `git_path_supports_symlinks` to avoid duplicate implementations. Remove a duplicate implementation in the repo test helper code.
e54343a4	2019-06-29T09:17:32	fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
a7d32d60	2019-07-20T18:46:32	stash: avoid recomputing tree when committing worktree When creating a new stash, we need to create there separate commits storing differences stored in the index, untracked changes as well as differences in the working directory. The first two will only be done conditionally if the equivalent options "git stash --keep-index --include-untracked" are being passed to `git_stash_save`, but even when only creating a stash of worktree changes we're much slower than git.git. Using our new stash example: $ time git stash Saved working directory and index state WIP on (no branch): 2f7d9d47575e Linux 5.1.7 real 0m0.528s user 0m0.309s sys 0m0.381s $ time lg2 stash real 0m27.165s user 0m13.645s sys 0m6.403s As can be seen, libgit2 is more than 50x slower than git.git! When creating the stash commit that includes all worktree changes, we create a completely new index to prepare for the new commit and populate it with the entries contained in the index' tree. Here comes the catch: by populating the index with a tree's contents, we do not have any stat caches in the index. This means that we have to re-validate every single file from the worktree and see whether it has changed. The issue can be fixed by populating the new index with the repo's existing index instead of with the tree. This retains all stat cache information, and thus we really only need to check files that have changed stat information. This is semantically equivalent to what we previously did: previously, we used the tree of the commit computed from the index. Now we're just using the index directly. And, in fact, the cache is doing wonders: time lg2 stash real 0m1.836s user 0m1.166s sys 0m0.663s We're now performing 15x faster than before and are only 3x slower than git.git now.

63307cba

2019-09-28T17:32:18

Merge pull request #5226 from pks-t/pks/regexp-api regexp: implement a new regular expression API

70325370

2019-09-27T11:16:02

Merge pull request #5106 from tiennou/fix/ref-api-fixes git_refdb API fixes

452b7f8f

2019-09-25T20:29:21

Don't use enum for flags Using an `enum` causes trouble when used with C++ as bitwise operations are not possible w/o casting (e.g., `opts.flags &= ~GIT_BLOB_FILTER_CHECK_FOR_BINARY;` is invalid as there is no `&=` operator for `enum`). Signed-off-by: Sven Strickroth <email@cs-ware.de>

3c1aa232

2019-09-21T16:09:00

Merge pull request #5232 from pks-t/pks/buffer-ensure-size-oom buffer: fix writes into out-of-memory buffers

f585b129

2019-09-12T14:29:28

posix: remove superseded POSIX regex wrappers The old POSIX regex wrappers have been superseded by our own regexp API that provides a higher-level abstraction. Remove the POSIX wrappers in favor of the new one.

7aacf027

2019-09-13T08:55:33

global: convert all users of POSIX regex to use our new regexp API The old POSIX regex API has been superseded by our new regexp API. Convert all users to make use of the new one.

d77378eb

2019-09-13T08:54:26

regexp: implement new regular expression API We currently support a set of different regular expression backends with PCRE, PCRE2, regcomp(3P) and regcomp_l(3). The current implementation of this is done via a simple POSIX wrapper that either directly uses supplied functions or that is a very small wrapper. To support PCRE and PCRE2, we use their provided <pcreposix.h> and <pcre2posix.h> wrappers. These wrappers are implemented in such a way that the accompanying libraries pcre-posix and pcre2-posix provide the same symbols as the libc ones, namely regcomp(3P) et al. This works out on some systems just fine, most importantly on glibc-based ones, where the regular expression functions are implemented as weak aliases and thus get overridden by linking in the pcre{,2}-posix library. On other systems we depend on the linking order of libc and pcre library, and as libc always comes first we will end up with the functions of the libc implementation. As a result, we may use the structures `regex_t` and `regmatch_t` declared by <pcre{,2}posix.h>, but use functions defined by the libc, leading to segfaults. The issue is not easily solvable. Somed distributions like Debian have resolved this by patching PCRE and PCRE2 to carry custom prefixes to all the POSIX function wrappers. But this is not supported by upstream and thus inherently unportable between distributions. We could instead try to modify linking order, but this starts becoming fragile and will not work e.g. when libgit2 is loaded via dlopen(3P) or similar ways. In the end, this means that we simply cannot use the POSIX wrappers provided by the PCRE libraries at all. Thus, this commit introduces a new regular expression API. The new API is on a tad higher level than the previous POSIX abstraction layer, as it tries to abstract away any non-portable flags like e.g. REG_EXTENDED, which has no equivalents in all of our supported backends. As there are no users of POSIX regular expressions that do _not_ reguest REG_EXTENDED this is fine to be abstracted away, though. Due to the API being higher-level than before, it should generally be a tad easier to use than the previous one. Note: ideally, the new API would've been called `git_regex_foobar` with a file "regex.h" and "regex.c". Unfortunately, this is currently impossible to implement due to naming clashes between the then-existing "regex.h" and <regex.h> provided by the libc. As we add the source directory of libgit2 to the header search path, an include of <regex.h> would always find our own "regex.h". Thus, we have to take the bitter pill of adding one more character to all the functions to disambiguate the includes. To improve guarantees around cross-backend compatibility, this commit also brings along an improved regular expression test suite core::regexp.

174b7a32

2019-09-19T12:24:06

buffer: fix printing into out-of-memory buffer Before printing into a `git_buf` structure, we always call `ENSURE_SIZE` first. This macro will reallocate the buffer as-needed depending on whether the current amount of allocated bytes is sufficient or not. If `asize` is big enough, then it will just do nothing, otherwise it will call out to `git_buf_try_grow`. But in fact, it is insufficient to only check `asize`. When we fail to allocate any more bytes e.g. via `git_buf_try_grow`, then we set the buffer's pointer to `git_buf__oom`. Note that we touch neither `asize` nor `size`. So if we just check `asize > targetsize`, then we will happily let the caller of `ENSURE_SIZE` proceed with an out-of-memory buffer. As a result, we will print all bytes into the out-of-memory buffer instead, resulting in an out-of-bounds write. Fix the issue by having `ENSURE_SIZE` verify that the buffer is not marked as OOM. Add a test to verify that we're not writing into the OOM buffer.

208f1d7a

2019-09-19T12:46:37

buffer: fix infinite loop when growing buffers When growing buffers, we repeatedly multiply the currently allocated number of bytes by 1.5 until it exceeds the requested number of bytes. This has two major problems: 1. If the current number of bytes is tiny and one wishes to resize to a comparatively huge number of bytes, then we may need to loop thousands of times. 2. If resizing to a value close to `SIZE_MAX` (which would fail anyway), then we probably hit an infinite loop as multiplying the current amount of bytes will repeatedly result in integer overflows. When reallocating buffers, one typically chooses values close to 1.5 to enable re-use of resulting memory holes in later reallocations. But because of this, it really only makes sense to use a factor of 1.5 _once_, but not looping until we finally are able to fit it. Thus, we can completely avoid the loop and just opt for the much simpler algorithm of multiplying with 1.5 once and, if the result doesn't fit, just use the target size. This avoids both problems of looping extensively and hitting overflows. This commit also adds a test that would've previously resulted in an infinite loop.

3e8a17b0

2019-09-21T15:18:42

buffer: fix memory leak if unable to grow buffer If growing a buffer fails, we set its pointer to the static `git_buf__oom` structure. While we correctly free the old pointer if `git__malloc` returned an error, we do not free it if there was an integer overflow while calculating the new allocation size. Fix this issue by freeing the pointer to plug the memory leak.

49a3289e

2019-09-21T08:25:23

cred: add missing private header in GSSAPI block Should have been part of 8bf0f7eb26c65b2b937b1f40a384b9b269b0b76d

aa407ca3

2019-09-19T13:23:59

Merge pull request #5206 from tiennou/cmake/pkgconfig-building CMake pkg-config modulification

564b3ffc

2019-08-17T12:34:59

cmake: add missing requires to the .pc file

d80d9d56

2019-08-17T12:17:21

cmake: streamline *.pc file handling via a module

8bf0f7eb

2019-09-09T13:00:27

cred: separate public interface from low-level details

5d8a4659

2019-09-13T10:31:49

Merge pull request #5195 from tiennou/fix/commitish-smart-push smart: use push_glob instead of manual filtering

dde6d9c7

2019-09-10T17:09:57

open:move all cleanup code to cleanup label in git_repository_open_ext

b545be3d

2019-09-10T11:14:36

open:fix memory leak when passing NULL to git_repository_open_ext

c3a7892f

2019-09-09T13:10:24

Merge pull request #5209 from mkostyuk/apply-wrong-patch apply: Fix a patch corruption related to EOFNL handling

17d6cd45

2019-09-09T13:06:22

Merge pull request #5210 from buddyspike/master ignore: correct handling of nested rules overriding wild card unignore

4d3392dd

2019-09-09T13:03:42

Merge pull request #5214 from pks-t/pks/diff-iterator-allocation-fixes Memory allocation fixes for diff generator

39028eb6

2019-09-09T13:00:53

Merge pull request #5212 from libgit2/ethomson/creds_for_scheme Use an HTTP scheme that supports the given credentials

8c142241

2019-06-14T08:20:05

refdb: make sure to remove packed refs first This fixes part of the issue where, given a concurrent `git pack-refs`, a ref lookup could return an old, vestigial value from the packed file, as the valid loose one would have been deleted.

171116e7

2019-06-14T06:50:41

refdb: repurpose filesystem prune function

8fd855fd

2019-02-02T19:00:51

refdb: reorder parameters for consistency

9b25cf15

2019-02-02T19:00:49

refdb: fix packed_delete clobbering some errors In the case of a failed lookup, we'd paper over that by writing back the packed-refs successfully.

0a88c83d

2019-02-02T19:00:47

refdb: make low-level deletion helpers explicit

baf411e7

2019-02-02T19:00:45

refdb: ensure all mandatory functions are provided at setup time

c2cf9844

2019-02-02T19:00:43

refdb: check the version of the backend we're about to set

8db9fd3b

2019-02-02T19:00:41

refdb: documentation

a7b4b639

2019-08-24T12:14:31

ignore: correct handling of nested rules overriding wild card unignore problem: filesystem_iterator loads .gitignore files in top-down order. subsequently, ignore module evaluates them in the order they are loaded. this creates a problem if we have unignored a rule (using a wild card) in a sub dir and ignored it again in a level further below (see the test included in this patch). solution: process ignores in reverse order. closes #4963

5fc27aac

2019-08-27T13:38:08

Merge pull request #5208 from mkostyuk/apply-removed-new-file apply: git_apply_to_tree fails to apply patches that add new files

6de48085

2019-08-27T11:29:24

Merge pull request #5189 from libgit2/ethomson/attrs_from_head Optionally read `.gitattributes` from HEAD

aaa48d06

2019-08-27T11:26:50

Merge pull request #5196 from pks-t/pks/config-include-onbranch config: implement "onbranch" conditional

9ca7a60e

2019-08-27T10:36:20

iterator: avoid leaving partially initialized frame on stack When allocating tree iterator entries, we use GIT_ERROR_ALLOC_CHECK` to check whether the allocation has failed. The macro will cause the function to immediately return, though, leaving behind a partially initialized iterator frame. Fix the issue by manually checking for memory allocation errors and using `goto done` in case of an error, popping the iterator frame.

fe241071

2019-08-27T10:36:19

diff_generate: detect memory allocation errors when preparing opts When preparing options for the two iterators that are about to be diffed, we allocate a common prefix for both iterators depending on the options passed by the user. We do not check whether the allocation was successful, though. In fact, this isn't much of a problem, as using a `NULL` prefix is perfectly fine. But in the end, we probably want to detect that the system doesn't have any memory left, as we're unlikely to be able to continue afterwards anyway. While the issue is being fixed in the newly created function `diff_prepare_iterator_opts`, it has been previously existing in the previous macro `DIFF_FROM_ITERATORS` already.

699de9c5

2019-08-27T10:36:17

iterator: remove duplicate memset When allocating new tree iterator frames, we zero out the allocated memory twice. Remove one of the `memset` calls.

8a23597b

2019-08-27T10:36:18

diff_generate: refactor `DIFF_FROM_ITERATORS` macro of doom While the `DIFF_FROM_ITERATORS` does make it shorter to implement the various `git_diff_foo_to_bar` functions, it is a complex and unreadable beast that implicitly assumes certain local variable names. This is not something desirable to have at all and obstructs understanding and more importantly debugging the code by quite a bit. The `DIFF_FROM_ITERATORS` macro basically removed the burden of having to derive the options for both iterators from a pair of iterator flags and the diff options. This patch introduces a new function that does the that exact and refactors all callers to manage the iterators by themselves. As we potentially need to allocate a shared prefix for the iterator, we need to tell the caller to allocate that prefix as soon as the options aren't required anymore. Thus, the function has a `char **prefix` out pointer that will get set to the allocated string and subsequently be free'd by the caller. While this patch increases the line count, I personally deem this to an acceptable tradeoff for increased readbiblity.

4e20c7b1

2019-08-25T22:11:39

Merge pull request #5213 from boardwalk/dskorupski/fix_include_case Fix include casing for case-sensitive filesystems.

44d5e47d

2019-08-24T10:39:56

Fix include casing for case-sensitive filesystems.

4de51f9e

2019-08-23T16:05:28

http: ensure the scheme supports the credentials When a server responds with multiple scheme support - for example, Negotiate and NTLM are commonly used together - we need to ensure that we choose a scheme that supports the credentials.

60319788

2019-08-23T09:58:15

Merge pull request #5054 from tniessen/util-use-64-bit-timer util: use 64 bit timer on Windows

53f51c60

2019-08-21T19:48:05

smart: implement by-date insertion when revwalking

4b91f058

2019-08-21T19:43:06

revwalk: expose more ways of scheduling commits Before we can tweak the revwalk to be more efficent when negotiating, we need to add an "insertion mode" option. Since there's already an implicit set of those, make it visible, at least privately.

8cbef12d

2019-08-08T11:52:54

util: do not perform allocations in insertsort Our hand-rolled fallback sorting function `git__insertsort_r` does an in-place sort of the given array. As elements may not necessarily be pointers, it needs a way of swapping two values of arbitrary size, which is currently implemented by allocating a temporary buffer of the element's size. This is problematic, though, as the emulated `qsort` interface doesn't provide any return values and thus cannot signal an error if allocation of that temporary buffer has failed. Convert the function to swap via a temporary buffer allocated on the stack. Like this, it can `memcpy` contents of both elements in small batches without requiring a heap allocation. The buffer size has been chosen such that in most cases, a single iteration of copying will suffice. Most importantly, it can fully contain `git_oid` structures and pointers. Add a bunch of tests for the `git__qsort_r` interface to verify nothing breaks. Furthermore, this removes the declaration of `git__insertsort_r` and makes it static as it is not used anywhere else.

f3b3e543

2019-08-08T11:34:01

xdiff: catch memory allocation errors The xdiff code contains multiple call sites where the results of `xdl_malloc` are not being checked for memory allocation errors. Add checks to fix possible segfaults due to `NULL` pointer accesses.

c2dd895a

2019-08-08T10:47:29

transports: http: check for memory allocation failures When allocating a chunk that is used to write to HTTP streams, we do not check for memory allocation errors. This may lead us to write to a `NULL` pointer and thus cause a segfault. Fix this by adding a call to `GIT_ERROR_CHECK_ALLOC`.

08699541

2019-08-08T10:46:42

trailer: check for memory allocation errors The "trailer.c" code has been copied mostly verbatim from git.git with minor adjustments, only. As git.git's `xmalloc` function, which aborts on memory allocation errors, has been swapped out for `git_malloc`, which doesn't abort, we may inadvertently access `NULL` pointers. Add checks to fix this.

8c7d9761

2019-08-08T10:45:12

posix: fix direct use of `malloc` In "posix.c" there are multiple callsites which execute `malloc` instead of `git__malloc`. Thus, users of library are not able to track these allocations with a custom allocator. Convert these call sites to use `git__malloc` instead.

a477bff1

2019-08-08T10:44:57

indexer: catch OOM when adding expected OIDs When adding OIDs to the indexer's map of yet-to-be-seen OIDs to verify that packfiles are complete, we do so by first allocating a new OID and then calling `git_oidmap_set` on it. There was no check for memory allocation errors in place, though, leading to possible segfaults due to trying to copy data to a `NULL` pointer. Verify the result of `git__malloc` with `GIT_ERROR_CHECK_ALLOC` to fix the issue.

d4fe402b

2019-08-08T10:36:33

merge: check return value of `git_commit_list_insert` The function `git_commit_list_insert` dynamically allocates memory and may thus fail to insert a given commit, but we didn't check for that in several places in "merge.c". Convert surrounding functions to return error codes and check whether `git_commit_list_insert` was successful, returning an error if not.

c0486188

2019-08-08T10:28:09

blame_git: detect memory allocation errors The code in "blame_git.c" was mostly imported from git.git with only minor changes. One of these changes was to use our own allocators instead of git's `xmalloc`, but there's a subtle difference: `xmalloc` would abort the program if unable to allocate any memory, bit `git__malloc` doesn't. As we didn't check for memory allocation errors in some places, we might inadvertently dereference a `NULL` pointer in out-of-memory situations. Convert multiple functions to return proper error codes and add calls to `GIT_ERROR_CHECK_ALLOC` to fix this.

1c847169

2019-08-21T16:38:59

http: allow dummy negotiation scheme to fail to act The dummy negotiation scheme is used for known authentication strategies that do not wish to act. For example, when a server requests the "Negotiate" scheme but libgit2 is not built with Negotiate support, and will use the "dummy" strategy which will simply not act. Instead of setting `out` to NULL and returning a successful code, return `GIT_PASSTHROUGH` to indicate that it did not act and catch that error code.

39d18fe6

2019-07-31T08:37:10

smart: use push_glob instead of manual filtering The code worked under the assumption that anything under `refs/tags` are tag objects, and all the rest would be peelable to a commit. As it is completely valid to have tags to blobs under a non `refs/tags` ref, this would cause failures when trying to peel a tag to a commit. Fix the broken filtering by switching to `git_revwalk_push_glob`, which already handles this case.

de4bc2bd

2019-08-20T03:29:45

apply: git_apply_to_tree fails to apply patches that add new files git_apply_to_tree() cannot be used apply patches with new files. An attempt to apply such a patch fails because git_apply_to_tree() tries to remove a non-existing file from an old index. The solution is to modify git_apply_to_tree() to git_index_remove() when the patch states that the modified files is removed.

630127e3

2019-08-20T03:08:32

apply: Fix a patch corruption related to EOFNL handling Use of apply's API can lead to an improper patch application and a corruption of the modified file. The issue is caused by mishandling of the end of file changes if there are several hunks to apply. The new line character is added to a line from a wrong hunk. The solution is to modify apply_hunk() to add the newline character at the end of a line from a right hunk.

071750a3

2019-08-15T14:18:26

cmake: move _WIN32_WINNT definitions to root

0f40e68e

2019-08-14T09:05:07

Merge pull request #5187 from ianhattendorf/fix/clone-whitespace clone: don't decode URL percent encodings

57a9ccd5

2019-06-21T15:53:54

commit_list: fix possible buffer overflow in `commit_quick_parse` The function `commit_quick_parse` provides a way to quickly parse parts of a commit without storing or verifying most of its metadata. The first thing it does is calculating the number of parents by skipping "parent " lines until it finds the first non-parent line. Afterwards, this parent count is passed to `alloc_parents`, which will allocate an array to store all the parent. To calculate the amount of storage required for the parents array, `alloc_parents` simply multiplicates the number of parents with the respective elements's size. This already screams "buffer overflow", and in fact this problem is getting worse by the result being cast to an `uint32_t`. In fact, triggering this is possible: git-hash-object(1) will happily write a commit with multiple millions of parents for you. I've stopped at 67,108,864 parents as git-hash-object(1) unfortunately soaks up the complete object without streaming anything to disk and thus will cause an OOM situation at a later point. The point here is: this commit was about 4.1GB of size but compressed down to 24MB and thus easy to distribute. The above doesn't yet trigger the buffer overflow, thus. As the array's elements are all pointers which are 8 bytes on 64 bit, we need a total of 536,870,912 parents to trigger the overflow to `0`. The effect is that we're now underallocating the array and do an out-of-bound writes. As the buffer is kindly provided by the adversary, this may easily result in code execution. Extrapolating from the test file with 67m commits to the one with 536m commits results in a factor of 8. Thus the uncompressed contents would be about 32GB in size and the compressed ones 192MB. While still easily distributable via the network, only servers will have that amount of RAM and not cause an out-of-memory condition previous to triggering the overflow. This at least makes this attack not an easy vector for client-side use of libgit2.

cb1439c9

2019-06-19T12:59:27

config: validate ownership of C:\ProgramData\Git\config before using it When the VirtualStore feature is in effect, it is safe to let random users write into C:\ProgramData because other users won't see those files. This seemed to be the case when we introduced support for C:\ProgramData\Git\config. However, when that feature is not in effect (which seems to be the case in newer Windows 10 versions), we'd rather not use those files unless they come from a trusted source, such as an administrator. This change imitates the strategy chosen by PowerShell's native OpenSSH port to Windows regarding host key files: if a system file is owned neither by an administrator, a system account, or the current user, it is ignored.

5774b2b1

2019-08-11T23:42:45

Merge pull request #5113 from pks-t/pks/stash-perf stash: avoid recomputing tree when committing worktree

42bacbc6

2019-08-11T21:06:19

Merge pull request #5121 from pks-t/pks/variadic-errors Variadic macros

fba3bf79

2019-07-21T14:15:12

blob: optionally read attributes from repository When `GIT_BLOB_FILTER_ATTTRIBUTES_FROM_HEAD` is passed to `git_blob_filter`, read attributes from `gitattributes` files that are checked in to the repository at the HEAD revision. This passes the flag `GIT_FILTER_ATTRIBUTES_FROM_HEAD` to the filter functions.

f0f27c1c

2019-07-21T14:13:25

filter: optionally read attributes from repository When `GIT_FILTER_ATTRIBUTES_FROM_HEAD` is specified, configure the filter to read filter attributes from `gitattributes` files that are checked in to the repository at the HEAD revision. This passes the flag `GIT_ATTR_CHECK_INCLUDE_HEAD` to the attribute reading functions.

4fd5748c

2019-07-21T14:11:03

attr: optionally read attributes from repository When `GIT_ATTR_CHECK_INCLUDE_HEAD` is specified, read `gitattribute` files that are checked into the repository at the HEAD revision.

a5392eae

2019-07-21T12:13:07

blob: allow blob filtering to ignore system gitattributes Introduce `GIT_BLOB_FILTER_NO_SYSTEM_ATTRIBUTES`, which tells `git_blob_filter` to ignore the system-wide attributes file, usually `/etc/gitattributes`. This simply passes the appropriate flag to the attribute loading code.

22eb12af

2019-07-21T12:12:05

filter: add GIT_FILTER_NO_SYSTEM_ATTRIBUTES option Allow system-wide attributes (the ones specified in `/etc/gitattributes`) to be ignored if the flag `GIT_FILTER_NO_SYSTEM_ATTRIBUTES` is specified.

fa1a4c77

2019-07-21T11:03:01

blob: deprecate `git_blob_filtered_content` Users should now use `git_blob_filter`.

a32ab076

2019-07-21T10:56:42

blob: introduce git_blob_filter Provide a function to filter blobs that allows for more functionality than the existing `git_blob_filtered_content` function.

b0692d6b

2019-08-09T09:01:56

Merge pull request #4913 from implausible/feature/signing-rebase-commits Add sign capability to git_rebase_commit

998f9c15

2019-08-07T07:21:27

fixup: strange indentation

f627ba6c

2019-08-02T13:18:07

Merge pull request #5197 from pks-t/pks/remote-ifdeffed-block remote: remove unused block of code

e23c0b18

2019-08-02T07:52:58

remote: remove unused block of code In "remote.c", we have a chunk of code that is #ifdef'fed out via `#if 0` with a comment that we could export it as a helper function. The code was implemented in 2013 and ifdef'fed in 2014, which shows that there's clearly no interest in having such a helper at all. As this block has recently created some confusion about `p_getenv` due to it containing the only reference to that function in our codebase, let's remove this block altogether.

d588de7c

2019-08-02T07:51:02

Merge pull request #5191 from eaigner/master config: check if we are running in a sandboxed environment

952fbbfb

2019-08-01T20:04:11

config: check if we are running in a sandboxed environment On macOS the $HOME environment variable returns the path to the sandbox container instead of the actual user $HOME for sandboxed apps. To get the correct path, we have to get it from the password file entry.

722ba93f

2019-08-01T15:14:06

config: implement "onbranch" conditional With Git v2.23.0, the conditional include mechanism gained another new conditional "onbranch". As the name says, it will cause a file to be included if the "onbranch" pattern matches the currently checked out branch. Implement this new condition and add a bunch of tests.

1721ab04

2019-06-16T11:25:47

unix: posix: avoid use of variadic macro `p_snprintf` The macro `p_snprintf` is implemented as a variadic macro that calls `snprintf` directly with `__VA_ARGS__`. In C89, variadic macros are not allowed, but as the arguments of `p_snprintf` and `snprintf` are matching 1:1, we can fix this by simply removing the parameter list from `p_snprintf`.

63d8cd18

2019-06-16T11:17:17

apply: remove use of variadic error macro The macro `apply_err` is implemented as a variadic macro, which are not defined by C89. Convert it to a variadic function, instead.

27b8b31e

2019-08-01T11:57:03

parse: remove use of variadic macros which are not C89 compliant The macro `git_parse_error` is implemented in a variadic way so that it's possible to pass printf-style parameters. Unfortunately, variadic macros are not defined by C89 and thus we cannot use that functionality. But as we have implemented `git_error_vset` in the previous commit, we can now just use that instead. Convert `git_parse_error` to a variadic function and use `git_error_vset` to fix the compliance violation. While at it, move the function to "patch_parse.c".

c8e63812

2019-06-16T11:03:08

errors: introduce `git_error_vset` function Right now, we only provide a `git_error_set` that has a variadic function signature. It's impossible to drive this function in a C89-compliant way from other functions that have a variadic signature, though, like for example `git_parse_error`. Implement a new `git_error_vset` function that gets a `va_list` as parameter, fixing the above problem.

e8f63411

2019-08-01T11:29:58

Merge pull request #5186 from pks-t/pks/config-snapshot-separation config: separate file and snapshot backends

fb0730f1

2019-04-16T23:49:16

util: use 64 bit timer on Windows git__timer was originally implemented using a 32 bit timer since Windows XP did not support GetTickCount64. Windows XP was discontinued five years ago, so it should be safe to use the new API. As a benefit, we do not need to care about overflows for the next 585 million years.

c8e249b0

2019-07-29T10:51:22

object: deprecate git_object__size for removal In #5118 we remove the double-underscore to make it a normally-named public function. However, this is not an interesting function outside of the library and it takes up a name for something that could be more useful. Remove the single-underscore version as we have not done any releases with it.

37ebe9ad

2019-07-24T18:49:08

config_backend: rename internal structures The internal backend structures are kind-of legacy and do not really speak for themselves. Rename them accordingly to make them easier to understand.

2bff84ba

2019-07-26T21:02:56

config_file: separate out read-only backend To further distinguish the file writeable and readonly backends, separate the readonly backend into its own "config_snapshot.c" implementation. The snapshot backend can be generically used to snapshot any type of backend.

f0b10066

2019-07-24T18:37:14

config_file: fix cast of readonly backend In `backend_readonly_free`, the passed in config backend is being cast to a `diskfile_backend` instead of to a `diskfile_readonly_backend`. While this works out just fine because we only access its header values, which were shared between both backends, it is undefined behaviour. Use the correct type to fix this.

a3159df8

2019-07-24T18:31:43

config_file: remove shared `diskfile_header` struct The `diskfile_header` structure is shared between both `diskfile_backend` and `diskfile_readonly_backend`. The separation and resulting casting is confusing at times and a source for programming errors. Remove the shared structure and inline them directly.

271e5fba

2019-07-24T18:18:18

config_file: duplicate accessors for readonly backend While most functions of the readonly configuration backend are implemented separately from the writeable configuration backend, the two functions `config_iterator_new` and `config_get` are shared between both. This sharing makes it necessary to have some shared data structures, which is the `diskfile_header` structure. Unfortunately, this makes the backends harder to grasp than necessary due to all the casting between structs and also quite error prone. Reimplement those functions for the readonly backends. As readonly backends cannot be refreshed anyway, we can remove the calls to `config_refresh` in there.

4e7ce1fb

2019-07-24T18:13:52

config_file: reimplement `config_readonly_open` generically The `config_readonly_open` function currently receives as input a diskfile backend and will copy its entries to a new snapshot. This is rather intimate, as we need to assume that the source config backend is in fact a diskfile entry. We can do better than this though by using generic methods to copy contents of the provided backend, e.g. by using a config iterator. This also allows us to decouple the read-only backend from the read-write backend.

76182e84

2019-07-24T18:04:38

config_entries: fix possible segfault when duplicating entries When duplicating a configuration entry, we allocate a new entry but do not verify that we get a valid pointer back. As we're dereferencing the pointer afterwards, we might thus run into a segfault in out-of-memory situations. Extract a new function `git_config_entries_dup_entry` that handles the complete entry duplication. Fix the error by using `GIT_ERROR_CHECK_ALLOC`.

ba2885da

2019-07-24T18:05:28

git_net_url_parse: don't git_buf_decode_percent for path

2766b92d

2019-07-21T15:10:34

config_file: refresh when creating an iterator When creating a new iterator for a config file backend, then we should always make sure that we're up to date by calling `config_refresh`. Otherwise, we might not notice when another process has modified the configuration file and thus will represent outdated values. Add two tests to config::stress that verify that we get up-to-date values when reading configuration entries via `git_config_iterator`.

9fac8b78

2019-07-21T15:08:22

config_file: do not refresh read-only backends If calling `config_refresh` on a read-only configuration file backend, then we will segfault when comparing the timestamp of the file due to `path` being uninitialized. As a read-only snapshot should not be refreshed anyway and stay consistent, we can simply return early when calling `config_refresh` on a read-only snapshot.

28d11b59

2019-07-21T14:41:21

config_file: consistently use `GIT_CONTAINER_OF`

6be5ac23

2019-07-11T15:30:51

checkout: postpone creation of symlinks to the end On most platforms it's fine to create symlinks to nonexisting files. Not so on Windows, where the type of a symlink (file or directory) needs to be set at creation time. So depending on whether the target file exists or not, we may end up with different symlink types. This creates a problem when performing checkouts, where we simply iterate over all blobs that need to be updated without treating symlinks any special. If the target file of the symlink is going to be checked out after the symlink itself, then the symlink will be created as directory symlink and not as file symlink. Fix the issue by iterating over blobs twice: once to perform postponed deletions and updates to non-symlink blobs, and once to perform updates to symlink blobs.

50194dcd

2019-07-11T15:14:42

win32: fix symlinks to relative file targets When creating a symlink in Windows, one needs to tell Windows whether the symlink should be a file or directory symlink. To determine which flag to pass, we call `GetFileAttributesW` on the target file to see whether it is a directory and then pass the flag accordingly. The problem though is if create a symlink with a relative target path, then we will check that relative path while not necessarily being inside of the working directory where the symlink is to be created. Thus, getting its attributes will either fail or return attributes of the wrong target. Fix this by resolving the target path relative to the directory in which the symlink is to be created.

a00842c4

2019-06-29T09:59:14

win32: correctly unlink symlinks to directories When deleting a symlink on Windows, then the way to delete it depends on whether it is a directory symlink or a file symlink. In the first case, we need to use `DeleteFile`, in the second `RemoveDirectory`. Right now, `p_unlink` will only ever try to use `DeleteFile`, though, and thus fail to remove directory symlinks. This mismatches how unlink(3P) is expected to behave, though, as it shall remove any symlink disregarding whether it is a file or directory symlink. In order to correctly unlink a symlink, we thus need to check what kind of file this is. If we were to first query file attributes of every file upon calling `p_unlink`, then this would penalize the common case though. Instead, we can try to first delete the file with `DeleteFile` and only if the error returned is `ERROR_ACCESS_DENIED` will we query file attributes and determine whether it is a directory symlink to use `RemoveDirectory` instead.

ded77bb1

2019-06-29T09:58:34

path: extract function to check whether a path supports symlinks When initializing a repository, we need to check whether its working directory supports symlinks to correctly set the initial value of the "core.symlinks" config variable. The code to check the filesystem is reusable in other parts of our codebase, like for example in our tests to determine whether certain tests can be expected to succeed or not. Extract the code into a new function `git_path_supports_symlinks` to avoid duplicate implementations. Remove a duplicate implementation in the repo test helper code.

e54343a4

2019-06-29T09:17:32

fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.

a7d32d60

2019-07-20T18:46:32

stash: avoid recomputing tree when committing worktree When creating a new stash, we need to create there separate commits storing differences stored in the index, untracked changes as well as differences in the working directory. The first two will only be done conditionally if the equivalent options "git stash --keep-index --include-untracked" are being passed to `git_stash_save`, but even when only creating a stash of worktree changes we're much slower than git.git. Using our new stash example: $ time git stash Saved working directory and index state WIP on (no branch): 2f7d9d47575e Linux 5.1.7 real 0m0.528s user 0m0.309s sys 0m0.381s $ time lg2 stash real 0m27.165s user 0m13.645s sys 0m6.403s As can be seen, libgit2 is more than 50x slower than git.git! When creating the stash commit that includes all worktree changes, we create a completely new index to prepare for the new commit and populate it with the entries contained in the index' tree. Here comes the catch: by populating the index with a tree's contents, we do not have any stat caches in the index. This means that we have to re-validate every single file from the worktree and see whether it has changed. The issue can be fixed by populating the new index with the repo's existing index instead of with the tree. This retains all stat cache information, and thus we really only need to check files that have changed stat information. This is semantically equivalent to what we previously did: previously, we used the tree of the commit computed from the index. Now we're just using the index directly. And, in fact, the cache is doing wonders: time lg2 stash real 0m1.836s user 0m1.166s sys 0m0.663s We're now performing 15x faster than before and are only 3x slower than git.git now.

thodg/libgit2/src

src

Log