kmx git

Commit	Date	Message
f2f5ec84	2018-11-23T19:27:09	khash: move khash include into implementation files The current map implementations directly include the "khash.h" headers into their own headers to make available a set of static functions, defines et cetera. Besides leaking the complete khash namespace into files wherever khashes are used, this also triggers Clang's -Wunused-function warnings when some of the static functions are not being used at all. Fix the issue by moving the includes into the respective map implementation files. Add forward declares for all the map types to make them known.
5bfb3b58	2018-11-23T18:48:40	khash: implement map-specific foreach macros The current foreach map macros simply redirect to the type-indifferent `kh_foreach` macro. As this type-indifferent macro directly accesses the structures, the current implementation makes it impossible to make the stuctures private to the implementation only. And making them private is required to move out the khash include into the implementations to decrease the namespace leak.
b2af13f2	2018-11-21T12:07:23	iterator: remove unused function `tree_iterator_entry_cmp` The function `tree_iterator_entry_cmp` has been introduced in commit be30387e8 (iterators: refactored tree iterator, 2016-02-25), but in fact it has never been used at all. Remove it to avoid unused function warnings as soon as we re-enable "-Wunused-functions".
382b668b	2018-11-23T18:38:18	khash: implement begin/end via functions instead of macros Right now, the `git_map_begin()` and `git_map_end()` helpers are implemented via macros which simply redirect to `kh_begin` and `kh_end`. As these macros refer to members of the map structures, they make it impossible to move the khash include into the implementation files. Implement these helpers as real functions instead to further decouple the headers from implementations.
852bc9f4	2018-11-23T19:26:24	khash: remove intricate knowledge of khash types Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.
ae765d00	2018-11-23T19:26:48	submodule: remove string map implementation that strips trailing slashes The submodule code currently has its own implementation of a string map, which overrides the hashing and hash equals functions with functions that ignore potential trailing slashes. These functions aren't actually used by our code, making them useless.
02789782	2018-11-23T18:37:57	idxmap: remove unused foreach macros The foreach macros of the idxmap types are not used anywhere. As we are about to open-code all foreach macros for the maps in order to be able to make the khash structure internal, removing these unused macros will leave a few places less that need conversion.
f0714daf	2018-11-25T13:36:29	Fix warning C4133 incompatible types in MSVC Introduced in commit b433a22a979ae78c28c8b16f8c3487e2787cb73e. Signed-off-by: Sven Strickroth <email@cs-ware.de>
0e3e832d	2018-11-21T13:30:01	Merge pull request #4884 from libgit2/ethomson/index_iterator index: introduce git_index_iterator
cb23c3ef	2018-11-21T10:54:29	commit: fix out-of-bound reads when parsing truncated author fields While commit objects usually should have only one author field, our commit parser actually handles the case where a commit has multiple author fields because some tools that exist in the wild actually write them. Detection of those additional author fields is done by using a simple `git__prefixcmp`, checking whether the current line starts with the string "author ". In case where we are handed a non-NUL-terminated string that ends directly after the space, though, we may have an out-of-bounds read of one byte when trying to compare the expected final NUL byte. Fix the issue by using `git__prefixncmp` instead of `git_prefixcmp`. Unfortunately, a test cannot be easily written to catch this case. While we could test the last error message and verify that it didn't in fact fail parsing a signature (because that would indicate that it has in fact tried to parse the additional "author " field, which it shouldn't be able to detect in the first place), this doesn't work as the next line needs to be the "committer" field, which would error out with the same error message even if we hadn't done an out-of-bounds read. As objects read from the object database are always NUL terminated, this issue cannot be triggered in normal code and thus it's not security critical.
11d33df8	2018-11-18T23:39:43	Merge branch 'tiennou/fix/logallrefupdates-always'
e226ad8f	2018-11-17T17:55:10	refs: add support for core.logAllRefUpdates=always Since we were not expecting this config entry to contain a string, we would fail as soon as its (cached) value would be accessed. Hence, provide some constants for the 4 states we use, and account for "always" when we decide to reflog changes.
646a94be	2018-11-18T23:15:56	Merge pull request #4847 from noahp/noahp/null-arg-fixes tests: 🌀 address two null argument instances
5c213e29	2018-11-18T22:59:03	Merge pull request #4875 from tiennou/fix/openssl-errors Some OpenSSL issues
4ef2b889	2018-11-18T22:56:28	Merge pull request #4882 from kc8apf/include_port_in_host_header transport/http: Include non-default ports in Host header
7321cff0	2018-11-15T09:17:51	Merge pull request #4713 from libgit2/ethomson/win_symlinks Support symlinks on Windows when core.symlinks=true
c358bbc5	2018-11-12T17:22:47	index: introduce git_index_iterator Provide a public git_index_iterator API that is backed by an index snapshot. This allows consumers to provide a stable iteration even while manipulating the index during iteration.
4b84db6a	2018-11-14T12:33:38	patch_parse: remove unused function `parse_number` The function `parse_number` was replaced by `git_parse_advance_digit` which is provided by the parser interface in commit 252f2eeee (parse: implement and use `git_parse_advance_digit`, 2017-07-14). As there are no remaining callers, remove it.
4209a512	2018-11-14T12:04:42	strntol: fix out-of-bounds reads when parsing numbers with leading sign When parsing a number, we accept a leading plus or minus sign to return a positive or negative number. When the parsed string has such a leading sign, we set up a flag indicating that the number is negative and advance the pointer to the next character in that string. This misses updating the number of bytes in the string, though, which is why the parser may later on do an out-of-bounds read. Fix the issue by correctly updating both the pointer and the number of remaining bytes. Furthermore, we need to check whether we actually have any bytes left after having advanced the pointer, as otherwise the auto-detection of the base may do an out-of-bonuds access. Add a test that detects the out-of-bound read. Note that this is not actually security critical. While there are a lot of places where the function is called, all of these places are guarded or irrelevant: - commit list: this operates on objects from the ODB, which are always NUL terminated any may thus not trigger the off-by-one OOB read. - config: the configuration is NUL terminated. - curl stream: user input is being parsed that is always NUL terminated - index: the index is read via `git_futils_readbuffer`, which always NUL terminates it. - loose objects: used to parse the length from the object's header. As we check previously that the buffer contains a NUL byte, this is safe. - rebase: this parses numbers from the rebase instruction sheet. As the rebase code uses `git_futils_readbuffer`, the buffer is always NUL terminated. - revparse: this parses a user provided buffer that is NUL terminated. - signature: this parser the header information of objects. As objects read from the ODB are always NUL terminated, this is a non-issue. The constructor `git_signature_from_buffer` does not accept a length parameter for the buffer, so the buffer needs to be NUL terminated, as well. - smart transport: the buffer that is parsed is NUL terminated - tree cache: this parses the tree cache from the index extension. The index itself is read via `git_futils_readbuffer`, which always NUL terminates it. - winhttp transport: user input is being parsed that is always NUL terminated
cf83809b	2018-11-13T14:26:26	Merge pull request #4883 from pks-t/pks/signature-tz-oob signature: fix out-of-bounds read when parsing timezone offset
f127ce35	2018-11-13T08:22:25	tests: address two null argument instances Handle two null argument cases that occur in the unit tests. One is in library code, the other is in test code. Detected by running unit tests with undefined behavior sanitizer: ```bash # build mkdir build && cd build cmake -DBUILD_CLAR=ON -DCMAKE_C_FLAGS="-fsanitize=address \ -fsanitize=undefined -fstack-usage -static-libasan" .. cmake --build . # run with asan ASAN_OPTIONS="allocator_may_return_null=1" ./libgit2_clar ... ............../libgit2/src/apply.c:316:3: runtime error: null pointer \ passed as argument 1, which is declared to never be null ...................../libgit2/tests/apply/fromfile.c:46:3: runtime \ error: null pointer passed as argument 1, which is declared to never be null ```
20cb30b6	2018-11-13T13:40:17	Merge pull request #4667 from tiennou/feature/remote-create-api Remote creation API
28239be3	2018-11-13T13:27:41	Merge pull request #4818 from pks-t/pks/index-collision Index collision fixes
11fbead8	2018-11-11T16:40:56	Merge pull request #4705 from libgit2/ethomson/apply Patch (diff) application
83b35181	2018-10-19T10:54:38	transport/http: Include non-default ports in Host header When the port is omitted, the server assumes the default port for the service is used (see https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Host). In cases where the client provided a non-default port, it should be passed along. This hasn't been an issue so far as the git protocol doesn't include server-generated URIs. I encountered this when implementing Rust registry support for Sonatype Nexus. Rust's registry uses a git repository for the package index. Clients look at a file in the root of the package index to find the base URL for downloading the packages. Sonatype Nexus looks at the incoming HTTP request (Host header and URL) to determine the client-facing URL base as it may be running behind a load balancer or reverse proxy. This client-facing URL base is then used to construct the package download base URL. When libgit2 fetches the index from Nexus on a non-default port, Nexus trusts the incorrect Host header and generates an incorrect package download base URL.
58b60fcc	2018-11-08T09:31:28	netops: add method to return default http port for a connection Constant strings and logic for HTTP(S) default ports were starting to be spread throughout netops.c. Instead of duplicating this again to determine if a Host header should include the port, move the default port constants and logic into an internal method in netops.{c,h}.
52f859fd	2018-11-09T19:32:08	signature: fix out-of-bounds read when parsing timezone offset When parsing a signature's timezone offset, we first check whether there is a timezone at all by verifying that there are still bytes left to read following the time itself. The check thus looks like `time_end + 1 < buffer_end`, which is actually correct in this case. After setting the timezone's start pointer to that location, we compute the remaining bytes by using the formula `buffer_end - tz_start + 1`, re-using the previous `time_end + 1`. But this is in fact missing the braces around `(tz_start + 1)`, thus leading to an overestimation of the remaining bytes by a length of two. In case of a non-NUL terminated buffer, this will result in an overflow. The function `git_signature__parse` is only used in two locations. First is `git_signature_from_buffer`, which only accepts a string without a length. The string thus necessarily has to be NUL terminated and cannot trigger the issue. The other function is `git_commit__parse_raw`, which can in fact trigger the error as it may receive non-NUL terminated commit data. But as objects read from the ODB are always NUL-terminated by us as a cautionary measure, it cannot trigger the issue either. In other words, this error does not have any impact on security.
9ad96367	2018-11-07T15:31:21	smart transport: only clear url on hard reset After creating a transport for a server, we expect to be able to call `connect`, then invoke subsequent `action` calls. We provide the URL to these `action` calls, although our built-in transports happen to ignore it since they've already parsed it into an internal format that they intend to use (`gitno_connection_data`). In ca2eb4608243162a13c427e74526b6422d5a6659, we began clearing the URL field after a connection, meaning that subsequent calls to transport `action` callbacks would get a NULL URL, which went undetected since the builtin transports ignore the URL when they're already connected (instead of re-parsing it into an internal format). Downstream custom transport implementations (eg, LibGit2Sharp) did notice this change, however. Since `reset_stream` is called even when we're not closing the subtransport, update to only clear the URL when we're closing the subtransport. This ensures that `action` calls will get the correct URL information even after a connection.
f8b9493b	2018-11-05T15:46:08	apply: test re-adding a file after removing it Ensure that we can add a file back after it's been removed. Update the renamed/deleted validation in application to not apply to deltas that are adding files to support this.
78580ad3	2018-11-05T15:34:59	apply: test modifying a file after renaming it Ensure that we cannot modify a file after it's been renamed out of the way. If multiple deltas exist for a single path, ensure that we do not attempt to modify a file after it's been renamed out of the way. To support this, we must track the paths that have been removed or renamed; add to a string map when we remove a path and remove from the string map if we recreate a path. Validate that we are not applying to a path that is in this map, unless the delta is a rename, since git supports renaming one file to two different places in two different deltas. Further, test that we cannot apply a modification delta to a path that will be created in the future by a rename (a path that does not yet exist.)
af33210b	2018-07-10T16:10:03	apply: introduce a delta callback Introduce a callback to the application options that allow callers to add a per-delta callback. The callback can return an error code to stop patch application, or can return a value to skip the application of a particular delta.
df4258ad	2018-11-04T13:01:03	apply: handle multiple deltas to the same file git allows a patch file to contain multiple deltas to the same file: although it does not produce files in this format itself, this could be the result of concatenating two different patch files that affected the same file. git apply behaves by applying this next delta to the existing postimage of the file. We should do the same. If we have previously seen a file, and produced a postimage for it, we will load that postimage and apply the current delta to that. If we have not, get the file from the preimage.
6fecf4d1	2018-11-04T11:47:46	apply: handle exact renames Deltas containing exact renames are special; they simple indicate that a file was renamed without providing additional metadata (like the filemode). Teach the reader to provide the file mode and use the preimage's filemode in the case that the delta does not provide one.)
12f9ac17	2018-11-04T11:26:42	apply: validate unchanged mode when applying both When applying to both the index and the working directory, ensure that the working directory's mode matches the index's mode. It's not sufficient to look only at the hashed object id to determine that the file is unchanged, git also takes the mode into account.
52e27b84	2018-10-10T12:42:54	reader: free is unused and unnecessary None of the reader implementations actually allocate anything themselves, so they don't need a free function. Remove it.
47cc5f85	2018-09-29T19:32:51	apply: introduce a hunk callback Introduce a callback to patch application that allows consumers to cancel hunk application.
37b25ac5	2018-07-08T16:12:58	apply: move location to an argument, not the opts Move the location option to an argument, out of the options structure. This allows the options structure to be re-used for functions that don't need to know the location, since it's implicit in their functionality. For example, `git_apply_tree` should not take a location, but is expected to take all the other options.
2d27ddc0	2018-07-01T21:35:51	apply: use an indexwriter Place the entire `git_apply` operation inside an indexwriter, so that we lock the index before we begin performing patch application. This ensures that there are no other processes modifying things in the working directory.
9be89bbd	2018-07-01T11:08:26	reader: apply working directory filters When reading a file from the working directory, ensure that we apply any necessary filters to the item. This ensures that we get the repository-normalized data as the preimage, and further ensures that we can accurately compare the working directory contents to the index contents for accurate safety validation in the `BOTH` case.
813f0802	2018-07-01T15:14:36	apply: validate workdir contents match index for BOTH When applying to both the index and the working directory, ensure that the index contents match the working directory. This mirrors the requirement in `git apply --index`. This also means that - along with the prior commit that uses the working directory contents as the checkout baseline - we no longer expect conflicts during checkout. So remove the special-case error handling for checkout conflicts. (Any checkout conflict now would be because the file was actually modified between the start of patch application and the checkout.)
0f4b2f02	2018-07-01T15:13:50	reader: optionally validate index matches workdir When using a workdir reader, optionally validate that the index contents match the working directory contents.
5b8d5a22	2018-07-01T13:42:53	apply: use preimage as the checkout baseline Use the preimage as the checkout's baseline. This allows us to support applying patches to files that are modified in the working directory (those that differ from the HEAD and index). Without this, files will be reported as (checkout) conflicts. With this, we expect the on-disk data when we began the patch application (the "preimage") to be on-disk during checkout. We could have also simply used the `FORCE` flag to checkout to accomplish a similar mechanism. However, `FORCE` ignores all differences, while providing a preimage ensures that we will only overwrite the file contents that we actually read. Modify the reader interface to provide the OID to support this.
dddfff77	2018-06-30T17:12:16	apply: convert checkout conflicts to apply failures When there's a checkout conflict during apply, that means that the working directory was modified in a conflicting manner and the postimage cannot be written. During application, convert this to an application failure for consistency across workdir/index/both applications.
5b66b667	2018-06-29T12:39:41	apply: when preimage file is missing, return EAPPLYFAIL The preimage file being missing entirely is simply a case of an application failure; return the correct error value for the caller.
e0224121	2018-06-29T12:09:02	apply: simplify checkout vs index application Separate the concerns of applying via checkout and updating the repository's index. This results in simpler functionality and allows us to not build the temporary collection of paths in the index case.
20f8a6db	2018-06-28T17:26:21	apply: remove deleted paths from index We update the index with the new_file side of the delta, but we need to explicitly remove the old_file path in the case where an item was deleted or renamed.
c3077ea0	2018-06-25T21:24:49	apply: return a specific exit code on failure Return `GIT_EAPPLYFAIL` on patch application failure so that users can determine that patch application failed due to a malformed/conflicting patch by looking at the error code.
d54aa9ae	2018-06-26T15:25:30	iterator: introduce `git_iterator_foreach` Introduce a `git_iterator_foreach` helper function which invokes a callback on all files for a given iterator.
9c34c996	2018-06-25T17:03:14	apply: handle file additions Don't attempt to read the postimage file during a file addition, simply use an empty buffer as the postimage. Also, test that we can handle file additions.
3b5378c5	2018-06-25T16:27:06	apply: handle file deletions If the file was deleted in the postimage, do not attempt to update the target. Instead, ignore it and simply allow it to stay removed in our computed postimage. Also, test that we can handle file deletions.
f83bbe0a	2018-03-19T19:50:45	apply: introduce `git_apply` Introduce `git_apply`, which will take a `git_diff` and apply it to the working directory (akin to `git apply`), the index (akin to `git apply --cached`), or both (akin to `git apply --index`).
664cda6f	2018-03-19T20:10:38	apply: reimplement `git_apply_tree` with readers The generic `git_reader` interface simplifies `git_apply_tree` somewhat. Reimplement `git_apply_tree` with them.
d73043a2	2018-03-19T20:10:31	reader: a generic way to read files from repos Similar to the `git_iterator` interface, the `git_reader` interface will allow us to read file contents from an arbitrary repository-backed data source (trees, index, or working directory).
02b1083a	2018-01-28T23:25:07	apply: introduce `git_apply_tree` Introduce `git_apply_tree`, which will apply a `git_diff` to a given `git_tree`, allowing an in-memory patch application for a repository.
2b12dcf6	2018-03-19T19:45:11	iterator: optionally hash filesystem iterators Optionally hash the contents of files encountered in the filesystem or working directory iterators. This is not expected to be used in production code paths, but may allow us to simplify some test contexts. For working directory iterators, apply filters as appropriate, since we have the context able to do it.
37acffac	2018-10-08T20:51:20	remote: remove static create_internal function
10cba764	2018-07-06T21:58:34	remote: lower the default vector size to 8 As it is, this is space for 32 refs pointers, which feels a little much. Lower it to 8, as it is the minimum vector size anyway.
d3650294	2018-06-20T02:27:14	remote: add a flag to prevent generation of the default fetchspec
fdb116b3	2018-06-20T02:27:12	remote: add a creation flag for ignoring url.insteadOf
3cbaebdf	2018-06-20T02:27:11	remote: provide a generic API for creating remotes This supersedes the functionality of remote_create_with_fetchspec, remote_create_anonymous and remote_create_detached.
43b4b2fa	2018-06-20T02:27:09	remote: merge if-statements We need a repo/config and a name to be able to do anything to the configuration. As such, those two tests can be merged so their conditions are shared.
b741bb89	2018-06-20T02:27:04	remote: add a helper for generating the default fetchspec
b2640c36	2018-06-20T02:27:03	config: add asserts for non-null parameters in lock/unlock
de2af3c2	2018-06-20T02:27:00	remote: move static method
7fafec0e	2018-10-29T18:32:39	tree: fix integer overflow when reading unreasonably large filemodes The `parse_mode` option uses an open-coded octal number parser. The parser is quite naive in that it simply parses until hitting a character that is not in the accepted range of '0' - '7', completely ignoring the fact that we can at most accept a 16 bit unsigned integer as filemode. If the filemode is bigger than UINT16_MAX, it will thus overflow and provide an invalid filemode for the object entry. Fix the issue by using `git__strntol32` instead and doing a bounds check. As this function already handles overflows, it neatly solves the problem. Note that previously, `parse_mode` was also skipping the character immediately after the filemode. In proper trees, this should be a simple space, but in fact the parser accepted any character and simply skipped over it. As a consequence of using `git__strntol32`, we now need to an explicit check for a trailing whitespace after having parsed the filemode. Because of the newly introduced error message, the test object::tree::parse::mode_doesnt_cause_oob_read needs adjustment to its error message check, which in fact is a good thing as it demonstrates that we now fail looking for the whitespace immediately following the filemode. Add a test that shows that we will fail to parse such invalid filemodes now.
f647bbc8	2018-10-29T17:25:09	tree: fix mode parsing reading out-of-bounds When parsing a tree entry's mode, we will eagerly parse until we hit a character that is not in the accepted set of octal digits '0' - '7'. If the provided buffer is not a NUL terminated one, we may thus read out-of-bounds. Fix the issue by passing the buffer length to `parse_mode` and paying attention to it. Note that this is not a vulnerability in our usual code paths, as all object data read from the ODB is NUL terminated.
50d09407	2018-10-29T18:05:27	strntol: fix detection and skipping of base prefixes The `git__strntol` family of functions has the ability to auto-detect a number's base if the string has either the common '0x' prefix for hexadecimal numbers or '0' prefix for octal numbers. The detection of such prefixes and following handling has two major issues though that are being fixed in one go now. - We do not do any bounds checking previous to verifying the '0x' base. While we do verify that there is at least one digit available previously, we fail to verify that there are two digits available and thus may do an out-of-bounds read when parsing this two-character-prefix. - When skipping the prefix of such numbers, we only update the pointer length without also updating the number of remaining bytes. Thus if we try to parse a number '0x1' of total length 3, we will first skip the first two bytes and then try to read 3 bytes starting at '1'. Fix both issues by disentangling the logic. Instead of doing the detection and skipping of such prefixes in one go, we will now first try to detect the base while also honoring how many bytes are left. Only if we have a valid base that is either 8 or 16 and have one of the known prefixes, we will now advance the pointer and update the remaining bytes in one step. Add some tests that verify that no out-of-bounds parsing happens and that autodetection works as advertised.
41863a00	2018-10-29T17:19:58	strntol: fix out-of-bounds read when skipping leading spaces The `git__strntol` family of functions accepts leading spaces and will simply skip them. The skipping will not honor the provided buffer's length, though, which may lead it to read outside of the provided buffer's bounds if it is not a simple NUL-terminated string. Furthermore, if leading space is trimmed, the function will further advance the pointer but not update the number of remaining bytes, which may also lead to out-of-bounds reads. Fix the issue by properly paying attention to the buffer length and updating it when stripping leading whitespace characters. Add a test that verifies that we won't read past the provided buffer length.
03994912	2018-08-29T01:57:24	openssl: only say we're connected if the connection succeeded ssl_close uses this boolean to know if SSL_shutdown should be called. It turns out OpenSSL auto-shutdowns on failure, so if the call to SSL_connect fails, it will complain about "shutdown while in init", trampling the original error.
caee0a66	2018-08-29T01:57:22	openssl: set the error class to GITERR_SSL
623647af	2018-10-26T12:33:59	Merge pull request #4864 from pks-t/pks/object-parse-fixes Object parse fixes
7655b2d8	2018-10-19T10:29:19	commit: fix reading out of bounds when parsing encoding The commit message encoding is currently being parsed by the `git__prefixcmp` function. As this function does not accept a buffer length, it will happily skip over a buffer's end if it is not `NUL` terminated. Fix the issue by using `git__prefixncmp` instead. Add a test that verifies that we are unable to parse the encoding field if it's cut off by the supplied buffer length.
ee11d47e	2018-10-19T09:47:50	tag: fix out of bounds read when searching for tag message When parsing tags, we skip all unknown fields that appear before the tag message. This skipping is done by using a plain `strstr(buffer, "\n\n")` to search for the two newlines that separate tag fields from tag message. As it is not possible to supply a buffer length to `strstr`, this call may skip over the buffer's end and thus result in an out of bounds read. As `strstr` may return a pointer that is out of bounds, the following computation of `buffer_end - buffer` will overflow and result in an allocation of an invalid length. Fix the issue by using `git__memmem` instead. Add a test that verifies parsing the tag fails not due to the allocation failure but due to the tag having no message.
83e8a6b3	2018-10-18T16:08:46	util: provide `git__memmem` function Unfortunately, neither the `memmem` nor the `strnstr` functions are part of any C standard but are merely extensions of C that are implemented by e.g. glibc. Thus, there is no standardized way to search for a string in a block of memory with a limited size, and using `strstr` is to be considered unsafe in case where the buffer has not been sanitized. In fact, there are some uses of `strstr` in exactly that unsafe way in our codebase. Provide a new function `git__memmem` that implements the `memmem` semantics. That is in a given haystack of `n` bytes, search for the occurrence of a byte sequence of `m` bytes and return a pointer to the first occurrence. The implementation chosen is the "Not So Naive" algorithm from [1]. It was chosen as the implementation is comparably simple while still being reasonably efficient in most cases. Preprocessing happens in constant time and space, searching has a time complexity of O(n*m) with a slightly sub-linear average case. [1]: http://www-igm.univ-mlv.fr/~lecroq/string/
bea65980	2018-10-25T11:21:14	Merge pull request #4851 from pks-t/pks/strtol-removal strtol removal
305e801a	2018-10-21T09:52:32	util: allow callers to reset custom allocators Provide a utility to reset custom allocators back to their default. This is particularly useful for testing.
7c791f3d	2018-10-20T20:25:51	Merge pull request #4852 from libgit2/ethomson/unc_paths Win32 path canonicalization refactoring
6cc14ae3	2018-10-20T20:22:04	Merge pull request #4840 from libgit2/cmn/validity-tree-from-unowned-index Check object existence when creating a tree from an index
a2f9f94b	2018-10-20T20:18:04	Merge branch 'issue-4203'
32b81661	2018-10-20T20:16:32	merge: don't leak the index during reloads
820e1e93	2018-10-20T02:17:22	repository: load_config for non-repo configs Teach `load_config` how to load all the configurations except (optionally) the repository configuration. This allows the new repository codepath to load the global/xdg/system configuration paths so that they can be inspected during repository initialization.
b433a22a	2018-10-19T03:14:53	win32: emulate Git for Windows in symlink support Emulate the Git for Windows `core.symlinks` support. Since symbolic links are generally enabled for Administrator (and _may_ be enabled due to enabling Developer mode) but symbolic links are still sufficiently uncommon on Windows that Git users are expected to explicitly opt-in to symbolic links by enabling `core.symlinks=true` in a global (or xdg or system) configuration. When `core.symlinks=true` is set globally _and_ symbolic links support is detected then new repositories created will not have a `core.symlinks` set. If `core.symlinks` is _not_ set then no detection will be performed, and `core.symlinks=false` will be set in the repository configuration.
204cce66	2018-07-03T02:30:34	win32: add symbolic link support Enable `p_symlink` to actually create symbolic links, not just create a fake link (a text file containing the link target). This now means that `core.symlinks=true` works on Windows platforms where symbolic links are enabled (likely due to running in Developer Mode).
30771261	2018-07-03T02:21:17	win32: use GetFinalPathNameByHandle directly Now that we've updated to WIN32_WINNT version of Vista or better, we don't need to dynamically load GetFinalPathNameByHandle and can simply invoke it directly.
b8bdffb5	2018-07-02T07:27:09	cmake: increase WIN32_WINNT to Vista Increase the WIN32_WINNT level to 0x0600, which enables support for new APIs from Windows 6.0 (Vista). We had previously set this to 0x0501, which was Windows XP. Although we removed XP support many years ago, there was no need to update this level previously. We're doing so now explicitly so that we can get support for the `CreateSymbolicLink` API.
0a4284b1	2018-10-19T14:54:13	Merge pull request #4819 from libgit2/cmn/config-nonewline Configuration variables can appear on the same line as the section header
8b6e2895	2018-09-21T15:18:03	index: fix adding index entries with conflicting files When adding an index entry "a/b/c" while an index entry "a/b" already exists, git will happily remove "a/b/c" and only add the new index entry: $ git init test Initialized empty Git repository in /tmp/test.repo/test/.git/ $ touch x $ git add x $ rm x $ mkdir x $ touch x/y $ git add x/y $ git status A x/y The other way round, adding an index entry "a/b" with an entry "a/b/c" already existing is equivalent, where git will remove "a/b/c" and add "a/b". In contrast, libgit2 will currently fail to add these properly and instead complain about the entry appearing as both a file and a directory. This is a programming error, though: our current code already tries to detect and, in the case of `git_index_add`, to automatically replace such index entries. Funnily enough, we already remove the conflicting index entries, but instead of adding the new entry we then bail out afterwards. This leaves callers with the worst of both worlds: we both remove the old entry but fail to add the new one. The root cause is weird semantics of the `has_file_name` and `has_dir_name` functions. While these functions only sound like they are responsible for detecting such conflicts, they will also already remove them in case where its `ok_to_replace` parameter is set. But even if we tell it to replace such entries, it will return an error code. Fix the error by returning success in case where the entries have been replaced. Fix an already existing test which tested for wrong behaviour. Note that the test didn't notice that the resulting tree had no entries. Thus it is fine to change existing behaviour here, as the previous result could've let to silently loosing data. Also add a new test that verifies behaviour in the reverse conflicting case.
923317db	2018-09-21T12:57:02	index: modernize error handling of `index_insert` The current error hanling of the function `index_insert` is currently very fragile. Instead of erroring out in case an error has happened, it will instead verify that no error has happened for each statement. This makes adding new code to that function an adventurous task. Improve the situation by converting the function to use our typical `goto out` pattern.
ea19efc1	2018-10-18T15:08:56	util: fix out of bounds read in error message When an integer that is parsed with `git__strntol32` is too big to fit into an int32, we will generate an error message that includes the actual string that failed to parse. This does not acknowledge the fact that the string may either not be NUL terminated or alternative include additional characters after the number that is to be parsed. We may thus end up printing characters into the buffer that aren't the number or, worse, read out of bounds. Fix the issue by utilizing the `endptr` that was set by `git__strntol64`. This pointer is guaranteed to be set to the first character following the number, and we can thus use it to compute the width of the number that shall be printed. Create a test to verify that we correctly truncate the number.
a34f5b0d	2018-10-18T08:57:27	win32: refactor `git_win32_path_remove_namespace` Update `git_win32_path_remove_namespace` to disambiguate the prefix being removed versus the prefix being added. Now we remove the "namespace", and (may) add a "prefix" in its place. Eg, we remove the `\\?\` namespace. We remove the `\\?\UNC\` namespace, and replace it with the `\\` prefix. This aids readability somewhat. Additionally, use pointer arithmetic instead of offsets, which seems to also help readability.
b2e85f98	2018-10-17T08:48:43	win32: rename `git_win32__canonicalize_path` The internal API `git_win32__canonicalize_path` is far, far too easily confused with the internal API `git_win32_path_canonicalize`. The former removes the namespace prefix from a path (eg, given `\\?\C:\Temp\foo`, it returns `C:\Temp\foo`, and given `\\?\UNC\server\share`, it returns `\\server\share`). As such, rename it to `git_win32_path_remove_namespace`. `git_win32_path_canonicalize` remains unchanged.
b09c1c7b	2018-10-18T14:37:55	util: avoid signed integer overflows in `git__strntol64` While `git__strntol64` tries to detect integer overflows when doing the necessary arithmetics to come up with the final result, it does the detection only after the fact. This check thus relies on undefined behavior of signed integer overflows. Fix this by instead checking up-front whether the multiplications or additions will overflow. Note that a detected overflow will not cause us to abort parsing the current sequence of digits. In the case of an overflow, previous behavior was to still set up the end pointer correctly to point to the first character immediately after the currently parsed number. We do not want to change this now as code may rely on the end pointer being set up correctly even if the parsed number is too big to be represented as 64 bit integer.
8d7fa88a	2018-10-18T12:04:07	util: remove `git__strtol32` The function `git__strtol32` can easily be misused when untrusted data is passed to it that may not have been sanitized with trailing `NUL` bytes. As all usages of this function have now been removed, we can remove this function altogether to avoid future misuse of it.
2613fbb2	2018-10-18T11:58:14	global: replace remaining use of `git__strtol32` Replace remaining uses of the `git__strtol32` function. While these uses are all safe as the strings were either sanitized or from a trusted source, we want to remove `git__strtol32` altogether to avoid future misuse.
21652ee9	2018-10-18T11:43:30	tree-cache: avoid out-of-bound reads when parsing trees We use the `git__strtol32` function to parse the child and entry count of treecaches from the index, which do not accept a buffer length. As the buffer that is being passed in is untrusted data and may thus be malformed and may not contain a terminating `NUL` byte, we can overrun the buffer and thus perform an out-of-bounds read. Fix the issue by uzing `git__strntol32` instead.
68deb2cc	2018-10-18T11:37:10	util: remove unsafe `git__strtol64` function The function `git__strtol64` does not take a maximum buffer length as parameter. This has led to some unsafe usages of this function, and as such we may consider it as being unsafe to use. As we have now eradicated all usages of this function, let's remove it completely to avoid future misuse.
1a2efd10	2018-10-18T11:35:08	config: remove last instance of `git__strntol64` When parsing integers from configuration values, we use `git__strtol64`. This is fine to do, as we always sanitize values and can thus be sure that they'll have a terminating `NUL` byte. But as this is the last call-site of `git__strtol64`, let's just pass in the length explicitly by calling `strlen` on the value to be able to remove `git__strtol64` altogether.
3db9aa6f	2018-10-18T11:32:48	signature: avoid out-of-bounds reads when parsing signature dates We use `git__strtol64` and `git__strtol32` to parse the trailing commit or author date and timezone of signatures. As signatures are usually part of a commit or tag object and thus essentially untrusted data, the buffer may be misformatted and may not be `NUL` terminated. This may lead to an out-of-bounds read. Fix the issue by using `git__strntol64` and `git__strntol32` instead.
600ceadd	2018-10-18T11:29:06	index: avoid out-of-bounds read when reading reuc entry stage We use `git__strtol64` to parse file modes of the index entries, which does not limit the parsed buffer length. As the index can be essentially treated as "untrusted" in that the data stems from the file system, it may be misformatted and may not contain terminating `NUL` bytes. This may lead to out-of-bounds reads when trying to parse index entries with such malformatted modes. Fix the issue by using `git__strntol64` instead.
1a3fa1f5	2018-10-18T11:25:59	commit_list: avoid use of strtol64 without length limit When quick-parsing a commit, we use `git__strtol64` to parse the commit's time. The buffer that's passed to `commit_quick_parse` is the raw data of an ODB object, though, whose data may not be properly formatted and also does not have to be `NUL` terminated. This may lead to out-of-bound reads. Use `git__strntol64` to avoid this problem.

f2f5ec84

2018-11-23T19:27:09

khash: move khash include into implementation files The current map implementations directly include the "khash.h" headers into their own headers to make available a set of static functions, defines et cetera. Besides leaking the complete khash namespace into files wherever khashes are used, this also triggers Clang's -Wunused-function warnings when some of the static functions are not being used at all. Fix the issue by moving the includes into the respective map implementation files. Add forward declares for all the map types to make them known.

5bfb3b58

2018-11-23T18:48:40

khash: implement map-specific foreach macros The current foreach map macros simply redirect to the type-indifferent `kh_foreach` macro. As this type-indifferent macro directly accesses the structures, the current implementation makes it impossible to make the stuctures private to the implementation only. And making them private is required to move out the khash include into the implementations to decrease the namespace leak.

b2af13f2

2018-11-21T12:07:23

iterator: remove unused function `tree_iterator_entry_cmp` The function `tree_iterator_entry_cmp` has been introduced in commit be30387e8 (iterators: refactored tree iterator, 2016-02-25), but in fact it has never been used at all. Remove it to avoid unused function warnings as soon as we re-enable "-Wunused-functions".

382b668b

2018-11-23T18:38:18

khash: implement begin/end via functions instead of macros Right now, the `git_*map_begin()` and `git_*map_end()` helpers are implemented via macros which simply redirect to `kh_begin` and `kh_end`. As these macros refer to members of the map structures, they make it impossible to move the khash include into the implementation files. Implement these helpers as real functions instead to further decouple the headers from implementations.

852bc9f4

2018-11-23T19:26:24

khash: remove intricate knowledge of khash types Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.

ae765d00

2018-11-23T19:26:48

submodule: remove string map implementation that strips trailing slashes The submodule code currently has its own implementation of a string map, which overrides the hashing and hash equals functions with functions that ignore potential trailing slashes. These functions aren't actually used by our code, making them useless.

02789782

2018-11-23T18:37:57

idxmap: remove unused foreach macros The foreach macros of the idxmap types are not used anywhere. As we are about to open-code all foreach macros for the maps in order to be able to make the khash structure internal, removing these unused macros will leave a few places less that need conversion.

f0714daf

2018-11-25T13:36:29

Fix warning C4133 incompatible types in MSVC Introduced in commit b433a22a979ae78c28c8b16f8c3487e2787cb73e. Signed-off-by: Sven Strickroth <email@cs-ware.de>

0e3e832d

2018-11-21T13:30:01

Merge pull request #4884 from libgit2/ethomson/index_iterator index: introduce git_index_iterator

cb23c3ef

2018-11-21T10:54:29

commit: fix out-of-bound reads when parsing truncated author fields While commit objects usually should have only one author field, our commit parser actually handles the case where a commit has multiple author fields because some tools that exist in the wild actually write them. Detection of those additional author fields is done by using a simple `git__prefixcmp`, checking whether the current line starts with the string "author ". In case where we are handed a non-NUL-terminated string that ends directly after the space, though, we may have an out-of-bounds read of one byte when trying to compare the expected final NUL byte. Fix the issue by using `git__prefixncmp` instead of `git_prefixcmp`. Unfortunately, a test cannot be easily written to catch this case. While we could test the last error message and verify that it didn't in fact fail parsing a signature (because that would indicate that it has in fact tried to parse the additional "author " field, which it shouldn't be able to detect in the first place), this doesn't work as the next line needs to be the "committer" field, which would error out with the same error message even if we hadn't done an out-of-bounds read. As objects read from the object database are always NUL terminated, this issue cannot be triggered in normal code and thus it's not security critical.

11d33df8

2018-11-18T23:39:43

Merge branch 'tiennou/fix/logallrefupdates-always'

e226ad8f

2018-11-17T17:55:10

refs: add support for core.logAllRefUpdates=always Since we were not expecting this config entry to contain a string, we would fail as soon as its (cached) value would be accessed. Hence, provide some constants for the 4 states we use, and account for "always" when we decide to reflog changes.

646a94be

2018-11-18T23:15:56

Merge pull request #4847 from noahp/noahp/null-arg-fixes tests: 🌀 address two null argument instances

5c213e29

2018-11-18T22:59:03

Merge pull request #4875 from tiennou/fix/openssl-errors Some OpenSSL issues

4ef2b889

2018-11-18T22:56:28

Merge pull request #4882 from kc8apf/include_port_in_host_header transport/http: Include non-default ports in Host header

7321cff0

2018-11-15T09:17:51

Merge pull request #4713 from libgit2/ethomson/win_symlinks Support symlinks on Windows when core.symlinks=true

c358bbc5

2018-11-12T17:22:47

index: introduce git_index_iterator Provide a public git_index_iterator API that is backed by an index snapshot. This allows consumers to provide a stable iteration even while manipulating the index during iteration.

4b84db6a

2018-11-14T12:33:38

patch_parse: remove unused function `parse_number` The function `parse_number` was replaced by `git_parse_advance_digit` which is provided by the parser interface in commit 252f2eeee (parse: implement and use `git_parse_advance_digit`, 2017-07-14). As there are no remaining callers, remove it.

4209a512

2018-11-14T12:04:42

strntol: fix out-of-bounds reads when parsing numbers with leading sign When parsing a number, we accept a leading plus or minus sign to return a positive or negative number. When the parsed string has such a leading sign, we set up a flag indicating that the number is negative and advance the pointer to the next character in that string. This misses updating the number of bytes in the string, though, which is why the parser may later on do an out-of-bounds read. Fix the issue by correctly updating both the pointer and the number of remaining bytes. Furthermore, we need to check whether we actually have any bytes left after having advanced the pointer, as otherwise the auto-detection of the base may do an out-of-bonuds access. Add a test that detects the out-of-bound read. Note that this is not actually security critical. While there are a lot of places where the function is called, all of these places are guarded or irrelevant: - commit list: this operates on objects from the ODB, which are always NUL terminated any may thus not trigger the off-by-one OOB read. - config: the configuration is NUL terminated. - curl stream: user input is being parsed that is always NUL terminated - index: the index is read via `git_futils_readbuffer`, which always NUL terminates it. - loose objects: used to parse the length from the object's header. As we check previously that the buffer contains a NUL byte, this is safe. - rebase: this parses numbers from the rebase instruction sheet. As the rebase code uses `git_futils_readbuffer`, the buffer is always NUL terminated. - revparse: this parses a user provided buffer that is NUL terminated. - signature: this parser the header information of objects. As objects read from the ODB are always NUL terminated, this is a non-issue. The constructor `git_signature_from_buffer` does not accept a length parameter for the buffer, so the buffer needs to be NUL terminated, as well. - smart transport: the buffer that is parsed is NUL terminated - tree cache: this parses the tree cache from the index extension. The index itself is read via `git_futils_readbuffer`, which always NUL terminates it. - winhttp transport: user input is being parsed that is always NUL terminated

cf83809b

2018-11-13T14:26:26

Merge pull request #4883 from pks-t/pks/signature-tz-oob signature: fix out-of-bounds read when parsing timezone offset

f127ce35

2018-11-13T08:22:25

tests: address two null argument instances Handle two null argument cases that occur in the unit tests. One is in library code, the other is in test code. Detected by running unit tests with undefined behavior sanitizer: ```bash # build mkdir build && cd build cmake -DBUILD_CLAR=ON -DCMAKE_C_FLAGS="-fsanitize=address \ -fsanitize=undefined -fstack-usage -static-libasan" .. cmake --build . # run with asan ASAN_OPTIONS="allocator_may_return_null=1" ./libgit2_clar ... ............../libgit2/src/apply.c:316:3: runtime error: null pointer \ passed as argument 1, which is declared to never be null ...................../libgit2/tests/apply/fromfile.c:46:3: runtime \ error: null pointer passed as argument 1, which is declared to never be null ```

20cb30b6

2018-11-13T13:40:17

Merge pull request #4667 from tiennou/feature/remote-create-api Remote creation API

28239be3

2018-11-13T13:27:41

Merge pull request #4818 from pks-t/pks/index-collision Index collision fixes

11fbead8

2018-11-11T16:40:56

Merge pull request #4705 from libgit2/ethomson/apply Patch (diff) application

83b35181

2018-10-19T10:54:38

transport/http: Include non-default ports in Host header When the port is omitted, the server assumes the default port for the service is used (see https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Host). In cases where the client provided a non-default port, it should be passed along. This hasn't been an issue so far as the git protocol doesn't include server-generated URIs. I encountered this when implementing Rust registry support for Sonatype Nexus. Rust's registry uses a git repository for the package index. Clients look at a file in the root of the package index to find the base URL for downloading the packages. Sonatype Nexus looks at the incoming HTTP request (Host header and URL) to determine the client-facing URL base as it may be running behind a load balancer or reverse proxy. This client-facing URL base is then used to construct the package download base URL. When libgit2 fetches the index from Nexus on a non-default port, Nexus trusts the incorrect Host header and generates an incorrect package download base URL.

58b60fcc

2018-11-08T09:31:28

netops: add method to return default http port for a connection Constant strings and logic for HTTP(S) default ports were starting to be spread throughout netops.c. Instead of duplicating this again to determine if a Host header should include the port, move the default port constants and logic into an internal method in netops.{c,h}.

52f859fd

2018-11-09T19:32:08

signature: fix out-of-bounds read when parsing timezone offset When parsing a signature's timezone offset, we first check whether there is a timezone at all by verifying that there are still bytes left to read following the time itself. The check thus looks like `time_end + 1 < buffer_end`, which is actually correct in this case. After setting the timezone's start pointer to that location, we compute the remaining bytes by using the formula `buffer_end - tz_start + 1`, re-using the previous `time_end + 1`. But this is in fact missing the braces around `(tz_start + 1)`, thus leading to an overestimation of the remaining bytes by a length of two. In case of a non-NUL terminated buffer, this will result in an overflow. The function `git_signature__parse` is only used in two locations. First is `git_signature_from_buffer`, which only accepts a string without a length. The string thus necessarily has to be NUL terminated and cannot trigger the issue. The other function is `git_commit__parse_raw`, which can in fact trigger the error as it may receive non-NUL terminated commit data. But as objects read from the ODB are always NUL-terminated by us as a cautionary measure, it cannot trigger the issue either. In other words, this error does not have any impact on security.

9ad96367

2018-11-07T15:31:21

smart transport: only clear url on hard reset After creating a transport for a server, we expect to be able to call `connect`, then invoke subsequent `action` calls. We provide the URL to these `action` calls, although our built-in transports happen to ignore it since they've already parsed it into an internal format that they intend to use (`gitno_connection_data`). In ca2eb4608243162a13c427e74526b6422d5a6659, we began clearing the URL field after a connection, meaning that subsequent calls to transport `action` callbacks would get a NULL URL, which went undetected since the builtin transports ignore the URL when they're already connected (instead of re-parsing it into an internal format). Downstream custom transport implementations (eg, LibGit2Sharp) did notice this change, however. Since `reset_stream` is called even when we're not closing the subtransport, update to only clear the URL when we're closing the subtransport. This ensures that `action` calls will get the correct URL information even after a connection.

f8b9493b

2018-11-05T15:46:08

apply: test re-adding a file after removing it Ensure that we can add a file back after it's been removed. Update the renamed/deleted validation in application to not apply to deltas that are adding files to support this.

78580ad3

2018-11-05T15:34:59

apply: test modifying a file after renaming it Ensure that we cannot modify a file after it's been renamed out of the way. If multiple deltas exist for a single path, ensure that we do not attempt to modify a file after it's been renamed out of the way. To support this, we must track the paths that have been removed or renamed; add to a string map when we remove a path and remove from the string map if we recreate a path. Validate that we are not applying to a path that is in this map, unless the delta is a rename, since git supports renaming one file to two different places in two different deltas. Further, test that we cannot apply a modification delta to a path that will be created in the future by a rename (a path that does not yet exist.)

af33210b

2018-07-10T16:10:03

apply: introduce a delta callback Introduce a callback to the application options that allow callers to add a per-delta callback. The callback can return an error code to stop patch application, or can return a value to skip the application of a particular delta.

df4258ad

2018-11-04T13:01:03

apply: handle multiple deltas to the same file git allows a patch file to contain multiple deltas to the same file: although it does not produce files in this format itself, this could be the result of concatenating two different patch files that affected the same file. git apply behaves by applying this next delta to the existing postimage of the file. We should do the same. If we have previously seen a file, and produced a postimage for it, we will load that postimage and apply the current delta to that. If we have not, get the file from the preimage.

6fecf4d1

2018-11-04T11:47:46

apply: handle exact renames Deltas containing exact renames are special; they simple indicate that a file was renamed without providing additional metadata (like the filemode). Teach the reader to provide the file mode and use the preimage's filemode in the case that the delta does not provide one.)

12f9ac17

2018-11-04T11:26:42

apply: validate unchanged mode when applying both When applying to both the index and the working directory, ensure that the working directory's mode matches the index's mode. It's not sufficient to look only at the hashed object id to determine that the file is unchanged, git also takes the mode into account.

52e27b84

2018-10-10T12:42:54

reader: free is unused and unnecessary None of the reader implementations actually allocate anything themselves, so they don't need a free function. Remove it.

47cc5f85

2018-09-29T19:32:51

apply: introduce a hunk callback Introduce a callback to patch application that allows consumers to cancel hunk application.

37b25ac5

2018-07-08T16:12:58

apply: move location to an argument, not the opts Move the location option to an argument, out of the options structure. This allows the options structure to be re-used for functions that don't need to know the location, since it's implicit in their functionality. For example, `git_apply_tree` should not take a location, but is expected to take all the other options.

2d27ddc0

2018-07-01T21:35:51

apply: use an indexwriter Place the entire `git_apply` operation inside an indexwriter, so that we lock the index before we begin performing patch application. This ensures that there are no other processes modifying things in the working directory.

9be89bbd

2018-07-01T11:08:26

reader: apply working directory filters When reading a file from the working directory, ensure that we apply any necessary filters to the item. This ensures that we get the repository-normalized data as the preimage, and further ensures that we can accurately compare the working directory contents to the index contents for accurate safety validation in the `BOTH` case.

813f0802

2018-07-01T15:14:36

apply: validate workdir contents match index for BOTH When applying to both the index and the working directory, ensure that the index contents match the working directory. This mirrors the requirement in `git apply --index`. This also means that - along with the prior commit that uses the working directory contents as the checkout baseline - we no longer expect conflicts during checkout. So remove the special-case error handling for checkout conflicts. (Any checkout conflict now would be because the file was actually modified between the start of patch application and the checkout.)

0f4b2f02

2018-07-01T15:13:50

reader: optionally validate index matches workdir When using a workdir reader, optionally validate that the index contents match the working directory contents.

5b8d5a22

2018-07-01T13:42:53

apply: use preimage as the checkout baseline Use the preimage as the checkout's baseline. This allows us to support applying patches to files that are modified in the working directory (those that differ from the HEAD and index). Without this, files will be reported as (checkout) conflicts. With this, we expect the on-disk data when we began the patch application (the "preimage") to be on-disk during checkout. We could have also simply used the `FORCE` flag to checkout to accomplish a similar mechanism. However, `FORCE` ignores all differences, while providing a preimage ensures that we will only overwrite the file contents that we actually read. Modify the reader interface to provide the OID to support this.

dddfff77

2018-06-30T17:12:16

apply: convert checkout conflicts to apply failures When there's a checkout conflict during apply, that means that the working directory was modified in a conflicting manner and the postimage cannot be written. During application, convert this to an application failure for consistency across workdir/index/both applications.

5b66b667

2018-06-29T12:39:41

apply: when preimage file is missing, return EAPPLYFAIL The preimage file being missing entirely is simply a case of an application failure; return the correct error value for the caller.

e0224121

2018-06-29T12:09:02

apply: simplify checkout vs index application Separate the concerns of applying via checkout and updating the repository's index. This results in simpler functionality and allows us to not build the temporary collection of paths in the index case.

20f8a6db

2018-06-28T17:26:21

apply: remove deleted paths from index We update the index with the new_file side of the delta, but we need to explicitly remove the old_file path in the case where an item was deleted or renamed.

c3077ea0

2018-06-25T21:24:49

apply: return a specific exit code on failure Return `GIT_EAPPLYFAIL` on patch application failure so that users can determine that patch application failed due to a malformed/conflicting patch by looking at the error code.

d54aa9ae

2018-06-26T15:25:30

iterator: introduce `git_iterator_foreach` Introduce a `git_iterator_foreach` helper function which invokes a callback on all files for a given iterator.

9c34c996

2018-06-25T17:03:14

apply: handle file additions Don't attempt to read the postimage file during a file addition, simply use an empty buffer as the postimage. Also, test that we can handle file additions.

3b5378c5

2018-06-25T16:27:06

apply: handle file deletions If the file was deleted in the postimage, do not attempt to update the target. Instead, ignore it and simply allow it to stay removed in our computed postimage. Also, test that we can handle file deletions.

f83bbe0a

2018-03-19T19:50:45

apply: introduce `git_apply` Introduce `git_apply`, which will take a `git_diff` and apply it to the working directory (akin to `git apply`), the index (akin to `git apply --cached`), or both (akin to `git apply --index`).

664cda6f

2018-03-19T20:10:38

apply: reimplement `git_apply_tree` with readers The generic `git_reader` interface simplifies `git_apply_tree` somewhat. Reimplement `git_apply_tree` with them.

d73043a2

2018-03-19T20:10:31

reader: a generic way to read files from repos Similar to the `git_iterator` interface, the `git_reader` interface will allow us to read file contents from an arbitrary repository-backed data source (trees, index, or working directory).

02b1083a

2018-01-28T23:25:07

apply: introduce `git_apply_tree` Introduce `git_apply_tree`, which will apply a `git_diff` to a given `git_tree`, allowing an in-memory patch application for a repository.

2b12dcf6

2018-03-19T19:45:11

iterator: optionally hash filesystem iterators Optionally hash the contents of files encountered in the filesystem or working directory iterators. This is not expected to be used in production code paths, but may allow us to simplify some test contexts. For working directory iterators, apply filters as appropriate, since we have the context able to do it.

37acffac

2018-10-08T20:51:20

remote: remove static create_internal function

10cba764

2018-07-06T21:58:34

remote: lower the default vector size to 8 As it is, this is space for 32 refs pointers, which feels a little much. Lower it to 8, as it is the minimum vector size anyway.

d3650294

2018-06-20T02:27:14

remote: add a flag to prevent generation of the default fetchspec

fdb116b3

2018-06-20T02:27:12

remote: add a creation flag for ignoring url.insteadOf

3cbaebdf

2018-06-20T02:27:11

remote: provide a generic API for creating remotes This supersedes the functionality of remote_create_with_fetchspec, remote_create_anonymous and remote_create_detached.

43b4b2fa

2018-06-20T02:27:09

remote: merge if-statements We need a repo/config and a name to be able to do anything to the configuration. As such, those two tests can be merged so their conditions are shared.

b741bb89

2018-06-20T02:27:04

remote: add a helper for generating the default fetchspec

b2640c36

2018-06-20T02:27:03

config: add asserts for non-null parameters in lock/unlock

de2af3c2

2018-06-20T02:27:00

remote: move static method

7fafec0e

2018-10-29T18:32:39

tree: fix integer overflow when reading unreasonably large filemodes The `parse_mode` option uses an open-coded octal number parser. The parser is quite naive in that it simply parses until hitting a character that is not in the accepted range of '0' - '7', completely ignoring the fact that we can at most accept a 16 bit unsigned integer as filemode. If the filemode is bigger than UINT16_MAX, it will thus overflow and provide an invalid filemode for the object entry. Fix the issue by using `git__strntol32` instead and doing a bounds check. As this function already handles overflows, it neatly solves the problem. Note that previously, `parse_mode` was also skipping the character immediately after the filemode. In proper trees, this should be a simple space, but in fact the parser accepted any character and simply skipped over it. As a consequence of using `git__strntol32`, we now need to an explicit check for a trailing whitespace after having parsed the filemode. Because of the newly introduced error message, the test object::tree::parse::mode_doesnt_cause_oob_read needs adjustment to its error message check, which in fact is a good thing as it demonstrates that we now fail looking for the whitespace immediately following the filemode. Add a test that shows that we will fail to parse such invalid filemodes now.

f647bbc8

2018-10-29T17:25:09

tree: fix mode parsing reading out-of-bounds When parsing a tree entry's mode, we will eagerly parse until we hit a character that is not in the accepted set of octal digits '0' - '7'. If the provided buffer is not a NUL terminated one, we may thus read out-of-bounds. Fix the issue by passing the buffer length to `parse_mode` and paying attention to it. Note that this is not a vulnerability in our usual code paths, as all object data read from the ODB is NUL terminated.

50d09407

2018-10-29T18:05:27

strntol: fix detection and skipping of base prefixes The `git__strntol` family of functions has the ability to auto-detect a number's base if the string has either the common '0x' prefix for hexadecimal numbers or '0' prefix for octal numbers. The detection of such prefixes and following handling has two major issues though that are being fixed in one go now. - We do not do any bounds checking previous to verifying the '0x' base. While we do verify that there is at least one digit available previously, we fail to verify that there are two digits available and thus may do an out-of-bounds read when parsing this two-character-prefix. - When skipping the prefix of such numbers, we only update the pointer length without also updating the number of remaining bytes. Thus if we try to parse a number '0x1' of total length 3, we will first skip the first two bytes and then try to read 3 bytes starting at '1'. Fix both issues by disentangling the logic. Instead of doing the detection and skipping of such prefixes in one go, we will now first try to detect the base while also honoring how many bytes are left. Only if we have a valid base that is either 8 or 16 and have one of the known prefixes, we will now advance the pointer and update the remaining bytes in one step. Add some tests that verify that no out-of-bounds parsing happens and that autodetection works as advertised.

41863a00

2018-10-29T17:19:58

strntol: fix out-of-bounds read when skipping leading spaces The `git__strntol` family of functions accepts leading spaces and will simply skip them. The skipping will not honor the provided buffer's length, though, which may lead it to read outside of the provided buffer's bounds if it is not a simple NUL-terminated string. Furthermore, if leading space is trimmed, the function will further advance the pointer but not update the number of remaining bytes, which may also lead to out-of-bounds reads. Fix the issue by properly paying attention to the buffer length and updating it when stripping leading whitespace characters. Add a test that verifies that we won't read past the provided buffer length.

03994912

2018-08-29T01:57:24

openssl: only say we're connected if the connection succeeded ssl_close uses this boolean to know if SSL_shutdown should be called. It turns out OpenSSL auto-shutdowns on failure, so if the call to SSL_connect fails, it will complain about "shutdown while in init", trampling the original error.

caee0a66

2018-08-29T01:57:22

openssl: set the error class to GITERR_SSL

623647af

2018-10-26T12:33:59

Merge pull request #4864 from pks-t/pks/object-parse-fixes Object parse fixes

7655b2d8

2018-10-19T10:29:19

commit: fix reading out of bounds when parsing encoding The commit message encoding is currently being parsed by the `git__prefixcmp` function. As this function does not accept a buffer length, it will happily skip over a buffer's end if it is not `NUL` terminated. Fix the issue by using `git__prefixncmp` instead. Add a test that verifies that we are unable to parse the encoding field if it's cut off by the supplied buffer length.

ee11d47e

2018-10-19T09:47:50

tag: fix out of bounds read when searching for tag message When parsing tags, we skip all unknown fields that appear before the tag message. This skipping is done by using a plain `strstr(buffer, "\n\n")` to search for the two newlines that separate tag fields from tag message. As it is not possible to supply a buffer length to `strstr`, this call may skip over the buffer's end and thus result in an out of bounds read. As `strstr` may return a pointer that is out of bounds, the following computation of `buffer_end - buffer` will overflow and result in an allocation of an invalid length. Fix the issue by using `git__memmem` instead. Add a test that verifies parsing the tag fails not due to the allocation failure but due to the tag having no message.

83e8a6b3

2018-10-18T16:08:46

util: provide `git__memmem` function Unfortunately, neither the `memmem` nor the `strnstr` functions are part of any C standard but are merely extensions of C that are implemented by e.g. glibc. Thus, there is no standardized way to search for a string in a block of memory with a limited size, and using `strstr` is to be considered unsafe in case where the buffer has not been sanitized. In fact, there are some uses of `strstr` in exactly that unsafe way in our codebase. Provide a new function `git__memmem` that implements the `memmem` semantics. That is in a given haystack of `n` bytes, search for the occurrence of a byte sequence of `m` bytes and return a pointer to the first occurrence. The implementation chosen is the "Not So Naive" algorithm from [1]. It was chosen as the implementation is comparably simple while still being reasonably efficient in most cases. Preprocessing happens in constant time and space, searching has a time complexity of O(n*m) with a slightly sub-linear average case. [1]: http://www-igm.univ-mlv.fr/~lecroq/string/

bea65980

2018-10-25T11:21:14

Merge pull request #4851 from pks-t/pks/strtol-removal strtol removal

305e801a

2018-10-21T09:52:32

util: allow callers to reset custom allocators Provide a utility to reset custom allocators back to their default. This is particularly useful for testing.

7c791f3d

2018-10-20T20:25:51

Merge pull request #4852 from libgit2/ethomson/unc_paths Win32 path canonicalization refactoring

6cc14ae3

2018-10-20T20:22:04

Merge pull request #4840 from libgit2/cmn/validity-tree-from-unowned-index Check object existence when creating a tree from an index

a2f9f94b

2018-10-20T20:18:04

Merge branch 'issue-4203'

32b81661

2018-10-20T20:16:32

merge: don't leak the index during reloads

820e1e93

2018-10-20T02:17:22

repository: load_config for non-repo configs Teach `load_config` how to load all the configurations except (optionally) the repository configuration. This allows the new repository codepath to load the global/xdg/system configuration paths so that they can be inspected during repository initialization.

b433a22a

2018-10-19T03:14:53

win32: emulate Git for Windows in symlink support Emulate the Git for Windows `core.symlinks` support. Since symbolic links are generally enabled for Administrator (and _may_ be enabled due to enabling Developer mode) but symbolic links are still sufficiently uncommon on Windows that Git users are expected to explicitly opt-in to symbolic links by enabling `core.symlinks=true` in a global (or xdg or system) configuration. When `core.symlinks=true` is set globally _and_ symbolic links support is detected then new repositories created will not have a `core.symlinks` set. If `core.symlinks` is _not_ set then no detection will be performed, and `core.symlinks=false` will be set in the repository configuration.

204cce66

2018-07-03T02:30:34

win32: add symbolic link support Enable `p_symlink` to actually create symbolic links, not just create a fake link (a text file containing the link target). This now means that `core.symlinks=true` works on Windows platforms where symbolic links are enabled (likely due to running in Developer Mode).

30771261

2018-07-03T02:21:17

win32: use GetFinalPathNameByHandle directly Now that we've updated to WIN32_WINNT version of Vista or better, we don't need to dynamically load GetFinalPathNameByHandle and can simply invoke it directly.

b8bdffb5

2018-07-02T07:27:09

cmake: increase WIN32_WINNT to Vista Increase the WIN32_WINNT level to 0x0600, which enables support for new APIs from Windows 6.0 (Vista). We had previously set this to 0x0501, which was Windows XP. Although we removed XP support many years ago, there was no need to update this level previously. We're doing so now explicitly so that we can get support for the `CreateSymbolicLink` API.

0a4284b1

2018-10-19T14:54:13

Merge pull request #4819 from libgit2/cmn/config-nonewline Configuration variables can appear on the same line as the section header

8b6e2895

2018-09-21T15:18:03

index: fix adding index entries with conflicting files When adding an index entry "a/b/c" while an index entry "a/b" already exists, git will happily remove "a/b/c" and only add the new index entry: $ git init test Initialized empty Git repository in /tmp/test.repo/test/.git/ $ touch x $ git add x $ rm x $ mkdir x $ touch x/y $ git add x/y $ git status A x/y The other way round, adding an index entry "a/b" with an entry "a/b/c" already existing is equivalent, where git will remove "a/b/c" and add "a/b". In contrast, libgit2 will currently fail to add these properly and instead complain about the entry appearing as both a file and a directory. This is a programming error, though: our current code already tries to detect and, in the case of `git_index_add`, to automatically replace such index entries. Funnily enough, we already remove the conflicting index entries, but instead of adding the new entry we then bail out afterwards. This leaves callers with the worst of both worlds: we both remove the old entry but fail to add the new one. The root cause is weird semantics of the `has_file_name` and `has_dir_name` functions. While these functions only sound like they are responsible for detecting such conflicts, they will also already remove them in case where its `ok_to_replace` parameter is set. But even if we tell it to replace such entries, it will return an error code. Fix the error by returning success in case where the entries have been replaced. Fix an already existing test which tested for wrong behaviour. Note that the test didn't notice that the resulting tree had no entries. Thus it is fine to change existing behaviour here, as the previous result could've let to silently loosing data. Also add a new test that verifies behaviour in the reverse conflicting case.

923317db

2018-09-21T12:57:02

index: modernize error handling of `index_insert` The current error hanling of the function `index_insert` is currently very fragile. Instead of erroring out in case an error has happened, it will instead verify that no error has happened for each statement. This makes adding new code to that function an adventurous task. Improve the situation by converting the function to use our typical `goto out` pattern.

ea19efc1

2018-10-18T15:08:56

util: fix out of bounds read in error message When an integer that is parsed with `git__strntol32` is too big to fit into an int32, we will generate an error message that includes the actual string that failed to parse. This does not acknowledge the fact that the string may either not be NUL terminated or alternative include additional characters after the number that is to be parsed. We may thus end up printing characters into the buffer that aren't the number or, worse, read out of bounds. Fix the issue by utilizing the `endptr` that was set by `git__strntol64`. This pointer is guaranteed to be set to the first character following the number, and we can thus use it to compute the width of the number that shall be printed. Create a test to verify that we correctly truncate the number.

a34f5b0d

2018-10-18T08:57:27

win32: refactor `git_win32_path_remove_namespace` Update `git_win32_path_remove_namespace` to disambiguate the prefix being removed versus the prefix being added. Now we remove the "namespace", and (may) add a "prefix" in its place. Eg, we remove the `\\?\` namespace. We remove the `\\?\UNC\` namespace, and replace it with the `\\` prefix. This aids readability somewhat. Additionally, use pointer arithmetic instead of offsets, which seems to also help readability.

b2e85f98

2018-10-17T08:48:43

win32: rename `git_win32__canonicalize_path` The internal API `git_win32__canonicalize_path` is far, far too easily confused with the internal API `git_win32_path_canonicalize`. The former removes the namespace prefix from a path (eg, given `\\?\C:\Temp\foo`, it returns `C:\Temp\foo`, and given `\\?\UNC\server\share`, it returns `\\server\share`). As such, rename it to `git_win32_path_remove_namespace`. `git_win32_path_canonicalize` remains unchanged.

b09c1c7b

2018-10-18T14:37:55

util: avoid signed integer overflows in `git__strntol64` While `git__strntol64` tries to detect integer overflows when doing the necessary arithmetics to come up with the final result, it does the detection only after the fact. This check thus relies on undefined behavior of signed integer overflows. Fix this by instead checking up-front whether the multiplications or additions will overflow. Note that a detected overflow will not cause us to abort parsing the current sequence of digits. In the case of an overflow, previous behavior was to still set up the end pointer correctly to point to the first character immediately after the currently parsed number. We do not want to change this now as code may rely on the end pointer being set up correctly even if the parsed number is too big to be represented as 64 bit integer.

8d7fa88a

2018-10-18T12:04:07

util: remove `git__strtol32` The function `git__strtol32` can easily be misused when untrusted data is passed to it that may not have been sanitized with trailing `NUL` bytes. As all usages of this function have now been removed, we can remove this function altogether to avoid future misuse of it.

2613fbb2

2018-10-18T11:58:14

global: replace remaining use of `git__strtol32` Replace remaining uses of the `git__strtol32` function. While these uses are all safe as the strings were either sanitized or from a trusted source, we want to remove `git__strtol32` altogether to avoid future misuse.

21652ee9

2018-10-18T11:43:30

tree-cache: avoid out-of-bound reads when parsing trees We use the `git__strtol32` function to parse the child and entry count of treecaches from the index, which do not accept a buffer length. As the buffer that is being passed in is untrusted data and may thus be malformed and may not contain a terminating `NUL` byte, we can overrun the buffer and thus perform an out-of-bounds read. Fix the issue by uzing `git__strntol32` instead.

68deb2cc

2018-10-18T11:37:10

util: remove unsafe `git__strtol64` function The function `git__strtol64` does not take a maximum buffer length as parameter. This has led to some unsafe usages of this function, and as such we may consider it as being unsafe to use. As we have now eradicated all usages of this function, let's remove it completely to avoid future misuse.

1a2efd10

2018-10-18T11:35:08

config: remove last instance of `git__strntol64` When parsing integers from configuration values, we use `git__strtol64`. This is fine to do, as we always sanitize values and can thus be sure that they'll have a terminating `NUL` byte. But as this is the last call-site of `git__strtol64`, let's just pass in the length explicitly by calling `strlen` on the value to be able to remove `git__strtol64` altogether.

3db9aa6f

2018-10-18T11:32:48

signature: avoid out-of-bounds reads when parsing signature dates We use `git__strtol64` and `git__strtol32` to parse the trailing commit or author date and timezone of signatures. As signatures are usually part of a commit or tag object and thus essentially untrusted data, the buffer may be misformatted and may not be `NUL` terminated. This may lead to an out-of-bounds read. Fix the issue by using `git__strntol64` and `git__strntol32` instead.

600ceadd

2018-10-18T11:29:06

index: avoid out-of-bounds read when reading reuc entry stage We use `git__strtol64` to parse file modes of the index entries, which does not limit the parsed buffer length. As the index can be essentially treated as "untrusted" in that the data stems from the file system, it may be misformatted and may not contain terminating `NUL` bytes. This may lead to out-of-bounds reads when trying to parse index entries with such malformatted modes. Fix the issue by using `git__strntol64` instead.

1a3fa1f5

2018-10-18T11:25:59

commit_list: avoid use of strtol64 without length limit When quick-parsing a commit, we use `git__strtol64` to parse the commit's time. The buffer that's passed to `commit_quick_parse` is the raw data of an ODB object, though, whose data may not be properly formatted and also does not have to be `NUL` terminated. This may lead to out-of-bound reads. Use `git__strntol64` to avoid this problem.

thodg/libgit2/src

src

Log