src


Log

Author Commit Date CI Message
Carlos Martín Nieto 98e8b11c 2018-05-23T08:40:17 path: check for a symlinked .gitmodules in fs-agnostic code We still compare case-insensitively to protect more thoroughly as we don't know what specifics we'll see on the system and it's the behaviour from git.
Carlos Martín Nieto 17df18ae 2018-05-22T15:48:38 index: stat before creating the entry This is so we have it available for the path validity checking. In a later commit we will start rejecting `.gitmodules` files as symlinks.
Carlos Martín Nieto dc5591b4 2018-05-18T15:16:53 path: hide the dotgit file functions These can't go into the public API yet as we don't want to introduce API or ABI changes in a security release.
Carlos Martín Nieto f98d140b 2018-05-16T15:56:04 path: add functions to detect .gitconfig and .gitattributes
Carlos Martín Nieto 644c973f 2018-05-22T15:21:08 path: accept the name length as a parameter We may take in names from the middle of a string so we want the caller to let us know how long the path component is that we should be checking.
Carlos Martín Nieto 4656e9c4 2018-05-16T15:42:08 path: add a function to detect an .gitmodules file Given a path component it knows what to pass to the filesystem-specific functions so we're protected even from trees which try to use the 8.3 naming rules to get around us matching on the filename exactly. The logic and test strings come from the equivalent git change.
Carlos Martín Nieto 9893f56b 2018-05-16T14:47:04 path: provide a generic function for checking dogit files on NTFS It checks against the 8.3 shortname variants, including the one which includes the checksum as part of its name.
Carlos Martín Nieto 5b855194 2018-05-22T16:13:47 path: reject .gitmodules as a symlink Any part of the library which asks the question can pass in the mode to have it checked against `.gitmodules` being a symlink. This is particularly relevant for adding entries to the index from the worktree and for checking out files.
Carlos Martín Nieto f43ade0e 2018-05-16T11:56:04 path: provide a generic dogit checking function for HFS This lets us check for other kinds of reserved files.
Carlos Martín Nieto 4a1753c2 2018-05-14T16:03:15 submodule: also validate Windows-separated paths for validity Otherwise we would also admit `..\..\foo\bar` as a valid path and fail to protect Windows users. Ideally we would check for both separators without the need for the copied string, but this'll get us over the RCE.
Carlos Martín Nieto 3adb0dc3 2018-05-22T13:58:24 path: expose dotgit detection functions per filesystem These will be used by the checkout code to detect them for the particular filesystem they're on.
Carlos Martín Nieto f77e40a1 2018-04-30T13:47:15 submodule: ignore submodules which include path traversal in their name If the we decide that the "name" of the submodule (i.e. its path inside `.git/modules/`) is trying to escape that directory or otherwise trick us, we ignore the configuration for that submodule. This leaves us with a half-configured submodule when looking it up by path, but it's the same result as if the configuration really were missing. The name check is potentially more strict than it needs to be, but it lets us re-use the check we're doing for the checkout. The function that encapsulates this logic is ready to be exported but we don't want to do that in a security release so it remains internal for now.
Carlos Martín Nieto e29ab6fe 2017-09-27T15:17:26 proxy: add a free function for the options's pointers When we duplicate a user-provided options struct, we're stuck with freeing the url in it. In case we add stuff to the proxy struct, let's add a function in which to put the logic.
Slava Karpenko c3fbf905 2017-09-11T21:34:41 Clear the remote_ref_name buffer in git_push_update_tips() If fetch_spec was a non-pattern, and it is not the first iteration of push_status vector, then git_refspec_transform would result in the new value appended via git_buf_puts to the previous iteration value. Forcibly clearing the buffer on each iteration to prevent this behavior.
Tyrie Vella b3c0d43c 2018-01-22T14:44:31 merge: virtual commit should be last argument to merge-base Our virtual commit must be the last argument to merge-base: since our algorithm pushes _both_ parents of the virtual commit, it needs to be the last argument, since merge-base: > Given three commits A, B and C, git merge-base A B C will compute the > merge base between A and a hypothetical commit M We want to calculate the merge base between the actual commit ("two") and the virtual commit ("one") - since one actually pushes its parents to the merge-base calculation, we need to calculate the merge base of "two" and the parents of one.
Jeff King 3ca2bb39 2017-08-09T16:34:02 sha1_position: convert do-while to while If we enter the sha1_position() function with "lo == hi", we have no elements. But the do-while loop means that we'll enter the loop body once anyway, picking "mi" at that same value and comparing nonsense to our desired key. This is unlikely to match in practice, but we still shouldn't be looking at the memory in the first place. This bug is inherited from git.git; it was fixed there in e01580cfe01526ec2c4eb4899f776a82ade7e0e1.
Carlos Martín Nieto 21f77af9 2017-07-12T07:40:16 signature: don't leave a dangling pointer to the strings on parse failure If the signature is invalid but we detect that after allocating the strings, we free them. We however leave that pointer dangling in the structure the caller gave us, which can lead to double-free. Set these pointers to `NULL` after freeing their memory to avoid this.
Patrick Steinhardt 4296a36b 2017-07-10T09:36:19 ignore: honor case insensitivity for negative ignores When computing negative ignores, we throw away any rule which does not undo a previous rule to optimize. But on case insensitive file systems, we need to keep in mind that a negative ignore can also undo a previous rule with different case, which we did not yet honor while determining whether a rule undoes a previous one. So in the following example, we fail to unignore the "/Case" directory: /case !/Case Make both paths checking whether a plain- or wildcard-based rule undo a previous rule aware of case-insensitivity. This fixes the described issue.
Patrick Steinhardt 5c15cd94 2017-07-07T13:27:27 ignore: keep negative rules containing wildcards Ignore rules allow for reverting a previously ignored rule by prefixing it with an exclamation mark. As such, a negative rule can only override previously ignored files. While computing all ignore patterns, we try to use this fact to optimize away some negative rules which do not override any previous patterns, as they won't change the outcome anyway. In some cases, though, this optimization causes us to get the actual ignores wrong for some files. This may happen whenever the pattern contains a wildcard, as we are unable to reason about whether a pattern overrides a previous pattern in a sane way. This happens for example in the case where a gitignore file contains "*.c" and "!src/*.c", where we wouldn't un-ignore files inside of the "src/" subdirectory. In this case, the first solution coming to mind may be to just strip the "src/" prefix and simply compare the basenames. While that would work here, it would stop working as soon as the basename pattern itself is different, like for example with "*x.c" and "!src/*.c. As such, we settle for the easier fix of just not optimizing away rules that contain a wildcard.
Patrick Steinhardt 8d86cdd4 2017-07-07T12:27:43 ignore: return early to avoid useless indentation
Edward Thomson b2b37077 2018-01-21T18:05:45 merge: reverse merge bases for recursive merge When the commits being merged have multiple merge bases, reverse the order when creating the virtual merge base. This is for compatibility with git's merge-recursive algorithm, and ensures that we build identical trees. Git does this to try to use older merge bases first. Per 8918b0c: > It seems to be the only sane way to do it: when a two-head merge is > done, and the merge-base and one of the two branches agree, the > merge assumes that the other branch has something new. > > If we start creating virtual commits from newer merge-bases, and go > back to older merge-bases, and then merge with newer commits again, > chances are that a patch is lost, _because_ the merge-base and the > head agree on it. Unlikely, yes, but it happened to me.
Edward Thomson 457a81bb 2018-01-21T18:01:20 oidarray: introduce git_oidarray__reverse Provide a simple function to reverse an oidarray.
Adrián Medraño Calvo 5e97bdaf 2018-01-22T11:55:28 odb: export mempack backend Fixes #4492, #4496.
lhchavez e83efde4 2017-12-23T14:59:07 Fix unpack double free If an element has been cached, but then the call to packfile_unpack_compressed() fails, the very next thing that happens is that its data is freed and then the element is not removed from the cache, which frees the data again. This change sets obj->data to NULL to avoid the double-free. It also stops trying to resolve deltas after two continuous failed rounds of resolution, and adds a test for this.
Patrick Steinhardt a521f5b1 2017-12-15T10:47:01 diff_file: properly refcount blobs when initializing file contents When initializing a `git_diff_file_content` from a source whose data is derived from a blob, we simply assign the blob's pointer to the resulting struct without incrementing its refcount. Thus, the structure can only be used as long as the blob is kept alive by the caller. Fix the issue by using `git_blob_dup` instead of a direct assignment. This function will increment the refcount of the blob without allocating new memory, so it does exactly what we want. As `git_diff_file_content__unload` already frees the blob when `GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code handling the free but only have to set that flag correctly.
Patrick Steinhardt b35c3098 2018-02-28T12:06:59 curl: explicitly initialize and cleanup global curl state Our curl-based streams make use of the easy curl interface. This interface automatically initializes and de-initializes the global curl state by calling out to `curl_global_init` and `curl_global_cleanup`. Thus, all global state will be repeatedly re-initialized when creating multiple curl streams in succession. Despite being inefficient, this is not thread-safe due to `curl_global_init` being not thread-safe itself. Thus a multi-threaded programing handling multiple curl streams at the same time is inherently racy. Fix the issue by globally initializing and cleaning up curl's state.
Etienne Samson 34f1ded9 2017-12-13T00:19:41 stransport: provide error message on trust failures Fixes #4440
Patrick Steinhardt 7cc80546 2018-01-03T12:54:42 streams: openssl: fix thread-safety for OpenSSL error messages The function `ERR_error_string` can be invoked without providing a buffer, in which case OpenSSL will simply return a string printed into a static buffer. Obviously and as documented in ERR_error_string(3), this is not thread-safe at all. As libgit2 is a library, though, it is easily possible that other threads may be using OpenSSL at the same time, which might lead to clobbered error strings. Fix the issue by instead using a stack-allocated buffer. According to the documentation, the caller has to provide a buffer of at least 256 bytes of size. While we do so, make sure that the buffer will never get overflown by switching to `ERR_error_string_n` to specify the buffer's size.
Patrick Steinhardt 7ad0cee6 2017-12-08T10:10:19 hash: openssl: check return values of SHA1_* functions The OpenSSL functions `SHA1_Init`, `SHA1_Update` and `SHA1_Final` all return 1 for success and 0 otherwise, but we never check their return values. Do so.
lhchavez 8f189cbf 2017-12-15T15:01:50 Simplified overflow condition
Edward Thomson c24b15c3 2018-02-28T12:20:23 win32: strncmp -> git__strncmp The win32 C library is compiled cdecl, however when configured with `STDCALL=ON`, our functions (and function pointers) will use the stdcall calling convention. You cannot set a `__stdcall` function pointer to a `__cdecl` function, so it's easier to just use our `git__strncmp` instead of sorting that mess out.
lhchavez feb00daf 2017-12-09T05:26:27 Using unsigned instead
lhchavez a42e11ae 2017-12-08T06:00:27 libFuzzer: Prevent a potential shift overflow The type of |base_offset| in get_delta_base() is `git_off_t`, which is a signed `long`. That means that we need to make sure that the 8 most significant bits are zero (instead of 7) to avoid an overflow when it is shifted by 7 bits. Found using libFuzzer.
Edward Thomson 9ab8d153 2018-02-25T15:46:51 winhttp: enable TLS 1.2 on Windows 7 and earlier Versions of Windows prior to Windows 8 do not enable TLS 1.2 by default, though support may exist. Try to enable TLS 1.2 support explicitly on connections. This request may fail if the operating system does not have TLS 1.2 support - the initial release of Vista lacks TLS 1.2 support (though it is available as a software update) and XP completely lacks TLS 1.2 support. If this request does fail, the HTTP context is still valid, and still maintains the original protocol support. So we ignore the failure from this operation.
lhchavez a3cd5e94 2017-12-06T03:03:18 libFuzzer: Fix missing trailer crash This change fixes an invalid memory access when the trailer is missing / corrupt. Found using libFuzzer.
lhchavez 5cc3971a 2017-12-06T03:22:58 libFuzzer: Fix a git_packfile_stream leak This change ensures that the git_packfile_stream object in git_indexer_append() does not leak when the stream has errors. Found using libFuzzer.
Edward Thomson aa0127c0 2018-02-27T11:24:30 winhttp: include constants for TLS 1.1/1.2 support For platforms that do not define `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_1` and/or `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_2`.
Patrick Steinhardt 049e1de5 2017-11-30T18:10:28 openssl: fix thread-safety on non-glibc POSIX systems While the OpenSSL library provides all means to work safely in a multi-threaded application, we fail to do so correctly. Quoting from crypto_lock(3): OpenSSL can safely be used in multi-threaded applications provided that at least two callback functions are set, locking_function and threadid_func. We do in fact provide the means to set up the locking function via `git_openssl_set_locking()`, where we initialize a set of locks by using the POSIX threads API and set the correct callback function to lock and unlock them. But what we do not do is setting the `threadid_func` callback. This function is being used to correctly locate thread-local data of the OpenSSL library and should thus return per-thread identifiers. Digging deeper into OpenSSL's documentation, the library does provide a fallback in case that locking function is not provided by the user. On Windows and BeOS we should be safe, as it simply "uses the system's default thread identifying API". On other platforms though OpenSSL will fall back to using the address of `errno`, assuming it is thread-local. While this assumption holds true for glibc-based systems, POSIX in fact does not specify whether it is thread-local or not. Quoting from errno(3p): It is unspecified whether errno is a macro or an identifier declared with external linkage. And in fact, with musl there is at least one libc implementation which simply declares `errno` as a simple `int` without being thread-local. On those systems, the fallback threadid function of OpenSSL will not be thread-safe. Fix this by setting up our own callback for this setting. As users of libgit2 may want to set it themselves, we obviously cannot always set that function on initialization. But as we already set up primitives for threading in `git_openssl_set_locking()`, this function becomes the obvious choice where to implement the additional setup.
Patrick Steinhardt 3c4e0cee 2017-11-30T15:12:48 diff_generate: fix unsetting diff flags The macro `DIFF_FLAG_SET` can be used to set or unset a flag by modifying the diff's bitmask. While the case of setting the flag is handled correctly, the case of unsetting the flag was not. Instead of inverting the flags, we are inverting the value which is used to decide whether we want to set or unset the bits. The value being used here is a simple `bool` which is `false`. As that is being uplifted to `int` when getting the bitwise-complement, we will end up retaining all bits inside of the bitmask. As that's only ever used to set `GIT_DIFF_IGNORE_CASE`, we were actually always ignoring case for generated diffs. Fix that by instead getting the bitwise-complement of `FLAG`, not `VAL`.
Edward Thomson 9bdc00b1 2018-02-27T10:32:29 mingw: update TLS option flags Include the constants for `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_1` and `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_2` so that they can be used by mingw. This updates both the `deps/winhttp` framework (for classic mingw) and adds the defines for mingw64, which does not use that framework.
Patrick Steinhardt 05a753d4 2017-11-30T15:09:05 diff: remove unused macros `DIFF_FLAG_*` In commit 9be638ecf (git_diff_generated: abstract generated diffs, 2016-04-19), the code for generated diffs was moved out of the generic "diff.c" and instead into its own module. During that conversion, it was forgotten to remove the macros `DIFF_FLAG_IS_SET`, `DIFF_FLAG_ISNT_SET` and `DIFF_FLAG_SET`, which are now only used in "diff_generated.c". Remove those macros now.
David Turner 68842cbb 2017-10-29T12:28:43 Ignore trailing whitespace in .gitignore files (as git itself does)
Edward Thomson 8631357e 2017-10-07T12:23:33 checkout: do not test file mode on Windows On Windows, we do not support file mode changes, so do not test for type changes between the disk and tree being checked out. We could have false positives since the on-disk file can only have an (effective) mode of 0100644 since NTFS does not support executable files. If the tree being checked out did have an executable file, we would erroneously decide that the file on disk had been changed.
Edward Thomson 24388179 2016-06-15T02:00:35 checkout: treat files as modified if mode differs When performing a forced checkout, treat files as modified when the workdir or the index is identical except for the mode. This ensures that force checkout will update the mode to the target. (Apply this check for regular files only, if one of the items was a file and the other was another type of item then this would be a typechange and handled independently.)
Edward Thomson 3983fc1d 2018-02-18T16:10:33 checkout: take mode into account when comparing index to baseline When checking out a file, we determine whether the baseline (what we expect to be in the working directory) actually matches the contents of the working directory. This is safe behavior to prevent us from overwriting changes in the working directory. We look at the index to optimize this test: if we know that the index matches the working directory, then we can simply look at the index data compared to the baseline. We have historically compared the baseline to the index entry by oid. However, we must also compare the mode of the two items to ensure that they are identical. Otherwise, we will refuse to update the working directory for a mode change.
Patrick Steinhardt f41e86d6 2017-10-06T12:05:26 transports: smart: fix memory leak when skipping symbolic refs When we setup the revision walk for negotiating references with a remote, we iterate over all references, ignoring tags and symbolic references. While skipping over symbolic references, we forget to free the looked up reference, resulting in a memory leak when the next iteration simply overwrites the variable. Fix that issue by freeing the reference at the beginning of each iteration and collapsing return paths for error and success.
Patrick Steinhardt cda18f9b 2017-10-06T11:24:11 refs: do not use peeled OID if peeling to a tag If a reference stored in a packed-refs file does not directly point to a commit, tree or blob, the packed-refs file will also will include a fully-peeled OID pointing to the first underlying object of that type. If we try to peel a reference to an object, we will use that peeled OID to speed up resolving the object. As a reference for an annotated tag does not directly point to a commit, tree or blob but instead to the tag object, the packed-refs file will have an accomodating fully-peeled OID pointing to the object referenced by that tag. When we use the fully-peeled OID pointing to the referenced object when peeling, we obviously cannot peel that to the tag anymore. Fix this issue by not using the fully-peeled OID whenever we want to peel to a tag. Note that this does not include the case where we want to resolve to _any_ object type. Existing code may make use from the fact that we resolve those to commit objects instead of tag objects, even though that behaviour is inconsistent between packed and loose references. Furthermore, some tests of ours make the assumption that we in fact resolve those references to a commit.
Patrick Steinhardt e74e05ed 2018-02-20T10:38:27 diff_tform: fix rename detection with rewrite/delete pair A rewritten file can either be classified as a modification of its contents or of a delete of the complete file followed by an addition of the new content. This distinction becomes important when we want to detect renames for rewrites. Given a scenario where a file "a" has been deleted and another file "b" has been renamed to "a", this should be detected as a deletion of "a" followed by a rename of "a" -> "b". Thus, splitting of the original rewrite into a delete/add pair is important here. This splitting is represented by a flag we can set at the current delta. While the flag is already being set in case we want to break rewrites, we do not do so in case where the `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` flag is set. This can trigger an assert when we try to match the source and target deltas. Fix the issue by setting the `GIT_DIFF_FLAG__TO_SPLIT` flag at the delta when it is a rename target and `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` is set.
Andreas Smas c2702235 2017-01-20T23:14:19 Use SOCK_CLOEXEC when creating sockets
Carlos Martín Nieto 93ecb61a 2017-10-07T11:25:12 proxy: rename the options freeing function
Carlos Martín Nieto 27a8092b 2017-09-27T15:30:19 curl: free the user-provided proxy credentials
Carlos Martín Nieto 8d7dcb10 2017-09-27T15:27:32 curl: free the proxy options
Patrick Steinhardt 58197758 2017-07-07T12:27:18 ignore: fix indentation of comment block
Ian Douglas Scott f908bb8e 2017-06-23T10:10:29 Convert port with htons() in p_getaddrinfo() `sin_port` should be in network byte order.
Ariel Davis e4517af3 2017-06-16T23:19:31 repository: remove trailing whitespace
Ariel Davis 82bb59b4 2017-06-16T21:02:26 repository: do not initialize templates if dir is an empty string
Patrick Steinhardt 6f4d04b5 2018-03-08T12:36:46 index: error out on unreasonable prefix-compressed path lengths When computing the complete path length from the encoded prefix-compressed path, we end up just allocating the complete path without ever checking what the encoded path length actually is. This can easily lead to a denial of service by just encoding an unreasonable long path name inside of the index. Git already enforces a maximum path length of 4096 bytes. As we also have that enforcement ready in some places, just make sure that the resulting path is smaller than GIT_PATH_MAX. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
Patrick Steinhardt 6ddd286e 2018-03-08T12:00:27 index: fix out-of-bounds read with invalid index entry prefix length The index format in version 4 has prefix-compressed entries, where every index entry can compress its path by using a path prefix of the previous entry. Since implmenting support for this index format version in commit 5625d86b9 (index: support index v4, 2016-05-17), though, we do not correctly verify that the prefix length that we want to reuse is actually smaller or equal to the amount of characters than the length of the previous index entry's path. This can lead to a an integer underflow and subsequently to an out-of-bounds read. Fix this by verifying that the prefix is actually smaller than the previous entry's path length. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
Patrick Steinhardt b6756821 2018-03-08T11:49:19 index: convert `read_entry` to return entry size via an out-param The function `read_entry` does not conform to our usual coding style of returning stuff via the out parameter and to use the return value for reporting errors. Due to most of our code conforming to that pattern, it has become quite natural for us to actually return `-1` in case there is any error, which has also slipped in with commit 5625d86b9 (index: support index v4, 2016-05-17). As the function returns an `size_t` only, though, the return value is wrapped around, causing the caller of `read_tree` to continue with an invalid index entry. Ultimately, this can lead to a double-free. Improve code and fix the bug by converting the function to return the index entry size via an out parameter and only using the return value to indicate errors. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
Mohseen Mukaddam a78441bc 2017-06-13T11:05:40 Adding git_filter_init for initializing `git_filter` struct + unit test
Edward Thomson 99e40a67 2017-06-12T21:23:44 Merge pull request #4263 from libgit2/ethomson/config_for_inmemory_repo Allow creation of a configuration object in an in-memory repository
Edward Thomson 2d486781 2017-06-12T12:02:27 repository: don't fail to create config option in inmemory repo When in an in-memory repository - without a configuration file - do not fail to create a configuration object.
Edward Thomson 9d49a43c 2017-06-12T12:01:10 repository_item_path: return ENOTFOUND when appropriate Disambiguate error values: return `GIT_ENOTFOUND` when the item cannot exist in the repository (perhaps because the repository is inmemory or otherwise not backed by a filesystem), return `-1` when there is a hard failure.
Edward Thomson 9927e958 2017-06-12T16:01:22 Merge pull request #4261 from RogerGee/fix_wait_while_ack smart_protocol: fix parsing of server ACK responses
Edward Thomson cb3010c5 2017-06-12T12:56:40 odb_read_prefix: reset error in backends loop When looking for an object by prefix, we query all the backends so that we can ensure that there is no ambiguity. We need to reset the `error` value between backends; otherwise the first backend may find an object by prefix, but subsequent backends may not. If we do not reset the `error` value then it will remain at `GIT_ENOTFOUND` and `read_prefix_1` will fail, despite having actually found an object.
Edward Thomson fb3fc837 2017-06-12T11:45:09 repository_item_path: error messages lowercased
Edward Thomson 6f960b55 2017-06-11T10:37:46 Merge pull request #4088 from chescock/packfile-name-using-complete-hash Ensure packfiles with different contents have different names
Edward Thomson d2c4f764 2017-06-11T09:54:04 Merge pull request #4260 from libgit2/ethomson/forced_checkout_2 Update to forced checkout and untracked files
Edward Thomson 4a0df574 2017-06-10T18:46:35 git_futils_rmdir: only allow `EBUSY` when asked Only ignore `EBUSY` from `rmdir` when the `GIT_RMDIR_SKIP_NONEMPTY` bit is set.
Edward Thomson 83989d70 2017-06-08T22:23:53 checkout: cope with untracked files in directory deletion When deleting a directory during checkout, do not simply delete the directory, since there may be untracked files. Instead, go into the iterator and examine each file. In the original code (the code with the faulty assumption), we look to see if there's an index entry beneath the directory that we want to remove. Eg, it looks to see if we have a workdir entry foo and an index entry foo/bar.txt. If this is not the case, then the working directory must have precious files in that directory. This part is okay. The part that's not okay is if there is an index entry foo/bar.txt. It just blows away the whole damned directory. That's not cool. Instead, by simply pushing the directory itself onto the stack and iterating each entry, we will deal with the files one by one - whether they're in the index (and can be force removed) or not (and are precious). The original code was a bad optimization, assuming that we didn't need to git_iterator_advance_into if there was any index entry in the folder. That's wrong - we could have optimized this iff all folder entries are in the index. Instead, we need to simply dig into the directory and analyze its entries.
Roger Gee e141f079 2017-06-10T11:46:09 smart_protocol: fix parsing of server ACK responses Fix ACK parsing in wait_while_ack() internal function. This patch handles the case where multi_ack_detailed mode sends 'ready' ACKs. The existing functionality would bail out too early, thus causing the processing of the ensuing packfile to fail if/when 'ready' ACKs were sent.
Patrick Steinhardt 6c23704d 2017-06-08T21:40:18 settings: rename `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION` Initially, the setting has been solely used to enable the use of `fsync()` when creating objects. Since then, the use has been extended to also cover references and index files. As the option is not yet part of any release, we can still correct this by renaming the option to something more sensible, indicating not only correlation to objects. This commit renames the option to `GIT_OPT_ENABLE_FSYNC_GITDIR`. We also move the variable from the object to repository source code.
Edward Thomson 458cea5c 2017-06-08T14:22:24 Merge pull request #4255 from pks-t/pks/buffer-grow-errors Buffer growing cleanups
Edward Thomson 90500d81 2017-06-08T13:56:22 Merge pull request #4253 from pks-t/pks/cov-fixes Coverity fixes
Patrick Steinhardt 90388aa8 2017-06-06T15:02:23 refdb_fs: be explicit about using null-OID if we cannot resolve ref
Patrick Steinhardt 78a8f68f 2017-06-06T14:57:31 path: only set dotgit flags when configs were read
Patrick Steinhardt 9be4c303 2017-06-06T14:54:48 worktree: use `git__free` instead of `free`
Patrick Steinhardt 0f642f31 2017-06-06T14:54:19 refs: properly report errors from `update_wt_heads`
Patrick Steinhardt 0c28c72d 2017-06-06T14:53:45 fileops: check return value of `git_path_dirname`
Patrick Steinhardt a693b873 2017-06-07T10:20:44 buffer: use `git_buf_init` with length The `git_buf_init` function has an optional length parameter, which will cause the buffer to be initialized and allocated in one step. This can be used instead of static initialization with `GIT_BUF_INIT` followed by a `git_buf_grow`. This patch does so for two functions where it is applicable.
Patrick Steinhardt 4796c916 2017-06-07T09:56:31 buffer: return errors for `git_buf_init` and `git_buf_attach` Both the `git_buf_init` and `git_buf_attach` functions may call `git_buf_grow` in case they were given an allocation length as parameter. As such, it is possible for these functions to fail when we run out of memory. While it won't probably be used anytime soon, it does indeed make sense to also record this fact by returning an error code from both functions. As they belong to the internal API only, this change does not break our interface.
Patrick Steinhardt 9a8386a2 2017-06-07T09:50:54 buffer: consistently use `ENSURE_SIZE` to grow buffers on-demand The `ENSURE_SIZE` macro can be used to grow a buffer if its currently allocated size does not suffice a required target size. While most of the code already uses this macro, the `git_buf_join` and `git_buf_join3` functions do not yet use it. Due to the macro first checking whether we have to grow the buffer at all, this has the benefit of saving a function call when it is not needed. While this is nice to have, it will probably not matter at all performance-wise -- instead, this only serves for consistency across the code.
Patrick Steinhardt e82dd813 2017-06-08T11:52:32 buffer: fix `ENSURE_SIZE` macro referencing wrong variable While the `ENSURE_SIZE` macro gets a reference to both the buffer that is to be resized and a new size, we were not consistently referencing the passed buffer, but instead a variable `buf`, which is not passed in. Funnily enough, we never noticed because our buffers seem to always be named `buf` whenever the macro was being used. Fix the macro by always using the passed-in buffer. While at it, add braces around all mentions of passed-in variables as should be done with macros to avoid subtle errors. Found-by: Edward Thompson
Patrick Steinhardt 97eb5ef0 2017-06-07T10:05:54 buffer: rely on `GITERR_OOM` set by `git_buf_try_grow` The function `git_buf_try_grow` consistently calls `giterr_set_oom` whenever growing the buffer fails due to insufficient memory being available. So in fact, we do not have to do this ourselves when a call to any buffer-growing function has failed due to an OOM situation. But we still do so in two functions, which this patch cleans up.
Edward Thomson 3a8801ae 2017-06-08T10:55:47 Merge pull request #4258 from pks-t/pks/sha1dc-update SHA1DC update
Patrick Steinhardt 63d86c27 2017-06-07T14:50:16 sha1dc: update to fix errors with endianess and unaligned access This updates our version of SHA1DC to e139984 (Merge pull request #35 from lidl/master, 2017-05-30).
Edward Thomson 3bc95cfe 2017-06-07T14:42:12 Merge pull request #4236 from pks-t/pks/index-v4-fixes Fix path computations for compressed index entries
Marc-Antoine Perennou f28744a5 2017-06-05T10:11:20 openssl_stream: fix building with libressl OpenSSL v1.1 has introduced a new way of initializing the library without having to call various functions of different subsystems. In libgit2, we have been adapting to that change with 88520151f (openssl_stream: use new initialization function on OpenSSL version >=1.1, 2017-04-07), where we added an #ifdef depending on the OpenSSL version. This change broke building with libressl, though, which has not changed its API in the same way. Fix the issue by expanding the #ifdef condition to use the old way of initializing with libressl. Signed-off-by: Marc-Antoine Perennou <Marc-Antoine@Perennou.com>
Patrick Steinhardt 064a60e9 2017-05-19T14:06:15 index: verify we have enough space left when writing index entries In our code writing index entries, we carry around a `disk_size` representing how much memory we have in total and pass this value to `git_encode_varint` to do bounds checks. This does not make much sense, as at the time when passing on this variable it is already out of date. Fix this by subtracting used memory from `disk_size` as we go along. Furthermore, assert we've actually got enough space left to do the final path memcpy.
Patrick Steinhardt c71dff7e 2017-05-19T13:49:34 index: fix shared prefix computation when writing index entry When using compressed index entries, each entry's path is preceded by a varint encoding how long the shared prefix with the previous index entry actually is. We currently encode a length of `(path_len - same_len)`, which is doubly wrong. First, `path_len` is already set to `path_len - same_len` previously. Second, we want to encode the shared prefix rather than the un-shared suffix length. Fix this by using `same_len` as the varint value instead.
Patrick Steinhardt 83e0392c 2017-05-19T13:39:05 index: also sanity check entry size with compressed entries We have a check in place whether the index has enough data left for the required footer after reading an index entry, but this was only used for uncompressed entries. Move the check down a bit so that it is executed for both compressed and uncompressed index entries.
Patrick Steinhardt 350d2c47 2017-05-19T14:22:35 index: remove file-scope entry size macros All index entry size computations are now performed in `index_entry_size`. As such, we do not need the file-scope macros for computing these sizes anymore. Remove them and move the `entry_size` macro into the `index_entry_size` function.
Patrick Steinhardt 46b67034 2017-05-19T13:59:53 index: don't right-pad paths when writing compressed entries Our code to write index entries to disk does not check whether the entry that is to be written should use prefix compression for the path. As such, we were overallocating memory and added bogus right-padding into the resulting index entries. As there is no padding allowed in the index version 4 format, this should actually result in an invalid index. Fix this by re-using the newly extracted `index_entry_size` function.
Patrick Steinhardt 29f498e0 2017-05-19T13:38:34 index: move index entry size computation into its own function Create a new function `index_entry_size` which encapsulates the logic to calculate how much space is needed for an index entry, whether it is simple/extended or compressed/uncompressed. This can later be re-used by our code writing index entries.
Patrick Steinhardt 8ceb890b 2017-05-19T12:35:21 index: set last written index entry in foreach-entry-loop The last written disk entry is currently being written inside of the function `write_disk_entry`. Make behavior a bit more obviously by instead setting it inside of `write_entries` while iterating all entries.
Patrick Steinhardt 11d0be23 2017-05-12T10:01:43 index: set last entry when reading compressed entries To calculate the path of a compressed index entry, we need to know the preceding entry's path. While we do actually set the first predecessor correctly to "", we fail to update this while reading the entries. Fix the issue by updating `last` inside of the loop. Previously, we've been passing a double-pointer to `read_entry`, which it didn't update. As it is more obvious to update the pointer inside the loop itself, though, we can simply convert it to a normal pointer.
Patrick Steinhardt febe8c14 2017-05-10T14:27:12 index: fix confusion with shared prefix in compressed path names The index version 4 introduced compressed path names for the entries. From the git.git index-format documentation: At the beginning of an entry, an integer N in the variable width encoding [...] is stored, followed by a NUL-terminated string S. Removing N bytes from the end of the path name for the previous entry, and replacing it with the string S yields the path name for this entry. But instead of stripping N bytes from the previous path's string and using the remaining prefix, we were instead simply concatenating the previous path with the current entry path, which is obviously wrong. Fix the issue by correctly copying the first N bytes of the previous entry only and concatenating the result with our current entry's path.
Patrick Steinhardt 8a5e7aae 2017-05-22T12:53:44 varint: fix computation for remaining buffer space When encoding varints to a buffer, we want to remain sure that the required buffer space does not exceed what is actually available. Our current check does not do the right thing, though, in that it does not honor that our `pos` variable counts the position down instead of up. As such, we will require too much memory for small varints and not enough memory for big varints. Fix the issue by correctly calculating the required size as `(sizeof(varint) - pos)`. Add a test which failed before.
Edward Thomson dd0aa811 2017-06-04T22:46:07 Merge branch 'pr/4228'
Edward Thomson 82e929a8 2017-06-04T19:35:39 Merge pull request #4239 from roblg/toplevel-dir-ignore-fix Fix issue with directory glob ignore in subdirectories