kmx git

Commit	Date	Message
7392799d	2018-05-30T08:35:06	submodule: detect duplicated submodule paths When loading submodule names, we build a map of submodule paths and their respective names. While looping over the configuration keys, we do not check though whether a submodule path was seen already. This leads to a memory leak in case we have multiple submodules with the same path, as we just overwrite the old value in the map in that case. Fix the error by verifying that the path to be added is not yet part of the string map. Git does not allow to have multiple submodules for a path anyway, so we now do the same and detect this duplication, reporting it to the user.
f2e5c092	2018-04-27T15:31:43	cmake: remove now-useless LIBGIT2_LIBDIRS handling With the recent change of always resolving pkg-config libraries to their full path, we do not have to manage the LIBGIT2_LIBDIRS variable anymore. The only other remaining user of LIBGIT2_LIBDIRS is winhttp, which is a CMake-style library target and can thus be resolved by CMake automatically. Remove the variable to simplify our build system a bit.
0c8ff50f	2018-04-27T10:38:49	cmake: resolve libraries found by pkg-config Libraries found by CMake modules are usually handled with their full path. This makes linking against those libraries a lot more robust when it comes to libraries in non-standard locations, as otherwise we might mix up libraries from different locations when link directories are given. One excemption are libraries found by PKG_CHECK_MODULES. Instead of returning libraries with their complete path, it will return the variable names as well as a set of link directories. In case where multiple sets of the same library are installed in different locations, this can lead the compiler to link against the wrong libraries in the end, when link directories of other dependencies are added. To fix this shortcoming, we need to manually resolve library paths returned by CMake against their respective library directories. This is an easy task to do with `FIND_LIBRARY`.
8fa0b34b	2018-04-19T01:05:05	local: fix a leaking reference when iterating over a symref Valgrind log : ==17702== 18 bytes in 1 blocks are indirectly lost in loss record 69 of 1,123 ==17702== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==17702== by 0x5FDBB49: strdup (strdup.c:42) ==17702== by 0x632B3E: git__strdup (util.h:106) ==17702== by 0x632D2C: git_reference__alloc_symbolic (refs.c:64) ==17702== by 0x62E0AF: loose_lookup (refdb_fs.c:408) ==17702== by 0x62E636: refdb_fs_backend__iterator_next (refdb_fs.c:565) ==17702== by 0x62CD8E: git_refdb_iterator_next (refdb.c:147) ==17702== by 0x6347F2: git_reference_next (refs.c:838) ==17702== by 0x6345CB: git_reference_foreach (refs.c:748) ==17702== by 0x66BE62: local_download_pack (local.c:579) ==17702== by 0x5DB48F: git_fetch_download_pack (fetch.c:148) ==17702== by 0x639028: git_remote_download (remote.c:932) ==17702== by 0x63919A: git_remote_fetch (remote.c:969) ==17702== by 0x4ABEDD: test_fetchhead_nonetwork__fetch_into_repo_with_symrefs (nonetwork.c:362) ==17702== by 0x4125D9: clar_run_test (clar.c:222) ==17702== by 0x41287C: clar_run_suite (clar.c:286) ==17702== by 0x412DDE: clar_test_run (clar.c:433) ==17702== by 0x4105E1: main (main.c:24)
16b62dd4	2018-04-04T21:28:31	diff: Add missing GIT_DELTA_TYPECHANGE -> 'T' mapping. This adds the 'T' status character to git_diff_status_char() for diff entries that change type.
2fe887e6	2018-04-18T20:57:16	remote: repo is optional here As per CID:1378747, we might be called with a NULL repo, which would be deferenced in write_add_refspec
07011e60	2018-04-12T13:32:27	revwalk: fix uninteresting revs sometimes not limiting graphwalk When we want to limit our graphwalk, we use the heuristic of checking whether the newest limiting (uninteresting) revision is newer than the oldest interesting revision. We do so by inspecting whether the first item's commit time of the user-supplied list of revisions is newer than the last added interesting revision. This is wrong though, as the user supplied list is in no way guaranteed to be sorted by increasing commit dates. This could lead us to abort the revwalk early before applying all relevant limiting revisions, outputting revisions which should in fact have been hidden. Fix the heuristic by instead checking whether _any_ of the limiting commits was made earlier than the last interesting commit. Add a test.
0f88adb6	2018-02-08T12:36:47	Submodule API should report .gitmodules parse errors Signed-off-by: Sven Strickroth <email@cs-ware.de>
96329606	2018-03-11T15:35:56	worktree: Read worktree specific reflog for HEAD Signed-off-by: Sven Strickroth <email@cs-ware.de>
a137cdbd	2018-04-18T21:41:44	refspec: check for valid parameters in git_refspec__dwim_one CID:1383993, "In git_refspec__dwim_one: All paths that lead to this null pointer comparison already dereference the pointer earlier (CWE-476)"
b2f3ff56	2018-04-19T01:08:18	worktree: fix calloc of the wrong object type
7fa6c8ce	2018-03-29T10:18:51	util: fix missing headers for MinGW environments There are multiple references to undefined functions in the Microsoft builds. Add headers to make them known.
0f09d9f5	2018-04-02T20:00:07	Fix build with LibreSSL 2.7 LibreSSL 2.7 adds OpenSSL 1.1 API Signed-off-by: Bernard Spil <brnrd@FreeBSD.org>
e2a80124	2018-04-10T21:16:43	refs: preserve the owning refdb when duping reference This fixes a segfault in git_reference_owner on references returned from git_reference__read_head and git_reference_dup ones.
b260fdc8	2018-04-06T12:24:10	attr_file: fix handling of directory patterns with trailing spaces When comparing whether a path matches a directory rule, we pass the both the path and directory name to `fnmatch` with `GIT_ATTR_FNMATCH_DIRECTORY` being set. `fnmatch` expects the pattern to contain no trailing directory '/', which is why we try to always strip patterns of trailing slashes. We do not handle that case correctly though when the pattern itself has trailing spaces, causing the match to fail. Fix the issue by stripping trailing spaces and tabs for a rule previous to checking whether the pattern is a directory pattern with a trailing '/'. This replaces the whitespace-stripping in our ignore file parsing code, which was stripping whitespaces too late. Add a test to catch future breakage.
a714e836	2018-04-06T10:39:16	transports: local: fix assert when fetching into repo with symrefs When fetching into a repository which has symbolic references via the "local" transport we run into an assert. The assert is being triggered while we negotiate the packfile between the two repositories. When hiding known revisions from the packbuilder revwalk, we unconditionally hide all references of the local refdb. In case one of these references is a symbolic reference, though, this means we're trying to hide a `NULL` OID, which triggers the assert. Fix the issue by only hiding OID references from the revwalk. Add a test to catch this issue in the future.
59012bf4	2018-03-29T09:15:48	odb: mempack: fix leaking objects when freeing mempacks When a ODB mempack gets free'd, we take no measures at all to free its contents, most notably the objects added to the database, resulting in a memory leak. Call `git_mempack_reset` previous to freeing the ODB structures themselves, which takes care of releasing all associated data structures.
4d4a7dbf	2018-03-28T17:37:39	sha1dc: update to fix errors with endianess This updates the version of SHA1DC to c3e1304ea3.
b89988c7	2018-03-27T15:03:15	transports: ssh: replace deprecated function `libssh2_session_startup` The function `libssh2_session_startup` has been deprecated since libssh2 version 1.2.8 in favor of `libssh2_session_handshake` introduced in the same version. libssh2 1.2.8 was released in April 2011, so it is already seven years old. It is available in Debian Wheezy, Ubuntu Trusty and CentOS 7.4, so the most important and conservative distros already have it available. As such, it seems safe to just use the new function.
b2e7d8c2	2018-03-27T14:49:21	transports: ssh: disconnect session before freeing it The function `ssh_stream_free` takes over the responsibility of closing channels and streams just before freeing their memory, but it does not do so for the session. In fact, we never disconnect the session ourselves at all, as libssh2 will not do so itself upon freeing the structure. Quoting the documentation of `libssh2_session_free`: > Frees all resources associated with a session instance. Typically > called after libssh2_session_disconnect_ex, The missing disconnect probably stems from a misunderstanding what it actually does. As we are already closing the TCP socket ourselves, the assumption was that no additional disconnect is required. But calling `libssh2_session_disconnect` will notify the server that we are cleanly closing the connection, such that the server can free his own resources. Add a call to `libssh2_session_disconnect` to fix that issue. [1]: https://www.libssh2.org/libssh2_session_free.html
cfed1be8	2018-05-24T20:28:36	submodule: plug leaks from the escape detection
f650153a	2018-05-24T19:05:59	submodule: replace index with strchr which exists on Windows
a3df20cf	2018-05-24T19:00:13	submodule: the repostiory for _name_is_valid should not be const We might modify caches due to us trying to load the configuration to figure out what kinds of filesystem protections we should have.
a9e60994	2018-05-23T08:40:17	path: check for a symlinked .gitmodules in fs-agnostic code We still compare case-insensitively to protect more thoroughly as we don't know what specifics we'll see on the system and it's the behaviour from git.
aa003557	2018-05-22T16:13:47	path: reject .gitmodules as a symlink Any part of the library which asks the question can pass in the mode to have it checked against `.gitmodules` being a symlink. This is particularly relevant for adding entries to the index from the worktree and for checking out files.
08d4b459	2018-05-22T15:48:38	index: stat before creating the entry This is so we have it available for the path validity checking. In a later commit we will start rejecting `.gitmodules` files as symlinks.
c7cac088	2018-05-22T15:21:08	path: accept the name length as a parameter We may take in names from the middle of a string so we want the caller to let us know how long the path component is that we should be checking.
d7ee21ee	2018-05-22T13:58:24	path: expose dotgit detection functions per filesystem These will be used by the checkout code to detect them for the particular filesystem they're on.
f907a6f5	2018-05-18T15:16:53	path: hide the dotgit file functions These can't go into the public API yet as we don't want to introduce API or ABI changes in a security release.
0cc14627	2018-05-16T15:56:04	path: add functions to detect .gitconfig and .gitattributes
26b3cec0	2018-05-16T15:42:08	path: add a function to detect an .gitmodules file Given a path component it knows what to pass to the filesystem-specific functions so we're protected even from trees which try to use the 8.3 naming rules to get around us matching on the filename exactly. The logic and test strings come from the equivalent git change.
dd364dde	2018-05-16T14:47:04	path: provide a generic function for checking dogit files on NTFS It checks against the 8.3 shortname variants, including the one which includes the checksum as part of its name.
37dc60b6	2018-05-16T11:56:04	path: provide a generic dogit checking function for HFS This lets us check for other kinds of reserved files.
916af8ea	2018-05-14T16:03:15	submodule: also validate Windows-separated paths for validity Otherwise we would also admit `..\..\foo\bar` as a valid path and fail to protect Windows users. Ideally we would check for both separators without the need for the copied string, but this'll get us over the RCE.
e6c757a7	2018-04-30T13:47:15	submodule: ignore submodules which include path traversal in their name If the we decide that the "name" of the submodule (i.e. its path inside `.git/modules/`) is trying to escape that directory or otherwise trick us, we ignore the configuration for that submodule. This leaves us with a half-configured submodule when looking it up by path, but it's the same result as if the configuration really were missing. The name check is potentially more strict than it needs to be, but it lets us re-use the check we're doing for the checkout. The function that encapsulates this logic is ready to be exported but we don't want to do that in a security release so it remains internal for now.
a52b4c51	2018-03-23T09:59:46	odb: fix writing to fake write streams In commit 7ec7aa4a7 (odb: assert on logic errors when writing objects, 2018-02-01), the check for whether we are trying to overflowing the fake stream buffer was changed from returning an error to raising an assert. The conversion forgot though that the logic around `assert`s are basically inverted. Previously, if the statement stream->written + len > steram->size evaluated to true, we would return a `-1`. Now we are asserting that this statement is true, and in case it is not we will raise an error. So the conversion to the `assert` in fact changed the behaviour to the complete opposite intention. Fix the assert by inverting its condition again and add a regression test.
16210877	2018-02-28T12:59:47	Unescape repo before constructing ssh request
8a2cdbd3	2018-02-28T12:58:58	Rename unescape and make non-static
0e4f3d9d	2018-03-03T21:47:22	gitno_extract_url_parts: decode hostnames RFC 3986 says that hostnames can be percent encoded. Percent decode hostnames in our URLs.
05551ca0	2018-03-03T20:14:54	Remove now unnecessary `gitno_unescape`
60e7848e	2018-03-03T20:13:30	gitno_extract_url_parts: use `git_buf`s Now that we can decode percent-encoded strings as part of `git_buf`s, use that decoder in `gitno_extract_url_parts`.
6f577906	2018-03-03T20:09:09	ssh urls: use `git_buf_decode_percent` Use `git_buf_decode_percent` so that we can avoid allocating a temporary buffer.
8070a357	2018-03-03T18:47:35	Introduce `git_buf_decode_percent` Introduce a function to take a percent-encoded string (URI encoded, described by RFC 1738) and decode it into a `git_buf`.
3db1af1f	2018-03-08T12:36:46	index: error out on unreasonable prefix-compressed path lengths When computing the complete path length from the encoded prefix-compressed path, we end up just allocating the complete path without ever checking what the encoded path length actually is. This can easily lead to a denial of service by just encoding an unreasonable long path name inside of the index. Git already enforces a maximum path length of 4096 bytes. As we also have that enforcement ready in some places, just make sure that the resulting path is smaller than GIT_PATH_MAX. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
3207ddb0	2018-03-08T12:00:27	index: fix out-of-bounds read with invalid index entry prefix length The index format in version 4 has prefix-compressed entries, where every index entry can compress its path by using a path prefix of the previous entry. Since implmenting support for this index format version in commit 5625d86b9 (index: support index v4, 2016-05-17), though, we do not correctly verify that the prefix length that we want to reuse is actually smaller or equal to the amount of characters than the length of the previous index entry's path. This can lead to a an integer underflow and subsequently to an out-of-bounds read. Fix this by verifying that the prefix is actually smaller than the previous entry's path length. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
58a6fe94	2018-03-08T11:49:19	index: convert `read_entry` to return entry size via an out-param The function `read_entry` does not conform to our usual coding style of returning stuff via the out parameter and to use the return value for reporting errors. Due to most of our code conforming to that pattern, it has become quite natural for us to actually return `-1` in case there is any error, which has also slipped in with commit 5625d86b9 (index: support index v4, 2016-05-17). As the function returns an `size_t` only, though, the return value is wrapped around, causing the caller of `read_tree` to continue with an invalid index entry. Ultimately, this can lead to a double-free. Improve code and fix the bug by converting the function to return the index entry size via an out parameter and only using the return value to indicate errors. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
53e692af	2018-03-02T12:49:54	worktree: rename parameter creason to reason
12356076	2018-03-02T12:41:04	worktree: lock reason should be const
8a8ea1db	2018-02-28T18:14:52	Merge pull request #4552 from libgit2/cmn/config-header-common Cast less blindly between configuration objects
e8e490b2	2018-02-28T17:01:47	Merge pull request #4554 from pks-t/pks/curl-init curl: initialize and cleanup global curl state
9cd0c6f1	2018-02-28T16:01:16	config: return an error if config_refresh is called on a snapshot Instead of treating it as a no-op, treat it as a programming error and return the same kind of error as if you called to set or delete variables on a snapshot.
2022b004	2018-02-28T12:06:59	curl: explicitly initialize and cleanup global curl state Our curl-based streams make use of the easy curl interface. This interface automatically initializes and de-initializes the global curl state by calling out to `curl_global_init` and `curl_global_cleanup`. Thus, all global state will be repeatedly re-initialized when creating multiple curl streams in succession. Despite being inefficient, this is not thread-safe due to `curl_global_init` being not thread-safe itself. Thus a multi-threaded programing handling multiple curl streams at the same time is inherently racy. Fix the issue by globally initializing and cleaning up curl's state.
a33deeb4	2018-02-28T12:20:23	win32: strncmp -> git__strncmp The win32 C library is compiled cdecl, however when configured with `STDCALL=ON`, our functions (and function pointers) will use the stdcall calling convention. You cannot set a `__stdcall` function pointer to a `__cdecl` function, so it's easier to just use our `git__strncmp` instead of sorting that mess out.
2424e64c	2018-02-28T12:06:02	config: harden our use of the backend objects a bit When we create an iterator we don't actually know that we have a live config object and we must instead only rely on the header. We fixed it to use this in a previous commit, but this makes it harder to misuse by converting to use the header object in the typecast. We also guard inside the `config_refresh` function against being given a snapshot (although callers right now do check).
1785de4e	2018-02-28T11:46:17	config: move the level field into the header We use it in a few places where we might have a full object or a snapshot so move it to where we can actually access it.
c1524b2e	2018-02-28T11:33:11	config: move the repository to the diskfile header We pass this around and when creating a new iterator we need to read the repository pointer. Put it in a common place so we can reach it regardless of whether we got a full object or a snapshot.
c9d59c61	2018-02-27T12:45:21	Merge pull request #4545 from libgit2/ethomson/checkout_filemode Respect core.filemode in checkout
5ecb6220	2018-02-25T15:46:51	winhttp: enable TLS 1.2 on Windows 7 and earlier Versions of Windows prior to Windows 8 do not enable TLS 1.2 by default, though support may exist. Try to enable TLS 1.2 support explicitly on connections. This request may fail if the operating system does not have TLS 1.2 support - the initial release of Vista lacks TLS 1.2 support (though it is available as a software update) and XP completely lacks TLS 1.2 support. If this request does fail, the HTTP context is still valid, and still maintains the original protocol support. So we ignore the failure from this operation.
934e6a3b	2018-02-27T11:24:30	winhttp: include constants for TLS 1.1/1.2 support For platforms that do not define `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_1` and/or `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_2`.
8c8db980	2018-02-27T10:32:29	mingw: update TLS option flags Include the constants for `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_1` and `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_2` so that they can be used by mingw. This updates both the `deps/winhttp` framework (for classic mingw) and adds the defines for mingw64, which does not use that framework.
c214ba19	2018-02-20T00:35:27	checkout: respect core.filemode when comparing filemodes Fixes #4504
894ccf4b	2018-02-20T16:14:54	Merge pull request #4535 from libgit2/ethomson/checkout_typechange_with_index_and_wd checkout: when examining index (instead of workdir), also examine mode
ce7080a0	2018-02-20T10:38:27	diff_tform: fix rename detection with rewrite/delete pair A rewritten file can either be classified as a modification of its contents or of a delete of the complete file followed by an addition of the new content. This distinction becomes important when we want to detect renames for rewrites. Given a scenario where a file "a" has been deleted and another file "b" has been renamed to "a", this should be detected as a deletion of "a" followed by a rename of "a" -> "b". Thus, splitting of the original rewrite into a delete/add pair is important here. This splitting is represented by a flag we can set at the current delta. While the flag is already being set in case we want to break rewrites, we do not do so in case where the `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` flag is set. This can trigger an assert when we try to match the source and target deltas. Fix the issue by setting the `GIT_DIFF_FLAG__TO_SPLIT` flag at the delta when it is a rename target and `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` is set.
d7fea1e1	2018-02-18T16:10:33	checkout: take mode into account when comparing index to baseline When checking out a file, we determine whether the baseline (what we expect to be in the working directory) actually matches the contents of the working directory. This is safe behavior to prevent us from overwriting changes in the working directory. We look at the index to optimize this test: if we know that the index matches the working directory, then we can simply look at the index data compared to the baseline. We have historically compared the baseline to the index entry by oid. However, we must also compare the mode of the two items to ensure that they are identical. Otherwise, we will refuse to update the working directory for a mode change.
f1ad004c	2018-02-18T22:29:48	Merge pull request #4529 from libgit2/ethomson/index_add_requires_files git_index_add_frombuffer: only accept files/links
5f774dbf	2018-02-11T10:14:13	git_index_add_frombuffer: only accept files/links Ensure that the buffer given to `git_index_add_frombuffer` represents a regular blob, an executable blob, or a link. Explicitly reject commit entries (submodules) - it makes little sense to allow users to add a submodule from a string; there's no possible path to success.
92324d84	2018-02-16T11:28:53	util: clean up header includes While "util.h" declares the macro `git__tolower`, which simply resorts to tolower(3P) on Unix-like systems, the <ctype.h> header is only being included in "util.c". Thus, anybody who has included "util.h" without having <ctype.h> included will fail to compile as soon as the macro is in use. Furthermore, we can clean up additional includes in "util.c" and simply replace them with an include for "common.h".
06b8a40f	2018-02-16T11:29:46	Explicitly mark fallthrough cases with comments A lot of compilers nowadays generate warnings when there are cases in a switch statement which implicitly fall through to the next case. To avoid this warning, the last line in the case that is falling through can have a comment matching a regular expression, where one possible comment body would be `/* fall through */`. An alternative to the comment would be an explicit attribute like e.g. `[[clang::fallthrough]` or `__attribute__ ((fallthrough))`. But GCC only introduced support for such an attribute recently with GCC 7. Thus, and also because the fallthrough comment is supported by most compilers, we settle for using comments instead. One shortcoming of that method is that compilers are very strict about that. Most interestingly, that comment _really_ has to be the last line. In case a closing brace follows the comment, the heuristic will fail.
7c6e9175	2018-02-16T11:11:11	index: shut up warning on uninitialized variable Even though the `entry` variable will always be initialized when `read_entry` returns success and even though we never dereference `entry` in case `read_entry` fails, GCC prints a warning about uninitialized use. Just initialize the pointer to `NULL` in order to shut GCC up.
84f03b3a	2018-02-16T10:48:55	streams: openssl: fix use of uninitialized variable When verifying the server certificate, we do try to make sure that the hostname actually matches the certificate alternative names. In cases where the host is either an IPv4 or IPv6 address, we have to compare the binary representations of the hostname with the declared IP address of the certificate. We only do that comparison in case we were successfully able to parse the hostname as an IP, which would always result in the memory region being initialized. Still, GCC 6.4.0 was complaining about usage of non-initialized memory. Fix the issue by simply asserting that `addr` needs to be initialized. This shuts up the GCC warning.
ee6be190	2018-01-31T08:36:19	http: standardize user-agent addition The winhttp and posix http each need to add the user-agent to their requests. Standardize on a single function to include this so that we do not get the version numbers we're sending out of sync. Assemble the complete user agent in `git_http__user_agent`, returning assembled strings. Co-authored-by: Patrick Steinhardt <ps@pks.im>
178fda8a	2018-02-09T17:55:18	hash: win32: fix missing comma in `giterr_set`
638c6b8c	2018-02-09T17:32:15	odb_loose: only close file descriptor if it was opened successfully
a43bcd2c	2018-02-09T17:31:50	odb: fix memory leaks due to not freeing hash context
9985edb5	2018-02-01T06:32:55	hash: set error messages on failure
619f61a8	2018-02-01T06:22:36	odb: error when we can't create object header Return an error to the caller when we can't create an object header for some reason (printf failure) instead of simply asserting.
7ec7aa4a	2018-02-01T05:54:57	odb: assert on logic errors when writing objects There's no recovery possible if we're so confused or corrupted that we're trying to overwrite our memory. Simply assert.
138e4c2b	2018-02-01T06:35:31	git_odb__hashfd: propagate error on failures
35ed256b	2018-02-01T05:11:05	git_odb__hashobj: provide errors messages on failures Provide error messages on hash failures: assert when given invalid input instead of failing with a user error; provide error messages on program errors.
59d99adc	2018-01-31T09:34:52	odb: check for alloc errors on hardcoded objects It's unlikely that we'll fail to allocate a single byte, but let's check for allocation failures for good measure. Untangle `-1` being a marker of not having found the hardcoded odb object; use that to reflect actual errors.
ef902864	2018-01-31T09:30:51	odb: error when we can't alloc an object At the moment, we're swallowing the allocation failure. We need to return the error to the caller.
0fd0bfe4	2018-02-08T22:51:46	Merge pull request #4450 from libgit2/ethomson/odb_loose_readstream Streaming read support for the loose ODB backend
d749822c	2018-02-08T22:50:58	Merge pull request #4491 from libgit2/ethomson/recursive Recursive merge: reverse the order of merge bases
ba4faf6e	2018-02-08T17:15:33	buf_text: remove `offset` parameter of BOM detection function The function to detect a BOM takes an offset where it shall look for a BOM. No caller uses that, and searching for the BOM in the middle of a buffer seems to be very unlikely, as a BOM should only ever exist at file start. Remove the parameter, as it has already caused confusion due to its weirdness.
2eea5f1c	2018-02-08T10:27:31	config_parse: fix reading files with BOM The function `skip_bom` is being used to detect and skip BOM marks previously to parsing a configuration file. To do so, it simply uses `git_buf_text_detect_bom`. But since the refactoring to use the parser interface in commit 9e66590bd (config_parse: use common parser interface, 2017-07-21), the BOM detection was actually broken. The issue stems from a misunderstanding of `git_buf_text_detect_bom`. It was assumed that its third parameter limits the length of the character sequence that is to be analyzed, while in fact it was an offset at which we want to detect the BOM. Fix the parameter to be `0` instead of the buffer length, as we always want to check the beginning of the configuration file.
848153f3	2018-02-08T10:02:29	config_parse: handle empty lines with CRLF Currently, the configuration parser will fail reading empty lines with just an CRLF-style line ending. Special-case the '\r' character in order to handle it the same as Unix-style line endings. Add tests to spot this regression in the future.
5340ca77	2018-02-08T09:31:51	config_parse: add comment to clarify logic getting next character Upon each line, the configuration parser tries to get either the first non-whitespace character or the first whitespace character, in case there is no non-whitespace character. The logic handling this looks rather odd and doesn't immediately convey this meaning, so add a comment to clarify what happens.
1403c612	2018-01-22T14:44:31	merge: virtual commit should be last argument to merge-base Our virtual commit must be the last argument to merge-base: since our algorithm pushes _both_ parents of the virtual commit, it needs to be the last argument, since merge-base: > Given three commits A, B and C, git merge-base A B C will compute the > merge base between A and a hypothetical commit M We want to calculate the merge base between the actual commit ("two") and the virtual commit ("one") - since one actually pushes its parents to the merge-base calculation, we need to calculate the merge base of "two" and the parents of one.
b924df1e	2018-01-21T18:05:45	merge: reverse merge bases for recursive merge When the commits being merged have multiple merge bases, reverse the order when creating the virtual merge base. This is for compatibility with git's merge-recursive algorithm, and ensures that we build identical trees. Git does this to try to use older merge bases first. Per 8918b0c: > It seems to be the only sane way to do it: when a two-head merge is > done, and the merge-base and one of the two branches agree, the > merge assumes that the other branch has something new. > > If we start creating virtual commits from newer merge-bases, and go > back to older merge-bases, and then merge with newer commits again, > chances are that a patch is lost, _because_ the merge-base and the > head agree on it. Unlikely, yes, but it happened to me.
ed51feb7	2018-01-21T18:01:20	oidarray: introduce git_oidarray__reverse Provide a simple function to reverse an oidarray.
26f5d36d	2018-02-04T10:27:39	Merge pull request #4489 from libgit2/ethomson/conflicts_crlf Conflict markers should match EOL style in conflicting files
8abd514c	2018-02-02T17:37:12	Merge pull request #4499 from pks-t/pks/setuid-config sysdir: do not use environment in setuid case
2553cbe3	2018-02-02T11:33:46	Merge pull request #4512 from libgit2/ethomson/header_guards Consistent header guards
53454b68	2018-02-02T11:31:15	Merge pull request #4510 from pks-t/pks/attr-file-bare-stat attr: avoid stat'ting files for bare repositories
0967459e	2018-01-25T13:11:34	sysdir: do not use environment in setuid case In order to derive the location of some Git directories, we currently use the environment variables $HOME and $XDG_CONFIG_HOME. This might prove to be problematic whenever the binary is run with setuid, that is when the effective user does not equal the real user. In case the environment variables do not get sanitized by the caller, we thus might end up using the real user's configuration when doing stuff as the effective user. The fix is to use the passwd entry's directory instead of $HOME in this situation. As this might break scenarios where the user explicitly sets $HOME to another path, this fix is only applied in case the effective user does not equal the real user.
09df354e	2018-02-01T16:52:43	odb_loose: HEADER_LEN -> MAX_HEADER_LEN `MAX_HEADER_LEN` is a more descriptive constant name.
624614b2	2017-12-19T00:43:49	odb_loose: validate length when checking for zlib content When checking to see if a file has zlib deflate content, make sure that we actually have read at least two bytes before examining the array.
1118ba3e	2017-12-18T23:08:40	odb_loose: `read_header` for packlike loose objects Support `read_header` for "packlike loose objects", which were a temporarily and uncommonly used format loose object format that encodes the header before the zlib deflate data. This will never actually be seen in the wild, but add support for it for completeness and (more importantly) because our corpus of test data has objects in this format, so it's easier to support it than to try to special case it.
4c7a16b7	2017-12-18T15:56:21	odb_loose: read_header should use zstream Make `read_header` use the common zstream implementation. Remove the now unnecessary zlib wrapper in odb_loose.
6155e06b	2017-12-17T18:44:02	zstream: introduce a single chunk reader Introduce `get_output_chunk` that will inflate/deflate all the available input buffer into the output buffer. `get_output` will call `get_output_chunk` in a loop, while other consumers can use it to inflate only a piece of the data.

7392799d

2018-05-30T08:35:06

submodule: detect duplicated submodule paths When loading submodule names, we build a map of submodule paths and their respective names. While looping over the configuration keys, we do not check though whether a submodule path was seen already. This leads to a memory leak in case we have multiple submodules with the same path, as we just overwrite the old value in the map in that case. Fix the error by verifying that the path to be added is not yet part of the string map. Git does not allow to have multiple submodules for a path anyway, so we now do the same and detect this duplication, reporting it to the user.

f2e5c092

2018-04-27T15:31:43

cmake: remove now-useless LIBGIT2_LIBDIRS handling With the recent change of always resolving pkg-config libraries to their full path, we do not have to manage the LIBGIT2_LIBDIRS variable anymore. The only other remaining user of LIBGIT2_LIBDIRS is winhttp, which is a CMake-style library target and can thus be resolved by CMake automatically. Remove the variable to simplify our build system a bit.

0c8ff50f

2018-04-27T10:38:49

cmake: resolve libraries found by pkg-config Libraries found by CMake modules are usually handled with their full path. This makes linking against those libraries a lot more robust when it comes to libraries in non-standard locations, as otherwise we might mix up libraries from different locations when link directories are given. One excemption are libraries found by PKG_CHECK_MODULES. Instead of returning libraries with their complete path, it will return the variable names as well as a set of link directories. In case where multiple sets of the same library are installed in different locations, this can lead the compiler to link against the wrong libraries in the end, when link directories of other dependencies are added. To fix this shortcoming, we need to manually resolve library paths returned by CMake against their respective library directories. This is an easy task to do with `FIND_LIBRARY`.

8fa0b34b

2018-04-19T01:05:05

local: fix a leaking reference when iterating over a symref Valgrind log : ==17702== 18 bytes in 1 blocks are indirectly lost in loss record 69 of 1,123 ==17702== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==17702== by 0x5FDBB49: strdup (strdup.c:42) ==17702== by 0x632B3E: git__strdup (util.h:106) ==17702== by 0x632D2C: git_reference__alloc_symbolic (refs.c:64) ==17702== by 0x62E0AF: loose_lookup (refdb_fs.c:408) ==17702== by 0x62E636: refdb_fs_backend__iterator_next (refdb_fs.c:565) ==17702== by 0x62CD8E: git_refdb_iterator_next (refdb.c:147) ==17702== by 0x6347F2: git_reference_next (refs.c:838) ==17702== by 0x6345CB: git_reference_foreach (refs.c:748) ==17702== by 0x66BE62: local_download_pack (local.c:579) ==17702== by 0x5DB48F: git_fetch_download_pack (fetch.c:148) ==17702== by 0x639028: git_remote_download (remote.c:932) ==17702== by 0x63919A: git_remote_fetch (remote.c:969) ==17702== by 0x4ABEDD: test_fetchhead_nonetwork__fetch_into_repo_with_symrefs (nonetwork.c:362) ==17702== by 0x4125D9: clar_run_test (clar.c:222) ==17702== by 0x41287C: clar_run_suite (clar.c:286) ==17702== by 0x412DDE: clar_test_run (clar.c:433) ==17702== by 0x4105E1: main (main.c:24)

16b62dd4

2018-04-04T21:28:31

diff: Add missing GIT_DELTA_TYPECHANGE -> 'T' mapping. This adds the 'T' status character to git_diff_status_char() for diff entries that change type.

2fe887e6

2018-04-18T20:57:16

remote: repo is optional here As per CID:1378747, we might be called with a NULL repo, which would be deferenced in write_add_refspec

07011e60

2018-04-12T13:32:27

revwalk: fix uninteresting revs sometimes not limiting graphwalk When we want to limit our graphwalk, we use the heuristic of checking whether the newest limiting (uninteresting) revision is newer than the oldest interesting revision. We do so by inspecting whether the first item's commit time of the user-supplied list of revisions is newer than the last added interesting revision. This is wrong though, as the user supplied list is in no way guaranteed to be sorted by increasing commit dates. This could lead us to abort the revwalk early before applying all relevant limiting revisions, outputting revisions which should in fact have been hidden. Fix the heuristic by instead checking whether _any_ of the limiting commits was made earlier than the last interesting commit. Add a test.

0f88adb6

2018-02-08T12:36:47

Submodule API should report .gitmodules parse errors Signed-off-by: Sven Strickroth <email@cs-ware.de>

96329606

2018-03-11T15:35:56

worktree: Read worktree specific reflog for HEAD Signed-off-by: Sven Strickroth <email@cs-ware.de>

a137cdbd

2018-04-18T21:41:44

refspec: check for valid parameters in git_refspec__dwim_one CID:1383993, "In git_refspec__dwim_one: All paths that lead to this null pointer comparison already dereference the pointer earlier (CWE-476)"

b2f3ff56

2018-04-19T01:08:18

worktree: fix calloc of the wrong object type

7fa6c8ce

2018-03-29T10:18:51

util: fix missing headers for MinGW environments There are multiple references to undefined functions in the Microsoft builds. Add headers to make them known.

0f09d9f5

2018-04-02T20:00:07

Fix build with LibreSSL 2.7 LibreSSL 2.7 adds OpenSSL 1.1 API Signed-off-by: Bernard Spil <brnrd@FreeBSD.org>

e2a80124

2018-04-10T21:16:43

refs: preserve the owning refdb when duping reference This fixes a segfault in git_reference_owner on references returned from git_reference__read_head and git_reference_dup ones.

b260fdc8

2018-04-06T12:24:10

attr_file: fix handling of directory patterns with trailing spaces When comparing whether a path matches a directory rule, we pass the both the path and directory name to `fnmatch` with `GIT_ATTR_FNMATCH_DIRECTORY` being set. `fnmatch` expects the pattern to contain no trailing directory '/', which is why we try to always strip patterns of trailing slashes. We do not handle that case correctly though when the pattern itself has trailing spaces, causing the match to fail. Fix the issue by stripping trailing spaces and tabs for a rule previous to checking whether the pattern is a directory pattern with a trailing '/'. This replaces the whitespace-stripping in our ignore file parsing code, which was stripping whitespaces too late. Add a test to catch future breakage.

a714e836

2018-04-06T10:39:16

transports: local: fix assert when fetching into repo with symrefs When fetching into a repository which has symbolic references via the "local" transport we run into an assert. The assert is being triggered while we negotiate the packfile between the two repositories. When hiding known revisions from the packbuilder revwalk, we unconditionally hide all references of the local refdb. In case one of these references is a symbolic reference, though, this means we're trying to hide a `NULL` OID, which triggers the assert. Fix the issue by only hiding OID references from the revwalk. Add a test to catch this issue in the future.

59012bf4

2018-03-29T09:15:48

odb: mempack: fix leaking objects when freeing mempacks When a ODB mempack gets free'd, we take no measures at all to free its contents, most notably the objects added to the database, resulting in a memory leak. Call `git_mempack_reset` previous to freeing the ODB structures themselves, which takes care of releasing all associated data structures.

4d4a7dbf

2018-03-28T17:37:39

sha1dc: update to fix errors with endianess This updates the version of SHA1DC to c3e1304ea3.

b89988c7

2018-03-27T15:03:15

transports: ssh: replace deprecated function `libssh2_session_startup` The function `libssh2_session_startup` has been deprecated since libssh2 version 1.2.8 in favor of `libssh2_session_handshake` introduced in the same version. libssh2 1.2.8 was released in April 2011, so it is already seven years old. It is available in Debian Wheezy, Ubuntu Trusty and CentOS 7.4, so the most important and conservative distros already have it available. As such, it seems safe to just use the new function.

b2e7d8c2

2018-03-27T14:49:21

transports: ssh: disconnect session before freeing it The function `ssh_stream_free` takes over the responsibility of closing channels and streams just before freeing their memory, but it does not do so for the session. In fact, we never disconnect the session ourselves at all, as libssh2 will not do so itself upon freeing the structure. Quoting the documentation of `libssh2_session_free`: > Frees all resources associated with a session instance. Typically > called after libssh2_session_disconnect_ex, The missing disconnect probably stems from a misunderstanding what it actually does. As we are already closing the TCP socket ourselves, the assumption was that no additional disconnect is required. But calling `libssh2_session_disconnect` will notify the server that we are cleanly closing the connection, such that the server can free his own resources. Add a call to `libssh2_session_disconnect` to fix that issue. [1]: https://www.libssh2.org/libssh2_session_free.html

cfed1be8

2018-05-24T20:28:36

submodule: plug leaks from the escape detection

f650153a

2018-05-24T19:05:59

submodule: replace index with strchr which exists on Windows

a3df20cf

2018-05-24T19:00:13

submodule: the repostiory for _name_is_valid should not be const We might modify caches due to us trying to load the configuration to figure out what kinds of filesystem protections we should have.

a9e60994

2018-05-23T08:40:17

path: check for a symlinked .gitmodules in fs-agnostic code We still compare case-insensitively to protect more thoroughly as we don't know what specifics we'll see on the system and it's the behaviour from git.

aa003557

2018-05-22T16:13:47

path: reject .gitmodules as a symlink Any part of the library which asks the question can pass in the mode to have it checked against `.gitmodules` being a symlink. This is particularly relevant for adding entries to the index from the worktree and for checking out files.

08d4b459

2018-05-22T15:48:38

index: stat before creating the entry This is so we have it available for the path validity checking. In a later commit we will start rejecting `.gitmodules` files as symlinks.

c7cac088

2018-05-22T15:21:08

path: accept the name length as a parameter We may take in names from the middle of a string so we want the caller to let us know how long the path component is that we should be checking.

d7ee21ee

2018-05-22T13:58:24

path: expose dotgit detection functions per filesystem These will be used by the checkout code to detect them for the particular filesystem they're on.

f907a6f5

2018-05-18T15:16:53

path: hide the dotgit file functions These can't go into the public API yet as we don't want to introduce API or ABI changes in a security release.

0cc14627

2018-05-16T15:56:04

path: add functions to detect .gitconfig and .gitattributes

26b3cec0

2018-05-16T15:42:08

path: add a function to detect an .gitmodules file Given a path component it knows what to pass to the filesystem-specific functions so we're protected even from trees which try to use the 8.3 naming rules to get around us matching on the filename exactly. The logic and test strings come from the equivalent git change.

dd364dde

2018-05-16T14:47:04

path: provide a generic function for checking dogit files on NTFS It checks against the 8.3 shortname variants, including the one which includes the checksum as part of its name.

37dc60b6

2018-05-16T11:56:04

path: provide a generic dogit checking function for HFS This lets us check for other kinds of reserved files.

916af8ea

2018-05-14T16:03:15

submodule: also validate Windows-separated paths for validity Otherwise we would also admit `..\..\foo\bar` as a valid path and fail to protect Windows users. Ideally we would check for both separators without the need for the copied string, but this'll get us over the RCE.

e6c757a7

2018-04-30T13:47:15

submodule: ignore submodules which include path traversal in their name If the we decide that the "name" of the submodule (i.e. its path inside `.git/modules/`) is trying to escape that directory or otherwise trick us, we ignore the configuration for that submodule. This leaves us with a half-configured submodule when looking it up by path, but it's the same result as if the configuration really were missing. The name check is potentially more strict than it needs to be, but it lets us re-use the check we're doing for the checkout. The function that encapsulates this logic is ready to be exported but we don't want to do that in a security release so it remains internal for now.

a52b4c51

2018-03-23T09:59:46

odb: fix writing to fake write streams In commit 7ec7aa4a7 (odb: assert on logic errors when writing objects, 2018-02-01), the check for whether we are trying to overflowing the fake stream buffer was changed from returning an error to raising an assert. The conversion forgot though that the logic around `assert`s are basically inverted. Previously, if the statement stream->written + len > steram->size evaluated to true, we would return a `-1`. Now we are asserting that this statement is true, and in case it is not we will raise an error. So the conversion to the `assert` in fact changed the behaviour to the complete opposite intention. Fix the assert by inverting its condition again and add a regression test.

16210877

2018-02-28T12:59:47

Unescape repo before constructing ssh request

8a2cdbd3

2018-02-28T12:58:58

Rename unescape and make non-static

0e4f3d9d

2018-03-03T21:47:22

gitno_extract_url_parts: decode hostnames RFC 3986 says that hostnames can be percent encoded. Percent decode hostnames in our URLs.

05551ca0

2018-03-03T20:14:54

Remove now unnecessary `gitno_unescape`

60e7848e

2018-03-03T20:13:30

gitno_extract_url_parts: use `git_buf`s Now that we can decode percent-encoded strings as part of `git_buf`s, use that decoder in `gitno_extract_url_parts`.

6f577906

2018-03-03T20:09:09

ssh urls: use `git_buf_decode_percent` Use `git_buf_decode_percent` so that we can avoid allocating a temporary buffer.

8070a357

2018-03-03T18:47:35

Introduce `git_buf_decode_percent` Introduce a function to take a percent-encoded string (URI encoded, described by RFC 1738) and decode it into a `git_buf`.

3db1af1f

2018-03-08T12:36:46

index: error out on unreasonable prefix-compressed path lengths When computing the complete path length from the encoded prefix-compressed path, we end up just allocating the complete path without ever checking what the encoded path length actually is. This can easily lead to a denial of service by just encoding an unreasonable long path name inside of the index. Git already enforces a maximum path length of 4096 bytes. As we also have that enforcement ready in some places, just make sure that the resulting path is smaller than GIT_PATH_MAX. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>

3207ddb0

2018-03-08T12:00:27

index: fix out-of-bounds read with invalid index entry prefix length The index format in version 4 has prefix-compressed entries, where every index entry can compress its path by using a path prefix of the previous entry. Since implmenting support for this index format version in commit 5625d86b9 (index: support index v4, 2016-05-17), though, we do not correctly verify that the prefix length that we want to reuse is actually smaller or equal to the amount of characters than the length of the previous index entry's path. This can lead to a an integer underflow and subsequently to an out-of-bounds read. Fix this by verifying that the prefix is actually smaller than the previous entry's path length. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>

58a6fe94

2018-03-08T11:49:19

index: convert `read_entry` to return entry size via an out-param The function `read_entry` does not conform to our usual coding style of returning stuff via the out parameter and to use the return value for reporting errors. Due to most of our code conforming to that pattern, it has become quite natural for us to actually return `-1` in case there is any error, which has also slipped in with commit 5625d86b9 (index: support index v4, 2016-05-17). As the function returns an `size_t` only, though, the return value is wrapped around, causing the caller of `read_tree` to continue with an invalid index entry. Ultimately, this can lead to a double-free. Improve code and fix the bug by converting the function to return the index entry size via an out parameter and only using the return value to indicate errors. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>

53e692af

2018-03-02T12:49:54

worktree: rename parameter creason to reason

12356076

2018-03-02T12:41:04

worktree: lock reason should be const

8a8ea1db

2018-02-28T18:14:52

Merge pull request #4552 from libgit2/cmn/config-header-common Cast less blindly between configuration objects

e8e490b2

2018-02-28T17:01:47

Merge pull request #4554 from pks-t/pks/curl-init curl: initialize and cleanup global curl state

9cd0c6f1

2018-02-28T16:01:16

config: return an error if config_refresh is called on a snapshot Instead of treating it as a no-op, treat it as a programming error and return the same kind of error as if you called to set or delete variables on a snapshot.

2022b004

2018-02-28T12:06:59

curl: explicitly initialize and cleanup global curl state Our curl-based streams make use of the easy curl interface. This interface automatically initializes and de-initializes the global curl state by calling out to `curl_global_init` and `curl_global_cleanup`. Thus, all global state will be repeatedly re-initialized when creating multiple curl streams in succession. Despite being inefficient, this is not thread-safe due to `curl_global_init` being not thread-safe itself. Thus a multi-threaded programing handling multiple curl streams at the same time is inherently racy. Fix the issue by globally initializing and cleaning up curl's state.

a33deeb4

2018-02-28T12:20:23

win32: strncmp -> git__strncmp The win32 C library is compiled cdecl, however when configured with `STDCALL=ON`, our functions (and function pointers) will use the stdcall calling convention. You cannot set a `__stdcall` function pointer to a `__cdecl` function, so it's easier to just use our `git__strncmp` instead of sorting that mess out.

2424e64c

2018-02-28T12:06:02

config: harden our use of the backend objects a bit When we create an iterator we don't actually know that we have a live config object and we must instead only rely on the header. We fixed it to use this in a previous commit, but this makes it harder to misuse by converting to use the header object in the typecast. We also guard inside the `config_refresh` function against being given a snapshot (although callers right now do check).

1785de4e

2018-02-28T11:46:17

config: move the level field into the header We use it in a few places where we might have a full object or a snapshot so move it to where we can actually access it.

c1524b2e

2018-02-28T11:33:11

config: move the repository to the diskfile header We pass this around and when creating a new iterator we need to read the repository pointer. Put it in a common place so we can reach it regardless of whether we got a full object or a snapshot.

c9d59c61

2018-02-27T12:45:21

Merge pull request #4545 from libgit2/ethomson/checkout_filemode Respect core.filemode in checkout

5ecb6220

2018-02-25T15:46:51

winhttp: enable TLS 1.2 on Windows 7 and earlier Versions of Windows prior to Windows 8 do not enable TLS 1.2 by default, though support may exist. Try to enable TLS 1.2 support explicitly on connections. This request may fail if the operating system does not have TLS 1.2 support - the initial release of Vista lacks TLS 1.2 support (though it is available as a software update) and XP completely lacks TLS 1.2 support. If this request does fail, the HTTP context is still valid, and still maintains the original protocol support. So we ignore the failure from this operation.

934e6a3b

2018-02-27T11:24:30

winhttp: include constants for TLS 1.1/1.2 support For platforms that do not define `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_1` and/or `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_2`.

8c8db980

2018-02-27T10:32:29

mingw: update TLS option flags Include the constants for `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_1` and `WINHTTP_FLAG_SECURE_PROTOCOL_TLS1_2` so that they can be used by mingw. This updates both the `deps/winhttp` framework (for classic mingw) and adds the defines for mingw64, which does not use that framework.

c214ba19

2018-02-20T00:35:27

checkout: respect core.filemode when comparing filemodes Fixes #4504

894ccf4b

2018-02-20T16:14:54

Merge pull request #4535 from libgit2/ethomson/checkout_typechange_with_index_and_wd checkout: when examining index (instead of workdir), also examine mode

ce7080a0

2018-02-20T10:38:27

diff_tform: fix rename detection with rewrite/delete pair A rewritten file can either be classified as a modification of its contents or of a delete of the complete file followed by an addition of the new content. This distinction becomes important when we want to detect renames for rewrites. Given a scenario where a file "a" has been deleted and another file "b" has been renamed to "a", this should be detected as a deletion of "a" followed by a rename of "a" -> "b". Thus, splitting of the original rewrite into a delete/add pair is important here. This splitting is represented by a flag we can set at the current delta. While the flag is already being set in case we want to break rewrites, we do not do so in case where the `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` flag is set. This can trigger an assert when we try to match the source and target deltas. Fix the issue by setting the `GIT_DIFF_FLAG__TO_SPLIT` flag at the delta when it is a rename target and `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` is set.

d7fea1e1

2018-02-18T16:10:33

checkout: take mode into account when comparing index to baseline When checking out a file, we determine whether the baseline (what we expect to be in the working directory) actually matches the contents of the working directory. This is safe behavior to prevent us from overwriting changes in the working directory. We look at the index to optimize this test: if we know that the index matches the working directory, then we can simply look at the index data compared to the baseline. We have historically compared the baseline to the index entry by oid. However, we must also compare the mode of the two items to ensure that they are identical. Otherwise, we will refuse to update the working directory for a mode change.

f1ad004c

2018-02-18T22:29:48

Merge pull request #4529 from libgit2/ethomson/index_add_requires_files git_index_add_frombuffer: only accept files/links

5f774dbf

2018-02-11T10:14:13

git_index_add_frombuffer: only accept files/links Ensure that the buffer given to `git_index_add_frombuffer` represents a regular blob, an executable blob, or a link. Explicitly reject commit entries (submodules) - it makes little sense to allow users to add a submodule from a string; there's no possible path to success.

92324d84

2018-02-16T11:28:53

util: clean up header includes While "util.h" declares the macro `git__tolower`, which simply resorts to tolower(3P) on Unix-like systems, the <ctype.h> header is only being included in "util.c". Thus, anybody who has included "util.h" without having <ctype.h> included will fail to compile as soon as the macro is in use. Furthermore, we can clean up additional includes in "util.c" and simply replace them with an include for "common.h".

06b8a40f

2018-02-16T11:29:46

Explicitly mark fallthrough cases with comments A lot of compilers nowadays generate warnings when there are cases in a switch statement which implicitly fall through to the next case. To avoid this warning, the last line in the case that is falling through can have a comment matching a regular expression, where one possible comment body would be `/* fall through */`. An alternative to the comment would be an explicit attribute like e.g. `[[clang::fallthrough]` or `__attribute__ ((fallthrough))`. But GCC only introduced support for such an attribute recently with GCC 7. Thus, and also because the fallthrough comment is supported by most compilers, we settle for using comments instead. One shortcoming of that method is that compilers are very strict about that. Most interestingly, that comment _really_ has to be the last line. In case a closing brace follows the comment, the heuristic will fail.

7c6e9175

2018-02-16T11:11:11

index: shut up warning on uninitialized variable Even though the `entry` variable will always be initialized when `read_entry` returns success and even though we never dereference `entry` in case `read_entry` fails, GCC prints a warning about uninitialized use. Just initialize the pointer to `NULL` in order to shut GCC up.

84f03b3a

2018-02-16T10:48:55

streams: openssl: fix use of uninitialized variable When verifying the server certificate, we do try to make sure that the hostname actually matches the certificate alternative names. In cases where the host is either an IPv4 or IPv6 address, we have to compare the binary representations of the hostname with the declared IP address of the certificate. We only do that comparison in case we were successfully able to parse the hostname as an IP, which would always result in the memory region being initialized. Still, GCC 6.4.0 was complaining about usage of non-initialized memory. Fix the issue by simply asserting that `addr` needs to be initialized. This shuts up the GCC warning.

ee6be190

2018-01-31T08:36:19

http: standardize user-agent addition The winhttp and posix http each need to add the user-agent to their requests. Standardize on a single function to include this so that we do not get the version numbers we're sending out of sync. Assemble the complete user agent in `git_http__user_agent`, returning assembled strings. Co-authored-by: Patrick Steinhardt <ps@pks.im>

178fda8a

2018-02-09T17:55:18

hash: win32: fix missing comma in `giterr_set`

638c6b8c

2018-02-09T17:32:15

odb_loose: only close file descriptor if it was opened successfully

a43bcd2c

2018-02-09T17:31:50

odb: fix memory leaks due to not freeing hash context

9985edb5

2018-02-01T06:32:55

hash: set error messages on failure

619f61a8

2018-02-01T06:22:36

odb: error when we can't create object header Return an error to the caller when we can't create an object header for some reason (printf failure) instead of simply asserting.

7ec7aa4a

2018-02-01T05:54:57

odb: assert on logic errors when writing objects There's no recovery possible if we're so confused or corrupted that we're trying to overwrite our memory. Simply assert.

138e4c2b

2018-02-01T06:35:31

git_odb__hashfd: propagate error on failures

35ed256b

2018-02-01T05:11:05

git_odb__hashobj: provide errors messages on failures Provide error messages on hash failures: assert when given invalid input instead of failing with a user error; provide error messages on program errors.

59d99adc

2018-01-31T09:34:52

odb: check for alloc errors on hardcoded objects It's unlikely that we'll fail to allocate a single byte, but let's check for allocation failures for good measure. Untangle `-1` being a marker of not having found the hardcoded odb object; use that to reflect actual errors.

ef902864

2018-01-31T09:30:51

odb: error when we can't alloc an object At the moment, we're swallowing the allocation failure. We need to return the error to the caller.

0fd0bfe4

2018-02-08T22:51:46

Merge pull request #4450 from libgit2/ethomson/odb_loose_readstream Streaming read support for the loose ODB backend

d749822c

2018-02-08T22:50:58

Merge pull request #4491 from libgit2/ethomson/recursive Recursive merge: reverse the order of merge bases

ba4faf6e

2018-02-08T17:15:33

buf_text: remove `offset` parameter of BOM detection function The function to detect a BOM takes an offset where it shall look for a BOM. No caller uses that, and searching for the BOM in the middle of a buffer seems to be very unlikely, as a BOM should only ever exist at file start. Remove the parameter, as it has already caused confusion due to its weirdness.

2eea5f1c

2018-02-08T10:27:31

config_parse: fix reading files with BOM The function `skip_bom` is being used to detect and skip BOM marks previously to parsing a configuration file. To do so, it simply uses `git_buf_text_detect_bom`. But since the refactoring to use the parser interface in commit 9e66590bd (config_parse: use common parser interface, 2017-07-21), the BOM detection was actually broken. The issue stems from a misunderstanding of `git_buf_text_detect_bom`. It was assumed that its third parameter limits the length of the character sequence that is to be analyzed, while in fact it was an offset at which we want to detect the BOM. Fix the parameter to be `0` instead of the buffer length, as we always want to check the beginning of the configuration file.

848153f3

2018-02-08T10:02:29

config_parse: handle empty lines with CRLF Currently, the configuration parser will fail reading empty lines with just an CRLF-style line ending. Special-case the '\r' character in order to handle it the same as Unix-style line endings. Add tests to spot this regression in the future.

5340ca77

2018-02-08T09:31:51

config_parse: add comment to clarify logic getting next character Upon each line, the configuration parser tries to get either the first non-whitespace character or the first whitespace character, in case there is no non-whitespace character. The logic handling this looks rather odd and doesn't immediately convey this meaning, so add a comment to clarify what happens.

1403c612

2018-01-22T14:44:31

merge: virtual commit should be last argument to merge-base Our virtual commit must be the last argument to merge-base: since our algorithm pushes _both_ parents of the virtual commit, it needs to be the last argument, since merge-base: > Given three commits A, B and C, git merge-base A B C will compute the > merge base between A and a hypothetical commit M We want to calculate the merge base between the actual commit ("two") and the virtual commit ("one") - since one actually pushes its parents to the merge-base calculation, we need to calculate the merge base of "two" and the parents of one.

b924df1e

2018-01-21T18:05:45

merge: reverse merge bases for recursive merge When the commits being merged have multiple merge bases, reverse the order when creating the virtual merge base. This is for compatibility with git's merge-recursive algorithm, and ensures that we build identical trees. Git does this to try to use older merge bases first. Per 8918b0c: > It seems to be the only sane way to do it: when a two-head merge is > done, and the merge-base and one of the two branches agree, the > merge assumes that the other branch has something new. > > If we start creating virtual commits from newer merge-bases, and go > back to older merge-bases, and then merge with newer commits again, > chances are that a patch is lost, _because_ the merge-base and the > head agree on it. Unlikely, yes, but it happened to me.

ed51feb7

2018-01-21T18:01:20

oidarray: introduce git_oidarray__reverse Provide a simple function to reverse an oidarray.

26f5d36d

2018-02-04T10:27:39

Merge pull request #4489 from libgit2/ethomson/conflicts_crlf Conflict markers should match EOL style in conflicting files

8abd514c

2018-02-02T17:37:12

Merge pull request #4499 from pks-t/pks/setuid-config sysdir: do not use environment in setuid case

2553cbe3

2018-02-02T11:33:46

Merge pull request #4512 from libgit2/ethomson/header_guards Consistent header guards

53454b68

2018-02-02T11:31:15

Merge pull request #4510 from pks-t/pks/attr-file-bare-stat attr: avoid stat'ting files for bare repositories

0967459e

2018-01-25T13:11:34

sysdir: do not use environment in setuid case In order to derive the location of some Git directories, we currently use the environment variables $HOME and $XDG_CONFIG_HOME. This might prove to be problematic whenever the binary is run with setuid, that is when the effective user does not equal the real user. In case the environment variables do not get sanitized by the caller, we thus might end up using the real user's configuration when doing stuff as the effective user. The fix is to use the passwd entry's directory instead of $HOME in this situation. As this might break scenarios where the user explicitly sets $HOME to another path, this fix is only applied in case the effective user does not equal the real user.

09df354e

2018-02-01T16:52:43

odb_loose: HEADER_LEN -> MAX_HEADER_LEN `MAX_HEADER_LEN` is a more descriptive constant name.

624614b2

2017-12-19T00:43:49

odb_loose: validate length when checking for zlib content When checking to see if a file has zlib deflate content, make sure that we actually have read at least two bytes before examining the array.

1118ba3e

2017-12-18T23:08:40

odb_loose: `read_header` for packlike loose objects Support `read_header` for "packlike loose objects", which were a temporarily and uncommonly used format loose object format that encodes the header before the zlib deflate data. This will never actually be seen in the wild, but add support for it for completeness and (more importantly) because our corpus of test data has objects in this format, so it's easier to support it than to try to special case it.

4c7a16b7

2017-12-18T15:56:21

odb_loose: read_header should use zstream Make `read_header` use the common zstream implementation. Remove the now unnecessary zlib wrapper in odb_loose.

6155e06b

2017-12-17T18:44:02

zstream: introduce a single chunk reader Introduce `get_output_chunk` that will inflate/deflate all the available input buffer into the output buffer. `get_output` will call `get_output_chunk` in a loop, while other consumers can use it to inflate only a piece of the data.

thodg/libgit2/src

src

Log