kmx git

Commit	Date	Message
d04c1aa0	2018-11-28T13:36:47	config: fix adding files if their parent directory is a file When we try to add a configuration file with `git_config_add_file_ondisk`, we treat nonexisting files as empty. We do this by performing a stat call, ignoring ENOENT errors. This works just fine in case the file or any of its parents simply does not exist, but there is also the case where any of the parent directories is not a directory, but a file. So e.g. trying to add a configuration file "/dev/null/.gitconfig" will fail, as `errno` will be ENOTDIR instead of ENOENT. Catch ENOTDIR in addition to ENOENT to fix the issue. Add a test that verifies we are able to add configuration files with such an invalid path file just fine.
ac5879de	2018-12-06T10:48:20	Typesetting conventions
82209544	2018-12-04T10:12:24	Fix segfault in loose_backend__readstream If the routine exits with error before stream or hash_ctx is initialized, the program will segfault when trying to free them.
c3b2053d	2018-12-04T21:48:12	make proxy_stream_close close target stream even on errors When git_filter_apply_fn callback returns a error while smudging proxy_stream_close ends up returning without closing the stream. This is turn makes blob_content_to_file crash as it asserts the stream being closed whether there are errors or not. Closing the target stream on error fixes this problem.
a6ffb1ce	2018-12-04T10:59:25	Removed one null check
8ce7272d	2018-12-14T14:43:09	annotated_commit: peel to commit instead of assuming we have one We want to allow the creation of annotated commits out of annotated tags and for that we have to peel the reference all the way to the commit instead of stopping at the first id it provides.
0573e78d	2018-12-14T14:41:17	refs: constify git_reference_peel We have no need to take a non-const reference. This does involve some other work to make sure we don't mix const and non-const variables, but by splitting what we want each variable to do we can also simplify the logic for when we do want to free a new reference we might have allocated.
42c4c624	2018-08-17T18:51:56	config: assert that our parameters are valid CID 1395011
a09a1028	2018-10-24T01:30:12	refs: assert that we're passed valid refs when renaming CID 1382962
c7469503	2018-10-24T01:26:48	diff: assert that we're passed a valid git_diff object CID 1386176, 1386177, 1388219
a4b332e8	2018-10-24T01:21:21	submodule: grab the error while loading from config Previously, an error in `git_config_next` would be mistaken as a successful load, because the previous call would have succeeded. Coverity saw the subsequent check for a completed iteration as dead, so let's make it useful again. CID 1391374
8bc913a2	2018-11-07T15:31:21	smart transport: only clear url on hard reset After creating a transport for a server, we expect to be able to call `connect`, then invoke subsequent `action` calls. We provide the URL to these `action` calls, although our built-in transports happen to ignore it since they've already parsed it into an internal format that they intend to use (`gitno_connection_data`). In ca2eb4608243162a13c427e74526b6422d5a6659, we began clearing the URL field after a connection, meaning that subsequent calls to transport `action` callbacks would get a NULL URL, which went undetected since the builtin transports ignore the URL when they're already connected (instead of re-parsing it into an internal format). Downstream custom transport implementations (eg, LibGit2Sharp) did notice this change, however. Since `reset_stream` is called even when we're not closing the subtransport, update to only clear the URL when we're closing the subtransport. This ensures that `action` calls will get the correct URL information even after a connection.
80890b9a	2018-07-20T08:20:48	winhttp: retry erroneously failing requests Early Windows TLS 1.2 implementations have an issue during key exchange with OpenSSL implementations that cause negotiation to fail with the error "the buffer supplied to a function was too small." This is a transient error on the connection, so when that error is received, retry up to 5 times to create a connection to the remote server before actually giving up. (cherry picked from commit dc371e3c5903760cc2334a0acfac9bce04479773)
ebdca44a	2018-09-02T11:38:43	git_remote_prune to be O(n * logn) (cherry picked from commit bfec6526e931d7f6ac5ecc38c37e76163092bfda)
a77b64a5	2018-10-06T12:58:06	ignore unsupported http authentication schemes auth_context_match returns 0 instead of -1 for unknown schemes to not fail in situations where some authentication schemes are supported and others are not. apply_credentials is adjusted to handle auth_context_match returning 0 without producing authentication context. (cherry picked from commit 475db39bb4c44a2221f340c66c227f555e478d10)
4ca4c7d1	2018-08-21T02:11:32	transport/http: do not return success if we failed to get a scheme Otherwise we return a NULL context, which will get dereferenced in apply_credentials. (cherry picked from commit 1c949ce1483ca22a29e8f523360999cbbe411a50)
01b8bbf4	2018-08-21T01:55:56	remote: set the error before cleanup Otherwise we'll return stack data to the caller. (cherry picked from commit 22d013b657c5957fde31641351cb72d08cc192ae)
c0e038a2	2018-08-21T01:12:11	revwalk: The left operand of '<' is a garbage value At line 594, we do this : if (error < 0) return error; but if nothing was pushed in a GIT_SORT_TIME revwalk, we'd return uninitialized stack data. (cherry picked from commit aa8cb5866f1eabd92c8c08f7a8610d42df07375f)
ef7d7def	2018-08-17T00:51:51	worktree: unlock should return 1 when the worktree isn't locked The documentation states that git_worktree_unlock returns 0 on success, and 1 on success if the worktree wasn't locked. Turns out we were returning 0 in any of those cases. (cherry picked from commit 59c2e70eeee8b2bae79d05060599114a5f6d737a)
a5a0347d	2018-08-16T22:45:43	Fix leak in index.c (cherry picked from commit 581d5492f6afdaf31a10e51187466a80ffc9f76f)
0fefd899	2018-08-17T15:56:30	util: make the qsort_r check work on macOS This performs a compile-check by using CMake support, to differentiate the GNU version from the BSD version of qsort_r. Module taken from 4f252abea5f1d17c60f6ff115c9c44cc0b6f1df6, which I've checked against CMake 2.8.11. (cherry picked from commit 1a9cc18260b68b149476adb6f39e37ab47d3d21f)
fad59658	2018-08-09T12:48:26	diff: fix OOM on AIX when finding similar deltas in empty diff The function `git_diff_find_similar` keeps a function of cache similarity metrics signatures, whose size depends on the number of deltas passed in via the `diff` parameter. In case where the diff is empty and thus doesn't have any deltas at all, we may end up allocating this cache via a call to `git__calloc(0, sizeof(void *))`. At least on AIX, allocating 0 bytes will result in a `NULL` pointer being returned, which causes us to erroneously return an OOM error. Fix this situation by simply returning early in case where we are being passed an empty diff, as we cannot find any similarities in that case anyway. (cherry picked from commit c65568d8c8c1bf4920393190e862819cd263f439)
831a709d	2018-08-05T14:37:08	Add a comment (cherry picked from commit ec76a1aa43321db2451e747d7a4408e780991c4a)
ab3f6099	2018-08-05T14:25:22	Don't error on missing section, just continue (cherry picked from commit 019409be004fb73071415750e98db03d33fada47)
9023f8f6	2018-07-22T23:31:19	config_file: Don't crash on options without a section (cherry picked from commit c4d7fa951acd066fd80d83954dd6082c1c7e9e1e)
e1295400	2018-08-04T19:30:40	parse: Do not initialize the content in context to NULL String operations in libgit2 are supposed to never receive `NULL`, e.g. they are not `NULL`-save. In the case of `git__linenlen()`, invocation with `NULL` leads to undefined behavior. In a `git_parse_ctx` however, the `content` field used in these operations was initialized to `NULL` if the `git_parse_ctx_init()` was called with `NULL` for `content` or `0` for `content_len`. For the latter case, the initialization function even contained some logic for initializing `content` with `NULL`. This commit mitigates triggering undefined behavior by rewriting the logic. Now `content` is always initialized to a non-null buffer. Instead of a null buffer, an empty string is used for denoting an empty buffer. (cherry picked from commit d1bfe614aa20a0bdaf76c6d418176320ab11baf4)
b8844566	2018-07-27T12:00:37	tree: rename from_tree to validate and clarify the tree in the test (cherry picked from commit f00db9ed67423b04976f8d20b0de2ee1fb7c3993)
e91d6b5e	2018-09-27T11:18:00	fix check if blob is uninteresting when inserting tree to packbuilder Blobs that have been marked as uninteresting should not be inserted into packbuilder when inserting a tree. The check as to whether a blob was uninteresting looked at the status for the tree itself instead of the blob. This could cause significantly larger packfiles. (cherry picked from commit b36cc7a4013a47856dade4226edc657906b82431)
b69089fd	2018-09-17T14:49:46	revwalk: only check the first commit in the list for an earlier timestamp This is not a big deal, but it does make us match git more closely by checking only the first. The lists are sorted already, so there should be no functional difference other than removing a possible check from every iteration in the loop. (cherry picked from commit 12a1790d8e71087056d2b2de936ddae439e1d94c)
3cabf8d1	2018-09-17T14:39:58	revwalk: use the max value for a signed integer When porting, we overlooked that the difference between git's and our's time representation and copied their way of getting the max value. Unfortunately git was using unsigned integers, so `~0ll` does correspond to their max value, whereas for us it corresponds to `-1`. This means that we always consider the last date to be smaller than the current commit's and always think commits are interesting. Change the initial value to the macro that gives us the maximum value on each platform so we can accurately consider commits interesting or not. (cherry picked from commit 46f35127b6fcfab87cb80d1b772ac7c662eafd38)
b133ab9b	2018-09-26T21:17:39	vector: do not realloc 0-size vectors (cherry picked from commit e0afd1c21c4421cec4f67162021f835e2bbb7df6)
ebb0e37e	2018-09-26T19:15:35	vector: do not malloc 0-length vectors on dup (cherry picked from commit fa48d2ea7d2d5dc9620e5c9f05ba8d788775582b)
5fc7d59c	2018-09-11T15:53:35	index: release the snapshot instead of freeing the index Previously we would assert in index_free because the reader incrementation would not be balanced. Release the snapshot normally, so the variable gets decremented before the index is freed. (cherry picked from commit c70713d6e4af563696563e410864eb4a6507757d)
c37c6ed8	2018-07-18T21:04:13	tree: accept null ids in existing trees when updating When we add entries to a treebuilder we validate them. But we validate even those that we're adding because they exist in the base tree. This disables using the normal mechanisms on these trees, even to fix them. Keep track of whether the entry we're appending comes from an existing tree and bypass the name and id validation if it's from existing data. (cherry picked from commit 2dff7e282da77f6b791e843ec267d9ddecabc187)
a051bce7	2018-06-18T20:37:18	patch_parse: populate line numbers while parsing diffs (cherry picked from commit f9e28026753f7b6c871a160ad584b2dc2639d30f)
c62c6379	2018-04-20T08:38:50	worktree: a worktree can be made from a bare repository
a7fbb051	2018-04-18T22:40:46	repository: being a worktree means we're not really bare We were previously conflating any error into GIT_ENOTFOUND, which might or might not be correct. This fixes the code so a config error is bubbled up, as well as preserving the semantics in the face of worktree-repositories
76e57b4e	2018-10-18T11:29:06	index: avoid out-of-bounds read when reading reuc entry stage We use `git__strtol64` to parse file modes of the index entries, which does not limit the parsed buffer length. As the index can be essentially treated as "untrusted" in that the data stems from the file system, it may be misformatted and may not contain terminating `NUL` bytes. This may lead to out-of-bounds reads when trying to parse index entries with such malformatted modes. Fix the issue by using `git__strntol64` instead. (cherry picked from commit 600ceadd1426b874ae0618651210a690a68b27e9)
4dddcafb	2018-10-18T11:25:59	commit_list: avoid use of strtol64 without length limit When quick-parsing a commit, we use `git__strtol64` to parse the commit's time. The buffer that's passed to `commit_quick_parse` is the raw data of an ODB object, though, whose data may not be properly formatted and also does not have to be `NUL` terminated. This may lead to out-of-bound reads. Use `git__strntol64` to avoid this problem. (cherry picked from commit 1a3fa1f5fafd433bdcf1834426d6963eff532125)
1c02b896	2018-07-20T21:50:58	smart subtransport: free url when resetting stream Free the url field when resetting the stream to avoid leaking it. (cherry picked from commit ca2eb4608243162a13c427e74526b6422d5a6659)
4f0e5f70	2018-10-19T10:29:19	commit: fix reading out of bounds when parsing encoding The commit message encoding is currently being parsed by the `git__prefixcmp` function. As this function does not accept a buffer length, it will happily skip over a buffer's end if it is not `NUL` terminated. Fix the issue by using `git__prefixncmp` instead. Add a test that verifies that we are unable to parse the encoding field if it's cut off by the supplied buffer length. (cherry picked from commit 7655b2d89e8275853d9921dd903dcdad9b3d4a7b)
6e40bb3a	2018-10-19T09:47:50	tag: fix out of bounds read when searching for tag message When parsing tags, we skip all unknown fields that appear before the tag message. This skipping is done by using a plain `strstr(buffer, "\n\n")` to search for the two newlines that separate tag fields from tag message. As it is not possible to supply a buffer length to `strstr`, this call may skip over the buffer's end and thus result in an out of bounds read. As `strstr` may return a pointer that is out of bounds, the following computation of `buffer_end - buffer` will overflow and result in an allocation of an invalid length. Fix the issue by using `git__memmem` instead. Add a test that verifies parsing the tag fails not due to the allocation failure but due to the tag having no message. (cherry picked from commit ee11d47e3d907b66eeff99e0ba1e1c71e05164b7)
ab0d9fb2	2018-10-18T16:08:46	util: provide `git__memmem` function Unfortunately, neither the `memmem` nor the `strnstr` functions are part of any C standard but are merely extensions of C that are implemented by e.g. glibc. Thus, there is no standardized way to search for a string in a block of memory with a limited size, and using `strstr` is to be considered unsafe in case where the buffer has not been sanitized. In fact, there are some uses of `strstr` in exactly that unsafe way in our codebase. Provide a new function `git__memmem` that implements the `memmem` semantics. That is in a given haystack of `n` bytes, search for the occurrence of a byte sequence of `m` bytes and return a pointer to the first occurrence. The implementation chosen is the "Not So Naive" algorithm from [1]. It was chosen as the implementation is comparably simple while still being reasonably efficient in most cases. Preprocessing happens in constant time and space, searching has a time complexity of O(n*m) with a slightly sub-linear average case. [1]: http://www-igm.univ-mlv.fr/~lecroq/string/ (cherry picked from commit 83e8a6b36acc67f2702cbbc7d4e334c7f7737719)
c8e94f84	2018-10-18T15:08:56	util: fix out of bounds read in error message When an integer that is parsed with `git__strntol32` is too big to fit into an int32, we will generate an error message that includes the actual string that failed to parse. This does not acknowledge the fact that the string may either not be NUL terminated or alternative include additional characters after the number that is to be parsed. We may thus end up printing characters into the buffer that aren't the number or, worse, read out of bounds. Fix the issue by utilizing the `endptr` that was set by `git__strntol64`. This pointer is guaranteed to be set to the first character following the number, and we can thus use it to compute the width of the number that shall be printed. Create a test to verify that we correctly truncate the number. (cherry picked from commit ea19efc19fa683d632af3e172868bc4350724813)
0f7663a1	2018-10-18T14:37:55	util: avoid signed integer overflows in `git__strntol64` While `git__strntol64` tries to detect integer overflows when doing the necessary arithmetics to come up with the final result, it does the detection only after the fact. This check thus relies on undefined behavior of signed integer overflows. Fix this by instead checking up-front whether the multiplications or additions will overflow. Note that a detected overflow will not cause us to abort parsing the current sequence of digits. In the case of an overflow, previous behavior was to still set up the end pointer correctly to point to the first character immediately after the currently parsed number. We do not want to change this now as code may rely on the end pointer being set up correctly even if the parsed number is too big to be represented as 64 bit integer. (cherry picked from commit b09c1c7b636c4112e247adc24245c65f3f9478d0)
e2e7a9cb	2018-10-18T12:04:07	util: remove `git__strtol32` The function `git__strtol32` can easily be misused when untrusted data is passed to it that may not have been sanitized with trailing `NUL` bytes. As all usages of this function have now been removed, we can remove this function altogether to avoid future misuse of it. (cherry picked from commit 8d7fa88a9d5011b653035497b0f523e0f177b6a6)
2052a3cd	2018-10-18T11:58:14	global: replace remaining use of `git__strtol32` Replace remaining uses of the `git__strtol32` function. While these uses are all safe as the strings were either sanitized or from a trusted source, we want to remove `git__strtol32` altogether to avoid future misuse. (cherry picked from commit 2613fbb26a3e1a34dda8a5d198c108626cfd6cc3)
61165dd4	2018-10-18T11:43:30	tree-cache: avoid out-of-bound reads when parsing trees We use the `git__strtol32` function to parse the child and entry count of treecaches from the index, which do not accept a buffer length. As the buffer that is being passed in is untrusted data and may thus be malformed and may not contain a terminating `NUL` byte, we can overrun the buffer and thus perform an out-of-bounds read. Fix the issue by uzing `git__strntol32` instead. (cherry picked from commit 21652ee9de439e042cc2e69b208aa2ef8ce31147)
6b2b63e5	2018-10-18T11:37:10	util: remove unsafe `git__strtol64` function The function `git__strtol64` does not take a maximum buffer length as parameter. This has led to some unsafe usages of this function, and as such we may consider it as being unsafe to use. As we have now eradicated all usages of this function, let's remove it completely to avoid future misuse. (cherry picked from commit 68deb2cc80ef19bf3a1915c26b5308b283a6d69a)
334e6c69	2018-10-18T11:35:08	config: remove last instance of `git__strntol64` When parsing integers from configuration values, we use `git__strtol64`. This is fine to do, as we always sanitize values and can thus be sure that they'll have a terminating `NUL` byte. But as this is the last call-site of `git__strtol64`, let's just pass in the length explicitly by calling `strlen` on the value to be able to remove `git__strtol64` altogether. (cherry picked from commit 1a2efd10bde66f798375e2f47ba57ef00ad5c193)
5ce26b18	2018-10-18T11:32:48	signature: avoid out-of-bounds reads when parsing signature dates We use `git__strtol64` and `git__strtol32` to parse the trailing commit or author date and timezone of signatures. As signatures are usually part of a commit or tag object and thus essentially untrusted data, the buffer may be misformatted and may not be `NUL` terminated. This may lead to an out-of-bounds read. Fix the issue by using `git__strntol64` and `git__strntol32` instead. (cherry picked from commit 3db9aa6f79711103a331a2bbbd044a3c37d4f136)
c7f91f39	2018-08-06T12:00:21	odb: fix use of wrong printf formatters The `git_odb_stream` members `declared_size` and `received_bytes` are both of the type `git_off_t`, which we usually defined to be a 64 bit signed integer. Thus, passing these members to "PRIdZ" formatters is not correct, as they are not guaranteed to accept big enough numbers. Instead, use the "PRId64" formatter, which is able to represent 64 bit signed integers. (cherry picked from commit 0fcd05631a1f59e156e613448262800c155e79d0)
a221f58e	2018-10-05T11:47:39	submodule: ignore path and url attributes if they look like options These can be used to inject options in an implementation which performs a recursive clone by executing an external command via crafted url and path attributes such that it triggers a local executable to be run. The library is not vulnerable as we do not rely on external executables but a user of the library might be relying on that so we add this protection. This matches this aspect of git's fix for CVE-2018-17456.
614c266d	2018-10-05T10:56:02	config_file: properly ignore includes without "path" value In case a configuration includes a key "include.path=" without any value, the generated configuration entry will have its value set to `NULL`. This is unexpected by the logic handling includes, and as soon as we try to calculate the included path we will unconditionally dereference that `NULL` pointer and thus segfault. Fix the issue by returning early in both `parse_include` and `parse_conditional_include` in case where the `file` argument is `NULL`. Add a test to avoid future regression. The issue has been found by the oss-fuzz project, issue 10810. (cherry picked from commit d06d4220eec035466d1a837972a40546b8904330)
f5c3442b	2018-10-03T16:17:21	smart_pkt: do not accept callers passing in no line length Right now, we simply ignore the `linelen` parameter of `git_pkt_parse_line` in case the caller passed in zero. But in fact, we never want to assume anything about the provided buffer length and always want the caller to pass in the available number of bytes. And in fact, checking all the callers, one can see that the funciton is never being called in case where the buffer length is zero, and thus we are safe to remove this check. (cherry picked from commit 1bc5b05c614c7b10de021fa392943e8e6bd12c77)
f7c3f6cc	2018-08-09T11:16:15	smart_pkt: return parsed length via out-parameter The `parse_len` function currently directly returns the parsed length of a packet line or an error code in case there was an error. Instead, convert this to our usual style of using the return value as error code only and returning the actual value via an out-parameter. Thus, we can now convert the output parameter to an unsigned type, as the size of a packet cannot ever be negative. While at it, we also move the check whether the input buffer is long enough into `parse_len` itself. We don't really want to pass around potentially non-NUL-terminated buffers to functions without also passing along the length, as this is dangerous in the unlikely case where other callers for that function get added. Note that we need to make sure though to not mess with `GIT_EBUFS` error codes, as these indicate not an error to the caller but that he needs to fetch more data. (cherry picked from commit c05790a8a8dd4351e61fc06c0a06c6a6fb6134dc)
7e3cd611	2018-08-09T11:13:59	smart_pkt: reorder and rename parameters of `git_pkt_parse_line` The parameters of the `git_pkt_parse_line` function are quite confusing. First, there is no real indicator what the `out` parameter is actually all about, and it's not really clear what the `bufflen` parameter refers to. Reorder and rename the parameters to make this more obvious. (cherry picked from commit 0b3dfbf425d689101663341beb94237614f1b5c2)
356f60f4	2018-08-09T11:04:42	smart_pkt: fix buffer overflow when parsing "unpack" packets When checking whether an "unpack" packet returned the "ok" status or not, we use a call to `git__prefixcmp`. In case where the passed line isn't properly NUL terminated, though, this may overrun the line buffer. Fix this by using `git__prefixncmp` instead. (cherry picked from commit 5fabaca801e1f5e7a1054be612e8fabec7cd6a7f)
b5b7c303	2018-08-09T11:03:37	smart_pkt: fix "ng" parser accepting non-space character When parsing "ng" packets, we blindly assume that the character immediately following the "ng" prefix is a space and skip it. As the calling function doesn't make sure that this is the case, we can thus end up blindly accepting an invalid packet line. Fix the issue by using `git__prefixncmp`, checking whether the line starts with "ng ". (cherry picked from commit b5ba7af2d30c958b090dcf135749d9afe89ec703)
319f0c03	2018-08-09T11:01:00	smart_pkt: fix buffer overflow when parsing "ok" packets There are two different buffer overflows present when parsing "ok" packets. First, we never verify whether the line already ends after "ok", but directly go ahead and also try to skip the expected space after "ok". Second, we then go ahead and use `strchr` to scan for the terminating newline character. But in case where the line isn't terminated correctly, this can overflow the line buffer. Fix the issues by using `git__prefixncmp` to check for the "ok " prefix and only checking for a trailing '\n' instead of using `memchr`. This also fixes the issue of us always requiring a trailing '\n'. Reported by oss-fuzz, issue 9749: Crash Type: Heap-buffer-overflow READ {*} Crash Address: 0x6310000389c0 Crash State: ok_pkt git_pkt_parse_line git_smart__store_refs Sanitizer: address (ASAN) (cherry picked from commit a9f1ca09178af0640963e069a2142d5ced53f0b4)
0599c267	2018-08-09T10:38:10	smart_pkt: fix buffer overflow when parsing "ACK" packets We are being quite lenient when parsing "ACK" packets. First, we didn't correctly verify that we're not overrunning the provided buffer length, which we fix here by using `git__prefixncmp` instead of `git__prefixcmp`. Second, we do not verify that the actual contents make any sense at all, as we simply ignore errors when parsing the ACKs OID and any unknown status strings. This may result in a parsed packet structure with invalid contents, which is being silently passed to the caller. This is being fixed by performing proper input validation and checking of return codes. (cherry picked from commit bc349045b1be8fb3af2b02d8554483869e54d5b8)
0fe87761	2018-08-09T10:57:06	smart_pkt: adjust style of "ref" packet parsing function While the function parsing ref packets doesn't have any immediately obvious buffer overflows, it's style is different to all the other parsing functions. Instead of checking buffer length while we go, it does a check up-front. This causes the code to seem a lot more magical than it really is due to some magic constants. Refactor the function to instead make use of the style of other packet parser and verify buffer lengths as we go. (cherry picked from commit 5edcf5d190f3b379740b223ff6a649d08fa49581)
97156614	2018-08-09T10:46:58	smart_pkt: check whether error packets are prefixed with "ERR " In the `git_pkt_parse_line` function, we determine what kind of packet a given packet line contains by simply checking for the prefix of that line. Except for "ERR" packets, we always only check for the immediate identifier without the trailing space (e.g. we check for an "ACK" prefix, not for "ACK "). But for "ERR" packets, we do in fact include the trailing space in our check. This is not really much of a problem at all, but it is inconsistent with all the other packet types and thus causes confusion when the `err_pkt` function just immediately skips the space without checking whether it overflows the line buffer. Adjust the check in `git_pkt_parse_line` to not include the trailing space and instead move it into `err_pkt` for consistency. (cherry picked from commit 786426ea6ec2a76ffe2515dc5182705fb3d44603)
5c0d1100	2018-08-09T10:46:26	smart_pkt: explicitly avoid integer overflows when parsing packets When parsing data, progress or error packets, we need to copy the contents of the rest of the current packet line into the flex-array of the parsed packet. To keep track of this array's length, we then assign the remaining length of the packet line to the structure. We do have a mismatch of types here, as the structure's `len` field is a signed integer, while the length that we are assigning has type `size_t`. On nearly all platforms, this shouldn't pose any problems at all. The line length can at most be 16^4, as the line's length is being encoded by exactly four hex digits. But on a platforms with 16 bit integers, this assignment could cause an overflow. While such platforms will probably only exist in the embedded ecosystem, we still want to avoid this potential overflow. Thus, we now simply change the structure's `len` member to be of type `size_t` to avoid any integer promotion. (cherry picked from commit 40fd84cca68db24f325e460a40dabe805e7a5d35)
20e58aac	2018-08-09T10:36:44	smart_pkt: honor line length when determining packet type When we parse the packet type of an incoming packet line, we do not verify that we don't overflow the provided line buffer. Fix this by using `git__prefixncmp` instead and passing in `len`. As we have previously already verified that `len <= linelen`, we thus won't ever overflow the provided buffer length. (cherry picked from commit 4a5804c983317100eed509537edc32d69c8d7aa2)
003cbc3f	2018-06-24T19:47:08	Verify ref_pkt's are long enough If the remote sends a too-short packet, we'll allow `len` to go negative and eventually issue a malloc for <= 0 bytes on ``` pkt->head.name = git__malloc(alloclen); ``` (cherry picked from commit 437ee5a70711ac2e027877d71ee4ae17e5ec3d6c)
4385aef3	2017-08-22T16:29:07	smart: typedef git_pkt_type and clarify recv_pkt return type (cherry picked from commit 08961c9d0d6927bfcc725bd64c9a87dbcca0c52c)
21ffc57d	2018-06-28T05:27:36	Small style tweak, and set an error (cherry picked from commit 895a668e19dc596e7b12ea27724ceb7b68556106)
be98c9e9	2018-06-26T02:32:50	Remove GIT_PKT_PACK entirely (cherry picked from commit 90cf86070046fcffd5306915b57786da054d8964)
bf4342f7	2018-08-11T13:06:14	Fix 'invalid packet line' for ng packets containing errors (cherry picked from commit 50dd7fea5ad1bf6c013b72ad0aa803a9c84cdede)
15e92284	2018-09-05T11:49:13	Prevent heap-buffer-overflow When running repack while doing repo writes, `packfile_load__cb()` can see some temporary files in the directory that are bigger than the usual, and makes `memcmp` overflow on the `p->pack_name` string. ASAN detected this. This just uses `strncmp`, that should not have any performance impact and is safe for comparing strings of different sizes. ``` ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61200001a3f3 at pc 0x7f4a9e1976ec bp 0x7ffc1f80e100 sp 0x7ffc1f80d8b0 READ of size 89 at 0x61200001a3f3 thread T0 SCARINESS: 26 (multi-byte-read-heap-buffer-overflow) #0 0x7f4a9e1976eb in __interceptor_memcmp.part.78 (/build/cfgr-admin#link-tree/libtools_build_sanitizers_asan-ubsan-py.so+0xcf6eb) #1 0x7f4a518c5431 in packfile_load__cb /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb_pack.c:213 #2 0x7f4a518d9582 in git_path_direach /build/libgit2/0.27.0/src/libgit2-0.27.0/src/path.c:1134 #3 0x7f4a518c58ad in pack_backend__refresh /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb_pack.c:347 #4 0x7f4a518c1b12 in git_odb_refresh /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb.c:1511 #5 0x7f4a518bff5f in git_odb__freshen /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb.c:752 #6 0x7f4a518c17d4 in git_odb_stream_finalize_write /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb.c:1415 #7 0x7f4a51b9d015 in Repository_write /build/pygit2/0.27.0/src/pygit2-0.27.0/src/repository.c:509 ``` (cherry picked from commit d22cd1f4a4c10ff47b04c57560e6765d77e5a8fd)
39706ded	2018-09-03T10:49:46	config_parse: refactor error handling when parsing multiline variables The current error handling for the multiline variable parser is a bit fragile, as each error condition has its own code to clear memory. Instead, unify error handling as far as possible to avoid this repetitive code. While at it, make use of `GITERR_CHECK_ALLOC` to correctly handle OOM situations and verify that the buffer we print into does not run out of memory either. (cherry picked from commit bc63e1ef521ab5900dc0b0dcd578b8bf18627fb1)
68823395	2018-09-01T03:50:26	config: Fix a leak parsing multi-line config entries (cherry picked from commit 38b852558eb518f96c313cdcd9ce5a7af6ded194)
24c7b23d	2018-08-25T17:04:39	config: convert unbounded recursion into a loop (cherry picked from commit a03113e80332fba6c77f43b21d398caad50b4b89)
1f9a8510	2018-07-19T13:00:42	smart_pkt: fix potential OOB-read when processing ng packet OSS-fuzz has reported a potential out-of-bounds read when processing a "ng" smart packet: ==1==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6310000249c0 at pc 0x000000493a92 bp 0x7ffddc882cd0 sp 0x7ffddc882480 READ of size 65529 at 0x6310000249c0 thread T0 SCARINESS: 26 (multi-byte-read-heap-buffer-overflow) #0 0x493a91 in __interceptor_strchr.part.35 /src/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:673 #1 0x813960 in ng_pkt libgit2/src/transports/smart_pkt.c:320:14 #2 0x810f79 in git_pkt_parse_line libgit2/src/transports/smart_pkt.c:478:9 #3 0x82c3c9 in git_smart__store_refs libgit2/src/transports/smart_protocol.c:47:12 #4 0x6373a2 in git_smart__connect libgit2/src/transports/smart.c:251:15 #5 0x57688f in git_remote_connect libgit2/src/remote.c:708:15 #6 0x52e59b in LLVMFuzzerTestOneInput /src/download_refs_fuzzer.cc:145:9 #7 0x52ef3f in ExecuteFilesOnyByOne(int, char**) /src/libfuzzer/afl/afl_driver.cpp:301:5 #8 0x52f4ee in main /src/libfuzzer/afl/afl_driver.cpp:339:12 #9 0x7f6c910db82f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/libc-start.c:291 #10 0x41d518 in _start When parsing an "ng" packet, we keep track of both the current position as well as the remaining length of the packet itself. But instead of taking care not to exceed the length, we pass the current pointer's position to `strchr`, which will search for a certain character until hitting NUL. It is thus possible to create a crafted packet which doesn't contain a NUL byte to trigger an out-of-bounds read. Fix the issue by instead using `memchr`, passing the remaining length as restriction. Furthermore, verify that we actually have enough bytes left to produce a match at all. OSS-Fuzz-Issue: 9406
c1577110	2018-07-05T13:30:46	delta: fix overflow when computing limit When checking whether a delta base offset and length fit into the base we have in memory already, we can trigger an overflow which breaks the check. This would subsequently result in us reading memory from out of bounds of the base. The issue is easily fixed by checking for overflow when adding `off` and `len`, thus guaranteeting that we are never indexing beyond `base_len`. This corresponds to the git patch 8960844a7 (check patch_delta bounds more carefully, 2006-04-07), which adds these overflow checks. Reported-by: Riccardo Schirone <rschiron@redhat.com>
9844d38b	2018-06-29T09:11:02	delta: fix out-of-bounds read of delta When computing the offset and length of the delta base, we repeatedly increment the `delta` pointer without checking whether we have advanced past its end already, which can thus result in an out-of-bounds read. Fix this by repeatedly checking whether we have reached the end. Add a test which would cause Valgrind to produce an error. Reported-by: Riccardo Schirone <rschiron@redhat.com> Test-provided-by: Riccardo Schirone <rschiron@redhat.com>
3f461902	2018-06-29T07:45:18	delta: fix sign-extension of big left-shift Our delta code was originally adapted from JGit, which itself adapted it from git itself. Due to this heritage, we inherited a bug from git.git in how we compute the delta offset, which was fixed upstream in 48fb7deb5 (Fix big left-shifts of unsigned char, 2009-06-17). As explained by Linus: Shifting 'unsigned char' or 'unsigned short' left can result in sign extension errors, since the C integer promotion rules means that the unsigned char/short will get implicitly promoted to a signed 'int' due to the shift (or due to other operations). This normally doesn't matter, but if you shift things up sufficiently, it will now set the sign bit in 'int', and a subsequent cast to a bigger type (eg 'long' or 'unsigned long') will now sign-extend the value despite the original expression being unsigned. One example of this would be something like unsigned long size; unsigned char c; size += c << 24; where despite all the variables being unsigned, 'c << 24' ends up being a signed entity, and will get sign-extended when then doing the addition in an 'unsigned long' type. Since git uses 'unsigned char' pointers extensively, we actually have this bug in a couple of places. In our delta code, we inherited such a bogus shift when computing the offset at which the delta base is to be found. Due to the sign extension we can end up with an offset where all the bits are set. This can allow an arbitrary memory read, as the addition in `base_len < off + len` can now overflow if `off` has all its bits set. Fix the issue by casting the result of `*delta++ << 24UL` to an unsigned integer again. Add a test with a crafted delta that would actually succeed with an out-of-bounds read in case where the cast wouldn't exist. Reported-by: Riccardo Schirone <rschiron@redhat.com> Test-provided-by: Riccardo Schirone <rschiron@redhat.com>
7392799d	2018-05-30T08:35:06	submodule: detect duplicated submodule paths When loading submodule names, we build a map of submodule paths and their respective names. While looping over the configuration keys, we do not check though whether a submodule path was seen already. This leads to a memory leak in case we have multiple submodules with the same path, as we just overwrite the old value in the map in that case. Fix the error by verifying that the path to be added is not yet part of the string map. Git does not allow to have multiple submodules for a path anyway, so we now do the same and detect this duplication, reporting it to the user.
f2e5c092	2018-04-27T15:31:43	cmake: remove now-useless LIBGIT2_LIBDIRS handling With the recent change of always resolving pkg-config libraries to their full path, we do not have to manage the LIBGIT2_LIBDIRS variable anymore. The only other remaining user of LIBGIT2_LIBDIRS is winhttp, which is a CMake-style library target and can thus be resolved by CMake automatically. Remove the variable to simplify our build system a bit.
0c8ff50f	2018-04-27T10:38:49	cmake: resolve libraries found by pkg-config Libraries found by CMake modules are usually handled with their full path. This makes linking against those libraries a lot more robust when it comes to libraries in non-standard locations, as otherwise we might mix up libraries from different locations when link directories are given. One excemption are libraries found by PKG_CHECK_MODULES. Instead of returning libraries with their complete path, it will return the variable names as well as a set of link directories. In case where multiple sets of the same library are installed in different locations, this can lead the compiler to link against the wrong libraries in the end, when link directories of other dependencies are added. To fix this shortcoming, we need to manually resolve library paths returned by CMake against their respective library directories. This is an easy task to do with `FIND_LIBRARY`.
b2f3ff56	2018-04-19T01:08:18	worktree: fix calloc of the wrong object type
8fa0b34b	2018-04-19T01:05:05	local: fix a leaking reference when iterating over a symref Valgrind log : ==17702== 18 bytes in 1 blocks are indirectly lost in loss record 69 of 1,123 ==17702== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==17702== by 0x5FDBB49: strdup (strdup.c:42) ==17702== by 0x632B3E: git__strdup (util.h:106) ==17702== by 0x632D2C: git_reference__alloc_symbolic (refs.c:64) ==17702== by 0x62E0AF: loose_lookup (refdb_fs.c:408) ==17702== by 0x62E636: refdb_fs_backend__iterator_next (refdb_fs.c:565) ==17702== by 0x62CD8E: git_refdb_iterator_next (refdb.c:147) ==17702== by 0x6347F2: git_reference_next (refs.c:838) ==17702== by 0x6345CB: git_reference_foreach (refs.c:748) ==17702== by 0x66BE62: local_download_pack (local.c:579) ==17702== by 0x5DB48F: git_fetch_download_pack (fetch.c:148) ==17702== by 0x639028: git_remote_download (remote.c:932) ==17702== by 0x63919A: git_remote_fetch (remote.c:969) ==17702== by 0x4ABEDD: test_fetchhead_nonetwork__fetch_into_repo_with_symrefs (nonetwork.c:362) ==17702== by 0x4125D9: clar_run_test (clar.c:222) ==17702== by 0x41287C: clar_run_suite (clar.c:286) ==17702== by 0x412DDE: clar_test_run (clar.c:433) ==17702== by 0x4105E1: main (main.c:24)
2fe887e6	2018-04-18T20:57:16	remote: repo is optional here As per CID:1378747, we might be called with a NULL repo, which would be deferenced in write_add_refspec
96329606	2018-03-11T15:35:56	worktree: Read worktree specific reflog for HEAD Signed-off-by: Sven Strickroth <email@cs-ware.de>
a137cdbd	2018-04-18T21:41:44	refspec: check for valid parameters in git_refspec__dwim_one CID:1383993, "In git_refspec__dwim_one: All paths that lead to this null pointer comparison already dereference the pointer earlier (CWE-476)"
7fa6c8ce	2018-03-29T10:18:51	util: fix missing headers for MinGW environments There are multiple references to undefined functions in the Microsoft builds. Add headers to make them known.
0f09d9f5	2018-04-02T20:00:07	Fix build with LibreSSL 2.7 LibreSSL 2.7 adds OpenSSL 1.1 API Signed-off-by: Bernard Spil <brnrd@FreeBSD.org>
16b62dd4	2018-04-04T21:28:31	diff: Add missing GIT_DELTA_TYPECHANGE -> 'T' mapping. This adds the 'T' status character to git_diff_status_char() for diff entries that change type.
07011e60	2018-04-12T13:32:27	revwalk: fix uninteresting revs sometimes not limiting graphwalk When we want to limit our graphwalk, we use the heuristic of checking whether the newest limiting (uninteresting) revision is newer than the oldest interesting revision. We do so by inspecting whether the first item's commit time of the user-supplied list of revisions is newer than the last added interesting revision. This is wrong though, as the user supplied list is in no way guaranteed to be sorted by increasing commit dates. This could lead us to abort the revwalk early before applying all relevant limiting revisions, outputting revisions which should in fact have been hidden. Fix the heuristic by instead checking whether _any_ of the limiting commits was made earlier than the last interesting commit. Add a test.
0f88adb6	2018-02-08T12:36:47	Submodule API should report .gitmodules parse errors Signed-off-by: Sven Strickroth <email@cs-ware.de>
e2a80124	2018-04-10T21:16:43	refs: preserve the owning refdb when duping reference This fixes a segfault in git_reference_owner on references returned from git_reference__read_head and git_reference_dup ones.
b260fdc8	2018-04-06T12:24:10	attr_file: fix handling of directory patterns with trailing spaces When comparing whether a path matches a directory rule, we pass the both the path and directory name to `fnmatch` with `GIT_ATTR_FNMATCH_DIRECTORY` being set. `fnmatch` expects the pattern to contain no trailing directory '/', which is why we try to always strip patterns of trailing slashes. We do not handle that case correctly though when the pattern itself has trailing spaces, causing the match to fail. Fix the issue by stripping trailing spaces and tabs for a rule previous to checking whether the pattern is a directory pattern with a trailing '/'. This replaces the whitespace-stripping in our ignore file parsing code, which was stripping whitespaces too late. Add a test to catch future breakage.
a714e836	2018-04-06T10:39:16	transports: local: fix assert when fetching into repo with symrefs When fetching into a repository which has symbolic references via the "local" transport we run into an assert. The assert is being triggered while we negotiate the packfile between the two repositories. When hiding known revisions from the packbuilder revwalk, we unconditionally hide all references of the local refdb. In case one of these references is a symbolic reference, though, this means we're trying to hide a `NULL` OID, which triggers the assert. Fix the issue by only hiding OID references from the revwalk. Add a test to catch this issue in the future.
59012bf4	2018-03-29T09:15:48	odb: mempack: fix leaking objects when freeing mempacks When a ODB mempack gets free'd, we take no measures at all to free its contents, most notably the objects added to the database, resulting in a memory leak. Call `git_mempack_reset` previous to freeing the ODB structures themselves, which takes care of releasing all associated data structures.
4d4a7dbf	2018-03-28T17:37:39	sha1dc: update to fix errors with endianess This updates the version of SHA1DC to c3e1304ea3.
b89988c7	2018-03-27T15:03:15	transports: ssh: replace deprecated function `libssh2_session_startup` The function `libssh2_session_startup` has been deprecated since libssh2 version 1.2.8 in favor of `libssh2_session_handshake` introduced in the same version. libssh2 1.2.8 was released in April 2011, so it is already seven years old. It is available in Debian Wheezy, Ubuntu Trusty and CentOS 7.4, so the most important and conservative distros already have it available. As such, it seems safe to just use the new function.
b2e7d8c2	2018-03-27T14:49:21	transports: ssh: disconnect session before freeing it The function `ssh_stream_free` takes over the responsibility of closing channels and streams just before freeing their memory, but it does not do so for the session. In fact, we never disconnect the session ourselves at all, as libssh2 will not do so itself upon freeing the structure. Quoting the documentation of `libssh2_session_free`: > Frees all resources associated with a session instance. Typically > called after libssh2_session_disconnect_ex, The missing disconnect probably stems from a misunderstanding what it actually does. As we are already closing the TCP socket ourselves, the assumption was that no additional disconnect is required. But calling `libssh2_session_disconnect` will notify the server that we are cleanly closing the connection, such that the server can free his own resources. Add a call to `libssh2_session_disconnect` to fix that issue. [1]: https://www.libssh2.org/libssh2_session_free.html
cfed1be8	2018-05-24T20:28:36	submodule: plug leaks from the escape detection
f650153a	2018-05-24T19:05:59	submodule: replace index with strchr which exists on Windows

d04c1aa0

2018-11-28T13:36:47

config: fix adding files if their parent directory is a file When we try to add a configuration file with `git_config_add_file_ondisk`, we treat nonexisting files as empty. We do this by performing a stat call, ignoring ENOENT errors. This works just fine in case the file or any of its parents simply does not exist, but there is also the case where any of the parent directories is not a directory, but a file. So e.g. trying to add a configuration file "/dev/null/.gitconfig" will fail, as `errno` will be ENOTDIR instead of ENOENT. Catch ENOTDIR in addition to ENOENT to fix the issue. Add a test that verifies we are able to add configuration files with such an invalid path file just fine.

ac5879de

2018-12-06T10:48:20

Typesetting conventions

82209544

2018-12-04T10:12:24

Fix segfault in loose_backend__readstream If the routine exits with error before stream or hash_ctx is initialized, the program will segfault when trying to free them.

c3b2053d

2018-12-04T21:48:12

make proxy_stream_close close target stream even on errors When git_filter_apply_fn callback returns a error while smudging proxy_stream_close ends up returning without closing the stream. This is turn makes blob_content_to_file crash as it asserts the stream being closed whether there are errors or not. Closing the target stream on error fixes this problem.

a6ffb1ce

2018-12-04T10:59:25

Removed one null check

8ce7272d

2018-12-14T14:43:09

annotated_commit: peel to commit instead of assuming we have one We want to allow the creation of annotated commits out of annotated tags and for that we have to peel the reference all the way to the commit instead of stopping at the first id it provides.

0573e78d

2018-12-14T14:41:17

refs: constify git_reference_peel We have no need to take a non-const reference. This does involve some other work to make sure we don't mix const and non-const variables, but by splitting what we want each variable to do we can also simplify the logic for when we do want to free a new reference we might have allocated.

42c4c624

2018-08-17T18:51:56

config: assert that our parameters are valid CID 1395011

a09a1028

2018-10-24T01:30:12

refs: assert that we're passed valid refs when renaming CID 1382962

c7469503

2018-10-24T01:26:48

diff: assert that we're passed a valid git_diff object CID 1386176, 1386177, 1388219

a4b332e8

2018-10-24T01:21:21

submodule: grab the error while loading from config Previously, an error in `git_config_next` would be mistaken as a successful load, because the previous call would have succeeded. Coverity saw the subsequent check for a completed iteration as dead, so let's make it useful again. CID 1391374

8bc913a2

2018-11-07T15:31:21

smart transport: only clear url on hard reset After creating a transport for a server, we expect to be able to call `connect`, then invoke subsequent `action` calls. We provide the URL to these `action` calls, although our built-in transports happen to ignore it since they've already parsed it into an internal format that they intend to use (`gitno_connection_data`). In ca2eb4608243162a13c427e74526b6422d5a6659, we began clearing the URL field after a connection, meaning that subsequent calls to transport `action` callbacks would get a NULL URL, which went undetected since the builtin transports ignore the URL when they're already connected (instead of re-parsing it into an internal format). Downstream custom transport implementations (eg, LibGit2Sharp) did notice this change, however. Since `reset_stream` is called even when we're not closing the subtransport, update to only clear the URL when we're closing the subtransport. This ensures that `action` calls will get the correct URL information even after a connection.

80890b9a

2018-07-20T08:20:48

winhttp: retry erroneously failing requests Early Windows TLS 1.2 implementations have an issue during key exchange with OpenSSL implementations that cause negotiation to fail with the error "the buffer supplied to a function was too small." This is a transient error on the connection, so when that error is received, retry up to 5 times to create a connection to the remote server before actually giving up. (cherry picked from commit dc371e3c5903760cc2334a0acfac9bce04479773)

ebdca44a

2018-09-02T11:38:43

git_remote_prune to be O(n * logn) (cherry picked from commit bfec6526e931d7f6ac5ecc38c37e76163092bfda)

a77b64a5

2018-10-06T12:58:06

ignore unsupported http authentication schemes auth_context_match returns 0 instead of -1 for unknown schemes to not fail in situations where some authentication schemes are supported and others are not. apply_credentials is adjusted to handle auth_context_match returning 0 without producing authentication context. (cherry picked from commit 475db39bb4c44a2221f340c66c227f555e478d10)

4ca4c7d1

2018-08-21T02:11:32

transport/http: do not return success if we failed to get a scheme Otherwise we return a NULL context, which will get dereferenced in apply_credentials. (cherry picked from commit 1c949ce1483ca22a29e8f523360999cbbe411a50)

01b8bbf4

2018-08-21T01:55:56

remote: set the error before cleanup Otherwise we'll return stack data to the caller. (cherry picked from commit 22d013b657c5957fde31641351cb72d08cc192ae)

c0e038a2

2018-08-21T01:12:11

revwalk: The left operand of '<' is a garbage value At line 594, we do this : if (error < 0) return error; but if nothing was pushed in a GIT_SORT_TIME revwalk, we'd return uninitialized stack data. (cherry picked from commit aa8cb5866f1eabd92c8c08f7a8610d42df07375f)

ef7d7def

2018-08-17T00:51:51

worktree: unlock should return 1 when the worktree isn't locked The documentation states that git_worktree_unlock returns 0 on success, and 1 on success if the worktree wasn't locked. Turns out we were returning 0 in any of those cases. (cherry picked from commit 59c2e70eeee8b2bae79d05060599114a5f6d737a)

a5a0347d

2018-08-16T22:45:43

Fix leak in index.c (cherry picked from commit 581d5492f6afdaf31a10e51187466a80ffc9f76f)

0fefd899

2018-08-17T15:56:30

util: make the qsort_r check work on macOS This performs a compile-check by using CMake support, to differentiate the GNU version from the BSD version of qsort_r. Module taken from 4f252abea5f1d17c60f6ff115c9c44cc0b6f1df6, which I've checked against CMake 2.8.11. (cherry picked from commit 1a9cc18260b68b149476adb6f39e37ab47d3d21f)

fad59658

2018-08-09T12:48:26

diff: fix OOM on AIX when finding similar deltas in empty diff The function `git_diff_find_similar` keeps a function of cache similarity metrics signatures, whose size depends on the number of deltas passed in via the `diff` parameter. In case where the diff is empty and thus doesn't have any deltas at all, we may end up allocating this cache via a call to `git__calloc(0, sizeof(void *))`. At least on AIX, allocating 0 bytes will result in a `NULL` pointer being returned, which causes us to erroneously return an OOM error. Fix this situation by simply returning early in case where we are being passed an empty diff, as we cannot find any similarities in that case anyway. (cherry picked from commit c65568d8c8c1bf4920393190e862819cd263f439)

831a709d

2018-08-05T14:37:08

Add a comment (cherry picked from commit ec76a1aa43321db2451e747d7a4408e780991c4a)

ab3f6099

2018-08-05T14:25:22

Don't error on missing section, just continue (cherry picked from commit 019409be004fb73071415750e98db03d33fada47)

9023f8f6

2018-07-22T23:31:19

config_file: Don't crash on options without a section (cherry picked from commit c4d7fa951acd066fd80d83954dd6082c1c7e9e1e)

e1295400

2018-08-04T19:30:40

parse: Do not initialize the content in context to NULL String operations in libgit2 are supposed to never receive `NULL`, e.g. they are not `NULL`-save. In the case of `git__linenlen()`, invocation with `NULL` leads to undefined behavior. In a `git_parse_ctx` however, the `content` field used in these operations was initialized to `NULL` if the `git_parse_ctx_init()` was called with `NULL` for `content` or `0` for `content_len`. For the latter case, the initialization function even contained some logic for initializing `content` with `NULL`. This commit mitigates triggering undefined behavior by rewriting the logic. Now `content` is always initialized to a non-null buffer. Instead of a null buffer, an empty string is used for denoting an empty buffer. (cherry picked from commit d1bfe614aa20a0bdaf76c6d418176320ab11baf4)

b8844566

2018-07-27T12:00:37

tree: rename from_tree to validate and clarify the tree in the test (cherry picked from commit f00db9ed67423b04976f8d20b0de2ee1fb7c3993)

e91d6b5e

2018-09-27T11:18:00

fix check if blob is uninteresting when inserting tree to packbuilder Blobs that have been marked as uninteresting should not be inserted into packbuilder when inserting a tree. The check as to whether a blob was uninteresting looked at the status for the tree itself instead of the blob. This could cause significantly larger packfiles. (cherry picked from commit b36cc7a4013a47856dade4226edc657906b82431)

b69089fd

2018-09-17T14:49:46

revwalk: only check the first commit in the list for an earlier timestamp This is not a big deal, but it does make us match git more closely by checking only the first. The lists are sorted already, so there should be no functional difference other than removing a possible check from every iteration in the loop. (cherry picked from commit 12a1790d8e71087056d2b2de936ddae439e1d94c)

3cabf8d1

2018-09-17T14:39:58

revwalk: use the max value for a signed integer When porting, we overlooked that the difference between git's and our's time representation and copied their way of getting the max value. Unfortunately git was using unsigned integers, so `~0ll` does correspond to their max value, whereas for us it corresponds to `-1`. This means that we always consider the last date to be smaller than the current commit's and always think commits are interesting. Change the initial value to the macro that gives us the maximum value on each platform so we can accurately consider commits interesting or not. (cherry picked from commit 46f35127b6fcfab87cb80d1b772ac7c662eafd38)

b133ab9b

2018-09-26T21:17:39

vector: do not realloc 0-size vectors (cherry picked from commit e0afd1c21c4421cec4f67162021f835e2bbb7df6)

ebb0e37e

2018-09-26T19:15:35

vector: do not malloc 0-length vectors on dup (cherry picked from commit fa48d2ea7d2d5dc9620e5c9f05ba8d788775582b)

5fc7d59c

2018-09-11T15:53:35

index: release the snapshot instead of freeing the index Previously we would assert in index_free because the reader incrementation would not be balanced. Release the snapshot normally, so the variable gets decremented before the index is freed. (cherry picked from commit c70713d6e4af563696563e410864eb4a6507757d)

c37c6ed8

2018-07-18T21:04:13

tree: accept null ids in existing trees when updating When we add entries to a treebuilder we validate them. But we validate even those that we're adding because they exist in the base tree. This disables using the normal mechanisms on these trees, even to fix them. Keep track of whether the entry we're appending comes from an existing tree and bypass the name and id validation if it's from existing data. (cherry picked from commit 2dff7e282da77f6b791e843ec267d9ddecabc187)

a051bce7

2018-06-18T20:37:18

patch_parse: populate line numbers while parsing diffs (cherry picked from commit f9e28026753f7b6c871a160ad584b2dc2639d30f)

c62c6379

2018-04-20T08:38:50

worktree: a worktree can be made from a bare repository

a7fbb051

2018-04-18T22:40:46

repository: being a worktree means we're not really bare We were previously conflating any error into GIT_ENOTFOUND, which might or might not be correct. This fixes the code so a config error is bubbled up, as well as preserving the semantics in the face of worktree-repositories

76e57b4e

2018-10-18T11:29:06

index: avoid out-of-bounds read when reading reuc entry stage We use `git__strtol64` to parse file modes of the index entries, which does not limit the parsed buffer length. As the index can be essentially treated as "untrusted" in that the data stems from the file system, it may be misformatted and may not contain terminating `NUL` bytes. This may lead to out-of-bounds reads when trying to parse index entries with such malformatted modes. Fix the issue by using `git__strntol64` instead. (cherry picked from commit 600ceadd1426b874ae0618651210a690a68b27e9)

4dddcafb

2018-10-18T11:25:59

commit_list: avoid use of strtol64 without length limit When quick-parsing a commit, we use `git__strtol64` to parse the commit's time. The buffer that's passed to `commit_quick_parse` is the raw data of an ODB object, though, whose data may not be properly formatted and also does not have to be `NUL` terminated. This may lead to out-of-bound reads. Use `git__strntol64` to avoid this problem. (cherry picked from commit 1a3fa1f5fafd433bdcf1834426d6963eff532125)

1c02b896

2018-07-20T21:50:58

smart subtransport: free url when resetting stream Free the url field when resetting the stream to avoid leaking it. (cherry picked from commit ca2eb4608243162a13c427e74526b6422d5a6659)

4f0e5f70

2018-10-19T10:29:19

commit: fix reading out of bounds when parsing encoding The commit message encoding is currently being parsed by the `git__prefixcmp` function. As this function does not accept a buffer length, it will happily skip over a buffer's end if it is not `NUL` terminated. Fix the issue by using `git__prefixncmp` instead. Add a test that verifies that we are unable to parse the encoding field if it's cut off by the supplied buffer length. (cherry picked from commit 7655b2d89e8275853d9921dd903dcdad9b3d4a7b)

6e40bb3a

2018-10-19T09:47:50

tag: fix out of bounds read when searching for tag message When parsing tags, we skip all unknown fields that appear before the tag message. This skipping is done by using a plain `strstr(buffer, "\n\n")` to search for the two newlines that separate tag fields from tag message. As it is not possible to supply a buffer length to `strstr`, this call may skip over the buffer's end and thus result in an out of bounds read. As `strstr` may return a pointer that is out of bounds, the following computation of `buffer_end - buffer` will overflow and result in an allocation of an invalid length. Fix the issue by using `git__memmem` instead. Add a test that verifies parsing the tag fails not due to the allocation failure but due to the tag having no message. (cherry picked from commit ee11d47e3d907b66eeff99e0ba1e1c71e05164b7)

ab0d9fb2

2018-10-18T16:08:46

util: provide `git__memmem` function Unfortunately, neither the `memmem` nor the `strnstr` functions are part of any C standard but are merely extensions of C that are implemented by e.g. glibc. Thus, there is no standardized way to search for a string in a block of memory with a limited size, and using `strstr` is to be considered unsafe in case where the buffer has not been sanitized. In fact, there are some uses of `strstr` in exactly that unsafe way in our codebase. Provide a new function `git__memmem` that implements the `memmem` semantics. That is in a given haystack of `n` bytes, search for the occurrence of a byte sequence of `m` bytes and return a pointer to the first occurrence. The implementation chosen is the "Not So Naive" algorithm from [1]. It was chosen as the implementation is comparably simple while still being reasonably efficient in most cases. Preprocessing happens in constant time and space, searching has a time complexity of O(n*m) with a slightly sub-linear average case. [1]: http://www-igm.univ-mlv.fr/~lecroq/string/ (cherry picked from commit 83e8a6b36acc67f2702cbbc7d4e334c7f7737719)

c8e94f84

2018-10-18T15:08:56

util: fix out of bounds read in error message When an integer that is parsed with `git__strntol32` is too big to fit into an int32, we will generate an error message that includes the actual string that failed to parse. This does not acknowledge the fact that the string may either not be NUL terminated or alternative include additional characters after the number that is to be parsed. We may thus end up printing characters into the buffer that aren't the number or, worse, read out of bounds. Fix the issue by utilizing the `endptr` that was set by `git__strntol64`. This pointer is guaranteed to be set to the first character following the number, and we can thus use it to compute the width of the number that shall be printed. Create a test to verify that we correctly truncate the number. (cherry picked from commit ea19efc19fa683d632af3e172868bc4350724813)

0f7663a1

2018-10-18T14:37:55

util: avoid signed integer overflows in `git__strntol64` While `git__strntol64` tries to detect integer overflows when doing the necessary arithmetics to come up with the final result, it does the detection only after the fact. This check thus relies on undefined behavior of signed integer overflows. Fix this by instead checking up-front whether the multiplications or additions will overflow. Note that a detected overflow will not cause us to abort parsing the current sequence of digits. In the case of an overflow, previous behavior was to still set up the end pointer correctly to point to the first character immediately after the currently parsed number. We do not want to change this now as code may rely on the end pointer being set up correctly even if the parsed number is too big to be represented as 64 bit integer. (cherry picked from commit b09c1c7b636c4112e247adc24245c65f3f9478d0)

e2e7a9cb

2018-10-18T12:04:07

util: remove `git__strtol32` The function `git__strtol32` can easily be misused when untrusted data is passed to it that may not have been sanitized with trailing `NUL` bytes. As all usages of this function have now been removed, we can remove this function altogether to avoid future misuse of it. (cherry picked from commit 8d7fa88a9d5011b653035497b0f523e0f177b6a6)

2052a3cd

2018-10-18T11:58:14

global: replace remaining use of `git__strtol32` Replace remaining uses of the `git__strtol32` function. While these uses are all safe as the strings were either sanitized or from a trusted source, we want to remove `git__strtol32` altogether to avoid future misuse. (cherry picked from commit 2613fbb26a3e1a34dda8a5d198c108626cfd6cc3)

61165dd4

2018-10-18T11:43:30

tree-cache: avoid out-of-bound reads when parsing trees We use the `git__strtol32` function to parse the child and entry count of treecaches from the index, which do not accept a buffer length. As the buffer that is being passed in is untrusted data and may thus be malformed and may not contain a terminating `NUL` byte, we can overrun the buffer and thus perform an out-of-bounds read. Fix the issue by uzing `git__strntol32` instead. (cherry picked from commit 21652ee9de439e042cc2e69b208aa2ef8ce31147)

6b2b63e5

2018-10-18T11:37:10

util: remove unsafe `git__strtol64` function The function `git__strtol64` does not take a maximum buffer length as parameter. This has led to some unsafe usages of this function, and as such we may consider it as being unsafe to use. As we have now eradicated all usages of this function, let's remove it completely to avoid future misuse. (cherry picked from commit 68deb2cc80ef19bf3a1915c26b5308b283a6d69a)

334e6c69

2018-10-18T11:35:08

config: remove last instance of `git__strntol64` When parsing integers from configuration values, we use `git__strtol64`. This is fine to do, as we always sanitize values and can thus be sure that they'll have a terminating `NUL` byte. But as this is the last call-site of `git__strtol64`, let's just pass in the length explicitly by calling `strlen` on the value to be able to remove `git__strtol64` altogether. (cherry picked from commit 1a2efd10bde66f798375e2f47ba57ef00ad5c193)

5ce26b18

2018-10-18T11:32:48

signature: avoid out-of-bounds reads when parsing signature dates We use `git__strtol64` and `git__strtol32` to parse the trailing commit or author date and timezone of signatures. As signatures are usually part of a commit or tag object and thus essentially untrusted data, the buffer may be misformatted and may not be `NUL` terminated. This may lead to an out-of-bounds read. Fix the issue by using `git__strntol64` and `git__strntol32` instead. (cherry picked from commit 3db9aa6f79711103a331a2bbbd044a3c37d4f136)

c7f91f39

2018-08-06T12:00:21

odb: fix use of wrong printf formatters The `git_odb_stream` members `declared_size` and `received_bytes` are both of the type `git_off_t`, which we usually defined to be a 64 bit signed integer. Thus, passing these members to "PRIdZ" formatters is not correct, as they are not guaranteed to accept big enough numbers. Instead, use the "PRId64" formatter, which is able to represent 64 bit signed integers. (cherry picked from commit 0fcd05631a1f59e156e613448262800c155e79d0)

a221f58e

2018-10-05T11:47:39

submodule: ignore path and url attributes if they look like options These can be used to inject options in an implementation which performs a recursive clone by executing an external command via crafted url and path attributes such that it triggers a local executable to be run. The library is not vulnerable as we do not rely on external executables but a user of the library might be relying on that so we add this protection. This matches this aspect of git's fix for CVE-2018-17456.

614c266d

2018-10-05T10:56:02

config_file: properly ignore includes without "path" value In case a configuration includes a key "include.path=" without any value, the generated configuration entry will have its value set to `NULL`. This is unexpected by the logic handling includes, and as soon as we try to calculate the included path we will unconditionally dereference that `NULL` pointer and thus segfault. Fix the issue by returning early in both `parse_include` and `parse_conditional_include` in case where the `file` argument is `NULL`. Add a test to avoid future regression. The issue has been found by the oss-fuzz project, issue 10810. (cherry picked from commit d06d4220eec035466d1a837972a40546b8904330)

f5c3442b

2018-10-03T16:17:21

smart_pkt: do not accept callers passing in no line length Right now, we simply ignore the `linelen` parameter of `git_pkt_parse_line` in case the caller passed in zero. But in fact, we never want to assume anything about the provided buffer length and always want the caller to pass in the available number of bytes. And in fact, checking all the callers, one can see that the funciton is never being called in case where the buffer length is zero, and thus we are safe to remove this check. (cherry picked from commit 1bc5b05c614c7b10de021fa392943e8e6bd12c77)

f7c3f6cc

2018-08-09T11:16:15

smart_pkt: return parsed length via out-parameter The `parse_len` function currently directly returns the parsed length of a packet line or an error code in case there was an error. Instead, convert this to our usual style of using the return value as error code only and returning the actual value via an out-parameter. Thus, we can now convert the output parameter to an unsigned type, as the size of a packet cannot ever be negative. While at it, we also move the check whether the input buffer is long enough into `parse_len` itself. We don't really want to pass around potentially non-NUL-terminated buffers to functions without also passing along the length, as this is dangerous in the unlikely case where other callers for that function get added. Note that we need to make sure though to not mess with `GIT_EBUFS` error codes, as these indicate not an error to the caller but that he needs to fetch more data. (cherry picked from commit c05790a8a8dd4351e61fc06c0a06c6a6fb6134dc)

7e3cd611

2018-08-09T11:13:59

smart_pkt: reorder and rename parameters of `git_pkt_parse_line` The parameters of the `git_pkt_parse_line` function are quite confusing. First, there is no real indicator what the `out` parameter is actually all about, and it's not really clear what the `bufflen` parameter refers to. Reorder and rename the parameters to make this more obvious. (cherry picked from commit 0b3dfbf425d689101663341beb94237614f1b5c2)

356f60f4

2018-08-09T11:04:42

smart_pkt: fix buffer overflow when parsing "unpack" packets When checking whether an "unpack" packet returned the "ok" status or not, we use a call to `git__prefixcmp`. In case where the passed line isn't properly NUL terminated, though, this may overrun the line buffer. Fix this by using `git__prefixncmp` instead. (cherry picked from commit 5fabaca801e1f5e7a1054be612e8fabec7cd6a7f)

b5b7c303

2018-08-09T11:03:37

smart_pkt: fix "ng" parser accepting non-space character When parsing "ng" packets, we blindly assume that the character immediately following the "ng" prefix is a space and skip it. As the calling function doesn't make sure that this is the case, we can thus end up blindly accepting an invalid packet line. Fix the issue by using `git__prefixncmp`, checking whether the line starts with "ng ". (cherry picked from commit b5ba7af2d30c958b090dcf135749d9afe89ec703)

319f0c03

2018-08-09T11:01:00

smart_pkt: fix buffer overflow when parsing "ok" packets There are two different buffer overflows present when parsing "ok" packets. First, we never verify whether the line already ends after "ok", but directly go ahead and also try to skip the expected space after "ok". Second, we then go ahead and use `strchr` to scan for the terminating newline character. But in case where the line isn't terminated correctly, this can overflow the line buffer. Fix the issues by using `git__prefixncmp` to check for the "ok " prefix and only checking for a trailing '\n' instead of using `memchr`. This also fixes the issue of us always requiring a trailing '\n'. Reported by oss-fuzz, issue 9749: Crash Type: Heap-buffer-overflow READ {*} Crash Address: 0x6310000389c0 Crash State: ok_pkt git_pkt_parse_line git_smart__store_refs Sanitizer: address (ASAN) (cherry picked from commit a9f1ca09178af0640963e069a2142d5ced53f0b4)

0599c267

2018-08-09T10:38:10

smart_pkt: fix buffer overflow when parsing "ACK" packets We are being quite lenient when parsing "ACK" packets. First, we didn't correctly verify that we're not overrunning the provided buffer length, which we fix here by using `git__prefixncmp` instead of `git__prefixcmp`. Second, we do not verify that the actual contents make any sense at all, as we simply ignore errors when parsing the ACKs OID and any unknown status strings. This may result in a parsed packet structure with invalid contents, which is being silently passed to the caller. This is being fixed by performing proper input validation and checking of return codes. (cherry picked from commit bc349045b1be8fb3af2b02d8554483869e54d5b8)

0fe87761

2018-08-09T10:57:06

smart_pkt: adjust style of "ref" packet parsing function While the function parsing ref packets doesn't have any immediately obvious buffer overflows, it's style is different to all the other parsing functions. Instead of checking buffer length while we go, it does a check up-front. This causes the code to seem a lot more magical than it really is due to some magic constants. Refactor the function to instead make use of the style of other packet parser and verify buffer lengths as we go. (cherry picked from commit 5edcf5d190f3b379740b223ff6a649d08fa49581)

97156614

2018-08-09T10:46:58

smart_pkt: check whether error packets are prefixed with "ERR " In the `git_pkt_parse_line` function, we determine what kind of packet a given packet line contains by simply checking for the prefix of that line. Except for "ERR" packets, we always only check for the immediate identifier without the trailing space (e.g. we check for an "ACK" prefix, not for "ACK "). But for "ERR" packets, we do in fact include the trailing space in our check. This is not really much of a problem at all, but it is inconsistent with all the other packet types and thus causes confusion when the `err_pkt` function just immediately skips the space without checking whether it overflows the line buffer. Adjust the check in `git_pkt_parse_line` to not include the trailing space and instead move it into `err_pkt` for consistency. (cherry picked from commit 786426ea6ec2a76ffe2515dc5182705fb3d44603)

5c0d1100

2018-08-09T10:46:26

smart_pkt: explicitly avoid integer overflows when parsing packets When parsing data, progress or error packets, we need to copy the contents of the rest of the current packet line into the flex-array of the parsed packet. To keep track of this array's length, we then assign the remaining length of the packet line to the structure. We do have a mismatch of types here, as the structure's `len` field is a signed integer, while the length that we are assigning has type `size_t`. On nearly all platforms, this shouldn't pose any problems at all. The line length can at most be 16^4, as the line's length is being encoded by exactly four hex digits. But on a platforms with 16 bit integers, this assignment could cause an overflow. While such platforms will probably only exist in the embedded ecosystem, we still want to avoid this potential overflow. Thus, we now simply change the structure's `len` member to be of type `size_t` to avoid any integer promotion. (cherry picked from commit 40fd84cca68db24f325e460a40dabe805e7a5d35)

20e58aac

2018-08-09T10:36:44

smart_pkt: honor line length when determining packet type When we parse the packet type of an incoming packet line, we do not verify that we don't overflow the provided line buffer. Fix this by using `git__prefixncmp` instead and passing in `len`. As we have previously already verified that `len <= linelen`, we thus won't ever overflow the provided buffer length. (cherry picked from commit 4a5804c983317100eed509537edc32d69c8d7aa2)

003cbc3f

2018-06-24T19:47:08

Verify ref_pkt's are long enough If the remote sends a too-short packet, we'll allow `len` to go negative and eventually issue a malloc for <= 0 bytes on ``` pkt->head.name = git__malloc(alloclen); ``` (cherry picked from commit 437ee5a70711ac2e027877d71ee4ae17e5ec3d6c)

4385aef3

2017-08-22T16:29:07

smart: typedef git_pkt_type and clarify recv_pkt return type (cherry picked from commit 08961c9d0d6927bfcc725bd64c9a87dbcca0c52c)

21ffc57d

2018-06-28T05:27:36

Small style tweak, and set an error (cherry picked from commit 895a668e19dc596e7b12ea27724ceb7b68556106)

be98c9e9

2018-06-26T02:32:50

Remove GIT_PKT_PACK entirely (cherry picked from commit 90cf86070046fcffd5306915b57786da054d8964)

bf4342f7

2018-08-11T13:06:14

Fix 'invalid packet line' for ng packets containing errors (cherry picked from commit 50dd7fea5ad1bf6c013b72ad0aa803a9c84cdede)

15e92284

2018-09-05T11:49:13

Prevent heap-buffer-overflow When running repack while doing repo writes, `packfile_load__cb()` can see some temporary files in the directory that are bigger than the usual, and makes `memcmp` overflow on the `p->pack_name` string. ASAN detected this. This just uses `strncmp`, that should not have any performance impact and is safe for comparing strings of different sizes. ``` ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61200001a3f3 at pc 0x7f4a9e1976ec bp 0x7ffc1f80e100 sp 0x7ffc1f80d8b0 READ of size 89 at 0x61200001a3f3 thread T0 SCARINESS: 26 (multi-byte-read-heap-buffer-overflow) #0 0x7f4a9e1976eb in __interceptor_memcmp.part.78 (/build/cfgr-admin#link-tree/libtools_build_sanitizers_asan-ubsan-py.so+0xcf6eb) #1 0x7f4a518c5431 in packfile_load__cb /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb_pack.c:213 #2 0x7f4a518d9582 in git_path_direach /build/libgit2/0.27.0/src/libgit2-0.27.0/src/path.c:1134 #3 0x7f4a518c58ad in pack_backend__refresh /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb_pack.c:347 #4 0x7f4a518c1b12 in git_odb_refresh /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb.c:1511 #5 0x7f4a518bff5f in git_odb__freshen /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb.c:752 #6 0x7f4a518c17d4 in git_odb_stream_finalize_write /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb.c:1415 #7 0x7f4a51b9d015 in Repository_write /build/pygit2/0.27.0/src/pygit2-0.27.0/src/repository.c:509 ``` (cherry picked from commit d22cd1f4a4c10ff47b04c57560e6765d77e5a8fd)

39706ded

2018-09-03T10:49:46

config_parse: refactor error handling when parsing multiline variables The current error handling for the multiline variable parser is a bit fragile, as each error condition has its own code to clear memory. Instead, unify error handling as far as possible to avoid this repetitive code. While at it, make use of `GITERR_CHECK_ALLOC` to correctly handle OOM situations and verify that the buffer we print into does not run out of memory either. (cherry picked from commit bc63e1ef521ab5900dc0b0dcd578b8bf18627fb1)

68823395

2018-09-01T03:50:26

config: Fix a leak parsing multi-line config entries (cherry picked from commit 38b852558eb518f96c313cdcd9ce5a7af6ded194)

24c7b23d

2018-08-25T17:04:39

config: convert unbounded recursion into a loop (cherry picked from commit a03113e80332fba6c77f43b21d398caad50b4b89)

1f9a8510

2018-07-19T13:00:42

smart_pkt: fix potential OOB-read when processing ng packet OSS-fuzz has reported a potential out-of-bounds read when processing a "ng" smart packet: ==1==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6310000249c0 at pc 0x000000493a92 bp 0x7ffddc882cd0 sp 0x7ffddc882480 READ of size 65529 at 0x6310000249c0 thread T0 SCARINESS: 26 (multi-byte-read-heap-buffer-overflow) #0 0x493a91 in __interceptor_strchr.part.35 /src/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:673 #1 0x813960 in ng_pkt libgit2/src/transports/smart_pkt.c:320:14 #2 0x810f79 in git_pkt_parse_line libgit2/src/transports/smart_pkt.c:478:9 #3 0x82c3c9 in git_smart__store_refs libgit2/src/transports/smart_protocol.c:47:12 #4 0x6373a2 in git_smart__connect libgit2/src/transports/smart.c:251:15 #5 0x57688f in git_remote_connect libgit2/src/remote.c:708:15 #6 0x52e59b in LLVMFuzzerTestOneInput /src/download_refs_fuzzer.cc:145:9 #7 0x52ef3f in ExecuteFilesOnyByOne(int, char**) /src/libfuzzer/afl/afl_driver.cpp:301:5 #8 0x52f4ee in main /src/libfuzzer/afl/afl_driver.cpp:339:12 #9 0x7f6c910db82f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/libc-start.c:291 #10 0x41d518 in _start When parsing an "ng" packet, we keep track of both the current position as well as the remaining length of the packet itself. But instead of taking care not to exceed the length, we pass the current pointer's position to `strchr`, which will search for a certain character until hitting NUL. It is thus possible to create a crafted packet which doesn't contain a NUL byte to trigger an out-of-bounds read. Fix the issue by instead using `memchr`, passing the remaining length as restriction. Furthermore, verify that we actually have enough bytes left to produce a match at all. OSS-Fuzz-Issue: 9406

c1577110

2018-07-05T13:30:46

delta: fix overflow when computing limit When checking whether a delta base offset and length fit into the base we have in memory already, we can trigger an overflow which breaks the check. This would subsequently result in us reading memory from out of bounds of the base. The issue is easily fixed by checking for overflow when adding `off` and `len`, thus guaranteeting that we are never indexing beyond `base_len`. This corresponds to the git patch 8960844a7 (check patch_delta bounds more carefully, 2006-04-07), which adds these overflow checks. Reported-by: Riccardo Schirone <rschiron@redhat.com>

9844d38b

2018-06-29T09:11:02

delta: fix out-of-bounds read of delta When computing the offset and length of the delta base, we repeatedly increment the `delta` pointer without checking whether we have advanced past its end already, which can thus result in an out-of-bounds read. Fix this by repeatedly checking whether we have reached the end. Add a test which would cause Valgrind to produce an error. Reported-by: Riccardo Schirone <rschiron@redhat.com> Test-provided-by: Riccardo Schirone <rschiron@redhat.com>

3f461902

2018-06-29T07:45:18

delta: fix sign-extension of big left-shift Our delta code was originally adapted from JGit, which itself adapted it from git itself. Due to this heritage, we inherited a bug from git.git in how we compute the delta offset, which was fixed upstream in 48fb7deb5 (Fix big left-shifts of unsigned char, 2009-06-17). As explained by Linus: Shifting 'unsigned char' or 'unsigned short' left can result in sign extension errors, since the C integer promotion rules means that the unsigned char/short will get implicitly promoted to a signed 'int' due to the shift (or due to other operations). This normally doesn't matter, but if you shift things up sufficiently, it will now set the sign bit in 'int', and a subsequent cast to a bigger type (eg 'long' or 'unsigned long') will now sign-extend the value despite the original expression being unsigned. One example of this would be something like unsigned long size; unsigned char c; size += c << 24; where despite all the variables being unsigned, 'c << 24' ends up being a signed entity, and will get sign-extended when then doing the addition in an 'unsigned long' type. Since git uses 'unsigned char' pointers extensively, we actually have this bug in a couple of places. In our delta code, we inherited such a bogus shift when computing the offset at which the delta base is to be found. Due to the sign extension we can end up with an offset where all the bits are set. This can allow an arbitrary memory read, as the addition in `base_len < off + len` can now overflow if `off` has all its bits set. Fix the issue by casting the result of `*delta++ << 24UL` to an unsigned integer again. Add a test with a crafted delta that would actually succeed with an out-of-bounds read in case where the cast wouldn't exist. Reported-by: Riccardo Schirone <rschiron@redhat.com> Test-provided-by: Riccardo Schirone <rschiron@redhat.com>

7392799d

2018-05-30T08:35:06

submodule: detect duplicated submodule paths When loading submodule names, we build a map of submodule paths and their respective names. While looping over the configuration keys, we do not check though whether a submodule path was seen already. This leads to a memory leak in case we have multiple submodules with the same path, as we just overwrite the old value in the map in that case. Fix the error by verifying that the path to be added is not yet part of the string map. Git does not allow to have multiple submodules for a path anyway, so we now do the same and detect this duplication, reporting it to the user.

f2e5c092

2018-04-27T15:31:43

cmake: remove now-useless LIBGIT2_LIBDIRS handling With the recent change of always resolving pkg-config libraries to their full path, we do not have to manage the LIBGIT2_LIBDIRS variable anymore. The only other remaining user of LIBGIT2_LIBDIRS is winhttp, which is a CMake-style library target and can thus be resolved by CMake automatically. Remove the variable to simplify our build system a bit.

0c8ff50f

2018-04-27T10:38:49

cmake: resolve libraries found by pkg-config Libraries found by CMake modules are usually handled with their full path. This makes linking against those libraries a lot more robust when it comes to libraries in non-standard locations, as otherwise we might mix up libraries from different locations when link directories are given. One excemption are libraries found by PKG_CHECK_MODULES. Instead of returning libraries with their complete path, it will return the variable names as well as a set of link directories. In case where multiple sets of the same library are installed in different locations, this can lead the compiler to link against the wrong libraries in the end, when link directories of other dependencies are added. To fix this shortcoming, we need to manually resolve library paths returned by CMake against their respective library directories. This is an easy task to do with `FIND_LIBRARY`.

b2f3ff56

2018-04-19T01:08:18

worktree: fix calloc of the wrong object type

8fa0b34b

2018-04-19T01:05:05

local: fix a leaking reference when iterating over a symref Valgrind log : ==17702== 18 bytes in 1 blocks are indirectly lost in loss record 69 of 1,123 ==17702== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==17702== by 0x5FDBB49: strdup (strdup.c:42) ==17702== by 0x632B3E: git__strdup (util.h:106) ==17702== by 0x632D2C: git_reference__alloc_symbolic (refs.c:64) ==17702== by 0x62E0AF: loose_lookup (refdb_fs.c:408) ==17702== by 0x62E636: refdb_fs_backend__iterator_next (refdb_fs.c:565) ==17702== by 0x62CD8E: git_refdb_iterator_next (refdb.c:147) ==17702== by 0x6347F2: git_reference_next (refs.c:838) ==17702== by 0x6345CB: git_reference_foreach (refs.c:748) ==17702== by 0x66BE62: local_download_pack (local.c:579) ==17702== by 0x5DB48F: git_fetch_download_pack (fetch.c:148) ==17702== by 0x639028: git_remote_download (remote.c:932) ==17702== by 0x63919A: git_remote_fetch (remote.c:969) ==17702== by 0x4ABEDD: test_fetchhead_nonetwork__fetch_into_repo_with_symrefs (nonetwork.c:362) ==17702== by 0x4125D9: clar_run_test (clar.c:222) ==17702== by 0x41287C: clar_run_suite (clar.c:286) ==17702== by 0x412DDE: clar_test_run (clar.c:433) ==17702== by 0x4105E1: main (main.c:24)

2fe887e6

2018-04-18T20:57:16

remote: repo is optional here As per CID:1378747, we might be called with a NULL repo, which would be deferenced in write_add_refspec

96329606

2018-03-11T15:35:56

worktree: Read worktree specific reflog for HEAD Signed-off-by: Sven Strickroth <email@cs-ware.de>

a137cdbd

2018-04-18T21:41:44

refspec: check for valid parameters in git_refspec__dwim_one CID:1383993, "In git_refspec__dwim_one: All paths that lead to this null pointer comparison already dereference the pointer earlier (CWE-476)"

7fa6c8ce

2018-03-29T10:18:51

util: fix missing headers for MinGW environments There are multiple references to undefined functions in the Microsoft builds. Add headers to make them known.

0f09d9f5

2018-04-02T20:00:07

Fix build with LibreSSL 2.7 LibreSSL 2.7 adds OpenSSL 1.1 API Signed-off-by: Bernard Spil <brnrd@FreeBSD.org>

16b62dd4

2018-04-04T21:28:31

diff: Add missing GIT_DELTA_TYPECHANGE -> 'T' mapping. This adds the 'T' status character to git_diff_status_char() for diff entries that change type.

07011e60

2018-04-12T13:32:27

revwalk: fix uninteresting revs sometimes not limiting graphwalk When we want to limit our graphwalk, we use the heuristic of checking whether the newest limiting (uninteresting) revision is newer than the oldest interesting revision. We do so by inspecting whether the first item's commit time of the user-supplied list of revisions is newer than the last added interesting revision. This is wrong though, as the user supplied list is in no way guaranteed to be sorted by increasing commit dates. This could lead us to abort the revwalk early before applying all relevant limiting revisions, outputting revisions which should in fact have been hidden. Fix the heuristic by instead checking whether _any_ of the limiting commits was made earlier than the last interesting commit. Add a test.

0f88adb6

2018-02-08T12:36:47

Submodule API should report .gitmodules parse errors Signed-off-by: Sven Strickroth <email@cs-ware.de>

e2a80124

2018-04-10T21:16:43

refs: preserve the owning refdb when duping reference This fixes a segfault in git_reference_owner on references returned from git_reference__read_head and git_reference_dup ones.

b260fdc8

2018-04-06T12:24:10

attr_file: fix handling of directory patterns with trailing spaces When comparing whether a path matches a directory rule, we pass the both the path and directory name to `fnmatch` with `GIT_ATTR_FNMATCH_DIRECTORY` being set. `fnmatch` expects the pattern to contain no trailing directory '/', which is why we try to always strip patterns of trailing slashes. We do not handle that case correctly though when the pattern itself has trailing spaces, causing the match to fail. Fix the issue by stripping trailing spaces and tabs for a rule previous to checking whether the pattern is a directory pattern with a trailing '/'. This replaces the whitespace-stripping in our ignore file parsing code, which was stripping whitespaces too late. Add a test to catch future breakage.

a714e836

2018-04-06T10:39:16

transports: local: fix assert when fetching into repo with symrefs When fetching into a repository which has symbolic references via the "local" transport we run into an assert. The assert is being triggered while we negotiate the packfile between the two repositories. When hiding known revisions from the packbuilder revwalk, we unconditionally hide all references of the local refdb. In case one of these references is a symbolic reference, though, this means we're trying to hide a `NULL` OID, which triggers the assert. Fix the issue by only hiding OID references from the revwalk. Add a test to catch this issue in the future.

59012bf4

2018-03-29T09:15:48

odb: mempack: fix leaking objects when freeing mempacks When a ODB mempack gets free'd, we take no measures at all to free its contents, most notably the objects added to the database, resulting in a memory leak. Call `git_mempack_reset` previous to freeing the ODB structures themselves, which takes care of releasing all associated data structures.

4d4a7dbf

2018-03-28T17:37:39

sha1dc: update to fix errors with endianess This updates the version of SHA1DC to c3e1304ea3.

b89988c7

2018-03-27T15:03:15

transports: ssh: replace deprecated function `libssh2_session_startup` The function `libssh2_session_startup` has been deprecated since libssh2 version 1.2.8 in favor of `libssh2_session_handshake` introduced in the same version. libssh2 1.2.8 was released in April 2011, so it is already seven years old. It is available in Debian Wheezy, Ubuntu Trusty and CentOS 7.4, so the most important and conservative distros already have it available. As such, it seems safe to just use the new function.

b2e7d8c2

2018-03-27T14:49:21

transports: ssh: disconnect session before freeing it The function `ssh_stream_free` takes over the responsibility of closing channels and streams just before freeing their memory, but it does not do so for the session. In fact, we never disconnect the session ourselves at all, as libssh2 will not do so itself upon freeing the structure. Quoting the documentation of `libssh2_session_free`: > Frees all resources associated with a session instance. Typically > called after libssh2_session_disconnect_ex, The missing disconnect probably stems from a misunderstanding what it actually does. As we are already closing the TCP socket ourselves, the assumption was that no additional disconnect is required. But calling `libssh2_session_disconnect` will notify the server that we are cleanly closing the connection, such that the server can free his own resources. Add a call to `libssh2_session_disconnect` to fix that issue. [1]: https://www.libssh2.org/libssh2_session_free.html

cfed1be8

2018-05-24T20:28:36

submodule: plug leaks from the escape detection

f650153a

2018-05-24T19:05:59

submodule: replace index with strchr which exists on Windows

thodg/libgit2/src

src

Log