src


Log

Author Commit Date CI Message
Edward Thomson af33210b 2018-07-10T16:10:03 apply: introduce a delta callback Introduce a callback to the application options that allow callers to add a per-delta callback. The callback can return an error code to stop patch application, or can return a value to skip the application of a particular delta.
Edward Thomson 37b25ac5 2018-07-08T16:12:58 apply: move location to an argument, not the opts Move the location option to an argument, out of the options structure. This allows the options structure to be re-used for functions that don't need to know the location, since it's implicit in their functionality. For example, `git_apply_tree` should not take a location, but is expected to take all the other options.
Edward Thomson 2d27ddc0 2018-07-01T21:35:51 apply: use an indexwriter Place the entire `git_apply` operation inside an indexwriter, so that we lock the index before we begin performing patch application. This ensures that there are no other processes modifying things in the working directory.
Edward Thomson 813f0802 2018-07-01T15:14:36 apply: validate workdir contents match index for BOTH When applying to both the index and the working directory, ensure that the index contents match the working directory. This mirrors the requirement in `git apply --index`. This also means that - along with the prior commit that uses the working directory contents as the checkout baseline - we no longer expect conflicts during checkout. So remove the special-case error handling for checkout conflicts. (Any checkout conflict now would be because the file was actually modified between the start of patch application and the checkout.)
Edward Thomson 0f4b2f02 2018-07-01T15:13:50 reader: optionally validate index matches workdir When using a workdir reader, optionally validate that the index contents match the working directory contents.
Edward Thomson 9be89bbd 2018-07-01T11:08:26 reader: apply working directory filters When reading a file from the working directory, ensure that we apply any necessary filters to the item. This ensures that we get the repository-normalized data as the preimage, and further ensures that we can accurately compare the working directory contents to the index contents for accurate safety validation in the `BOTH` case.
Edward Thomson 5b8d5a22 2018-07-01T13:42:53 apply: use preimage as the checkout baseline Use the preimage as the checkout's baseline. This allows us to support applying patches to files that are modified in the working directory (those that differ from the HEAD and index). Without this, files will be reported as (checkout) conflicts. With this, we expect the on-disk data when we began the patch application (the "preimage") to be on-disk during checkout. We could have also simply used the `FORCE` flag to checkout to accomplish a similar mechanism. However, `FORCE` ignores all differences, while providing a preimage ensures that we will only overwrite the file contents that we actually read. Modify the reader interface to provide the OID to support this.
Edward Thomson 5b66b667 2018-06-29T12:39:41 apply: when preimage file is missing, return EAPPLYFAIL The preimage file being missing entirely is simply a case of an application failure; return the correct error value for the caller.
Edward Thomson dddfff77 2018-06-30T17:12:16 apply: convert checkout conflicts to apply failures When there's a checkout conflict during apply, that means that the working directory was modified in a conflicting manner and the postimage cannot be written. During application, convert this to an application failure for consistency across workdir/index/both applications.
Edward Thomson e0224121 2018-06-29T12:09:02 apply: simplify checkout vs index application Separate the concerns of applying via checkout and updating the repository's index. This results in simpler functionality and allows us to not build the temporary collection of paths in the index case.
Edward Thomson 3b5378c5 2018-06-25T16:27:06 apply: handle file deletions If the file was deleted in the postimage, do not attempt to update the target. Instead, ignore it and simply allow it to stay removed in our computed postimage. Also, test that we can handle file deletions.
Edward Thomson f83bbe0a 2018-03-19T19:50:45 apply: introduce `git_apply` Introduce `git_apply`, which will take a `git_diff` and apply it to the working directory (akin to `git apply`), the index (akin to `git apply --cached`), or both (akin to `git apply --index`).
Edward Thomson 664cda6f 2018-03-19T20:10:38 apply: reimplement `git_apply_tree` with readers The generic `git_reader` interface simplifies `git_apply_tree` somewhat. Reimplement `git_apply_tree` with them.
Edward Thomson d73043a2 2018-03-19T20:10:31 reader: a generic way to read files from repos Similar to the `git_iterator` interface, the `git_reader` interface will allow us to read file contents from an arbitrary repository-backed data source (trees, index, or working directory).
Edward Thomson 20f8a6db 2018-06-28T17:26:21 apply: remove deleted paths from index We update the index with the new_file side of the delta, but we need to explicitly remove the old_file path in the case where an item was deleted or renamed.
Edward Thomson c3077ea0 2018-06-25T21:24:49 apply: return a specific exit code on failure Return `GIT_EAPPLYFAIL` on patch application failure so that users can determine that patch application failed due to a malformed/conflicting patch by looking at the error code.
Edward Thomson d54aa9ae 2018-06-26T15:25:30 iterator: introduce `git_iterator_foreach` Introduce a `git_iterator_foreach` helper function which invokes a callback on all files for a given iterator.
Edward Thomson 9c34c996 2018-06-25T17:03:14 apply: handle file additions Don't attempt to read the postimage file during a file addition, simply use an empty buffer as the postimage. Also, test that we can handle file additions.
Edward Thomson 02b1083a 2018-01-28T23:25:07 apply: introduce `git_apply_tree` Introduce `git_apply_tree`, which will apply a `git_diff` to a given `git_tree`, allowing an in-memory patch application for a repository.
Edward Thomson 2b12dcf6 2018-03-19T19:45:11 iterator: optionally hash filesystem iterators Optionally hash the contents of files encountered in the filesystem or working directory iterators. This is not expected to be used in production code paths, but may allow us to simplify some test contexts. For working directory iterators, apply filters as appropriate, since we have the context able to do it.
Patrick Steinhardt 623647af 2018-10-26T12:33:59 Merge pull request #4864 from pks-t/pks/object-parse-fixes Object parse fixes
Patrick Steinhardt 7655b2d8 2018-10-19T10:29:19 commit: fix reading out of bounds when parsing encoding The commit message encoding is currently being parsed by the `git__prefixcmp` function. As this function does not accept a buffer length, it will happily skip over a buffer's end if it is not `NUL` terminated. Fix the issue by using `git__prefixncmp` instead. Add a test that verifies that we are unable to parse the encoding field if it's cut off by the supplied buffer length.
Patrick Steinhardt ee11d47e 2018-10-19T09:47:50 tag: fix out of bounds read when searching for tag message When parsing tags, we skip all unknown fields that appear before the tag message. This skipping is done by using a plain `strstr(buffer, "\n\n")` to search for the two newlines that separate tag fields from tag message. As it is not possible to supply a buffer length to `strstr`, this call may skip over the buffer's end and thus result in an out of bounds read. As `strstr` may return a pointer that is out of bounds, the following computation of `buffer_end - buffer` will overflow and result in an allocation of an invalid length. Fix the issue by using `git__memmem` instead. Add a test that verifies parsing the tag fails not due to the allocation failure but due to the tag having no message.
Patrick Steinhardt 83e8a6b3 2018-10-18T16:08:46 util: provide `git__memmem` function Unfortunately, neither the `memmem` nor the `strnstr` functions are part of any C standard but are merely extensions of C that are implemented by e.g. glibc. Thus, there is no standardized way to search for a string in a block of memory with a limited size, and using `strstr` is to be considered unsafe in case where the buffer has not been sanitized. In fact, there are some uses of `strstr` in exactly that unsafe way in our codebase. Provide a new function `git__memmem` that implements the `memmem` semantics. That is in a given haystack of `n` bytes, search for the occurrence of a byte sequence of `m` bytes and return a pointer to the first occurrence. The implementation chosen is the "Not So Naive" algorithm from [1]. It was chosen as the implementation is comparably simple while still being reasonably efficient in most cases. Preprocessing happens in constant time and space, searching has a time complexity of O(n*m) with a slightly sub-linear average case. [1]: http://www-igm.univ-mlv.fr/~lecroq/string/
Patrick Steinhardt bea65980 2018-10-25T11:21:14 Merge pull request #4851 from pks-t/pks/strtol-removal strtol removal
Edward Thomson 305e801a 2018-10-21T09:52:32 util: allow callers to reset custom allocators Provide a utility to reset custom allocators back to their default. This is particularly useful for testing.
Edward Thomson 7c791f3d 2018-10-20T20:25:51 Merge pull request #4852 from libgit2/ethomson/unc_paths Win32 path canonicalization refactoring
Edward Thomson 6cc14ae3 2018-10-20T20:22:04 Merge pull request #4840 from libgit2/cmn/validity-tree-from-unowned-index Check object existence when creating a tree from an index
Edward Thomson a2f9f94b 2018-10-20T20:18:04 Merge branch 'issue-4203'
Edward Thomson 32b81661 2018-10-20T20:16:32 merge: don't leak the index during reloads
Patrick Steinhardt 0a4284b1 2018-10-19T14:54:13 Merge pull request #4819 from libgit2/cmn/config-nonewline Configuration variables can appear on the same line as the section header
Patrick Steinhardt ea19efc1 2018-10-18T15:08:56 util: fix out of bounds read in error message When an integer that is parsed with `git__strntol32` is too big to fit into an int32, we will generate an error message that includes the actual string that failed to parse. This does not acknowledge the fact that the string may either not be NUL terminated or alternative include additional characters after the number that is to be parsed. We may thus end up printing characters into the buffer that aren't the number or, worse, read out of bounds. Fix the issue by utilizing the `endptr` that was set by `git__strntol64`. This pointer is guaranteed to be set to the first character following the number, and we can thus use it to compute the width of the number that shall be printed. Create a test to verify that we correctly truncate the number.
Edward Thomson a34f5b0d 2018-10-18T08:57:27 win32: refactor `git_win32_path_remove_namespace` Update `git_win32_path_remove_namespace` to disambiguate the prefix being removed versus the prefix being added. Now we remove the "namespace", and (may) add a "prefix" in its place. Eg, we remove the `\\?\` namespace. We remove the `\\?\UNC\` namespace, and replace it with the `\\` prefix. This aids readability somewhat. Additionally, use pointer arithmetic instead of offsets, which seems to also help readability.
Edward Thomson b2e85f98 2018-10-17T08:48:43 win32: rename `git_win32__canonicalize_path` The internal API `git_win32__canonicalize_path` is far, far too easily confused with the internal API `git_win32_path_canonicalize`. The former removes the namespace prefix from a path (eg, given `\\?\C:\Temp\foo`, it returns `C:\Temp\foo`, and given `\\?\UNC\server\share`, it returns `\\server\share`). As such, rename it to `git_win32_path_remove_namespace`. `git_win32_path_canonicalize` remains unchanged.
Patrick Steinhardt b09c1c7b 2018-10-18T14:37:55 util: avoid signed integer overflows in `git__strntol64` While `git__strntol64` tries to detect integer overflows when doing the necessary arithmetics to come up with the final result, it does the detection only after the fact. This check thus relies on undefined behavior of signed integer overflows. Fix this by instead checking up-front whether the multiplications or additions will overflow. Note that a detected overflow will not cause us to abort parsing the current sequence of digits. In the case of an overflow, previous behavior was to still set up the end pointer correctly to point to the first character immediately after the currently parsed number. We do not want to change this now as code may rely on the end pointer being set up correctly even if the parsed number is too big to be represented as 64 bit integer.
Patrick Steinhardt 8d7fa88a 2018-10-18T12:04:07 util: remove `git__strtol32` The function `git__strtol32` can easily be misused when untrusted data is passed to it that may not have been sanitized with trailing `NUL` bytes. As all usages of this function have now been removed, we can remove this function altogether to avoid future misuse of it.
Patrick Steinhardt 2613fbb2 2018-10-18T11:58:14 global: replace remaining use of `git__strtol32` Replace remaining uses of the `git__strtol32` function. While these uses are all safe as the strings were either sanitized or from a trusted source, we want to remove `git__strtol32` altogether to avoid future misuse.
Patrick Steinhardt 21652ee9 2018-10-18T11:43:30 tree-cache: avoid out-of-bound reads when parsing trees We use the `git__strtol32` function to parse the child and entry count of treecaches from the index, which do not accept a buffer length. As the buffer that is being passed in is untrusted data and may thus be malformed and may not contain a terminating `NUL` byte, we can overrun the buffer and thus perform an out-of-bounds read. Fix the issue by uzing `git__strntol32` instead.
Patrick Steinhardt 68deb2cc 2018-10-18T11:37:10 util: remove unsafe `git__strtol64` function The function `git__strtol64` does not take a maximum buffer length as parameter. This has led to some unsafe usages of this function, and as such we may consider it as being unsafe to use. As we have now eradicated all usages of this function, let's remove it completely to avoid future misuse.
Patrick Steinhardt 1a2efd10 2018-10-18T11:35:08 config: remove last instance of `git__strntol64` When parsing integers from configuration values, we use `git__strtol64`. This is fine to do, as we always sanitize values and can thus be sure that they'll have a terminating `NUL` byte. But as this is the last call-site of `git__strtol64`, let's just pass in the length explicitly by calling `strlen` on the value to be able to remove `git__strtol64` altogether.
Patrick Steinhardt 3db9aa6f 2018-10-18T11:32:48 signature: avoid out-of-bounds reads when parsing signature dates We use `git__strtol64` and `git__strtol32` to parse the trailing commit or author date and timezone of signatures. As signatures are usually part of a commit or tag object and thus essentially untrusted data, the buffer may be misformatted and may not be `NUL` terminated. This may lead to an out-of-bounds read. Fix the issue by using `git__strntol64` and `git__strntol32` instead.
Patrick Steinhardt 600ceadd 2018-10-18T11:29:06 index: avoid out-of-bounds read when reading reuc entry stage We use `git__strtol64` to parse file modes of the index entries, which does not limit the parsed buffer length. As the index can be essentially treated as "untrusted" in that the data stems from the file system, it may be misformatted and may not contain terminating `NUL` bytes. This may lead to out-of-bounds reads when trying to parse index entries with such malformatted modes. Fix the issue by using `git__strntol64` instead.
Patrick Steinhardt 1a3fa1f5 2018-10-18T11:25:59 commit_list: avoid use of strtol64 without length limit When quick-parsing a commit, we use `git__strtol64` to parse the commit's time. The buffer that's passed to `commit_quick_parse` is the raw data of an ODB object, though, whose data may not be properly formatted and also does not have to be `NUL` terminated. This may lead to out-of-bound reads. Use `git__strntol64` to avoid this problem.
Edward Thomson f010b66b 2018-10-17T14:33:47 Merge pull request #4849 from libgit2/cmn/expose-gitfile-check path: export the dotgit-checking functions
Carlos Martín Nieto 7615794c 2018-10-15T18:08:13 Merge pull request #4845 from pks-t/pks/object-fuzzer Object parsing fuzzer
Carlos Martín Nieto 9b6e4081 2018-10-15T17:08:38 Merge commit 'afd10f0' (Follow 308 redirects)
Carlos Martín Nieto 05e54e00 2018-10-15T13:54:17 path: export the dotgit-checking functions These checks are preformed by libgit2 on checkout, but they're also useful for performing checks in applications which do not involve checkout. Expose them under `sys/` as it's still fairly in the weeds even for this library.
Carlos Martín Nieto 1d718fa5 2018-09-28T11:55:34 config: variables might appear on the same line as a section header While rare and a machine would typically not generate such a configuration file, it is nevertheless valid to write [foo "bar"] baz = true and we need to deal with that instead of assuming everything is on its own line.
Zander Brown afd10f0b 2018-10-13T09:31:20 Follow 308 redirects (as used by GitLab)
Patrick Steinhardt 814e7acb 2018-10-12T12:38:06 Merge pull request #4842 from nelhage/fuzz-config-memory config: Port config_file_fuzzer to the new in-memory backend.
Nelson Elhage 463c21e2 2018-10-11T13:27:06 Apply code review feedback
Patrick Steinhardt 6562cdda 2018-10-11T12:43:08 object: properly propagate errors on parsing failures When failing to parse a raw object fromits data, we free the partially parsed object but then fail to propagate the error to the caller. This may lead callers to operate on objects with invalid memory, which will sooner or later cause the program to segfault. Fix the issue by passing up the error code returned by `parse_raw`.
Nelson Elhage 2d449a11 2018-10-09T02:42:14 config: Refactor `git_config_backend_from_string` to take a length
Carlos Martín Nieto fd490d3e 2018-10-08T13:15:31 tree: unify the entry validity checks We have two similar functions, `git_treebuilder_insert` and `append_entry` which are used in different codepaths as part of creating a new tree. The former learnt to check for object existence under strict object creation, but the latter did not. This allowed the creation of a tree from an unowned index to bypass some of the checks and create a tree pointing to a nonexistent object. Extract a single function which performs these checks and call it from both codepaths. In `append_entry` we still do not validate when asked not to, as this is data which is already in the tree and we want to allow users to deal with repositories which already have some invalid data.
Edward Thomson 0cd976c8 2018-10-07T12:00:06 Merge pull request #4830 from pks-t/pks/diff-stats-rename-common diff_stats: use git's formatting of renames with common directories
Anders Borum 475db39b 2018-10-06T12:58:06 ignore unsupported http authentication schemes auth_context_match returns 0 instead of -1 for unknown schemes to not fail in situations where some authentication schemes are supported and others are not. apply_credentials is adjusted to handle auth_context_match returning 0 without producing authentication context.
Patrick Steinhardt a8d447f6 2018-10-05T20:13:34 Merge pull request #4837 from pks-t/cmn/reject-option-submodule-url-path submodule: ignore path and url attributes if they look like options
Patrick Steinhardt ce8803a2 2018-10-05T20:03:38 Merge pull request #4836 from pks-t/pks/smart-packets Smart packet security fixes
Carlos Martín Nieto c8ca3cae 2018-10-05T11:47:39 submodule: ignore path and url attributes if they look like options These can be used to inject options in an implementation which performs a recursive clone by executing an external command via crafted url and path attributes such that it triggers a local executable to be run. The library is not vulnerable as we do not rely on external executables but a user of the library might be relying on that so we add this protection. This matches this aspect of git's fix for CVE-2018-17456.
Patrick Steinhardt d06d4220 2018-10-05T10:56:02 config_file: properly ignore includes without "path" value In case a configuration includes a key "include.path=" without any value, the generated configuration entry will have its value set to `NULL`. This is unexpected by the logic handling includes, and as soon as we try to calculate the included path we will unconditionally dereference that `NULL` pointer and thus segfault. Fix the issue by returning early in both `parse_include` and `parse_conditional_include` in case where the `file` argument is `NULL`. Add a test to avoid future regression. The issue has been found by the oss-fuzz project, issue 10810.
Patrick Steinhardt e5090ee3 2018-10-04T11:19:28 diff_stats: use git's formatting of renames with common directories In cases where a file gets renamed such that the directories containing it previous and after the rename have a common prefix, then git will avoid printing this prefix twice and instead format the rename as "prefix/{old => new}". We currently didn't do anything like that, but simply printed "prefix/old -> prefix/new". Adjust our behaviour to instead match upstream. Adjust the test for this behaviour to expect the new format.
Patrick Steinhardt 1bc5b05c 2018-10-03T16:17:21 smart_pkt: do not accept callers passing in no line length Right now, we simply ignore the `linelen` parameter of `git_pkt_parse_line` in case the caller passed in zero. But in fact, we never want to assume anything about the provided buffer length and always want the caller to pass in the available number of bytes. And in fact, checking all the callers, one can see that the funciton is never being called in case where the buffer length is zero, and thus we are safe to remove this check.
Patrick Steinhardt c05790a8 2018-08-09T11:16:15 smart_pkt: return parsed length via out-parameter The `parse_len` function currently directly returns the parsed length of a packet line or an error code in case there was an error. Instead, convert this to our usual style of using the return value as error code only and returning the actual value via an out-parameter. Thus, we can now convert the output parameter to an unsigned type, as the size of a packet cannot ever be negative. While at it, we also move the check whether the input buffer is long enough into `parse_len` itself. We don't really want to pass around potentially non-NUL-terminated buffers to functions without also passing along the length, as this is dangerous in the unlikely case where other callers for that function get added. Note that we need to make sure though to not mess with `GIT_EBUFS` error codes, as these indicate not an error to the caller but that he needs to fetch more data.
Patrick Steinhardt 0b3dfbf4 2018-08-09T11:13:59 smart_pkt: reorder and rename parameters of `git_pkt_parse_line` The parameters of the `git_pkt_parse_line` function are quite confusing. First, there is no real indicator what the `out` parameter is actually all about, and it's not really clear what the `bufflen` parameter refers to. Reorder and rename the parameters to make this more obvious.
Patrick Steinhardt 5fabaca8 2018-08-09T11:04:42 smart_pkt: fix buffer overflow when parsing "unpack" packets When checking whether an "unpack" packet returned the "ok" status or not, we use a call to `git__prefixcmp`. In case where the passed line isn't properly NUL terminated, though, this may overrun the line buffer. Fix this by using `git__prefixncmp` instead.
Patrick Steinhardt b5ba7af2 2018-08-09T11:03:37 smart_pkt: fix "ng" parser accepting non-space character When parsing "ng" packets, we blindly assume that the character immediately following the "ng" prefix is a space and skip it. As the calling function doesn't make sure that this is the case, we can thus end up blindly accepting an invalid packet line. Fix the issue by using `git__prefixncmp`, checking whether the line starts with "ng ".
Patrick Steinhardt a9f1ca09 2018-08-09T11:01:00 smart_pkt: fix buffer overflow when parsing "ok" packets There are two different buffer overflows present when parsing "ok" packets. First, we never verify whether the line already ends after "ok", but directly go ahead and also try to skip the expected space after "ok". Second, we then go ahead and use `strchr` to scan for the terminating newline character. But in case where the line isn't terminated correctly, this can overflow the line buffer. Fix the issues by using `git__prefixncmp` to check for the "ok " prefix and only checking for a trailing '\n' instead of using `memchr`. This also fixes the issue of us always requiring a trailing '\n'. Reported by oss-fuzz, issue 9749: Crash Type: Heap-buffer-overflow READ {*} Crash Address: 0x6310000389c0 Crash State: ok_pkt git_pkt_parse_line git_smart__store_refs Sanitizer: address (ASAN)
Patrick Steinhardt bc349045 2018-08-09T10:38:10 smart_pkt: fix buffer overflow when parsing "ACK" packets We are being quite lenient when parsing "ACK" packets. First, we didn't correctly verify that we're not overrunning the provided buffer length, which we fix here by using `git__prefixncmp` instead of `git__prefixcmp`. Second, we do not verify that the actual contents make any sense at all, as we simply ignore errors when parsing the ACKs OID and any unknown status strings. This may result in a parsed packet structure with invalid contents, which is being silently passed to the caller. This is being fixed by performing proper input validation and checking of return codes.
Patrick Steinhardt 5edcf5d1 2018-08-09T10:57:06 smart_pkt: adjust style of "ref" packet parsing function While the function parsing ref packets doesn't have any immediately obvious buffer overflows, it's style is different to all the other parsing functions. Instead of checking buffer length while we go, it does a check up-front. This causes the code to seem a lot more magical than it really is due to some magic constants. Refactor the function to instead make use of the style of other packet parser and verify buffer lengths as we go.
Patrick Steinhardt 786426ea 2018-08-09T10:46:58 smart_pkt: check whether error packets are prefixed with "ERR " In the `git_pkt_parse_line` function, we determine what kind of packet a given packet line contains by simply checking for the prefix of that line. Except for "ERR" packets, we always only check for the immediate identifier without the trailing space (e.g. we check for an "ACK" prefix, not for "ACK "). But for "ERR" packets, we do in fact include the trailing space in our check. This is not really much of a problem at all, but it is inconsistent with all the other packet types and thus causes confusion when the `err_pkt` function just immediately skips the space without checking whether it overflows the line buffer. Adjust the check in `git_pkt_parse_line` to not include the trailing space and instead move it into `err_pkt` for consistency.
Patrick Steinhardt 40fd84cc 2018-08-09T10:46:26 smart_pkt: explicitly avoid integer overflows when parsing packets When parsing data, progress or error packets, we need to copy the contents of the rest of the current packet line into the flex-array of the parsed packet. To keep track of this array's length, we then assign the remaining length of the packet line to the structure. We do have a mismatch of types here, as the structure's `len` field is a signed integer, while the length that we are assigning has type `size_t`. On nearly all platforms, this shouldn't pose any problems at all. The line length can at most be 16^4, as the line's length is being encoded by exactly four hex digits. But on a platforms with 16 bit integers, this assignment could cause an overflow. While such platforms will probably only exist in the embedded ecosystem, we still want to avoid this potential overflow. Thus, we now simply change the structure's `len` member to be of type `size_t` to avoid any integer promotion.
Patrick Steinhardt 4a5804c9 2018-08-09T10:36:44 smart_pkt: honor line length when determining packet type When we parse the packet type of an incoming packet line, we do not verify that we don't overflow the provided line buffer. Fix this by using `git__prefixncmp` instead and passing in `len`. As we have previously already verified that `len <= linelen`, we thus won't ever overflow the provided buffer length.
Anders Borum b36cc7a4 2018-09-27T11:18:00 fix check if blob is uninteresting when inserting tree to packbuilder Blobs that have been marked as uninteresting should not be inserted into packbuilder when inserting a tree. The check as to whether a blob was uninteresting looked at the status for the tree itself instead of the blob. This could cause significantly larger packfiles.
Gabriel DeBacker 8ab11dd5 2018-09-30T16:40:22 Fix issue with path canonicalization for Win32 paths
Carlos Martín Nieto 0530d7d9 2018-09-28T18:04:23 Merge pull request #4767 from pks-t/pks/config-mem In-memory configuration
Patrick Steinhardt 2be39cef 2018-08-10T19:38:57 config: introduce new read-only in-memory backend Now that we have abstracted away how to store and retrieve config entries, it became trivial to implement a new in-memory backend by making use of this. And thus we do so. This commit implements a new read-only in-memory backend that can parse a chunk of memory into a `git_config_backend` structure.
Patrick Steinhardt b78f4ab0 2018-08-16T12:22:03 config_entries: refactor entries iterator memory ownership Right now, the config file code requires us to pass in its backend to the config entry iterator. This is required with the current code, as the config file backend will first create a read-only snapshot which is then passed to the iterator just for that purpose. So after the iterator is getting free'd, the code needs to make sure that the snapshot gets free'd, as well. By now, though, we can easily refactor the code to be more efficient and remove the reverse dependency from iterator to backend. Instead of creating a read-only snapshot (which also requires us to re-parse the complete configuration file), we can simply duplicate the config entries and pass those to the iterator. Like that, the iterator only needs to make sure to free the duplicated config entries, which is trivial to do and clears up memory ownership by a lot.
Patrick Steinhardt d49b1365 2018-08-10T19:01:37 config_entries: internalize structure declarations Access to the config entries is now completely done via the modules function interface and no caller messes with the struct's internals. We can thus completely move the structure declarations into the implementation file so that nobody even has a chance to mess with the members.
Patrick Steinhardt 123e5963 2018-08-10T18:59:59 config_entries: abstract away reference counting Instead of directly calling `git_atomic_inc` in users of the config entries store, provide a `git_config_entries_incref` function to further decouple the interfaces. Convert the refcount to a `git_refcount` structure while at it.
Patrick Steinhardt 5a7e0b3c 2018-08-10T18:49:38 config_entries: abstract away iteration over entries The nice thing about our `git_config_iterator` interfaces is that nobody needs to know anything about the implementation details. All that is required is to obtain the iterator via any backend and then use it by executing generic functions. We can thus completely internalize all the implementation details of how to iterate over entries into the config entries store and simply create such an iterator in our config file backend when we want to iterate its entries. This further decouples the config file backend from the config entries store.
Patrick Steinhardt 60ebc137 2018-08-10T14:53:09 config_entries: abstract away retrieval of config entries The code accessing config entries in the `git_config_entries` structure is still much too intimate with implementation details, directly accessing the maps and handling indices. Provide two new functions to get config entries from the internal map structure to decouple the interfaces and use them in the config file code. The function `git_config_entries_get` will simply look up the entry by name and, in the case of a multi-value, return the last occurrence of that entry. The second function, `git_config_entries_get_unique`, will only return an entry if it is unique and not included via another configuration file. This one is required to properly implement write operations for single entries, as we refuse to write to or delete a single entry if it is not clear which one was meant.
Patrick Steinhardt fb8a87da 2018-08-10T14:50:15 config_entries: rename functions and structure The previous commit simply moved all code that is required to handle config entries to a new module without yet adjusting any of the function and structure names to help readability. We now rename things accordingly to have a common "git_config_entries" entries instead of the old "diskfile_entries" one.
Patrick Steinhardt 04f57d51 2018-08-10T13:33:02 config_entries: pull out implementation of entry store The configuration entry store that is used for configuration files needs to keep track of all entries in two different structures: - a singly linked list is being used to be able to iterate through configuration files in the order they have been found - a string map is being used to efficiently look up configuration entries by their key This store is thus something that may be used by other, future backends as well to abstract away implementation details and iteration over the entries. Pull out the necessary functions from "config_file.c" and moves them into their own "config_entries.c" module. For now, this is simply moving over code without any renames and/or refactorings to help reviewing.
Patrick Steinhardt d75bbea1 2018-08-10T14:35:23 config_file: remove unnecessary snapshot indirection The implementation for config file snapshots has an unnecessary redirection from `config_snapshot` to `git_config_file__snapshot`. Inline the call to `git_config_file__snapshot` and remove it.
Patrick Steinhardt b944e137 2018-08-10T13:03:33 config: rename "config_file.h" to "config_backend.h" The header "config_file.h" has a list of inline-functions to access the contents of a config backend without directly messing with the struct's function pointers. While all these functions are called "git_config_file_*", they are in fact completely backend-agnostic and don't care whether it is a file or not. Rename all the function to instead be backend-agnostic versions called "git_config_backend_*" and rename the header to match.
Patrick Steinhardt 1aeff5d7 2018-08-10T12:52:18 config: move function normalizing section names into "config.c" The function `git_config_file_normalize_section` is never being used in any file different than "config.c", but it is implemented in "config_file.c". Move it over and make the symbol static.
Patrick Steinhardt a5562692 2018-08-10T12:49:50 config: make names backend-agnostic As a last step to make variables and structures more backend agnostic for our `git_config` structure, rename local variables to not be called `file` anymore.
Patrick Steinhardt ba1cd495 2018-09-28T11:10:49 Merge pull request #4784 from tiennou/fix/warnings Some warnings
Patrick Steinhardt 367f6243 2018-09-28T11:04:06 Merge pull request #4803 from tiennou/fix/4802 index: release the snapshot instead of freeing the index
Etienne Samson e0afd1c2 2018-09-26T21:17:39 vector: do not realloc 0-size vectors
Etienne Samson fa48d2ea 2018-09-26T19:15:35 vector: do not malloc 0-length vectors on dup
Etienne Samson be4717d2 2018-09-18T12:12:06 path: fix "comparison always true" warning
Etienne Samson 21496c30 2018-03-30T00:20:23 stransport: fix a warning on iOS "warning: values of type 'OSStatus' should not be used as format arguments; add an explicit cast to 'int' instead [-Wformat]"
Patrick Steinhardt a54043b7 2018-09-21T15:32:18 Merge pull request #4794 from marcin-krystianc/mkrystianc/prune_perf git_remote_prune to be O(n * logn)
Patrick Steinhardt 83733aeb 2018-08-10T12:43:21 config: rename `file_internal` and its `file` member Same as with the previous commit, the `file_internal` struct is used to keep track of all the backends that are added to a `git_config` struct. Rename it to `backend_internal` and rename its `file` member to `backend` to make the implementation more backend-agnostic.
Patrick Steinhardt 633cf40c 2018-08-10T12:39:17 config: rename `files` vector to `backends` Originally, the `git_config` struct is a collection of all the parsed configuration files from different scopes (system-wide config, user-specific config as well as the repo-specific config files). Historically, we didn't and don't yet have any other configuration backends than the one for files, which is why the field holding the config backends is called `files`. But in fact, nothing dictates that the vector of backends actually holds file backends only, as they are generic and custom backends can be implemented by users. Rename the member to be called `backends` to clarify that there is nothing specific to files here.
Patrick Steinhardt b9affa32 2018-08-10T19:23:00 config_parse: avoid unused static declared values The variables `git_config_escaped` and `git_config_escapes` are both defined as static const character pointers in "config_parse.h". In case where "config_parse.h" is included but those two variables are not being used, the compiler will thus complain about defined but unused variables. Fix this by declaring them as external and moving the actual initialization to the C file. Note that it is not possible to simply make this a #define, as we are indexing into those arrays.
Patrick Steinhardt 0b9c68b1 2018-08-16T14:10:58 submodule: fix submodule names depending on config-owned memory When populating the list of submodule names, we use the submodule configuration entry's name as the key in the map of submodule names. This creates a hidden dependency on the liveliness of the configuration that was used to parse the submodule, which is fragile and unexpected. Fix the issue by duplicating the string before writing it into the submodule name map.
Edward Thomson e181a649 2018-09-18T03:03:03 Merge pull request #4809 from libgit2/cmn/revwalk-sign-regression Fix revwalk limiting regression
Carlos Martín Nieto 12a1790d 2018-09-17T14:49:46 revwalk: only check the first commit in the list for an earlier timestamp This is not a big deal, but it does make us match git more closely by checking only the first. The lists are sorted already, so there should be no functional difference other than removing a possible check from every iteration in the loop.