src/index.c


Log

Author Commit Date CI Message
Patrick Steinhardt 17641f1f 2020-06-01T15:05:51 Merge pull request #5526 from libgit2/ethomson/poolinit git_pool_init: allow the function to fail
Edward Thomson 0f35efeb 2020-05-23T10:15:51 git_pool_init: handle failure cases Propagate failures caused by pool initialization errors.
Patrick Wang 8c96d56d 2020-05-26T04:53:09 index: write v4: bugfix: prefix path with strip_len, not same_len According to index-format.txt of git, the path of an entry is prefixed with N, where N indicates the length of bytes to be stripped.
Edward Thomson cb43274a 2020-01-18T17:42:52 index functions: return an int Stop returning a void for functions, future-proofing them to allow them to fail.
Patrick Steinhardt 7fc97eb3 2020-01-09T14:21:41 index: fix resizing index map twice on case-insensitive systems Depending on whether the index map is case-sensitive or insensitive, we need to call either `git_idxmap_icase_resize` or `git_idxmap_resize`. There are multiple locations where we thus use the following pattern: if (index->ignore_case && git_idxmap_icase_resize(map, length) < 0) return -1; else if (git_idxmap_resize(map, length) < 0) return -1; The funny thing is: on case-insensitive systems, we will try to resize the map twice in case where `git_idxmap_icase_resize()` doesn't error. While this will still use the correct hashing function as both map types use the same, this bug will at least cause us to resize the map twice in a row. Fix the issue by introducing a new function `index_map_resize` that handles case-sensitivity, similar to how `index_map_set` and `index_map_delete`. Convert all call sites where we were previously resizing the map to use that new function.
Patrick Steinhardt ab45887f 2020-01-09T14:15:02 index: replace map macros with inline functions Traditionally, our maps were mostly implemented via macros that had weird call semantics. This shows in our index code, where we have macros that insert into an index map case-sensitively or insensitively, as they still return error codes via an error parameter. This is unwieldy and, most importantly, not necessary anymore, due to the introduction of our high-level map API and removal of macros. Replace them with inlined functions to make code easier to read.
Patrick Steinhardt 658022c4 2019-07-18T13:53:41 configuration: cvar -> configmap `cvar` is an unhelpful name. Refactor its usage to `configmap` for more clarity.
Edward Thomson 7e49deba 2019-05-20T06:35:11 index: safely cast file size
Edward Thomson 6574cd00 2019-06-08T19:25:36 index: rename `frombuffer` to `from_buffer` The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the index functions for consistency with the rest of the library.
Edward Thomson 08f39208 2019-06-08T17:46:04 blob: add underscore to `from` functions The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.
Patrick Steinhardt 8da93944 2018-12-01T10:52:44 idxmap: have `resize` functions return proper error code The currently existing function `git_idxmap_resize` and `git_idxmap_icase_resize` do not return any error codes at all due to their previous implementation making use of a macro. Due to that, it is impossible to see whether the resize operation might have failed due to an out-of-memory situation. Fix this by providing a proper error code. Adjust callers to make use of it.
Patrick Steinhardt 661fc57b 2018-12-01T01:16:25 idxmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_idxmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_idxmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_idxmap_insert` to make use of it.
Patrick Steinhardt d00c24a9 2019-01-23T10:49:25 idxmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce new high-level functions `git_idxmap_get` and `git_idxmap_icase_get` that take a map and a key and return a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
Patrick Steinhardt 351eeff3 2019-01-23T10:42:46 maps: use uniform lifecycle management functions Currently, the lifecycle functions for maps (allocation, deallocation, resize) are not named in a uniform way and do not have a uniform function signature. Rename the functions to fix that, and stick to libgit2's naming scheme of saying `git_foo_new`. This results in the following new interface for allocation: - `int git_<t>map_new(git_<t>map **out)` to allocate a new map, returning an error code if we ran out of memory - `void git_<t>map_free(git_<t>map *map)` to free a map - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map This commit also fixes all existing callers.
Edward Thomson 494448a5 2019-01-20T19:10:08 index: explicitly cast down to a size_t Quiet down a warning from MSVC about how we're potentially losing data. This cast is safe since we've explicitly tested that `strip_len` <= `last_len`.
Etienne Samson 0bf7e043 2019-01-24T12:12:04 index: preserve extension parsing errors Previously, we would clobber any extension-specific error message with an "extension is truncated" message. This makes `read_extension` correctly preserve those errors, takes responsibility for truncation errors, and adds a new message with the actual extension signature for unsupported mandatory extensions.
Edward Thomson f673e232 2018-12-27T13:47:34 git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
Edward Thomson 18e71e6d 2018-11-28T13:31:06 index: use new enum and structure names Use the new-style index names throughout our own codebase.
Patrick Steinhardt 852bc9f4 2018-11-23T19:26:24 khash: remove intricate knowledge of khash types Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.
Patrick Steinhardt 0e3e832d 2018-11-21T13:30:01 Merge pull request #4884 from libgit2/ethomson/index_iterator index: introduce git_index_iterator
Edward Thomson c358bbc5 2018-11-12T17:22:47 index: introduce git_index_iterator Provide a public git_index_iterator API that is backed by an index snapshot. This allows consumers to provide a stable iteration even while manipulating the index during iteration.
Patrick Steinhardt 28239be3 2018-11-13T13:27:41 Merge pull request #4818 from pks-t/pks/index-collision Index collision fixes
Patrick Steinhardt 8b6e2895 2018-09-21T15:18:03 index: fix adding index entries with conflicting files When adding an index entry "a/b/c" while an index entry "a/b" already exists, git will happily remove "a/b/c" and only add the new index entry: $ git init test Initialized empty Git repository in /tmp/test.repo/test/.git/ $ touch x $ git add x $ rm x $ mkdir x $ touch x/y $ git add x/y $ git status A x/y The other way round, adding an index entry "a/b" with an entry "a/b/c" already existing is equivalent, where git will remove "a/b/c" and add "a/b". In contrast, libgit2 will currently fail to add these properly and instead complain about the entry appearing as both a file and a directory. This is a programming error, though: our current code already tries to detect and, in the case of `git_index_add`, to automatically replace such index entries. Funnily enough, we already remove the conflicting index entries, but instead of adding the new entry we then bail out afterwards. This leaves callers with the worst of both worlds: we both remove the old entry but fail to add the new one. The root cause is weird semantics of the `has_file_name` and `has_dir_name` functions. While these functions only sound like they are responsible for detecting such conflicts, they will also already remove them in case where its `ok_to_replace` parameter is set. But even if we tell it to replace such entries, it will return an error code. Fix the error by returning success in case where the entries have been replaced. Fix an already existing test which tested for wrong behaviour. Note that the test didn't notice that the resulting tree had no entries. Thus it is fine to change existing behaviour here, as the previous result could've let to silently loosing data. Also add a new test that verifies behaviour in the reverse conflicting case.
Patrick Steinhardt 923317db 2018-09-21T12:57:02 index: modernize error handling of `index_insert` The current error hanling of the function `index_insert` is currently very fragile. Instead of erroring out in case an error has happened, it will instead verify that no error has happened for each statement. This makes adding new code to that function an adventurous task. Improve the situation by converting the function to use our typical `goto out` pattern.
Patrick Steinhardt 600ceadd 2018-10-18T11:29:06 index: avoid out-of-bounds read when reading reuc entry stage We use `git__strtol64` to parse file modes of the index entries, which does not limit the parsed buffer length. As the index can be essentially treated as "untrusted" in that the data stems from the file system, it may be misformatted and may not contain terminating `NUL` bytes. This may lead to out-of-bounds reads when trying to parse index entries with such malformatted modes. Fix the issue by using `git__strntol64` instead.
Etienne Samson c70713d6 2018-09-11T15:53:35 index: release the snapshot instead of freeing the index Previously we would assert in index_free because the reader incrementation would not be balanced. Release the snapshot normally, so the variable gets decremented before the index is freed.
abyss7 581d5492 2018-08-16T22:45:43 Fix leak in index.c
Edward Thomson bfa1f022 2018-06-22T19:17:08 settings: optional unsaved index safety Add the `GIT_OPT_ENABLE_UNSAVED_INDEX_SAFETY` option, which will cause commands that reload the on-disk index to fail if the current `git_index` has changed that have not been saved. This will prevent users from - for example - adding a file to the index then calling a function like `git_checkout` and having that file be silently removed from the index since it was re-read from disk. Now calls that would re-read the index will fail if the index is "dirty", meaning changes have been made to it but have not been written. Users can either `git_index_read` to discard those changes explicitly, or `git_index_write` to write them.
Edward Thomson 787768c2 2018-06-22T19:07:54 index: return a unique error code on dirty index When the index is dirty, return GIT_EINDEXDIRTY so that consumers can identify the exact problem programatically.
Edward Thomson b242cdbf 2017-11-17T00:19:07 index: commit the changes to the index properly Now that the index has a "dirty" state, where it has changes that have not yet been committed or rolled back, our tests need to be adapted to actually commit or rollback the changes instead of assuming that the index can be operated on in its indeterminate state.
Edward Thomson 7c56c49b 2017-11-12T08:09:35 index: add a dirty bit reflecting unsaved changes Teach the index when it is "dirty", and has unsaved changes. Consider the index dirty whenever a caller has added or removed an entry from the main index, REUC or NAME section, including when the index is completely cleared. Similarly, consider the index _not_ dirty immediately after it is written, or when it is read from the on-disk index. This allows us to ensure that unsaved changes are not lost when we automatically refresh the index.
Patrick Steinhardt ecf4f33a 2018-02-08T11:14:48 Convert usage of `git_buf_free` to new `git_buf_dispose`
John Paul Adrian Glaubitz 93271f59 2018-05-25T01:41:33 index: Fix alignment issues in write_disk_entry() In order to avoid alignment issues on certain target architectures, it is necessary to use memcpy() when modifying elements of a struct inside a buffer returned by git_filebuf_reserve().
Carlos Martín Nieto a7168b47 2018-05-22T16:13:47 path: reject .gitmodules as a symlink Any part of the library which asks the question can pass in the mode to have it checked against `.gitmodules` being a symlink. This is particularly relevant for adding entries to the index from the worktree and for checking out files.
Carlos Martín Nieto 58ff913a 2018-05-22T15:48:38 index: stat before creating the entry This is so we have it available for the path validity checking. In a later commit we will start rejecting `.gitmodules` files as symlinks.
Patrick Steinhardt 3db1af1f 2018-03-08T12:36:46 index: error out on unreasonable prefix-compressed path lengths When computing the complete path length from the encoded prefix-compressed path, we end up just allocating the complete path without ever checking what the encoded path length actually is. This can easily lead to a denial of service by just encoding an unreasonable long path name inside of the index. Git already enforces a maximum path length of 4096 bytes. As we also have that enforcement ready in some places, just make sure that the resulting path is smaller than GIT_PATH_MAX. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
Patrick Steinhardt 3207ddb0 2018-03-08T12:00:27 index: fix out-of-bounds read with invalid index entry prefix length The index format in version 4 has prefix-compressed entries, where every index entry can compress its path by using a path prefix of the previous entry. Since implmenting support for this index format version in commit 5625d86b9 (index: support index v4, 2016-05-17), though, we do not correctly verify that the prefix length that we want to reuse is actually smaller or equal to the amount of characters than the length of the previous index entry's path. This can lead to a an integer underflow and subsequently to an out-of-bounds read. Fix this by verifying that the prefix is actually smaller than the previous entry's path length. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
Patrick Steinhardt 58a6fe94 2018-03-08T11:49:19 index: convert `read_entry` to return entry size via an out-param The function `read_entry` does not conform to our usual coding style of returning stuff via the out parameter and to use the return value for reporting errors. Due to most of our code conforming to that pattern, it has become quite natural for us to actually return `-1` in case there is any error, which has also slipped in with commit 5625d86b9 (index: support index v4, 2016-05-17). As the function returns an `size_t` only, though, the return value is wrapped around, causing the caller of `read_tree` to continue with an invalid index entry. Ultimately, this can lead to a double-free. Improve code and fix the bug by converting the function to return the index entry size via an out parameter and only using the return value to indicate errors. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
Edward Thomson f1ad004c 2018-02-18T22:29:48 Merge pull request #4529 from libgit2/ethomson/index_add_requires_files git_index_add_frombuffer: only accept files/links
Edward Thomson 5f774dbf 2018-02-11T10:14:13 git_index_add_frombuffer: only accept files/links Ensure that the buffer given to `git_index_add_frombuffer` represents a regular blob, an executable blob, or a link. Explicitly reject commit entries (submodules) - it makes little sense to allow users to add a submodule from a string; there's no possible path to success.
Patrick Steinhardt 7c6e9175 2018-02-16T11:11:11 index: shut up warning on uninitialized variable Even though the `entry` variable will always be initialized when `read_entry` returns success and even though we never dereference `entry` in case `read_entry` fails, GCC prints a warning about uninitialized use. Just initialize the pointer to `NULL` in order to shut GCC up.
Patrick Steinhardt 0c7f49dd 2017-06-30T13:39:01 Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Patrick Steinhardt 064a60e9 2017-05-19T14:06:15 index: verify we have enough space left when writing index entries In our code writing index entries, we carry around a `disk_size` representing how much memory we have in total and pass this value to `git_encode_varint` to do bounds checks. This does not make much sense, as at the time when passing on this variable it is already out of date. Fix this by subtracting used memory from `disk_size` as we go along. Furthermore, assert we've actually got enough space left to do the final path memcpy.
Patrick Steinhardt c71dff7e 2017-05-19T13:49:34 index: fix shared prefix computation when writing index entry When using compressed index entries, each entry's path is preceded by a varint encoding how long the shared prefix with the previous index entry actually is. We currently encode a length of `(path_len - same_len)`, which is doubly wrong. First, `path_len` is already set to `path_len - same_len` previously. Second, we want to encode the shared prefix rather than the un-shared suffix length. Fix this by using `same_len` as the varint value instead.
Patrick Steinhardt 83e0392c 2017-05-19T13:39:05 index: also sanity check entry size with compressed entries We have a check in place whether the index has enough data left for the required footer after reading an index entry, but this was only used for uncompressed entries. Move the check down a bit so that it is executed for both compressed and uncompressed index entries.
Patrick Steinhardt 350d2c47 2017-05-19T14:22:35 index: remove file-scope entry size macros All index entry size computations are now performed in `index_entry_size`. As such, we do not need the file-scope macros for computing these sizes anymore. Remove them and move the `entry_size` macro into the `index_entry_size` function.
Patrick Steinhardt 46b67034 2017-05-19T13:59:53 index: don't right-pad paths when writing compressed entries Our code to write index entries to disk does not check whether the entry that is to be written should use prefix compression for the path. As such, we were overallocating memory and added bogus right-padding into the resulting index entries. As there is no padding allowed in the index version 4 format, this should actually result in an invalid index. Fix this by re-using the newly extracted `index_entry_size` function.
Patrick Steinhardt 29f498e0 2017-05-19T13:38:34 index: move index entry size computation into its own function Create a new function `index_entry_size` which encapsulates the logic to calculate how much space is needed for an index entry, whether it is simple/extended or compressed/uncompressed. This can later be re-used by our code writing index entries.
Patrick Steinhardt 8ceb890b 2017-05-19T12:35:21 index: set last written index entry in foreach-entry-loop The last written disk entry is currently being written inside of the function `write_disk_entry`. Make behavior a bit more obviously by instead setting it inside of `write_entries` while iterating all entries.
Patrick Steinhardt 11d0be23 2017-05-12T10:01:43 index: set last entry when reading compressed entries To calculate the path of a compressed index entry, we need to know the preceding entry's path. While we do actually set the first predecessor correctly to "", we fail to update this while reading the entries. Fix the issue by updating `last` inside of the loop. Previously, we've been passing a double-pointer to `read_entry`, which it didn't update. As it is more obvious to update the pointer inside the loop itself, though, we can simply convert it to a normal pointer.
Patrick Steinhardt febe8c14 2017-05-10T14:27:12 index: fix confusion with shared prefix in compressed path names The index version 4 introduced compressed path names for the entries. From the git.git index-format documentation: At the beginning of an entry, an integer N in the variable width encoding [...] is stored, followed by a NUL-terminated string S. Removing N bytes from the end of the path name for the previous entry, and replacing it with the string S yields the path name for this entry. But instead of stripping N bytes from the previous path's string and using the remaining prefix, we were instead simply concatenating the previous path with the current entry path, which is obviously wrong. Fix the issue by correctly copying the first N bytes of the previous entry only and concatenating the result with our current entry's path.
Patrick Steinhardt 8f1ff26b 2017-02-02T13:09:32 idxmap: remove GIT__USE_IDXMAP
Patrick Steinhardt f14f75d4 2017-02-02T13:08:52 khash: avoid using `kh_resize` directly
Patrick Steinhardt 73028af8 2017-01-27T14:20:24 khash: avoid using macro magic to get return address
Edward Thomson 909d5494 2016-12-29T12:25:15 giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
Pranit Bauva 65b78ea3 2016-11-17T01:08:49 use `giterr_set_str()` wherever possible `giterr_set()` is used when it is required to format a string, and since we don't really require it for this case, it is better to stick to `giterr_set_str()`. This also suppresses a warning(-Wformat-security) raised by the compiler. Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
David Turner 5625d86b 2016-05-17T15:40:32 index: support index v4 Support reading and writing index v4. Index v4 uses a very simple compression scheme for pathnames, but is otherwise similar to index v3. Signed-off-by: David Turner <dturner@twitter.com>
Edward Thomson 4aaae935 2016-07-22T12:53:13 index: cast to avoid warning
Edward Thomson 6249d960 2016-06-29T17:55:44 index: include conflicts in `git_index_read_index` Ensure that we include conflicts when calling `git_index_read_index`, which will remove conflicts in the index that do not exist in the new target, and will add conflicts from the new target.
Edward Thomson 6f7ec728 2016-06-29T17:01:47 index: refactor common `read_index` functionality Most of `git_index_read_index` is common to reading any iterator. Refactor it out in case we want to implement `read_tree` in terms of it in the future.
Patrick Steinhardt 13deb874 2016-06-07T08:35:26 index: fix NULL pointer access in index_remove_entry When removing an entry from the index by its position, we first retrieve the position from the index's entries and then try to remove the retrieved value from the index map with `DELETE_IN_MAP`. When `index_remove_entry` returns `NULL` we try to feed it into the `DELETE_IN_MAP` macro, which will unconditionally call `idxentry_hash` and then happily dereference the `NULL` entry pointer. Fix the issue by not passing a `NULL` entry into `DELETE_IN_MAP`.
Edward Thomson 46082c38 2016-06-02T02:34:03 index_read_index: invalidate new paths in tree cache When adding a new entry to an existing index via `git_index_read_index`, be sure to remove the tree cache entry for that new path. This will mark all parent trees as dirty.
Edward Thomson 9167c145 2016-06-02T01:04:58 index_read_index: set flags for path_len correctly Update the flags to reset the path_len (to emulate `index_insert`)
Edward Thomson 046ec3c9 2016-06-02T00:47:51 index_read_index: differentiate on mode Treat index entries with different modes as different, which they are, at least for the purposes of up-to-date calculations.
Edward Thomson 93de20b8 2016-06-01T14:56:27 index_read_index: reset error correctly Clear any error state upon each iteration. If one of the iterations ends (with an error of `GIT_ITEROVER`) we need to reset that error to 0, lest we stop the whole process prematurely.
Patrick Steinhardt f80852af 2016-05-02T14:30:14 index: fix memory leak on error case
Carlos Martín Nieto 60a194aa 2016-03-20T11:00:12 tree: re-use the id and filename in the odb object Instead of copying over the data into the individual entries, point to the originals, which are already in a format we can use.
Patrick Steinhardt 80a834a5 2016-03-01T16:00:49 index: assert required OID are non-NULL
Edward Thomson 6ddf533a 2016-02-23T18:29:16 git_index_add: validate objects in index entries (optionally) When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate the index entries given to `git_index_add`.
Carlos Martín Nieto 9f4e7c84 2016-02-25T18:42:09 Merge pull request #3638 from ethomson/nsec USE_NSECS fixes
Edward Thomson 3d6a42d1 2016-02-25T11:23:19 nsec: support NDK's crazy nanoseconds Android NDK does not have a `struct timespec` in its `struct stat` for nanosecond support, instead it has a single nanosecond member inside the struct stat itself. We will use that and use a macro to expand to the `st_mtim` / `st_mtimespec` definition on other systems (much like the existing `st_mtime` backcompat definition).
Patrick Steinhardt 0f1e2d20 2016-02-23T11:23:26 index: fix contradicting comparison The overflow check in `read_reuc` tries to verify if the `git__strtol32` parses an integer bigger than UINT_MAX. The `tmp` variable is casted to an unsigned int for this and then checked for being greater than UINT_MAX, which obviously can never be true. Fix this by instead fixing the `mode` field's size in `struct git_index_reuc_entry` to `uint32_t`. We can now parse the int with `git__strtol64`, which can never return a value bigger than `UINT32_MAX`, and additionally checking if the returned value is smaller than zero. We do not need to handle overflows explicitly here, as `git__strtol64` returns an error when the returned value would overflow.
Patrick Steinhardt 7808c937 2016-02-22T15:59:15 index: plug memory leak in `read_conflict_names`
Carlos Martín Nieto 5663d4f6 2016-02-18T12:31:56 Merge pull request #3613 from ethomson/fixups Remove most of the silly warnings
Carlos Martín Nieto 594a5d12 2016-02-18T12:28:06 Merge pull request #3619 from ethomson/win32_forbidden win32: allow us to read indexes with forbidden paths on win32
Edward Thomson 318b825e 2016-02-16T17:11:46 index: allow read of index w/ illegal entries Allow `git_index_read` to handle reading existing indexes with illegal entries. Allow the low-level `git_index_add` to add properly formed `git_index_entry`s even if they contain paths that would be illegal for the current filesystem (eg, `AUX`). Continue to disallow `git_index_add_bypath` from adding entries that are illegal universally illegal (eg, `.git`, `foo/../bar`).
Edward Thomson b2ca8d9c 2016-02-12T10:22:54 index: explicitly cast the teeny index entry members
Edward Thomson 997e0301 2016-02-12T10:11:32 index: don't use `seek` return as an error code
Edward Thomson 9a634cba 2016-02-12T10:03:29 index: explicitly cast new hash size to an int
Arthur Schreiber 3679ebae 2016-02-11T23:37:52 Horrible fix for #3173.
Carlos Martín Nieto 9d81509a 2015-12-23T11:54:52 index: get rid of the locking We don't support using an index object from multiple threads at the same time, so the locking doesn't have any effect when following the rules. If not following the rules, things are going to break down anyway.
Vicent Marti ef8b7feb 2015-12-16T19:36:50 index: Also size-hint the hash table Note that we're not checking whether the resize succeeds; in OOM cases, we let it run with a "small" vector and hash table and see if by chance we can grow it dynamically as we insert the new entries. Nothing to lose really.
Vicent Marti d7d46cfb 2015-12-16T17:00:25 index: Preallocate the entries vector with size hint
Vicent Marti 0cc20a8c 2015-12-16T16:53:06 index: Adjust namemask & mode when filling
Vicent Marti 879ebab3 2015-12-16T12:30:52 merge: Use `git_index__fill` to populate the index Instead of calling `git_index_add` in a loop, use the new `git_index_fill` internal API to fill the index with the initial staged entries. The new `fill` helper assumes that all the entries will be unique and valid, so it can append them at the end of the entries vector and only sort it once at the end. It performs no validation checks. This prevents the quadratic behavior caused by having to sort the entries list once after every insertion.
Carlos Martín Nieto dc49eb58 2015-12-10T11:57:44 Merge pull request #3538 from pks-t/pks/index-memory-leak index: always queue `remove_entry` for removal
Patrick Steinhardt b057fdef 2015-12-08T16:00:35 index: always queue `remove_entry` for removal When replacing an index with a new one, we need to iterate through all index entries in order to determine which entries are equal. When it is not possible to re-use old entries for the new index, we move it into a list of entries that are to be removed and thus free'd. When we encounter a non-zero error code, though, we skip adding the current index entry to the remove-queue. `INSERT_MAP_EX`, which is the function last run before adding to the remove-queue, may return a positive non-zero code that indicates what exactly happened while inserting the element. In this case we skip adding the entry to the remove-queue but still continue the current operation, leading to a leak of the current entry. Fix this by checking for a negative return value instead of a non-zero one when we want to add the current index entry to the remove-queue.
Edward Thomson 626f9e24 2015-12-03T16:27:15 index: canonicalize inserted paths safely When adding to the index, we look to see if a portion of the given path matches a portion of a path in the index. If so, we will use the existing path information. For example, when adding `foo/bar.c`, if there is an index entry to `FOO/other` and the filesystem is case insensitive, then we will put `bar.c` into the existing tree instead of creating a new one with a different case. Use `strncmp` to do that instead of `memcmp`. When we `bsearch` into the index, we locate the position where the new entry would go. The index entry at that position does not necessarily have a relation to the entry we're adding, so we cannot make assumptions and use `memcmp`. Instead, compare them as strings. When canonicalizing paths, we look for the first index entry that matches a given substring.
Edward Thomson 25e84f95 2015-11-23T15:49:54 checkout: only consider nsecs when built that way When examining the working directory and determining whether it's up-to-date, only consider the nanoseconds in the index entry when built with `GIT_USE_NSEC`. This prevents us from believing that the working directory is always dirty when the index was originally written with a git client that uinderstands nsecs (like git 2.x).
Edward Thomson 5f32c506 2015-11-16T18:06:52 racy: make git_index_read_index handle raciness Ensure that `git_index_read_index` clears the uptodate bit on files that it modifies. Further, do not propagate the cache from an on-disk index into another on-disk index. Although this should not be done, as `git_index_read_index` is used to bring an in-memory index into another index (that may or may not be on-disk), ensure that we do not accidentally bring in these bits when misused.
Edward Thomson 27bc41cf 2015-11-13T16:31:51 index: clear uptodate bit on save The uptodate bit should have a lifecycle of a single read->write on the index. Once the index is written, the files within it should be scanned for racy timestamps against the new index timestamp.
Edward Thomson d1101263 2015-11-13T15:32:48 index: don't detect raciness in uptodate entries Keep track of entries that we believe are up-to-date, because we added the index entries since the index was loaded. This prevents us from unnecessarily examining files that we wrote during the cleanup of racy entries (when we smudge racily clean files that have a timestamp newer than or equal to the index's timestamp when we read it). Without keeping track of this, we would examine every file that we just checked out for raciness, since all their timestamps would be newer than the index's timestamp.
Edward Thomson cb0ff012 2015-11-06T17:15:35 racy-git: do a single index->workdir diff When examining paths that are racily clean, do a single index->workdir diff over the entirety of the racily clean files, instead of a diff per file.
Carlos Martín Nieto 75a0ccf5 2015-11-12T19:53:09 Merge pull request #3170 from CmdrMoozy/nsec_fix git_index_entry__init_from_stat: set nsec fields in entry stats
Carlos Martín Nieto ad8509ef 2015-11-12T11:54:06 index: overwrite the path when inserting conflicts When we insert a conflict in a case-insensitive index, accept the new entry's path as the correct case instead of leaving the path we already had. This puts `git_index_conflict_add()` on the same level as `git_index_add()` in this respect.
Carlos Martín Nieto 16604d74 2015-11-11T00:36:15 index: correctly report which conflict stage has a wrong filemode When we're at offset 'i', we're dealing with the 'i+1' stage, since conflicts start at 1.
Edward Thomson 0bf77e32 2015-10-30T13:07:02 index: read_index must update hashes
Vicent Marti 1e5e02b4 2015-10-27T17:26:04 pool: Simplify implementation
Vicent Marti d307a013 2015-10-27T22:17:32 reuc: Be smarter when inserting new REUC entries Inserting new REUC entries can quickly become pathological given that each insert unsorts the REUC vector, and both subsequent lookups *and* insertions will require sorting it again before being successful. To avoid this, we're switching to `git_vector_insert_sorted`: this keeps the REUC vector constantly sorted and lets us use the `on_dup` callback to skip an extra binary search on each insertion.
Vicent Marti 128e94bb 2015-10-21T12:04:53 index: Remove unneeded consts