kmx git

Commit	Date	Message
11ef76a9	2022-01-22T13:31:02	index: use a byte array for checksum The index's checksum is not an object ID, so we should not use the `git_oid` type. Use a byte array for checksum calculation and storage. Deprecate the `git_index_checksum` function without a replacement. This is an abstraction that callers should not care about (and indeed do not seem to be using). Remove the unused `git_index__changed_relative_to` function.
90df4302	2022-01-05T12:18:05	Fix typos
adcf638c	2021-11-21T21:34:17	filebuf: use hashes not oids The filebuf functions should use hashes directly, not indirectly using the oid functions.
7dcc29fc	2021-10-22T22:51:59	Make enum in src,tests and examples C90 compliant by removing trailing comma.
63e36c53	2021-11-01T09:34:32	path: `validate` -> `is_valid` Since we're returning a boolean about validation, the name is more properly "is valid".
95117d47	2021-10-31T09:45:46	path: separate git-specific path functions from util Introduce `git_fs_path`, which operates on generic filesystem paths. `git_path` will be kept for only git-specific path functionality (for example, checking for `.git` in a path).
f0e693b1	2021-09-07T17:53:49	str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
31ecaca2	2021-09-30T08:11:40	hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.
2a713da1	2021-09-29T21:31:17	hash: accept the algorithm in inputs
578aeba9	2021-03-20T17:00:33	use git_repository_workdir_path to generate paths Use `git_repository_workdir_path` to generate workdir paths since it will validate the length.
88323cd0	2021-03-20T09:52:17	path: git_path_isvalid -> git_path_validate If we want to validate more and different types of paths, the name `git_path_validate` makes that easier and more expressive. We can add, for example, `git_path_validate_foo` while the current name makes that less ergonomic.
40930508	2021-02-18T16:36:42	index: Initialize case_sorted to GIT_VECTOR_INIT This is for extra safety within write_entries
21981f28	2021-02-16T13:43:09	index: Check git_vector_dup error in write_entries If allocating case_sorted.contents fails, git_vector_sort will segfault.
ab772974	2020-12-05T15:49:30	threads: give atomic functions the git_atomic prefix
37763d38	2020-12-05T15:26:59	threads: rename git_atomic to git_atomic32 Clarify the `git_atomic` type and functions now that we have a 64 bit version as well (`git_atomic64`).
1fa339be	2020-04-05T16:50:13	index: use GIT_ASSERT
d0656ac8	2020-06-27T12:15:26	Make the tests run cleanly under UndefinedBehaviorSanitizer This change makes the tests run cleanly under `-fsanitize=undefined,nullability` and comprises of: * Avoids some arithmetic with NULL pointers (which UBSan does not like). * Avoids an overflow in a shift, due to an uint8_t being implicitly converted to a signed 32-bit signed integer after being shifted by a 32-bit signed integer. * Avoids a unaligned read in libgit2. * Ignores unaligned reads in the SHA1 library, since it only happens on Intel processors, where it is _still_ undefined behavior, but the semantics are moderately well-understood. Of notable omission is `-fsanitize=integer`, since there are lots of warnings in zlib and the SHA1 library which probably don't make sense to fix and I could not figure out how to silence easily. libgit2 itself also has ~100s of warnings which are mostly innocuous (e.g. use of enum constants that only fit on an `uint32_t`, but there is no way to do that in a simple fashion because the data type chosen for enumerated types is implementation-defined), and investigating whether there are worrying warnings would need reducing the noise significantly.
c6184f0c	2020-06-08T21:07:36	tree-wide: do not compile deprecated functions with hard deprecation When compiling libgit2 with -DDEPRECATE_HARD, we add a preprocessor definition `GIT_DEPRECATE_HARD` which causes the "git2/deprecated.h" header to be empty. As a result, no function declarations are made available to callers, but the implementations are still available to link against. This has the problem that function declarations also aren't visible to the implementations, meaning that the symbol's visibility will not be set up correctly. As a result, the resulting library may not expose those deprecated symbols at all on some platforms and thus cause linking errors. Fix the issue by conditionally compiling deprecated functions, only. While it becomes impossible to link against such a library in case one uses deprecated functions, distributors of libgit2 aren't expected to pass -DDEPRECATE_HARD anyway. Instead, users of libgit2 should manually define GIT_DEPRECATE_HARD to hide deprecated functions. Using "real" hard deprecation still makes sense in the context of CI to test we don't use deprecated symbols ourselves and in case a dependant uses libgit2 in a vendored way and knows it won't ever use any of the deprecated symbols anyway.
17641f1f	2020-06-01T15:05:51	Merge pull request #5526 from libgit2/ethomson/poolinit git_pool_init: allow the function to fail
0f35efeb	2020-05-23T10:15:51	git_pool_init: handle failure cases Propagate failures caused by pool initialization errors.
8c96d56d	2020-05-26T04:53:09	index: write v4: bugfix: prefix path with strip_len, not same_len According to index-format.txt of git, the path of an entry is prefixed with N, where N indicates the length of bytes to be stripped.
cb43274a	2020-01-18T17:42:52	index functions: return an int Stop returning a void for functions, future-proofing them to allow them to fail.
7fc97eb3	2020-01-09T14:21:41	index: fix resizing index map twice on case-insensitive systems Depending on whether the index map is case-sensitive or insensitive, we need to call either `git_idxmap_icase_resize` or `git_idxmap_resize`. There are multiple locations where we thus use the following pattern: if (index->ignore_case && git_idxmap_icase_resize(map, length) < 0) return -1; else if (git_idxmap_resize(map, length) < 0) return -1; The funny thing is: on case-insensitive systems, we will try to resize the map twice in case where `git_idxmap_icase_resize()` doesn't error. While this will still use the correct hashing function as both map types use the same, this bug will at least cause us to resize the map twice in a row. Fix the issue by introducing a new function `index_map_resize` that handles case-sensitivity, similar to how `index_map_set` and `index_map_delete`. Convert all call sites where we were previously resizing the map to use that new function.
ab45887f	2020-01-09T14:15:02	index: replace map macros with inline functions Traditionally, our maps were mostly implemented via macros that had weird call semantics. This shows in our index code, where we have macros that insert into an index map case-sensitively or insensitively, as they still return error codes via an error parameter. This is unwieldy and, most importantly, not necessary anymore, due to the introduction of our high-level map API and removal of macros. Replace them with inlined functions to make code easier to read.
658022c4	2019-07-18T13:53:41	configuration: cvar -> configmap `cvar` is an unhelpful name. Refactor its usage to `configmap` for more clarity.
7e49deba	2019-05-20T06:35:11	index: safely cast file size
6574cd00	2019-06-08T19:25:36	index: rename `frombuffer` to `from_buffer` The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the index functions for consistency with the rest of the library.
08f39208	2019-06-08T17:46:04	blob: add underscore to `from` functions The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.
8da93944	2018-12-01T10:52:44	idxmap: have `resize` functions return proper error code The currently existing function `git_idxmap_resize` and `git_idxmap_icase_resize` do not return any error codes at all due to their previous implementation making use of a macro. Due to that, it is impossible to see whether the resize operation might have failed due to an out-of-memory situation. Fix this by providing a proper error code. Adjust callers to make use of it.
661fc57b	2018-12-01T01:16:25	idxmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_idxmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_idxmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_idxmap_insert` to make use of it.
d00c24a9	2019-01-23T10:49:25	idxmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce new high-level functions `git_idxmap_get` and `git_idxmap_icase_get` that take a map and a key and return a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
351eeff3	2019-01-23T10:42:46	maps: use uniform lifecycle management functions Currently, the lifecycle functions for maps (allocation, deallocation, resize) are not named in a uniform way and do not have a uniform function signature. Rename the functions to fix that, and stick to libgit2's naming scheme of saying `git_foo_new`. This results in the following new interface for allocation: - `int git_<t>map_new(git_<t>map *out)` to allocate a new map, returning an error code if we ran out of memory - `void git_<t>map_free(git_<t>map map)` to free a map - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map This commit also fixes all existing callers.
494448a5	2019-01-20T19:10:08	index: explicitly cast down to a size_t Quiet down a warning from MSVC about how we're potentially losing data. This cast is safe since we've explicitly tested that `strip_len` <= `last_len`.
0bf7e043	2019-01-24T12:12:04	index: preserve extension parsing errors Previously, we would clobber any extension-specific error message with an "extension is truncated" message. This makes `read_extension` correctly preserve those errors, takes responsibility for truncation errors, and adds a new message with the actual extension signature for unsupported mandatory extensions.
f673e232	2018-12-27T13:47:34	git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
18e71e6d	2018-11-28T13:31:06	index: use new enum and structure names Use the new-style index names throughout our own codebase.
852bc9f4	2018-11-23T19:26:24	khash: remove intricate knowledge of khash types Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.
0e3e832d	2018-11-21T13:30:01	Merge pull request #4884 from libgit2/ethomson/index_iterator index: introduce git_index_iterator
c358bbc5	2018-11-12T17:22:47	index: introduce git_index_iterator Provide a public git_index_iterator API that is backed by an index snapshot. This allows consumers to provide a stable iteration even while manipulating the index during iteration.
28239be3	2018-11-13T13:27:41	Merge pull request #4818 from pks-t/pks/index-collision Index collision fixes
8b6e2895	2018-09-21T15:18:03	index: fix adding index entries with conflicting files When adding an index entry "a/b/c" while an index entry "a/b" already exists, git will happily remove "a/b/c" and only add the new index entry: $ git init test Initialized empty Git repository in /tmp/test.repo/test/.git/ $ touch x $ git add x $ rm x $ mkdir x $ touch x/y $ git add x/y $ git status A x/y The other way round, adding an index entry "a/b" with an entry "a/b/c" already existing is equivalent, where git will remove "a/b/c" and add "a/b". In contrast, libgit2 will currently fail to add these properly and instead complain about the entry appearing as both a file and a directory. This is a programming error, though: our current code already tries to detect and, in the case of `git_index_add`, to automatically replace such index entries. Funnily enough, we already remove the conflicting index entries, but instead of adding the new entry we then bail out afterwards. This leaves callers with the worst of both worlds: we both remove the old entry but fail to add the new one. The root cause is weird semantics of the `has_file_name` and `has_dir_name` functions. While these functions only sound like they are responsible for detecting such conflicts, they will also already remove them in case where its `ok_to_replace` parameter is set. But even if we tell it to replace such entries, it will return an error code. Fix the error by returning success in case where the entries have been replaced. Fix an already existing test which tested for wrong behaviour. Note that the test didn't notice that the resulting tree had no entries. Thus it is fine to change existing behaviour here, as the previous result could've let to silently loosing data. Also add a new test that verifies behaviour in the reverse conflicting case.
923317db	2018-09-21T12:57:02	index: modernize error handling of `index_insert` The current error hanling of the function `index_insert` is currently very fragile. Instead of erroring out in case an error has happened, it will instead verify that no error has happened for each statement. This makes adding new code to that function an adventurous task. Improve the situation by converting the function to use our typical `goto out` pattern.
600ceadd	2018-10-18T11:29:06	index: avoid out-of-bounds read when reading reuc entry stage We use `git__strtol64` to parse file modes of the index entries, which does not limit the parsed buffer length. As the index can be essentially treated as "untrusted" in that the data stems from the file system, it may be misformatted and may not contain terminating `NUL` bytes. This may lead to out-of-bounds reads when trying to parse index entries with such malformatted modes. Fix the issue by using `git__strntol64` instead.
c70713d6	2018-09-11T15:53:35	index: release the snapshot instead of freeing the index Previously we would assert in index_free because the reader incrementation would not be balanced. Release the snapshot normally, so the variable gets decremented before the index is freed.
581d5492	2018-08-16T22:45:43	Fix leak in index.c
bfa1f022	2018-06-22T19:17:08	settings: optional unsaved index safety Add the `GIT_OPT_ENABLE_UNSAVED_INDEX_SAFETY` option, which will cause commands that reload the on-disk index to fail if the current `git_index` has changed that have not been saved. This will prevent users from - for example - adding a file to the index then calling a function like `git_checkout` and having that file be silently removed from the index since it was re-read from disk. Now calls that would re-read the index will fail if the index is "dirty", meaning changes have been made to it but have not been written. Users can either `git_index_read` to discard those changes explicitly, or `git_index_write` to write them.
787768c2	2018-06-22T19:07:54	index: return a unique error code on dirty index When the index is dirty, return GIT_EINDEXDIRTY so that consumers can identify the exact problem programatically.
b242cdbf	2017-11-17T00:19:07	index: commit the changes to the index properly Now that the index has a "dirty" state, where it has changes that have not yet been committed or rolled back, our tests need to be adapted to actually commit or rollback the changes instead of assuming that the index can be operated on in its indeterminate state.
7c56c49b	2017-11-12T08:09:35	index: add a dirty bit reflecting unsaved changes Teach the index when it is "dirty", and has unsaved changes. Consider the index dirty whenever a caller has added or removed an entry from the main index, REUC or NAME section, including when the index is completely cleared. Similarly, consider the index _not_ dirty immediately after it is written, or when it is read from the on-disk index. This allows us to ensure that unsaved changes are not lost when we automatically refresh the index.
ecf4f33a	2018-02-08T11:14:48	Convert usage of `git_buf_free` to new `git_buf_dispose`
93271f59	2018-05-25T01:41:33	index: Fix alignment issues in write_disk_entry() In order to avoid alignment issues on certain target architectures, it is necessary to use memcpy() when modifying elements of a struct inside a buffer returned by git_filebuf_reserve().
a7168b47	2018-05-22T16:13:47	path: reject .gitmodules as a symlink Any part of the library which asks the question can pass in the mode to have it checked against `.gitmodules` being a symlink. This is particularly relevant for adding entries to the index from the worktree and for checking out files.
58ff913a	2018-05-22T15:48:38	index: stat before creating the entry This is so we have it available for the path validity checking. In a later commit we will start rejecting `.gitmodules` files as symlinks.
3db1af1f	2018-03-08T12:36:46	index: error out on unreasonable prefix-compressed path lengths When computing the complete path length from the encoded prefix-compressed path, we end up just allocating the complete path without ever checking what the encoded path length actually is. This can easily lead to a denial of service by just encoding an unreasonable long path name inside of the index. Git already enforces a maximum path length of 4096 bytes. As we also have that enforcement ready in some places, just make sure that the resulting path is smaller than GIT_PATH_MAX. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
3207ddb0	2018-03-08T12:00:27	index: fix out-of-bounds read with invalid index entry prefix length The index format in version 4 has prefix-compressed entries, where every index entry can compress its path by using a path prefix of the previous entry. Since implmenting support for this index format version in commit 5625d86b9 (index: support index v4, 2016-05-17), though, we do not correctly verify that the prefix length that we want to reuse is actually smaller or equal to the amount of characters than the length of the previous index entry's path. This can lead to a an integer underflow and subsequently to an out-of-bounds read. Fix this by verifying that the prefix is actually smaller than the previous entry's path length. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
58a6fe94	2018-03-08T11:49:19	index: convert `read_entry` to return entry size via an out-param The function `read_entry` does not conform to our usual coding style of returning stuff via the out parameter and to use the return value for reporting errors. Due to most of our code conforming to that pattern, it has become quite natural for us to actually return `-1` in case there is any error, which has also slipped in with commit 5625d86b9 (index: support index v4, 2016-05-17). As the function returns an `size_t` only, though, the return value is wrapped around, causing the caller of `read_tree` to continue with an invalid index entry. Ultimately, this can lead to a double-free. Improve code and fix the bug by converting the function to return the index entry size via an out parameter and only using the return value to indicate errors. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>
f1ad004c	2018-02-18T22:29:48	Merge pull request #4529 from libgit2/ethomson/index_add_requires_files git_index_add_frombuffer: only accept files/links
5f774dbf	2018-02-11T10:14:13	git_index_add_frombuffer: only accept files/links Ensure that the buffer given to `git_index_add_frombuffer` represents a regular blob, an executable blob, or a link. Explicitly reject commit entries (submodules) - it makes little sense to allow users to add a submodule from a string; there's no possible path to success.
7c6e9175	2018-02-16T11:11:11	index: shut up warning on uninitialized variable Even though the `entry` variable will always be initialized when `read_entry` returns success and even though we never dereference `entry` in case `read_entry` fails, GCC prints a warning about uninitialized use. Just initialize the pointer to `NULL` in order to shut GCC up.
0c7f49dd	2017-06-30T13:39:01	Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
064a60e9	2017-05-19T14:06:15	index: verify we have enough space left when writing index entries In our code writing index entries, we carry around a `disk_size` representing how much memory we have in total and pass this value to `git_encode_varint` to do bounds checks. This does not make much sense, as at the time when passing on this variable it is already out of date. Fix this by subtracting used memory from `disk_size` as we go along. Furthermore, assert we've actually got enough space left to do the final path memcpy.
c71dff7e	2017-05-19T13:49:34	index: fix shared prefix computation when writing index entry When using compressed index entries, each entry's path is preceded by a varint encoding how long the shared prefix with the previous index entry actually is. We currently encode a length of `(path_len - same_len)`, which is doubly wrong. First, `path_len` is already set to `path_len - same_len` previously. Second, we want to encode the shared prefix rather than the un-shared suffix length. Fix this by using `same_len` as the varint value instead.
83e0392c	2017-05-19T13:39:05	index: also sanity check entry size with compressed entries We have a check in place whether the index has enough data left for the required footer after reading an index entry, but this was only used for uncompressed entries. Move the check down a bit so that it is executed for both compressed and uncompressed index entries.
350d2c47	2017-05-19T14:22:35	index: remove file-scope entry size macros All index entry size computations are now performed in `index_entry_size`. As such, we do not need the file-scope macros for computing these sizes anymore. Remove them and move the `entry_size` macro into the `index_entry_size` function.
46b67034	2017-05-19T13:59:53	index: don't right-pad paths when writing compressed entries Our code to write index entries to disk does not check whether the entry that is to be written should use prefix compression for the path. As such, we were overallocating memory and added bogus right-padding into the resulting index entries. As there is no padding allowed in the index version 4 format, this should actually result in an invalid index. Fix this by re-using the newly extracted `index_entry_size` function.
29f498e0	2017-05-19T13:38:34	index: move index entry size computation into its own function Create a new function `index_entry_size` which encapsulates the logic to calculate how much space is needed for an index entry, whether it is simple/extended or compressed/uncompressed. This can later be re-used by our code writing index entries.
8ceb890b	2017-05-19T12:35:21	index: set last written index entry in foreach-entry-loop The last written disk entry is currently being written inside of the function `write_disk_entry`. Make behavior a bit more obviously by instead setting it inside of `write_entries` while iterating all entries.
11d0be23	2017-05-12T10:01:43	index: set last entry when reading compressed entries To calculate the path of a compressed index entry, we need to know the preceding entry's path. While we do actually set the first predecessor correctly to "", we fail to update this while reading the entries. Fix the issue by updating `last` inside of the loop. Previously, we've been passing a double-pointer to `read_entry`, which it didn't update. As it is more obvious to update the pointer inside the loop itself, though, we can simply convert it to a normal pointer.
febe8c14	2017-05-10T14:27:12	index: fix confusion with shared prefix in compressed path names The index version 4 introduced compressed path names for the entries. From the git.git index-format documentation: At the beginning of an entry, an integer N in the variable width encoding [...] is stored, followed by a NUL-terminated string S. Removing N bytes from the end of the path name for the previous entry, and replacing it with the string S yields the path name for this entry. But instead of stripping N bytes from the previous path's string and using the remaining prefix, we were instead simply concatenating the previous path with the current entry path, which is obviously wrong. Fix the issue by correctly copying the first N bytes of the previous entry only and concatenating the result with our current entry's path.
8f1ff26b	2017-02-02T13:09:32	idxmap: remove GIT__USE_IDXMAP
f14f75d4	2017-02-02T13:08:52	khash: avoid using `kh_resize` directly
73028af8	2017-01-27T14:20:24	khash: avoid using macro magic to get return address
909d5494	2016-12-29T12:25:15	giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
65b78ea3	2016-11-17T01:08:49	use `giterr_set_str()` wherever possible `giterr_set()` is used when it is required to format a string, and since we don't really require it for this case, it is better to stick to `giterr_set_str()`. This also suppresses a warning(-Wformat-security) raised by the compiler. Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
5625d86b	2016-05-17T15:40:32	index: support index v4 Support reading and writing index v4. Index v4 uses a very simple compression scheme for pathnames, but is otherwise similar to index v3. Signed-off-by: David Turner <dturner@twitter.com>
4aaae935	2016-07-22T12:53:13	index: cast to avoid warning
6249d960	2016-06-29T17:55:44	index: include conflicts in `git_index_read_index` Ensure that we include conflicts when calling `git_index_read_index`, which will remove conflicts in the index that do not exist in the new target, and will add conflicts from the new target.
6f7ec728	2016-06-29T17:01:47	index: refactor common `read_index` functionality Most of `git_index_read_index` is common to reading any iterator. Refactor it out in case we want to implement `read_tree` in terms of it in the future.
13deb874	2016-06-07T08:35:26	index: fix NULL pointer access in index_remove_entry When removing an entry from the index by its position, we first retrieve the position from the index's entries and then try to remove the retrieved value from the index map with `DELETE_IN_MAP`. When `index_remove_entry` returns `NULL` we try to feed it into the `DELETE_IN_MAP` macro, which will unconditionally call `idxentry_hash` and then happily dereference the `NULL` entry pointer. Fix the issue by not passing a `NULL` entry into `DELETE_IN_MAP`.
46082c38	2016-06-02T02:34:03	index_read_index: invalidate new paths in tree cache When adding a new entry to an existing index via `git_index_read_index`, be sure to remove the tree cache entry for that new path. This will mark all parent trees as dirty.
9167c145	2016-06-02T01:04:58	index_read_index: set flags for path_len correctly Update the flags to reset the path_len (to emulate `index_insert`)
046ec3c9	2016-06-02T00:47:51	index_read_index: differentiate on mode Treat index entries with different modes as different, which they are, at least for the purposes of up-to-date calculations.
93de20b8	2016-06-01T14:56:27	index_read_index: reset error correctly Clear any error state upon each iteration. If one of the iterations ends (with an error of `GIT_ITEROVER`) we need to reset that error to 0, lest we stop the whole process prematurely.
f80852af	2016-05-02T14:30:14	index: fix memory leak on error case
60a194aa	2016-03-20T11:00:12	tree: re-use the id and filename in the odb object Instead of copying over the data into the individual entries, point to the originals, which are already in a format we can use.
80a834a5	2016-03-01T16:00:49	index: assert required OID are non-NULL
6ddf533a	2016-02-23T18:29:16	git_index_add: validate objects in index entries (optionally) When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate the index entries given to `git_index_add`.
9f4e7c84	2016-02-25T18:42:09	Merge pull request #3638 from ethomson/nsec USE_NSECS fixes
3d6a42d1	2016-02-25T11:23:19	nsec: support NDK's crazy nanoseconds Android NDK does not have a `struct timespec` in its `struct stat` for nanosecond support, instead it has a single nanosecond member inside the struct stat itself. We will use that and use a macro to expand to the `st_mtim` / `st_mtimespec` definition on other systems (much like the existing `st_mtime` backcompat definition).
0f1e2d20	2016-02-23T11:23:26	index: fix contradicting comparison The overflow check in `read_reuc` tries to verify if the `git__strtol32` parses an integer bigger than UINT_MAX. The `tmp` variable is casted to an unsigned int for this and then checked for being greater than UINT_MAX, which obviously can never be true. Fix this by instead fixing the `mode` field's size in `struct git_index_reuc_entry` to `uint32_t`. We can now parse the int with `git__strtol64`, which can never return a value bigger than `UINT32_MAX`, and additionally checking if the returned value is smaller than zero. We do not need to handle overflows explicitly here, as `git__strtol64` returns an error when the returned value would overflow.
7808c937	2016-02-22T15:59:15	index: plug memory leak in `read_conflict_names`
5663d4f6	2016-02-18T12:31:56	Merge pull request #3613 from ethomson/fixups Remove most of the silly warnings
594a5d12	2016-02-18T12:28:06	Merge pull request #3619 from ethomson/win32_forbidden win32: allow us to read indexes with forbidden paths on win32
318b825e	2016-02-16T17:11:46	index: allow read of index w/ illegal entries Allow `git_index_read` to handle reading existing indexes with illegal entries. Allow the low-level `git_index_add` to add properly formed `git_index_entry`s even if they contain paths that would be illegal for the current filesystem (eg, `AUX`). Continue to disallow `git_index_add_bypath` from adding entries that are illegal universally illegal (eg, `.git`, `foo/../bar`).
b2ca8d9c	2016-02-12T10:22:54	index: explicitly cast the teeny index entry members
997e0301	2016-02-12T10:11:32	index: don't use `seek` return as an error code
9a634cba	2016-02-12T10:03:29	index: explicitly cast new hash size to an int
3679ebae	2016-02-11T23:37:52	Horrible fix for #3173.
9d81509a	2015-12-23T11:54:52	index: get rid of the locking We don't support using an index object from multiple threads at the same time, so the locking doesn't have any effect when following the rules. If not following the rules, things are going to break down anyway.
ef8b7feb	2015-12-16T19:36:50	index: Also size-hint the hash table Note that we're not checking whether the resize succeeds; in OOM cases, we let it run with a "small" vector and hash table and see if by chance we can grow it dynamically as we insert the new entries. Nothing to lose really.

11ef76a9

2022-01-22T13:31:02

index: use a byte array for checksum The index's checksum is not an object ID, so we should not use the `git_oid` type. Use a byte array for checksum calculation and storage. Deprecate the `git_index_checksum` function without a replacement. This is an abstraction that callers should not care about (and indeed do not seem to be using). Remove the unused `git_index__changed_relative_to` function.

90df4302

2022-01-05T12:18:05

Fix typos

adcf638c

2021-11-21T21:34:17

filebuf: use hashes not oids The filebuf functions should use hashes directly, not indirectly using the oid functions.

7dcc29fc

2021-10-22T22:51:59

Make enum in src,tests and examples C90 compliant by removing trailing comma.

63e36c53

2021-11-01T09:34:32

path: `validate` -> `is_valid` Since we're returning a boolean about validation, the name is more properly "is valid".

95117d47

2021-10-31T09:45:46

path: separate git-specific path functions from util Introduce `git_fs_path`, which operates on generic filesystem paths. `git_path` will be kept for only git-specific path functionality (for example, checking for `.git` in a path).

f0e693b1

2021-09-07T17:53:49

str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.

31ecaca2

2021-09-30T08:11:40

hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.

2a713da1

2021-09-29T21:31:17

hash: accept the algorithm in inputs

578aeba9

2021-03-20T17:00:33

use git_repository_workdir_path to generate paths Use `git_repository_workdir_path` to generate workdir paths since it will validate the length.

88323cd0

2021-03-20T09:52:17

path: git_path_isvalid -> git_path_validate If we want to validate more and different types of paths, the name `git_path_validate` makes that easier and more expressive. We can add, for example, `git_path_validate_foo` while the current name makes that less ergonomic.

40930508

2021-02-18T16:36:42

index: Initialize case_sorted to GIT_VECTOR_INIT This is for extra safety within write_entries

21981f28

2021-02-16T13:43:09

index: Check git_vector_dup error in write_entries If allocating case_sorted.contents fails, git_vector_sort will segfault.

ab772974

2020-12-05T15:49:30

threads: give atomic functions the git_atomic prefix

37763d38

2020-12-05T15:26:59

threads: rename git_atomic to git_atomic32 Clarify the `git_atomic` type and functions now that we have a 64 bit version as well (`git_atomic64`).

1fa339be

2020-04-05T16:50:13

index: use GIT_ASSERT

d0656ac8

2020-06-27T12:15:26

Make the tests run cleanly under UndefinedBehaviorSanitizer This change makes the tests run cleanly under `-fsanitize=undefined,nullability` and comprises of: * Avoids some arithmetic with NULL pointers (which UBSan does not like). * Avoids an overflow in a shift, due to an uint8_t being implicitly converted to a signed 32-bit signed integer after being shifted by a 32-bit signed integer. * Avoids a unaligned read in libgit2. * Ignores unaligned reads in the SHA1 library, since it only happens on Intel processors, where it is _still_ undefined behavior, but the semantics are moderately well-understood. Of notable omission is `-fsanitize=integer`, since there are lots of warnings in zlib and the SHA1 library which probably don't make sense to fix and I could not figure out how to silence easily. libgit2 itself also has ~100s of warnings which are mostly innocuous (e.g. use of enum constants that only fit on an `uint32_t`, but there is no way to do that in a simple fashion because the data type chosen for enumerated types is implementation-defined), and investigating whether there are worrying warnings would need reducing the noise significantly.

c6184f0c

2020-06-08T21:07:36

tree-wide: do not compile deprecated functions with hard deprecation When compiling libgit2 with -DDEPRECATE_HARD, we add a preprocessor definition `GIT_DEPRECATE_HARD` which causes the "git2/deprecated.h" header to be empty. As a result, no function declarations are made available to callers, but the implementations are still available to link against. This has the problem that function declarations also aren't visible to the implementations, meaning that the symbol's visibility will not be set up correctly. As a result, the resulting library may not expose those deprecated symbols at all on some platforms and thus cause linking errors. Fix the issue by conditionally compiling deprecated functions, only. While it becomes impossible to link against such a library in case one uses deprecated functions, distributors of libgit2 aren't expected to pass -DDEPRECATE_HARD anyway. Instead, users of libgit2 should manually define GIT_DEPRECATE_HARD to hide deprecated functions. Using "real" hard deprecation still makes sense in the context of CI to test we don't use deprecated symbols ourselves and in case a dependant uses libgit2 in a vendored way and knows it won't ever use any of the deprecated symbols anyway.

17641f1f

2020-06-01T15:05:51

Merge pull request #5526 from libgit2/ethomson/poolinit git_pool_init: allow the function to fail

0f35efeb

2020-05-23T10:15:51

git_pool_init: handle failure cases Propagate failures caused by pool initialization errors.

8c96d56d

2020-05-26T04:53:09

index: write v4: bugfix: prefix path with strip_len, not same_len According to index-format.txt of git, the path of an entry is prefixed with N, where N indicates the length of bytes to be stripped.

cb43274a

2020-01-18T17:42:52

index functions: return an int Stop returning a void for functions, future-proofing them to allow them to fail.

7fc97eb3

2020-01-09T14:21:41

index: fix resizing index map twice on case-insensitive systems Depending on whether the index map is case-sensitive or insensitive, we need to call either `git_idxmap_icase_resize` or `git_idxmap_resize`. There are multiple locations where we thus use the following pattern: if (index->ignore_case && git_idxmap_icase_resize(map, length) < 0) return -1; else if (git_idxmap_resize(map, length) < 0) return -1; The funny thing is: on case-insensitive systems, we will try to resize the map twice in case where `git_idxmap_icase_resize()` doesn't error. While this will still use the correct hashing function as both map types use the same, this bug will at least cause us to resize the map twice in a row. Fix the issue by introducing a new function `index_map_resize` that handles case-sensitivity, similar to how `index_map_set` and `index_map_delete`. Convert all call sites where we were previously resizing the map to use that new function.

ab45887f

2020-01-09T14:15:02

index: replace map macros with inline functions Traditionally, our maps were mostly implemented via macros that had weird call semantics. This shows in our index code, where we have macros that insert into an index map case-sensitively or insensitively, as they still return error codes via an error parameter. This is unwieldy and, most importantly, not necessary anymore, due to the introduction of our high-level map API and removal of macros. Replace them with inlined functions to make code easier to read.

658022c4

2019-07-18T13:53:41

configuration: cvar -> configmap `cvar` is an unhelpful name. Refactor its usage to `configmap` for more clarity.

7e49deba

2019-05-20T06:35:11

index: safely cast file size

6574cd00

2019-06-08T19:25:36

index: rename `frombuffer` to `from_buffer` The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the index functions for consistency with the rest of the library.

08f39208

2019-06-08T17:46:04

blob: add underscore to `from` functions The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.

8da93944

2018-12-01T10:52:44

idxmap: have `resize` functions return proper error code The currently existing function `git_idxmap_resize` and `git_idxmap_icase_resize` do not return any error codes at all due to their previous implementation making use of a macro. Due to that, it is impossible to see whether the resize operation might have failed due to an out-of-memory situation. Fix this by providing a proper error code. Adjust callers to make use of it.

661fc57b

2018-12-01T01:16:25

idxmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_idxmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_idxmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_idxmap_insert` to make use of it.

d00c24a9

2019-01-23T10:49:25

idxmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce new high-level functions `git_idxmap_get` and `git_idxmap_icase_get` that take a map and a key and return a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.

351eeff3

2019-01-23T10:42:46

maps: use uniform lifecycle management functions Currently, the lifecycle functions for maps (allocation, deallocation, resize) are not named in a uniform way and do not have a uniform function signature. Rename the functions to fix that, and stick to libgit2's naming scheme of saying `git_foo_new`. This results in the following new interface for allocation: - `int git_<t>map_new(git_<t>map **out)` to allocate a new map, returning an error code if we ran out of memory - `void git_<t>map_free(git_<t>map *map)` to free a map - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map This commit also fixes all existing callers.

494448a5

2019-01-20T19:10:08

index: explicitly cast down to a size_t Quiet down a warning from MSVC about how we're potentially losing data. This cast is safe since we've explicitly tested that `strip_len` <= `last_len`.

0bf7e043

2019-01-24T12:12:04

index: preserve extension parsing errors Previously, we would clobber any extension-specific error message with an "extension is truncated" message. This makes `read_extension` correctly preserve those errors, takes responsibility for truncation errors, and adds a new message with the actual extension signature for unsupported mandatory extensions.

f673e232

2018-12-27T13:47:34

git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.

18e71e6d

2018-11-28T13:31:06

index: use new enum and structure names Use the new-style index names throughout our own codebase.

852bc9f4

2018-11-23T19:26:24

khash: remove intricate knowledge of khash types Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.

0e3e832d

2018-11-21T13:30:01

Merge pull request #4884 from libgit2/ethomson/index_iterator index: introduce git_index_iterator

c358bbc5

2018-11-12T17:22:47

index: introduce git_index_iterator Provide a public git_index_iterator API that is backed by an index snapshot. This allows consumers to provide a stable iteration even while manipulating the index during iteration.

28239be3

2018-11-13T13:27:41

Merge pull request #4818 from pks-t/pks/index-collision Index collision fixes

8b6e2895

2018-09-21T15:18:03

index: fix adding index entries with conflicting files When adding an index entry "a/b/c" while an index entry "a/b" already exists, git will happily remove "a/b/c" and only add the new index entry: $ git init test Initialized empty Git repository in /tmp/test.repo/test/.git/ $ touch x $ git add x $ rm x $ mkdir x $ touch x/y $ git add x/y $ git status A x/y The other way round, adding an index entry "a/b" with an entry "a/b/c" already existing is equivalent, where git will remove "a/b/c" and add "a/b". In contrast, libgit2 will currently fail to add these properly and instead complain about the entry appearing as both a file and a directory. This is a programming error, though: our current code already tries to detect and, in the case of `git_index_add`, to automatically replace such index entries. Funnily enough, we already remove the conflicting index entries, but instead of adding the new entry we then bail out afterwards. This leaves callers with the worst of both worlds: we both remove the old entry but fail to add the new one. The root cause is weird semantics of the `has_file_name` and `has_dir_name` functions. While these functions only sound like they are responsible for detecting such conflicts, they will also already remove them in case where its `ok_to_replace` parameter is set. But even if we tell it to replace such entries, it will return an error code. Fix the error by returning success in case where the entries have been replaced. Fix an already existing test which tested for wrong behaviour. Note that the test didn't notice that the resulting tree had no entries. Thus it is fine to change existing behaviour here, as the previous result could've let to silently loosing data. Also add a new test that verifies behaviour in the reverse conflicting case.

923317db

2018-09-21T12:57:02

index: modernize error handling of `index_insert` The current error hanling of the function `index_insert` is currently very fragile. Instead of erroring out in case an error has happened, it will instead verify that no error has happened for each statement. This makes adding new code to that function an adventurous task. Improve the situation by converting the function to use our typical `goto out` pattern.

600ceadd

2018-10-18T11:29:06

index: avoid out-of-bounds read when reading reuc entry stage We use `git__strtol64` to parse file modes of the index entries, which does not limit the parsed buffer length. As the index can be essentially treated as "untrusted" in that the data stems from the file system, it may be misformatted and may not contain terminating `NUL` bytes. This may lead to out-of-bounds reads when trying to parse index entries with such malformatted modes. Fix the issue by using `git__strntol64` instead.

c70713d6

2018-09-11T15:53:35

index: release the snapshot instead of freeing the index Previously we would assert in index_free because the reader incrementation would not be balanced. Release the snapshot normally, so the variable gets decremented before the index is freed.

581d5492

2018-08-16T22:45:43

Fix leak in index.c

bfa1f022

2018-06-22T19:17:08

settings: optional unsaved index safety Add the `GIT_OPT_ENABLE_UNSAVED_INDEX_SAFETY` option, which will cause commands that reload the on-disk index to fail if the current `git_index` has changed that have not been saved. This will prevent users from - for example - adding a file to the index then calling a function like `git_checkout` and having that file be silently removed from the index since it was re-read from disk. Now calls that would re-read the index will fail if the index is "dirty", meaning changes have been made to it but have not been written. Users can either `git_index_read` to discard those changes explicitly, or `git_index_write` to write them.

787768c2

2018-06-22T19:07:54

index: return a unique error code on dirty index When the index is dirty, return GIT_EINDEXDIRTY so that consumers can identify the exact problem programatically.

b242cdbf

2017-11-17T00:19:07

index: commit the changes to the index properly Now that the index has a "dirty" state, where it has changes that have not yet been committed or rolled back, our tests need to be adapted to actually commit or rollback the changes instead of assuming that the index can be operated on in its indeterminate state.

7c56c49b

2017-11-12T08:09:35

index: add a dirty bit reflecting unsaved changes Teach the index when it is "dirty", and has unsaved changes. Consider the index dirty whenever a caller has added or removed an entry from the main index, REUC or NAME section, including when the index is completely cleared. Similarly, consider the index _not_ dirty immediately after it is written, or when it is read from the on-disk index. This allows us to ensure that unsaved changes are not lost when we automatically refresh the index.

ecf4f33a

2018-02-08T11:14:48

Convert usage of `git_buf_free` to new `git_buf_dispose`

93271f59

2018-05-25T01:41:33

index: Fix alignment issues in write_disk_entry() In order to avoid alignment issues on certain target architectures, it is necessary to use memcpy() when modifying elements of a struct inside a buffer returned by git_filebuf_reserve().

a7168b47

2018-05-22T16:13:47

path: reject .gitmodules as a symlink Any part of the library which asks the question can pass in the mode to have it checked against `.gitmodules` being a symlink. This is particularly relevant for adding entries to the index from the worktree and for checking out files.

58ff913a

2018-05-22T15:48:38

index: stat before creating the entry This is so we have it available for the path validity checking. In a later commit we will start rejecting `.gitmodules` files as symlinks.

3db1af1f

2018-03-08T12:36:46

index: error out on unreasonable prefix-compressed path lengths When computing the complete path length from the encoded prefix-compressed path, we end up just allocating the complete path without ever checking what the encoded path length actually is. This can easily lead to a denial of service by just encoding an unreasonable long path name inside of the index. Git already enforces a maximum path length of 4096 bytes. As we also have that enforcement ready in some places, just make sure that the resulting path is smaller than GIT_PATH_MAX. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>

3207ddb0

2018-03-08T12:00:27

index: fix out-of-bounds read with invalid index entry prefix length The index format in version 4 has prefix-compressed entries, where every index entry can compress its path by using a path prefix of the previous entry. Since implmenting support for this index format version in commit 5625d86b9 (index: support index v4, 2016-05-17), though, we do not correctly verify that the prefix length that we want to reuse is actually smaller or equal to the amount of characters than the length of the previous index entry's path. This can lead to a an integer underflow and subsequently to an out-of-bounds read. Fix this by verifying that the prefix is actually smaller than the previous entry's path length. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>

58a6fe94

2018-03-08T11:49:19

index: convert `read_entry` to return entry size via an out-param The function `read_entry` does not conform to our usual coding style of returning stuff via the out parameter and to use the return value for reporting errors. Due to most of our code conforming to that pattern, it has become quite natural for us to actually return `-1` in case there is any error, which has also slipped in with commit 5625d86b9 (index: support index v4, 2016-05-17). As the function returns an `size_t` only, though, the return value is wrapped around, causing the caller of `read_tree` to continue with an invalid index entry. Ultimately, this can lead to a double-free. Improve code and fix the bug by converting the function to return the index entry size via an out parameter and only using the return value to indicate errors. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reported-by: Vivek Parikh <viv0411.parikh@gmail.com>

f1ad004c

2018-02-18T22:29:48

Merge pull request #4529 from libgit2/ethomson/index_add_requires_files git_index_add_frombuffer: only accept files/links

5f774dbf

2018-02-11T10:14:13

git_index_add_frombuffer: only accept files/links Ensure that the buffer given to `git_index_add_frombuffer` represents a regular blob, an executable blob, or a link. Explicitly reject commit entries (submodules) - it makes little sense to allow users to add a submodule from a string; there's no possible path to success.

7c6e9175

2018-02-16T11:11:11

index: shut up warning on uninitialized variable Even though the `entry` variable will always be initialized when `read_entry` returns success and even though we never dereference `entry` in case `read_entry` fails, GCC prints a warning about uninitialized use. Just initialize the pointer to `NULL` in order to shut GCC up.

0c7f49dd

2017-06-30T13:39:01

Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.

064a60e9

2017-05-19T14:06:15

index: verify we have enough space left when writing index entries In our code writing index entries, we carry around a `disk_size` representing how much memory we have in total and pass this value to `git_encode_varint` to do bounds checks. This does not make much sense, as at the time when passing on this variable it is already out of date. Fix this by subtracting used memory from `disk_size` as we go along. Furthermore, assert we've actually got enough space left to do the final path memcpy.

c71dff7e

2017-05-19T13:49:34

index: fix shared prefix computation when writing index entry When using compressed index entries, each entry's path is preceded by a varint encoding how long the shared prefix with the previous index entry actually is. We currently encode a length of `(path_len - same_len)`, which is doubly wrong. First, `path_len` is already set to `path_len - same_len` previously. Second, we want to encode the shared prefix rather than the un-shared suffix length. Fix this by using `same_len` as the varint value instead.

83e0392c

2017-05-19T13:39:05

index: also sanity check entry size with compressed entries We have a check in place whether the index has enough data left for the required footer after reading an index entry, but this was only used for uncompressed entries. Move the check down a bit so that it is executed for both compressed and uncompressed index entries.

350d2c47

2017-05-19T14:22:35

index: remove file-scope entry size macros All index entry size computations are now performed in `index_entry_size`. As such, we do not need the file-scope macros for computing these sizes anymore. Remove them and move the `entry_size` macro into the `index_entry_size` function.

46b67034

2017-05-19T13:59:53

index: don't right-pad paths when writing compressed entries Our code to write index entries to disk does not check whether the entry that is to be written should use prefix compression for the path. As such, we were overallocating memory and added bogus right-padding into the resulting index entries. As there is no padding allowed in the index version 4 format, this should actually result in an invalid index. Fix this by re-using the newly extracted `index_entry_size` function.

29f498e0

2017-05-19T13:38:34

index: move index entry size computation into its own function Create a new function `index_entry_size` which encapsulates the logic to calculate how much space is needed for an index entry, whether it is simple/extended or compressed/uncompressed. This can later be re-used by our code writing index entries.

8ceb890b

2017-05-19T12:35:21

index: set last written index entry in foreach-entry-loop The last written disk entry is currently being written inside of the function `write_disk_entry`. Make behavior a bit more obviously by instead setting it inside of `write_entries` while iterating all entries.

11d0be23

2017-05-12T10:01:43

index: set last entry when reading compressed entries To calculate the path of a compressed index entry, we need to know the preceding entry's path. While we do actually set the first predecessor correctly to "", we fail to update this while reading the entries. Fix the issue by updating `last` inside of the loop. Previously, we've been passing a double-pointer to `read_entry`, which it didn't update. As it is more obvious to update the pointer inside the loop itself, though, we can simply convert it to a normal pointer.

febe8c14

2017-05-10T14:27:12

index: fix confusion with shared prefix in compressed path names The index version 4 introduced compressed path names for the entries. From the git.git index-format documentation: At the beginning of an entry, an integer N in the variable width encoding [...] is stored, followed by a NUL-terminated string S. Removing N bytes from the end of the path name for the previous entry, and replacing it with the string S yields the path name for this entry. But instead of stripping N bytes from the previous path's string and using the remaining prefix, we were instead simply concatenating the previous path with the current entry path, which is obviously wrong. Fix the issue by correctly copying the first N bytes of the previous entry only and concatenating the result with our current entry's path.

8f1ff26b

2017-02-02T13:09:32

idxmap: remove GIT__USE_IDXMAP

f14f75d4

2017-02-02T13:08:52

khash: avoid using `kh_resize` directly

73028af8

2017-01-27T14:20:24

khash: avoid using macro magic to get return address

909d5494

2016-12-29T12:25:15

giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one

65b78ea3

2016-11-17T01:08:49

use `giterr_set_str()` wherever possible `giterr_set()` is used when it is required to format a string, and since we don't really require it for this case, it is better to stick to `giterr_set_str()`. This also suppresses a warning(-Wformat-security) raised by the compiler. Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>

5625d86b

2016-05-17T15:40:32

index: support index v4 Support reading and writing index v4. Index v4 uses a very simple compression scheme for pathnames, but is otherwise similar to index v3. Signed-off-by: David Turner <dturner@twitter.com>

4aaae935

2016-07-22T12:53:13

index: cast to avoid warning

6249d960

2016-06-29T17:55:44

index: include conflicts in `git_index_read_index` Ensure that we include conflicts when calling `git_index_read_index`, which will remove conflicts in the index that do not exist in the new target, and will add conflicts from the new target.

6f7ec728

2016-06-29T17:01:47

index: refactor common `read_index` functionality Most of `git_index_read_index` is common to reading any iterator. Refactor it out in case we want to implement `read_tree` in terms of it in the future.

13deb874

2016-06-07T08:35:26

index: fix NULL pointer access in index_remove_entry When removing an entry from the index by its position, we first retrieve the position from the index's entries and then try to remove the retrieved value from the index map with `DELETE_IN_MAP`. When `index_remove_entry` returns `NULL` we try to feed it into the `DELETE_IN_MAP` macro, which will unconditionally call `idxentry_hash` and then happily dereference the `NULL` entry pointer. Fix the issue by not passing a `NULL` entry into `DELETE_IN_MAP`.

46082c38

2016-06-02T02:34:03

index_read_index: invalidate new paths in tree cache When adding a new entry to an existing index via `git_index_read_index`, be sure to remove the tree cache entry for that new path. This will mark all parent trees as dirty.

9167c145

2016-06-02T01:04:58

index_read_index: set flags for path_len correctly Update the flags to reset the path_len (to emulate `index_insert`)

046ec3c9

2016-06-02T00:47:51

index_read_index: differentiate on mode Treat index entries with different modes as different, which they are, at least for the purposes of up-to-date calculations.

93de20b8

2016-06-01T14:56:27

index_read_index: reset error correctly Clear any error state upon each iteration. If one of the iterations ends (with an error of `GIT_ITEROVER`) we need to reset that error to 0, lest we stop the whole process prematurely.

f80852af

2016-05-02T14:30:14

index: fix memory leak on error case

60a194aa

2016-03-20T11:00:12

tree: re-use the id and filename in the odb object Instead of copying over the data into the individual entries, point to the originals, which are already in a format we can use.

80a834a5

2016-03-01T16:00:49

index: assert required OID are non-NULL

6ddf533a

2016-02-23T18:29:16

git_index_add: validate objects in index entries (optionally) When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate the index entries given to `git_index_add`.

9f4e7c84

2016-02-25T18:42:09

Merge pull request #3638 from ethomson/nsec USE_NSECS fixes

3d6a42d1

2016-02-25T11:23:19

nsec: support NDK's crazy nanoseconds Android NDK does not have a `struct timespec` in its `struct stat` for nanosecond support, instead it has a single nanosecond member inside the struct stat itself. We will use that and use a macro to expand to the `st_mtim` / `st_mtimespec` definition on other systems (much like the existing `st_mtime` backcompat definition).

0f1e2d20

2016-02-23T11:23:26

index: fix contradicting comparison The overflow check in `read_reuc` tries to verify if the `git__strtol32` parses an integer bigger than UINT_MAX. The `tmp` variable is casted to an unsigned int for this and then checked for being greater than UINT_MAX, which obviously can never be true. Fix this by instead fixing the `mode` field's size in `struct git_index_reuc_entry` to `uint32_t`. We can now parse the int with `git__strtol64`, which can never return a value bigger than `UINT32_MAX`, and additionally checking if the returned value is smaller than zero. We do not need to handle overflows explicitly here, as `git__strtol64` returns an error when the returned value would overflow.

7808c937

2016-02-22T15:59:15

index: plug memory leak in `read_conflict_names`

5663d4f6

2016-02-18T12:31:56

Merge pull request #3613 from ethomson/fixups Remove most of the silly warnings

594a5d12

2016-02-18T12:28:06

Merge pull request #3619 from ethomson/win32_forbidden win32: allow us to read indexes with forbidden paths on win32

318b825e

2016-02-16T17:11:46

index: allow read of index w/ illegal entries Allow `git_index_read` to handle reading existing indexes with illegal entries. Allow the low-level `git_index_add` to add properly formed `git_index_entry`s even if they contain paths that would be illegal for the current filesystem (eg, `AUX`). Continue to disallow `git_index_add_bypath` from adding entries that are illegal universally illegal (eg, `.git`, `foo/../bar`).

b2ca8d9c

2016-02-12T10:22:54

index: explicitly cast the teeny index entry members

997e0301

2016-02-12T10:11:32

index: don't use `seek` return as an error code

9a634cba

2016-02-12T10:03:29

index: explicitly cast new hash size to an int

3679ebae

2016-02-11T23:37:52

Horrible fix for #3173.

9d81509a

2015-12-23T11:54:52

index: get rid of the locking We don't support using an index object from multiple threads at the same time, so the locking doesn't have any effect when following the rules. If not following the rules, things are going to break down anyway.

ef8b7feb

2015-12-16T19:36:50

index: Also size-hint the hash table Note that we're not checking whether the resize succeeds; in OOM cases, we let it run with a "small" vector and hash table and see if by chance we can grow it dynamically as we insert the new entries. Nothing to lose really.

thodg/libgit2/src/index.c

src/index.c

Log