kmx git

Commit	Date	Message
ceddeed8	2021-11-11T15:20:50	Merge pull request #6104 from libgit2/ethomson/path path: refactor utility path functions
95117d47	2021-10-31T09:45:46	path: separate git-specific path functions from util Introduce `git_fs_path`, which operates on generic filesystem paths. `git_path` will be kept for only git-specific path functionality (for example, checking for `.git` in a path).
94cb060c	2021-11-08T14:54:09	Add tests for ODB refresh Add optional refreshing in the fake backend, and count the number of refresh calls if enabled.
f0e693b1	2021-09-07T17:53:49	str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
31ecaca2	2021-09-30T08:11:40	hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.
2a713da1	2021-09-29T21:31:17	hash: accept the algorithm in inputs
d095502e	2021-08-08T13:34:06	tests: don't generate false positives on empty path segments
e5975f36	2021-07-30T11:37:12	tests: reset odb backend priority
cd460522	2020-04-20T22:16:52	odb: Implement option for overriding of default odb backend priority Introduce GIT_OPT_SET_ODB_LOOSE_PRIORITY and GIT_OPT_SET_ODB_PACKED_PRIORITY to allow overriding the default priority values for the default ODB backends. Libgit2 has historically assumed that most objects for long- running operations will be packed, therefore GIT_LOOSE_PRIORITY is set to 1 by default, and GIT_PACKED_PRIORITY to 2. When a client allows libgit2 to set the default backends, they can specify an override for the two priority values in order to change the order in which each ODB backend is accessed.
08f39208	2019-06-08T17:46:04	blob: add underscore to `from` functions The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.
6b3730d4	2019-02-16T19:55:30	Fix a memory leak in odb_otype_fast() This change frees a copy of a cached object in odb_otype_fast().
f673e232	2018-12-27T13:47:34	git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
f7416509	2019-01-20T20:15:31	Fix odb foreach to also close on positive error code In include/git2/odb.h it states that callback can also return positive value which should break looping. Implementations of git_odb_foreach() and pack_backend__foreach() did not respect that.
168fe39b	2018-11-28T14:26:57	object_type: use new enumeration names Use the new object_type enumeration names within the codebase.
ecf4f33a	2018-02-08T11:14:48	Convert usage of `git_buf_free` to new `git_buf_dispose`
a52b4c51	2018-03-23T09:59:46	odb: fix writing to fake write streams In commit 7ec7aa4a7 (odb: assert on logic errors when writing objects, 2018-02-01), the check for whether we are trying to overflowing the fake stream buffer was changed from returning an error to raising an assert. The conversion forgot though that the logic around `assert`s are basically inverted. Previously, if the statement stream->written + len > steram->size evaluated to true, we would return a `-1`. Now we are asserting that this statement is true, and in case it is not we will raise an error. So the conversion to the `assert` in fact changed the behaviour to the complete opposite intention. Fix the assert by inverting its condition again and add a regression test.
904307af	2018-03-23T09:58:57	tests: add tests for the mempack ODB backend Our mempack ODB backend has no test coverage at all right now. Add a simple test suite to at least have some coverage of the most basic operations on the ODB.
619f61a8	2018-02-01T06:22:36	odb: error when we can't create object header Return an error to the caller when we can't create an object header for some reason (printf failure) instead of simply asserting.
909a1992	2017-12-31T09:56:30	odb_loose: largefile tests only on 64 bit platforms Only run the large file tests on 64 bit platforms. Even though we support streaming reads on objects, and do not need to fit them in memory, we use `size_t` in various places to reflect the size of an object.
27078e58	2017-12-18T23:11:42	odb_loose: test read_header on large blobs Test that we can read_header on large blobs. This should succeed on all platforms since we read only a few bytes into memory to be able to parse the header.
e118231b	2017-12-18T23:11:24	odb_loose: test read_header explicitly
b1e66bfc	2017-12-17T16:31:35	odb: test loose object streaming
dbe3d3e9	2017-12-17T02:12:19	odb_loose: test reading a large file in stream Since some test situations may have generous disk space, but limited RAM (eg hosted build agents), test that we can stream a large file into a loose object, and then stream it out of the loose object storage.
275f103d	2018-01-12T08:59:40	odb: reject reading and writing null OIDs The null OID (hash with all zeroes) indicates a missing object in upstream git and is thus not a valid object ID. Add defensive measurements to avoid writing such a hash to the object database in the very unlikely case where some data results in the null OID. Furthermore, add shortcuts when reading the null OID from the ODB to avoid ever returning an object when a faulty repository may contain the null OID.
456e5218	2017-12-20T16:13:31	tests: add GITTEST_SLOW env var check Writing very large files may be slow, particularly on inefficient filesystems and when running instrumented code to detect invalid memory accesses (eg within valgrind or similar tools). Introduce `GITTEST_SLOW` so that tests that are slow can be skipped by the CI system.
3e6533ba	2017-12-10T17:25:00	odb_loose: reject objects that cannot fit in memory Check the size of objects being read from the loose odb backend and reject those that would not fit in memory with an error message that reflects the actual problem, instead of error'ing later with an unintuitive error message regarding truncation or invalid hashes.
dacc3291	2017-11-30T15:49:05	odb: test loose reading/writing large objects Introduce a test for very large objects in the ODB. Write a large object (5 GB) and ensure that the write succeeds and provides us the expected object ID. Introduce a test that writes that file and ensures that we can subsequently read it.
a180e7d9	2017-06-13T11:10:19	tests: odb: add more low-level backend tests Introduce a new test suite "odb::backend::simple", which utilizes the fake backend to exercise the ODB abstraction layer. While such tests already exist for the case where multiple backends are put together, no direct testing for functionality with a single backend exist yet.
b2e53f36	2017-06-13T11:39:36	tests: odb: implement `exists_prefix` for the fake backend The fake backend currently implements all reading functions except for the `exists_prefix` one. Implement it to enable further testing of the ODB layer.
983e627d	2017-06-13T11:38:59	tests: odb: use correct OID length The `search_object` function takes the OID length as one of its parameters, where its maximum length is `GIT_OID_HEXSZ`. The `exists` function of the fake backend used `GIT_OID_RAWSZ` though, leading to only the first half of the OID being used when finding the correct object.
c4cbb3b1	2017-06-13T11:38:14	tests: odb: have the fake backend detect ambiguous prefixes In order to be able to test the ODB prefix functions, we need to be able to detect ambiguous prefixes in case multiple objects with the same prefix exist in the fake ODB. Extend `search_object` to detect ambiguous queries and have callers return its error code instead of always returning `GIT_ENOTFOUND`.
f148258a	2017-06-12T16:19:45	tests: odb: add tests with multiple backends Previous to pulling out and extending the fake backend, it was quite cumbersome to write tests for very specific scenarios regarding backends. But as we have made it more generic, it has become much easier to do so. As such, this commit adds multiple tests for scenarios with multiple backends for the ODB. The changes also include a test for a very targeted scenario. When one backend found a matching object via `read_prefix`, but the last backend returns `GIT_ENOTFOUND` and when object hash verification is turned off, we fail to reset the error code to `GIT_OK`. This causes us to segfault later on, when doing a double-free on the returned object.
6e010bb1	2017-06-12T15:43:56	tests: odb: allow passing fake objects to the fake backend Right now, the fake backend is quite restrained in the way how it works: we pass it an OID which it is to return later as well as an error code we want it to return. While this is sufficient for existing tests, we can make the fake backend a little bit more generic in order to allow us testing for additional scenarios. To do so, we change the backend to not accept an error code and OID which it is to return for queries, but instead a simple array of OIDs with their respective blob contents. On each query, the fake backend simply iterates through this array and returns the first matching object.
369cb45f	2017-06-12T15:21:58	tests: do not reuse OID from backend In order to make the fake backend more useful, we want to enable it holding multiple object references. To do so, we need to decouple it from the single fake OID it currently holds, which we simply move up into the calling tests.
2add34d0	2017-06-12T14:53:46	tests: odb: move fake backend into its own file The fake backend used by the test suite `odb::backend::nonrefreshing` is useful to have some low-level tests for the ODB layer. As such, we move the implementation into its own `backend_helpers` module.
6c23704d	2017-06-08T21:40:18	settings: rename `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION` Initially, the setting has been solely used to enable the use of `fsync()` when creating objects. Since then, the use has been extended to also cover references and index files. As the option is not yet part of any release, we can still correct this by renaming the option to something more sensible, indicating not only correlation to objects. This commit renames the option to `GIT_OPT_ENABLE_FSYNC_GITDIR`. We also move the variable from the object to repository source code.
28a0741f	2017-04-10T09:30:08	odb: verify object hashes The upstream git.git project verifies objects when looking them up from disk. This avoids scenarios where objects have somehow become corrupt on disk, e.g. due to hardware failures or bit flips. While our mantra is usually to follow upstream behavior, we do not do so in this case, as we never check hashes of objects we have just read from disk. To fix this, we create a new error class `GIT_EMISMATCH` which denotes that we have looked up an object with a hashsum mismatch. `odb_read_1` will then, after having read the object from its backend, hash the object and compare the resulting hash to the expected hash. If hashes do not match, it will return an error. This obviously introduces another computation of checksums and could potentially impact performance. Note though that we usually perform I/O operations directly before doing this computation, and as such the actual overhead should be drowned out by I/O. Running our test suite seems to confirm this guess. On a Linux system with best-of-five timings, we had 21.592s with the check enabled and 21.590s with the ckeck disabled. Note though that our test suite mostly contains very small blobs only. It is expected that repositories with bigger blobs may notice an increased hit by this check. In addition to a new test, we also had to change the odb::backend::nonrefreshing test suite, which now triggers a hashsum mismatch when looking up the commit "deadbeef...". This is expected, as the fake backend allocated inside of the test will return an empty object for the OID "deadbeef...", which will obviously not hash back to "deadbeef..." again. We can simply adjust the hash to equal the hash of the empty object here to fix this test.
e29e8029	2017-04-10T10:31:22	tests: odb: make hash of fake backend configurable In the odb::backend::nonrefreshing test suite, we set up a fake backend so that we are able to determine if backend functions are called correctly. During the setup, we also parse an OID which is later on used to read out the pseudo-object. While this procedure works right now, it will create problems later when we implement hash verification for looked up objects. The current OID ("deadbeef") will not match the hash of contents we give back to the ODB layer and thus cannot be verified. Make the hash configurable so that we can simply switch the returned for single tests.
89d403cc	2017-04-05T09:50:12	win32: enable `p_utimes` for readonly files Instead of failing to set the timestamp of a read-only file (like any object file), set it writable temporarily to update the timestamp.
6fd6c678	2017-03-22T20:29:22	Merge pull request #4030 from libgit2/ethomson/fsync fsync all the things
52d03f37	2017-03-03T13:26:29	git_commit_create: freshen tree objects in commit Freshen the tree object that a commit points to during commit time.
1c04a96b	2017-02-28T12:29:29	Honor `core.fsyncObjectFiles`
2a5ad7d0	2017-02-17T16:42:40	fsync: call it "synchronous" object writing Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.
e6ed0d2f	2016-12-13T11:31:38	odb_loose: fsync tests Introduce a simple counter that `p_fsync` implements. This is useful for ensuring that `p_fsync` is called when we expect it to be, for example when we have enabled an odb backend to perform `fsync`s when writing objects.
8f0d5cde	2016-12-29T12:55:49	tests: update error message checking
565fb8dc	2016-06-25T20:02:45	revwalk: introduce tests that hide old commits Introduce some tests that show some commits, while hiding some commits that have a timestamp older than the common ancestors of these two commits.
becadafc	2016-08-05T19:30:56	odb: only provide the empty tree Only provide the empty tree internally, which matches git's behavior. If we provide the empty blob then any users trying to write it with libgit2 would omit it from actually landing in the odb, which appear to git proper as a broken repository (missing that object).
27051d4e	2016-07-22T13:34:19	odb: only freshen pack files every 2 seconds Since writing multiple objects may all already exist in a single packfile, avoid freshening that packfile repeatedly in a tight loop. Instead, only freshen pack files every 2 seconds.
8f09a98e	2016-07-14T16:23:24	odb: freshen existing objects when writing When writing an object, we calculate its OID and see if it exists in the object database. If it does, we need to freshen the file that contains it.
9a786650	2016-03-09T11:00:27	odb: Handle corner cases in `git_odb_expand_ids` The old implementation had two issues: 1. OIDs that were too short as to be ambiguous were not being handled properly. 2. If the last OID to expand in the array was missing from the ODB, we would leak a `GIT_ENOTFOUND` error code from the function.
62484f52	2016-03-08T14:09:55	git_odb_expand_ids: accept git_odb_expand_id array Take (and write to) an array of a struct, `git_odb_expand_id`.
4b1f0f79	2016-03-08T11:44:21	git_odb_expand_ids: rename func, return the type
6c04269c	2016-03-04T00:50:35	git_odb_exists_many_prefixes: query odb for multiple short ids Query the object database for multiple objects at a time, given their object ID (which may be abbreviated) and optional type.
a0a1b19a	2015-10-14T19:31:54	odb: Prioritize alternate backends For most real use cases, repositories with alternates use them as main object storage. Checking the alternate for objects before the main repository should result in measurable speedups. Because of this, we're changing the sorting algorithm to prioritize alternates in cases where two backends have the same priority. This means that the pack backend for the alternate will be checked before the pack backend for the main repository but both of them will be checked before any loose backends.
d3b29fb9	2015-10-01T00:50:37	refdb and odb backends must provide `free` function As refdb and odb backends can be allocated by client code, libgit2 can’t know whether an alternative memory allocator was used, and thus should not try to call `git__free` on those objects. Instead, odb and refdb backend implementations must always provide their own `free` functions to ensure memory gets freed correctly.
ac2fba0e	2015-09-16T15:07:27	git_futils_mkdir_*: make a relative-to-base mkdir Untangle git_futils_mkdir from git_futils_mkdir_ext - the latter assumes that we own everything beneath the base, as if it were being called with a base of the repository or working directory, and is tailored towards checkout and ensuring that there is no bogosity beneath the base that must be cleaned up. This is (at best) slow and (at worst) unsafe in the larger context of a filesystem where we do not own things and cannot do things like unlink symlinks that are in our way.
8da44047	2015-06-06T03:55:28	path: error out if the callback returns an error When the callback returns an error, we should stop immediately. This broke when trying to make sure we pass specific errors up the chain. This broke cancelling out of the loose backend's foreach.
e0156651	2014-11-21T13:50:46	odb: `git_odb_object` contents are never NULL This is a contract that we made in the library and that we need to uphold. The contents of a blob can never be NULL because several parts of the library (including the filter and attributes code) expect `git_blob_rawcontent` to always return a valid pointer.
e1ac0101	2014-11-08T14:40:53	odb: hardcode the empty blob and tree git hardocodes these as objects which exist regardless of whether they are in the odb and uses them in the shell interface as a way of expressing the lack of a blob or tree for one side of e.g. a diff. In the library we use each language's natural way of declaring a lack of value which makes a workaround like this unnecessary. Since git uses it, it does however mean each shell application would need to perform this check themselves. This makes it common work across a range of applications and an issue with compatibility with git, which fits right into what the library aims to provide. Thus we introduce the hard-coded empty blob and tree in the odb frontend. These hard-coded objects are checked for before going to the backends, but after the cache check, which means the second time they're used, they will be treated as normal cached objects instead of creating new ones.
7629ea5d	2014-06-11T16:00:04	Fixed odb foreach test failure for big-endian 64-bit
0cee70eb	2014-07-01T14:09:01	Introduce cl_assert_equal_oid
430866d2	2014-05-20T08:29:51	Fix a leak in the tests
ee311907	2014-05-05T16:04:14	odb: ignore files in the objects dir We assume that everything under GIT_DIR/objects/ is a directory. This is not necessarily the case if some process left a stray file in there. Check beforehand if we do have a directory and ignore the entry otherwise.
89499078	2014-03-10T10:53:39	Fix a number of git_odb_exists_prefix bugs The git_odb_exists_prefix API was not dealing correctly when a later backend returned GIT_ENOTFOUND even if an earlier backend had found the object. Additionally, the unit tests were not properly exercising the API and had a couple mistakes in checking the results. Lastly, since the backends are not expected to behavior correctly unless all bytes of the short id are zero except for the prefix, this makes the ODB prefix APIs explicitly clear out the extra bytes so the user doesn't have to be as careful.
ae32c54e	2014-03-05T20:28:49	Plug a few leaks in the tests
a064dc2d	2014-03-06T00:47:05	Merge pull request #2159 from libgit2/rb/odb-exists-prefix Add ODB API to check for existence by prefix and object id shortener
7bd2f401	2014-03-05T11:35:47	ODB writing fails gracefully when unsupported If no ODB backends support writing, we should fail gracefully.
f5753999	2014-03-04T15:34:23	Add exists_prefix to ODB backend and ODB API
25e0b157	2013-12-06T15:07:57	Remove converting user error to GIT_EUSER This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.
83e1efbf	2013-11-14T14:10:32	Update files that reference tests-clar
17820381	2013-11-14T14:05:52	Rename tests-clar to tests

ceddeed8

2021-11-11T15:20:50

Merge pull request #6104 from libgit2/ethomson/path path: refactor utility path functions

95117d47

2021-10-31T09:45:46

path: separate git-specific path functions from util Introduce `git_fs_path`, which operates on generic filesystem paths. `git_path` will be kept for only git-specific path functionality (for example, checking for `.git` in a path).

94cb060c

2021-11-08T14:54:09

Add tests for ODB refresh Add optional refreshing in the fake backend, and count the number of refresh calls if enabled.

f0e693b1

2021-09-07T17:53:49

str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.

31ecaca2

2021-09-30T08:11:40

hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.

2a713da1

2021-09-29T21:31:17

hash: accept the algorithm in inputs

d095502e

2021-08-08T13:34:06

tests: don't generate false positives on empty path segments

e5975f36

2021-07-30T11:37:12

tests: reset odb backend priority

cd460522

2020-04-20T22:16:52

odb: Implement option for overriding of default odb backend priority Introduce GIT_OPT_SET_ODB_LOOSE_PRIORITY and GIT_OPT_SET_ODB_PACKED_PRIORITY to allow overriding the default priority values for the default ODB backends. Libgit2 has historically assumed that most objects for long- running operations will be packed, therefore GIT_LOOSE_PRIORITY is set to 1 by default, and GIT_PACKED_PRIORITY to 2. When a client allows libgit2 to set the default backends, they can specify an override for the two priority values in order to change the order in which each ODB backend is accessed.

08f39208

2019-06-08T17:46:04

blob: add underscore to `from` functions The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.

6b3730d4

2019-02-16T19:55:30

Fix a memory leak in odb_otype_fast() This change frees a copy of a cached object in odb_otype_fast().

f673e232

2018-12-27T13:47:34

git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.

f7416509

2019-01-20T20:15:31

Fix odb foreach to also close on positive error code In include/git2/odb.h it states that callback can also return positive value which should break looping. Implementations of git_odb_foreach() and pack_backend__foreach() did not respect that.

168fe39b

2018-11-28T14:26:57

object_type: use new enumeration names Use the new object_type enumeration names within the codebase.

ecf4f33a

2018-02-08T11:14:48

Convert usage of `git_buf_free` to new `git_buf_dispose`

a52b4c51

2018-03-23T09:59:46

odb: fix writing to fake write streams In commit 7ec7aa4a7 (odb: assert on logic errors when writing objects, 2018-02-01), the check for whether we are trying to overflowing the fake stream buffer was changed from returning an error to raising an assert. The conversion forgot though that the logic around `assert`s are basically inverted. Previously, if the statement stream->written + len > steram->size evaluated to true, we would return a `-1`. Now we are asserting that this statement is true, and in case it is not we will raise an error. So the conversion to the `assert` in fact changed the behaviour to the complete opposite intention. Fix the assert by inverting its condition again and add a regression test.

904307af

2018-03-23T09:58:57

tests: add tests for the mempack ODB backend Our mempack ODB backend has no test coverage at all right now. Add a simple test suite to at least have some coverage of the most basic operations on the ODB.

619f61a8

2018-02-01T06:22:36

odb: error when we can't create object header Return an error to the caller when we can't create an object header for some reason (printf failure) instead of simply asserting.

909a1992

2017-12-31T09:56:30

odb_loose: largefile tests only on 64 bit platforms Only run the large file tests on 64 bit platforms. Even though we support streaming reads on objects, and do not need to fit them in memory, we use `size_t` in various places to reflect the size of an object.

27078e58

2017-12-18T23:11:42

odb_loose: test read_header on large blobs Test that we can read_header on large blobs. This should succeed on all platforms since we read only a few bytes into memory to be able to parse the header.

e118231b

2017-12-18T23:11:24

odb_loose: test read_header explicitly

b1e66bfc

2017-12-17T16:31:35

odb: test loose object streaming

dbe3d3e9

2017-12-17T02:12:19

odb_loose: test reading a large file in stream Since some test situations may have generous disk space, but limited RAM (eg hosted build agents), test that we can stream a large file into a loose object, and then stream it out of the loose object storage.

275f103d

2018-01-12T08:59:40

odb: reject reading and writing null OIDs The null OID (hash with all zeroes) indicates a missing object in upstream git and is thus not a valid object ID. Add defensive measurements to avoid writing such a hash to the object database in the very unlikely case where some data results in the null OID. Furthermore, add shortcuts when reading the null OID from the ODB to avoid ever returning an object when a faulty repository may contain the null OID.

456e5218

2017-12-20T16:13:31

tests: add GITTEST_SLOW env var check Writing very large files may be slow, particularly on inefficient filesystems and when running instrumented code to detect invalid memory accesses (eg within valgrind or similar tools). Introduce `GITTEST_SLOW` so that tests that are slow can be skipped by the CI system.

3e6533ba

2017-12-10T17:25:00

odb_loose: reject objects that cannot fit in memory Check the size of objects being read from the loose odb backend and reject those that would not fit in memory with an error message that reflects the actual problem, instead of error'ing later with an unintuitive error message regarding truncation or invalid hashes.

dacc3291

2017-11-30T15:49:05

odb: test loose reading/writing large objects Introduce a test for very large objects in the ODB. Write a large object (5 GB) and ensure that the write succeeds and provides us the expected object ID. Introduce a test that writes that file and ensures that we can subsequently read it.

a180e7d9

2017-06-13T11:10:19

tests: odb: add more low-level backend tests Introduce a new test suite "odb::backend::simple", which utilizes the fake backend to exercise the ODB abstraction layer. While such tests already exist for the case where multiple backends are put together, no direct testing for functionality with a single backend exist yet.

b2e53f36

2017-06-13T11:39:36

tests: odb: implement `exists_prefix` for the fake backend The fake backend currently implements all reading functions except for the `exists_prefix` one. Implement it to enable further testing of the ODB layer.

983e627d

2017-06-13T11:38:59

tests: odb: use correct OID length The `search_object` function takes the OID length as one of its parameters, where its maximum length is `GIT_OID_HEXSZ`. The `exists` function of the fake backend used `GIT_OID_RAWSZ` though, leading to only the first half of the OID being used when finding the correct object.

c4cbb3b1

2017-06-13T11:38:14

tests: odb: have the fake backend detect ambiguous prefixes In order to be able to test the ODB prefix functions, we need to be able to detect ambiguous prefixes in case multiple objects with the same prefix exist in the fake ODB. Extend `search_object` to detect ambiguous queries and have callers return its error code instead of always returning `GIT_ENOTFOUND`.

f148258a

2017-06-12T16:19:45

tests: odb: add tests with multiple backends Previous to pulling out and extending the fake backend, it was quite cumbersome to write tests for very specific scenarios regarding backends. But as we have made it more generic, it has become much easier to do so. As such, this commit adds multiple tests for scenarios with multiple backends for the ODB. The changes also include a test for a very targeted scenario. When one backend found a matching object via `read_prefix`, but the last backend returns `GIT_ENOTFOUND` and when object hash verification is turned off, we fail to reset the error code to `GIT_OK`. This causes us to segfault later on, when doing a double-free on the returned object.

6e010bb1

2017-06-12T15:43:56

tests: odb: allow passing fake objects to the fake backend Right now, the fake backend is quite restrained in the way how it works: we pass it an OID which it is to return later as well as an error code we want it to return. While this is sufficient for existing tests, we can make the fake backend a little bit more generic in order to allow us testing for additional scenarios. To do so, we change the backend to not accept an error code and OID which it is to return for queries, but instead a simple array of OIDs with their respective blob contents. On each query, the fake backend simply iterates through this array and returns the first matching object.

369cb45f

2017-06-12T15:21:58

tests: do not reuse OID from backend In order to make the fake backend more useful, we want to enable it holding multiple object references. To do so, we need to decouple it from the single fake OID it currently holds, which we simply move up into the calling tests.

2add34d0

2017-06-12T14:53:46

tests: odb: move fake backend into its own file The fake backend used by the test suite `odb::backend::nonrefreshing` is useful to have some low-level tests for the ODB layer. As such, we move the implementation into its own `backend_helpers` module.

6c23704d

2017-06-08T21:40:18

settings: rename `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION` Initially, the setting has been solely used to enable the use of `fsync()` when creating objects. Since then, the use has been extended to also cover references and index files. As the option is not yet part of any release, we can still correct this by renaming the option to something more sensible, indicating not only correlation to objects. This commit renames the option to `GIT_OPT_ENABLE_FSYNC_GITDIR`. We also move the variable from the object to repository source code.

28a0741f

2017-04-10T09:30:08

odb: verify object hashes The upstream git.git project verifies objects when looking them up from disk. This avoids scenarios where objects have somehow become corrupt on disk, e.g. due to hardware failures or bit flips. While our mantra is usually to follow upstream behavior, we do not do so in this case, as we never check hashes of objects we have just read from disk. To fix this, we create a new error class `GIT_EMISMATCH` which denotes that we have looked up an object with a hashsum mismatch. `odb_read_1` will then, after having read the object from its backend, hash the object and compare the resulting hash to the expected hash. If hashes do not match, it will return an error. This obviously introduces another computation of checksums and could potentially impact performance. Note though that we usually perform I/O operations directly before doing this computation, and as such the actual overhead should be drowned out by I/O. Running our test suite seems to confirm this guess. On a Linux system with best-of-five timings, we had 21.592s with the check enabled and 21.590s with the ckeck disabled. Note though that our test suite mostly contains very small blobs only. It is expected that repositories with bigger blobs may notice an increased hit by this check. In addition to a new test, we also had to change the odb::backend::nonrefreshing test suite, which now triggers a hashsum mismatch when looking up the commit "deadbeef...". This is expected, as the fake backend allocated inside of the test will return an empty object for the OID "deadbeef...", which will obviously not hash back to "deadbeef..." again. We can simply adjust the hash to equal the hash of the empty object here to fix this test.

e29e8029

2017-04-10T10:31:22

tests: odb: make hash of fake backend configurable In the odb::backend::nonrefreshing test suite, we set up a fake backend so that we are able to determine if backend functions are called correctly. During the setup, we also parse an OID which is later on used to read out the pseudo-object. While this procedure works right now, it will create problems later when we implement hash verification for looked up objects. The current OID ("deadbeef") will not match the hash of contents we give back to the ODB layer and thus cannot be verified. Make the hash configurable so that we can simply switch the returned for single tests.

89d403cc

2017-04-05T09:50:12

win32: enable `p_utimes` for readonly files Instead of failing to set the timestamp of a read-only file (like any object file), set it writable temporarily to update the timestamp.

6fd6c678

2017-03-22T20:29:22

Merge pull request #4030 from libgit2/ethomson/fsync fsync all the things

52d03f37

2017-03-03T13:26:29

git_commit_create: freshen tree objects in commit Freshen the tree object that a commit points to during commit time.

1c04a96b

2017-02-28T12:29:29

Honor `core.fsyncObjectFiles`

2a5ad7d0

2017-02-17T16:42:40

fsync: call it "synchronous" object writing Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.

e6ed0d2f

2016-12-13T11:31:38

odb_loose: fsync tests Introduce a simple counter that `p_fsync` implements. This is useful for ensuring that `p_fsync` is called when we expect it to be, for example when we have enabled an odb backend to perform `fsync`s when writing objects.

8f0d5cde

2016-12-29T12:55:49

tests: update error message checking

565fb8dc

2016-06-25T20:02:45

revwalk: introduce tests that hide old commits Introduce some tests that show some commits, while hiding some commits that have a timestamp older than the common ancestors of these two commits.

becadafc

2016-08-05T19:30:56

odb: only provide the empty tree Only provide the empty tree internally, which matches git's behavior. If we provide the empty blob then any users trying to write it with libgit2 would omit it from actually landing in the odb, which appear to git proper as a broken repository (missing that object).

27051d4e

2016-07-22T13:34:19

odb: only freshen pack files every 2 seconds Since writing multiple objects may all already exist in a single packfile, avoid freshening that packfile repeatedly in a tight loop. Instead, only freshen pack files every 2 seconds.

8f09a98e

2016-07-14T16:23:24

odb: freshen existing objects when writing When writing an object, we calculate its OID and see if it exists in the object database. If it does, we need to freshen the file that contains it.

9a786650

2016-03-09T11:00:27

odb: Handle corner cases in `git_odb_expand_ids` The old implementation had two issues: 1. OIDs that were too short as to be ambiguous were not being handled properly. 2. If the last OID to expand in the array was missing from the ODB, we would leak a `GIT_ENOTFOUND` error code from the function.

62484f52

2016-03-08T14:09:55

git_odb_expand_ids: accept git_odb_expand_id array Take (and write to) an array of a struct, `git_odb_expand_id`.

4b1f0f79

2016-03-08T11:44:21

git_odb_expand_ids: rename func, return the type

6c04269c

2016-03-04T00:50:35

git_odb_exists_many_prefixes: query odb for multiple short ids Query the object database for multiple objects at a time, given their object ID (which may be abbreviated) and optional type.

a0a1b19a

2015-10-14T19:31:54

odb: Prioritize alternate backends For most real use cases, repositories with alternates use them as main object storage. Checking the alternate for objects before the main repository should result in measurable speedups. Because of this, we're changing the sorting algorithm to prioritize alternates *in cases where two backends have the same priority*. This means that the pack backend for the alternate will be checked before the pack backend for the main repository *but* both of them will be checked before any loose backends.

d3b29fb9

2015-10-01T00:50:37

refdb and odb backends must provide `free` function As refdb and odb backends can be allocated by client code, libgit2 can’t know whether an alternative memory allocator was used, and thus should not try to call `git__free` on those objects. Instead, odb and refdb backend implementations must always provide their own `free` functions to ensure memory gets freed correctly.

ac2fba0e

2015-09-16T15:07:27

git_futils_mkdir_*: make a relative-to-base mkdir Untangle git_futils_mkdir from git_futils_mkdir_ext - the latter assumes that we own everything beneath the base, as if it were being called with a base of the repository or working directory, and is tailored towards checkout and ensuring that there is no bogosity beneath the base that must be cleaned up. This is (at best) slow and (at worst) unsafe in the larger context of a filesystem where we do not own things and cannot do things like unlink symlinks that are in our way.

8da44047

2015-06-06T03:55:28

path: error out if the callback returns an error When the callback returns an error, we should stop immediately. This broke when trying to make sure we pass specific errors up the chain. This broke cancelling out of the loose backend's foreach.

e0156651

2014-11-21T13:50:46

odb: `git_odb_object` contents are never NULL This is a contract that we made in the library and that we need to uphold. The contents of a blob can never be NULL because several parts of the library (including the filter and attributes code) expect `git_blob_rawcontent` to always return a valid pointer.

e1ac0101

2014-11-08T14:40:53

odb: hardcode the empty blob and tree git hardocodes these as objects which exist regardless of whether they are in the odb and uses them in the shell interface as a way of expressing the lack of a blob or tree for one side of e.g. a diff. In the library we use each language's natural way of declaring a lack of value which makes a workaround like this unnecessary. Since git uses it, it does however mean each shell application would need to perform this check themselves. This makes it common work across a range of applications and an issue with compatibility with git, which fits right into what the library aims to provide. Thus we introduce the hard-coded empty blob and tree in the odb frontend. These hard-coded objects are checked for before going to the backends, but after the cache check, which means the second time they're used, they will be treated as normal cached objects instead of creating new ones.

7629ea5d

2014-06-11T16:00:04

Fixed odb foreach test failure for big-endian 64-bit

0cee70eb

2014-07-01T14:09:01

Introduce cl_assert_equal_oid

430866d2

2014-05-20T08:29:51

Fix a leak in the tests

ee311907

2014-05-05T16:04:14

odb: ignore files in the objects dir We assume that everything under GIT_DIR/objects/ is a directory. This is not necessarily the case if some process left a stray file in there. Check beforehand if we do have a directory and ignore the entry otherwise.

89499078

2014-03-10T10:53:39

Fix a number of git_odb_exists_prefix bugs The git_odb_exists_prefix API was not dealing correctly when a later backend returned GIT_ENOTFOUND even if an earlier backend had found the object. Additionally, the unit tests were not properly exercising the API and had a couple mistakes in checking the results. Lastly, since the backends are not expected to behavior correctly unless all bytes of the short id are zero except for the prefix, this makes the ODB prefix APIs explicitly clear out the extra bytes so the user doesn't have to be as careful.

ae32c54e

2014-03-05T20:28:49

Plug a few leaks in the tests

a064dc2d

2014-03-06T00:47:05

Merge pull request #2159 from libgit2/rb/odb-exists-prefix Add ODB API to check for existence by prefix and object id shortener

7bd2f401

2014-03-05T11:35:47

ODB writing fails gracefully when unsupported If no ODB backends support writing, we should fail gracefully.

f5753999

2014-03-04T15:34:23

Add exists_prefix to ODB backend and ODB API

25e0b157

2013-12-06T15:07:57

Remove converting user error to GIT_EUSER This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.

83e1efbf

2013-11-14T14:10:32

Update files that reference tests-clar

17820381

2013-11-14T14:05:52

Rename tests-clar to tests

thodg/libgit2/tests/odb

tests/odb

Log