kmx git

Commit	Date	Message
28a0741f	2017-04-10T09:30:08	odb: verify object hashes The upstream git.git project verifies objects when looking them up from disk. This avoids scenarios where objects have somehow become corrupt on disk, e.g. due to hardware failures or bit flips. While our mantra is usually to follow upstream behavior, we do not do so in this case, as we never check hashes of objects we have just read from disk. To fix this, we create a new error class `GIT_EMISMATCH` which denotes that we have looked up an object with a hashsum mismatch. `odb_read_1` will then, after having read the object from its backend, hash the object and compare the resulting hash to the expected hash. If hashes do not match, it will return an error. This obviously introduces another computation of checksums and could potentially impact performance. Note though that we usually perform I/O operations directly before doing this computation, and as such the actual overhead should be drowned out by I/O. Running our test suite seems to confirm this guess. On a Linux system with best-of-five timings, we had 21.592s with the check enabled and 21.590s with the ckeck disabled. Note though that our test suite mostly contains very small blobs only. It is expected that repositories with bigger blobs may notice an increased hit by this check. In addition to a new test, we also had to change the odb::backend::nonrefreshing test suite, which now triggers a hashsum mismatch when looking up the commit "deadbeef...". This is expected, as the fake backend allocated inside of the test will return an empty object for the OID "deadbeef...", which will obviously not hash back to "deadbeef..." again. We can simply adjust the hash to equal the hash of the empty object here to fix this test.
e29e8029	2017-04-10T10:31:22	tests: odb: make hash of fake backend configurable In the odb::backend::nonrefreshing test suite, we set up a fake backend so that we are able to determine if backend functions are called correctly. During the setup, we also parse an OID which is later on used to read out the pseudo-object. While this procedure works right now, it will create problems later when we implement hash verification for looked up objects. The current OID ("deadbeef") will not match the hash of contents we give back to the ODB layer and thus cannot be verified. Make the hash configurable so that we can simply switch the returned for single tests.
89d403cc	2017-04-05T09:50:12	win32: enable `p_utimes` for readonly files Instead of failing to set the timestamp of a read-only file (like any object file), set it writable temporarily to update the timestamp.
6fd6c678	2017-03-22T20:29:22	Merge pull request #4030 from libgit2/ethomson/fsync fsync all the things
52d03f37	2017-03-03T13:26:29	git_commit_create: freshen tree objects in commit Freshen the tree object that a commit points to during commit time.
1c04a96b	2017-02-28T12:29:29	Honor `core.fsyncObjectFiles`
2a5ad7d0	2017-02-17T16:42:40	fsync: call it "synchronous" object writing Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.
e6ed0d2f	2016-12-13T11:31:38	odb_loose: fsync tests Introduce a simple counter that `p_fsync` implements. This is useful for ensuring that `p_fsync` is called when we expect it to be, for example when we have enabled an odb backend to perform `fsync`s when writing objects.
8f0d5cde	2016-12-29T12:55:49	tests: update error message checking
565fb8dc	2016-06-25T20:02:45	revwalk: introduce tests that hide old commits Introduce some tests that show some commits, while hiding some commits that have a timestamp older than the common ancestors of these two commits.
becadafc	2016-08-05T19:30:56	odb: only provide the empty tree Only provide the empty tree internally, which matches git's behavior. If we provide the empty blob then any users trying to write it with libgit2 would omit it from actually landing in the odb, which appear to git proper as a broken repository (missing that object).
27051d4e	2016-07-22T13:34:19	odb: only freshen pack files every 2 seconds Since writing multiple objects may all already exist in a single packfile, avoid freshening that packfile repeatedly in a tight loop. Instead, only freshen pack files every 2 seconds.
8f09a98e	2016-07-14T16:23:24	odb: freshen existing objects when writing When writing an object, we calculate its OID and see if it exists in the object database. If it does, we need to freshen the file that contains it.
9a786650	2016-03-09T11:00:27	odb: Handle corner cases in `git_odb_expand_ids` The old implementation had two issues: 1. OIDs that were too short as to be ambiguous were not being handled properly. 2. If the last OID to expand in the array was missing from the ODB, we would leak a `GIT_ENOTFOUND` error code from the function.
62484f52	2016-03-08T14:09:55	git_odb_expand_ids: accept git_odb_expand_id array Take (and write to) an array of a struct, `git_odb_expand_id`.
4b1f0f79	2016-03-08T11:44:21	git_odb_expand_ids: rename func, return the type
6c04269c	2016-03-04T00:50:35	git_odb_exists_many_prefixes: query odb for multiple short ids Query the object database for multiple objects at a time, given their object ID (which may be abbreviated) and optional type.
a0a1b19a	2015-10-14T19:31:54	odb: Prioritize alternate backends For most real use cases, repositories with alternates use them as main object storage. Checking the alternate for objects before the main repository should result in measurable speedups. Because of this, we're changing the sorting algorithm to prioritize alternates in cases where two backends have the same priority. This means that the pack backend for the alternate will be checked before the pack backend for the main repository but both of them will be checked before any loose backends.
d3b29fb9	2015-10-01T00:50:37	refdb and odb backends must provide `free` function As refdb and odb backends can be allocated by client code, libgit2 can’t know whether an alternative memory allocator was used, and thus should not try to call `git__free` on those objects. Instead, odb and refdb backend implementations must always provide their own `free` functions to ensure memory gets freed correctly.
ac2fba0e	2015-09-16T15:07:27	git_futils_mkdir_*: make a relative-to-base mkdir Untangle git_futils_mkdir from git_futils_mkdir_ext - the latter assumes that we own everything beneath the base, as if it were being called with a base of the repository or working directory, and is tailored towards checkout and ensuring that there is no bogosity beneath the base that must be cleaned up. This is (at best) slow and (at worst) unsafe in the larger context of a filesystem where we do not own things and cannot do things like unlink symlinks that are in our way.
8da44047	2015-06-06T03:55:28	path: error out if the callback returns an error When the callback returns an error, we should stop immediately. This broke when trying to make sure we pass specific errors up the chain. This broke cancelling out of the loose backend's foreach.
e0156651	2014-11-21T13:50:46	odb: `git_odb_object` contents are never NULL This is a contract that we made in the library and that we need to uphold. The contents of a blob can never be NULL because several parts of the library (including the filter and attributes code) expect `git_blob_rawcontent` to always return a valid pointer.
e1ac0101	2014-11-08T14:40:53	odb: hardcode the empty blob and tree git hardocodes these as objects which exist regardless of whether they are in the odb and uses them in the shell interface as a way of expressing the lack of a blob or tree for one side of e.g. a diff. In the library we use each language's natural way of declaring a lack of value which makes a workaround like this unnecessary. Since git uses it, it does however mean each shell application would need to perform this check themselves. This makes it common work across a range of applications and an issue with compatibility with git, which fits right into what the library aims to provide. Thus we introduce the hard-coded empty blob and tree in the odb frontend. These hard-coded objects are checked for before going to the backends, but after the cache check, which means the second time they're used, they will be treated as normal cached objects instead of creating new ones.
7629ea5d	2014-06-11T16:00:04	Fixed odb foreach test failure for big-endian 64-bit
0cee70eb	2014-07-01T14:09:01	Introduce cl_assert_equal_oid
430866d2	2014-05-20T08:29:51	Fix a leak in the tests
ee311907	2014-05-05T16:04:14	odb: ignore files in the objects dir We assume that everything under GIT_DIR/objects/ is a directory. This is not necessarily the case if some process left a stray file in there. Check beforehand if we do have a directory and ignore the entry otherwise.
89499078	2014-03-10T10:53:39	Fix a number of git_odb_exists_prefix bugs The git_odb_exists_prefix API was not dealing correctly when a later backend returned GIT_ENOTFOUND even if an earlier backend had found the object. Additionally, the unit tests were not properly exercising the API and had a couple mistakes in checking the results. Lastly, since the backends are not expected to behavior correctly unless all bytes of the short id are zero except for the prefix, this makes the ODB prefix APIs explicitly clear out the extra bytes so the user doesn't have to be as careful.
ae32c54e	2014-03-05T20:28:49	Plug a few leaks in the tests
a064dc2d	2014-03-06T00:47:05	Merge pull request #2159 from libgit2/rb/odb-exists-prefix Add ODB API to check for existence by prefix and object id shortener
7bd2f401	2014-03-05T11:35:47	ODB writing fails gracefully when unsupported If no ODB backends support writing, we should fail gracefully.
f5753999	2014-03-04T15:34:23	Add exists_prefix to ODB backend and ODB API
25e0b157	2013-12-06T15:07:57	Remove converting user error to GIT_EUSER This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.
83e1efbf	2013-11-14T14:10:32	Update files that reference tests-clar
17820381	2013-11-14T14:05:52	Rename tests-clar to tests

28a0741f

2017-04-10T09:30:08

odb: verify object hashes The upstream git.git project verifies objects when looking them up from disk. This avoids scenarios where objects have somehow become corrupt on disk, e.g. due to hardware failures or bit flips. While our mantra is usually to follow upstream behavior, we do not do so in this case, as we never check hashes of objects we have just read from disk. To fix this, we create a new error class `GIT_EMISMATCH` which denotes that we have looked up an object with a hashsum mismatch. `odb_read_1` will then, after having read the object from its backend, hash the object and compare the resulting hash to the expected hash. If hashes do not match, it will return an error. This obviously introduces another computation of checksums and could potentially impact performance. Note though that we usually perform I/O operations directly before doing this computation, and as such the actual overhead should be drowned out by I/O. Running our test suite seems to confirm this guess. On a Linux system with best-of-five timings, we had 21.592s with the check enabled and 21.590s with the ckeck disabled. Note though that our test suite mostly contains very small blobs only. It is expected that repositories with bigger blobs may notice an increased hit by this check. In addition to a new test, we also had to change the odb::backend::nonrefreshing test suite, which now triggers a hashsum mismatch when looking up the commit "deadbeef...". This is expected, as the fake backend allocated inside of the test will return an empty object for the OID "deadbeef...", which will obviously not hash back to "deadbeef..." again. We can simply adjust the hash to equal the hash of the empty object here to fix this test.

e29e8029

2017-04-10T10:31:22

tests: odb: make hash of fake backend configurable In the odb::backend::nonrefreshing test suite, we set up a fake backend so that we are able to determine if backend functions are called correctly. During the setup, we also parse an OID which is later on used to read out the pseudo-object. While this procedure works right now, it will create problems later when we implement hash verification for looked up objects. The current OID ("deadbeef") will not match the hash of contents we give back to the ODB layer and thus cannot be verified. Make the hash configurable so that we can simply switch the returned for single tests.

89d403cc

2017-04-05T09:50:12

win32: enable `p_utimes` for readonly files Instead of failing to set the timestamp of a read-only file (like any object file), set it writable temporarily to update the timestamp.

6fd6c678

2017-03-22T20:29:22

Merge pull request #4030 from libgit2/ethomson/fsync fsync all the things

52d03f37

2017-03-03T13:26:29

git_commit_create: freshen tree objects in commit Freshen the tree object that a commit points to during commit time.

1c04a96b

2017-02-28T12:29:29

Honor `core.fsyncObjectFiles`

2a5ad7d0

2017-02-17T16:42:40

fsync: call it "synchronous" object writing Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.

e6ed0d2f

2016-12-13T11:31:38

odb_loose: fsync tests Introduce a simple counter that `p_fsync` implements. This is useful for ensuring that `p_fsync` is called when we expect it to be, for example when we have enabled an odb backend to perform `fsync`s when writing objects.

8f0d5cde

2016-12-29T12:55:49

tests: update error message checking

565fb8dc

2016-06-25T20:02:45

revwalk: introduce tests that hide old commits Introduce some tests that show some commits, while hiding some commits that have a timestamp older than the common ancestors of these two commits.

becadafc

2016-08-05T19:30:56

odb: only provide the empty tree Only provide the empty tree internally, which matches git's behavior. If we provide the empty blob then any users trying to write it with libgit2 would omit it from actually landing in the odb, which appear to git proper as a broken repository (missing that object).

27051d4e

2016-07-22T13:34:19

odb: only freshen pack files every 2 seconds Since writing multiple objects may all already exist in a single packfile, avoid freshening that packfile repeatedly in a tight loop. Instead, only freshen pack files every 2 seconds.

8f09a98e

2016-07-14T16:23:24

odb: freshen existing objects when writing When writing an object, we calculate its OID and see if it exists in the object database. If it does, we need to freshen the file that contains it.

9a786650

2016-03-09T11:00:27

odb: Handle corner cases in `git_odb_expand_ids` The old implementation had two issues: 1. OIDs that were too short as to be ambiguous were not being handled properly. 2. If the last OID to expand in the array was missing from the ODB, we would leak a `GIT_ENOTFOUND` error code from the function.

62484f52

2016-03-08T14:09:55

git_odb_expand_ids: accept git_odb_expand_id array Take (and write to) an array of a struct, `git_odb_expand_id`.

4b1f0f79

2016-03-08T11:44:21

git_odb_expand_ids: rename func, return the type

6c04269c

2016-03-04T00:50:35

git_odb_exists_many_prefixes: query odb for multiple short ids Query the object database for multiple objects at a time, given their object ID (which may be abbreviated) and optional type.

a0a1b19a

2015-10-14T19:31:54

odb: Prioritize alternate backends For most real use cases, repositories with alternates use them as main object storage. Checking the alternate for objects before the main repository should result in measurable speedups. Because of this, we're changing the sorting algorithm to prioritize alternates *in cases where two backends have the same priority*. This means that the pack backend for the alternate will be checked before the pack backend for the main repository *but* both of them will be checked before any loose backends.

d3b29fb9

2015-10-01T00:50:37

refdb and odb backends must provide `free` function As refdb and odb backends can be allocated by client code, libgit2 can’t know whether an alternative memory allocator was used, and thus should not try to call `git__free` on those objects. Instead, odb and refdb backend implementations must always provide their own `free` functions to ensure memory gets freed correctly.

ac2fba0e

2015-09-16T15:07:27

git_futils_mkdir_*: make a relative-to-base mkdir Untangle git_futils_mkdir from git_futils_mkdir_ext - the latter assumes that we own everything beneath the base, as if it were being called with a base of the repository or working directory, and is tailored towards checkout and ensuring that there is no bogosity beneath the base that must be cleaned up. This is (at best) slow and (at worst) unsafe in the larger context of a filesystem where we do not own things and cannot do things like unlink symlinks that are in our way.

8da44047

2015-06-06T03:55:28

path: error out if the callback returns an error When the callback returns an error, we should stop immediately. This broke when trying to make sure we pass specific errors up the chain. This broke cancelling out of the loose backend's foreach.

e0156651

2014-11-21T13:50:46

odb: `git_odb_object` contents are never NULL This is a contract that we made in the library and that we need to uphold. The contents of a blob can never be NULL because several parts of the library (including the filter and attributes code) expect `git_blob_rawcontent` to always return a valid pointer.

e1ac0101

2014-11-08T14:40:53

odb: hardcode the empty blob and tree git hardocodes these as objects which exist regardless of whether they are in the odb and uses them in the shell interface as a way of expressing the lack of a blob or tree for one side of e.g. a diff. In the library we use each language's natural way of declaring a lack of value which makes a workaround like this unnecessary. Since git uses it, it does however mean each shell application would need to perform this check themselves. This makes it common work across a range of applications and an issue with compatibility with git, which fits right into what the library aims to provide. Thus we introduce the hard-coded empty blob and tree in the odb frontend. These hard-coded objects are checked for before going to the backends, but after the cache check, which means the second time they're used, they will be treated as normal cached objects instead of creating new ones.

7629ea5d

2014-06-11T16:00:04

Fixed odb foreach test failure for big-endian 64-bit

0cee70eb

2014-07-01T14:09:01

Introduce cl_assert_equal_oid

430866d2

2014-05-20T08:29:51

Fix a leak in the tests

ee311907

2014-05-05T16:04:14

odb: ignore files in the objects dir We assume that everything under GIT_DIR/objects/ is a directory. This is not necessarily the case if some process left a stray file in there. Check beforehand if we do have a directory and ignore the entry otherwise.

89499078

2014-03-10T10:53:39

Fix a number of git_odb_exists_prefix bugs The git_odb_exists_prefix API was not dealing correctly when a later backend returned GIT_ENOTFOUND even if an earlier backend had found the object. Additionally, the unit tests were not properly exercising the API and had a couple mistakes in checking the results. Lastly, since the backends are not expected to behavior correctly unless all bytes of the short id are zero except for the prefix, this makes the ODB prefix APIs explicitly clear out the extra bytes so the user doesn't have to be as careful.

ae32c54e

2014-03-05T20:28:49

Plug a few leaks in the tests

a064dc2d

2014-03-06T00:47:05

Merge pull request #2159 from libgit2/rb/odb-exists-prefix Add ODB API to check for existence by prefix and object id shortener

7bd2f401

2014-03-05T11:35:47

ODB writing fails gracefully when unsupported If no ODB backends support writing, we should fail gracefully.

f5753999

2014-03-04T15:34:23

Add exists_prefix to ODB backend and ODB API

25e0b157

2013-12-06T15:07:57

Remove converting user error to GIT_EUSER This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.

83e1efbf

2013-11-14T14:10:32

Update files that reference tests-clar

17820381

2013-11-14T14:05:52

Rename tests-clar to tests

thodg/libgit2/tests/odb

tests/odb

Log