tests/odb


Log

Author Commit Date CI Message
Edward Thomson ceddeed8 2021-11-11T15:20:50 Merge pull request #6104 from libgit2/ethomson/path path: refactor utility path functions
Edward Thomson 95117d47 2021-10-31T09:45:46 path: separate git-specific path functions from util Introduce `git_fs_path`, which operates on generic filesystem paths. `git_path` will be kept for only git-specific path functionality (for example, checking for `.git` in a path).
Josh Triplett 94cb060c 2021-11-08T14:54:09 Add tests for ODB refresh Add optional refreshing in the fake backend, and count the number of refresh calls if enabled.
Edward Thomson f0e693b1 2021-09-07T17:53:49 str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Edward Thomson 31ecaca2 2021-09-30T08:11:40 hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.
Edward Thomson 2a713da1 2021-09-29T21:31:17 hash: accept the algorithm in inputs
Peter Pettersson d095502e 2021-08-08T13:34:06 tests: don't generate false positives on empty path segments
Edward Thomson e5975f36 2021-07-30T11:37:12 tests: reset odb backend priority
Tony De La Nuez cd460522 2020-04-20T22:16:52 odb: Implement option for overriding of default odb backend priority Introduce GIT_OPT_SET_ODB_LOOSE_PRIORITY and GIT_OPT_SET_ODB_PACKED_PRIORITY to allow overriding the default priority values for the default ODB backends. Libgit2 has historically assumed that most objects for long- running operations will be packed, therefore GIT_LOOSE_PRIORITY is set to 1 by default, and GIT_PACKED_PRIORITY to 2. When a client allows libgit2 to set the default backends, they can specify an override for the two priority values in order to change the order in which each ODB backend is accessed.
Edward Thomson 08f39208 2019-06-08T17:46:04 blob: add underscore to `from` functions The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.
lhchavez 6b3730d4 2019-02-16T19:55:30 Fix a memory leak in odb_otype_fast() This change frees a copy of a cached object in odb_otype_fast().
Edward Thomson f673e232 2018-12-27T13:47:34 git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
Marijan Šuflaj f7416509 2019-01-20T20:15:31 Fix odb foreach to also close on positive error code In include/git2/odb.h it states that callback can also return positive value which should break looping. Implementations of git_odb_foreach() and pack_backend__foreach() did not respect that.
Edward Thomson 168fe39b 2018-11-28T14:26:57 object_type: use new enumeration names Use the new object_type enumeration names within the codebase.
Patrick Steinhardt ecf4f33a 2018-02-08T11:14:48 Convert usage of `git_buf_free` to new `git_buf_dispose`
Patrick Steinhardt a52b4c51 2018-03-23T09:59:46 odb: fix writing to fake write streams In commit 7ec7aa4a7 (odb: assert on logic errors when writing objects, 2018-02-01), the check for whether we are trying to overflowing the fake stream buffer was changed from returning an error to raising an assert. The conversion forgot though that the logic around `assert`s are basically inverted. Previously, if the statement stream->written + len > steram->size evaluated to true, we would return a `-1`. Now we are asserting that this statement is true, and in case it is not we will raise an error. So the conversion to the `assert` in fact changed the behaviour to the complete opposite intention. Fix the assert by inverting its condition again and add a regression test.
Patrick Steinhardt 904307af 2018-03-23T09:58:57 tests: add tests for the mempack ODB backend Our mempack ODB backend has no test coverage at all right now. Add a simple test suite to at least have some coverage of the most basic operations on the ODB.
Edward Thomson 619f61a8 2018-02-01T06:22:36 odb: error when we can't create object header Return an error to the caller when we can't create an object header for some reason (printf failure) instead of simply asserting.
Edward Thomson 909a1992 2017-12-31T09:56:30 odb_loose: largefile tests only on 64 bit platforms Only run the large file tests on 64 bit platforms. Even though we support streaming reads on objects, and do not need to fit them in memory, we use `size_t` in various places to reflect the size of an object.
Edward Thomson 27078e58 2017-12-18T23:11:42 odb_loose: test read_header on large blobs Test that we can read_header on large blobs. This should succeed on all platforms since we read only a few bytes into memory to be able to parse the header.
Edward Thomson e118231b 2017-12-18T23:11:24 odb_loose: test read_header explicitly
Edward Thomson b1e66bfc 2017-12-17T16:31:35 odb: test loose object streaming
Edward Thomson dbe3d3e9 2017-12-17T02:12:19 odb_loose: test reading a large file in stream Since some test situations may have generous disk space, but limited RAM (eg hosted build agents), test that we can stream a large file into a loose object, and then stream it out of the loose object storage.
Patrick Steinhardt 275f103d 2018-01-12T08:59:40 odb: reject reading and writing null OIDs The null OID (hash with all zeroes) indicates a missing object in upstream git and is thus not a valid object ID. Add defensive measurements to avoid writing such a hash to the object database in the very unlikely case where some data results in the null OID. Furthermore, add shortcuts when reading the null OID from the ODB to avoid ever returning an object when a faulty repository may contain the null OID.
Edward Thomson 456e5218 2017-12-20T16:13:31 tests: add GITTEST_SLOW env var check Writing very large files may be slow, particularly on inefficient filesystems and when running instrumented code to detect invalid memory accesses (eg within valgrind or similar tools). Introduce `GITTEST_SLOW` so that tests that are slow can be skipped by the CI system.
Edward Thomson 3e6533ba 2017-12-10T17:25:00 odb_loose: reject objects that cannot fit in memory Check the size of objects being read from the loose odb backend and reject those that would not fit in memory with an error message that reflects the actual problem, instead of error'ing later with an unintuitive error message regarding truncation or invalid hashes.
Edward Thomson dacc3291 2017-11-30T15:49:05 odb: test loose reading/writing large objects Introduce a test for very large objects in the ODB. Write a large object (5 GB) and ensure that the write succeeds and provides us the expected object ID. Introduce a test that writes that file and ensures that we can subsequently read it.
Patrick Steinhardt a180e7d9 2017-06-13T11:10:19 tests: odb: add more low-level backend tests Introduce a new test suite "odb::backend::simple", which utilizes the fake backend to exercise the ODB abstraction layer. While such tests already exist for the case where multiple backends are put together, no direct testing for functionality with a single backend exist yet.
Patrick Steinhardt b2e53f36 2017-06-13T11:39:36 tests: odb: implement `exists_prefix` for the fake backend The fake backend currently implements all reading functions except for the `exists_prefix` one. Implement it to enable further testing of the ODB layer.
Patrick Steinhardt 983e627d 2017-06-13T11:38:59 tests: odb: use correct OID length The `search_object` function takes the OID length as one of its parameters, where its maximum length is `GIT_OID_HEXSZ`. The `exists` function of the fake backend used `GIT_OID_RAWSZ` though, leading to only the first half of the OID being used when finding the correct object.
Patrick Steinhardt c4cbb3b1 2017-06-13T11:38:14 tests: odb: have the fake backend detect ambiguous prefixes In order to be able to test the ODB prefix functions, we need to be able to detect ambiguous prefixes in case multiple objects with the same prefix exist in the fake ODB. Extend `search_object` to detect ambiguous queries and have callers return its error code instead of always returning `GIT_ENOTFOUND`.
Patrick Steinhardt f148258a 2017-06-12T16:19:45 tests: odb: add tests with multiple backends Previous to pulling out and extending the fake backend, it was quite cumbersome to write tests for very specific scenarios regarding backends. But as we have made it more generic, it has become much easier to do so. As such, this commit adds multiple tests for scenarios with multiple backends for the ODB. The changes also include a test for a very targeted scenario. When one backend found a matching object via `read_prefix`, but the last backend returns `GIT_ENOTFOUND` and when object hash verification is turned off, we fail to reset the error code to `GIT_OK`. This causes us to segfault later on, when doing a double-free on the returned object.
Patrick Steinhardt 6e010bb1 2017-06-12T15:43:56 tests: odb: allow passing fake objects to the fake backend Right now, the fake backend is quite restrained in the way how it works: we pass it an OID which it is to return later as well as an error code we want it to return. While this is sufficient for existing tests, we can make the fake backend a little bit more generic in order to allow us testing for additional scenarios. To do so, we change the backend to not accept an error code and OID which it is to return for queries, but instead a simple array of OIDs with their respective blob contents. On each query, the fake backend simply iterates through this array and returns the first matching object.
Patrick Steinhardt 369cb45f 2017-06-12T15:21:58 tests: do not reuse OID from backend In order to make the fake backend more useful, we want to enable it holding multiple object references. To do so, we need to decouple it from the single fake OID it currently holds, which we simply move up into the calling tests.
Patrick Steinhardt 2add34d0 2017-06-12T14:53:46 tests: odb: move fake backend into its own file The fake backend used by the test suite `odb::backend::nonrefreshing` is useful to have some low-level tests for the ODB layer. As such, we move the implementation into its own `backend_helpers` module.
Patrick Steinhardt 6c23704d 2017-06-08T21:40:18 settings: rename `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION` Initially, the setting has been solely used to enable the use of `fsync()` when creating objects. Since then, the use has been extended to also cover references and index files. As the option is not yet part of any release, we can still correct this by renaming the option to something more sensible, indicating not only correlation to objects. This commit renames the option to `GIT_OPT_ENABLE_FSYNC_GITDIR`. We also move the variable from the object to repository source code.
Patrick Steinhardt 28a0741f 2017-04-10T09:30:08 odb: verify object hashes The upstream git.git project verifies objects when looking them up from disk. This avoids scenarios where objects have somehow become corrupt on disk, e.g. due to hardware failures or bit flips. While our mantra is usually to follow upstream behavior, we do not do so in this case, as we never check hashes of objects we have just read from disk. To fix this, we create a new error class `GIT_EMISMATCH` which denotes that we have looked up an object with a hashsum mismatch. `odb_read_1` will then, after having read the object from its backend, hash the object and compare the resulting hash to the expected hash. If hashes do not match, it will return an error. This obviously introduces another computation of checksums and could potentially impact performance. Note though that we usually perform I/O operations directly before doing this computation, and as such the actual overhead should be drowned out by I/O. Running our test suite seems to confirm this guess. On a Linux system with best-of-five timings, we had 21.592s with the check enabled and 21.590s with the ckeck disabled. Note though that our test suite mostly contains very small blobs only. It is expected that repositories with bigger blobs may notice an increased hit by this check. In addition to a new test, we also had to change the odb::backend::nonrefreshing test suite, which now triggers a hashsum mismatch when looking up the commit "deadbeef...". This is expected, as the fake backend allocated inside of the test will return an empty object for the OID "deadbeef...", which will obviously not hash back to "deadbeef..." again. We can simply adjust the hash to equal the hash of the empty object here to fix this test.
Patrick Steinhardt e29e8029 2017-04-10T10:31:22 tests: odb: make hash of fake backend configurable In the odb::backend::nonrefreshing test suite, we set up a fake backend so that we are able to determine if backend functions are called correctly. During the setup, we also parse an OID which is later on used to read out the pseudo-object. While this procedure works right now, it will create problems later when we implement hash verification for looked up objects. The current OID ("deadbeef") will not match the hash of contents we give back to the ODB layer and thus cannot be verified. Make the hash configurable so that we can simply switch the returned for single tests.
Edward Thomson 89d403cc 2017-04-05T09:50:12 win32: enable `p_utimes` for readonly files Instead of failing to set the timestamp of a read-only file (like any object file), set it writable temporarily to update the timestamp.
Edward Thomson 6fd6c678 2017-03-22T20:29:22 Merge pull request #4030 from libgit2/ethomson/fsync fsync all the things
Edward Thomson 52d03f37 2017-03-03T13:26:29 git_commit_create: freshen tree objects in commit Freshen the tree object that a commit points to during commit time.
Edward Thomson 1c04a96b 2017-02-28T12:29:29 Honor `core.fsyncObjectFiles`
Edward Thomson 2a5ad7d0 2017-02-17T16:42:40 fsync: call it "synchronous" object writing Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.
Edward Thomson e6ed0d2f 2016-12-13T11:31:38 odb_loose: fsync tests Introduce a simple counter that `p_fsync` implements. This is useful for ensuring that `p_fsync` is called when we expect it to be, for example when we have enabled an odb backend to perform `fsync`s when writing objects.
Edward Thomson 8f0d5cde 2016-12-29T12:55:49 tests: update error message checking
Edward Thomson 565fb8dc 2016-06-25T20:02:45 revwalk: introduce tests that hide old commits Introduce some tests that show some commits, while hiding some commits that have a timestamp older than the common ancestors of these two commits.
Edward Thomson becadafc 2016-08-05T19:30:56 odb: only provide the empty tree Only provide the empty tree internally, which matches git's behavior. If we provide the empty blob then any users trying to write it with libgit2 would omit it from actually landing in the odb, which appear to git proper as a broken repository (missing that object).
Edward Thomson 27051d4e 2016-07-22T13:34:19 odb: only freshen pack files every 2 seconds Since writing multiple objects may all already exist in a single packfile, avoid freshening that packfile repeatedly in a tight loop. Instead, only freshen pack files every 2 seconds.
Edward Thomson 8f09a98e 2016-07-14T16:23:24 odb: freshen existing objects when writing When writing an object, we calculate its OID and see if it exists in the object database. If it does, we need to freshen the file that contains it.
Vicent Marti 9a786650 2016-03-09T11:00:27 odb: Handle corner cases in `git_odb_expand_ids` The old implementation had two issues: 1. OIDs that were too short as to be ambiguous were not being handled properly. 2. If the last OID to expand in the array was missing from the ODB, we would leak a `GIT_ENOTFOUND` error code from the function.
Edward Thomson 62484f52 2016-03-08T14:09:55 git_odb_expand_ids: accept git_odb_expand_id array Take (and write to) an array of a struct, `git_odb_expand_id`.
Edward Thomson 4b1f0f79 2016-03-08T11:44:21 git_odb_expand_ids: rename func, return the type
Edward Thomson 6c04269c 2016-03-04T00:50:35 git_odb_exists_many_prefixes: query odb for multiple short ids Query the object database for multiple objects at a time, given their object ID (which may be abbreviated) and optional type.
Vicent Marti a0a1b19a 2015-10-14T19:31:54 odb: Prioritize alternate backends For most real use cases, repositories with alternates use them as main object storage. Checking the alternate for objects before the main repository should result in measurable speedups. Because of this, we're changing the sorting algorithm to prioritize alternates *in cases where two backends have the same priority*. This means that the pack backend for the alternate will be checked before the pack backend for the main repository *but* both of them will be checked before any loose backends.
Arthur Schreiber d3b29fb9 2015-10-01T00:50:37 refdb and odb backends must provide `free` function As refdb and odb backends can be allocated by client code, libgit2 can’t know whether an alternative memory allocator was used, and thus should not try to call `git__free` on those objects. Instead, odb and refdb backend implementations must always provide their own `free` functions to ensure memory gets freed correctly.
Edward Thomson ac2fba0e 2015-09-16T15:07:27 git_futils_mkdir_*: make a relative-to-base mkdir Untangle git_futils_mkdir from git_futils_mkdir_ext - the latter assumes that we own everything beneath the base, as if it were being called with a base of the repository or working directory, and is tailored towards checkout and ensuring that there is no bogosity beneath the base that must be cleaned up. This is (at best) slow and (at worst) unsafe in the larger context of a filesystem where we do not own things and cannot do things like unlink symlinks that are in our way.
Carlos Martín Nieto 8da44047 2015-06-06T03:55:28 path: error out if the callback returns an error When the callback returns an error, we should stop immediately. This broke when trying to make sure we pass specific errors up the chain. This broke cancelling out of the loose backend's foreach.
Vicent Marti e0156651 2014-11-21T13:50:46 odb: `git_odb_object` contents are never NULL This is a contract that we made in the library and that we need to uphold. The contents of a blob can never be NULL because several parts of the library (including the filter and attributes code) expect `git_blob_rawcontent` to always return a valid pointer.
Carlos Martín Nieto e1ac0101 2014-11-08T14:40:53 odb: hardcode the empty blob and tree git hardocodes these as objects which exist regardless of whether they are in the odb and uses them in the shell interface as a way of expressing the lack of a blob or tree for one side of e.g. a diff. In the library we use each language's natural way of declaring a lack of value which makes a workaround like this unnecessary. Since git uses it, it does however mean each shell application would need to perform this check themselves. This makes it common work across a range of applications and an issue with compatibility with git, which fits right into what the library aims to provide. Thus we introduce the hard-coded empty blob and tree in the odb frontend. These hard-coded objects are checked for before going to the backends, but after the cache check, which means the second time they're used, they will be treated as normal cached objects instead of creating new ones.
Jakub Čajka 7629ea5d 2014-06-11T16:00:04 Fixed odb foreach test failure for big-endian 64-bit
Edward Thomson 0cee70eb 2014-07-01T14:09:01 Introduce cl_assert_equal_oid
Carlos Martín Nieto 430866d2 2014-05-20T08:29:51 Fix a leak in the tests
Carlos Martín Nieto ee311907 2014-05-05T16:04:14 odb: ignore files in the objects dir We assume that everything under GIT_DIR/objects/ is a directory. This is not necessarily the case if some process left a stray file in there. Check beforehand if we do have a directory and ignore the entry otherwise.
Russell Belfer 89499078 2014-03-10T10:53:39 Fix a number of git_odb_exists_prefix bugs The git_odb_exists_prefix API was not dealing correctly when a later backend returned GIT_ENOTFOUND even if an earlier backend had found the object. Additionally, the unit tests were not properly exercising the API and had a couple mistakes in checking the results. Lastly, since the backends are not expected to behavior correctly unless all bytes of the short id are zero except for the prefix, this makes the ODB prefix APIs explicitly clear out the extra bytes so the user doesn't have to be as careful.
Carlos Martín Nieto ae32c54e 2014-03-05T20:28:49 Plug a few leaks in the tests
Vicent Marti a064dc2d 2014-03-06T00:47:05 Merge pull request #2159 from libgit2/rb/odb-exists-prefix Add ODB API to check for existence by prefix and object id shortener
Edward Thomson 7bd2f401 2014-03-05T11:35:47 ODB writing fails gracefully when unsupported If no ODB backends support writing, we should fail gracefully.
Russell Belfer f5753999 2014-03-04T15:34:23 Add exists_prefix to ODB backend and ODB API
Russell Belfer 25e0b157 2013-12-06T15:07:57 Remove converting user error to GIT_EUSER This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.
Ben Straub 83e1efbf 2013-11-14T14:10:32 Update files that reference tests-clar
Ben Straub 17820381 2013-11-14T14:05:52 Rename tests-clar to tests