tests/pack


Log

Author Commit Date CI Message
Edward Thomson f0e693b1 2021-09-07T17:53:49 str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Edward Thomson 31ecaca2 2021-09-30T08:11:40 hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.
Edward Thomson 2a713da1 2021-09-29T21:31:17 hash: accept the algorithm in inputs
Edward Thomson c3512fe6 2021-08-29T21:35:40 Merge branch 'main' into multi-pack-index-odb-write
lhchavez ea285904 2020-02-18T00:02:13 midx: Introduce git_odb_write_multi_pack_index() This change introduces git_odb_write_multi_pack_index(), which creates a `multi-pack-index` file from all the `.pack` files that have been loaded in the ODB. Fixes: #5399
lhchavez 9d117e38 2020-02-17T21:28:13 midx: Add a way to write multi-pack-index files This change adds the git_midx_writer_* functions to allow to write and create `multi-pack-index` files from `.idx`/`.pack` files. Part of: #5399
lhchavez fff209c4 2020-02-17T21:28:13 midx: Add a way to write multi-pack-index files This change adds the git_midx_writer_* functions to allow to write and create `multi-pack-index` files from `.idx`/`.pack` files. Part of: #5399
lhchavez d50d3db6 2021-01-07T05:43:30 midx: Fix a bug in `git_midx_needs_refresh()` The very last check in the freshness check for the midx was wrong >< This was also because this function was not tested.
Edward Thomson 0450e313 2020-12-05T22:03:59 tests: ifdef out unused function in no-thread builds
lhchavez 322c15ee 2020-08-01T18:24:41 Make the pack and mwindow implementations data-race-free This change fixes a packfile heap corruption that can happen when interacting with multiple packfiles concurrently across multiple threads. This is exacerbated by setting a lower mwindow open file limit. This change: * Renames most of the internal methods in pack.c to clearly indicate that they expect to be called with a certain lock held, making reasoning about the state of locks a bit easier. * Splits the `git_pack_file` lock in two: the one in `git_pack_file` only protects the `index_map`. The protection to `git_mwindow_file` is now in that struct. * Explicitly checks for freshness of the `git_pack_file` in `git_packfile_unpack_header`: this allows the mwindow implementation to close files whenever there is enough cache pressure, and `git_packfile_unpack_header` will reopen the packfile if needed. * After a call to `p_munmap()`, the `data` and `len` fields are poisoned with `NULL` to make use-after-frees more evident and crash rather than being open to the possibility of heap corruption. * Adds a test case to prevent this from regressing in the future. Fixes: #5591
lhchavez f847fa7b 2020-02-16T02:00:56 midx: Support multi-pack-index files in odb_pack.c This change adds support for reading multi-pack-index files from the packfile odb backend. This also makes git_pack_file objects open their backing failes lazily in more scenarios, since the multi-pack-index can avoid having to open them in some cases (yay!). This change also refreshes the documentation found in src/odb_pack.c to match the updated code. Part of: #5399
Edward Thomson e316b0d3 2020-05-15T11:47:09 runtime: move init/shutdown into the "runtime" Provide a mechanism for system components to register for initialization and shutdown of the libgit2 runtime.
lhchavez 005e7715 2020-02-23T22:28:52 multipack: Introduce a parser for multi-pack-index files This change is the first in a series to add support for git's multi-pack-index. This should speed up large repositories significantly. Part of: #5399
lhchavez 92d42eb3 2020-07-12T09:53:10 Minor nits and style formatting
lhchavez d88994da 2020-06-27T07:34:36 Create the repository within the test This change moves the humongous static repository with 1025 commits into the test file, now with a more modest 16 commits.
lhchavez eab2b044 2020-06-26T16:10:30 Review feedback * Change the default of the file limit to 0 (unlimited). * Changed the heuristic to close files to be the file that contains the least-recently-used window such that the window is the most-recently-used in the file, and the file does not have in-use windows. * Parameterized the filelimit test to check for a limit of 1 and 100 open windows.
lhchavez 9679df57 2020-02-08T20:47:24 mwindow: set limit on number of open files There are some cases in which repositories accrue a large number of packfiles. The existing mwindow limit applies only to the total size of mmap'd files, not on their number. This leads to a situation in which having lots of small packfiles could exhaust the allowed number of open files, particularly on macOS, where the default ulimit is very low (256). This change adds a new configuration parameter (GIT_OPT_SET_MWINDOW_FILE_LIMIT) that sets the maximum number of open packfiles, with a default of 128. This is low enough so that even macOS users should not hit it during normal use. Based on PR #5386, originally written by @josharian. Fixes: #2758
Josh Triplett 5278a006 2020-05-23T16:07:54 git_packbuilder_write: Allow setting path to NULL to use the default path If given a NULL path, write to the object path of the repository. Add tests for the new behavior.
Patrick Steinhardt 2dc7b5ef 2019-12-14T12:53:04 tests: pack: add missing asserts around `git_packbuilder_write`
Patrick Steinhardt e54343a4 2019-06-29T09:17:32 fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
Edward Thomson a1ef995d 2019-02-21T10:33:30 indexer: use git_indexer_progress throughout Update internal usage of `git_transfer_progress` to `git_indexer_progreses`.
Patrick Steinhardt 18cf5698 2018-12-01T09:37:40 maps: provide high-level iteration interface Currently, our headers need to leak some implementation details of maps due to their direct use of indices in the implementation of their foreach macros. This makes it impossible to completely hide the map structures away, and also makes it impossible to include the khash implementation header in the C files of the respective map only. This is now being fixed by providing a high-level iteration interface `map_iterate`, which takes as inputs the map that shall be iterated over, an iterator as well as the locations where keys and values shall be put into. For simplicity's sake, the iterator is a simple `size_t` that shall initialized to `0` on the first call. All existing foreach macros are then adjusted to make use of this new function.
Patrick Steinhardt 7e926ef3 2018-11-30T12:14:43 maps: provide a uniform entry count interface There currently exist two different function names for getting the entry count of maps, where offmaps offset and string maps use `num_entries` and OID maps use `size`. In most programming languages with built-in map types, this is simply called `size`, which is also shorter to type. Thus, this commit renames the other two functions `num_entries` to match the common way and adjusts all callers.
Dhruva Krishnamurthy 004a3398 2019-01-28T18:31:21 Allow bypassing check '.keep' files using libgit2 option 'GIT_OPT_IGNORE_PACK_KEEP_FILE_CHECK'
Edward Thomson f673e232 2018-12-27T13:47:34 git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
Edward Thomson 168fe39b 2018-11-28T14:26:57 object_type: use new enumeration names Use the new object_type enumeration names within the codebase.
Patrick Steinhardt 852bc9f4 2018-11-23T19:26:24 khash: remove intricate knowledge of khash types Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.
Patrick Steinhardt 65a4b06e 2017-11-12T10:43:28 tests: indexer: add test to exercise our connectivity checking The new connectivity tests are not currently being verified at all due to being turned off by default. Create two test cases for a pack file which fails our checks and one which suceeds.
Patrick Steinhardt c16556aa 2017-11-12T10:31:48 indexer: introduce options struct to `git_indexer_new` We strive to keep an options structure to many functions to be able to extend options in the future without breaking the API. `git_indexer_new` doesn't have one right now, but we want to be able to add an option for enabling strict packfile verification. Add a new `git_indexer_options` structure and adjust callers to use that.
Patrick Steinhardt ecf4f33a 2018-02-08T11:14:48 Convert usage of `git_buf_free` to new `git_buf_dispose`
lhchavez c3514b0b 2017-12-23T14:59:07 Fix unpack double free If an element has been cached, but then the call to packfile_unpack_compressed() fails, the very next thing that happens is that its data is freed and then the element is not removed from the cache, which frees the data again. This change sets obj->data to NULL to avoid the double-free. It also stops trying to resolve deltas after two continuous failed rounds of resolution, and adds a test for this.
lhchavez c8aaba24 2017-12-06T03:03:18 libFuzzer: Fix missing trailer crash This change fixes an invalid memory access when the trailer is missing / corrupt. Found using libFuzzer.
lhchavez 400caed3 2017-12-06T03:22:58 libFuzzer: Fix a git_packfile_stream leak This change ensures that the git_packfile_stream object in git_indexer_append() does not leak when the stream has errors. Found using libFuzzer.
Edward Thomson 6f960b55 2017-06-11T10:37:46 Merge pull request #4088 from chescock/packfile-name-using-complete-hash Ensure packfiles with different contents have different names
Patrick Steinhardt 6c23704d 2017-06-08T21:40:18 settings: rename `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION` Initially, the setting has been solely used to enable the use of `fsync()` when creating objects. Since then, the use has been extended to also cover references and index files. As the option is not yet part of any release, we can still correct this by renaming the option to something more sensible, indicating not only correlation to objects. This commit renames the option to `GIT_OPT_ENABLE_FSYNC_GITDIR`. We also move the variable from the object to repository source code.
Chris Hescock c0e54155 2017-01-11T10:39:59 indexer: name pack files after trailer hash Upstream git.git has changed the way how packfiles are named. Previously, they were using a hash of the contained object's OIDs, which has then been changed to use the hash of the complete packfile instead. See 1190a1acf (pack-objects: name pack files after trailer hash, 2013-12-05) in the git.git repository for more information on this change. This commit changes our logic to match the behavior of core git.
Edward Thomson 1c04a96b 2017-02-28T12:29:29 Honor `core.fsyncObjectFiles`
Edward Thomson 3ac05d11 2017-02-17T16:48:03 win32: don't fsync parent directories on Windows Windows doesn't support it.
Edward Thomson 2a5ad7d0 2017-02-17T16:42:40 fsync: call it "synchronous" object writing Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.
Edward Thomson 1229e1c4 2017-02-17T16:36:53 fsync parent directories when fsyncing When fsync'ing files, fsync the parent directory in the case where we rename a file into place, or create a new file, to ensure that the directory entry is flushed correctly.
Edward Thomson 1c2c0ae2 2016-12-14T12:51:40 packbuilder: honor git_object__synchronized_writing Honor `git_object__synchronized_writing` when creating a packfile and corresponding index.
lhchavez f5586f5c 2017-01-14T16:37:00 Addressed review feedback
lhchavez a7ff6e5e 2017-01-03T18:24:51 Fix the memory leak
lhchavez def644e4 2017-01-01T17:35:29 Add a test
Jacques Germishuys 0478b7f4 2014-09-25T15:35:00 Silence unused return value warning
Ciro Santilli 3b2cb2c9 2014-09-16T11:49:25 Factor 40 and 41 constants from source.
Edward Thomson 0cee70eb 2014-07-01T14:09:01 Introduce cl_assert_equal_oid
Carlos Martín Nieto b3b66c57 2014-06-18T17:13:12 Share packs across repository instances Opening the same repository multiple times will currently open the same file multiple times, as well as map the same region of the file multiple times. This is not necessary, as the packfile data is immutable. Instead of opening and closing packfiles directly, introduce an indirection and allocate packfiles globally. This does mean locking on each packfile open, but we already use this lock for the global mwindow list so it doesn't introduce a new contention point.
Philip Kelley d7a29463 2014-05-17T16:58:09 Fix a bug in the pack::packbuilder suite
Russell Belfer 7697e541 2013-12-11T15:02:20 Test cancel from indexer progress callback This adds tests that try canceling an indexer operation from within the progress callback. After writing the tests, I wanted to run this under valgrind and had a number of errors in that situation because mmap wasn't working. I added a CMake option to force emulation of mmap and consolidated the Amiga-specific code into that new place (so we don't actually need separate Amiga code now, just have to turn on -DNO_MMAP). Additionally, I made the indexer code propagate error codes more reliably than it used to.
Ben Straub 83e1efbf 2013-11-14T14:10:32 Update files that reference tests-clar
Ben Straub 17820381 2013-11-14T14:05:52 Rename tests-clar to tests