kmx git

Commit	Date	Message
d2458af7	2022-01-22T14:19:13	indexer: use a byte array for checksum The index's checksum is not an object ID, so we should not use the `git_oid` type. Use a byte array for checksum calculation and storage. Deprecate the `git_indexer_hash` function. Callers should use the new `git_indexer_name` function which provides a unique packfile name.
90df4302	2022-01-05T12:18:05	Fix typos
fc1a3f45	2021-11-29T13:36:36	object: return GIT_EINVALID on parse errors Return `GIT_EINVALID` on parse errors so that direct callers of parse functions can determine when there was a failure to parse the object. The object parser functions will swallow this error code to prevent it from propagating down the chain to end-users. (`git_merge` should not return `GIT_EINVALID` when a commit it tries to look up is not valid, this would be too vague to be useful.) The only public function that this affects is `git_signature_from_buffer`, which is now documented as returning `GIT_EINVALID` when appropriate.
adcf638c	2021-11-21T21:34:17	filebuf: use hashes not oids The filebuf functions should use hashes directly, not indirectly using the oid functions.
f0e693b1	2021-09-07T17:53:49	str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
31ecaca2	2021-09-30T08:11:40	hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.
2a713da1	2021-09-29T21:31:17	hash: accept the algorithm in inputs
c65eb24c	2021-09-08T08:47:39	Avoid double negatives in the justification for truncation Turns out, double negatives are harder to parse than positive statements.
6571ba7b	2021-09-08T06:29:58	Only avoid `mmap(2)`/`ftruncate(2)` when in non-Windows It turns out that if we use `mmap(2)`, non-Windows remote filesystems break due to permissions. If we don't, _Windows_ remote filesystems break due to lack of coherence between memory mapped views of the file and direct I/O operations done to the files. To break out of this impossible situation, conditionally-compile versions of Windows-specific `write_at` and `append_to_pack`.
eeceaac0	2021-09-07T08:38:35	Also remove a `ftruncate(2)` call in `git_indexer_commit` Now that we're not using `mmap(2)` for writing stuff, we don't need to truncate the file afterwards, since it'll have the correct size at the end of the process. Whee~!
66a75fde	2021-09-07T07:14:39	indexer: Avoid one `mmap(2)`/`munmap(2)` pair per `git_indexer_append` call This change makes `append_to_pack` completely rely on `p_pwrite` to do all its I/O instead of splitting it between `p_pwrite` and a `mmap(2)`/`munmap(2)`+`memcpy(3)`. This saves a good chunk of user CPU time and avoids making two syscalls per round, but doesn't really cut down a lot of wall time (~1% on cloning the [git](https://github.com/git/git.git) repository).
ff6f6754	2021-01-07T05:44:16	Use `p_pwrite`/`p_pread` consistently throughout the codebase This change stops using the seek+read/write combo to perform I/O with an offset, since this is faster by one system call (and also more atomic and therefore safer).
4ce8e01a	2020-06-17T14:31:11	Support build with NO_MMAP to disable use of system mmap * Use pread/pwrite to avoid updating position in file descriptor * Emulate missing pread/pwrite on win32 using overlapped file IO
322c15ee	2020-08-01T18:24:41	Make the pack and mwindow implementations data-race-free This change fixes a packfile heap corruption that can happen when interacting with multiple packfiles concurrently across multiple threads. This is exacerbated by setting a lower mwindow open file limit. This change: * Renames most of the internal methods in pack.c to clearly indicate that they expect to be called with a certain lock held, making reasoning about the state of locks a bit easier. * Splits the `git_pack_file` lock in two: the one in `git_pack_file` only protects the `index_map`. The protection to `git_mwindow_file` is now in that struct. * Explicitly checks for freshness of the `git_pack_file` in `git_packfile_unpack_header`: this allows the mwindow implementation to close files whenever there is enough cache pressure, and `git_packfile_unpack_header` will reopen the packfile if needed. * After a call to `p_munmap()`, the `data` and `len` fields are poisoned with `NULL` to make use-after-frees more evident and crash rather than being open to the possibility of heap corruption. * Adds a test case to prevent this from regressing in the future. Fixes: #5591
7cd0bf65	2020-04-05T18:26:52	pack: use GIT_ASSERT
a3e8b7cd	2020-04-05T17:18:20	mwindow: use GIT_ASSERT
cd2fe662	2020-04-05T16:56:55	indexer: use GIT_ASSERT
3a197ea7	2020-06-27T12:33:32	Make the tests pass cleanly with MemorySanitizer This change: * Initializes a few variables that were being read before being initialized. * Includes https://github.com/madler/zlib/pull/393. As such, it only works reliably with `-DUSE_BUNDLED_ZLIB=ON`.
c6184f0c	2020-06-08T21:07:36	tree-wide: do not compile deprecated functions with hard deprecation When compiling libgit2 with -DDEPRECATE_HARD, we add a preprocessor definition `GIT_DEPRECATE_HARD` which causes the "git2/deprecated.h" header to be empty. As a result, no function declarations are made available to callers, but the implementations are still available to link against. This has the problem that function declarations also aren't visible to the implementations, meaning that the symbol's visibility will not be set up correctly. As a result, the resulting library may not expose those deprecated symbols at all on some platforms and thus cause linking errors. Fix the issue by conditionally compiling deprecated functions, only. While it becomes impossible to link against such a library in case one uses deprecated functions, distributors of libgit2 aren't expected to pass -DDEPRECATE_HARD anyway. Instead, users of libgit2 should manually define GIT_DEPRECATE_HARD to hide deprecated functions. Using "real" hard deprecation still makes sense in the context of CI to test we don't use deprecated symbols ourselves and in case a dependant uses libgit2 in a vendored way and knows it won't ever use any of the deprecated symbols anyway.
ba59a4a2	2020-04-01T12:34:16	Making get_delta_base() conform to the general error-handling pattern This makes get_delta_base() return the error code as the return value and the delta base as an out-parameter.
90450d88	2020-02-07T12:10:12	indexer: check return code of `git_hash_ctx_init` Initialization of the hashing context may fail on some systems, most notably on Win32 via the legacy hashing context. As such, we need to always check the error code of `git_hash_ctx_init`, which is not done when creating a new indexer. Fix the issue by adding checks.
6460e8ab	2019-06-23T18:13:29	internal: use off64_t instead of git_off_t Prefer `off64_t` internally.
a477bff1	2019-08-08T10:44:57	indexer: catch OOM when adding expected OIDs When adding OIDs to the indexer's map of yet-to-be-seen OIDs to verify that packfiles are complete, we do so by first allocating a new OID and then calling `git_oidmap_set` on it. There was no check for memory allocation errors in place, though, leading to possible segfaults due to trying to copy data to a `NULL` pointer. Verify the result of `git__malloc` with `GIT_ERROR_CHECK_ALLOC` to fix the issue.
0b5ba0d7	2019-06-06T16:36:23	Rename opt init functions to `options_init` In libgit2 nomenclature, when we need to verb a direct object, we name a function `git_directobject_verb`. Thus, if we need to init an options structure named `git_foo_options`, then the name of the function that does that should be `git_foo_options_init`. The previous names of `git_foo_init_options` is close - it _sounds_ as if it's initializing the options of a `foo`, but in fact `git_foo_options` is its own noun that should be respected. Deprecate the old names; they'll now call directly to the new ones.
a1ef995d	2019-02-21T10:33:30	indexer: use git_indexer_progress throughout Update internal usage of `git_transfer_progress` to `git_indexer_progreses`.
c976b4f9	2018-12-01T10:18:26	indexer: use map iterator to delete expected OIDs To compute whether there are objects missing in a packfile, the indexer keeps around a map of OIDs that it still expects to see. This map does not store any values at all, but in fact the keys are owned by the map itself. Right now, we free these keys by iterating over the map and freeing the key itself, which is kind of awkward as keys are expected to be constant. We can make this a bit prettier by inserting the OID as value, too. As we already store the `NULL` pointer either way, this does not increase memory usage, but makes the code a tad more clear. Furthermore, we convert the previously existing map iteration via indices to make use of an iterator, instead.
2e0a3048	2019-01-23T10:48:55	oidmap: introduce high-level setter for key/value pairs Currently, one would use either `git_oidmap_insert` to insert key/value pairs into a map or `git_oidmap_put` to insert a key only. These function have historically been macros, which is why their syntax is kind of weird: instead of returning an error code directly, they instead have to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes.Furthermore, `git_oidmap_put` is tightly coupled with implementation details of the map as it exposes the index of inserted entries. Introduce a new function `git_oidmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all trivial callers of `git_oidmap_insert` and `git_oidmap_put` to make use of it.
351eeff3	2019-01-23T10:42:46	maps: use uniform lifecycle management functions Currently, the lifecycle functions for maps (allocation, deallocation, resize) are not named in a uniform way and do not have a uniform function signature. Rename the functions to fix that, and stick to libgit2's naming scheme of saying `git_foo_new`. This results in the following new interface for allocation: - `int git_<t>map_new(git_<t>map *out)` to allocate a new map, returning an error code if we ran out of memory - `void git_<t>map_free(git_<t>map map)` to free a map - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map This commit also fixes all existing callers.
f673e232	2018-12-27T13:47:34	git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
168fe39b	2018-11-28T14:26:57	object_type: use new enumeration names Use the new object_type enumeration names within the codebase.
852bc9f4	2018-11-23T19:26:24	khash: remove intricate knowledge of khash types Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.
50186ce8	2018-08-26T11:26:45	Merge pull request #4374 from pks-t/pks/pack-file-verify Pack file verification
32810348	2018-07-20T08:43:54	Use UINT32_MAX as the default object limit This replicates the old behavior of limiting to 2³² by default.
bfe34242	2018-07-16T03:12:01	See if this fixes 32-bit build
efe3f37d	2018-07-12T04:20:15	Add a git_libgit2_opts option to set the max indexer object count
912c59c9	2018-06-24T06:51:08	while fuzzing, limit # objects read
6b51f380	2018-06-22T13:19:31	indexer: correctly initialize struct with {0}
5ec4aee9	2017-11-12T10:35:18	indexer: add ability to select connectivity checks Right now, we simply turn on connectivity checks in the indexer as soon as we have access to an object database. But seeing that the connectivity checks may incur additional overhead, we do want the user to decide for himself whether he wants to allow those checks. Furthermore, it might also be desirable to check connectivity in case where no object database is given at all, e.g. in case where a fully connected pack file is expected. Add a flag `verify` to `git_indexer_options` to enable additional verification checks. Also avoid to query the ODB in case none is given to allow users to enable checks when they do not have an ODB.
c16556aa	2017-11-12T10:31:48	indexer: introduce options struct to `git_indexer_new` We strive to keep an options structure to many functions to be able to extend options in the future without breaking the API. `git_indexer_new` doesn't have one right now, but we want to be able to add an option for enabling strict packfile verification. Add a new `git_indexer_options` structure and adjust callers to use that.
a616fb16	2017-10-13T13:53:05	indexer: check pack file connectivity When passing `--strict` to `git-unpack-objects`, core git will verify the pack file that is currently being read. In addition to the typical checksum verification, this will especially cause it to verify object connectivity of the received pack file. So it checks, for every received object, if all the objects it references are either part of the local object database or part of the pack file. In libgit2, we currently have no such mechanism, which leaves us unable to verify received pack files prior to writing them into our local object database. This commit introduce the concept of `expected_oids` to the indexer. When pack file verification is turned on by a new flag, the indexer will try to parse each received object first. If the object has any links to other objects, it will check if those links are already satisfied by known objects either part of the object database or objects it has already seen as part of that pack file. If not, it will add them to the list of `expected_oids`. Furthermore, the indexer will remove the current object from the `expected_oids` if it is currently being expected. Like this, we are able to verify whether all object links are being satisfied. As soon as we hit the end of the object stream and have resolved all objects as well as deltified objects, we assert that `expected_oids` is in fact empty. This should always be the case for a valid pack file with full connectivity.
be41c384	2017-11-12T09:25:49	indexer: extract function reading stream objects The loop inside of `git_indexer_append` iterates over every object that is to be stored as part of the index. While the logic to retrieve every object from the packfile stream is rather involved, it currently just part of the loop, making it unnecessarily hard to follow. Move the logic into its own function `read_stream_object`, which unpacks a single object from the stream. Note that there is some subtletly here involving the special error `GIT_EBUFS`, which indicates to the indexer that no more data is currently available. So instead of returning an error and aborting the whole loop in that case, we do have to catch that value and return successfully to wait for more data to be read.
6568f374	2017-10-11T13:20:19	indexer: remove useless local variable The `processed` variable local to `git_indexer_append` counts how many objects have already been processed. But actually, whenever it gets assigned to, we are also assigning the same value to the `stats->indexed_objects` struct member. So in fact, it is being quite useless due to always having the same value as the `indexer_objects` member and makes it a bit harder to understand the code. We can just remove the variable to fix that.
ecf4f33a	2018-02-08T11:14:48	Convert usage of `git_buf_free` to new `git_buf_dispose`
c8ee5270	2017-12-08T09:05:58	pack: rename `git_packfile_stream_free` The function `git_packfile_stream_free` frees all state of the packfile stream without freeing the structure itself. This naming makes it hard to spot whether it will try to free the pointer itself or not, causing potential future errors. Due to this reason, we have decided to name a function freeing state without freeing the actual struture a "dispose" function. Rename `git_packfile_stream_free` to `git_packfile_stream_dispose` as a first example following this rule.
619f61a8	2018-02-01T06:22:36	odb: error when we can't create object header Return an error to the caller when we can't create an object header for some reason (printf failure) instead of simply asserting.
c3514b0b	2017-12-23T14:59:07	Fix unpack double free If an element has been cached, but then the call to packfile_unpack_compressed() fails, the very next thing that happens is that its data is freed and then the element is not removed from the cache, which frees the data again. This change sets obj->data to NULL to avoid the double-free. It also stops trying to resolve deltas after two continuous failed rounds of resolution, and adds a test for this.
c8aaba24	2017-12-06T03:03:18	libFuzzer: Fix missing trailer crash This change fixes an invalid memory access when the trailer is missing / corrupt. Found using libFuzzer.
400caed3	2017-12-06T03:22:58	libFuzzer: Fix a git_packfile_stream leak This change ensures that the git_packfile_stream object in git_indexer_append() does not leak when the stream has errors. Found using libFuzzer.
0c7f49dd	2017-06-30T13:39:01	Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
6f960b55	2017-06-11T10:37:46	Merge pull request #4088 from chescock/packfile-name-using-complete-hash Ensure packfiles with different contents have different names
6c23704d	2017-06-08T21:40:18	settings: rename `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION` Initially, the setting has been solely used to enable the use of `fsync()` when creating objects. Since then, the use has been extended to also cover references and index files. As the option is not yet part of any release, we can still correct this by renaming the option to something more sensible, indicating not only correlation to objects. This commit renames the option to `GIT_OPT_ENABLE_FSYNC_GITDIR`. We also move the variable from the object to repository source code.
c0e54155	2017-01-11T10:39:59	indexer: name pack files after trailer hash Upstream git.git has changed the way how packfiles are named. Previously, they were using a hash of the contained object's OIDs, which has then been changed to use the hash of the complete packfile instead. See 1190a1acf (pack-objects: name pack files after trailer hash, 2013-12-05) in the git.git repository for more information on this change. This commit changes our logic to match the behavior of core git.
1c04a96b	2017-02-28T12:29:29	Honor `core.fsyncObjectFiles`
2a5ad7d0	2017-02-17T16:42:40	fsync: call it "synchronous" object writing Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.
1229e1c4	2017-02-17T16:36:53	fsync parent directories when fsyncing When fsync'ing files, fsync the parent directory in the case where we rename a file into place, or create a new file, to ensure that the directory entry is flushed correctly.
1c2c0ae2	2016-12-14T12:51:40	packbuilder: honor git_object__synchronized_writing Honor `git_object__synchronized_writing` when creating a packfile and corresponding index.
0d716905	2017-01-27T15:23:15	oidmap: remove GIT__USE_OIDMAP macro
85d2748c	2017-01-27T14:05:10	khash: avoid using `kh_key`/`kh_val` as lvalue
f31cb45a	2017-01-25T15:31:12	khash: avoid using `kh_put` directly
cb18386f	2017-01-25T14:26:58	khash: avoid using `kh_val`/`kh_value` directly
036daa59	2017-01-25T14:11:42	khash: use `git_map_exists` where applicable
9694d9ba	2017-01-25T14:09:17	khash: avoid using `kh_foreach`/`kh_foreach_value` directly
048c5ea7	2017-01-21T23:55:21	Merge pull request #4053 from chescock/extend-packfile-by-pages Extend packfile in increments of page_size.
87b7a705	2017-01-21T15:44:57	indexer: avoid warning about `idx->pack` It must be non-NULL to have a valid `git_indexer`.
bf339ab0	2017-01-21T14:51:31	indexer: introduce `git_packfile_close` Encapsulation!
52949c80	2017-01-21T18:30:12	Merge branch 'pr/4060'
d030bba9	2017-01-21T17:15:33	indexer: only delete temp file if it was unused Only try to `unlink` our temp file when we know that we didn't copy it into its permanent location.
f5586f5c	2017-01-14T16:37:00	Addressed review feedback
96df833b	2017-01-03T19:15:09	Close the file before unlinking I forgot that Windows chokes while trying to delete open files.
db535d0a	2017-01-01T12:45:02	Delete temporary packfile in indexer This change deletes the temporary packfile that the indexer creates to avoid littering the pack/ directory with garbage.
c7a1535f	2016-12-29T11:47:52	Extend packfile in increments of page_size. This improves performance by reducing the number of I/O operations.
909d5494	2016-12-29T12:25:15	giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
d53cc13e	2016-03-31T04:12:46	Merge pull request #3575 from pmq20/master-13jan16 Remove duplicated calls to git_mwindow_close
e50a49ee	2016-03-22T01:54:49	Merge pull request #3559 from yongthecoder/master Add a sanity check in git_indexer_commit to avoid subtraction overflow.
87c18197	2016-03-16T19:05:11	Split the page size from the mmap alignment While often similar, these are not the same on Windows. We want to use the page size on Windows for the pools, but for mmap we need to use the allocation granularity as the alignment. On the other platforms these values remain the same.
d4e4f272	2016-01-13T11:07:14	Remove duplicated calls to git_mwindow_close
b3eb2cde	2015-12-24T10:04:44	Avoid subtraction overflow in git_indexer_commit
c369b379	2015-07-31T16:23:11	Remove extra semicolon outside of a function Without this change, compiling with gcc and pedantic generates warning: ISO C does not allow extra ‘;’ outside of a function.
3e8c5e45	2015-06-10T16:43:48	Merge pull request #3174 from libgit2/cmn/idx-fill-hole indexer: use lseek to extend the packfile
02980bdc	2015-06-09T16:53:07	Initialize a few variables Coverity complains about the git_rawobj ones because we use a loop in which we keep remembering the old version, and we end up copying our object as the base, so we want to have the data pointer be NULL.
aa57231f	2015-06-02T10:25:22	indexer: use lseek to extend the packfile We've been using `p_ftruncate()` to extend the packfile in order to mmap it and write the new data into it. This works well in the general case, but as truncation does not allocate space in the filesystem, it must do so when we write data to it. The only way the OS has to indicate a failure to allocate space is via SIGBUS which means we tried to write outside the file. This will cause everyone to crash as they don't expect to handle this signal. Switch to using `p_lseek()` and `p_write()` to extend the file in a way which tells the filesystem to allocate the space for the missing data. We can then be sure that we have space to write into.
e2dd3735	2015-05-22T11:20:47	indexer: avoid loading already existent bases When thickening a pack, avoid loading already loaded bases and trying to insert them all over again.
7800048a	2015-03-17T10:06:50	Merge pull request #2972 from libgit2/cmn/pack-objects-walk [WIP] Smarter pack-building
7c63a33f	2015-03-13T19:41:40	indexer: bring back the error message on duplcate commits It turns out that erroring out on duplicate commits is the right thing to do, but git was not hitting the bug on the server-side. Bring back a descriptive error message in case of duplicate entries and error out.
dccf59ad	2015-03-13T18:28:07	indexer: don't worry about duplicate objects If a packfile includes duplicate objects, we can choose to use the secon copy instead of the first by using the same logic as if it were the first. Change the error condition from 0 to -1, which indicates a bad resize, and set the OOM message in that case. This does mean we will leak the first copy of the object. We can deal with that later, but making fetches work is more important.
a34692c4	2015-03-13T18:00:15	indexer: set an error message on duplicate objects in pack While this is not even close to a fix, we can at least set an error message so we know which error we are facing. Up to know we just returned an error without a message.
b63b76e0	2014-10-12T11:42:31	Reorder some khash declarations Keep the definitions in the headers, while putting the declarations in the C files. Putting the function definitions in headers causes them to be duplicated if you include two headers with them.
c251f3bb	2014-12-08T16:05:47	win32: remember to cleanup our hash_ctx
ec7e680c	2014-11-20T12:07:55	Fix for misleading "missing delta bases" error - Fix #2721.
7561f98d	2014-11-19T14:54:30	Fix for memory leak issue in indexer.c, that surfaces on windows
177a29d8	2014-10-27T10:39:45	Merge commit 'refs/pull/2366/head' of github.com:libgit2/libgit2
01b432cf	2014-07-09T14:12:30	Properly report failure when expanding a packfile
bc8a0886	2014-06-27T11:51:35	Fix assert when receiving uncommon sideband packet
b3b66c57	2014-06-18T17:13:12	Share packs across repository instances Opening the same repository multiple times will currently open the same file multiple times, as well as map the same region of the file multiple times. This is not necessary, as the packfile data is immutable. Instead of opening and closing packfiles directly, introduce an indirection and allocate packfiles globally. This does mean locking on each packfile open, but we already use this lock for the global mwindow list so it doesn't introduce a new contention point.
62e562f9	2014-05-18T07:54:41	Fix compiler warning (git_off_t cast to size_t). Use size_t for page size, instead of long. Check result of sysconf. Use size_t for page offset so no cast to size_t (second arg to p_mmap). Use mod instead div/mult pair, so no cast to size_t is necessary.
9c4feef9	2014-05-17T12:44:21	Fix warning on uninitialized variable.
0731a5b4	2014-05-14T19:12:48	indexer: mmap fixes for Windows Windows has its own ftruncate() called _chsize_s(). p_mkstemp() is changed to use p_open() so we can make sure we open for writing; the addition of exclusive create is a good thing to do regardless, as we want a temporary path for ourselves. Lastly, MSVC doesn't quite know how to add two numbers if one of them is a void pointer, so let's alias it to unsigned char.C
f7310540	2014-05-13T02:41:48	indexer: use mmap for writing Some OSs cannot keep their ideas about file content straight when mixing standard IO with file mapping. As we use mmap for reading from the packfile, let's make writing to the pack file use mmap.
b3f27c43	2014-05-13T21:08:50	Initialize local variable
2dde1e0c	2014-05-08T22:31:59	indexer: avoid memory moves Our vector does a move of the rest of the array when we remove an item. Doing this repeatedly can be expensive, and we do this a lot in the indexer. Instead, set the value to NULL and skip those entries. perf reported around 30% of `index-pack` time was going into memmove. With this change, that goes away and we spent most of the time hashing and inflating data.

d2458af7

2022-01-22T14:19:13

indexer: use a byte array for checksum The index's checksum is not an object ID, so we should not use the `git_oid` type. Use a byte array for checksum calculation and storage. Deprecate the `git_indexer_hash` function. Callers should use the new `git_indexer_name` function which provides a unique packfile name.

90df4302

2022-01-05T12:18:05

Fix typos

fc1a3f45

2021-11-29T13:36:36

object: return GIT_EINVALID on parse errors Return `GIT_EINVALID` on parse errors so that direct callers of parse functions can determine when there was a failure to parse the object. The object parser functions will swallow this error code to prevent it from propagating down the chain to end-users. (`git_merge` should not return `GIT_EINVALID` when a commit it tries to look up is not valid, this would be too vague to be useful.) The only public function that this affects is `git_signature_from_buffer`, which is now documented as returning `GIT_EINVALID` when appropriate.

adcf638c

2021-11-21T21:34:17

filebuf: use hashes not oids The filebuf functions should use hashes directly, not indirectly using the oid functions.

f0e693b1

2021-09-07T17:53:49

str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.

31ecaca2

2021-09-30T08:11:40

hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.

2a713da1

2021-09-29T21:31:17

hash: accept the algorithm in inputs

c65eb24c

2021-09-08T08:47:39

Avoid double negatives in the justification for truncation Turns out, double negatives are harder to parse than positive statements.

6571ba7b

2021-09-08T06:29:58

Only avoid `mmap(2)`/`ftruncate(2)` when in non-Windows It turns out that if we use `mmap(2)`, non-Windows remote filesystems break due to permissions. If we don't, _Windows_ remote filesystems break due to lack of coherence between memory mapped views of the file and direct I/O operations done to the files. To break out of this impossible situation, conditionally-compile versions of Windows-specific `write_at` and `append_to_pack`.

eeceaac0

2021-09-07T08:38:35

Also remove a `ftruncate(2)` call in `git_indexer_commit` Now that we're not using `mmap(2)` for writing stuff, we don't need to truncate the file afterwards, since it'll have the correct size at the end of the process. Whee~!

66a75fde

2021-09-07T07:14:39

indexer: Avoid one `mmap(2)`/`munmap(2)` pair per `git_indexer_append` call This change makes `append_to_pack` completely rely on `p_pwrite` to do all its I/O instead of splitting it between `p_pwrite` and a `mmap(2)`/`munmap(2)`+`memcpy(3)`. This saves a good chunk of user CPU time and avoids making two syscalls per round, but doesn't really cut down a lot of wall time (~1% on cloning the [git](https://github.com/git/git.git) repository).

ff6f6754

2021-01-07T05:44:16

Use `p_pwrite`/`p_pread` consistently throughout the codebase This change stops using the seek+read/write combo to perform I/O with an offset, since this is faster by one system call (and also more atomic and therefore safer).

4ce8e01a

2020-06-17T14:31:11

Support build with NO_MMAP to disable use of system mmap * Use pread/pwrite to avoid updating position in file descriptor * Emulate missing pread/pwrite on win32 using overlapped file IO

322c15ee

2020-08-01T18:24:41

Make the pack and mwindow implementations data-race-free This change fixes a packfile heap corruption that can happen when interacting with multiple packfiles concurrently across multiple threads. This is exacerbated by setting a lower mwindow open file limit. This change: * Renames most of the internal methods in pack.c to clearly indicate that they expect to be called with a certain lock held, making reasoning about the state of locks a bit easier. * Splits the `git_pack_file` lock in two: the one in `git_pack_file` only protects the `index_map`. The protection to `git_mwindow_file` is now in that struct. * Explicitly checks for freshness of the `git_pack_file` in `git_packfile_unpack_header`: this allows the mwindow implementation to close files whenever there is enough cache pressure, and `git_packfile_unpack_header` will reopen the packfile if needed. * After a call to `p_munmap()`, the `data` and `len` fields are poisoned with `NULL` to make use-after-frees more evident and crash rather than being open to the possibility of heap corruption. * Adds a test case to prevent this from regressing in the future. Fixes: #5591

7cd0bf65

2020-04-05T18:26:52

pack: use GIT_ASSERT

a3e8b7cd

2020-04-05T17:18:20

mwindow: use GIT_ASSERT

cd2fe662

2020-04-05T16:56:55

indexer: use GIT_ASSERT

3a197ea7

2020-06-27T12:33:32

Make the tests pass cleanly with MemorySanitizer This change: * Initializes a few variables that were being read before being initialized. * Includes https://github.com/madler/zlib/pull/393. As such, it only works reliably with `-DUSE_BUNDLED_ZLIB=ON`.

c6184f0c

2020-06-08T21:07:36

tree-wide: do not compile deprecated functions with hard deprecation When compiling libgit2 with -DDEPRECATE_HARD, we add a preprocessor definition `GIT_DEPRECATE_HARD` which causes the "git2/deprecated.h" header to be empty. As a result, no function declarations are made available to callers, but the implementations are still available to link against. This has the problem that function declarations also aren't visible to the implementations, meaning that the symbol's visibility will not be set up correctly. As a result, the resulting library may not expose those deprecated symbols at all on some platforms and thus cause linking errors. Fix the issue by conditionally compiling deprecated functions, only. While it becomes impossible to link against such a library in case one uses deprecated functions, distributors of libgit2 aren't expected to pass -DDEPRECATE_HARD anyway. Instead, users of libgit2 should manually define GIT_DEPRECATE_HARD to hide deprecated functions. Using "real" hard deprecation still makes sense in the context of CI to test we don't use deprecated symbols ourselves and in case a dependant uses libgit2 in a vendored way and knows it won't ever use any of the deprecated symbols anyway.

ba59a4a2

2020-04-01T12:34:16

Making get_delta_base() conform to the general error-handling pattern This makes get_delta_base() return the error code as the return value and the delta base as an out-parameter.

90450d88

2020-02-07T12:10:12

indexer: check return code of `git_hash_ctx_init` Initialization of the hashing context may fail on some systems, most notably on Win32 via the legacy hashing context. As such, we need to always check the error code of `git_hash_ctx_init`, which is not done when creating a new indexer. Fix the issue by adding checks.

6460e8ab

2019-06-23T18:13:29

internal: use off64_t instead of git_off_t Prefer `off64_t` internally.

a477bff1

2019-08-08T10:44:57

indexer: catch OOM when adding expected OIDs When adding OIDs to the indexer's map of yet-to-be-seen OIDs to verify that packfiles are complete, we do so by first allocating a new OID and then calling `git_oidmap_set` on it. There was no check for memory allocation errors in place, though, leading to possible segfaults due to trying to copy data to a `NULL` pointer. Verify the result of `git__malloc` with `GIT_ERROR_CHECK_ALLOC` to fix the issue.

0b5ba0d7

2019-06-06T16:36:23

Rename opt init functions to `options_init` In libgit2 nomenclature, when we need to verb a direct object, we name a function `git_directobject_verb`. Thus, if we need to init an options structure named `git_foo_options`, then the name of the function that does that should be `git_foo_options_init`. The previous names of `git_foo_init_options` is close - it _sounds_ as if it's initializing the options of a `foo`, but in fact `git_foo_options` is its own noun that should be respected. Deprecate the old names; they'll now call directly to the new ones.

a1ef995d

2019-02-21T10:33:30

indexer: use git_indexer_progress throughout Update internal usage of `git_transfer_progress` to `git_indexer_progreses`.

c976b4f9

2018-12-01T10:18:26

indexer: use map iterator to delete expected OIDs To compute whether there are objects missing in a packfile, the indexer keeps around a map of OIDs that it still expects to see. This map does not store any values at all, but in fact the keys are owned by the map itself. Right now, we free these keys by iterating over the map and freeing the key itself, which is kind of awkward as keys are expected to be constant. We can make this a bit prettier by inserting the OID as value, too. As we already store the `NULL` pointer either way, this does not increase memory usage, but makes the code a tad more clear. Furthermore, we convert the previously existing map iteration via indices to make use of an iterator, instead.

2e0a3048

2019-01-23T10:48:55

oidmap: introduce high-level setter for key/value pairs Currently, one would use either `git_oidmap_insert` to insert key/value pairs into a map or `git_oidmap_put` to insert a key only. These function have historically been macros, which is why their syntax is kind of weird: instead of returning an error code directly, they instead have to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes.Furthermore, `git_oidmap_put` is tightly coupled with implementation details of the map as it exposes the index of inserted entries. Introduce a new function `git_oidmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all trivial callers of `git_oidmap_insert` and `git_oidmap_put` to make use of it.

351eeff3

2019-01-23T10:42:46

maps: use uniform lifecycle management functions Currently, the lifecycle functions for maps (allocation, deallocation, resize) are not named in a uniform way and do not have a uniform function signature. Rename the functions to fix that, and stick to libgit2's naming scheme of saying `git_foo_new`. This results in the following new interface for allocation: - `int git_<t>map_new(git_<t>map **out)` to allocate a new map, returning an error code if we ran out of memory - `void git_<t>map_free(git_<t>map *map)` to free a map - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map This commit also fixes all existing callers.

f673e232

2018-12-27T13:47:34

git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.

168fe39b

2018-11-28T14:26:57

object_type: use new enumeration names Use the new object_type enumeration names within the codebase.

852bc9f4

2018-11-23T19:26:24

khash: remove intricate knowledge of khash types Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.

50186ce8

2018-08-26T11:26:45

Merge pull request #4374 from pks-t/pks/pack-file-verify Pack file verification

32810348

2018-07-20T08:43:54

Use UINT32_MAX as the default object limit This replicates the old behavior of limiting to 2³² by default.

bfe34242

2018-07-16T03:12:01

See if this fixes 32-bit build

efe3f37d

2018-07-12T04:20:15

Add a git_libgit2_opts option to set the max indexer object count

912c59c9

2018-06-24T06:51:08

while fuzzing, limit # objects read

6b51f380

2018-06-22T13:19:31

indexer: correctly initialize struct with {0}

5ec4aee9

2017-11-12T10:35:18

indexer: add ability to select connectivity checks Right now, we simply turn on connectivity checks in the indexer as soon as we have access to an object database. But seeing that the connectivity checks may incur additional overhead, we do want the user to decide for himself whether he wants to allow those checks. Furthermore, it might also be desirable to check connectivity in case where no object database is given at all, e.g. in case where a fully connected pack file is expected. Add a flag `verify` to `git_indexer_options` to enable additional verification checks. Also avoid to query the ODB in case none is given to allow users to enable checks when they do not have an ODB.

c16556aa

2017-11-12T10:31:48

indexer: introduce options struct to `git_indexer_new` We strive to keep an options structure to many functions to be able to extend options in the future without breaking the API. `git_indexer_new` doesn't have one right now, but we want to be able to add an option for enabling strict packfile verification. Add a new `git_indexer_options` structure and adjust callers to use that.

a616fb16

2017-10-13T13:53:05

indexer: check pack file connectivity When passing `--strict` to `git-unpack-objects`, core git will verify the pack file that is currently being read. In addition to the typical checksum verification, this will especially cause it to verify object connectivity of the received pack file. So it checks, for every received object, if all the objects it references are either part of the local object database or part of the pack file. In libgit2, we currently have no such mechanism, which leaves us unable to verify received pack files prior to writing them into our local object database. This commit introduce the concept of `expected_oids` to the indexer. When pack file verification is turned on by a new flag, the indexer will try to parse each received object first. If the object has any links to other objects, it will check if those links are already satisfied by known objects either part of the object database or objects it has already seen as part of that pack file. If not, it will add them to the list of `expected_oids`. Furthermore, the indexer will remove the current object from the `expected_oids` if it is currently being expected. Like this, we are able to verify whether all object links are being satisfied. As soon as we hit the end of the object stream and have resolved all objects as well as deltified objects, we assert that `expected_oids` is in fact empty. This should always be the case for a valid pack file with full connectivity.

be41c384

2017-11-12T09:25:49

indexer: extract function reading stream objects The loop inside of `git_indexer_append` iterates over every object that is to be stored as part of the index. While the logic to retrieve every object from the packfile stream is rather involved, it currently just part of the loop, making it unnecessarily hard to follow. Move the logic into its own function `read_stream_object`, which unpacks a single object from the stream. Note that there is some subtletly here involving the special error `GIT_EBUFS`, which indicates to the indexer that no more data is currently available. So instead of returning an error and aborting the whole loop in that case, we do have to catch that value and return successfully to wait for more data to be read.

6568f374

2017-10-11T13:20:19

indexer: remove useless local variable The `processed` variable local to `git_indexer_append` counts how many objects have already been processed. But actually, whenever it gets assigned to, we are also assigning the same value to the `stats->indexed_objects` struct member. So in fact, it is being quite useless due to always having the same value as the `indexer_objects` member and makes it a bit harder to understand the code. We can just remove the variable to fix that.

ecf4f33a

2018-02-08T11:14:48

Convert usage of `git_buf_free` to new `git_buf_dispose`

c8ee5270

2017-12-08T09:05:58

pack: rename `git_packfile_stream_free` The function `git_packfile_stream_free` frees all state of the packfile stream without freeing the structure itself. This naming makes it hard to spot whether it will try to free the pointer itself or not, causing potential future errors. Due to this reason, we have decided to name a function freeing state without freeing the actual struture a "dispose" function. Rename `git_packfile_stream_free` to `git_packfile_stream_dispose` as a first example following this rule.

619f61a8

2018-02-01T06:22:36

odb: error when we can't create object header Return an error to the caller when we can't create an object header for some reason (printf failure) instead of simply asserting.

c3514b0b

2017-12-23T14:59:07

Fix unpack double free If an element has been cached, but then the call to packfile_unpack_compressed() fails, the very next thing that happens is that its data is freed and then the element is not removed from the cache, which frees the data again. This change sets obj->data to NULL to avoid the double-free. It also stops trying to resolve deltas after two continuous failed rounds of resolution, and adds a test for this.

c8aaba24

2017-12-06T03:03:18

libFuzzer: Fix missing trailer crash This change fixes an invalid memory access when the trailer is missing / corrupt. Found using libFuzzer.

400caed3

2017-12-06T03:22:58

libFuzzer: Fix a git_packfile_stream leak This change ensures that the git_packfile_stream object in git_indexer_append() does not leak when the stream has errors. Found using libFuzzer.

0c7f49dd

2017-06-30T13:39:01

Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.

6f960b55

2017-06-11T10:37:46

Merge pull request #4088 from chescock/packfile-name-using-complete-hash Ensure packfiles with different contents have different names

6c23704d

2017-06-08T21:40:18

settings: rename `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION` Initially, the setting has been solely used to enable the use of `fsync()` when creating objects. Since then, the use has been extended to also cover references and index files. As the option is not yet part of any release, we can still correct this by renaming the option to something more sensible, indicating not only correlation to objects. This commit renames the option to `GIT_OPT_ENABLE_FSYNC_GITDIR`. We also move the variable from the object to repository source code.

c0e54155

2017-01-11T10:39:59

indexer: name pack files after trailer hash Upstream git.git has changed the way how packfiles are named. Previously, they were using a hash of the contained object's OIDs, which has then been changed to use the hash of the complete packfile instead. See 1190a1acf (pack-objects: name pack files after trailer hash, 2013-12-05) in the git.git repository for more information on this change. This commit changes our logic to match the behavior of core git.

1c04a96b

2017-02-28T12:29:29

Honor `core.fsyncObjectFiles`

2a5ad7d0

2017-02-17T16:42:40

fsync: call it "synchronous" object writing Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.

1229e1c4

2017-02-17T16:36:53

fsync parent directories when fsyncing When fsync'ing files, fsync the parent directory in the case where we rename a file into place, or create a new file, to ensure that the directory entry is flushed correctly.

1c2c0ae2

2016-12-14T12:51:40

packbuilder: honor git_object__synchronized_writing Honor `git_object__synchronized_writing` when creating a packfile and corresponding index.

0d716905

2017-01-27T15:23:15

oidmap: remove GIT__USE_OIDMAP macro

85d2748c

2017-01-27T14:05:10

khash: avoid using `kh_key`/`kh_val` as lvalue

f31cb45a

2017-01-25T15:31:12

khash: avoid using `kh_put` directly

cb18386f

2017-01-25T14:26:58

khash: avoid using `kh_val`/`kh_value` directly

036daa59

2017-01-25T14:11:42

khash: use `git_map_exists` where applicable

9694d9ba

2017-01-25T14:09:17

khash: avoid using `kh_foreach`/`kh_foreach_value` directly

048c5ea7

2017-01-21T23:55:21

Merge pull request #4053 from chescock/extend-packfile-by-pages Extend packfile in increments of page_size.

87b7a705

2017-01-21T15:44:57

indexer: avoid warning about `idx->pack` It must be non-NULL to have a valid `git_indexer`.

bf339ab0

2017-01-21T14:51:31

indexer: introduce `git_packfile_close` Encapsulation!

52949c80

2017-01-21T18:30:12

Merge branch 'pr/4060'

d030bba9

2017-01-21T17:15:33

indexer: only delete temp file if it was unused Only try to `unlink` our temp file when we know that we didn't copy it into its permanent location.

f5586f5c

2017-01-14T16:37:00

Addressed review feedback

96df833b

2017-01-03T19:15:09

Close the file before unlinking I forgot that Windows chokes while trying to delete open files.

db535d0a

2017-01-01T12:45:02

Delete temporary packfile in indexer This change deletes the temporary packfile that the indexer creates to avoid littering the pack/ directory with garbage.

c7a1535f

2016-12-29T11:47:52

Extend packfile in increments of page_size. This improves performance by reducing the number of I/O operations.

909d5494

2016-12-29T12:25:15

giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one

d53cc13e

2016-03-31T04:12:46

Merge pull request #3575 from pmq20/master-13jan16 Remove duplicated calls to git_mwindow_close

e50a49ee

2016-03-22T01:54:49

Merge pull request #3559 from yongthecoder/master Add a sanity check in git_indexer_commit to avoid subtraction overflow.

87c18197

2016-03-16T19:05:11

Split the page size from the mmap alignment While often similar, these are not the same on Windows. We want to use the page size on Windows for the pools, but for mmap we need to use the allocation granularity as the alignment. On the other platforms these values remain the same.

d4e4f272

2016-01-13T11:07:14

Remove duplicated calls to git_mwindow_close

b3eb2cde

2015-12-24T10:04:44

Avoid subtraction overflow in git_indexer_commit

c369b379

2015-07-31T16:23:11

Remove extra semicolon outside of a function Without this change, compiling with gcc and pedantic generates warning: ISO C does not allow extra ‘;’ outside of a function.

3e8c5e45

2015-06-10T16:43:48

Merge pull request #3174 from libgit2/cmn/idx-fill-hole indexer: use lseek to extend the packfile

02980bdc

2015-06-09T16:53:07

Initialize a few variables Coverity complains about the git_rawobj ones because we use a loop in which we keep remembering the old version, and we end up copying our object as the base, so we want to have the data pointer be NULL.

aa57231f

2015-06-02T10:25:22

indexer: use lseek to extend the packfile We've been using `p_ftruncate()` to extend the packfile in order to mmap it and write the new data into it. This works well in the general case, but as truncation does not allocate space in the filesystem, it must do so when we write data to it. The only way the OS has to indicate a failure to allocate space is via SIGBUS which means we tried to write outside the file. This will cause everyone to crash as they don't expect to handle this signal. Switch to using `p_lseek()` and `p_write()` to extend the file in a way which tells the filesystem to allocate the space for the missing data. We can then be sure that we have space to write into.

e2dd3735

2015-05-22T11:20:47

indexer: avoid loading already existent bases When thickening a pack, avoid loading already loaded bases and trying to insert them all over again.

7800048a

2015-03-17T10:06:50

Merge pull request #2972 from libgit2/cmn/pack-objects-walk [WIP] Smarter pack-building

7c63a33f

2015-03-13T19:41:40

indexer: bring back the error message on duplcate commits It turns out that erroring out on duplicate commits is the right thing to do, but git was not hitting the bug on the server-side. Bring back a descriptive error message in case of duplicate entries and error out.

dccf59ad

2015-03-13T18:28:07

indexer: don't worry about duplicate objects If a packfile includes duplicate objects, we can choose to use the secon copy instead of the first by using the same logic as if it were the first. Change the error condition from 0 to -1, which indicates a bad resize, and set the OOM message in that case. This does mean we will leak the first copy of the object. We can deal with that later, but making fetches work is more important.

a34692c4

2015-03-13T18:00:15

indexer: set an error message on duplicate objects in pack While this is not even close to a fix, we can at least set an error message so we know which error we are facing. Up to know we just returned an error without a message.

b63b76e0

2014-10-12T11:42:31

Reorder some khash declarations Keep the definitions in the headers, while putting the declarations in the C files. Putting the function definitions in headers causes them to be duplicated if you include two headers with them.

c251f3bb

2014-12-08T16:05:47

win32: remember to cleanup our hash_ctx

ec7e680c

2014-11-20T12:07:55

Fix for misleading "missing delta bases" error - Fix #2721.

7561f98d

2014-11-19T14:54:30

Fix for memory leak issue in indexer.c, that surfaces on windows

177a29d8

2014-10-27T10:39:45

Merge commit 'refs/pull/2366/head' of github.com:libgit2/libgit2

01b432cf

2014-07-09T14:12:30

Properly report failure when expanding a packfile

bc8a0886

2014-06-27T11:51:35

Fix assert when receiving uncommon sideband packet

b3b66c57

2014-06-18T17:13:12

Share packs across repository instances Opening the same repository multiple times will currently open the same file multiple times, as well as map the same region of the file multiple times. This is not necessary, as the packfile data is immutable. Instead of opening and closing packfiles directly, introduce an indirection and allocate packfiles globally. This does mean locking on each packfile open, but we already use this lock for the global mwindow list so it doesn't introduce a new contention point.

62e562f9

2014-05-18T07:54:41

Fix compiler warning (git_off_t cast to size_t). Use size_t for page size, instead of long. Check result of sysconf. Use size_t for page offset so no cast to size_t (second arg to p_mmap). Use mod instead div/mult pair, so no cast to size_t is necessary.

9c4feef9

2014-05-17T12:44:21

Fix warning on uninitialized variable.

0731a5b4

2014-05-14T19:12:48

indexer: mmap fixes for Windows Windows has its own ftruncate() called _chsize_s(). p_mkstemp() is changed to use p_open() so we can make sure we open for writing; the addition of exclusive create is a good thing to do regardless, as we want a temporary path for ourselves. Lastly, MSVC doesn't quite know how to add two numbers if one of them is a void pointer, so let's alias it to unsigned char.C

f7310540

2014-05-13T02:41:48

indexer: use mmap for writing Some OSs cannot keep their ideas about file content straight when mixing standard IO with file mapping. As we use mmap for reading from the packfile, let's make writing to the pack file use mmap.

b3f27c43

2014-05-13T21:08:50

Initialize local variable

2dde1e0c

2014-05-08T22:31:59

indexer: avoid memory moves Our vector does a move of the rest of the array when we remove an item. Doing this repeatedly can be expensive, and we do this a lot in the indexer. Instead, set the value to NULL and skip those entries. perf reported around 30% of `index-pack` time was going into memmove. With this change, that goes away and we spent most of the time hashing and inflating data.

thodg/libgit2/src/indexer.c

src/indexer.c

Log