|
86b85492
|
2020-06-03T15:40:37
|
|
Merge pull request #5537 from libgit2/ethomson/clar_exactmatch
tests: offer exact name matching with a `$` suffix
|
|
53a8f463
|
2020-06-03T07:40:59
|
|
Merge pull request #5536 from libgit2/ethomson/http
httpclient: support googlesource
|
|
6de8aa7f
|
2020-06-02T12:21:22
|
|
Merge pull request #5532 from joshtriplett/pack-default-path
git_packbuilder_write: Allow setting path to NULL to use the default path
|
|
22f9a0fc
|
2020-06-02T12:12:41
|
|
Merge pull request #5531 from joshtriplett/mempack-threads
mempack: Use threads when building the pack
|
|
0d3ce2ac
|
2020-06-02T10:23:41
|
|
offer exact name matching with a `$` suffix
When using `-s` to specify a particular test, it will do a prefix match.
Thus, `-sapply::both::rename_a_to_b_to_c` will match both a test named
`test_apply_both__rename_a_to_b_to_c` and a test that begins with that
name, like `test_apply_both__rename_a_to_b_to_c_exact`.
Permit a trailing `$` to `-s` syntax. This allows a user to specify
`-sapply::both::rename_a_to_b_to_c$` to match _only_ the
`test_apply_both__rename_a_to_b_to_c` function.
We already filter to ensure that the given prefix matches the current
test name. Also ensure that the length of the test name matches the
length of the filter, sans trailing `$`.
|
|
d4b953f8
|
2020-06-02T09:26:11
|
|
Merge pull request #5528 from libgit2/ethomson/clar_internal
clar: use internal functions instead of /bin/cp and /bin/rm
|
|
ee9e9163
|
2020-05-23T15:56:29
|
|
clar: remove files internally instead of /bin/rm
Similar to how clar has used `/bin/cp` to copy files, it's used
`/bin/rm` to remove them. This has similar deficiencies; meaning that
leaks is noisy and it's slow. Move it to an internal function.
|
|
d03fd331
|
2020-05-23T15:42:51
|
|
clar: copy files with sendfile on linux
|
|
8df4f519
|
2020-05-23T15:04:54
|
|
clar: copy files internally instead of /bin/cp
clar has historically shelled out to `/bin/cp` to copy test fixtures
into a sandbox. This has two deficiencies:
1. It's slower than simply opening the source and destination and
copying them in a read/write loop. On my Mac, the `/bin/cp` based
approach takes ~2:40 for a full test pass. Using a read/write loop
to copy the files ourselves takes ~1:50.
2. It's noisy. Since the leak detector follows fork/exec, we'll end up
running the leak detector on `/bin/cp`. This would be fine, except
that the leak detector spams the console on startup and shutdown, so
it adds a _lot_ of additional information to the test runs that is
useless. By not forking and using this internal system, we see much
less output.
|
|
2a2c5b40
|
2020-05-23T15:57:48
|
|
clar: remove unused shell_out function
|
|
849f371e
|
2020-06-02T00:29:34
|
|
Merge pull request #5535 from libgit2/ethomson/strarray
strarray refactoring
|
|
04c7bdb4
|
2020-06-01T22:44:14
|
|
httpclient: clear the read_buf on new requests
The httpclient implementation keeps a `read_buf` that holds the data
in the body of the response after the headers have been written. We
store that data for subsequent calls to `git_http_client_read_body`. If
we want to stop reading body data and send another request, we need to
clear that cached data.
Clear the cached body data on new requests, just like we read any
outstanding data from the socket.
|
|
aa8b2c0f
|
2020-06-01T23:53:55
|
|
httpclient: don't read more than the client wants
When `git_http_client_read_body` is invoked, it provides the size of the
buffer that can be read into. This will be set as the parser context's
`output_size` member. Use this as an upper limit on our reads, and
ensure that we do not read more than the client requests.
|
|
a9746b30
|
2020-05-29T11:21:55
|
|
strarray: move to its own file
|
|
5eb48a14
|
2020-05-29T13:17:39
|
|
strarray: deprecate git_strarray_copy
We should not be in the business of copying strings around for users.
We either return a strarray that can be freed, or we take one (and do
not mutate it).
|
|
51eff5a5
|
2020-05-29T13:13:19
|
|
strarray: we should `dispose` instead of `free`
We _dispose_ the contents of objects; we _free_ objects (and their
contents). Update `git_strarray_free` to be `git_strarray_dispose`.
`git_strarray_free` remains as a deprecated proxy function.
|
|
570f0340
|
2020-06-01T19:10:38
|
|
httpclient: read_body should return 0 at EOF
When users call `git_http_client_read_body`, it should return 0 at the
end of a message. When the `on_message_complete` callback is called,
this will set `client->state` to `DONE`. In our read loop, we look for
this condition and exit.
Without this, when there is no data left except the end of message chunk
(`0\r\n`) in the http stream, we would block by reading the three bytes
off the stream but not making progress in any `on_body` callbacks.
Listening to the `on_message_complete` callback allows us to stop trying
to read from the socket when we've read the end of message chunk.
|
|
b7bdb071
|
2020-05-30T15:21:48
|
|
online::clone: test a googlesource URL
Google Git (googlesource.com) behaves differently than git proper.
Test that we can communicate with it.
|
|
629515a8
|
2020-06-01T15:06:29
|
|
Merge pull request #5481 from pks-t/pks/cmake-cleanups
CMake cleanups
|
|
17641f1f
|
2020-06-01T15:05:51
|
|
Merge pull request #5526 from libgit2/ethomson/poolinit
git_pool_init: allow the function to fail
|
|
0f35efeb
|
2020-05-23T10:15:51
|
|
git_pool_init: handle failure cases
Propagate failures caused by pool initialization errors.
|
|
511fb9e6
|
2020-04-03T22:53:23
|
|
cmake: always disable deprecation-sync warnings
We currently disable deprecation synchronization warnings in case we're
building with Clang. We check for Clang by doing a string comparison on
the compiler identification, but this seems to have been broken by an
update in macOS' image as the compiler ID has changed to "AppleClang".
Let's just unconditionally disable this warning on Unix platforms. We
never add the deprecated attribute anyway, so the warning doesn't help
us at all.
|
|
3956679c
|
2020-04-03T20:08:02
|
|
cmake: remove policies
The `CMAKE_MINIUM_REQUIRE()` function not only sets up the minimum
required CMake version of a project, but it will also at the same time
set the CMake policy version. In effect this means that all policies
that have been introduced before the minimum CMake version will be
enabled automatically.
When updating our minimum required version ebabb88f2 (cmake: update
minimum CMake version to v3.5.1, 2019-10-10), we didn't remove any of
the policies we've been manually enabling. The newest CMake policy we've
been enabling is CMP0054, which was introduced back in CMake v3.1. As a
result, we can now just remove all manual calls to `CMAKE_POLICY()`.
|
|
2e7d4579
|
2020-04-03T19:59:39
|
|
cmake: remove option to add profiling flags
We currently have an option that adds options for profiling to both our
CFLAGS and LDFLAGS. Having such flags behind various build options is
not really sensible at all, since users should instead set up those
flags via environment variables supported by CMake itself.
Let's remove this option.
|
|
2551b1b0
|
2020-04-03T19:53:35
|
|
cmake: remove support for creating tags
We currently have support for generating tags via ctags as part of our
build system. We aren't really in the place of supporting any tooling
that exists apart from the actual build environment, as doing so adds
additional complexity and maintenance burden to our build instructions.
This is in fact nicely demonstrated by this particular option, as it
hasn't been working anymore since commit e5c9723d0 (cmake: move library
build instructions into subdirectory, 2017-06-30).
As a result, this commit removes support for building CTags
|
|
bc02bcd9
|
2020-04-03T19:51:22
|
|
cmake: move modules into the "cmake/" top level dir
Our custom CMake module currently live in "cmake/Modules". As the
"cmake/" directory doesn't contain anything except the "Modules"
directory, it doesn't really make sense to have the additional
intermediate directory. So let's instead move the modules one level up
into the "cmake/" top level directory.
|
|
172a2886
|
2020-06-01T14:04:15
|
|
Merge pull request #5529 from libgit2/ethomson/difftest
diff::workdir: actually test the buffers
|
|
1bbdf15d
|
2020-06-01T13:57:12
|
|
Merge pull request #5527 from libgit2/ethomson/config_unreadable
Handle unreadable configuration files
|
|
9df69223
|
2020-05-23T11:42:19
|
|
config: test that unreadable files are treated as notfound
|
|
d1409f48
|
2020-05-06T19:57:07
|
|
config: ignore unreadable configuration files
Modified `config_file_open()` so it returns 0 if the config file is
not readable, which happens on global config files under macOS
sandboxing (note that for some reason `access(F_OK)` DOES work with
sandboxing, but it is lying). Without this read check sandboxed
applications on macOS can not open any repository, because
`config_file_read()` will return GIT_ERROR when it cannot read the
global /Users/username/.gitconfig file, and the upper layers will
just completely abort on GIT_ERROR when attempting to load the
global config file, so no repositories can be opened.
|
|
89ddd0fc
|
2020-06-01T13:19:01
|
|
Merge pull request #5533 from pjw91/fix-index-write
Make git_index_write() generate valid v4 index
|
|
1a899008
|
2020-05-26T20:36:13
|
|
tests: index::version: write v4 index: re-open repo to read written v4 index
The `git_index_free()` merely decrement the reference counter from 2 to
1, and does not "free" the index.
Thus, the following `git_repository_index()` merely increase the counter
to 2, instead of read index from disk.
The written index is not read and parsed, which makes this test case
effectively becomes a no-op.
|
|
8c96d56d
|
2020-05-26T04:53:09
|
|
index: write v4: bugfix: prefix path with strip_len, not same_len
According to index-format.txt of git, the path of an entry is prefixed
with N, where N indicates the length of bytes to be stripped.
|
|
5278a006
|
2020-05-23T16:07:54
|
|
git_packbuilder_write: Allow setting path to NULL to use the default path
If given a NULL path, write to the object path of the repository.
Add tests for the new behavior.
|
|
0bc091dd
|
2020-05-23T15:35:38
|
|
git_packbuilder_write: Unify cleanup path
Clean up and return via a single label, to avoid duplicate error
handling before each return, and to make it easier to extend the set of
cleanups needed.
|
|
30285a3c
|
2020-05-23T15:04:19
|
|
mempack: Use threads when building the pack
The mempack ODB backend creates a packbuilder internally to write out a
pack; call git_packbuilder_set_threads on that packbuilder, to use
threads for packing if available.
|
|
3414d470
|
2020-05-23T16:27:56
|
|
diff::workdir: actually test the buffers
The static test data is erroneously initialized with a length of 0 for
three of the strings. This means the tests are not actually examining
those strings. Provide the length.
|
|
27cb4e0e
|
2020-05-23T11:02:07
|
|
Merge pull request #5522 from pks-t/pks/openssl-cert-memleak
OpenSSL certificate memory leak
|
|
abfdb8a6
|
2020-05-23T10:15:37
|
|
git_pool_init: return an int
Let `git_pool_init` return an int so that it could fail.
|
|
e4bdba56
|
2020-05-23T09:57:22
|
|
Merge pull request #5515 from pks-t/pks/flaky-checkout-test
tests: checkout: fix flaky test due to mtime race
|
|
3b7b4d27
|
2020-05-23T09:40:55
|
|
Merge pull request #5523 from libgit2/pks/cmake-sort-reproducible-builds
cmake: Sort source files for reproducible builds
|
|
f88e12db
|
2020-05-23T09:35:53
|
|
checkout::index: free the index
|
|
3f201f75
|
2020-05-16T13:48:04
|
|
checkout: fix file being treated as unmodified due to racy index
When trying to determine whether a file changed, we try to avoid heavy
operations by fist taking a look at the index, seeing whether the index
entry is modified already. This doesn't seem to cut it, though, as we
currently have the racy checkout::index::can_disable_pathspec_match test
case: sometimes the files get restored to their original contents,
sometimes they aren't.
The issue is caused by a racy index [1]: in case we modify a file, add
it to the index and then modify it again in-place without changing its
file, then we may end up with a modified file that has the same stat(3P)
info as we've currently got it in its corresponding index entry. The
mitigation for this is to treat files with the same mtime as the index
are treated as racily modified. We already have this logic in place for
the index, but not when doing a checkout.
Fix the issue by only consulting the index entry in case it has an older
mtime as the index. Previously, the following script reliably had at
least 20 failures, while now there is no failure to be observed anymore:
```bash
j=0
for i in $(seq 100)
do
if ! ./libgit2_clar -scheckout::index::can_disable_pathspec_match >/dev/null
then
j=$(($j + 1))
fi
done
echo "Failures: $j"
```
[1]: https://git-scm.com/docs/racy-git
|
|
915f8860
|
2020-05-16T14:00:11
|
|
tests: checkout: fix stylistic issues and static variable
The test case checkout::index::can_disable_pathspec_match has some
shortcomings when it comes to coding style, which didn't fit our own
coding style. Furthermore, it had an unnecessary static local variable.
The test has been refactored to address these issues.
|
|
b85eefb4
|
2020-05-15T19:52:40
|
|
cmake: Sort source files for reproducible builds
We currently use `FILE(GLOB ...)` in most places to find source and
header files. This is problematic in that the order of files returned
depends on the operating system's directory iteration order and may thus
not be deterministic. As a result, we link object files in unspecified
order, which may cause the linker to emit different code across runs.
Fix this issue by sorting all code used as input to the libgit2 library
to improve the reliability of reproducible builds.
|
|
b43a9e66
|
2020-05-15T17:46:24
|
|
streams: openssl: fix memleak due to us not free'ing certs
When creating a `git_cert` from the OpenSSL X509 certificate of a given
stream, we do not call `X509_free()` on the certificate, leading to a
memory leak as soon as the certificate is requested e.g. by the
certificate check callback.
Fix the issue by properly calling `X509_free()`.
|
|
b7b872f5
|
2020-05-12T22:39:27
|
|
Merge pull request #5517 from libgit2/pks/futils-symlink-args
futils: fix order of declared parameters for `git_futils_fake_symlink`
|
|
a2eca682
|
2020-05-12T21:35:07
|
|
futils: fix order of declared parameters for `git_futils_fake_symlink`
While the function `git_futils_fake_symlink` is declared with arguments
`new, old`, the implementation uses the reverse order `old, new`. Let's
fix the ordering issues to be `new, old` for both, which matches what
symlink(3P) has. While at it, we also rename these parameters: `old` and
`new` doesn't really make a lot of sense in the context of symlinks,
which is why this commit renames them to be called `target` and `path`.
|
|
3f90fcd6
|
2020-05-12T21:22:48
|
|
Merge pull request #5516 from suhaibmujahid/update-release
Check the version in package.json
|
|
f1c1458c
|
2020-05-12T10:55:14
|
|
feat: Check the version in package.json
|
|
896abfc8
|
2020-05-12T11:14:10
|
|
Merge pull request #5513 from libgit2/pks/tests-fix-32-bit-formatter
tests: merge: fix printf formatter on 32 bit arches
|
|
0cf9b666
|
2020-05-12T11:41:44
|
|
tests: merge: fix printf formatter on 32 bit arches
We currently use `PRIuMAX` to print an integer of type `size_t` in
merge::trees::rename::cache_recomputation. While this works just fine on
64 bit arches, it doesn't on 32 bit ones. As a result, our nightly
builds on x86 and arm32 fail.
Fix the issue by using `PRIuZ` instead.
|
|
51a2bc43
|
2020-05-12T08:22:31
|
|
Merge pull request #5511 from suhaibmujahid/patch-1
Update package.json
|
|
045efb7b
|
2020-05-11T21:20:52
|
|
Merge pull request #5509 from libgit2/ethomson/assert_macros
Introduce GIT_ASSERT macros
|
|
31ddf163
|
2020-05-11T21:06:42
|
|
Merge pull request #5512 from A-Ovchinnikov-mx/patch-1
README.md: Add instructions for building in MinGW environment
|
|
cbae1c21
|
2020-04-01T22:12:07
|
|
assert: allow non-int returning functions to assert
Include GIT_ASSERT_WITH_RETVAL and GIT_ASSERT_ARG_WITH_RETVAL so that
functions that do not return int (or more precisely, where `-1` would
not be an error code) can assert.
This allows functions that return, eg, NULL on an error code to do that
by passing the return value (in this example, `NULL`) as a second
parameter to the GIT_ASSERT_WITH_RETVAL functions.
|
|
a95096ba
|
2020-01-12T10:31:07
|
|
assert: optionally fall-back to assert(3)
Fall back to the system assert(3) in debug builds, which may aide
in debugging.
"Safe" assertions can be enabled in debug builds by setting
GIT_ASSERT_HARD=0. Similarly, hard assertions can be enabled in
release builds by setting GIT_ASSERT_HARD to nonzero.
|
|
abe2efe1
|
2019-12-09T12:37:34
|
|
Introduce GIT_ASSERT macros
Provide macros to replace usages of `assert`. A true `assert` is
punishing as a library. Instead we should do our best to not crash.
GIT_ASSERT_ARG(x) will now assert that the given argument complies to
some format and sets an error message and returns `-1` if it does not.
GIT_ASSERT(x) is for internal usage, and available as an internal
consistency check. It will set an error message and return `-1` in the
event of failure.
|
|
4ad36338
|
2020-05-11T19:10:11
|
|
Update README.md
Add instructions for building libgit2 in MinGW environment
|
|
3453c3b1
|
2020-05-11T05:14:35
|
|
Update package.json
|
|
b83bc6d4
|
2020-05-11T09:18:36
|
|
Merge pull request #5510 from phkelley/stash-to-index-crash
Fix uninitialized stack memory and NULL ptr dereference in stash_to_index
|
|
56c95cf6
|
2020-05-10T21:43:38
|
|
Fix uninitialized stack memory and NULL ptr dereference in stash_to_index
Caught by static analysis.
|
|
d62e44cb
|
2019-06-03T18:35:08
|
|
checkout: Fix removing untracked files by path in subdirectories
The checkout code didn't iterate into a subdir if it didn't match the
pathspec, but since the pathspec might match files in the subdir we
should recurse into it (In contrast to gitignore handling).
Fixes #5089
|
|
2a1d97e6
|
2020-05-11T00:09:18
|
|
Merge pull request #5378 from libgit2/ethomson/checkout_pathspecs
Honor GIT_CHECKOUT_DISABLE_PATHSPEC_MATCH for all checkout types
|
|
63de2128
|
2020-02-02T20:20:19
|
|
checkout: filter pathspecs for _all_ checkout types
We were previously applying the pathspec filter for the baseline
iterator during checkout, as well as the target tree. This was an
oversight; in fact, we should apply the pathspec filter to _all_
checkout targets, not just trees.
Add a helper function to set the iterator pathspecs from the given
checkout pathspecs, and call it everywhere.
|
|
8731e1f4
|
2020-02-02T19:01:15
|
|
tests::checkout: only examine test10 and test11.txt
The checkout::index::can_disable_pathspec_match test attempts to set a
path filter of `test11.txt` and `test12.txt`, but then validates that
`test10.txt` and `test11.txt` were left unmodified. Update the test's
path filter to match the expectation.
|
|
24bd12c4
|
2020-02-02T01:00:15
|
|
Create test case demonstrating checkout bug w/ pathspec match disabled
|
|
02d27f61
|
2020-05-10T23:42:43
|
|
Merge pull request #5482 from pks-t/pks/coding-style
docs: add documentation for our coding style
|
|
d08f72eb
|
2020-05-10T23:38:48
|
|
Merge pull request #5500 from phkelley/enable-control-flow-guard
MSVC: Enable Control Flow Guard (CFG)
|
|
898caead
|
2020-05-10T19:03:10
|
|
Merge pull request #5431 from libgit2/ethomson/hexdump
git__hexdump: better mimic `hexdump -C`
|
|
63f9fbee
|
2020-04-25T15:37:45
|
|
MSVC: Enable Control Flow Guard (CFG)
This feature requires Visual Studio 2015 (MSVC_VERSION = 1900) or later. As the
minimum required CMake version is currently less than 3.7, GREATER_EQUAL is not
available to us and we must invert the result of the LESS operator.
|
|
66137ff6
|
2020-04-19T12:08:24
|
|
Merge pull request #5383 from ognarb/feature/blame-ignore-whitespace
Feature: Allow blame to ignore whitespace change
|
|
9830ab3d
|
2020-01-29T02:00:04
|
|
blame: add option to ignore whitespace changes
|
|
918a7d19
|
2020-04-14T12:26:36
|
|
Merge pull request #5487 from niacat/master
deps: ntlmclient: use htobe64 on NetBSD too
|
|
ffb6a576
|
2020-04-04T14:36:27
|
|
docs: add documentation for our coding style
For years, we've repeatedly had confusion about what our actual coding
style is not only for newcomers, but also across the core contributors.
This can mostly be attributed to the fact that we do not have any coding
conventions written down. This is now a thing of the past with the
introduction of a new document that gives an initial overview of our
style and most important best practices for both our C codebase as well
as for CMake.
While the proposed coding style for our C codebase should be rather
uncontroversial, the coding style for CMake might be. This can be
attributed to multiple facts. First, the CMake code base doesn't really
have any uniform coding style and is quite outdated in a lot of places.
Second, the proposed coding style actually breaks with our existing one:
we currently use all-uppercase function names and variables, but the
documented coding style says we use all-lowercase function names but
all-uppercase variables.
It's common practice in CMake to write variables in all upper-case, and
in fact all variables made available by CMake are exactly that. As
variables are case-sensitive in CMake, we cannot and shouldn't break
with this. In contrast, function calls are case insensitive, and modern
CMake always uses all-lowercase ones. I argue we should do the same to
get in line with other codebases and to reduce the likelihood of
repetitive strain injuries.
So especially for CMake, the proposed coding style says something we
don't have yet. I'm fine with that, as the document explicitly says that
it's what we want to have and not what we have right now.
|
|
465e10ce
|
2020-04-05T18:33:14
|
|
deps: ntlmclient: use htobe64 on NetBSD too
|
|
e9b0cfc0
|
2020-04-05T13:24:13
|
|
Merge pull request #5485 from libgit2/ethomson/sysdir_unused
sysdir: remove unused git_sysdir_get_str
|
|
b6f18db9
|
2020-04-05T11:16:29
|
|
sysdir: remove unused git_sysdir_get_str
|
|
e56d48be
|
2020-04-05T12:07:17
|
|
Merge pull request #5483 from xSetech/master
Fix typo causing removal of symbol 'git_worktree_prune_init_options'
|
|
ce2ab78f
|
2020-04-04T16:35:33
|
|
Fix typo causing removal of symbol 'git_worktree_prune_init_options'
Commit 0b5ba0d replaced this function with an "option_init"
equivallent, but misspelled the replacement function. As a result, this
symbol has been missing from libgit2.so ever since.
|
|
ad341eb7
|
2020-04-04T13:40:14
|
|
Merge pull request #5425 from lhchavez/fix-get-delta-base
pack: Improve error handling for get_delta_base()
|
|
5a1ec7ab
|
2020-04-04T13:37:13
|
|
Merge pull request #5480 from libgit2/ethomson/coverity
repo::open: ensure we can open the repository
|
|
7d9b1f07
|
2020-04-04T13:36:24
|
|
Merge pull request #5421 from petersalomonsen/examples-fixes-and-additions
examples: additions and fixes
|
|
966db47d
|
2020-04-04T13:21:02
|
|
Merge pull request #5477 from pks-t/pks/rename-detection-negative-caches
merge: cache negative cache results for similarity metrics
|
|
cb0cfc5a
|
2020-04-03T09:17:52
|
|
repo::open: ensure we can open the repository
Update the test cases to check the `git_repository_open` return code.
|
|
dc2beb7e
|
2020-02-24T18:30:16
|
|
examples: additions and fixes
add example for git commit
fix example for git add
add example for git push
|
|
4d4c8e0a
|
2020-04-02T07:34:55
|
|
Re-adding the "delta offset is zero" error case
|
|
dfd7fcc4
|
2020-04-02T13:26:13
|
|
Merge pull request #5388 from bk2204/repo-format-v1
Handle repository format v1
|
|
e1299171
|
2020-04-02T13:13:52
|
|
Merge pull request #5440 from pks-t/pks/cmake-streamlining
CMake: backend selection streamlining
|
|
b8eec0b2
|
2020-04-01T22:22:38
|
|
Merge pull request #5461 from pks-t/pks/refdb-fs-unused-header
refdb_fs: remove unused header file
|
|
5d37128d
|
2020-03-01T10:34:15
|
|
git__hexdump: better mimic `hexdump -C`
|
|
ba59a4a2
|
2020-04-01T12:34:16
|
|
Making get_delta_base() conform to the general error-handling pattern
This makes get_delta_base() return the error code as the return value
and the delta base as an out-parameter.
|
|
f3273725
|
2020-02-25T20:58:09
|
|
pack: Improve error handling for get_delta_base()
This change moves the responsibility of setting the error upon failures
of get_delta_base() to get_delta_base() instead of its callers. That
way, the caller chan always check if the return value is negative and
mark the whole operation as an error instead of using garbage values,
which can lead to crashes if the .pack files are malformed.
|
|
1c7fb212
|
2020-04-01T20:00:24
|
|
Merge pull request #5466 from pks-t/pks/patch-modechange-with-rename
patch: correctly handle mode changes for renames
|
|
85533f37
|
2020-04-01T19:59:31
|
|
Merge pull request #5474 from pks-t/pks/gitignore-cleanup
gitignore: clean up patterns from old times
|
|
2662da48
|
2020-04-01T18:03:39
|
|
Merge pull request #5478 from pks-t/pks/readme-ci-update
README.md: update build matrix to reflect our latest releases
|
|
541de515
|
2020-04-01T17:36:13
|
|
cmake: streamline backend detection
We're currently doing unnecessary work to auto-detect backends even if
the functionality is disabled altogether. Let's fix this by removing the
extraneous FOO_BACKEND variables, instead letting auto-detection modify
the variable itself.
|
|
7a6c4122
|
2020-04-01T16:15:38
|
|
README.md: update build matrix to reflect our latest releases
|
|
7d3c7057
|
2020-04-01T15:49:12
|
|
Merge pull request #5471 from pks-t/pks/v1.0
Release v1.0
|
|
4dfcc50f
|
2020-04-01T15:16:18
|
|
merge: cache negative cache results for similarity metrics
When computing renames, we cache the hash signatures for each of the
potentially conflicting entries so that we do not need to repeatedly
read the file and can at least halfway efficiently determine whether two
files are similar enough to be deemed a rename. In order to make the
hash signatures meaningful, we require at least four lines of data to be
present, resulting in at least four different hashes that can be
compared. Files that are deemed too small are not cached at all and
will thus be repeatedly re-hashed, which is usually not a huge issue.
The issue with above heuristic is in case a file does _not_ have at
least four lines, where a line is anything separated by a consecutive
run of "\n" or "\0" characters. For example "a\nb" is two lines, but
"a\0\0b" is also just two lines. Taken to the extreme, a file that has
megabytes of consecutive space- or NUL-only may also be deemed as too
small and thus not get cached. As a result, we will repeatedly load its
blob, calculate its hash signature just to finally throw it away as we
notice it's not of any value. When you've got a comparitively big file
that you compare against a big set of potentially renamed files, then
the cost simply expodes.
The issue can be trivially fixed by introducing negative cache entries.
Whenever we determine that a given blob does not have a meaningful
representation via a hash signature, we store this negative cache marker
and will from then on not hash it again, but also ignore it as a
potential rename target. This should help the "normal" case already
where you have a lot of small files as rename candidates, but in the
above scenario it's savings are extraordinarily high.
To verify we do not hit the issue anymore with described solution, this
commit adds a test that uses the exact same setup described above with
one 50 megabyte blob of '\0' characters and 1000 other files that get
renamed. Without the negative cache:
$ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
real 11m48.377s
user 11m11.576s
sys 0m35.187s
And with the negative cache:
$ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
real 0m1.972s
user 0m1.851s
sys 0m0.118s
So this represents a ~350-fold performance improvement, but it obviously
depends on how many files you have and how big the blob is. The test
number were chosen in a way that one will immediately notice as soon as
the bug resurfaces.
|