|
0cf9b666
|
2020-05-12T11:41:44
|
|
tests: merge: fix printf formatter on 32 bit arches
We currently use `PRIuMAX` to print an integer of type `size_t` in
merge::trees::rename::cache_recomputation. While this works just fine on
64 bit arches, it doesn't on 32 bit ones. As a result, our nightly
builds on x86 and arm32 fail.
Fix the issue by using `PRIuZ` instead.
|
|
4dfcc50f
|
2020-04-01T15:16:18
|
|
merge: cache negative cache results for similarity metrics
When computing renames, we cache the hash signatures for each of the
potentially conflicting entries so that we do not need to repeatedly
read the file and can at least halfway efficiently determine whether two
files are similar enough to be deemed a rename. In order to make the
hash signatures meaningful, we require at least four lines of data to be
present, resulting in at least four different hashes that can be
compared. Files that are deemed too small are not cached at all and
will thus be repeatedly re-hashed, which is usually not a huge issue.
The issue with above heuristic is in case a file does _not_ have at
least four lines, where a line is anything separated by a consecutive
run of "\n" or "\0" characters. For example "a\nb" is two lines, but
"a\0\0b" is also just two lines. Taken to the extreme, a file that has
megabytes of consecutive space- or NUL-only may also be deemed as too
small and thus not get cached. As a result, we will repeatedly load its
blob, calculate its hash signature just to finally throw it away as we
notice it's not of any value. When you've got a comparitively big file
that you compare against a big set of potentially renamed files, then
the cost simply expodes.
The issue can be trivially fixed by introducing negative cache entries.
Whenever we determine that a given blob does not have a meaningful
representation via a hash signature, we store this negative cache marker
and will from then on not hash it again, but also ignore it as a
potential rename target. This should help the "normal" case already
where you have a lot of small files as rename candidates, but in the
above scenario it's savings are extraordinarily high.
To verify we do not hit the issue anymore with described solution, this
commit adds a test that uses the exact same setup described above with
one 50 megabyte blob of '\0' characters and 1000 other files that get
renamed. Without the negative cache:
$ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
real 11m48.377s
user 11m11.576s
sys 0m35.187s
And with the negative cache:
$ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
real 0m1.972s
user 0m1.851s
sys 0m0.118s
So this represents a ~350-fold performance improvement, but it obviously
depends on how many files you have and how big the blob is. The test
number were chosen in a way that one will immediately notice as soon as
the bug resurfaces.
|
|
e54343a4
|
2019-06-29T09:17:32
|
|
fileops: rename to "futils.h" to match function signatures
Our file utils functions all have a "futils" prefix, e.g.
`git_futils_touch`. One would thus naturally guess that their
definitions and implementation would live in files "futils.h" and
"futils.c", respectively, but in fact they live in "fileops.h".
Rename the files to match expectations.
|
|
70fae43c
|
2019-06-13T11:57:16
|
|
tests: merge::analysis: use variants to deduplicate test suites
Since commit 394951ad4 (tests: allow for simple data-driven
tests, 2019-06-07), we have the ability to run a given test suite
with multiple variants. Use this new feature to deduplicate the
test suites for merge::{trees,workdir}::analysis into a single
test suite.
|
|
21ddeabe
|
2019-06-07T15:22:42
|
|
Review fixes:
- whitespace -> tabs
- comment style
- improve repo naming in merge/trees/analysis tests.
|
|
7b27b6cf
|
2019-06-06T16:32:09
|
|
Refactor testing:
- move duplication between merge/trees/ and merge/workdir/ into merge/analysis{.c,.h}
- remove merge-resolve.git resource, open the existing merge-resolve as a bare repo instead.
|
|
5427461f
|
2019-03-20T11:51:24
|
|
merge: add doc header to analysis tests
|
|
1d04f477
|
2019-03-19T23:43:56
|
|
merge: tests for bare repo merge analysis
dupe of workdir/analysis.c against a bare repo.
|
|
168fe39b
|
2018-11-28T14:26:57
|
|
object_type: use new enumeration names
Use the new object_type enumeration names within the codebase.
|
|
9994cd3f
|
2018-06-25T11:56:52
|
|
treewide: remove use of C++ style comments
C++ style comment ("//") are not specified by the ISO C90 standard and
thus do not conform to it. While libgit2 aims to conform to C90, we did
not enforce it until now, which is why quite a lot of these
non-conforming comments have snuck into our codebase. Do a tree-wide
conversion of all C++ style comments to the supported C style comments
to allow us enforcing strict C90 compliance in a later commit.
|
|
ecf4f33a
|
2018-02-08T11:14:48
|
|
Convert usage of `git_buf_free` to new `git_buf_dispose`
|
|
b8823c2b
|
2018-01-22T23:56:22
|
|
Add failing test case for virtual commit merge base issue
|
|
afcaf35e
|
2018-01-21T16:50:40
|
|
merge::trees::recursive: test for virtual base building
Virtual base building: ensure that the virtual base is created and
revwalked in the same way as git.
|
|
b924df1e
|
2018-01-21T18:05:45
|
|
merge: reverse merge bases for recursive merge
When the commits being merged have multiple merge bases, reverse the
order when creating the virtual merge base. This is for compatibility
with git's merge-recursive algorithm, and ensures that we build
identical trees.
Git does this to try to use older merge bases first. Per 8918b0c:
> It seems to be the only sane way to do it: when a two-head merge is
> done, and the merge-base and one of the two branches agree, the
> merge assumes that the other branch has something new.
>
> If we start creating virtual commits from newer merge-bases, and go
> back to older merge-bases, and then merge with newer commits again,
> chances are that a patch is lost, _because_ the merge-base and the
> head agree on it. Unlikely, yes, but it happened to me.
|
|
185b0d08
|
2018-01-20T19:41:28
|
|
merge: recursive uses larger conflict markers
Git uses longer conflict markers in the recursive merge base - two more
than the default (thus, 9 character long conflict markers). This allows
users to tell the difference between the recursive merge conflicts and
conflicts between the ours and theirs branches.
This was introduced in git d694a17986a28bbc19e2a6c32404ca24572e400f.
Update our tests to expect this as well.
|
|
49806e9b
|
2017-02-09T16:52:03
|
|
merge_trees: introduce test for submodule renames
Test that shows that submodules are incorrectly considered in renames,
and `git_merge_trees` will fail to lookup the submodule as a blob.
|
|
19ed4d0c
|
2017-01-01T22:19:23
|
|
merge: set default rename threshold
When `GIT_MERGE_FIND_RENAMES` is set, provide a default for
`rename_threshold` when it is unset.
|
|
9be638ec
|
2016-04-19T15:12:18
|
|
git_diff_generated: abstract generated diffs
|
|
5b9c63c3
|
2015-11-20T19:01:42
|
|
recursive merge: add a recursion limit
|
|
78859c63
|
2015-11-20T17:33:49
|
|
merge: handle conflicts in recursive base building
When building a recursive merge base, allow conflicts to occur.
Use the file (with conflict markers) as the common ancestor.
The user has already seen and dealt with this conflict by virtue
of having a criss-cross merge. If they resolved this conflict
identically in both branches, then there will be no conflict in the
result. This is the best case scenario.
If they did not resolve the conflict identically in the two branches,
then we will generate a new conflict. If the user is simply using
standard conflict output then the results will be fairly sensible.
But if the user is using a mergetool or using diff3 output, then the
common ancestor will be a conflict file (itself with diff3 output,
haha!). This is quite terrible, but it matches git's behavior.
|
|
34a51428
|
2015-11-09T11:55:26
|
|
merge tests: add complex recursive example
|
|
dcde5720
|
2015-11-09T08:23:27
|
|
merge tests: move expected data into own file
|
|
b1eef912
|
2015-10-27T18:00:30
|
|
merge: add recursive test with conflicting contents
|
|
fccad82e
|
2015-10-27T14:23:35
|
|
merge: add recursive test with three merge bases
|
|
99d9d9a4
|
2015-10-26T17:44:36
|
|
merge: improve test names in recursive merge tests
|
|
a200bcf7
|
2015-10-26T17:25:42
|
|
merge: add a third-level recursive merge
|
|
cdb6c1c8
|
2015-10-26T17:14:28
|
|
merge: add a second-level recursive merge
|
|
86c8d02c
|
2015-10-22T20:20:07
|
|
merge: add simple recursive test
Add a simple recursive test - where multiple ancestors exist and
creating a virtual merge base from them would prevent a conflict.
|
|
fa78782f
|
2015-10-22T17:00:09
|
|
merge: rename `git_merge_tree_flags_t` -> `git_merge_flags_t`
|
|
8683d31f
|
2015-10-22T14:39:20
|
|
merge: add GIT_MERGE_TREE_FAIL_ON_CONFLICT
Provide a new merge option, GIT_MERGE_TREE_FAIL_ON_CONFLICT, which
will stop on the first conflict and fail the merge operation with
GIT_EMERGECONFLICT.
|
|
ed1c6446
|
2015-07-28T11:41:27
|
|
iterator: use an options struct instead of args
|
|
9f545b9d
|
2015-05-19T11:23:59
|
|
introduce `git_index_entry_is_conflict`
It's not always obvious the mapping between stage level and
conflict-ness. More importantly, this can lead otherwise sane
people to write constructs like `if (!git_index_entry_stage(entry))`,
which (while technically correct) is unreadable.
Provide a nice method to help avoid such messy thinking.
|
|
9ebb5a3f
|
2015-02-18T22:53:40
|
|
merge: merge iterators
|
|
13de9363
|
2015-03-12T12:36:09
|
|
Collapse whitespace flags into git_merge_file_flags_t
|
|
f29dde68
|
2015-03-12T12:29:47
|
|
Renamed git_merge_options 'flags' to 'tree_flags'
|
|
0f24cac2
|
2015-03-09T17:03:03
|
|
Added tests to merge files and branches with whitespace problems and fixes
|
|
737b5051
|
2014-10-01T12:03:24
|
|
hashsig: Export as a `sys` header
|
|
0cee70eb
|
2014-07-01T14:09:01
|
|
Introduce cl_assert_equal_oid
|
|
5aa2ac6d
|
2014-03-11T22:47:39
|
|
Update git_merge_tree_opts to git_merge_options
|
|
05d47768
|
2014-03-10T22:30:41
|
|
Introduce git_merge_file for consumers
|
|
d541170c
|
2014-01-24T11:36:41
|
|
index: rename an entry's id to 'id'
This was not converted when we converted the rest, so do it now.
|
|
0e1ba46c
|
2014-01-19T20:03:13
|
|
Remove the "merge none" flag
The "merge none" (don't automerge) flag was only to aide in
merge trivial tests. We can easily determine whether merge
trivial resulted in a trivial merge or an automerge by examining
the REUC after automerge has completed.
|
|
c1d648c5
|
2014-01-08T18:29:42
|
|
merge_file should use more aggressive levels
The default merge_file level was XDL_MERGE_MINIMAL, which will
produce conflicts where there should not be in the case where
both sides were changed identically. Change the defaults to be
more aggressive (XDL_MERGE_ZEALOUS) which will more aggressively
compress non-conflicts. This matches git.git's defaults.
Increase testing around reverting a previously reverted commit to
illustrate this problem.
|
|
5588f073
|
2013-12-09T10:25:36
|
|
Clean up warnings
|
|
eac938d9
|
2013-12-02T14:10:04
|
|
Bare naked merge and rebase
|
|
17820381
|
2013-11-14T14:05:52
|
|
Rename tests-clar to tests
|