kmx git

Commit	Date	Message
0cf9b666	2020-05-12T11:41:44	tests: merge: fix printf formatter on 32 bit arches We currently use `PRIuMAX` to print an integer of type `size_t` in merge::trees::rename::cache_recomputation. While this works just fine on 64 bit arches, it doesn't on 32 bit ones. As a result, our nightly builds on x86 and arm32 fail. Fix the issue by using `PRIuZ` instead.
4dfcc50f	2020-04-01T15:16:18	merge: cache negative cache results for similarity metrics When computing renames, we cache the hash signatures for each of the potentially conflicting entries so that we do not need to repeatedly read the file and can at least halfway efficiently determine whether two files are similar enough to be deemed a rename. In order to make the hash signatures meaningful, we require at least four lines of data to be present, resulting in at least four different hashes that can be compared. Files that are deemed too small are not cached at all and will thus be repeatedly re-hashed, which is usually not a huge issue. The issue with above heuristic is in case a file does _not_ have at least four lines, where a line is anything separated by a consecutive run of "\n" or "\0" characters. For example "a\nb" is two lines, but "a\0\0b" is also just two lines. Taken to the extreme, a file that has megabytes of consecutive space- or NUL-only may also be deemed as too small and thus not get cached. As a result, we will repeatedly load its blob, calculate its hash signature just to finally throw it away as we notice it's not of any value. When you've got a comparitively big file that you compare against a big set of potentially renamed files, then the cost simply expodes. The issue can be trivially fixed by introducing negative cache entries. Whenever we determine that a given blob does not have a meaningful representation via a hash signature, we store this negative cache marker and will from then on not hash it again, but also ignore it as a potential rename target. This should help the "normal" case already where you have a lot of small files as rename candidates, but in the above scenario it's savings are extraordinarily high. To verify we do not hit the issue anymore with described solution, this commit adds a test that uses the exact same setup described above with one 50 megabyte blob of '\0' characters and 1000 other files that get renamed. Without the negative cache: $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null real 11m48.377s user 11m11.576s sys 0m35.187s And with the negative cache: $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null real 0m1.972s user 0m1.851s sys 0m0.118s So this represents a ~350-fold performance improvement, but it obviously depends on how many files you have and how big the blob is. The test number were chosen in a way that one will immediately notice as soon as the bug resurfaces.
e54343a4	2019-06-29T09:17:32	fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
70fae43c	2019-06-13T11:57:16	tests: merge::analysis: use variants to deduplicate test suites Since commit 394951ad4 (tests: allow for simple data-driven tests, 2019-06-07), we have the ability to run a given test suite with multiple variants. Use this new feature to deduplicate the test suites for merge::{trees,workdir}::analysis into a single test suite.
21ddeabe	2019-06-07T15:22:42	Review fixes: - whitespace -> tabs - comment style - improve repo naming in merge/trees/analysis tests.
7b27b6cf	2019-06-06T16:32:09	Refactor testing: - move duplication between merge/trees/ and merge/workdir/ into merge/analysis{.c,.h} - remove merge-resolve.git resource, open the existing merge-resolve as a bare repo instead.
5427461f	2019-03-20T11:51:24	merge: add doc header to analysis tests
1d04f477	2019-03-19T23:43:56	merge: tests for bare repo merge analysis dupe of workdir/analysis.c against a bare repo.
168fe39b	2018-11-28T14:26:57	object_type: use new enumeration names Use the new object_type enumeration names within the codebase.
9994cd3f	2018-06-25T11:56:52	treewide: remove use of C++ style comments C++ style comment ("//") are not specified by the ISO C90 standard and thus do not conform to it. While libgit2 aims to conform to C90, we did not enforce it until now, which is why quite a lot of these non-conforming comments have snuck into our codebase. Do a tree-wide conversion of all C++ style comments to the supported C style comments to allow us enforcing strict C90 compliance in a later commit.
ecf4f33a	2018-02-08T11:14:48	Convert usage of `git_buf_free` to new `git_buf_dispose`
b8823c2b	2018-01-22T23:56:22	Add failing test case for virtual commit merge base issue
afcaf35e	2018-01-21T16:50:40	merge::trees::recursive: test for virtual base building Virtual base building: ensure that the virtual base is created and revwalked in the same way as git.
b924df1e	2018-01-21T18:05:45	merge: reverse merge bases for recursive merge When the commits being merged have multiple merge bases, reverse the order when creating the virtual merge base. This is for compatibility with git's merge-recursive algorithm, and ensures that we build identical trees. Git does this to try to use older merge bases first. Per 8918b0c: > It seems to be the only sane way to do it: when a two-head merge is > done, and the merge-base and one of the two branches agree, the > merge assumes that the other branch has something new. > > If we start creating virtual commits from newer merge-bases, and go > back to older merge-bases, and then merge with newer commits again, > chances are that a patch is lost, _because_ the merge-base and the > head agree on it. Unlikely, yes, but it happened to me.
185b0d08	2018-01-20T19:41:28	merge: recursive uses larger conflict markers Git uses longer conflict markers in the recursive merge base - two more than the default (thus, 9 character long conflict markers). This allows users to tell the difference between the recursive merge conflicts and conflicts between the ours and theirs branches. This was introduced in git d694a17986a28bbc19e2a6c32404ca24572e400f. Update our tests to expect this as well.
49806e9b	2017-02-09T16:52:03	merge_trees: introduce test for submodule renames Test that shows that submodules are incorrectly considered in renames, and `git_merge_trees` will fail to lookup the submodule as a blob.
19ed4d0c	2017-01-01T22:19:23	merge: set default rename threshold When `GIT_MERGE_FIND_RENAMES` is set, provide a default for `rename_threshold` when it is unset.
9be638ec	2016-04-19T15:12:18	git_diff_generated: abstract generated diffs
5b9c63c3	2015-11-20T19:01:42	recursive merge: add a recursion limit
78859c63	2015-11-20T17:33:49	merge: handle conflicts in recursive base building When building a recursive merge base, allow conflicts to occur. Use the file (with conflict markers) as the common ancestor. The user has already seen and dealt with this conflict by virtue of having a criss-cross merge. If they resolved this conflict identically in both branches, then there will be no conflict in the result. This is the best case scenario. If they did not resolve the conflict identically in the two branches, then we will generate a new conflict. If the user is simply using standard conflict output then the results will be fairly sensible. But if the user is using a mergetool or using diff3 output, then the common ancestor will be a conflict file (itself with diff3 output, haha!). This is quite terrible, but it matches git's behavior.
34a51428	2015-11-09T11:55:26	merge tests: add complex recursive example
dcde5720	2015-11-09T08:23:27	merge tests: move expected data into own file
b1eef912	2015-10-27T18:00:30	merge: add recursive test with conflicting contents
fccad82e	2015-10-27T14:23:35	merge: add recursive test with three merge bases
99d9d9a4	2015-10-26T17:44:36	merge: improve test names in recursive merge tests
a200bcf7	2015-10-26T17:25:42	merge: add a third-level recursive merge
cdb6c1c8	2015-10-26T17:14:28	merge: add a second-level recursive merge
86c8d02c	2015-10-22T20:20:07	merge: add simple recursive test Add a simple recursive test - where multiple ancestors exist and creating a virtual merge base from them would prevent a conflict.
fa78782f	2015-10-22T17:00:09	merge: rename `git_merge_tree_flags_t` -> `git_merge_flags_t`
8683d31f	2015-10-22T14:39:20	merge: add GIT_MERGE_TREE_FAIL_ON_CONFLICT Provide a new merge option, GIT_MERGE_TREE_FAIL_ON_CONFLICT, which will stop on the first conflict and fail the merge operation with GIT_EMERGECONFLICT.
ed1c6446	2015-07-28T11:41:27	iterator: use an options struct instead of args
9f545b9d	2015-05-19T11:23:59	introduce `git_index_entry_is_conflict` It's not always obvious the mapping between stage level and conflict-ness. More importantly, this can lead otherwise sane people to write constructs like `if (!git_index_entry_stage(entry))`, which (while technically correct) is unreadable. Provide a nice method to help avoid such messy thinking.
9ebb5a3f	2015-02-18T22:53:40	merge: merge iterators
13de9363	2015-03-12T12:36:09	Collapse whitespace flags into git_merge_file_flags_t
f29dde68	2015-03-12T12:29:47	Renamed git_merge_options 'flags' to 'tree_flags'
0f24cac2	2015-03-09T17:03:03	Added tests to merge files and branches with whitespace problems and fixes
737b5051	2014-10-01T12:03:24	hashsig: Export as a `sys` header
0cee70eb	2014-07-01T14:09:01	Introduce cl_assert_equal_oid
5aa2ac6d	2014-03-11T22:47:39	Update git_merge_tree_opts to git_merge_options
05d47768	2014-03-10T22:30:41	Introduce git_merge_file for consumers
d541170c	2014-01-24T11:36:41	index: rename an entry's id to 'id' This was not converted when we converted the rest, so do it now.
0e1ba46c	2014-01-19T20:03:13	Remove the "merge none" flag The "merge none" (don't automerge) flag was only to aide in merge trivial tests. We can easily determine whether merge trivial resulted in a trivial merge or an automerge by examining the REUC after automerge has completed.
c1d648c5	2014-01-08T18:29:42	merge_file should use more aggressive levels The default merge_file level was XDL_MERGE_MINIMAL, which will produce conflicts where there should not be in the case where both sides were changed identically. Change the defaults to be more aggressive (XDL_MERGE_ZEALOUS) which will more aggressively compress non-conflicts. This matches git.git's defaults. Increase testing around reverting a previously reverted commit to illustrate this problem.
5588f073	2013-12-09T10:25:36	Clean up warnings
eac938d9	2013-12-02T14:10:04	Bare naked merge and rebase
17820381	2013-11-14T14:05:52	Rename tests-clar to tests

0cf9b666

2020-05-12T11:41:44

tests: merge: fix printf formatter on 32 bit arches We currently use `PRIuMAX` to print an integer of type `size_t` in merge::trees::rename::cache_recomputation. While this works just fine on 64 bit arches, it doesn't on 32 bit ones. As a result, our nightly builds on x86 and arm32 fail. Fix the issue by using `PRIuZ` instead.

4dfcc50f

2020-04-01T15:16:18

merge: cache negative cache results for similarity metrics When computing renames, we cache the hash signatures for each of the potentially conflicting entries so that we do not need to repeatedly read the file and can at least halfway efficiently determine whether two files are similar enough to be deemed a rename. In order to make the hash signatures meaningful, we require at least four lines of data to be present, resulting in at least four different hashes that can be compared. Files that are deemed too small are not cached at all and will thus be repeatedly re-hashed, which is usually not a huge issue. The issue with above heuristic is in case a file does _not_ have at least four lines, where a line is anything separated by a consecutive run of "\n" or "\0" characters. For example "a\nb" is two lines, but "a\0\0b" is also just two lines. Taken to the extreme, a file that has megabytes of consecutive space- or NUL-only may also be deemed as too small and thus not get cached. As a result, we will repeatedly load its blob, calculate its hash signature just to finally throw it away as we notice it's not of any value. When you've got a comparitively big file that you compare against a big set of potentially renamed files, then the cost simply expodes. The issue can be trivially fixed by introducing negative cache entries. Whenever we determine that a given blob does not have a meaningful representation via a hash signature, we store this negative cache marker and will from then on not hash it again, but also ignore it as a potential rename target. This should help the "normal" case already where you have a lot of small files as rename candidates, but in the above scenario it's savings are extraordinarily high. To verify we do not hit the issue anymore with described solution, this commit adds a test that uses the exact same setup described above with one 50 megabyte blob of '\0' characters and 1000 other files that get renamed. Without the negative cache: $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null real 11m48.377s user 11m11.576s sys 0m35.187s And with the negative cache: $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null real 0m1.972s user 0m1.851s sys 0m0.118s So this represents a ~350-fold performance improvement, but it obviously depends on how many files you have and how big the blob is. The test number were chosen in a way that one will immediately notice as soon as the bug resurfaces.

e54343a4

2019-06-29T09:17:32

fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.

70fae43c

2019-06-13T11:57:16

tests: merge::analysis: use variants to deduplicate test suites Since commit 394951ad4 (tests: allow for simple data-driven tests, 2019-06-07), we have the ability to run a given test suite with multiple variants. Use this new feature to deduplicate the test suites for merge::{trees,workdir}::analysis into a single test suite.

21ddeabe

2019-06-07T15:22:42

Review fixes: - whitespace -> tabs - comment style - improve repo naming in merge/trees/analysis tests.

7b27b6cf

2019-06-06T16:32:09

Refactor testing: - move duplication between merge/trees/ and merge/workdir/ into merge/analysis{.c,.h} - remove merge-resolve.git resource, open the existing merge-resolve as a bare repo instead.

5427461f

2019-03-20T11:51:24

merge: add doc header to analysis tests

1d04f477

2019-03-19T23:43:56

merge: tests for bare repo merge analysis dupe of workdir/analysis.c against a bare repo.

168fe39b

2018-11-28T14:26:57

object_type: use new enumeration names Use the new object_type enumeration names within the codebase.

9994cd3f

2018-06-25T11:56:52

treewide: remove use of C++ style comments C++ style comment ("//") are not specified by the ISO C90 standard and thus do not conform to it. While libgit2 aims to conform to C90, we did not enforce it until now, which is why quite a lot of these non-conforming comments have snuck into our codebase. Do a tree-wide conversion of all C++ style comments to the supported C style comments to allow us enforcing strict C90 compliance in a later commit.

ecf4f33a

2018-02-08T11:14:48

Convert usage of `git_buf_free` to new `git_buf_dispose`

b8823c2b

2018-01-22T23:56:22

Add failing test case for virtual commit merge base issue

afcaf35e

2018-01-21T16:50:40

merge::trees::recursive: test for virtual base building Virtual base building: ensure that the virtual base is created and revwalked in the same way as git.

b924df1e

2018-01-21T18:05:45

merge: reverse merge bases for recursive merge When the commits being merged have multiple merge bases, reverse the order when creating the virtual merge base. This is for compatibility with git's merge-recursive algorithm, and ensures that we build identical trees. Git does this to try to use older merge bases first. Per 8918b0c: > It seems to be the only sane way to do it: when a two-head merge is > done, and the merge-base and one of the two branches agree, the > merge assumes that the other branch has something new. > > If we start creating virtual commits from newer merge-bases, and go > back to older merge-bases, and then merge with newer commits again, > chances are that a patch is lost, _because_ the merge-base and the > head agree on it. Unlikely, yes, but it happened to me.

185b0d08

2018-01-20T19:41:28

merge: recursive uses larger conflict markers Git uses longer conflict markers in the recursive merge base - two more than the default (thus, 9 character long conflict markers). This allows users to tell the difference between the recursive merge conflicts and conflicts between the ours and theirs branches. This was introduced in git d694a17986a28bbc19e2a6c32404ca24572e400f. Update our tests to expect this as well.

49806e9b

2017-02-09T16:52:03

merge_trees: introduce test for submodule renames Test that shows that submodules are incorrectly considered in renames, and `git_merge_trees` will fail to lookup the submodule as a blob.

19ed4d0c

2017-01-01T22:19:23

merge: set default rename threshold When `GIT_MERGE_FIND_RENAMES` is set, provide a default for `rename_threshold` when it is unset.

9be638ec

2016-04-19T15:12:18

git_diff_generated: abstract generated diffs

5b9c63c3

2015-11-20T19:01:42

recursive merge: add a recursion limit

78859c63

2015-11-20T17:33:49

merge: handle conflicts in recursive base building When building a recursive merge base, allow conflicts to occur. Use the file (with conflict markers) as the common ancestor. The user has already seen and dealt with this conflict by virtue of having a criss-cross merge. If they resolved this conflict identically in both branches, then there will be no conflict in the result. This is the best case scenario. If they did not resolve the conflict identically in the two branches, then we will generate a new conflict. If the user is simply using standard conflict output then the results will be fairly sensible. But if the user is using a mergetool or using diff3 output, then the common ancestor will be a conflict file (itself with diff3 output, haha!). This is quite terrible, but it matches git's behavior.

34a51428

2015-11-09T11:55:26

merge tests: add complex recursive example

dcde5720

2015-11-09T08:23:27

merge tests: move expected data into own file

b1eef912

2015-10-27T18:00:30

merge: add recursive test with conflicting contents

fccad82e

2015-10-27T14:23:35

merge: add recursive test with three merge bases

99d9d9a4

2015-10-26T17:44:36

merge: improve test names in recursive merge tests

a200bcf7

2015-10-26T17:25:42

merge: add a third-level recursive merge

cdb6c1c8

2015-10-26T17:14:28

merge: add a second-level recursive merge

86c8d02c

2015-10-22T20:20:07

merge: add simple recursive test Add a simple recursive test - where multiple ancestors exist and creating a virtual merge base from them would prevent a conflict.

fa78782f

2015-10-22T17:00:09

merge: rename `git_merge_tree_flags_t` -> `git_merge_flags_t`

8683d31f

2015-10-22T14:39:20

merge: add GIT_MERGE_TREE_FAIL_ON_CONFLICT Provide a new merge option, GIT_MERGE_TREE_FAIL_ON_CONFLICT, which will stop on the first conflict and fail the merge operation with GIT_EMERGECONFLICT.

ed1c6446

2015-07-28T11:41:27

iterator: use an options struct instead of args

9f545b9d

2015-05-19T11:23:59

introduce `git_index_entry_is_conflict` It's not always obvious the mapping between stage level and conflict-ness. More importantly, this can lead otherwise sane people to write constructs like `if (!git_index_entry_stage(entry))`, which (while technically correct) is unreadable. Provide a nice method to help avoid such messy thinking.

9ebb5a3f

2015-02-18T22:53:40

merge: merge iterators

13de9363

2015-03-12T12:36:09

Collapse whitespace flags into git_merge_file_flags_t

f29dde68

2015-03-12T12:29:47

Renamed git_merge_options 'flags' to 'tree_flags'

0f24cac2

2015-03-09T17:03:03

Added tests to merge files and branches with whitespace problems and fixes

737b5051

2014-10-01T12:03:24

hashsig: Export as a `sys` header

0cee70eb

2014-07-01T14:09:01

Introduce cl_assert_equal_oid

5aa2ac6d

2014-03-11T22:47:39

Update git_merge_tree_opts to git_merge_options

05d47768

2014-03-10T22:30:41

Introduce git_merge_file for consumers

d541170c

2014-01-24T11:36:41

index: rename an entry's id to 'id' This was not converted when we converted the rest, so do it now.

0e1ba46c

2014-01-19T20:03:13

Remove the "merge none" flag The "merge none" (don't automerge) flag was only to aide in merge trivial tests. We can easily determine whether merge trivial resulted in a trivial merge or an automerge by examining the REUC after automerge has completed.

c1d648c5

2014-01-08T18:29:42

merge_file should use more aggressive levels The default merge_file level was XDL_MERGE_MINIMAL, which will produce conflicts where there should not be in the case where both sides were changed identically. Change the defaults to be more aggressive (XDL_MERGE_ZEALOUS) which will more aggressively compress non-conflicts. This matches git.git's defaults. Increase testing around reverting a previously reverted commit to illustrate this problem.

5588f073

2013-12-09T10:25:36

Clean up warnings

eac938d9

2013-12-02T14:10:04

Bare naked merge and rebase

17820381

2013-11-14T14:05:52

Rename tests-clar to tests

thodg/libgit2/tests/merge/trees

tests/merge/trees

Log