tests-clar


Log

Author Commit Date CI Message
yorah 0d32f39e 2013-03-04T11:31:50 Notify '*' pathspec correctly when diffing I also moved all tests related to notifying in their own file.
Russell Belfer ad26434b 2013-04-09T14:52:32 Tests and more fixes for submodule diffs This adds tests for diffs with submodules in them and (perhaps unsurprisingly) requires further fixes to be made. Specifically, this fixes: - when considering if a submodule is dirty in the workdir, it was being treated as dirty even if only the index was dirty. - git_diff_patch_to_str (and git_diff_patch_print) were "printing" the headers for files (and submodules) that were unmodified or had no meaningful content. - added comment to previous fix and removed unneeded parens.
Vicent Marti 812e5aea 2013-04-07T07:23:08 test: Add missing NULLs
Vicent Marti d9ecaf8c 2013-04-07T07:22:38 Merge remote-tracking branch 'gnprice/revwalk' into development
Greg Price b208d900 2013-03-20T10:01:58 revparse: Parse range-like syntax Signed-off-by: Greg Price <price@mit.edu>
Greg Price af079d8b 2013-03-03T20:54:23 revwalk: Parse revision ranges All the hard work is already in revparse. Signed-off-by: Greg Price <price@mit.edu>
Greg Price 2932c882 2013-03-04T02:17:04 revwalk: refactor tests a bit Signed-off-by: Greg Price <price@mit.edu>
Greg Price 06e6eab0 2013-03-19T12:02:19 revwalk tests: better diagram of example repo The purported command output was already inaccurate, as the refs aren't where it shows. In any event, the labels a reader of this file really needs are the indices used in commit_sorting_*, to make it possible to understand them by referring directly from those arrays to the diagram rather than from the index arrays, to commit_ids, to the diagram. Add those. Signed-off-by: Greg Price <price@mit.edu>
nulltoken 24cb87e2 2013-03-31T13:27:43 tag: Fix parsing when no tagger nor message
nulltoken 5a5bd640 2013-03-31T13:53:40 tests: Fix indentations
Edward Thomson 0e60b637 2013-03-29T18:36:11 free!
Edward Thomson 54a1a042 2013-03-29T11:26:12 remove unmerged files during reset hard
Vicent Martí 0b061b5b 2013-03-26T11:05:57 Merge pull request #1436 from schu/opts-cache-size opts: allow configuration of odb cache size
Russell Belfer ccfa6805 2013-03-25T23:58:40 Fix some diff ignores and submodule dirty workdir This started out trying to look at the problems from issue #1425 and gradually grew to a broader set of fixes. There are two core things fixed here: 1. When you had an ignore like "/bin" which is rooted at the top of your tree, instead of immediately adding the "bin/" entry as an ignored item in the diff, we were returning all of the direct descendants of the directory as ignored items. This changes things to immediately ignore the directory. Note that this effects the behavior in test_status_ignore__subdirectories so that we no longer exactly match core gits ignore behavior, but the new behavior probably makes more sense (i.e. we now will include an ignored directory inside an untracked directory that we previously would have left off). 2. When a submodule only contained working directory changes, the diff code was always considering it unmodified which was just an outright bug. The HEAD SHA of the submodule matches the SHA in the parent repo index, and since the SHAs matches, the diff code was overwriting the actual status with UNMODIFIED. These fixes broke existing tests test_diff_workdir__submodules and test_status_ignore__subdirectories but looking it over, I actually think the new results are correct and the old results were wrong. @nulltoken had actually commented on the subdirectory ignore issue previously. I also included in the tests some debugging versions of the shared iteration callback routines that print status or diff information. These aren't used actively in the tests, but can be quickly swapped in to test code to give a better picture of what is being scanned in some of the complex test scenarios.
Russell Belfer 37ee70fa 2013-03-25T22:19:39 Implement GIT_STATUS_OPT_EXCLUDE_SUBMODULES This option has been sitting unimplemented for a while, so I finally went through and implemented it along with some tests. As part of this, I improved the implementation of GIT_DIFF_IGNORE_SUBMODULES so it be more diligent about avoiding extra work and about leaving off delta records for submodules to the greatest extent possible (though it may include them still if you are request TYPECHANGE records).
Russell Belfer 0c289dd7 2013-03-25T16:40:16 Recursing into ignored dirs for diff and status This implements working versions of GIT_DIFF_RECURSE_IGNORED_DIRS and GIT_STATUS_OPT_RECURSE_IGNORED_DIRS along with some tests for the newly available behaviors. This is not turned on by default for status, but can be accessed via the options to the extended version of the command.
Russell Belfer 3658e81e 2013-03-25T14:20:07 Move crlf conversion into buf_text This adds crlf/lf conversion functions into buf_text with more efficient implementations that bypass the high level buffer functions. They attempt to minimize the number of reallocations done and they directly write the buffer data as needed if they know that there is enough memory allocated to memcpy data. Tests are added for these new functions. The crlf.c code is updated to use the new functions. Removed the include of buf_text.h from filter.h and just include it more narrowly in the places that need it.
Russell Belfer 050ab995 2013-03-25T14:13:53 Fix up checkout file contents checks This fixes of the file contents checks in checkout to give slightly better error messages by directly calling the underlying clar assertions so the file and line number of the top level call can be reported correctly, and renames the helpers to not start with "test_" since that is kind of reserved by clar. This also enables some of the CRLF tests on all platforms that were previously Windows only (by pushing a check of the native line endings into the test body).
Edward Thomson 4a15ea86 2013-03-21T14:02:25 don't convert CRLF to CRCRLF
Russell Belfer 1098cfae 2013-03-22T14:52:29 Test fixes and cleanup This fixes some places where the new tests were leaving the test area in a bad state or were freeing data they should not free. It also removes code that is extraneous to the core issue and fixes an invalid SHA being looked up in one of the tests (which was failing, but for the wrong reason).
Sven Strickroth b8acb775 2013-03-07T22:15:40 Added some tests for issue #1397 Signed-off-by: Sven Strickroth <email@cs-ware.de>
Vicent Marti 13640d1b 2013-03-25T21:39:11 oid: Do not parse OIDs longer than 40
Vicent Martí 1f107478 2013-03-25T13:26:50 Merge pull request #1428 from xavier-l/nul-terminated-oid Nul terminated oid
Michael Schubert f5e28202 2013-03-25T13:38:43 opts: allow configuration of odb cache size Currently, the odb cache has a fixed size of 128 slots as defined by GIT_DEFAULT_CACHE_SIZE. Allow users to set the size of the cache via git_libgit2_opts(). Fixes #1035.
Russell Belfer 1323c6d1 2013-03-22T14:27:56 Add cl_repo_set_bool and cleanup tests This adds a helper function for the cases where you want to quickly set a single boolean config value for a repository. This allowed me to remove a lot of code.
Russell Belfer 3ba01362 2013-03-20T11:46:03 Update cl_assert_equal_sz to be nicer This makes the size_t comparison test nicer (assuming that the values are actually not using the full length), and converts some cases that were using it for pointer comparison to use the macro that is designed for pointer comparison.
Russell Belfer 7202ec29 2013-03-20T11:47:34 Update to latest Clar
Xavier L b3c17483 2013-03-21T14:50:28 Clarified string value
Xavier L 7e527ca7 2013-03-21T12:16:31 Added test case for new function
Russell Belfer 65025cb8 2013-03-18T17:24:13 Three submodule status bug fixes 1. Fix sort order problem with submodules where "mod" was sorting after "mod-plus" because they were being sorted as "mod/" and "mod-plus/". This involved pushing the "contains a .git entry" test significantly lower in the stack. 2. Reinstate behavior that a directory which contains a .git entry will be treated as a submodule during iteration even if it is not yet added to the .gitmodules. 3. Now that any directory containing .git is reported as submodule, we have to be more careful checking for GIT_EEXISTS when we do a submodule lookup, because that is the error code that is returned by git_submodule_lookup when you try to look up a directory containing .git that has no record in gitmodules or the index.
Vicent Martí 5b27bf7e 2013-03-18T16:17:14 Merge pull request #1417 from arrbee/opts-for-paths Implement opts interface for global/system file search paths
Russell Belfer 32460251 2013-03-18T15:54:35 Fixes and cleanups Get rid of some dead code, tighten things up a bit, and fix a bug with core::env test.
Russell Belfer 41954a49 2013-03-18T14:19:35 Switch search paths to classic delimited strings This switches the APIs for setting and getting the global/system search paths from using git_strarray to using a simple string with GIT_PATH_LIST_SEPARATOR delimited paths, just as the environment PATH variable would contain. This makes it simpler to get and set the value. I also added code to expand "$PATH" when setting a new value to embed the old value of the path. This means that I no longer require separate actions to PREPEND to the value.
Vicent Martí 677dce8a 2013-03-18T14:00:09 Merge pull request #1080 from carlosmn/config-set-null Failing config related test
Russell Belfer 5540d947 2013-03-15T16:39:00 Implement global/system file search paths The goal of this work is to expose the search logic for "global", "system", and "xdg" files through the git_libgit2_opts() interface. Behind the scenes, I changed the logic for finding files to have a notion of a git_strarray that represents a search path and to store a separate search path for each of the three tiers of config file. For each tier, I implemented a function to initialize it to default values (generally based on environment variables), and then general interfaces to get it, set it, reset it, and prepend new directories to it. Next, I exposed these interfaces through the git_libgit2_opts interface, reusing the GIT_CONFIG_LEVEL_SYSTEM, etc., constants for the user to control which search path they were modifying. There are alternative designs for the opts interface / argument ordering, so I'm putting this phase out for discussion. Additionally, I ended up doing a little bit of clean up regarding attr.h and attr_file.h, adding a new attrcache.h so the other two files wouldn't have to be included in so many places.
Vicent Martí 5b229e20 2013-03-15T04:06:31 Merge pull request #1413 from arrbee/more-iterator-refactor Further tree_iterator refactoring
Russell Belfer 55e0f53d 2013-03-14T15:09:29 Fix various build warnings This fixes various build warnings on Mac and Windows (64-bit).
Russell Belfer d85296ab 2013-03-14T13:50:54 Fix valgrind issues (and mmap fallback for diff) This fixes a number of issues identified by valgrind - mostly missed free calls. Inside valgrind, mmap() may fail which causes some of the diff tests to fail. This adds a fallback code path to diff_output.c:get_workdir_content() where is the mmap() fails the code will now try to read the file data directly into allocated memory (which is what it would do if the data needed to be filtered anyhow).
Russell Belfer 0c468633 2013-03-14T13:40:15 Improved tree iterator internals This updates the tree iterator internals to be more efficient. The tree_iterator_entry objects are now kept as pointers that are allocated from a git_pool, so that we may use git__tsort_r for sorting (which is better than qsort, given that the tree is likely mostly ordered already). Those tree_iterator_entry objects now keep direct pointers to the data they refer to instead of keeping indirect index values. This simplifies a lot of the data structure traversal code. This also adds bsearch to find the start item position for range- limited tree iterators, and is more explicit about using git_path_cmp instead of reimplementing it. The git_path_cmp changed a bit to make it easier for tree_iterators to use it (but it was barely being used previously, so not a big deal). This adds a git_pool_free_array function that efficiently frees a list of pool allocated pointers (which the tree_iterator keeps). Also, added new tests for the git_pool free list functionality that was not previously being tested (or used).
Russell Belfer bbb13646 2013-03-13T14:59:51 Fix workdir iterator bugs This fixes two bugs with the workdir iterator depth check: first that the depth was not being decremented and second that empty directories were counting against the depth even though a frame was not being created for them. This also fixes a bug with the ENOTFOUND return code for workdir iterators when you attempt to advance_into an empty directory. Actually, that works correctly, but it was incorrectly being propogated into regular advance() calls in some circumstances. Added new tests for the above that create a huge hierarchy on the fly and try using the workdir iterator to traverse it.
Vicent Martí 1ac10aae 2013-03-12T09:23:53 Merge pull request #1408 from arrbee/refactor-iterators Refactor iterators
Russell Belfer a5eea2d7 2013-03-11T11:31:50 Stabilize order for equiv tree iterator entries Given a group of case-insensitively equivalent tree iterator entries, this ensures that the case-sensitively first trees will be used as the representative items. I.e. if you have conflicting entries "A/B/x", "a/b/x", and "A/b/x", this change ensures that the earliest entry "A/B/x" will be returned. The actual choice is not that important, but it is nice to have it stable and to have it been either the first or last item, as opposed to a random item from within the equivalent span.
Edward Thomson aa408cbf 2013-03-11T11:18:00 handle small files in similarity metrics
Russell Belfer aec4f663 2013-03-11T10:37:12 Fix tree iterator advance using wrong name compare Tree iterator advance was moving forward without taking the filemode of the entries into account, equating "a" and "a/". This makes the tree entry comparison code more easily reusable and fixes the problem.
Russell Belfer 92028ea5 2013-03-11T09:53:49 Fix tree iterator path for tree issue + cleanups This fixes an off by one error for generating full paths for tree entries in tree iterators when INCLUDE_TREES is set. Also, contains a bunch of small code cleanups with a couple of small utility functions and macro changes to eliminate redundant code.
Russell Belfer 61c7b61e 2013-03-10T22:38:53 Use correct case path in icase tree iterator If there are case-ambiguities in the path of a case insensitive tree iterator, it will now rewrite the entire path when it gives the path name to an entry, so a tree with "A/b/C/d.txt" and "a/B/c/E.txt" will give the true full paths (instead of case- folding them both to "A/B/C/d.txt" or "a/b/c/E.txt" or something like that.
Russell Belfer a03beb7b 2013-03-10T21:04:35 Add tests for case insensitive tree iterator This adds a test case for ci tree iteration when there is a name conflict. This points out a behavior quirk in the current version that I'd like to fix - namely, all tree entries get mapped to one version of the case pattern in the ci code - i.e. even if you have A/1.txt and a/2.txt, both will be reported as a/1.txt and a/2.txt because we only copy the name of a file at a given frame once. It would be nice to fix this, but I'm worried about how complex that is if you get a/B/c/1.txt and A/b/C/2.txt. It may require a walk up the frames whenever you advance to the next item in a blended equivalence class.
Carlos Martín Nieto 1aa5318a 2013-03-09T16:04:34 diff: allow asking for diffs with no context Previously, 0 meant default. This is problematic, as asking for 0 context lines is a valid thing to do. Change GIT_DIFF_OPTIONS_INIT to default to three and stop treating 0 as a magic value. In case no options are provided, make sure the options in the diff object default to 3.
Carlos Martín Nieto 48bde2f1 2013-03-08T02:11:34 config: don't allow passing NULL as a value to set Passing NULL is non-sensical. The error message leaves to be desired, though, as it leaks internal implementation details. Catch it at the `git_config_set_string` level and set an appropriate error message.
Russell Belfer e40f1c2d 2013-03-08T16:39:57 Make tree iterator handle icase equivalence There is a serious bug in the previous tree iterator implementation. If case insensitivity resulted in member elements being equivalent to one another, and those member elements were trees, then the children of the colliding elements would be processed in sequence instead of in a single flattened list. This meant that the tree iterator was not truly acting like a case-insensitive list. This completely reworks the tree iterator to manage lists with case insensitive equivalence classes and advance through the items in a unified manner in a single sorted frame. It is possible that at a future date we might want to update this to separate the case insensitive and case sensitive tree iterators so that the case sensitive one could be a minimal amount of code and the insensitive one would always know what it needed to do without checking flags. But there would be so much shared code between the two, that I'm not sure it that's a win. For now, this gets what we need. More tests are needed, though.
Vicent Martí 6f83a781 2013-03-07T11:14:03 Merge pull request #1403 from ethomson/tracing Optional tracing back to consumers
Edward Thomson b5ec5430 2013-03-04T23:52:30 optional tracing
Edward Thomson d00d5464 2013-03-01T15:37:33 immutable references and a pluggable ref database
Carlos Martín Nieto bb45c57f 2013-03-07T16:38:44 refs: explicitly catch leading slashes It's somewhat common to try to write "/refs/tags/something". There is no easy way to catch it during the main body of the function, as there is no way to distinguish whether it's a leading slash or a double slash somewhere in the middle. Catch this at the beginning so we don't trigger the assert in is_all_caps_and_underscore().
Russell Belfer 9bea03ce 2013-03-06T15:16:34 Add INCLUDE_TREES, DONT_AUTOEXPAND iterator flags This standardizes iterator behavior across all three iterators (index, tree, and working directory). Previously the working directory iterator behaved differently from the other two. Each iterator can now operate in one of three modes: 1. *No tree results, auto expand trees* means that only non- tree items will be returned and when a tree/directory is encountered, we will automatically descend into it. 2. *Tree results, auto expand trees* means that results will be given for every item found, including trees, but you only need to call normal git_iterator_advance to yield every item (i.e. trees returned with pre-order iteration). 3. *Tree results, no auto expand* means that calling the normal git_iterator_advance when looking at a tree will not descend into the tree, but will skip over it to the next entry in the parent. Previously, behavior 1 was the only option for index and tree iterators, and behavior 3 was the only option for workdir. The main public API implications of this are that the `git_iterator_advance_into()` call is now valid for all iterators, not just working directory iterators, and all the existing uses of working directory iterators explicitly use the GIT_ITERATOR_DONT_AUTOEXPAND (for now). Interestingly, the majority of the implementation was in the index iterator, since there are no tree entries there and now have to fake them. The tree and working directory iterators only required small modifications.
Russell Belfer 169dc616 2013-03-05T16:10:05 Make iterator APIs consistent with standards The iterator APIs are not currently consistent with the parameter ordering of the rest of the codebase. This rearranges the order of parameters, simplifies the naming of a number of functions, and makes somewhat better use of macros internally to clean up the iterator code. This also expands the test coverage of iterator functionality, making sure that case sensitive range-limited iteration works correctly.
Russell Belfer 9952f24e 2013-03-06T16:02:26 No longer need clar_main.c
Nico von Geyso aa518c70 2013-03-06T22:51:20 added missing free for git_note in clar tests
Nico von Geyso f7b18502 2013-03-06T22:25:01 fixed minor issues with new note iterator * fixed style issues * use new iterator functions for git_note_foreach()
Nico von Geyso 1a90dcf6 2013-03-06T19:07:56 use git_note_iterator type instead of non-public git_iterator one
Nico von Geyso 6edb427b 2013-03-06T16:43:21 basic note iterator implementation * git_note_iterator_new() - create a new note iterator * git_note_next() - retrieves the next item of the iterator
Edward Thomson 4cc326e9 2013-03-05T22:45:26 remote push test fix
Vicent Martí b72f5d40 2013-03-05T15:35:28 Merge pull request #1369 from arrbee/repo-init-template-hooks More tests (and fixes) for initializing repo from template
Edward Thomson 5bddabcc 2013-03-04T17:40:48 clear REUC on checkout
Carlos Martín Nieto 323bb885 2013-03-04T00:21:56 Fix a few leaks `git_diff_get_patch()` would unconditionally load the patch object and then simply leak it if the user hadn't requested it. Short-circuit loading the object if the user doesn't want it. The rest of the plugs are simply calling the free functions of objects allocated during the tests.
Carlos Martín Nieto 447ae791 2013-03-03T15:19:21 indexer: kill git_indexer This was the first implementation and its goal was simply to have something that worked. It is slow and now it's just taking up space. Remove it and switch the one known usage to use the streaming indexer.
Russell Belfer 487fc724 2013-03-01T13:41:53 Allow empty config object and use it This removes assertions that prevent us from having an empty git_config object and then updates some tests that were dependent on global config state to use an empty config before running anything.
Philip Kelley 47f70846 2013-03-01T13:27:46 Merge pull request #1379 from arrbee/fix-tests-with-autocrlf-input-on-windows Control for core.autocrlf during testing
Russell Belfer 7d46b34b 2013-03-01T12:26:05 Control for core.autocrlf during testing
Jameson Miller 926acbcf 2013-03-01T11:07:53 Clone should not delete directories it did not create
Vicent Martí e68e33f3 2013-02-27T14:50:32 Merge pull request #1233 from arrbee/file-similarity-metric Add file similarity scoring to diff rename/copy detection
Russell Belfer 18f08264 2013-02-27T13:44:15 Make mode handling during init more like git When creating files, instead of actually using GIT_FILEMODE_BLOB and the other various constants that happen to correspond to mode values, apparently I should be just using 0666 and 0777, and relying on the umask to clear bits and make the value sane. This fixes the rules for copying a template directory and fixes the checks to match that new behavior. (Further changes to the checkout logic to follow separately.)
Edward Thomson 395509ff 2013-02-27T14:47:39 don't dereference at the end of the workdir iterator
Russell Belfer 0d1b094b 2013-02-26T13:15:06 Fix portability issues on Windows The new tests were not taking core.filemode into account when testing file modes after repo initialization. Fixed that and some other Windows warnings that have crept in.
Russell Belfer 3c42e4ef 2013-02-26T11:43:14 Fix initialization of repo directories When PR #1359 removed the hooks from the test resources/template directory, it made me realize that the tests for git_repository_init_ext using templates must be pretty shabby because we could not have been testing if the hooks were getting created correctly. So, this started with me recreating a couple of hooks, including a sample and symlink, and adding tests that they got created correctly in the various circumstances, including with the SHARED modes, etc. Unfortunately this uncovered some issues with how directories and symlinks were copied and chmod'ed. Also, there was a FIXME in the code related to the chmod behavior as well. Going back over the directory creation logic for setting up a repository, I found it was a little difficult to read and could result in creating and/or chmod'ing directories that the user almost certainly didn't intend. So that let to this work which makes repo initialization much more careful (and hopefully easier to follow). It required a couple of extensions / changes to core fileops utilities, but I also think those are for the better, at least for git_futils_cp_r in terms of being careful about what actions it takes.
Michael Schubert 8005c6d4 2013-02-26T01:03:56 Revert "hash: remove git_hash_init from internal api" This reverts commit efe7fad6c96a3d6197a218aeaa561ec676794499, except for the indentation fixes.
Michael Schubert efe7fad6 2013-02-26T00:05:28 hash: remove git_hash_init from internal api Along with that, fix indentation in tests-clar/object/raw/hash.c
Michael Schubert be225be7 2013-02-25T23:36:25 tests/pack: fixup 6774b10 Initialize the hash ctx with git_hash_ctx_init, not git_hash_init.
Michael Schubert 6774b107 2013-02-17T17:52:16 tests/pack: do strict check of testpack's SHA1 hash
Martin Woodward fc6c5b50 2013-02-25T17:03:05 Remove sample hook files Getting rid of sample hook files from test repos as they just take up space with no value.
Russell Belfer 37d91686 2013-02-22T12:21:54 Do not fail if .gitignore is directory This is designed to fix libgit2sharp #350 where if .gitignore is a directory we abort all operations that process ignores instead of just skipping it as core git does. Also added test that fails without this change and passes with it.
Russell Belfer 1be4ba98 2013-02-22T11:13:01 More rename detection tests This includes tests for crlf changes, whitespace changes with the default comparison and with the ignore whitespace comparison, and more sensitivity checking for the comparison code.
Philip Kelley 7beeb3f4 2013-02-22T14:03:44 Rename 'exp' so it doesn't conflict with exp()
Russell Belfer 6f9d5ce8 2013-02-22T10:17:08 Fix tests for find_similar and related This fixes both a test that I broke in diff::patch where I was relying on the current state of the working directory for the renames test data and fixes an unstable test in diff::rename where the environment setting for the "diff.renames" config was being allowed to influence the test results.
Vicent Martí 06eaa06f 2013-02-22T09:48:47 Merge pull request #1343 from nulltoken/topic/remote_orphaned_branch Teach git_branch_remote_name() to work with orphaned heads
nulltoken bbc53e4f 2013-02-15T12:43:03 branch: refactor git_branch_remote_name() tests
nulltoken c1b5e8c4 2013-02-15T11:35:33 branch: Make git_branch_remote_name() cope with orphaned heads
nulltoken 9ccab8df 2013-02-22T15:25:06 stash: Update the reference when dropping the topmost stash
nulltoken 39bcb4de 2013-02-22T14:44:57 stash: Refactor stash::drop tests
nulltoken d788499a 2013-02-22T15:02:37 ignore: enhance git_ignore_path_is_ignored() test coverage
Russell Belfer d4b747c1 2013-02-21T16:44:44 Add diff rename tests with partial similarity This adds some new tests that actually exercise the similarity metric between files to detect renames, copies, and split modified files that are too heavily modified. There is still more testing to do - these tests are just partially covering the cases. There is also one bug fix in this where a change set with only MODIFY being broken into ADD/DELETE (due to low self-similarity) without any additional RENAMED entries would end up not processing the split requests (because the num_rewrites counter got reset).
Russell Belfer 960a04dd 2013-02-21T12:40:33 Initial integration of similarity metric to diff This is the initial integration of the similarity metric into the `git_diff_find_similar()` code path. The existing tests all pass, but the new functionality isn't currently well tested. The integration does go through the pluggable metric interface, so it should be possible to drop in an alternative to the internal metric that libgit2 implements. This comes along with a behavior change for an existing interface; namely, passing two NULLs to git_diff_blobs (or passing NULLs to git_diff_blob_to_buffer) will now call the file_cb parameter zero times instead of one time. I know it's strange that that change is paired with this other change, but it emerged from some initialization changes that I ended up making.
Russell Belfer 71a3d27e 2013-02-08T10:06:47 Replace diff delta binary with flags Previously the git_diff_delta recorded if the delta was binary. This replaces that (with no net change in structure size) with a full set of flags. The flag values that were already in use for individual git_diff_file objects are reused for the delta flags, too (along with renaming those flags to make it clear that they are used more generally). This (a) makes things somewhat more consistent (because I was using a -1 value in the "boolean" binary field to indicate unset, whereas now I can just use the flags that are easier to understand), and (b) will make it easier for me to add some additional flags to the delta object in the future, such as marking the results of a copy/rename detection or other deltas that might want a special indicator. While making this change, I officially moved some of the flags that were internal only into the private diff header. This also allowed me to remove a gross hack in rename/copy detect code where I was overwriting the status field with an internal value.
Russell Belfer 9bc8be3d 2013-02-19T10:25:41 Refine pluggable similarity API This plugs in the three basic similarity strategies for handling whitespace via internal use of the pluggable API. In so doing, I realized that the use of git_buf in the hashsig API was not needed and actually just made it harder to use, so I tweaked that API as well. Note that the similarity metric is still not hooked up in the find_similarity code - this is just setting out the function that will be used.
Russell Belfer aa643260 2013-02-15T11:08:02 More tests of file signatures with whitespace opts Seems to be working pretty well...
Russell Belfer 5e5848eb 2013-02-14T17:25:10 Change similarity metric to sampled hashes This moves the similarity metric code out of buf_text and into a new file. Also, this implements a different approach to similarity measurement based on a Rabin-Karp rolling hash where we only keep the top 100 and bottom 100 hashes. In theory, that should be sufficient samples to given a fairly accurate measurement while limiting the amount of data we keep for file signatures no matter how large the file is.
Russell Belfer 9c454b00 2013-01-11T22:13:02 Initial implementation of similarity scoring algo This adds a new `git_buf_text_hashsig` type and functions to generate these hash signatures and compare them to give a similarity score. This can be plugged into diff similarity scoring.
Vicent Martí f2e1d060 2013-02-20T12:00:51 Merge pull request #1351 from arrbee/moar-treebuilder-tests Add more treebuilder tests
Russell Belfer 0cfce06d 2013-02-20T11:58:21 Add more treebuilder tests The recent changes with git_treebuilder_entrycount point out that the test coverage for git_treebuilder_remove and git_treebuilder_entrycount is completely absent. This adds tests.
Russell Belfer f7511c2c 2013-02-20T10:19:58 Merge pull request #1348 from libgit2/signatures-2 Simplify signature parsing