src/diff.c


Log

Author Commit Date CI Message
Edward Thomson f0e693b1 2021-09-07T17:53:49 str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Edward Thomson 31ecaca2 2021-09-30T08:11:40 hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.
Edward Thomson 2a713da1 2021-09-29T21:31:17 hash: accept the algorithm in inputs
Edward Thomson ba3595af 2021-09-13T16:25:00 diff: deprecate diff_format_email `git_diff_format_email` is deprecated in favor of `git_email_create`.
Edward Thomson c443495b 2021-09-13T13:29:46 diff: use `git_email_create` in `diff_format_email`
Edward Thomson f407d3fa 2021-09-13T10:51:42 diff_commit_as_email: use `email_create` Move the `git_diff_commit_as_email` function to use `email_create`.
Edward Thomson 5d6c2f26 2020-04-05T14:59:54 diff: use GIT_ASSERT
Patrick Steinhardt c6184f0c 2020-06-08T21:07:36 tree-wide: do not compile deprecated functions with hard deprecation When compiling libgit2 with -DDEPRECATE_HARD, we add a preprocessor definition `GIT_DEPRECATE_HARD` which causes the "git2/deprecated.h" header to be empty. As a result, no function declarations are made available to callers, but the implementations are still available to link against. This has the problem that function declarations also aren't visible to the implementations, meaning that the symbol's visibility will not be set up correctly. As a result, the resulting library may not expose those deprecated symbols at all on some platforms and thus cause linking errors. Fix the issue by conditionally compiling deprecated functions, only. While it becomes impossible to link against such a library in case one uses deprecated functions, distributors of libgit2 aren't expected to pass -DDEPRECATE_HARD anyway. Instead, users of libgit2 should manually define GIT_DEPRECATE_HARD to hide deprecated functions. Using "real" hard deprecation still makes sense in the context of CI to test we don't use deprecated symbols ourselves and in case a dependant uses libgit2 in a vendored way and knows it won't ever use any of the deprecated symbols anyway.
Patrick Steinhardt a6c9e0b3 2020-06-08T12:40:47 tree-wide: mark local functions as static We've accumulated quite some functions which are never used outside of their respective code unit, but which are lacking the `static` keyword. Add it to reduce their linkage scope and allow the compiler to optimize better.
Patrick Steinhardt 7c499b54 2020-06-08T12:39:09 tree-wide: remove unused functions We have some functions which aren't used anywhere. Let's remove them to get rid of unneeded baggage.
Gregory Herrero ece5bb5e 2019-11-07T14:10:00 diff: make patchid computation work with all types of commits. Current implementation of patchid is not computing a correct patchid when given a patch where, for example, a new file is added or removed. Some more corner cases need to be handled to have same behavior as git patch-id command. Add some more tests to cover those corner cases. Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
Sven Strickroth 452b7f8f 2019-09-25T20:29:21 Don't use enum for flags Using an `enum` causes trouble when used with C++ as bitwise operations are not possible w/o casting (e.g., `opts.flags &= ~GIT_BLOB_FILTER_CHECK_FOR_BINARY;` is invalid as there is no `&=` operator for `enum`). Signed-off-by: Sven Strickroth <email@cs-ware.de>
Patrick Steinhardt 001d76e1 2019-07-11T11:34:40 diff: ignore EOFNL for computing patch IDs The patch ID is supposed to be mostly context-insignificant and thus only includes added or deleted lines. As such, we shouldn't honor end-of-file-without-newline markers in diffs. Ignore such lines to fix how we compute the patch ID for such diffs.
Edward Thomson 0b5ba0d7 2019-06-06T16:36:23 Rename opt init functions to `options_init` In libgit2 nomenclature, when we need to verb a direct object, we name a function `git_directobject_verb`. Thus, if we need to init an options structure named `git_foo_options`, then the name of the function that does that should be `git_foo_options_init`. The previous names of `git_foo_init_options` is close - it _sounds_ as if it's initializing the options of a `foo`, but in fact `git_foo_options` is its own noun that should be respected. Deprecate the old names; they'll now call directly to the new ones.
Edward Thomson c3866fa8 2019-01-20T18:54:16 diff: explicitly cast in flush_hunk Quiet down a warning from MSVC about how we're potentially losing data.
Edward Thomson f673e232 2018-12-27T13:47:34 git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
Patrick Steinhardt ecf4f33a 2018-02-08T11:14:48 Convert usage of `git_buf_free` to new `git_buf_dispose`
Patrick Steinhardt 90fc7f53 2017-11-30T15:09:05 diff: remove unused macros `DIFF_FLAG_*` In commit 9be638ecf (git_diff_generated: abstract generated diffs, 2016-04-19), the code for generated diffs was moved out of the generic "diff.c" and instead into its own module. During that conversion, it was forgotten to remove the macros `DIFF_FLAG_IS_SET`, `DIFF_FLAG_ISNT_SET` and `DIFF_FLAG_SET`, which are now only used in "diff_generated.c". Remove those macros now.
Patrick Steinhardt 046b081a 2017-09-15T10:46:26 diff: cleanup hash ctx in `git_diff_patchid` After initializing the hash context in `git_diff_patchid`, we never proceed to call `git_hash_ctx_cleanup` on it. While this doesn't really matter on most hash implementations, this causes a memory leak on Win32 due to CNG system requiring a `malloc` call. Fix the memory leak by always calling `git_hash_ctx_cleanup` before exiting.
Edward Thomson 1560b580 2017-08-15T10:35:47 Merge pull request #4288 from pks-t/pks/include-fixups Include fixups
Patrick Steinhardt 0c7f49dd 2017-06-30T13:39:01 Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Patrick Steinhardt 89a34828 2017-06-16T13:34:43 diff: implement function to calculate patch ID The upstream git project provides the ability to calculate a so-called patch ID. Quoting from git-patch-id(1): A "patch ID" is nothing but a sum of SHA-1 of the file diffs associated with a patch, with whitespace and line numbers ignored." Patch IDs can be used to identify two patches which are probably the same thing, e.g. when a patch has been cherry-picked to another branch. This commit implements a new function `git_diff_patchid`, which gets a patch and derives an OID from the diff. Note the different terminology here: a patch in libgit2 are the differences in a single file and a diff can contain multiple patches for different files. The implementation matches the upstream implementation and should derive the same OID for the same diff. In fact, some code has been directly derived from the upstream implementation. The upstream implementation has two different modes to calculate patch IDs, which is the stable and unstable mode. The old way of calculating the patch IDs was unstable in a sense that a different ordering the diffs was leading to different results. This oversight was fixed in git 1.9, but as git tries hard to never break existing workflows, the old and unstable way is still default. The newer and stable way does not care for ordering of the diff hunks, and in fact it is the mode that should probably be used today. So right now, we only implement the stable way of generating the patch ID.
Patrick Steinhardt 62a2fc06 2017-03-14T13:06:25 patch_generate: move `git_diff_foreach` to diff.c Now that the `git_diff_foreach` function does not depend on internals of the `git_patch_generated` structure anymore, we can easily move it to the actual diff code.
Edward Thomson 7166bb16 2016-04-25T00:35:48 introduce `git_diff_from_buffer` to parse diffs Parse diff files into a `git_diff` structure.
Edward Thomson 9be638ec 2016-04-19T15:12:18 git_diff_generated: abstract generated diffs
Edward Thomson d68cb736 2015-09-22T18:25:03 diff: include oid length in deltas Now that `git_diff_delta` data can be produced by reading patch file data, which may have an abbreviated oid, allow consumers to know that the id is abbreviated.
Patrick Steinhardt fe3057b4 2016-05-03T17:36:09 diff: simplify code for handling empty dirs When determining diffs between two iterators we may need to recurse into an unmatched directory for the "new" iterator when it is either a prefix to the current item of the "old" iterator or when untracked/ignored changes are requested by the user and the directory is untracked/ignored. When advancing into the directory and no files are found, we will get back `GIT_ENOTFOUND`. If so, we simply skip the directory, handling resulting unmatched old items in the next iteration. The other case of `iterator_advance_into` returning either `GIT_NOERROR` or any other error but `GIT_ENOTFOUND` will be handled by the caller, which will now either compare the first directory entry of the "new" iterator in case of `GIT_ENOERROR` or abort on other cases. Improve readability of the code to make the above logic more clear.
Edward Thomson 9eb9e5fa 2016-03-21T17:19:24 iterator: cleanups Remove some unused functions, refactor some ugliness.
Edward Thomson 67885532 2016-03-17T15:29:21 diff: stop processing nitem when its removed When a directory is removed out from underneath us, stop trying to manipulate it.
Edward Thomson 0e0589fc 2016-03-10T00:04:26 iterator: combine fs+workdir iterators more completely Drop some of the layers of indirection between the workdir and the filesystem iterators. This makes the code a little bit easier to follow, and reduces the number of unnecessary allocations a bit as well. (Prior to this, when we filter entries, we would allocate them, filter them and then free them; now we do the filtering before allocation.) Also, rename `git_iterator_advance_over_with_status` to just `git_iterator_advance_over`. Mostly because it's a fucking long-ass function name otherwise.
Arthur Schreiber 3679ebae 2016-02-11T23:37:52 Horrible fix for #3173.
Patrick Steinhardt 254e0a33 2015-11-24T13:43:43 diff: include commit message when formatting patch When formatting a patch as email we do not include the commit's message in the formatted patch output. Implement this and add a test that verifies behavior.
Edward Thomson 25e84f95 2015-11-23T15:49:54 checkout: only consider nsecs when built that way When examining the working directory and determining whether it's up-to-date, only consider the nanoseconds in the index entry when built with `GIT_USE_NSEC`. This prevents us from believing that the working directory is always dirty when the index was originally written with a git client that uinderstands nsecs (like git 2.x).
Carlos Martín Nieto 75a0ccf5 2015-11-12T19:53:09 Merge pull request #3170 from CmdrMoozy/nsec_fix git_index_entry__init_from_stat: set nsec fields in entry stats
Jason Haslam 3138ad93 2015-07-16T10:17:16 Add diff progress callback.
Vicent Marti 1e5e02b4 2015-10-27T17:26:04 pool: Simplify implementation
Edward Thomson 7499eae9 2015-10-22T09:30:41 diff: ignore nsecs when diffing Although our index contains the literal time present in the index, we do not read nanoseconds from disk, and thus we should not use them in any comparisons, lest we always think our working directory is dirty. Guard this behind a `GIT_USE_NSECS` for future improvement.
Axel Rasmussen 28659e50 2015-10-01T18:36:10 diff: refactor complex timestamp check into its own function
Axel Rasmussen 0226f7dd 2015-08-29T13:59:20 diff/index: respect USE_NSEC for racily clean file detection
Axel Rasmussen e7de893e 2015-06-01T13:43:54 cmake: add USE_NSEC, and only check nanosec m/ctime if enabled
Edward Thomson 8ab4d0e1 2015-09-12T15:32:18 diff: check pathspec on non-files When we're not doing pathspec matching, we let the iterator handle file matching for us. However, we can only trust the iterator to return *files* that match the pattern, because the iterator must return directories that are not strictly in the pathlist, but that are the parents of files that match the pattern, so that diff can later recurse into them. Thus, diff must examine non-files explicitly before including them in the delta list.
Edward Thomson 4a0dbeb0 2015-08-30T17:06:26 diff: use new iterator pathlist handling When using literal pathspecs in diff with `GIT_DIFF_DISABLE_PATHSPEC_MATCH` turn on the faster iterator pathlist handling. Updates iterator pathspecs to include directory prefixes (eg, `foo/`) for compatibility with `GIT_DIFF_DISABLE_PATHSPEC_MATCH`.
Edward Thomson ef206124 2015-07-28T19:55:37 Move filelist into the iterator handling itself.
Edward Thomson ed1c6446 2015-07-28T11:41:27 iterator: use an options struct instead of args
Pierre-Olivier Latour ccef5adb 2015-06-30T09:30:20 Added git_diff_index_to_index()
Pierre-Olivier Latour c2e1b058 2015-06-05T18:26:49 Only write index if updated when passing GIT_DIFF_UPDATE_INDEX When diffing the index with the workdir and GIT_DIFF_UPDATE_INDEX has been passed, the previous implementation was always writing the index to disk even if it wasn't modified.
Edward Thomson cc605e73 2015-06-23T23:52:03 Merge pull request #3222 from git-up/conflicted Fixed GIT_DELTA_CONFLICTED not returned in some cases
Pierre-Olivier Latour 8d8a2eef 2015-06-15T11:14:40 Fixed GIT_DELTA_CONFLICTED not returned in some cases If an index entry for a file that is not in HEAD is in conflicted state, when diffing HEAD with the index, the status field of the corresponding git_diff_delta was incorrectly reported as GIT_DELTA_ADDED instead of GIT_DELTA_CONFLICTED. This was due to handle_unmatched_new_item() initially setting the status to GIT_DELTA_CONFLICTED but then overriding it later with GIT_DELTA_ADDED.
Carlos Martín Nieto ff475375 2015-06-17T14:34:10 diff: check files with the same or newer timestamps When a file on the workdir has the same or a newer timestamp than the index, we need to perform a full check of the contents, as the update of the file may have happened just after we wrote the index. The iterator changes are such that we can reach inside the workdir iterator from the diff, though it may be better to have an accessor instead of moving these structs into the header.
Edward Thomson 96dd171e 2015-06-19T08:32:26 diff: preserve original mode in the index When updating the index during a diff, preserve the original mode, which prevents us from dropping the mode to what we have interpreted as on our system (eg, what the working directory claims it to be, which may be a lie on some systems.)
Edward Thomson 10549a2d 2015-05-19T18:26:04 Introduce `GIT_DIFF_FLAG_EXISTS` Mark the `old_file` and `new_file` sides of a delta with a new bit, `GIT_DIFF_FLAG_EXISTS`, that introduces that a particular side of the delta exists in the diff. This is useful for indicating whether a working directory item exists or not, in the presence of a conflict. Diff users may have previously used DELETED to determine this information.
Edward Thomson 253a05f7 2015-05-19T11:31:15 diff: prettify `maybe_modified` a little
Edward Thomson 9f545b9d 2015-05-19T11:23:59 introduce `git_index_entry_is_conflict` It's not always obvious the mapping between stage level and conflict-ness. More importantly, this can lead otherwise sane people to write constructs like `if (!git_index_entry_stage(entry))`, which (while technically correct) is unreadable. Provide a nice method to help avoid such messy thinking.
Edward Thomson 191e97a0 2015-05-18T18:15:17 diff conflicts: don't include incorrect ID Since a diff entry only concerns a single entry, zero the information for the index side of a conflict. (The index entry would otherwise erroneously include the lowest-stage index entry - generally the ancestor of a conflict.) Test that during status, the index side of the conflict is empty.
Edward Thomson 7877146f 2015-05-18T15:13:43 diff: for conflicts w/o workdir, blank nitem side Make sure that we provide a blanked nitem side when the item does not exist in the working directory.
Edward Thomson 7c948014 2015-05-14T14:00:29 diff/status: introduce conflicts When diffing against an index, return a new `GIT_DELTA_CONFLICTED` delta type for items that are conflicted. For a single file path, only one delta will be produced (despite the fact that there are multiple entries in the index). Index iterators now have the (optional) ability to return conflicts in the index. Prior to this change, they would be omitted, and callers (like diff) would omit conflicted index entries entirely.
Edward Thomson 1b6c26db 2015-05-13T17:47:26 diff: wrap the iterator functions Wrap the iterator current / advance functions so that we can extend them, but also handle GIT_ITEROVER cases in the iterator funcs instead of the callers.
Edward Thomson 4beab1f8 2015-03-31T16:29:35 checkout: break case-changes into delete/add When checking out with a case-insensitive working directory, we want to change the case of items in the working directory to reflect changes that occured in the checkout target. Diff now has an option to break case-changing renames into delete/add.
Pierre-Olivier Latour db853748 2015-04-15T15:28:03 Fixed GIT_DIFF_UPDATE_INDEX not being aware of executable bit changes In the prior implementation, enabling GIT_DIFF_UPDATE_INDEX would overwrite entries in the index with the ones generated from scanning the working if the OID was the same. Because this OID comparison ignores file modes, this means an file in the workdir with only an exec bit difference with the one in the index would end up being overwritten, resulting in the exec bit being loss. There might be other related bugs but the fix of comparing OIDs and file modes should address them all.
Pierre-Olivier Latour cc93ad16 2015-04-15T15:27:59 Removed unnecessary condition The variable noid is guaranteed to be zero at this point of the code path.
Pierre-Olivier Latour 35df76bd 2015-04-15T15:27:56 Use git_oid_cpy() instead of memcpy()
Pierre-Olivier Latour 8a3934e4 2015-03-11T19:29:36 Avoid retaining / releasing the index more than necessary when GIT_DIFF_UPDATE_INDEX is enabled
Carlos Martín Nieto 9a97f49e 2014-12-21T15:31:03 config: borrow refcounted references This changes the get_entry() method to return a refcounted version of the config entry, which you have to free when you're done. This allows us to avoid freeing the memory in which the entry is stored on a refresh, which may happen at any time for a live config. For this reason, get_string() has been forbidden on live configs and a new function get_string_buf() has been added, which stores the string in a git_buf which the user then owns. The functions which parse the string value takea advantage of the borrowing to parse safely and then release the entry.
Edward Thomson 795eaccd 2015-02-19T11:09:54 git_filter_opt_t -> git_filter_flag_t For consistency with the rest of the library, where an opt is an options *structure*.
Edward Thomson f1453c59 2015-02-12T12:19:37 Make our overflow check look more like gcc/clang's Make our overflow checking look more like gcc and clang's, so that we can substitute it out with the compiler instrinsics on platforms that support it. This means dropping the ability to pass `NULL` as an out parameter. As a result, the macros also get updated to reflect this as well.
Edward Thomson 392702ee 2015-02-09T23:41:13 allocations: test for overflow of requested size Introduce some helper macros to test integer overflow from arithmetic and set error message appropriately.
Jacques Germishuys 0beb7fe4 2014-12-24T11:44:17 Added missing error handling path
Carlos Martín Nieto f7fcb18f 2014-11-23T14:12:54 Plug leaks Valgrind is now clean except for libssl and libgcrypt.
Carlos Martín Nieto 62a617dc 2014-11-06T16:16:46 iterator: submodules are determined by an index or tree We cannot know from looking at .gitmodules whether a directory is a submodule or not. We need the index or tree we are comparing against to tell us. Otherwise we have to assume the entry in .gitmodules is stale or otherwise invalid. Thus we pass the index of the repository into the workdir iterator, even if we do not want to compare against it. This follows what git does, which even for `git diff <tree>`, it will consider staged submodules as such.
Pierre-Olivier Latour d88766c4 2014-10-26T10:40:46 Changed context_lines and interhunk_lines to uint32_t to match struct s_xdemitconf
Alan Rogers dc49e1b5 2014-06-04T15:36:28 Merge remote-tracking branch 'origin/development' into fix-git-status-list-new-unreadable-folder Conflicts: include/git2/diff.h
Alan Rogers 5e654200 2014-06-04T11:53:44 Implement GIT_DIFF_INCLUDE_UNREADABLE_AS_UNTRACKED
Alan Rogers c4366096 2014-05-23T00:28:12 Don't need to duplicate this code.
Alan Rogers e8cc3032 2014-05-22T19:43:58 Return GIT_DELTA_UNREADABLE for a file with a mode change
Alan Rogers 9067d5af 2014-05-21T23:13:24 Remove errant whitespace.
Alan Rogers 86c9d3da 2014-05-21T22:54:34 Return GIT_FILEMODE_UNREADABLE for files that fail to stat.
Alan Rogers 61bef72d 2014-05-20T23:57:40 Start adding GIT_DELTA_UNREADABLE and GIT_STATUS_WT_UNREADABLE.
Alan Rogers f47bc8ff 2014-05-20T18:16:04 Skip unreadable files for now.
Russell Belfer 2b52a0bf 2014-05-13T16:32:27 Increase use of config snapshots And decrease extra reload checks of config data.
Vicent Marti 03fcef18 2014-05-13T12:40:13 Merge pull request #2328 from libgit2/rb/how-broken-can-ignores-be Improve checks for ignore containment
Russell Belfer 5269008c 2014-05-06T16:01:49 Add filter options and ALLOW_UNSAFE Diff and status do not want core.safecrlf to actually raise an error regardless of the setting, so this extends the filter API with an additional options flags parameter and adds a flag so that filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating that unsafe filter application should be downgraded from a failure to a warning.
Russell Belfer f554611a 2014-05-06T12:41:26 Improve checks for ignore containment The diff code was using an "ignored_prefix" directory to track if a parent directory was ignored that contained untracked files alongside tracked files. Unfortunately, when negative ignore rules were used for directories inside ignored parents, the wrong rules were applied to untracked files inside the negatively ignored child directories. This commit moves the logic for ignore containment into the workdir iterator (which is a better place for it), so the ignored-ness of a directory is contained in the frame stack during traversal. This allows a child directory to override with a negative ignore and yet still restore the ignored state of the parent when we traverse out of the child. Along with this, there are some problems with "directory only" ignore rules on container directories. Given "a/*" and "!a/b/c/" (where the second rule is a directory rule but the first rule is just a generic prefix rule), then the directory only constraint was having "a/b/c/d/file" match the first rule and not the second. This was fixed by having ignore directory-only rules test a rule against the prefix of a file with LEADINGDIR enabled. Lastly, spot checks for ignores using `git_ignore_path_is_ignored` were tested from the top directory down to the bottom to deal with the containment problem, but this is wrong. We have to test bottom to top so that negative subdirectory rules will be checked before parent ignore rules. This does change the behavior of some existing tests, but it seems only to bring us more in line with core Git, so I think those changes are acceptable.
Russell Belfer 7a2e56a3 2014-04-29T14:30:15 Get rid of redundant git_diff_options_init fn Since git_diff_init_options was introduced, remove this old fn.
Russell Belfer bc91347b 2014-04-30T11:16:31 Fix remaining init_options inconsistencies There were a couple of "init_opts()" functions a few more cases of structure initialization that I somehow missed.
Russell Belfer 702efc89 2014-04-30T10:57:42 Make init_options fns use unsigned ints and macro Use an unsigned int for the version and add a helper macro so the code is simplified (and so the error message is a common string).
Russell Belfer 9c8ed499 2014-04-29T15:05:58 Remove trace / add git_diff_perfdata struct + api
Russell Belfer b23b112d 2014-04-29T11:29:49 Add payloads, bitmaps to trace API This is a proposed adjustment to the trace APIs. This makes the trace levels into a bitmask so that they can be selectively enabled and adds a callback-level payload, plus a message-level payload. This makes it easier for me to a GIT_TRACE_PERF callbacks that are simply bypassed if the PERF level is not set.
Russell Belfer cd424ad5 2014-04-28T16:39:53 Add GIT_STATUS_OPT_UPDATE_INDEX and use trace API This adds an option to refresh the stat cache while generating status. It also rips out the GIT_PERF stuff I had an makes use of the trace API to keep statistics about what happens during diff.
Russell Belfer 94fb4aad 2014-04-28T14:48:41 Add diff option to update index stat cache When diff is scanning the working directory, if it finds a file where it is not sure if the index entry matches the working dir, it will recalculate the OID (which is pretty expensive). This adds a new flag to diff so that if the OID calculation finds that the file actually has not changed (i.e. just the modified time was altered or such), then it will refresh the stat cache in the index so that future calls to diff will not have to check the oid again.
Russell Belfer 0fc8e1f6 2014-04-28T14:34:55 Lay groundwork for updating stat cache in diff This reorganized the diff OID calculation to make it easier to correctly update the stat cache during a diff once the flags to do so are enabled. This includes marking the path of a git_index_entry as const so we can make a "fake" git_index_entry with a "const char *" path and not get warnings. I was a little surprised at how unobtrusive this change was, but I think it's probably a good thing.
Russell Belfer 8ef4e11a 2014-04-28T14:16:26 Skip diff oid calc when size definitely changed When we think the stat cache in the index seems valid and the size or mode of a file has definitely changed, then don't bother trying to recalculate the OID of the workdir bits to confirm that it is modified - just accept that it is modified. This can result in files that show as modified with no actual diff, but the behavior actually appears to match Git on the command line. This also includes a minor optimization to not perform a submodule lookup on the ".git" directory itself.
Russell Belfer 240f4af3 2014-04-28T14:04:29 Add build option for diff internal statistics
Vicent Marti 2ad51b81 2014-04-25T02:04:12 Merge pull request #2241 from libgit2/rb/stash-skip-submodules Improve stash and checkout for ignored + untracked items
Russell Belfer 219c89d1 2014-04-23T16:28:45 Treat ignored, empty, and untracked dirs different In the iterator, distinguish between ignores and empty directories so that diff and status can ignore empty directories, but checkout and stash can treat them as untracked items.
Russell Belfer 37da3685 2014-04-22T21:51:54 Make checkout match diff for untracked/ignored dir When diff finds an untracked directory, it emulates Git behavior by looking inside the directory to see if there are any untracked items inside it. If there are only ignored items inside the dir, then diff considers it ignored, even if there is no direct ignore rule for it. Checkout was not copying this behavior - when it found an untracked directory, it just treated it as untracked. Unfortunately, when combined with GIT_CHECKOUT_REMOVE_UNTRACKED, this made is seem that checkout (and stash, which uses checkout) was removing ignored items when you had only asked it to remove untracked ones. This commit moves the logic for advancing past an untracked dir while scanning for non-ignored items into an iterator helper fn, and uses that for both diff and checkout.
Russell Belfer 8d09efa2 2014-04-22T12:33:27 Use git_diff_get_stats in example/diff + refactor This takes the `--stat` and related example options in the example diff.c program and converts them to use the `git_diff_get_stats` API which nicely formats stats for you. I went to add bar-graph scaling to the stats formatter and noticed that the `git_diff_stats` structure was holding on to all of the `git_patch` objects. Unfortunately, each of these objects keeps the full text of the diff in memory, so this is very expensive. I ended up modifying `git_diff_stats` to keep just the data that it needs to keep and allowed it to release the patches. Then, I added width scaling to the output on top of that. In making the diff example program match 'git diff' output, I ended up removing an newline from the sumamry output which I then had to compensate for in the email formatting to match the expectations. Lastly, I went through and refactored the tests to use a couple of helper functions and reduce the overall amount of code there.
Russell Belfer 3b4c401a 2014-02-10T13:20:08 Decouple index iterator sort from index This makes the index iterator honor the GIT_ITERATOR_IGNORE_CASE and GIT_ITERATOR_DONT_IGNORE_CASE flags without modifying the index data itself. To take advantage of this, I had to export a number of the internal index entry comparison functions. I also wrote some new tests to exercise the capability.
Jacques Germishuys d8cc1fb6 2014-04-11T19:15:15 Introduce git_diff_format_email and git_diff_commit_as_email
Jacques Germishuys a56b418d 2014-04-11T22:57:15 Sanitize git_diff_format_email_options' summary parameter It will form part of the subject line and should thus be one line.
Russell Belfer eb7e17cc 2014-04-08T14:47:20 Update submodules with parent-tracked content This updates how libgit2 treats submodule-like directories that actually have tracked content inside of them. This is a strange corner case, but it seems that many people have abortive submodule setups and then just went ahead and added the files into the parent repository. In this case, we should just treat the submodule as if it was a normal directory. Libgit2 will still try to skip over real submodules and contained repositories that do not have tracked files inside them, but this adds some new handling for cases where the apparently submodule data is in conflict with the actual list of tracked files.