src/iterator.c


Log

Author Commit Date CI Message
Russell Belfer fc57471a 2013-04-18T14:13:53 More filesystem iterator cleanup Renamed the callback functions and made some minor rearrangements to clean up the flow of some code.
Russell Belfer ff0ddfa4 2013-04-17T15:56:31 Add filesystem iterator variant This adds a new variant iterator that is a raw filesystem iterator for scanning directories from a root. There is still more work to do to blend this with the working directory iterator.
Russell Belfer 71f85226 2013-04-18T11:11:38 Make workdir iterator use filesystem iterator This adds some hooks into the filesystem iterator so that the workdir iterator can just become a wrapper around it. Then we remove most of the workdir iterator code and just have it augment the filesystem iterator with skipping .git entries, updating the ignore stack, and checking for submodules.
Vicent Marti 575a54db 2013-04-10T16:55:29 object: Export git_object_dup
Russell Belfer 65025cb8 2013-03-18T17:24:13 Three submodule status bug fixes 1. Fix sort order problem with submodules where "mod" was sorting after "mod-plus" because they were being sorted as "mod/" and "mod-plus/". This involved pushing the "contains a .git entry" test significantly lower in the stack. 2. Reinstate behavior that a directory which contains a .git entry will be treated as a submodule during iteration even if it is not yet added to the .gitmodules. 3. Now that any directory containing .git is reported as submodule, we have to be more careful checking for GIT_EEXISTS when we do a submodule lookup, because that is the error code that is returned by git_submodule_lookup when you try to look up a directory containing .git that has no record in gitmodules or the index.
Russell Belfer 0c468633 2013-03-14T13:40:15 Improved tree iterator internals This updates the tree iterator internals to be more efficient. The tree_iterator_entry objects are now kept as pointers that are allocated from a git_pool, so that we may use git__tsort_r for sorting (which is better than qsort, given that the tree is likely mostly ordered already). Those tree_iterator_entry objects now keep direct pointers to the data they refer to instead of keeping indirect index values. This simplifies a lot of the data structure traversal code. This also adds bsearch to find the start item position for range- limited tree iterators, and is more explicit about using git_path_cmp instead of reimplementing it. The git_path_cmp changed a bit to make it easier for tree_iterators to use it (but it was barely being used previously, so not a big deal). This adds a git_pool_free_array function that efficiently frees a list of pool allocated pointers (which the tree_iterator keeps). Also, added new tests for the git_pool free list functionality that was not previously being tested (or used).
Russell Belfer bbb13646 2013-03-13T14:59:51 Fix workdir iterator bugs This fixes two bugs with the workdir iterator depth check: first that the depth was not being decremented and second that empty directories were counting against the depth even though a frame was not being created for them. This also fixes a bug with the ENOTFOUND return code for workdir iterators when you attempt to advance_into an empty directory. Actually, that works correctly, but it was incorrectly being propogated into regular advance() calls in some circumstances. Added new tests for the above that create a huge hierarchy on the fly and try using the workdir iterator to traverse it.
Russell Belfer a5eea2d7 2013-03-11T11:31:50 Stabilize order for equiv tree iterator entries Given a group of case-insensitively equivalent tree iterator entries, this ensures that the case-sensitively first trees will be used as the representative items. I.e. if you have conflicting entries "A/B/x", "a/b/x", and "A/b/x", this change ensures that the earliest entry "A/B/x" will be returned. The actual choice is not that important, but it is nice to have it stable and to have it been either the first or last item, as opposed to a random item from within the equivalent span.
Russell Belfer aec4f663 2013-03-11T10:37:12 Fix tree iterator advance using wrong name compare Tree iterator advance was moving forward without taking the filemode of the entries into account, equating "a" and "a/". This makes the tree entry comparison code more easily reusable and fixes the problem.
Russell Belfer 92028ea5 2013-03-11T09:53:49 Fix tree iterator path for tree issue + cleanups This fixes an off by one error for generating full paths for tree entries in tree iterators when INCLUDE_TREES is set. Also, contains a bunch of small code cleanups with a couple of small utility functions and macro changes to eliminate redundant code.
Russell Belfer 61c7b61e 2013-03-10T22:38:53 Use correct case path in icase tree iterator If there are case-ambiguities in the path of a case insensitive tree iterator, it will now rewrite the entire path when it gives the path name to an entry, so a tree with "A/b/C/d.txt" and "a/B/c/E.txt" will give the true full paths (instead of case- folding them both to "A/B/C/d.txt" or "a/b/c/E.txt" or something like that.
Russell Belfer e40f1c2d 2013-03-08T16:39:57 Make tree iterator handle icase equivalence There is a serious bug in the previous tree iterator implementation. If case insensitivity resulted in member elements being equivalent to one another, and those member elements were trees, then the children of the colliding elements would be processed in sequence instead of in a single flattened list. This meant that the tree iterator was not truly acting like a case-insensitive list. This completely reworks the tree iterator to manage lists with case insensitive equivalence classes and advance through the items in a unified manner in a single sorted frame. It is possible that at a future date we might want to update this to separate the case insensitive and case sensitive tree iterators so that the case sensitive one could be a minimal amount of code and the insensitive one would always know what it needed to do without checking flags. But there would be so much shared code between the two, that I'm not sure it that's a win. For now, this gets what we need. More tests are needed, though.
Russell Belfer 9bea03ce 2013-03-06T15:16:34 Add INCLUDE_TREES, DONT_AUTOEXPAND iterator flags This standardizes iterator behavior across all three iterators (index, tree, and working directory). Previously the working directory iterator behaved differently from the other two. Each iterator can now operate in one of three modes: 1. *No tree results, auto expand trees* means that only non- tree items will be returned and when a tree/directory is encountered, we will automatically descend into it. 2. *Tree results, auto expand trees* means that results will be given for every item found, including trees, but you only need to call normal git_iterator_advance to yield every item (i.e. trees returned with pre-order iteration). 3. *Tree results, no auto expand* means that calling the normal git_iterator_advance when looking at a tree will not descend into the tree, but will skip over it to the next entry in the parent. Previously, behavior 1 was the only option for index and tree iterators, and behavior 3 was the only option for workdir. The main public API implications of this are that the `git_iterator_advance_into()` call is now valid for all iterators, not just working directory iterators, and all the existing uses of working directory iterators explicitly use the GIT_ITERATOR_DONT_AUTOEXPAND (for now). Interestingly, the majority of the implementation was in the index iterator, since there are no tree entries there and now have to fake them. The tree and working directory iterators only required small modifications.
Russell Belfer cc216a01 2013-03-05T16:29:04 Retire spoolandsort iterator Since the case sensitivity is moved into the respective iterators, this removes the spoolandsort iterator code.
Russell Belfer 169dc616 2013-03-05T16:10:05 Make iterator APIs consistent with standards The iterator APIs are not currently consistent with the parameter ordering of the rest of the codebase. This rearranges the order of parameters, simplifies the naming of a number of functions, and makes somewhat better use of macros internally to clean up the iterator code. This also expands the test coverage of iterator functionality, making sure that case sensitive range-limited iteration works correctly.
Philip Kelley 11d9f6b3 2013-01-27T14:17:07 Vector improvements and their fallout
Russell Belfer cce548e3 2013-01-22T15:28:25 Fix case sensitivity bug with tree iterators With the new code to make tree iterators support ignore_case, there is a bug in setting the start entry for range bounded iterators where memcmp was being used instead of strncasecmp. This fixes that and expands the tree iterator test to cover the cases that were broken.
Russell Belfer fffe429a 2013-01-10T11:11:42 Shortcut spool and sort for empty iterator
Russell Belfer 25423d03 2013-01-09T16:07:54 Support case insensitive tree iterators and status This makes tree iterators directly support case insensitivity by using a secondary index that can be sorted by icase. Also, this fixes the ambiguity check in the git_status_file API to also be case insensitive. Lastly, this adds new test cases for case insensitive range boundary checking for all types of iterators. With this change, it should be possible to deprecate the spool and sort iterator, but I haven't done that yet.
Russell Belfer 134d8c91 2013-01-08T15:53:13 Update iterator API with flags for ignore_case This changes the iterator API so that flags can be passed in to the constructor functions to control the ignore_case behavior. At this point, the flags are not supported on tree iterators (i.e. there is no functional change over the old API), but the API changes are all made to accomodate this. By the way, I went with a flags parameter because in the future I have a couple of other ideas for iterator flags that will make it easier to fix some diff/status/checkout bugs.
Russell Belfer 4b181037 2013-01-08T13:39:15 Minor iterator API cleanups In preparation for further iterator changes, this cleans up a few small things in the iterator API: * removed the git_iterator_for_repo_index_range API * made git_iterator_free not be inlined * minor param name and test function name tweaks
Edward Thomson 359fc2d2 2013-01-08T17:07:25 update copyrights
Russell Belfer 546d65a8 2013-01-02T17:01:34 Fix up spoolandsort iterator usage The spoolandsort iterator changes got sort-of cherry picked out of this branch and so I dropped the commit when rebasing; however, there were a few small changes that got dropped as well (since the version merged upstream wasn't quite the same as what I dropped).
Russell Belfer 5cf9875a 2012-12-18T15:19:24 Add index updating to checkout Make checkout update entries in the index for all files that are updated and/or removed, unless flag GIT_CHECKOUT_DONT_UPDATE_INDEX is given. To do this, iterators were extended to allow a little more introspection into the index being iterated over, etc.
Russell Belfer f616a36b 2012-12-27T22:25:52 Make spoolandsort a pushable iterator behavior An earlier change to `git_diff_from_iterators` introduced a memory leak where the allocated spoolandsort iterator was not returned to the caller and thus not freed. One proposal changes all iterator APIs to use git_iterator** so we can reallocate the iterator at will, but that seems unexpected. This commit makes it so that an iterator can be changed in place. The callbacks are isolated in a separate structure and a pointer to that structure can be reassigned by the spoolandsort extension. This means that spoolandsort doesn't create a new iterator; it just allocates a new block of callbacks (along with space for its own extra data) and swaps that into the iterator. Additionally, since spoolandsort is only needed to switch the case sensitivity of an iterator, this simplifies the API to only take the ignore_case boolean and to be a no-op if the iterator already matches the requested case sensitivity.
Russell Belfer 91e7d263 2012-12-10T15:29:44 Fix iterator reset and add reset ranges The `git_iterator_reset` command has not been working in all cases particularly when there is a start and end range. This fixes it and adds tests for it, and also extends it with the ability to update the start/end range strings when an iterator is reset.
Russell Belfer 9950d27a 2012-12-06T13:26:58 Clean up iterator APIs This removes the need to explicitly pass the repo into iterators where the repo is implied by the other parameters. This moves the repo to be owned by the parent struct. Also, this has some iterator related updates to the internal diff API to lay the groundwork for checkout improvements.
Edward Thomson b2414661 2012-11-28T22:43:55 status should ignore conflicts entries in the index
Vicent Martí e2934db2 2012-11-29T02:05:46 Merge pull request #1090 from arrbee/ignore-invalid-by-default Ignore invalid entries by default
Russell Belfer a8122b5d 2012-11-21T15:39:03 Fix warnings on Win64 build
Russell Belfer 16248ee2 2012-11-21T11:03:07 Fix up some missing consts in tree & index This fixes some missed places where we can apply const-ness to various public APIs. There are still some index and tree APIs that cannot take const pointers because we sort our `git_vectors` lazily and so we can't reliably bsearch the index and tree content without applying a `git_vector_sort()` first. This also fixes some missed places where size_t can be used and where const can be applied to a couple internal functions.
Ben Straub f45d51ff 2012-11-20T19:57:46 API updates for index.h
Russell Belfer e120123e 2012-11-20T14:01:46 API review / update for tree.h
Russell Belfer d46b0a04 2012-11-19T16:34:44 Improve iterator ignoring .git file The workdir iterator has always tried to ignore .git files, but it turns out there were some bugs. This makes it more robust at ignoring .git files. This also makes iterators always check ".git" case insensitively regardless of the properties of the system. This will make libgit2 skip ".GIT" and the like. This is different from core git, but on systems with case insensitive but case preserving file systems, allowing ".GIT" to be added is problematic.
Russell Belfer bad68c0a 2012-11-13T14:02:59 Add iterator for git_index object The index iterator could previously only be created from a repo object, but this allows creating an iterator from a `git_index` object instead (while keeping, though renaming, the old function).
Russell Belfer 55cbd05b 2012-11-08T16:56:34 Some diff refactorings to help code reuse There are some diff functions that are useful in a rewritten checkout and this lays some groundwork for that. This contains three main things: 1. Share the function diff uses to calculate the OID for a file in the working directory (now named `git_diff__oid_for_file` 2. Add a `git_diff__paired_foreach` function to iterator over two diff lists concurrently. Convert status to use it. 3. Move all the string/prefix/index entry comparisons into function pointers inside the `git_diff_list` object so they can be switched between case sensitive and insensitive versions. This makes them easier to reuse in various functions without replicating logic. As part of this, move a couple of index functions out of diff.c and into index.c.
Russell Belfer 220d5a6c 2012-11-08T16:45:25 Make iterator ignore eval lazy This makes it so that the check if a file is ignored will be deferred until requested on the workdir iterator, instead of aggressively evaluating the ignore rules for each entry. This should improve performance because there will be no need to check ignore rules for files that are already in the index.
Edward Thomson f45ec1a0 2012-10-29T20:04:21 index refactoring
Russell Belfer 0d64bef9 2012-10-05T15:56:57 Add complex checkout test and then fix checkout This started as a complex new test for checkout going through the "typechanges" test repository, but that revealed numerous issues with checkout, including: * complete failure with submodules * failure to create blobs with exec bits * problems when replacing a tree with a blob because the tree "example/" sorts after the blob "example" so the delete was being processed after the single file blob was created This fixes most of those problems and includes a number of other minor changes that made it easier to do that, including improving the TYPECHANGE support in diff/status, etc.
Russell Belfer dfbff793 2012-10-08T15:14:12 Fix a few diff bugs with directory content There are a few cases where diff should leave directories in the diff list if we want to match core git, such as when the directory contains a .git dir. That feature was lost when I introduced some of the new submodule handling. This restores that and then fixes a couple of related to diff output that are triggered by having diffs with directories in them. Also, this adds a new flag that can be passed to diff if you want diff output to actually include the file content of any untracked files.
Philip Kelley ec40b7f9 2012-09-17T15:42:41 Support for core.ignorecase
nulltoken ced8d142 2012-08-22T11:30:55 errors: deploy GIT_EBAREREPO usage
Russell Belfer 5f4a61ae 2012-08-09T19:43:25 Working implementation of git_submodule_status This is a big redesign of the git_submodule_status API and the implementation of the redesigned API. It also fixes a number of bugs that I found in other parts of the submodule API while writing the tests for the status part. This also fixes a couple of bugs in the iterators that had not been noticed before - one with iterating when there is a gitlink (i.e. separate-work-dir) and one where I was treating anything even vaguely submodule-like as a submodule, more aggressively than core git does.
Vicent Martí 904b67e6 2012-05-18T01:48:50 errors: Rename error codes
Vicent Martí e172cf08 2012-05-18T01:21:06 errors: Rename the generic return codes
Russell Belfer 6e5c4af0 2012-05-17T14:21:10 Fix workdir iterators on empty directories Creating a workdir iterator on a directory with absolutely no files was returning an error (GIT_ENOTFOUND) instead of an iterator for nothing. This fixes that and includes two new tests that cover that case.
Vicent Martí 9d0011fd 2012-05-16T19:23:47 tree: Naming conventions
Russell Belfer 41a82592 2012-05-15T14:17:39 Ranged iterators and rewritten git_status_file The goal of this work is to rewrite git_status_file to use the same underlying code as git_status_foreach. This is done in 3 phases: 1. Extend iterators to allow ranged iteration with start and end prefixes for the range of file names to be covered. 2. Improve diff so that when there is a pathspec and there is a common non-wildcard prefix of the pathspec, it will use ranged iterators to minimize excess iteration. 3. Rewrite git_status_file to call git_status_foreach_ext with a pathspec that covers just the one file being checked. Since ranged iterators underlie the status & diff implementation, this is actually fairly efficient. The workdir iterator does end up loading the contents of all the directories down to the single file, which should ideally be avoided, but it is pretty good.
Russell Belfer 7e000ab2 2012-05-08T15:03:59 Add support for diffing index with no HEAD When a repo is first created, there is no HEAD yet and attempting to diff files in the index was showing nothing because a tree iterator could not be constructed. This adds an "empty" iterator and falls back on that when the head cannot be looked up.
Russell Belfer bfc9ca59 2012-03-28T16:45:36 Added submodule API and use in status When processing status for a newly checked out repo, it is possible that there will be submodules that have not yet been initialized. The only way to distinguish these from untracked directories is to have some knowledge of submodules. This commit adds a new submodule API which, given a name or path, can determine if it appears to be a submodule and can give information about the submodule.
Russell Belfer 875bfc5f 2012-03-25T21:26:48 Fix error in tree iterator when popping up trees There was an error in the tree iterator where it would delete two tree levels instead of just one when popping up a tree level. Unfortunately the test data for the tree iterator did not have any deep trees with subtrees in the middle of the tree items, so this problem went unnoticed. This contains the 1-line fix plus new test data and tests that reveal the issue.
Russell Belfer 66142ae0 2012-03-22T10:44:36 New status fixes This adds support for roughly-right tracking of submodules (although it does not recurse into submodules to detect internal modifications a la core git), and it adds support for including unmodified files in diff iteration if requested.
Russell Belfer 7c7ff7d1 2012-03-19T16:10:11 Migrate index, oid, and utils to new errors This includes a few cleanups that came up while converting these files. This commit introduces a could new git error classes, including the catchall class: GITERR_INVALID which I'm using as the class for invalid and out of range values which are detected at too low a level of library to use a higher level classification. For example, an overflow error in parsing an integer or a bad letter in parsing an OID string would generate an error in this class.
Russell Belfer 0d0fa7c3 2012-03-16T15:56:01 Convert attr, ignore, mwindow, status to new errors Also cleaned up some previously converted code that still had little things to polish.
Russell Belfer deafee7b 2012-03-14T17:36:15 Continue error conversion This converts blob.c, fileops.c, and all of the win32 files. Also, various minor cleanups throughout the code. Plus, in testing the win32 build, I cleaned up a bunch (although not all) of the warnings with the 64-bit build.
Vicent Martí 9d160ba8 2012-03-06T01:37:56 diff: Fix rebase breackage
Russell Belfer 74fa4bfa 2012-02-28T16:14:47 Update diff to use iterators This is a major reorganization of the diff code. This changes the diff functions to use the iterators for traversing the content. This allowed a lot of code to be simplified. Also, this moved the functions relating to outputting a diff into a new file (diff_output.c). This includes a number of other changes - adding utility functions, extending iterators, etc. plus more tests for the diff code. This also takes the example diff.c program much further in terms of emulating git-diff command line options.
schu 01269540 2012-02-23T16:51:07 Fix -Wuninitialized warning Signed-off-by: schu <schu-github@schulog.org>
Russell Belfer 0534641d 2012-02-22T15:15:35 Fix iterators based on pull request feedback This update addresses all of the feedback in pull request #570. The biggest change was to create actual linked list stacks for storing the tree and workdir iterator state. This cleaned up the code a ton. Additionally, all of the static functions had their 'git_' prefix removed, and a lot of other unnecessary changes were removed from the original patch.
Russell Belfer da337c80 2012-02-22T11:22:33 Iterator improvements from diff implementation This makes two changes to iterator behavior: first, advance can optionally do the work of returning the new current value. This is such a common pattern that it really cleans up usage. Second, for workdir iterators, this removes automatically iterating into directories. That seemed like a good idea, but when an entirely new directory hierarchy is introduced into the workdir, there is no reason to iterate into it if there are no corresponding entries in the tree/index that it is being compared to. This second change actually wasn't a lot of code because not descending into directories was already the behavior for ignored directories. This just extends that to all directories.
Russell Belfer b6c93aef 2012-02-21T14:46:24 Uniform iterators for trees, index, and workdir This create a new git_iterator type of object that provides a uniform interface for iterating over the index, an arbitrary tree, or the working directory of a repository. As part of this, git ignore support was extended to support push and pop of directory-based ignore files as the working directory is being traversed (so the array of ignores does not have to be recreated at each directory during traveral). There are a number of other small utility functions in buffer, path, vector, and fileops that are included in this patch that made the iterator implementation cleaner.