src/tree.c


Log

Author Commit Date CI Message
Edward Thomson aadad405 2016-02-11T14:28:31 tree: zap warnings around `size_t` vs `uint16_t`
Carlos Martín Nieto fc436469 2015-12-06T22:51:00 tree: mark a tree as already sorted The trees are sorted on-disk, so we don't have to go over them again. This cuts almost a fifth of time spent parsing trees.
Carlos Martín Nieto 0174f21b 2015-12-02T18:56:31 tree: use a specialised mode parse function Instead of going out to strtol, which is made to parse generic numbers, copy a parse function from git which is specialised for file modes.
Patrick Steinhardt 9487585d 2015-12-01T14:19:29 tree: mark cloned tree entries as un-pooled When duplicating a `struct git_tree_entry` with `git_tree_entry_dup` the resulting structure is not allocated inside a memory pool. As we do a 1:1 copy of the original struct, though, we also copy the `pooled` field, which is set to `true` for pooled entries. This results in a huge memory leak as we never free tree entries that were duplicated from a pooled tree entry. Fix this by marking the newly duplicated entry as un-pooled.
Carlos Martín Nieto 95ae3520 2015-11-30T17:32:18 tree: ensure the entry filename fits in 16 bits Return an error in case the length is too big. Also take this opportunity to have a single allocating function for the size and overflow logic.
Carlos Martín Nieto ee42bb0e 2015-11-28T19:18:29 tree: make path len uint16_t and avoid holes This reduces the size of the struct from 32 to 26 bytes, and leaves a single padding byte at the end of the struct (which comes from the zero-length array).
Carlos Martín Nieto 2580077f 2015-11-15T00:44:02 tree: calculate the filename length once We already know the size due to the `memchr()` so use that information instead of calling `strlen()` on it.
Carlos Martín Nieto ed970748 2015-11-14T23:50:06 tree: pool the entry memory allocations These are rather small allocations, so we end up spending a non-trivial amount of time asking the OS for memory. Since these entries are tied to the lifetime of their tree, we can give the tree a pool so we speed up the allocations.
Carlos Martín Nieto 7132150d 2015-11-14T23:46:21 tree: avoid advancing over the filename multiple times We've already looked at the filename with `memchr()` and then used `strlen()` to allocate the entry. We already know how much we have to advance to get to the object id, so add the filename length instead of looking at each byte again.
Carlos Martín Nieto 84511143 2015-03-12T01:49:07 tree: add more correct error messages for not found Don't use the full path, as that's not what we are asserting does not exist, but just the subpath we were looking up.
Stefan Widgren c8e02b87 2015-02-15T21:07:05 Remove extra semicolon outside of a function Without this change, compiling with gcc and pedantic generates warning: ISO C does not allow extra ‘;’ outside of a function.
Edward Thomson f1453c59 2015-02-12T12:19:37 Make our overflow check look more like gcc/clang's Make our overflow checking look more like gcc and clang's, so that we can substitute it out with the compiler instrinsics on platforms that support it. This means dropping the ability to pass `NULL` as an out parameter. As a result, the macros also get updated to reflect this as well.
Edward Thomson 2884cc42 2015-02-11T09:39:38 overflow checking: don't make callers set oom Have the ALLOC_OVERFLOW testing macros also simply set_oom in the case where a computation would overflow, so that callers don't need to.
Edward Thomson 392702ee 2015-02-09T23:41:13 allocations: test for overflow of requested size Introduce some helper macros to test integer overflow from arithmetic and set error message appropriately.
Carlos Martín Nieto 208a2c8a 2014-12-27T12:09:11 treebuilder: rename _create() to _new() This function is a constructor, so let's name it like one and leave _create() for the reference functions, which do create/write the reference.
Edward Thomson dce7b1a4 2014-12-16T19:24:04 treebuilder: take a repository for path validation Path validation may be influenced by `core.protectHFS` and `core.protectNTFS` configuration settings, thus treebuilders can take a repository to influence their configuration.
Vicent Marti 62155257 2014-11-25T00:14:52 tree: Check for `.git` with case insensitivy
Carlos Martín Nieto 7465e873 2014-09-29T09:07:41 index: fill the tree cache on write-tree An obvious place to fill the tree cache is on write-tree, as we're guaranteed to be able to fill in the whole tree cache. The way this commit does this is not the most efficient, as we read the root tree from the odb instead of filling in the cache as we go along, but it fills the cache such that successive operations (and persisting the index to disk) will be able to take advantage of the cache, and it reuses the code we already have for filling the cache. Filling in the cache as we create the trees would require some reallocation of the children vector, which is currently not possible with out pool implementation. A different data structure would likely allow us to perform this operation at a later date.
Carlos Martín Nieto c2f8b215 2014-09-28T07:00:49 index: write out the tree cache extension Keeping the cache around after read-tree is only one part of the optimisation opportunities. In order to share the cache between program instances, we need to write the TREE extension to the index. Do so, taking the opportunity to rename 'entries' to 'entry_count' to match the name given in the format description. The included test is rather trivial, but works as a sanity check.
Carlos Martín Nieto 966fb207 2014-06-25T21:25:44 tree: free in error conditions As reported by coverity, we would leak some memory in error conditions.
Carlos Martín Nieto fcc60066 2014-06-09T22:59:32 treentry: no need for manual size book-keeping We can simply ask the hasmap.
Carlos Martín Nieto 978fbb4c 2014-06-09T22:45:23 treebuilder: don't keep removed entries around If the user wants to keep a copy for themselves, they should make a copy. It adds unnecessary complexity to make sure the returned entries are valid until the builder is cleared.
Carlos Martín Nieto 4d3f1f97 2014-06-09T04:38:22 treebuilder: use a map instead of vector to store the entries Finding a filename in a vector means we need to resort it every time we want to read from it, which includes every time we want to write to it as well, as we want to find duplicate keys. A hash-map fits what we want to do much more accurately, as we do not care about sorting, but just the particular filename. We still keep removed entries around, as the interface let you assume they were going to be around until the treebuilder is cleared or freed, but in this case that involves an append to a vector in the filter case, which can now fail. The only time we care about sorting is when we write out the tree, so let's make that the only time we do any sorting.
Carlos Martín Nieto 2c11d2ee 2014-06-09T23:23:53 treebuilder: insert sorted By inserting in the right position, we can keep the vector sorted, making entry insertion almost twice as fast.
Russell Belfer 882c7742 2014-02-04T10:01:37 Convert pqueue to just be a git_vector This updates the git_pqueue to simply be a set of specialized init/insert/pop functions on a git_vector. To preserve the pqueue feature of having a fixed size heap, I converted the "sorted" field in git_vectors to a more general "flags" field so that pqueue could mix in it's own flag. This had a bunch of ramifications because a number of places were directly looking at the vector "sorted" field - I added a couple new git_vector helpers (is_sorted, set_sorted) so the specific representation of this information could be abstracted.
Carlos Martín Nieto f000ee4e 2014-01-24T18:23:46 tree: remove legacy 'oid' naming Rename git_tree_entry_byoid() to _byid() as per the convention.
Carlos Martín Nieto d541170c 2014-01-24T11:36:41 index: rename an entry's id to 'id' This was not converted when we converted the rest, so do it now.
Arthur Schreiber 529f342a 2014-01-14T21:33:59 Align git_tree_entry_dup.
Russell Belfer 26c1cb91 2013-12-09T09:44:03 One more rename/cleanup for callback err functions
Russell Belfer 25e0b157 2013-12-06T15:07:57 Remove converting user error to GIT_EUSER This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.
Russell Belfer dab89f9b 2013-12-04T21:22:57 Further EUSER and error propagation fixes This continues auditing all the places where GIT_EUSER is being returned and making sure to clear any existing error using the new giterr_user_cancel helper. As a result, places that relied on intercepting GIT_EUSER but having the old error preserved also needed to be cleaned up to correctly stash and then retrieve the actual error. Additionally, as I encountered places where error codes were not being propagated correctly, I tried to fix them up. A number of those fixes are included in the this commit as well.
Carlos Martín Nieto 13f670a5 2013-04-15T09:07:57 tree: allow retrieval of raw attributes When a tool needs to recreate the tree object (for example an interface to another VCS), it needs to use the raw attributes, forgoing any normalization.
wilke d7fc2eb2 2013-09-13T21:36:39 Fix memory leak in git_tree_walk on error or when stopping the walk from the supplied callback
wilke 4e01e302 2013-09-13T21:21:33 Prevent git_tree_walk 'skip entry' callback return code from leaking through as the return value of git_tree_walk
Russell Belfer a7fcc44d 2013-09-05T16:14:32 Better macro name for is-exec-bit-set test
Russell Belfer f240acce 2013-09-05T11:20:12 Add more file mode permissions macros This adds some more macros for some standard operations on file modes, particularly related to permissions, and then updates a number of places around the code base to use the new macros.
Russell Belfer 114f5a6c 2013-06-10T10:10:39 Reorganize diff and add basic diff driver This is a significant reorganization of the diff code to break it into a set of more clearly distinct files and to document the new organization. Hopefully this will make the diff code easier to understand and to extend. This adds a new `git_diff_driver` object that looks of diff driver information from the attributes and the config so that things like function content in diff headers can be provided. The full driver spec is not implemented in the commit - this is focused on the reorganization of the code and putting the driver hooks in place. This also removes a few #includes from src/repository.h that were overbroad, but as a result required extra #includes in a variety of places since including src/repository.h no longer results in pulling in the whole world.
Russell Belfer 58206c9a 2013-05-16T10:38:27 Add cat-file example and increase const use in API This adds an example implementation that emulates git cat-file. It is a convenient and relatively simple example of getting data out of a repository. Implementing this also revealed that there are a number of APIs that are still not using const pointers to objects that really ought to be. The main cause of this is that `git_vector_bsearch` may need to call `git_vector_sort` before doing the search, so a const pointer to the vector is not allowed. However, for tree objects, with a little care, we can ensure that the vector of tree entries is always sorted and allow lookups to take a const pointer. Also, the missing const in commit objects just looks like an oversight.
Russell Belfer b60d95c7 2013-05-01T15:55:54 clarify error propogation
Vicent Marti 0b726701 2013-04-30T13:13:38 object: Explicitly define helper API methods for all obj types
Russell Belfer 203d5b0e 2013-04-29T18:20:58 Some cleanups Removed useless prototype and renamed object typecast functions declaration macro.
Russell Belfer d7761102 2013-04-29T14:22:06 Standardize cast versions of git_object accessors This removes the GIT_INLINE versions of the simple git_object accessors and standardizes them with a helper macro in src/object.h to build the function bodies.
Russell Belfer 116bbdf0 2013-04-16T12:08:21 clean up tree pointer casting
Russell Belfer 3f27127d 2013-04-16T11:51:02 Simplify object table parse functions This unifies the object parse functions into one signature that takes an odb_object.
Russell Belfer 78606263 2013-04-15T00:05:44 Add callback to git_objects_table This adds create and free callback to the git_objects_table so that more of the creation and destruction of objects can be table driven instead of using switch statements. This also makes the semantics of certain object creation functions consistent so that we can make better use of function pointers. This also fixes a theoretical error case where an object allocation fails and we end up storing NULL into the cache.
Russell Belfer badd85a6 2013-04-10T17:10:17 Use git_odb_object_data/_size whereever possible This uses the odb object accessors so we can change the internals more easily...
Vicent Marti 8842c75f 2013-04-03T22:30:07 What has science done.
Carlos Martín Nieto f90391ea 2013-04-18T14:47:54 treebuilder: don't overwrite the error message
Russell Belfer 0c468633 2013-03-14T13:40:15 Improved tree iterator internals This updates the tree iterator internals to be more efficient. The tree_iterator_entry objects are now kept as pointers that are allocated from a git_pool, so that we may use git__tsort_r for sorting (which is better than qsort, given that the tree is likely mostly ordered already). Those tree_iterator_entry objects now keep direct pointers to the data they refer to instead of keeping indirect index values. This simplifies a lot of the data structure traversal code. This also adds bsearch to find the start item position for range- limited tree iterators, and is more explicit about using git_path_cmp instead of reimplementing it. The git_path_cmp changed a bit to make it easier for tree_iterators to use it (but it was barely being used previously, so not a big deal). This adds a git_pool_free_array function that efficiently frees a list of pool allocated pointers (which the tree_iterator keeps). Also, added new tests for the git_pool free list functionality that was not previously being tested (or used).
Philip Kelley cb53669e 2013-03-01T16:38:13 Rename function to __ prefix
Philip Kelley 3f0d0c85 2013-03-01T15:44:18 Disable ignore_case when writing the index to a tree
Russell Belfer e2237179 2013-02-20T10:58:56 Some code cleanups in tree.c This replaces most of the explicit vector iteration with calls to git_vector_foreach, adds in some git__free and giterr_clear calls to clean up during some error paths, and a couple of other code simplifications.
Russell Belfer 93ab370b 2013-02-20T10:50:01 Store treebuilder length separately from entries vec The treebuilder entries vector flags removed items which means we can't rely on the entries vector length to accurately get the number of entries. This adds an entrycount value and maintains it while updating the treebuilder entries.
nulltoken 3ad05221 2013-02-05T16:52:56 Fix MSVC compilation warnings Fix #1308
Russell Belfer 4657fc1c 2013-01-29T13:54:08 Merge pull request #1285 from phkelley/vector Vector improvements and their fallout
John Wiegley 5fb98206 2013-01-28T15:56:04 Added git_treebuilder_entrycount Conflicts: src/tree.c
Philip Kelley 11d9f6b3 2013-01-27T14:17:07 Vector improvements and their fallout
Russell Belfer 98527b5b 2013-01-09T16:03:35 Add git_tree_entry_cmp and git_tree_entry_icmp This adds a new external API git_tree_entry_cmp and a new internal API git_tree_entry_icmp for sorting tree entries. The case insensitive one is internal only because general users should never be seeing case-insensitively sorted trees.
Edward Thomson 359fc2d2 2013-01-08T17:07:25 update copyrights
Russell Belfer 91e7d263 2012-12-10T15:29:44 Fix iterator reset and add reset ranges The `git_iterator_reset` command has not been working in all cases particularly when there is a start and end range. This fixes it and adds tests for it, and also extends it with the ability to update the start/end range strings when an iterator is reset.
Russell Belfer 9950d27a 2012-12-06T13:26:58 Clean up iterator APIs This removes the need to explicitly pass the repo into iterators where the repo is implied by the other parameters. This moves the repo to be owned by the parent struct. Also, this has some iterator related updates to the internal diff API to lay the groundwork for checkout improvements.
Carlos Martín Nieto f1c75b94 2012-12-07T15:16:41 tree: relax the filemode parser There are many different broken filemodes in the wild so we need to protect against them and give something useful up the chain. Don't fail when reading a tree from the ODB but normalize the mode as best we can. As 664 is no longer a mode that we consider to be valid and gets normalized to 644, we can stop accepting it in the treebuilder. The library won't expose it to the user, so any invalid modes are a bug.
Vicent Martí e2934db2 2012-11-29T02:05:46 Merge pull request #1090 from arrbee/ignore-invalid-by-default Ignore invalid entries by default
Russell Belfer a8122b5d 2012-11-21T15:39:03 Fix warnings on Win64 build
Russell Belfer 16248ee2 2012-11-21T11:03:07 Fix up some missing consts in tree & index This fixes some missed places where we can apply const-ness to various public APIs. There are still some index and tree APIs that cannot take const pointers because we sort our `git_vectors` lazily and so we can't reliably bsearch the index and tree content without applying a `git_vector_sort()` first. This also fixes some missed places where size_t can be used and where const can be applied to a couple internal functions.
Ben Straub f45d51ff 2012-11-20T19:57:46 API updates for index.h
Russell Belfer e120123e 2012-11-20T14:01:46 API review / update for tree.h
Russell Belfer cfeef7ce 2012-11-19T13:40:08 Minor optimization to tree entry validity check This checks for a leading '.' before looking for the invalid tree entry names. Even on pretty high levels of optimization, this seems to make a measurable improvement. I accidentally used && in the check initially instead of || and while debugging ended up improving the error reporting of issues with adding tree entries. I thought I'd leave those changes, too.
Scott J. Goldman 0d778b1a 2012-11-18T16:52:04 Catch invalid filenames in append_entry() This prevents the index api from calling write_tree() with a bogus tree.
Scott J. Goldman 19af78bb 2012-11-18T15:15:24 Prevent creating `..`, `.`, and `.git` with tree builder As per core git.
nulltoken f92bcaea 2012-11-08T17:39:23 index: prevent tree creation from a non merged state Fix libgit2/libgit2sharp#243
Vicent Marti 43eeca04 2012-11-01T20:24:43 index: Fix tests
Vicent Marti 276ea401 2012-11-01T20:15:53 index: Add git_index_write_tree
Edward Thomson f45ec1a0 2012-10-29T20:04:21 index refactoring
Russell Belfer 0d64bef9 2012-10-05T15:56:57 Add complex checkout test and then fix checkout This started as a complex new test for checkout going through the "typechanges" test repository, but that revealed numerous issues with checkout, including: * complete failure with submodules * failure to create blobs with exec bits * problems when replacing a tree with a blob because the tree "example/" sorts after the blob "example" so the delete was being processed after the single file blob was created This fixes most of those problems and includes a number of other minor changes that made it easier to do that, including improving the TYPECHANGE support in diff/status, etc.
nulltoken 9d7ac675 2012-08-21T11:45:16 tree entry: rename git_tree_entry_attributes() into git_tree_entry_filemode()
nulltoken a7dbac0b 2012-08-17T21:10:32 filemode: deploy enum usage
nulltoken 66439b0b 2012-08-17T11:21:49 treebuilder: enhance attributes handling on insertion
Carlos Martín Nieto a6bf1687 2012-08-13T14:07:47 tree: allow the user to skip an entry or cancel the walk Returning a negative cancels the walk, and returning a positive one causes us to skip an entry, which was previously done by a negative value. This allows us to stay consistent with the rest of the functions that take a callback and keeps the skipping functionality.
Carlos Martín Nieto 53ae1235 2012-08-13T14:00:53 tree: bring back the documented behaviour for a walk However, there should be a way to cancel the walk and another to skip the entry.
Vicent Marti 51e1d808 2012-08-06T12:41:08 Merge remote-tracking branch 'arrbee/tree-walk-fixes' into development Conflicts: src/notes.c src/transports/git.c src/transports/http.c src/transports/local.c tests-clar/odb/foreach.c
Russell Belfer b0d37669 2012-08-03T17:24:59 Add new iteration behavior to git_tree_walk Missed this one, ironically enough.
Russell Belfer 2031760c 2012-07-26T16:10:22 Fix git_tree_walk to return user error This makes sure that an error code returned by the callback function of `git_tree_walk` will stop the iteration and get propagated back to the caller verbatim. Also, this adds a minor helper function `git_tree_entry_byoid` that searches a `git_tree` for an entry with the given OID. This isn't a fast function, but it's easier than writing the loop yourself as an external user of the library.
nulltoken b8457baa 2012-07-24T07:57:58 portability: Improve x86/amd64 compatibility
Michael Schubert c6f42953 2012-07-19T17:33:48 tree: fix ordering for git_tree_walk Josh Triplett noticed libgit2 actually does preorder entries in tree_walk_post instead of postorder. Also, we continued walking even when an error occured in the callback. Fix #773; also, allow both pre- and postorder walking.
nulltoken dc1f4b32 2012-07-12T10:52:19 tree: unfound tree entry returns GIT_ENOTFOUND
nulltoken 1c3edb30 2012-07-12T09:46:45 tree: prevent git_tree_entry_free() from segfaulting when being passed a NULL tree_entry
Vicent Marti 46ea40d9 2012-06-29T17:08:36 tree: Rename `entry_copy` to `entry_dup`
Vicent Marti 0e2fcca8 2012-06-29T02:21:12 tree: Bring back `entry_bypath` Smaller, simpler, faster.
Vicent Marti b93688d0 2012-06-19T02:33:03 Merge remote-tracking branch 'yorah/fix/notes-creation' into development Conflicts: src/notes.c
nulltoken b0b3b4e3 2012-05-29T16:19:15 treebuilder: prevent git_treebuilder_free() from segfaulting when being passed a NULL treebuilder
Vicent Martí 3f035860 2012-06-07T22:43:03 misc: Fix warnings from PVS Studio trial
Vicent Martí 904b67e6 2012-05-18T01:48:50 errors: Rename error codes
Vicent Martí e172cf08 2012-05-18T01:21:06 errors: Rename the generic return codes
Vicent Martí 9d0011fd 2012-05-16T19:23:47 tree: Naming conventions
Vicent Martí cedf9ca9 2012-05-16T19:16:35 tree: Kill the `git_tree_diff` functions These are deprecated and replaced with the diffing code in git2/diff.h
Russell Belfer 41a82592 2012-05-15T14:17:39 Ranged iterators and rewritten git_status_file The goal of this work is to rewrite git_status_file to use the same underlying code as git_status_foreach. This is done in 3 phases: 1. Extend iterators to allow ranged iteration with start and end prefixes for the range of file names to be covered. 2. Improve diff so that when there is a pathspec and there is a common non-wildcard prefix of the pathspec, it will use ranged iterators to minimize excess iteration. 3. Rewrite git_status_file to call git_status_foreach_ext with a pathspec that covers just the one file being checked. Since ranged iterators underlie the status & diff implementation, this is actually fairly efficient. The workdir iterator does end up loading the contents of all the directories down to the single file, which should ideally be avoided, but it is pretty good.
Vicent Martí 3fbcac89 2012-05-02T19:56:38 Remove old and unused error codes
Vicent Martí 40879fac 2012-05-02T15:59:02 Merge branch 'new-error-handling' into development Conflicts: .travis.yml include/git2/diff.h src/config_file.c src/diff.c src/diff_output.c src/mwindow.c src/path.c tests-clar/clar_helpers.c tests-clar/object/tree/frompath.c tests/t00-core.c tests/t03-objwrite.c tests/t08-tag.c tests/t10-refs.c tests/t12-repo.c tests/t18-status.c tests/test_helpers.c tests/test_main.c
Vicent Martí b8802146 2012-05-01T19:16:14 Merge remote-tracking branch 'carlosmn/remaining-errors' into new-error-handling Conflicts: src/refspec.c