src/revwalk.c


Log

Author Commit Date CI Message
Vicent Marti 21d73e71 2011-03-22T20:26:01 Always free the parents of a revwalk commit Thanks to Carlos Martín Nieto for spotting this.
Vicent Marti 72a3fe42 2011-03-18T19:38:49 I broke your bindings Hey. Apologies in advance -- I broke your bindings. This is a major commit that includes a long-overdue redesign of the whole object-database structure. This is expected to be the last major external API redesign of the library until the first non-alpha release. Please get your bindings up to date with these changes. They will be included in the next minor release. Sorry again! Major features include: - Real caching and refcounting on parsed objects - Real caching and refcounting on objects read from the ODB - Streaming writes & reads from the ODB - Single-method writes for all object types - The external API is now partially thread-safe The speed increases are significant in all aspects, specially when reading an object several times from the ODB (revwalking) and when writing big objects to the ODB. Here's a full changelog for the external API: blob.h ------ - Remove `git_blob_new` - Remove `git_blob_set_rawcontent` - Remove `git_blob_set_rawcontent_fromfile` - Rename `git_blob_writefile` -> `git_blob_create_fromfile` - Change `git_blob_create_fromfile`: The `path` argument is now relative to the repository's working dir - Add `git_blob_create_frombuffer` commit.h -------- - Remove `git_commit_new` - Remove `git_commit_add_parent` - Remove `git_commit_set_message` - Remove `git_commit_set_committer` - Remove `git_commit_set_author` - Remove `git_commit_set_tree` - Add `git_commit_create` - Add `git_commit_create_v` - Add `git_commit_create_o` - Add `git_commit_create_ov` tag.h ----- - Remove `git_tag_new` - Remove `git_tag_set_target` - Remove `git_tag_set_name` - Remove `git_tag_set_tagger` - Remove `git_tag_set_message` - Add `git_tag_create` - Add `git_tag_create_o` tree.h ------ - Change `git_tree_entry_2object`: New signature is `(git_object **object_out, git_repository *repo, git_tree_entry *entry)` - Remove `git_tree_new` - Remove `git_tree_add_entry` - Remove `git_tree_remove_entry_byindex` - Remove `git_tree_remove_entry_byname` - Remove `git_tree_clearentries` - Remove `git_tree_entry_set_id` - Remove `git_tree_entry_set_name` - Remove `git_tree_entry_set_attributes` object.h ------------ - Remove `git_object_new - Remove `git_object_write` - Change `git_object_close`: This method is now *mandatory*. Not closing an object causes a memory leak. odb.h ----- - Remove type `git_rawobj` - Remove `git_rawobj_close` - Rename `git_rawobj_hash` -> `git_odb_hash` - Change `git_odb_hash`: New signature is `(git_oid *id, const void *data, size_t len, git_otype type)` - Add type `git_odb_object` - Add `git_odb_object_close` - Change `git_odb_read`: New signature is `(git_odb_object **out, git_odb *db, const git_oid *id)` - Change `git_odb_read_header`: New signature is `(size_t *len_p, git_otype *type_p, git_odb *db, const git_oid *id)` - Remove `git_odb_write` - Add `git_odb_open_wstream` - Add `git_odb_open_rstream` odb_backend.h ------------- - Change type `git_odb_backend`: New internal signatures are as follows int (* read)(void **, size_t *, git_otype *, struct git_odb_backend *, const git_oid *) int (* read_header)(size_t *, git_otype *, struct git_odb_backend *, const git_oid *) int (* writestream)(struct git_odb_stream **, struct git_odb_backend *, size_t, git_otype) int (* readstream)( struct git_odb_stream **, struct git_odb_backend *, const git_oid *) - Add type `git_odb_stream` - Add enum `git_odb_streammode` Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti b5c5f0f8 2011-03-16T23:59:09 Fix headers for the new Revision Walker The "oid.h" header is now included instead of "object.h". The old "revwalk.h" header has been removed; it was empty.
Vicent Marti 36aaf1ff 2011-03-16T01:53:25 Change the Revwalk reset behavior to the old version The `reset` call now removes the pushed commits so we can reuse the revwalker. The API documentation has been updated with the details.
Vicent Marti 36b31329 2011-03-16T01:04:17 Properly free commit a commit list in revwalk The commit list was not being properly free'd when a walk was stopped halfway through.
Vicent Marti 71db842f 2011-03-08T14:57:03 Rewrite the Revision Walker The new revision walker uses an internal Commit object storage system, custom memory allocator and much improved topological and time sorting algorithms. It's about 20x times faster than the previous implementation when browsing big repositories. The following external API calls have changed: `git_revwalk_next` returns an OID instead of a full commit object. The initial call to `git_revwalk_next` is no longer blocking when iterating through a repo with a time-sorting mode. Iterating with Topological or inverted modes still makes the initial call blocking to preprocess the commit list, but this block should be mostly unnoticeable on most repositories (topological preprocessing times at 0.3s on the git.git repo). `git_revwalk_push` and `git_revwalk_hide` now take an OID instead of a full commit object.
Vicent Marti f335b42c 2011-03-05T01:17:59 Fix segmentation fault when freeing a repository Disable garbage collection of cross-references to prevent double-freeing. Internal reference management is now done with a separate method. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 48c27f86 2011-02-28T16:51:17 Implement reference counting for git_objects All `git_object` instances looked up from the repository are reference counted. User is expected to use the new `git_object_close` when an object is no longer needed to force freeing it. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti fc658755 2011-02-22T21:59:36 Rewrite git_hashtable internals The old hash table with chained buckets has been replaced by a new one using Cuckoo hashing, which offers guaranteed constant lookup times. This should improve speeds on most use cases, since hash tables in libgit2 are usually used as caches where the objects are stored once and queried several times. The Cuckoo hash implementation is based off the one in the Basekit library [1] for the IO language, but rewritten to support an arbritrary number of hashes. We currently use 3 to maximize the usage of the nodes pool. [1]: https://github.com/stevedekorte/basekit/blob/master/source/CHash.c Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti cb77ad0d 2011-02-18T12:23:53 Fix segfault when iterating a revlist backwards The `prev` and `next` pointers were not being updated after popping one of the list elements. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti c836c332 2011-02-05T09:29:37 Make more methods return error codes git_revwalk_next now returns an error code when the iteration is over. git_repository_index now returns an error code when the index file could not be opened. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti b5ced41e 2010-12-18T02:35:45 Merge branch 'timezone'
Vicent Marti 638c2ca4 2010-12-18T02:10:25 Rename 'git_person' to 'git_signature' The new signature struct is public, and contains information about the timezone offset. Must be free'd manually by the user. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 1f080e2d 2010-12-13T03:43:56 Fix initialization & freeing of inexistent repos Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti eec95235 2010-12-02T04:58:22 Commit parents now use the common 'vector' code No more linked lists, no more O(n) access. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 1795f879 2010-11-05T03:20:17 Improve error handling All initialization functions now return error codes instead of pointers. Error codes are now properly propagated on most functions. Several new and more specific error codes have been added in common.h Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti a13bc8e7 2010-10-29T02:22:38 Add getter methods for object owners You can know access the owning repository of any existing object, or the repository on which a revision walker is working on. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 0c3596f1 2010-09-20T01:57:53 Add setter methods & write support for git_commit All the required git_commit_set_XXX methods have been implemented; all the attributes of a commit object can now be modified in-memory. The new method git_object_write() automatically writes back the in-memory changes of any object to the repository. So far it only supports git_commit objects. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 3315782c 2010-08-08T14:12:17 Redesigned the walking/object lookup interface The old 'git_revpool' object has been removed and split into two distinct objects with separate functionality, in order to have separate methods for object management and object walking. * A new object 'git_repository' does the high-level management of a repository's objects (commits, trees, tags, etc) on top of a 'git_odb'. Eventually, it will also manage other repository attributes (e.g. tag resolution, references, etc). See: src/git/repository.h * A new external method 'git_repository_lookup(repo, oid, type)' has been added to the 'git_repository' API. All object lookups (git_XXX_lookup()) are now wrappers to this method, and duplicated code has been removed. The method does automatic type checking and returns a generic 'git_revpool_object' that can be cast to any specific object. See: src/git/repository.h * The external methods for object parsing of repository objects (git_XXX_parse()) have been removed. Loading objects from the repository is now managed through the 'lookup' functions. These objects are loaded with minimal information, and the relevant parsing is done automatically when the user requests any of the parsed attributes through accessor methods. An attribute has been added to 'git_repository' in order to force the parsing of all the repository objects immediately after lookup. See: src/git/commit.h See: src/git/tag.h See: src/git/tree.h * The previous walking functionality of the revpool is now found in 'git_revwalk', which does the actual revision walking on a repository; the attributes when walking through commits in a database have been decoupled from the actual commit objects. This increases performance when accessing commits during the walk and allows to have several 'git_revwalk' instances working at the same time on top of the same repository, without having to load commits in memory several times. See: src/git/revwalk.h * The old 'git_revpool_table' has been renamed to 'git_hashtable' and now works as a generic hashtable with support for any kind of object and custom hash functions. See: src/hashtable.h * All the relevant unit tests have been updated, renamed and grouped accordingly. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 52f2390b 2010-07-07T14:56:05 Add external API to access detailed commit attributes The following new external methods have been added: GIT_EXTERN(const char *) git_commit_message_short(git_commit *commit); GIT_EXTERN(const char *) git_commit_message(git_commit *commit); GIT_EXTERN(time_t) git_commit_time(git_commit *commit); GIT_EXTERN(const git_commit_person *) git_commit_committer(git_commit *commit); GIT_EXTERN(const git_commit_person *) git_commit_author(git_commit *commit); GIT_EXTERN(const git_tree *) git_commit_tree(git_commit *commit); A new structure, git_commit_person has been added to represent a commit's author or committer. The parsing of a commit has been split in two phases. When adding a commit to the revision pool: - the commit's ODB object is opened - its raw contents are parsed for commit TIME, PARENTS and TREE (the minimal amount of data required to traverse the pool) - the commit's ODB object is closed When querying for extended information on a commit: - the commit's ODB object is reopened - its raw contents are parsed for the requested information - the commit's ODB object remains open to handle additional queries New unit tests have been added for the new functionality: In t0401-parse: parse_person_test In t0402-details: query_details_test Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 225fe215 2010-06-18T13:06:34 Add support for tree objects in revision pools Commits now store pointers to their tree objects. Tree objects now work as separate git_revpool_object entities. Tree objects can be loaded and parsed inedependently from commits. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 40721f6b 2010-07-10T11:50:16 Changed revpool's object table to support arbitrary objects git_revpool_object now has a type identifier for each object type in a revpool (commits, trees, blobs, etc). Trees can now be stored in the revision pool. git_revpool_tableit now supports filtering objects by their type when iterating through the object table. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 088a731f 2010-06-09T14:54:22 Fixed memory leaks in test suite Created commit objects in t0401-parse weren't being freed properly. Updated the API documentation to note that commit objects are owned by the revision pool and should not be freed manually. The parents list of each commit was being freed twice after each test. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 58b0cbea 2010-07-10T12:14:30 Actually free all commits when freeing a commit pool Previously the objects table was being freed, but not the actuall commits. All git_commit objects are freed and hence invalidated when freeing the git_rp object they belong to. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Ramsay Jones f2924934 2010-06-01T19:41:55 Style: Do not use (C99) // comments Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti de141d4b 2010-05-28T02:02:02 Improved error handling on auxilirary functions. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti 6bb7aa13 2010-05-28T01:48:59 Added new error codes. Improved error handling. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti 9b3577ed 2010-05-28T00:23:43 Fixed brace placement and converted spaces to tabs. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti 69dca959 2010-05-25T22:30:09 Fixed parsing commit times (they weren't being stored at all!) Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti e5d1faef 2010-05-25T19:48:13 Add external API for revision sorting. The GIT_RPSORT_XXX flags have been moved to the external API, and a new method 'gitrp_sorting(...)' has been added to safely change the sorting method of a revision pool. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti 9bdb7594 2010-05-23T17:12:28 Properly reset all commit properties when doing a gitrp_reset(). Add git_revpool_table_free() method. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti 655d381a 2010-05-23T16:51:31 Add topological sorting and new insertion methods for commit lists. 'git_commit_list_toposort()' and 'git_commit_list_timesort()' now sort a commit list by topological and time order respectively. Both sorts are stable and in place. 'git_commit_list_append' has been replaced by 'git_commit_list_push_back' and 'git_commit_list_push_front'. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti a7c182c5 2010-05-23T04:41:31 Add object cache to the revision pool. Fixed issue when generating pending commits list during iteration. The 'git_commit_lookup' function will now check the pool's cache for commits which have been previously loaded/parsed; there can only be a single 'git_commit' structure for each commit on the same pool. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti 5e15176d 2010-05-23T02:39:57 Add commit caching on the commit table. Properly initialize the pending commits list. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti c5696427 2010-05-22T23:21:10 Add 'git_revpool_object' and 'git_revpool_table' structures. All the objects which will will be eventually transversable from a revision pool (commits, trees, etc) now inherit from the 'git_revpool_object' structure which identifies them with their own OID. Furthermore, the 'git_revpool_table' and related functions have been added, which allow for constant time lookup (hash table) of the loaded revpool objects based on their OID. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti 36b7cdb6 2010-05-22T18:15:42 Changed 'git_commit_list' from a linked list to a doubly-linked list. Changed 'git_commit' to use bit fields instead of flags. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti 1a895dd7 2010-05-22T14:32:59 Add arbritrary ordering revision walking. The 'gitrp_next()' method now correctly does a revision walking of all the pushed revisions in arbritary ordering. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti 8add0153 2010-05-21T02:35:40 Split git_commit_lookup into separate functions. git_commit_lookup() now creates commit references without loading them from the ODB. git_commit_parse() creates a commit reference, loads it and parses it from the ODB. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Vicent Marti 08d5d000 2010-05-18T20:55:19 Add commit parents to parsed commits and commit lists to the revpool. Basic support for iterating the revpool. The following functions of the revwalk API have been partially implemented: void gitrp_reset(git_revpool *pool); void gitrp_push(git_revpool *pool, git_commit *commit); void gitrp_prepare_walk(git_revpool *pool); git_commit *gitrp_next(git_revpool *pool); Parsed commits' parents are now also parsed and stored in a "git_commit_list" structure (linked list). Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Andreas Ericsson <ae@op5.se>
Shawn O. Pearce 64a47c01 2008-12-30T23:21:36 Wrap malloc and friends and report out of memory as GIT_ENOMEM We now forbid direct use of malloc, strdup or calloc within the library and instead use wrapper functions git__malloc, etc. to invoke the underlying library malloc and set git_errno to a no memory error code if the allocation fails. In the future once we have pack objects in memory we are likely to enhance these routines with garbage collection logic to purge cached pack data when allocations fail. Because the size of the function will grow somewhat large, we don't want to mark them for inline as gcc tends to aggressively inline, creating larger than expected executables. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Andreas Ericsson c215be41 2008-11-22T14:57:40 Rename git_revpool_* functions gitrp_* Otherwise their prototypes don't match their declarations. Detected by 'sparse', which is obviously good to run before each commit. Signed-off-by: Andreas Ericsson <ae@op5.se> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Andreas Ericsson 1b9e92c7 2008-11-18T01:02:27 s/git_revp/git_revpool/ git_revp is something I personally can't stop pronouncing "rev pointer". I'm sure others would suffer the same problem. Also, rename the git_revp_ sub-api "gitrp_". This is the first of many such renames, primarily done to prevent extreme inflation in the "git_" namespace, which we'd like to reserve for a higher-level API. While we're at it, we remove the noise-char "c" from a lot of functions. Since revision walking is all about commits, the common case should be that we're dealing with commits. Exceptions can get a more mnemonic description as needed. Signed-off-by: Andreas Ericsson <ae@op5.se> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce 50298f44 2008-11-01T15:55:01 Switch the license from BSD to GPL+libgcc exception Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce d1ea30c3 2008-11-01T15:42:23 Move include files to include/git/, drop git_ prefix from file names Signed-off-by: Shawn O. Pearce <spearce@spearce.org>