src/repository.c


Log

Author Commit Date CI Message
Vicent Marti 72a3fe42 2011-03-18T19:38:49 I broke your bindings Hey. Apologies in advance -- I broke your bindings. This is a major commit that includes a long-overdue redesign of the whole object-database structure. This is expected to be the last major external API redesign of the library until the first non-alpha release. Please get your bindings up to date with these changes. They will be included in the next minor release. Sorry again! Major features include: - Real caching and refcounting on parsed objects - Real caching and refcounting on objects read from the ODB - Streaming writes & reads from the ODB - Single-method writes for all object types - The external API is now partially thread-safe The speed increases are significant in all aspects, specially when reading an object several times from the ODB (revwalking) and when writing big objects to the ODB. Here's a full changelog for the external API: blob.h ------ - Remove `git_blob_new` - Remove `git_blob_set_rawcontent` - Remove `git_blob_set_rawcontent_fromfile` - Rename `git_blob_writefile` -> `git_blob_create_fromfile` - Change `git_blob_create_fromfile`: The `path` argument is now relative to the repository's working dir - Add `git_blob_create_frombuffer` commit.h -------- - Remove `git_commit_new` - Remove `git_commit_add_parent` - Remove `git_commit_set_message` - Remove `git_commit_set_committer` - Remove `git_commit_set_author` - Remove `git_commit_set_tree` - Add `git_commit_create` - Add `git_commit_create_v` - Add `git_commit_create_o` - Add `git_commit_create_ov` tag.h ----- - Remove `git_tag_new` - Remove `git_tag_set_target` - Remove `git_tag_set_name` - Remove `git_tag_set_tagger` - Remove `git_tag_set_message` - Add `git_tag_create` - Add `git_tag_create_o` tree.h ------ - Change `git_tree_entry_2object`: New signature is `(git_object **object_out, git_repository *repo, git_tree_entry *entry)` - Remove `git_tree_new` - Remove `git_tree_add_entry` - Remove `git_tree_remove_entry_byindex` - Remove `git_tree_remove_entry_byname` - Remove `git_tree_clearentries` - Remove `git_tree_entry_set_id` - Remove `git_tree_entry_set_name` - Remove `git_tree_entry_set_attributes` object.h ------------ - Remove `git_object_new - Remove `git_object_write` - Change `git_object_close`: This method is now *mandatory*. Not closing an object causes a memory leak. odb.h ----- - Remove type `git_rawobj` - Remove `git_rawobj_close` - Rename `git_rawobj_hash` -> `git_odb_hash` - Change `git_odb_hash`: New signature is `(git_oid *id, const void *data, size_t len, git_otype type)` - Add type `git_odb_object` - Add `git_odb_object_close` - Change `git_odb_read`: New signature is `(git_odb_object **out, git_odb *db, const git_oid *id)` - Change `git_odb_read_header`: New signature is `(size_t *len_p, git_otype *type_p, git_odb *db, const git_oid *id)` - Remove `git_odb_write` - Add `git_odb_open_wstream` - Add `git_odb_open_rstream` odb_backend.h ------------- - Change type `git_odb_backend`: New internal signatures are as follows int (* read)(void **, size_t *, git_otype *, struct git_odb_backend *, const git_oid *) int (* read_header)(size_t *, git_otype *, struct git_odb_backend *, const git_oid *) int (* writestream)(struct git_odb_stream **, struct git_odb_backend *, size_t, git_otype) int (* readstream)( struct git_odb_stream **, struct git_odb_backend *, const git_oid *) - Add type `git_odb_stream` - Add enum `git_odb_streammode` Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 6b2a1941 2011-03-12T23:09:16 Fix the retarded object interdependency system It's no longer retarded. All object interdependencies are stored as OIDs instead of actual objects. This should be hundreds of times faster, specially on big repositories. Heck, who knows, maye it doesn't even segfault -- wouldn't that be awesome? What has changed on the API? `git_commit_parent`, `git_commit_tree`, `git_tag_target` now return their values through a pointer-to-pointer, and have an error code. `git_commit_set_tree` and `git_tag_set_target` now return an error code and may fail. `git_repository_free__no_gc` has been deprecated because it's stupid. Since there are no longer any interdependencies between objects, we don't need internal reference counting, and GC never fails or double-free's pointers. `git_object_close` now does a very sane thing: marks an object as unused. Closed objects will be eventually free'd from the object cache based on LRU. Please use `git_object_close` from the garbage collector `destroy` method on your bindings. It's 100% safe. `git_repository_gc` is a new method that forces a garbage collector pass through the repo, to free as many LRU objects as possible. This is useful if we are running out of memory.
Vicent Marti 71db842f 2011-03-08T14:57:03 Rewrite the Revision Walker The new revision walker uses an internal Commit object storage system, custom memory allocator and much improved topological and time sorting algorithms. It's about 20x times faster than the previous implementation when browsing big repositories. The following external API calls have changed: `git_revwalk_next` returns an OID instead of a full commit object. The initial call to `git_revwalk_next` is no longer blocking when iterating through a repo with a time-sorting mode. Iterating with Topological or inverted modes still makes the initial call blocking to preprocess the commit list, but this block should be mostly unnoticeable on most repositories (topological preprocessing times at 0.3s on the git.git repo). `git_revwalk_push` and `git_revwalk_hide` now take an OID instead of a full commit object.
Vicent Marti e0011be3 2011-03-05T13:22:16 Fix the opening of empty repositories We were checking for the index file, which is not assured to exist on clean git repositories. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti f335b42c 2011-03-05T01:17:59 Fix segmentation fault when freeing a repository Disable garbage collection of cross-references to prevent double-freeing. Internal reference management is now done with a separate method. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 584f49a5 2011-03-01T01:37:28 Fix several issues with refcounting - Added several missing reference increases - Add new destructor to the repository that does not GC the objects Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 48c27f86 2011-02-28T16:51:17 Implement reference counting for git_objects All `git_object` instances looked up from the repository are reference counted. User is expected to use the new `git_object_close` when an object is no longer needed to force freeing it. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 5de079b8 2011-02-28T12:12:26 Change the object creation/lookup API The methods previously known as git_repository_lookup git_repository_newobject git_repository_lookup_ref are now part of their respective namespaces: git_object_lookup git_object_new git_reference_lookup This makes the API more consistent with the new references API. Signed-off-by: Vicent Marti <tanoku@gmail.com>
nulltoken d2d6912e 2011-02-26T13:56:16 Refactored the opening and the initialization of a repository.
Vicent Marti fc658755 2011-02-22T21:59:36 Rewrite git_hashtable internals The old hash table with chained buckets has been replaced by a new one using Cuckoo hashing, which offers guaranteed constant lookup times. This should improve speeds on most use cases, since hash tables in libgit2 are usually used as caches where the objects are stored once and queried several times. The Cuckoo hash implementation is based off the one in the Basekit library [1] for the IO language, but rewritten to support an arbritrary number of hashes. We currently use 3 to maximize the usage of the nodes pool. [1]: https://github.com/stevedekorte/basekit/blob/master/source/CHash.c Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 874c3b6f 2011-02-18T14:11:53 Fix repository initialization Fixed several issues with path joining and bare repos. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 8212e2d7 2011-02-07T18:25:23 Fix detection of working dir on repositories Signed-off-by: Vicent Marti <tanoku@gmail.com>
Shuhei Tanuma 56ab8c54 2011-02-06T15:48:52 fix can't detect repository index issues.
Vicent Marti f725931b 2011-02-05T12:42:41 Fix directory/path manipulation methods The `dirname` and `dirbase` methods have been replaced with the Android implementation, which is actually compilant to some kind of standard. A new method `topdir` has been added, which returns the topmost directory in a path. These changes fix issue #49: `gitfo_prettify_dir_path` converts "./.git/" to ".git/", so the code at src/repository.c:190 goes out of bounds when trying to find the topmost directory. The new `git__topdir` method handles this gracefully, and the fixed `git__dirname` now returns the proper value for the repository's working dir. E.g. /repo/.git/ ==> working dir '/repo/' .git/ ==> working dir '.' Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti c836c332 2011-02-05T09:29:37 Make more methods return error codes git_revwalk_next now returns an error code when the iteration is over. git_repository_index now returns an error code when the index file could not be opened. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 2f8a8ab2 2011-01-29T01:56:25 Refactor reference parsing code Several changes have been committed to allow the user to create in-memory references and write back to disk. Peeling of symbolic references has been made explicit. Added getter and setter methods for all attributes on a reference. Added corresponding documentation. Signed-off-by: Vicent Marti <tanoku@gmail.com>
nulltoken 9282e921 2010-12-27T20:34:19 Merge nulltoken's reference parsing code All the commits have been squashed into a single one before refactoring the final code, to keep everything tidy. Individual commit messages are as follows: Added repository reference looking up functionality placeholder. Added basic reference database definition and caching infrastructure. Removed useless constant. Added GIT_EINVALIDREFNAME error and description. Added missing description for GIT_EBAREINDEX. Added GIT_EREFCORRUPTED error and description. Added GIT_ETOONESTEDSYMREF error and description. Added resolving of direct and symbolic references. Prepared the packed-refs parsing. Added parsing of the packed-refs file content. When no loose reference has been found, the full content of the packed-refs file is parsed. All of the new (i.e. not previously parsed as a loose reference) references are eagerly stored in the cached references storage. The method packed_reference_file__parse() is in deer need of some refactoring. :-) Extracted to a method the parsing of the peeled target of a tag. Extracted to a method the parsing of a standard packed ref. Fixed leaky removal of the cached references. Ensured that a previously parsed packed reference isn't returned if a more up-to-date loose reference exists. Enhanced documentation of git_repository_reference_lookup(). Moved some refs related constants from repository.c to refs.h. Made parsing of a packed tag reference more robust. Updated git_repository_reference_lookup() documentation. Added some references to the test repository. Added some tests covering tag references looking up. Added some tests covering symbolic and head references looking up. Added some tests covering packed references looking up.
nulltoken 2e6fd09c 2011-01-25T21:52:24 Fixed naming convention related issue.
nulltoken eb2f3b47 2011-01-23T19:01:57 Made git_repository_open2() and git_repository_open3() benefit from recently added path prettifying function.
nulltoken 9dd34b1e 2011-01-21T14:02:22 Made git_repository_open() and git_repository_init() benefit from recently added path prettifying function.
Vicent Marti c8f5ff8f 2011-01-20T14:43:27 Fix initialization of in-memory trees In-memory tree objects were not being properly initialized, because the internal entries vector was created on the 'parse' method. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti ec3c7a16 2011-01-13T04:54:14 Add new Repository initialization method Lets the user specify the ODB that will be used by the repository manually. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Alex Budovski e0c23b88 2011-01-11T17:50:37 Remove unused variable.
Vicent Marti e52ed7a5 2011-01-03T22:34:27 Split object methods from repository.c All the relevant git_object methods have been moved to object.c Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti fb3cd6bc 2011-01-03T21:46:18 Make internal methods static Keep all the repository init code as static. Signed-off-by: Vicent Marti <tanoku@gmail.com>
nulltoken f2d6a23a 2010-12-21T05:21:33 Small code maintenability improvement.
nulltoken 8ea2c83b 2010-12-20T16:46:13 Added creation of 'objects/info' and 'objects/pack' directories.
Vicent Marti 40c44d2f 2010-12-19T22:50:20 Fix issues in repository initialization Implemented recursive directory creation Fix style issues Signed-off-by: Vicent Marti <tanoku@gmail.com>
nulltoken 1c2c7c0d 2010-12-19T15:08:53 Added creation of ref/heads/ and refs/tags/ directories.
nulltoken 28990938 2010-12-17T20:03:20 Prettified HEAD symlink generation.
nulltoken e1f8cad0 2010-12-17T14:45:02 Added basic HEAD file creation.
nulltoken a67a096a 2010-12-17T10:41:56 Added creation of 'objects' and 'refs' directories.
nulltoken 58fcfc26 2010-12-17T10:36:58 Removed unnecessary git_repository_init_results handling.
nulltoken 08190e2a 2010-12-16T14:31:24 Simplified git_repository_init_results struct.
nulltoken 4b8e27c8 2010-12-15T18:25:15 Very first git_repository_init() draft.
Vicent Marti 1f080e2d 2010-12-13T03:43:56 Fix initialization & freeing of inexistent repos Signed-off-by: Vicent Marti <tanoku@gmail.com>
nulltoken 6c14d641 2010-12-09T20:55:54 Fixed a memory leak in git_repository_lookup() when provided git_otype is invalid.
Vicent Marti 44908fe7 2010-12-06T23:03:16 Change the library include file Libgit2 is now officially include as #include "<git2.h>" or indidividual files may be included as #include <git2/index.h> Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti d12299fe 2010-12-03T22:22:10 Change include structure for the project The maze with include dependencies has been fixed. There is now a global include: #include <git.h> The git_odb_backend API has been exposed. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 7d7cd885 2010-12-03T18:01:30 Decouple storage from ODB logic Comes with two default backends: loose object and packfiles. Signed-off-by: Vicent Marti <tanoku@gmail.com>
nulltoken 6f02c3ba 2010-12-05T20:18:56 Small source code readability improvements. Replaced magic number "0" with GIT_SUCCESS constant wherever it made sense.
Vicent Marti 691aa968 2010-12-02T18:35:38 Add 'git_repository_open2' to customize repo folders Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 6b1eab39 2010-11-23T14:36:31 Fix MSVC warnings and errors Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 50dd6ca5 2010-11-17T04:58:32 Fix repository initialization We cannot assume that non-bare repositories have an index file, because 'git index' doesn't create it by default. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti c3a20d5c 2010-11-14T22:11:46 Add support for 'index add' Actually add files to the index by creating their corresponding blob and storing it on the repository, then getting the hash and updating the index file. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 237da401 2010-11-14T22:06:10 Add support for blob files Blob files can now be loaded from the repository like all the other base Git types. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 1795f879 2010-11-05T03:20:17 Improve error handling All initialization functions now return error codes instead of pointers. Error codes are now properly propagated on most functions. Several new and more specific error codes have been added in common.h Signed-off-by: Vicent Marti <tanoku@gmail.com>
Dave Borowitz 1544bc31 2010-11-02T16:02:37 Only require an index for non-bare repos.
Vicent Marti 6fd195d7 2010-11-02T18:42:42 Change git_repository initialization to use a path The constructor to git_repository is now called 'git_repository_open(path)' and takes a path to a git repository instead of an existing ODB object. Unit tests have been updated accordingly and the two test repositories have been merged into one. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti a13bc8e7 2010-10-29T02:22:38 Add getter methods for object owners You can know access the owning repository of any existing object, or the repository on which a revision walker is working on. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 58519018 2010-10-28T02:07:18 Fix internal memory management on the library String mememory is now managed in a much more sane manner. Fixes include: - git_person email and name is no longer limited to 64 characters - git_tree_entry filename is no longer limited to 255 characters - raw objects are properly opened & closed the minimum amount of times required for parsing - unit tests no longer leak - removed 5 other misc memory leaks as reported by Valgrind - tree writeback no longer segfaults on rare ocassions The git_person struct is no longer public. It is now managed by the library, and getter methods are in place to access its internal attributes. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti ec25391d 2010-10-07T00:20:08 Add write-back support for Tag files Tag files can now be created and modified in-memory (all the setter methods have been implemented), and written back to disk using the generic git_object_write() method. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 2a884588 2010-09-21T17:17:10 Add write-back support for git_tree All the setter methods for git_tree have been added, including the setters for attributes on each git_tree_entry and methods to add/remove entries of the tree. Modified trees and trees created in-memory from scratch can be written back to the repository using git_object_write(). Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti d45b4a9a 2010-09-20T21:39:11 Add support for in-memory objects All repository objects can now be created from scratch in memory using either the git_object_new() method, or the corresponding git_XXX_new() for each object. So far, only git_commits can be written back to disk once created in memory. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 0c3596f1 2010-09-20T01:57:53 Add setter methods & write support for git_commit All the required git_commit_set_XXX methods have been implemented; all the attributes of a commit object can now be modified in-memory. The new method git_object_write() automatically writes back the in-memory changes of any object to the repository. So far it only supports git_commit objects. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti e802d8cc 2010-09-19T03:53:57 Implement internal methods to write on sources The new 'git__source_printf' does an overflow-safe printf on a source bfufer. The new 'git__source_write' does an overflow-safe byte write on a source buffer. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti f49a2e49 2010-09-19T03:21:06 Give object structures more descriptive names The 'git_obj' structure is now called 'git_rawobj', since it represents a raw object read from the ODB. The 'git_repository_object' structure is now called 'git_object', since it's the base object class for all objects. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti a7a7ddbe 2010-09-18T19:16:04 Add generic methods for object writeback git_repository_object has now several internal methods to write back the object information in the repository. - git_repository__dbo_prepare_write() Prepares the DBO object to be modified - git_repository__dbo_write() Writes new bytes to the DBO object - git_repository__dbo_writeback() Writes back the changes to the repository Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 46f8566a 2010-09-12T23:43:21 Add methods to access internal attributes in git_repo Added several methods to access: - The ODB behind a repo - The SHA1 id behind a generic repo object - The type of a generic repo object Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 9c9f4fc1 2010-08-12T23:40:54 Add support for manually freeing repo objects A new method 'git_repository_object_free' allows to manually force the freeing of a repository object, even though they are still automatically managed by the repository and don't need to be freed by the user. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti f2408cc2 2010-08-12T19:59:32 Fix object handling in git_repository All loaded objects through git_repository_lookup are properly parsed & free'd on failure. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 3315782c 2010-08-08T14:12:17 Redesigned the walking/object lookup interface The old 'git_revpool' object has been removed and split into two distinct objects with separate functionality, in order to have separate methods for object management and object walking. * A new object 'git_repository' does the high-level management of a repository's objects (commits, trees, tags, etc) on top of a 'git_odb'. Eventually, it will also manage other repository attributes (e.g. tag resolution, references, etc). See: src/git/repository.h * A new external method 'git_repository_lookup(repo, oid, type)' has been added to the 'git_repository' API. All object lookups (git_XXX_lookup()) are now wrappers to this method, and duplicated code has been removed. The method does automatic type checking and returns a generic 'git_revpool_object' that can be cast to any specific object. See: src/git/repository.h * The external methods for object parsing of repository objects (git_XXX_parse()) have been removed. Loading objects from the repository is now managed through the 'lookup' functions. These objects are loaded with minimal information, and the relevant parsing is done automatically when the user requests any of the parsed attributes through accessor methods. An attribute has been added to 'git_repository' in order to force the parsing of all the repository objects immediately after lookup. See: src/git/commit.h See: src/git/tag.h See: src/git/tree.h * The previous walking functionality of the revpool is now found in 'git_revwalk', which does the actual revision walking on a repository; the attributes when walking through commits in a database have been decoupled from the actual commit objects. This increases performance when accessing commits during the walk and allows to have several 'git_revwalk' instances working at the same time on top of the same repository, without having to load commits in memory several times. See: src/git/revwalk.h * The old 'git_revpool_table' has been renamed to 'git_hashtable' and now works as a generic hashtable with support for any kind of object and custom hash functions. See: src/hashtable.h * All the relevant unit tests have been updated, renamed and grouped accordingly. Signed-off-by: Vicent Marti <tanoku@gmail.com>