src/odb.h


Log

Author Commit Date CI Message
Edward Thomson 359fc2d2 2013-01-08T17:07:25 update copyrights
Carlos Martín Nieto f56f8585 2012-11-19T22:23:16 indexer: use the packfile streaming API The new API allows us to read the object bit by bit from the packfile, instead of needing it all at once in the packfile. This also allows us to hash the object as it comes in from the network instead of having to try to read it all and failing repeatedly for larger objects. This is only the first step, but it already shows huge improvements when dealing with objects over a few megabytes in size. It reduces the memory needs in some cases, but delta objects still need to be completely in memory and the old inefficent method is still used for that.
Russell Belfer c6ac28fd 2012-09-10T12:24:05 Reorg internal odb read header and object lookup Often `git_odb_read_header` will "fail" and have to read the entire object into memory instead of just the header. When this happens, the object is loaded and then disposed of immediately, which makes it difficult to efficiently use the header information to decide if the object should be loaded (since attempting to do so will often result in loading the object twice). This commit takes the existing code and reorganizes it to have two new functions: - `git_odb__read_header_or_object` which acts just like the old read header function except that it returns the object, too, if it was forced to load the whole thing. It then becomes the callers responsibility to free the `git_odb_object`. - `git_object__from_odb_object` which was extracted from the old `git_object_lookup` and creates a subclass of `git_object` from an existing `git_odb_object` (separating the ODB lookup from the `git_object` creation). This allows you to use the first header reading function efficiently without instantiating the `git_odb_object` twice. There is no net change to the behavior of any of the existing functions, but this allows internal code to tap into the ODB lookup and object creation to be more efficient.
Russell Belfer 60b9d3fc 2012-09-05T15:00:40 Implement filters for status/diff blobs This adds support to diff and status for running filters (a la crlf) on blobs in the workdir before computing SHAs and before generating text diffs. This ended up being a bit more code change than I had thought since I had to reorganize some of the diff logic to minimize peak memory use when filtering blobs in a diff. This also adds a cap on the maximum size of data that will be loaded to diff. I set it at 512Mb which should match core git. Right now it is a #define in src/diff.h but it could be moved into the public API if desired.
Vicent Martí 904b67e6 2012-05-18T01:48:50 errors: Rename error codes
Vicent Martí e172cf08 2012-05-18T01:21:06 errors: Rename the generic return codes
Russell Belfer 282283ac 2012-05-04T16:46:46 Fix valgrind issues There are three changes here: - correctly propogate error code from failed object lookups - make zlib inflate use our allocators - add OID to notfound error in ODB lookups
Russell Belfer e1de726c 2012-03-12T22:55:40 Migrate ODB files to new error handling This migrates odb.c, odb_loose.c, odb_pack.c and pack.c to the new style of error handling. Also got the unix and win32 versions of map.c. There are some minor changes to other files but no others were completely converted. This also contains an update to filebuf so that a zeroed out filebuf will not think that the fd (== 0) is actually open (and inadvertently call close() on fd 0 if cleaned up). Lastly, this was built and tested on win32 and contains a bunch of fixes for the win32 build which was pretty broken.
schu 5e0de328 2012-02-13T17:10:24 Update Copyright header Signed-off-by: schu <schu-github@schulog.org>
Vicent Martí f19e3ca2 2012-02-10T20:16:42 odb: Proper symlink hashing
Vicent Martí 18e5b854 2012-02-10T19:47:02 odb: Add internal `git_odb__hashfd`
Vicent Marti 9462c471 2011-11-25T08:16:26 repository: Change ownership semantics The ownership semantics have been changed all over the library to be consistent. There are no more "borrowed" or duplicated references. Main changes: - `git_repository_open2` and `3` have been dropped. - Added setters and getters to hotswap all the repository owned objects: `git_repository_index` `git_repository_set_index` `git_repository_odb` `git_repository_set_odb` `git_repository_config` `git_repository_set_config` `git_repository_workdir` `git_repository_set_workdir` Now working directories/index files/ODBs and so on can be hot-swapped after creating a repository and between operations. - All these objects now have proper ownership semantics with refcounting: they all require freeing after they are no longer needed (the repository always keeps its internal reference). - Repository open and initialization has been updated to keep in mind the configuration files. Bare repositories are now always detected, and a default config file is created on init. - All the tests affected by these changes have been dropped from the old test suite and ported to the new one.
Brodie Rao 01ad7b3a 2011-09-06T15:48:45 *: correct and codify various file permissions The following files now have 0444 permissions: - loose objects - pack indexes - pack files - packs downloaded by fetch - packs downloaded by the HTTP transport And the following files now have 0666 permissions: - config files - repository indexes - reflogs - refs This brings libgit2 more in line with Git. Note that git_filebuf_commit() and git_filebuf_commit_at() have both gained a new mode parameter. The latter change fixes an important issue where filebufs created with GIT_FILEBUF_TEMPORARY received 0600 permissions (due to mkstemp(3) usage). Now we chmod() the file before renaming it into place. Tests have been added to confirm that new commit, tag, and tree objects are created with the right permissions. I don't have access to Windows, so for now I've guarded the tests with "#ifndef GIT_WIN32".
Brodie Rao ce8cd006 2011-09-07T15:32:44 fileops/repository: create (most) directories with 0777 permissions To further match how Git behaves, this change makes most of the directories libgit2 creates in a git repo have a file mode of 0777. Specifically: - Intermediate directories created with git_futils_mkpath2file() have 0777 permissions. This affects odb_loose, reflog, and refs. - The top level folder for bare repos is created with 0777 permissions. - The top level folder for non-bare repos is created with 0755 permissions. - /objects/info/, /objects/pack/, /refs/heads/, and /refs/tags/ are created with 0777 permissions. Additionally, the following changes have been made: - fileops functions that create intermediate directories have grown a new dirmode parameter. The only exception to this is filebuf's lock_file(), which unconditionally creates intermediate directories with 0777 permissions when GIT_FILEBUF_FORCE is set. - The test runner now sets the umask to 0 before running any tests. This ensurses all file mode checks are consistent across systems. - t09-tree.c now does a directory permissions check. I've avoided adding this check to other tests that might reuse existing directories from the prefabricated test repos. Because they're checked into the repo, they have 0755 permissions. - Other assorted directories created by tests have 0777 permissions.
Vicent Marti 87d9869f 2011-09-19T03:34:49 Tabify everything There were quite a few places were spaces were being used instead of tabs. Try to catch them all. This should hopefully not break anything. Except for `git blame`. Oh well.
Vicent Marti bb742ede 2011-09-19T01:54:32 Cleanup legal data 1. The license header is technically not valid if it doesn't have a copyright signature. 2. The COPYING file has been updated with the different licenses used in the project. 3. The full GPLv2 header in each file annoys me.
Vicent Marti c85e08b1 2011-08-16T13:05:05 odb: Do not pass around a header when hashing
Vicent Marti 72a3fe42 2011-03-18T19:38:49 I broke your bindings Hey. Apologies in advance -- I broke your bindings. This is a major commit that includes a long-overdue redesign of the whole object-database structure. This is expected to be the last major external API redesign of the library until the first non-alpha release. Please get your bindings up to date with these changes. They will be included in the next minor release. Sorry again! Major features include: - Real caching and refcounting on parsed objects - Real caching and refcounting on objects read from the ODB - Streaming writes & reads from the ODB - Single-method writes for all object types - The external API is now partially thread-safe The speed increases are significant in all aspects, specially when reading an object several times from the ODB (revwalking) and when writing big objects to the ODB. Here's a full changelog for the external API: blob.h ------ - Remove `git_blob_new` - Remove `git_blob_set_rawcontent` - Remove `git_blob_set_rawcontent_fromfile` - Rename `git_blob_writefile` -> `git_blob_create_fromfile` - Change `git_blob_create_fromfile`: The `path` argument is now relative to the repository's working dir - Add `git_blob_create_frombuffer` commit.h -------- - Remove `git_commit_new` - Remove `git_commit_add_parent` - Remove `git_commit_set_message` - Remove `git_commit_set_committer` - Remove `git_commit_set_author` - Remove `git_commit_set_tree` - Add `git_commit_create` - Add `git_commit_create_v` - Add `git_commit_create_o` - Add `git_commit_create_ov` tag.h ----- - Remove `git_tag_new` - Remove `git_tag_set_target` - Remove `git_tag_set_name` - Remove `git_tag_set_tagger` - Remove `git_tag_set_message` - Add `git_tag_create` - Add `git_tag_create_o` tree.h ------ - Change `git_tree_entry_2object`: New signature is `(git_object **object_out, git_repository *repo, git_tree_entry *entry)` - Remove `git_tree_new` - Remove `git_tree_add_entry` - Remove `git_tree_remove_entry_byindex` - Remove `git_tree_remove_entry_byname` - Remove `git_tree_clearentries` - Remove `git_tree_entry_set_id` - Remove `git_tree_entry_set_name` - Remove `git_tree_entry_set_attributes` object.h ------------ - Remove `git_object_new - Remove `git_object_write` - Change `git_object_close`: This method is now *mandatory*. Not closing an object causes a memory leak. odb.h ----- - Remove type `git_rawobj` - Remove `git_rawobj_close` - Rename `git_rawobj_hash` -> `git_odb_hash` - Change `git_odb_hash`: New signature is `(git_oid *id, const void *data, size_t len, git_otype type)` - Add type `git_odb_object` - Add `git_odb_object_close` - Change `git_odb_read`: New signature is `(git_odb_object **out, git_odb *db, const git_oid *id)` - Change `git_odb_read_header`: New signature is `(size_t *len_p, git_otype *type_p, git_odb *db, const git_oid *id)` - Remove `git_odb_write` - Add `git_odb_open_wstream` - Add `git_odb_open_rstream` odb_backend.h ------------- - Change type `git_odb_backend`: New internal signatures are as follows int (* read)(void **, size_t *, git_otype *, struct git_odb_backend *, const git_oid *) int (* read_header)(size_t *, git_otype *, struct git_odb_backend *, const git_oid *) int (* writestream)(struct git_odb_stream **, struct git_odb_backend *, size_t, git_otype) int (* readstream)( struct git_odb_stream **, struct git_odb_backend *, const git_oid *) - Add type `git_odb_stream` - Add enum `git_odb_streammode` Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti becff042 2011-02-07T08:09:11 Fix compilation in MSVC The git_odb_backend_* symbols were being redefined as external. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 44908fe7 2010-12-06T23:03:16 Change the library include file Libgit2 is now officially include as #include "<git2.h>" or indidividual files may be included as #include <git2/index.h> Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 7d7cd885 2010-12-03T18:01:30 Decouple storage from ODB logic Comes with two default backends: loose object and packfiles. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti d12299fe 2010-12-03T22:22:10 Change include structure for the project The maze with include dependencies has been fixed. There is now a global include: #include <git.h> The git_odb_backend API has been exposed. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Ramsay Jones 89217d8f 2010-04-28T20:20:00 Add functions to open a '*.pack' file and perform some basic validation Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Shawn O. Pearce a7c60cfc 2009-01-03T02:41:26 Add basic support to read pack-*.idx v1 and v2 files The index data is mapped into memory and then scanned using a binary search algorithm to locate the matching entry for the supplied git_oid. The standard fanout hash trick is applied to reduce the search space by 8 iterations. Since the v1 and v2 file formats differ in their search function, due to the different layouts used for the object records, we use two different search implementations and a virtual function pointer to jump to the correct version of code for the current pack index. The single function jump per-pack should be faster then computing a branch point inside the inner loop of a common binary search. To improve concurrency during read operations the pack lock is only held while verifying the index is actually open, or while opening the index for the first time. This permits multiple concurrent readers to scan through the same index. If an invalid index file is opened we close it and mark the git_pack's invalid bit to true. The git_pack structure is kept around in its parent git_packlist, but the invalid bit will cause all future readers to skip over the pack entirely. Pruning the invalid entries is relatively unimportant because they shouldn't be very common, a $GIT_DIRECTORY/objects/pack directory tends to only have valid pack files. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>