src/odb_pack.c


Log

Author Commit Date CI Message
Michael Schubert 3d7617e4 2012-09-14T21:33:50 odb_pack: fix race condition last_found is the last packfile a wanted object was found in. Since last_found is shared among all searching threads, it might changes while we're searching. As suggested by @arrbee, put a copy on the stack to fix the race condition.
David Michael Barr ab8a0402 2012-09-12T14:26:31 odb_pack: try lookup before refreshing packs This reduces the rate of syscalls for the common case of sequences of object reads from the same pack. Best of 5 timings for libgit2_clar before this patch: real 0m5.375s user 0m0.392s sys 0m3.564s After applying this patch: real 0m5.285s user 0m0.356s sys 0m3.544s 0.6% improvement in system time. 9.2% improvement in user time. 1.7% improvement in elapsed time. Confirmed a 0.6% reduction in number of system calls with strace. Expect greater improvement for graph-traversal with large packs.
Carlos Martín Nieto f9988d4e 2012-09-04T21:42:00 odb: pass the user's data pointer correctly in foreach
Vicent Marti 51e1d808 2012-08-06T12:41:08 Merge remote-tracking branch 'arrbee/tree-walk-fixes' into development Conflicts: src/notes.c src/transports/git.c src/transports/http.c src/transports/local.c tests-clar/odb/foreach.c
Russell Belfer 5dca2010 2012-08-03T17:08:01 Update iterators for consistency across library This updates all the `foreach()` type functions across the library that take callbacks from the user to have a consistent behavior. The rules are: * A callback terminates the loop by returning any non-zero value * Once the callback returns non-zero, it will not be called again (i.e. the loop stops all iteration regardless of state) * If the callback returns non-zero, the parent fn returns GIT_EUSER * Although the parent returns GIT_EUSER, no error will be set in the library and `giterr_last()` will return NULL if called. This commit makes those changes across the library and adds tests for most of the iteration APIs to make sure that they follow the above rules.
Vicent Marti e25dda51 2012-08-02T01:38:30 Merge remote-tracking branch 'nulltoken/topic/amd64-compat' into development Conflicts: src/netops.c src/netops.h src/oid.c
nulltoken b8457baa 2012-07-24T07:57:58 portability: Improve x86/amd64 compatibility
Carlos Martín Nieto 507523c3 2012-07-21T16:23:49 odb: allow creating an ODB backend from a packfile index git_odb_backend_one_packfile() allows us to create an ODB backend out of an .idx file.
Carlos Martín Nieto 521aedad 2012-06-05T14:48:51 odb: add git_odb_foreach() Go through each backend and list every objects that exists in them. This allows fsck-like uses.
Vicent Martí 904b67e6 2012-05-18T01:48:50 errors: Rename error codes
Vicent Martí e172cf08 2012-05-18T01:21:06 errors: Rename the generic return codes
Russell Belfer 282283ac 2012-05-04T16:46:46 Fix valgrind issues There are three changes here: - correctly propogate error code from failed object lookups - make zlib inflate use our allocators - add OID to notfound error in ODB lookups
nulltoken fa6420f7 2012-04-29T21:46:33 buf: deploy git_buf_len()
Russell Belfer 44ef8b1b 2012-04-13T13:00:10 Fix warnings on 64-bit windows builds This fixes all the warnings on win64 except those in deps, which come from the regex code.
Russell Belfer 0d0fa7c3 2012-03-16T15:56:01 Convert attr, ignore, mwindow, status to new errors Also cleaned up some previously converted code that still had little things to polish.
Russell Belfer e1de726c 2012-03-12T22:55:40 Migrate ODB files to new error handling This migrates odb.c, odb_loose.c, odb_pack.c and pack.c to the new style of error handling. Also got the unix and win32 versions of map.c. There are some minor changes to other files but no others were completely converted. This also contains an update to filebuf so that a zeroed out filebuf will not think that the fd (== 0) is actually open (and inadvertently call close() on fd 0 if cleaned up). Lastly, this was built and tested on win32 and contains a bunch of fixes for the win32 build which was pretty broken.
Vicent Martí 1a481123 2012-02-17T00:13:34 error-handling: References Yes, this is error handling solely for `refs.c`, but some of the abstractions leak all ofer the code base.
Russell Belfer 854eccbb 2012-02-29T12:04:59 Clean up GIT_UNUSED macros on all platforms It turns out that commit 31e9cfc4cbcaf1b38cdd3dbe3282a8f57e5366a5 did not fix the GIT_USUSED behavior on all platforms. This commit walks through and really cleans things up more thoroughly, getting rid of the unnecessary stuff. To remove the use of some GIT_UNUSED, I ended up adding a couple of new iterators for hashtables that allow you to iterator just over keys or just over values. In making this change, I found a bug in the clar tests (where we were doing *count++ but meant to do (*count)++ to increment the value). I fixed that but then found the test failing because it was not really using an empty repo. So, I took some of the code that I wrote for iterator testing and moved it to clar_helpers.c, then made use of that to make it easier to open fixtures on a per test basis even within a single test file.
Vicent Martí 0c3bae62 2012-02-15T16:56:56 zlib: Remove custom `git2/zlib.h` header This is legacy compat stuff for when `deflateBound` is not defined, but we're not embedding zlib and that function is always available. Kill that with fire.
schu 5e0de328 2012-02-13T17:10:24 Update Copyright header Signed-off-by: schu <schu-github@schulog.org>
Russell Belfer 1744fafe 2012-01-17T15:49:47 Move path related functions from fileops to path This takes all of the functions that look up simple data about paths (such as `git_futils_isdir`) and moves them over to path.h (becoming `git_path_isdir`). This leaves fileops.h just with functions that actually manipulate the filesystem or look at the file contents in some way. As part of this, the dir.h header which is really just for win32 support was moved into win32 (with some minor changes).
Russell Belfer 97769280 2011-11-30T11:27:15 Use git_buf for path storage instead of stack-based buffers This converts virtually all of the places that allocate GIT_PATH_MAX buffers on the stack for manipulating paths to use git_buf objects instead. The patch is pretty careful not to touch the public API for libgit2, so there are a few places that still use GIT_PATH_MAX. This extends and changes some details of the git_buf implementation to add a couple of extra functions and to make error handling easier. This includes serious alterations to all the path.c functions, and several of the fileops.c ones, too. Also, there are a number of new functions that parallel existing ones except that use a git_buf instead of a stack-based buffer (such as git_config_find_global_r that exists alongsize git_config_find_global). This also modifies the win32 version of p_realpath to allocate whatever buffer size is needed to accommodate the realpath instead of hardcoding a GIT_PATH_MAX limit, but that change needs to be tested still.
Vicent Marti 3286c408 2011-10-28T14:51:13 global: Properly use `git__` memory wrappers Ensure that all memory related functions (malloc, calloc, strdup, free, etc) are using their respective `git__` wrappers.
Sven Strickroth 599297fd 2011-10-03T23:12:43 ignore missing pack file as git does See http://code.google.com/p/tortoisegit/issues/detail?id=862 Signed-off-by: Sven Strickroth <email@cs-ware.de>
Vicent Martí 71a4c1f1 2011-09-18T20:07:59 Merge pull request #384 from kiryl/warnings Add more -W flags to CFLAGS
Vicent Marti 87d9869f 2011-09-19T03:34:49 Tabify everything There were quite a few places were spaces were being used instead of tabs. Try to catch them all. This should hopefully not break anything. Except for `git blame`. Oh well.
Vicent Marti bb742ede 2011-09-19T01:54:32 Cleanup legal data 1. The license header is technically not valid if it doesn't have a copyright signature. 2. The COPYING file has been updated with the different licenses used in the project. 3. The full GPLv2 header in each file annoys me.
Kirill A. Shutemov d568d585 2011-08-30T23:55:22 CMakefile: add -Wmissing-prototypes and fix warnings Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Kirill A. Shutemov 932669b8 2011-08-25T14:22:57 Drop STRLEN() macros There is no need in STRLEN macros. Compilers can do this trivial optimization on its own. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Vicent Marti f6867e63 2011-08-08T16:56:28 Fix compilation in Windows
Carlos Martín Nieto b5b474dd 2011-07-28T11:45:46 Modify the given offset in git_packfile_unpack The callers immediately throw away the offset, so we don't need any logical changes in any of them. This will be useful for the indexer, as it does need to know where the compressed data ends. Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Carlos Martín Nieto a070f152 2011-07-29T01:08:02 Move pack functions to their own file
Carlos Martín Nieto 7d0cdf82 2011-07-09T02:25:01 Make packfile_unpack_header more generic On the way, store the fd and the size in the mwindow file. Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Carlos Martín Nieto c7c9e183 2011-07-07T10:17:40 Move the pack structs to an internal header
Carlos Martín Nieto 7bfdb3d2 2011-06-28T20:39:30 Factor out the mmap window code This code is useful for more things than just the packfile handling code. Factor it out so it can be reused. Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
Kirill A. Shutemov 03cdbab4 2011-07-15T16:44:25 odb_pack: fix cast warnings /home/kas/git/public/libgit2/src/odb_pack.c: In function ‘packfile_sort__cb’: /home/kas/git/public/libgit2/src/odb_pack.c:702:24: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:703:24: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c: In function ‘nth_packed_object_offset’: /home/kas/git/public/libgit2/src/odb_pack.c:944:10: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:944:10: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:944:10: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:948:9: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:948:9: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:948:9: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:952:22: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:952:22: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:952:22: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:953:8: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:953:8: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] /home/kas/git/public/libgit2/src/odb_pack.c:953:8: warning: cast discards ‘__attribute__((const))’ qualifier from pointer target type [-Wcast-qual] Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Vicent Marti de18f276 2011-07-07T01:46:20 vector: Timsort all of the things Drop the GLibc implementation of Merge Sort and replace it with Timsort. The algorithm has been tuned to work on arrays of pointers (void **), so there's no longer a need to abstract the byte-width of each element in the array. All the comparison callbacks now take pointers-to-elements, not pointers-to-pointers, so there's now one less level of dereferencing. E.g. int index_cmp(const void *a, const void *b) { - const git_index_entry *entry_a = *(const git_index_entry **)(a); + const git_index_entry *entry_a = (const git_index_entry *)(a); The result is up to a 40% speed-up when sorting vectors. Memory usage remains lineal. A new `bsearch` implementation has been added, whose callback also supplies pointer-to-elements, to uniform the Vector API again.
Carlos Martín Nieto 2ee318a7 2011-07-05T15:09:17 Small fixes in pack_window_open Check if the window structure has actually been allocated before trying to access it, and don't leak said structure if the map fails. Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Vicent Marti f79026b4 2011-07-04T11:43:34 fileops: Cleanup Cleaned up the structure of the whole OS-abstraction layer. fileops.c now contains a set of utility methods for file management used by the library. These are abstractions on top of the original POSIX calls. There's a new file called `posix.c` that contains emulations/reimplementations of all the POSIX calls the library uses. These are prefixed with `p_`. There's a specific posix file for each platform (win32 and unix). All the path-related methods have been moved from `utils.c` to `path.c` and have their own prefix.
Kirill A. Shutemov 932d1baf 2011-06-30T19:52:34 cleanup: remove trailing spaces Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Carlos Martín Nieto cdb6f9bf 2011-06-20T17:34:01 Allocate enough memory for the terminator in commit parsing Also allow space for the null-terminator when allocating the buffer in packfile_unpack_compressed. Up to now, the last newline had served as a terminator, but 858ef372 searches for a double-newline and exposes the problem. Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Vicent Marti f1d01851 2011-06-16T02:48:48 oid: Uniformize ncmp methods Drop redundant methods. The ncmp method is now public
Vicent Marti fa48608e 2011-06-16T02:36:21 oid: Rename methods Yeah. Finally. Fuck the old names, this ain't POSIX and they don't make any sense at all.
Vicent Martí 1b0d92b1 2011-06-06T18:28:11 Merge pull request #238 from pegonma/git_oid_ncmp Better name for git_oid_match
Marc Pegon c09093cc 2011-06-06T10:55:36 Renamed git_oid_match to git_oid_ncmp. As suggested by carlosmn, git_oid_ncmp would probably be a better name than git_oid_match, for it does the same as git_oid_cmp but only up to a certain amount of hex digits.
Johan 't Hart 393a9f9e 2011-06-06T00:33:23 Fix build errors on MSVC
Vicent Marti 7107b599 2011-06-02T01:03:52 odb-pack: More variable declarations
Vicent Marti 7b5fe049 2011-06-02T00:47:51 odb-pack: Do not declare variables mid-function
Vicent Marti d0323a5f 2011-06-01T21:25:56 short-oid: Cleanup
Marc Pegon 6c8ca697 2011-05-29T17:57:25 Fixed some error messages related to searching objects from a short oid. Fixed forgot to check that prefix length is greater than minimum prefix length in read_unique_short_oid method from pack backend.
Marc Pegon ac2b94ad 2011-05-28T21:24:25 Added a GIT_OID_MINPREFIXLEN constant to define the minimum length allowed for oid prefixes (set to 4, like in git). Consequently updated some object lookup methods and their documentation.
Marc Pegon dd453c4d 2011-05-27T22:46:41 Added git.git sha1 lookup method to replace simple binary search in pack backend. Implemented find_unique_short_oid for pack backend, based on git sha1 lookup method; finding an object given its full oid is just a particular case of searching the unique object matching an oid prefix (short oid). Added git_odb_read_unique_short_oid, which iterates over all the backends to find and read the unique object matching the given oid prefix. Added a git_object_lookup_short_oid method to find the unique object in the repository matching a given oid prefix : it generalizes git_object_lookup which now does nothing but calls git_object_lookup_short_oid.
Marc Pegon ecd6fdf1 2011-05-27T18:49:09 Added a read_unique_short_oid method to backends, to make it possible to find objects from sha1 prefixes in the future. Default implementations throw GIT_ENOTIMPLEMENTED for strict prefixes (i.e. length < GIT_OID_HEXSZ).
Vicent Marti f84d9819 2011-05-23T21:14:58 odb_pack: Reword errors
Jakob Pfender 267d539f 2011-05-19T13:38:12 odb_pack.c: Move to new error handling mechanism
Vicent Martí 273c8bc0 2011-05-01T14:59:11 Merge pull request #147 from nordsturm/fix_pack_backend_leak. Fix memory leak in pack_backend__free
Vicent Marti c7b79af3 2011-05-01T21:31:58 pack-odb: Check `mtime` instead of folder size Do not check the folder's size to detect new packfiles at runtime. This doesn't work on Win32.
Sergey Nikishin ed6c462c 2011-04-27T17:30:45 Fix memory leak in pack_backend__free
Vicent Marti 1d008781 2011-04-23T23:59:38 Fix conversion warning in MSVC
Vicent Marti 90d743cd 2011-04-15T15:12:37 Refresh the list of packfiles on each ODB query Fixes the issue where object lookups were failing right after a pull on an open repository.
nulltoken 56d8ca26 2011-03-20T18:36:25 Switch from time_t to git_time_t git_time_t is defined as a signed 64 integer. This allows a true predictable multiplatform behavior.
Vicent Marti 72a3fe42 2011-03-18T19:38:49 I broke your bindings Hey. Apologies in advance -- I broke your bindings. This is a major commit that includes a long-overdue redesign of the whole object-database structure. This is expected to be the last major external API redesign of the library until the first non-alpha release. Please get your bindings up to date with these changes. They will be included in the next minor release. Sorry again! Major features include: - Real caching and refcounting on parsed objects - Real caching and refcounting on objects read from the ODB - Streaming writes & reads from the ODB - Single-method writes for all object types - The external API is now partially thread-safe The speed increases are significant in all aspects, specially when reading an object several times from the ODB (revwalking) and when writing big objects to the ODB. Here's a full changelog for the external API: blob.h ------ - Remove `git_blob_new` - Remove `git_blob_set_rawcontent` - Remove `git_blob_set_rawcontent_fromfile` - Rename `git_blob_writefile` -> `git_blob_create_fromfile` - Change `git_blob_create_fromfile`: The `path` argument is now relative to the repository's working dir - Add `git_blob_create_frombuffer` commit.h -------- - Remove `git_commit_new` - Remove `git_commit_add_parent` - Remove `git_commit_set_message` - Remove `git_commit_set_committer` - Remove `git_commit_set_author` - Remove `git_commit_set_tree` - Add `git_commit_create` - Add `git_commit_create_v` - Add `git_commit_create_o` - Add `git_commit_create_ov` tag.h ----- - Remove `git_tag_new` - Remove `git_tag_set_target` - Remove `git_tag_set_name` - Remove `git_tag_set_tagger` - Remove `git_tag_set_message` - Add `git_tag_create` - Add `git_tag_create_o` tree.h ------ - Change `git_tree_entry_2object`: New signature is `(git_object **object_out, git_repository *repo, git_tree_entry *entry)` - Remove `git_tree_new` - Remove `git_tree_add_entry` - Remove `git_tree_remove_entry_byindex` - Remove `git_tree_remove_entry_byname` - Remove `git_tree_clearentries` - Remove `git_tree_entry_set_id` - Remove `git_tree_entry_set_name` - Remove `git_tree_entry_set_attributes` object.h ------------ - Remove `git_object_new - Remove `git_object_write` - Change `git_object_close`: This method is now *mandatory*. Not closing an object causes a memory leak. odb.h ----- - Remove type `git_rawobj` - Remove `git_rawobj_close` - Rename `git_rawobj_hash` -> `git_odb_hash` - Change `git_odb_hash`: New signature is `(git_oid *id, const void *data, size_t len, git_otype type)` - Add type `git_odb_object` - Add `git_odb_object_close` - Change `git_odb_read`: New signature is `(git_odb_object **out, git_odb *db, const git_oid *id)` - Change `git_odb_read_header`: New signature is `(size_t *len_p, git_otype *type_p, git_odb *db, const git_oid *id)` - Remove `git_odb_write` - Add `git_odb_open_wstream` - Add `git_odb_open_rstream` odb_backend.h ------------- - Change type `git_odb_backend`: New internal signatures are as follows int (* read)(void **, size_t *, git_otype *, struct git_odb_backend *, const git_oid *) int (* read_header)(size_t *, git_otype *, struct git_odb_backend *, const git_oid *) int (* writestream)(struct git_odb_stream **, struct git_odb_backend *, size_t, git_otype) int (* readstream)( struct git_odb_stream **, struct git_odb_backend *, const git_oid *) - Add type `git_odb_stream` - Add enum `git_odb_streammode` Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 58d06cf1 2011-03-10T01:06:24 Rewrite the Pack backend The new pack backend is an adaptation of the original git.git code in `sha1_file.c`. It's slightly faster than the previous version and severely less memory-hungry. The call-stack of a normal pack backend query has been properly documented in the top of the header for future reference. And by properly I mean with ASCII diagrams 'n shit.
Vicent Marti d4b5a4e2 2011-02-09T19:49:02 Internal changes on the backend system The priority value for different backends has been removed from the public `git_odb_backend` struct. We handle that internally. The priority value is specified on the `git_odb_add_alternate`. This is convenient because it allows us to poll a backend twice with different priorities without having to instantiate it twice. We also differentiate between main backends and alternates; alternates have lower priority and cannot be written to. These changes come with some unit tests to make sure that the backend sorting is consistent. The libgit2 version has been bumped to 0.4.0. This commit changes the external API: CHANGED: struct git_odb_backend No longer has a `priority` attribute; priority for the backend in managed internally by the library. git_odb_add_backend(git_odb *odb, git_odb_backend *backend, int priority) Now takes an additional priority parameter, the priority that will be given to the backend. ADDED: git_odb_add_alternate(git_odb *odb, git_odb_backend *backend, int priority) Add a backend as an alternate. Alternate backends have always lower priority than main backends, and writing is disabled on them. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Vicent Marti <tanoku@gmail.com>
Alex Budovski f0bde7fa 2011-01-11T16:07:45 Revised platform types to use 'best supported' size. This will allow graceful migration to 64 bit file sizes and timestamps should git's binary interface be extended to allow this.
Vicent Marti 44908fe7 2010-12-06T23:03:16 Change the library include file Libgit2 is now officially include as #include "<git2.h>" or indidividual files may be included as #include <git2/index.h> Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 7d7cd885 2010-12-03T18:01:30 Decouple storage from ODB logic Comes with two default backends: loose object and packfiles. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti d12299fe 2010-12-03T22:22:10 Change include structure for the project The maze with include dependencies has been fixed. There is now a global include: #include <git.h> The git_odb_backend API has been exposed. Signed-off-by: Vicent Marti <tanoku@gmail.com>