src/index.c


Log

Author Commit Date CI Message
Vicent Marti 817c2820 2011-02-21T17:05:16 Rewrite all file IO for more performance The new `git_filebuf` structure provides atomic high-performance writes to disk by using a write cache, and optionally a double-buffered scheme through a worker thread (not enabled yet). Writes can be done 3-layered, like in git.git (user code -> write cache -> disk), or 2-layered, by writing directly on the cache. This makes index writing considerably faster. The `git_filebuf` structure contains all the old functionality of `git_filelock` for atomic file writes and reads. The `git_filelock` structure has been removed. Additionally, the `git_filebuf` API allows to automatically hash (SHA1) all the data as it is written to disk (hashing is done smartly on big chunks to improve performance). Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti e822508a 2011-02-18T10:29:55 Disable threaded index writing by default The interlocking on the write threads was not being done properly (index entries were sometimes written out of order). With proper interlocking, the threaded write is only marginally faster on big index files, and slower on the smaller ones because of the overhead when creating threads. The threaded index writing has been temporarily disabled; after more accurate benchmarks, if might be possible to enable it again only when writing very large index files (> 1000 entries). Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 084c1935 2011-02-17T23:32:22 Fix type truncation in index entries 64-bit types stored in memory have to be truncated into 32 bits when writing to disk. Was causing warnings in MSVC. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 348c7335 2011-02-17T21:32:00 Improve the performance when writing Index files In response to issue #60 (git_index_write really slow), the write_index function has been rewritten to improve its performance -- it should now be in par with the performance of git.git. On top of that, if Posix Threads are available when compiling libgit2, a new threaded writing system will be used (3 separate threads take care of solving byte-endianness, hashing the contents of the index and writing to disk, respectively). For very long Index files, this method is up to 3x times faster than git.git. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 995f9c34 2011-02-09T12:43:19 Use the new git__joinpath to build paths in methods The `git__joinpath` function has been changed to use a statically allocated buffer; we assume the buffer to be 4096 bytes, because fuck you. The new method also supports an arbritrary number of paths to join, which may come in handy in the future. Some methods which were manually joining paths with `strcpy` now use the new function, namely those in `index.c` and `refs.c`. Based on Emeric Fermas' original patch, which was using the old `git__joinpath` because I'm stupid. Thanks! Signed-off-by: Vicent Marti <tanoku@gmail.com>
Alex Budovski f0bde7fa 2011-01-11T16:07:45 Revised platform types to use 'best supported' size. This will allow graceful migration to 64 bit file sizes and timestamps should git's binary interface be extended to allow this.
Alex Budovski 0a3bcad0 2011-01-10T14:57:06 Fix Windows build with forced bit truncation. Windows uses a 64 bit time_t by default and assigning to unsigned int causes a 64 -> 32 bit truncation warning. This change forces the truncation, acknowledging the implications detailed in the file comments. Also, blobs are limited to 32 bit file sizes for the same reason (on all platforms).
Vicent Marti a44fc1d4 2010-12-06T23:13:00 Fix type-conversion warnings The types in the git_index_entry struct are now system-defaults, and get truncated to uint32_t's when written back on the index. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 44908fe7 2010-12-06T23:03:16 Change the library include file Libgit2 is now officially include as #include "<git2.h>" or indidividual files may be included as #include <git2/index.h> Signed-off-by: Vicent Marti <tanoku@gmail.com>
nulltoken 6f02c3ba 2010-12-05T20:18:56 Small source code readability improvements. Replaced magic number "0" with GIT_SUCCESS constant wherever it made sense.
Vicent Marti c4034e63 2010-12-02T04:31:54 Refactor all 'vector' functions into common code All the operations on the 'git_index_entry' array and the 'git_tree_entry' array have been refactored into common code in the src/vector.c file. The new vector methods support: - insertion: O(1) (avg) - deletion: O(n) - searching: O(logn) - sorting: O(logn) - r. access: O(1) Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 91e88941 2010-11-29T18:06:22 Properly write Index Entry 'flags_extended' Always write the 'flags_extended' attribute to disk if it's available. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti c3dd69a9 2010-11-17T04:59:11 Fix resizing the index array No longer segfaults when resizing an empty array. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti c3a20d5c 2010-11-14T22:11:46 Add support for 'index add' Actually add files to the index by creating their corresponding blob and storing it on the repository, then getting the hash and updating the index file. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Scott Chacon 0be42199 2010-11-10T13:43:55 accessor for index entry count
Vicent Marti 3f43678e 2010-11-07T01:24:45 Make the Index API public Several private methods of the Index API are now public, including the methods to remove, get and add index entries. All the methods only take an integer value for the position of the entry to get/remove. To get or remove entries based on their path names, look them up first using the git_index_find method. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 1795f879 2010-11-05T03:20:17 Improve error handling All initialization functions now return error codes instead of pointers. Error codes are now properly propagated on most functions. Several new and more specific error codes have been added in common.h Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 6fd195d7 2010-11-02T18:42:42 Change git_repository initialization to use a path The constructor to git_repository is now called 'git_repository_open(path)' and takes a path to a git repository instead of an existing ODB object. Unit tests have been updated accordingly and the two test repositories have been merged into one. Signed-off-by: Vicent Marti <tanoku@gmail.com>
Vicent Marti 68535125 2010-07-09T20:19:56 Add support for git index files The new 'git_index' structure is an in-memory representation of a git index on disk; the 'git_index_entry' structures represent each one of the file entries on the index. The following calls for index instantiation have been added: git_index_alloc(): instantiate a new index structure git_index_free(): free an existing index git_index_clear(): clear all the entires in an existing file The following calls for index reading and writing have been added: git_index_read(): update the contents of the index structure from its file on disk. Internally implemented through: git_index__parse() Index files are stored on disk in network byte order; all integer fields inside them are properly converted to the machine's byte order when loading them in memory. The parsing engine also distinguishes between normal index entries and extended entries with 2 extra bytes of flags. The 'TREE' extension for index entries is also loaded into memory: Tree caches stored in Index files are loaded into the 'git_index_tree' structure pointed by the 'tree' pointer inside 'git_index'. 'index->tree' points to the root node of the tree cache; the full tree can be traversed through each of the node's 'tree->children'. Index files can be written back to disk through: git_index_write(): atomic writing of existing index objects backed by internal method git_index__write() The following calls for entry manipulation have been added: git_index_add(): insert an empty entry to the index git_index_find(): search an entry by its path name git_index__append(): appends a new index entry to the end of the list, resizing the entries array if required New index entries are always inserted at the end of the array; since the index entries must be sorted for it to be internally consistent, the index object is only sorted once, and if required, before accessing the whole entriea array (e.g. before writing to disk, before traversing, etc). git_index__remove_pos(): remove an index entry in a specific position git_index__sort(): sort the entries in the array by path name The entries array is sorted stably and in place using an insertion sort, which ought to be the most efficient approach since the entries array is always mostly-sorted. Signed-off-by: Vicent Marti <tanoku@gmail.com>