|
bed3229b
|
2009-01-03T03:34:09
|
|
Precompute the fanout decoding and the oid offset in a pack-*.idx
The fanout table is fairly commonly accessed, we need to read it
twice for each object we lookup in any given pack file. Most of
the processors running Git are running in little-endian mode, as
they are variants of the x86 platform, so reading the fanout is
a costly operation as we need to convert from network byte order
to local byte order. By decoding the fanout table into a malloc
obtained buffer we can save these 2 decode operations per lookup
and make search go more quickly.
This also cleans up the initialization of the search functions
by cutting out a few instructions, saving a small amount of time.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
a7c60cfc
|
2009-01-03T02:41:26
|
|
Add basic support to read pack-*.idx v1 and v2 files
The index data is mapped into memory and then scanned using a
binary search algorithm to locate the matching entry for the
supplied git_oid. The standard fanout hash trick is applied to
reduce the search space by 8 iterations.
Since the v1 and v2 file formats differ in their search function,
due to the different layouts used for the object records, we use
two different search implementations and a virtual function pointer
to jump to the correct version of code for the current pack index.
The single function jump per-pack should be faster then computing
a branch point inside the inner loop of a common binary search.
To improve concurrency during read operations the pack lock is only
held while verifying the index is actually open, or while opening
the index for the first time. This permits multiple concurrent
readers to scan through the same index.
If an invalid index file is opened we close it and mark the
git_pack's invalid bit to true. The git_pack structure is kept
around in its parent git_packlist, but the invalid bit will cause
all future readers to skip over the pack entirely. Pruning the
invalid entries is relatively unimportant because they shouldn't
be very common, a $GIT_DIRECTORY/objects/pack directory tends to
only have valid pack files.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
098ac57a
|
2009-01-03T00:02:25
|
|
Refactor pack memory management and locking to be safer
Using an atomic reference counter is difficult to make
cross-platform, as the reference count implementations
are generally processor specific. Its also hard to do
a proper multi-read/single-write implementation.
We now use a simple mutex around the reference count for the list
of packs. Readers grab the mutex and either build the list, or
increment the existing one's reference count. When the reader is
done with the list, the reference count is decremented. In this way
parallel readers are able to operate on the list without worrying
about it being deallocated out from under them.
Individual pack structures are held by reference counts, but we
only care about the list the pack structure is held in. There is
no need to increment/decrement the pack reference counts as we
scan through them during a read operation, the caller holds the
git_packlist and that is sufficient to hold the packs it references.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
3a33c7b3
|
2009-01-02T20:51:47
|
|
Fix snprintf compiler warning on cygwin
As far as gcc is concerned, the "z size specifier" is available as
an extension to the language, which is available with or without any
-std= switch. (I think you have to go back to 2.95 for a version
of gcc which doesn't work.) Many other compilers have this as an
extension as well (ie without the equivalent of -std=c99).
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
b438016e
|
2008-12-31T16:20:05
|
|
Find pack files in $GIT_DIR/objects/pack directory on git_odb_open
Currently we only catalog the available pack files into a table,
storing their path names relative to the pack directory.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
5614dc18
|
2008-12-31T13:27:51
|
|
Add basic locking to the git_odb structure
We grab the lock while accessing the alternates list, ensuring that
we only initialize it once for the given git_odb.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
d44cfd46
|
2008-12-31T13:16:31
|
|
Cleanup our header inclusion order to ensure pthread.h is early
If we are using threads we need to make sure pthread.h comes
in before just about anything else. Some platforms enable
macros that alter what other headers define.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
64a47c01
|
2008-12-30T23:21:36
|
|
Wrap malloc and friends and report out of memory as GIT_ENOMEM
We now forbid direct use of malloc, strdup or calloc within the
library and instead use wrapper functions git__malloc, etc. to
invoke the underlying library malloc and set git_errno to a no
memory error code if the allocation fails.
In the future once we have pack objects in memory we are likely
to enhance these routines with garbage collection logic to purge
cached pack data when allocations fail. Because the size of the
function will grow somewhat large, we don't want to mark them for
inline as gcc tends to aggressively inline, creating larger than
expected executables.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
ffb55c53
|
2008-12-30T22:29:04
|
|
Rename the path of the objects directory to be more specific
We're likely to add additional path data, like the path of the
refs or the path to the config file into the git_odb structure,
as it may grow into the repository wrapper. Changing the name
of the objects directory reference makes it more clear should
we later add something else.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
4c67e2e9
|
2008-12-30T22:25:30
|
|
Change git_odb__read_packed to return ENOTFOUND until implemented
We didn't search for the object, so we cannot possibly promise it
to the caller of git_odb_read().
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
064301cc
|
2008-12-30T22:07:56
|
|
Fix size_t snprintf warning by using PRIuPTR format macro
This is the correct C99 format code for the size_t type when passed
as an argument to the *printf family. If the platform doesn't
define it, we assume %lu and just cross our fingers that its the
proper setting for a size_t on this system. On most sane platforms,
"unsigned long" is the underlying type of "size_t".
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
c960d6a3
|
2008-12-27T18:59:43
|
|
Add a routine to determine a git_oid given an git_obj
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
3d3552e8
|
2008-12-18T22:58:10
|
|
Implement git_odb__read_loose()
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
7b6e8067
|
2008-12-10T18:31:28
|
|
Add some git_otype string conversion and testing routines
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
dff79e27
|
2008-11-18T00:59:36
|
|
Rename "git_sobj" "git_obj"
The 's' never really made sense, since it's not a "small"
object at all, but rather a plain object. As such, it should
have a "plain" object name.
Signed-off-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
1699efc4
|
2008-11-03T18:39:37
|
|
Implement some of the basic git_odb open and close API
Far from being complete, but its a good start.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
50298f44
|
2008-11-01T15:55:01
|
|
Switch the license from BSD to GPL+libgcc exception
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|
d1ea30c3
|
2008-11-01T15:42:23
|
|
Move include files to include/git/, drop git_ prefix from file names
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|