|
a97b769a
|
2016-04-27T12:00:31
|
|
odb: avoid inflating the full delta to read the header
When we read the header, we want to know the size and type of the
object. We're currently inflating the full delta in order to read the
first few bytes. This can mean hundreds of kB needlessly inflated for
large objects.
Instead use a packfile stream to read just enough so we can read the two
varints in the header and avoid inflating most of the delta.
|
|
d53cc13e
|
2016-03-31T04:12:46
|
|
Merge pull request #3575 from pmq20/master-13jan16
Remove duplicated calls to git_mwindow_close
|
|
e10144ae
|
2016-03-04T01:18:30
|
|
odb: improved not found error messages
When looking up an abbreviated oid, show the actual (abbreviated) oid
the caller passed instead of a full (but ambiguously truncated) oid.
|
|
6d97beb9
|
2016-02-25T15:46:59
|
|
pack: don't allow a negative offset
|
|
ea9e00cb
|
2016-02-23T18:15:43
|
|
pack: make sure we don't go out of bounds for extended entries
A corrupt index might have data that tells us to go look past the end of
the file for data. Catch these cases and return an appropriate error
message.
|
|
a53d2e39
|
2016-02-09T09:58:56
|
|
pack: do not free passed in poiter on error
The function `git_packfile_stream_open` tries to free the passed
in stream when an error occurs. The only call site is
`git_indexer_append`, though, which passes in the address of a
stream struct which has not been allocated on the heap.
Fix the issue by simply removing the call to free. In case of an
error we did not allocate any memory yet and otherwise it should
be the caller's responsibility to manage it's object's lifetime.
|
|
d4e4f272
|
2016-01-13T11:07:14
|
|
Remove duplicated calls to git_mwindow_close
|
|
b644e223
|
2016-01-13T11:02:38
|
|
Make packfile_unpack_compressed a private API
|
|
c369b379
|
2015-07-31T16:23:11
|
|
Remove extra semicolon outside of a function
Without this change, compiling with gcc and pedantic generates warning:
ISO C does not allow extra ‘;’ outside of a function.
|
|
878293f7
|
2015-06-10T10:44:14
|
|
pack: use git_buf when building the index name
The way we currently do it depends on the subtlety of strlen vs sizeof
and the fact that .pack is one longer than .idx. Let's use a git_buf so
we can express the manipulation we want much more clearly.
|
|
38c10ecd
|
2015-05-16T19:00:50
|
|
indexer: don't look for the index we're creating
When creating an index, know that we do not have an index for
our own packfile, preventing some unnecessary file opens and
error reporting.
|
|
b63b76e0
|
2014-10-12T11:42:31
|
|
Reorder some khash declarations
Keep the definitions in the headers, while putting the declarations in
the C files. Putting the function definitions in headers causes
them to be duplicated if you include two headers with them.
|
|
5091aff7
|
2015-02-20T08:40:40
|
|
Merge pull request #2907 from jasonhaslam/git_packfile_unpack_race
Fix race in git_packfile_unpack.
|
|
8588cb0c
|
2015-02-14T23:43:26
|
|
Fix race in git_packfile_unpack.
Increment refcount of newly added cache entries just like existing
entries looked up from the cache. Otherwise the new entry can be
evicted from the cache and destroyed while it's still in use.
|
|
f1453c59
|
2015-02-12T12:19:37
|
|
Make our overflow check look more like gcc/clang's
Make our overflow checking look more like gcc and clang's, so that
we can substitute it out with the compiler instrinsics on platforms
that support it. This means dropping the ability to pass `NULL` as
an out parameter.
As a result, the macros also get updated to reflect this as well.
|
|
392702ee
|
2015-02-09T23:41:13
|
|
allocations: test for overflow of requested size
Introduce some helper macros to test integer overflow from arithmetic
and set error message appropriately.
|
|
6f73e026
|
2014-12-24T11:42:50
|
|
Plug some leaks
|
|
ec7e680c
|
2014-11-20T12:07:55
|
|
Fix for misleading "missing delta bases" error - Fix #2721.
|
|
ea66215d
|
2014-10-26T10:29:19
|
|
Removed some useless variable assignments
|
|
e640a77c
|
2014-09-25T15:29:03
|
|
Silence uninitialized warning
|
|
5cd81bb3
|
2014-09-03T01:01:25
|
|
Several CppCat warnings fixed
|
|
b3d3459f
|
2014-08-26T15:09:47
|
|
pack: return the correct final offset
The callers of git_packfile_unpack() expect the obj_offset argument to
be set to the beginning of the next object. We were mistakenly returning
the the offset of the object's data, which causes the CRC function to
try to use the wrong offset.
Set obj_offset to curpos instead of elem->offset to point to the next
element and bring back expected behaviour.
|
|
5e0f47c3
|
2014-06-25T21:20:39
|
|
pack: free the new pack struct if we fail to insert
If we fail to insert the packfile in the map, make sure to free it.
This makes the free function only attempt to remove its mwindows from
the global list if we have opened the packfile to avoid accessing the
list unlocked.
|
|
b3b66c57
|
2014-06-18T17:13:12
|
|
Share packs across repository instances
Opening the same repository multiple times will currently open the same
file multiple times, as well as map the same region of the file multiple
times. This is not necessary, as the packfile data is immutable.
Instead of opening and closing packfiles directly, introduce an
indirection and allocate packfiles globally. This does mean locking on
each packfile open, but we already use this lock for the global mwindow
list so it doesn't introduce a new contention point.
|
|
649214be
|
2014-05-15T19:59:05
|
|
pack: init the cache on packfile alloc
When running multithreaded, it is not enough to check for the offmap
allocation. Move the call to cache_init() to packfile allocation so we
can be sure it is always allocated free of races.
This fixes #2355.
|
|
c968ce2c
|
2014-05-12T02:01:05
|
|
pack: don't forget to cache the base object
The base object is a good cache candidate, so we shouldn't forget to add
it to the cache.
|
|
15bcced2
|
2014-05-11T05:31:22
|
|
pack: use stack allocation for smaller delta chains
This avoid allocating the array on the heap for relatively small
chains. The expected performance increase is sadly not really
noticeable.
|
|
a3ffbf23
|
2014-05-11T03:50:34
|
|
pack: expose a cached delta base directly
Instead of going through a special entry in the chain, let's pass it as
an output parameter.
|
|
9dbd150f
|
2014-05-09T09:36:09
|
|
pack: simplify delta chain code
The switch makes the loop somewhat unwieldy. Let's assume it's fine and
perform the check when we're accessing the data.
This makes our code look a lot more like git's.
|
|
b2559f47
|
2014-05-08T17:14:59
|
|
pack: preallocate a 64-element chain
Dependency chains are often large and require a few
reallocations. Allocate a 64-element chain before doing anything else to
avoid allocations during the loop.
This value comes from the stack-allocated one git uses. We still
allocate this on the heap, but it does help performance a little bit.
|
|
e6d10c58
|
2014-05-08T16:24:54
|
|
pack: make sure not to leak the dep chain
|
|
a332e91c
|
2014-05-06T23:37:28
|
|
pack: use a cache for delta bases when unpacking
Bring back the use of the delta base cache for unpacking objects. When
generating the delta chain, we stop when we find a delta base in the
pack's cache and use that as the starting point.
|
|
2acdf4b8
|
2014-05-06T19:20:33
|
|
pack: unpack using a loop
We currently make use of recursive function calls to unpack an object,
resolving the deltas as we come back down the chain. This means that we
have unbounded stack growth as we look up objects in a pack.
This is now done in two steps: first we figure out what the dependency
chain is by looking up the delta bases until we reach a non-delta
object, pushing the information we need onto a stack and then we pop
from that stack and apply the deltas until there are no more left.
This version of the code does not make use of the delta base cache so it
is slower than what's in the mainline. A later commit will reintroduce
it.
|
|
ae081739
|
2014-05-06T21:21:04
|
|
pack: do not repeat the same error message four times
Repeating this error message makes it harder to find out where we
actually are finding the error, and they don't really describe what
we're trying to do.
|
|
86d5810b
|
2014-05-06T16:20:14
|
|
pack: remove misleading comment
|
|
8610487c
|
2014-01-23T23:28:28
|
|
Drop parsing pack filename SHA1 part, no one cares the filename
|
|
26c1cb91
|
2013-12-09T09:44:03
|
|
One more rename/cleanup for callback err functions
|
|
c7b3e1b3
|
2013-12-06T15:42:20
|
|
Some callback error check style cleanups
I find this easier to read...
|
|
25e0b157
|
2013-12-06T15:07:57
|
|
Remove converting user error to GIT_EUSER
This changes the behavior of callbacks so that the callback error
code is not converted into GIT_EUSER and instead we propagate the
return value through to the caller. Instead of using the
giterr_capture and giterr_restore functions, we now rely on all
functions to pass back the return value from a callback.
To avoid having a return value with no error message, the user
can call the public giterr_set_str or some such function to set
an error message. There is a new helper 'giterr_set_callback'
that functions can invoke after making a callback which ensures
that some error message was set in case the callback did not set
one.
In places where the sign of the callback return value is
meaningful (e.g. positive to skip, negative to abort), only the
negative values are returned back to the caller, obviously, since
the other values allow for continuing the loop.
The hardest parts of this were in the checkout code where positive
return values were overloaded as meaningful values for checkout.
I fixed this by adding an output parameter to many of the internal
checkout functions and removing the overload. This added some
code, but it is probably a better implementation.
There is some funkiness in the network code where user provided
callbacks could be returning a positive or a negative value and
we want to rely on that to cancel the loop. There are still a
couple places where an user error might get turned into GIT_EUSER
there, I think, though none exercised by the tests.
|
|
dab89f9b
|
2013-12-04T21:22:57
|
|
Further EUSER and error propagation fixes
This continues auditing all the places where GIT_EUSER is being
returned and making sure to clear any existing error using the
new giterr_user_cancel helper. As a result, places that relied
on intercepting GIT_EUSER but having the old error preserved also
needed to be cleaned up to correctly stash and then retrieve the
actual error.
Additionally, as I encountered places where error codes were not
being propagated correctly, I tried to fix them up. A number of
those fixes are included in the this commit as well.
|
|
51a3dfb5
|
2013-11-01T16:31:02
|
|
pack: `__object_header` always returns unsigned values
|
|
3343b5ff
|
2013-10-31T22:59:42
|
|
Fix warning on win64
|
|
51e82492
|
2013-10-03T16:54:25
|
|
pack: move the object header function here
|
|
67591c8c
|
2013-08-14T10:28:01
|
|
sha1_lookup: do not use the "experimental" lookup mode
|
|
3a2d48d5
|
2013-07-25T14:54:19
|
|
Close p->mwf.fd only if necessary
This fixes a regression introduced in revision 9d2f841a5d39fc25ce722a3904f6ebc9aa112222.
Signed-off-by: Sven Strickroth <email@cs-ware.de>
|
|
050af8bb
|
2013-07-15T16:00:00
|
|
pack: fix memory leak in error path
|
|
1a42dd17
|
2013-05-31T14:13:11
|
|
Mutex init can fail
It is obviously quite a serious problem if this happens, but mutex
initialization can fail and we should detect it. It's a bit like
a memory allocation failure, in that you're probably pretty screwed
if this occurs, but at least we'll catch it.
|
|
f658dc43
|
2013-05-31T14:09:58
|
|
Zero memory for major objects before freeing
By zeroing out the memory when we free larger objects (i.e. those
that serve as collections of other data, such as repos, odb, refdb),
I'm hoping that it will be easier for libgit2 bindings to find
errors in their object management code.
|
|
0ddfcb40
|
2013-05-02T18:06:14
|
|
Switch to index_version as "git_pack_file is ready" flag
We use p->index_map.data to check whether the struct has been set up
and all the information about the index is stored there. This variable
gets set up halfway through the setup process, however, and a thread
can come along and use fields that haven't been written to yet.
Crucially, pack_entry_find_offset() needs to read the index version
(which is written after index_map) to know the offset and stride
length to pass to sha1_entry_pos(). If these values are wrong,
assertions in it will fail, as it will be reading bogus data.
Make index_version the last field to be written and switch from using
p->index_map.data to p->index_version as "git_pack_file is ready" flag
as we can use it to know if every field has been written.
|
|
34bd5999
|
2013-05-02T17:14:05
|
|
Revert "Protect sha1_entry_pos call with mutex"
This reverts commit 8c535f3f6879c6796d8107d7eb80dd8b2105621b.
|
|
8c535f3f
|
2013-05-02T03:34:56
|
|
Protect sha1_entry_pos call with mutex
There is an occasional assertion failure in sha1_entry_pos from
pack_entry_find_index when running threaded. Holding the mutex
around the code that grabs the index_map data and processes it
makes this assertion failure go away.
|
|
9d2f841a
|
2013-05-02T03:03:54
|
|
Add extra locking around packfile open
We were still seeing a few issues in threaded access to packs.
This adds extra locks around the opening of the mwindow to
avoid a different race.
|
|
b7f167da
|
2013-04-29T13:52:12
|
|
Make git_oid_cmp public and add git_oid__cmp
|
|
38eef611
|
2013-04-16T14:19:27
|
|
Make indexer use shared packfile open code
The indexer was creating a packfile object separately from the
code in pack.c which was a problem since I put a call to
git_mutex_init into just pack.c. This commit updates the pack
function for creating a new pack object (i.e. git_packfile_check())
so that it can be used in both places and then makes indexer.c
use the shared initialization routine.
There are also a few minor formatting and warning message fixes.
|
|
5d2d21e5
|
2013-04-16T15:00:43
|
|
Consolidate packfile allocation further
Rename git_packfile_check to git_packfile_alloc since it is now
being used more in that capacity. Fix the various places that use
it. Consolidate some repeated code in odb_pack.c related to the
allocation of a new pack_backend.
|
|
53607868
|
2013-04-15T00:09:03
|
|
Further threading fixes
This builds on the earlier thread safety work to make it so that
setting the odb, index, refdb, or config for a repository is done
in a threadsafe manner with minimized locking time. This is done
by adding a lock to the repository object and using it to guard
the assignment of the above listed pointers. The lock is only
held to assign the pointer value.
This also contains some minor fixes to the other work with pack
files to reduce the time that locks are being held to and fix an
apparently memory leak.
|
|
24c70804
|
2013-04-12T12:59:38
|
|
Add mutex around mapping and unmapping pack files
When I was writing threading tests for the new cache, the main
error I kept running into was a pack file having it's content
unmapped underneath the running thread. This adds a lock around
the routines that map and unmap the pack data so that threads can
effectively reload the data when they need it.
This also required reworking the error handling paths in a couple
places in the code which I tried to make consistent.
|
|
0e040c03
|
2013-03-03T14:50:47
|
|
indexer: use a hashtable for keeping track of offsets
These offsets are needed for REF_DELTA objects, which encode which
object they use as a base, but not where it lies in the packfile, so
we need a list.
These objects are mostly from older packfiles, before OFS_DELTA was
widely spread. The time spent in indexing these packfiles is greatly
reduced, though remains above what git is able to do.
|
|
11d9f6b3
|
2013-01-27T14:17:07
|
|
Vector improvements and their fallout
|
|
aa3bf89d
|
2013-01-26T15:12:53
|
|
Fix a mutex leak in pack.c
|
|
9c62aaab
|
2013-01-14T17:42:12
|
|
pack: evict all of the pages at once
Somewhat surprisingly, this can increase the speed considerably, as we
don't bother trying to decide what to evict, and the most used entries
are quickly back into the cache.
|
|
ed6648ba
|
2013-01-14T16:39:22
|
|
pack: evict objects from the cache in groups of eight
This drops the cache eviction below libcrypto and zlib in the perf
output. The number has been chosen empirically.
|
|
09e29e47
|
2013-01-12T19:31:07
|
|
pack: fixes to the cache
The offset should be git_off_t, and we should check the return value
of the mutex lock function.
|
|
96c9b9f0
|
2013-01-12T18:38:19
|
|
indexer: properly free the packfile resources
The indexer needs to call the packfile's free function so it takes care of
freeing the caches.
We still need to close the mwf descriptor manually so we can rename the
packfile into its final name on Windows.
|
|
80d647ad
|
2013-01-11T20:15:06
|
|
Revert "pack: packfile_free -> git_packfile_free and use it in the indexers"
This reverts commit f289f886cb81bb570bed747053d5ebf8aba6bef7, which
makes the tests fail on Windows. Revert until we can figure out a
solution.
|
|
090d5e1f
|
2013-01-11T14:40:09
|
|
Fix MSVC compilation warnings
|
|
f289f886
|
2013-01-11T17:24:52
|
|
pack: packfile_free -> git_packfile_free and use it in the indexers
It turns out the indexers have been ignoring the pack's free function
and leaking data. Plug that.
|
|
0ed75620
|
2012-12-21T13:46:48
|
|
pack: limit the amount of memory the base delta cache can use
Currently limited to 16MB (like git) and to objects up to 1MB in
size.
|
|
c8f79c2b
|
2012-12-21T10:59:10
|
|
pack: abstract out the cache into its own functions
|
|
525d961c
|
2012-12-20T07:55:51
|
|
pack: refcount entries and add a mutex around cache access
|
|
c0f4a011
|
2012-12-19T16:48:12
|
|
pack: introduce a delta base cache
Many delta bases are re-used. Cache them to avoid inflating the same
data repeatedly.
This version doesn't limit the amount of entries to store, so it can
end up using a considerable amound of memory.
|
|
359fc2d2
|
2013-01-08T17:07:25
|
|
update copyrights
|
|
0249a503
|
2012-12-07T09:40:21
|
|
Merge pull request #1091 from carlosmn/stream-object
Indexer speedup with large objects
|
|
44f9f547
|
2012-11-30T13:33:30
|
|
pack: add git_packfile_resolve_header
To paraphrase @peff:
You can get both size and type from a packed object reasonably cheaply.
If you have:
* An object that is not a delta; both type and size are available in the
packfile header.
* An object that is a delta. The packfile type will be OBJ_*_DELTA, and
you have to resolve back to the base to find the real type. That means
potentially a lot of packfile index lookups, but each one is
relatively cheap. For the size, you inflate the first few bytes of the
delta, whose header will tell you the resulting size of applying the
delta to the base.
For simplicity, we just decompress the whole delta for now.
|
|
46635339
|
2012-11-19T22:22:33
|
|
pack: introduce a streaming API for raw objects
This allows us to take objects from the packfile as a stream instead
of having to keep it all in memory.
|
|
c3fb7d04
|
2012-11-27T15:00:49
|
|
Make git_odb_foreach_cb take const param
This makes the first OID param of the ODB callback a const pointer
and also propogates that change all the way to the backends.
|
|
fcb48e06
|
2012-11-24T15:48:17
|
|
Set p->mwf.fd to -1 on error
If p->mwf.fd is e.g. -2 then it is closed in packfile_free and an exception might be thrown.
Signed-off-by: Sven Strickroth <email@cs-ware.de>
|
|
826bc4a8
|
2012-11-23T13:31:22
|
|
Remove use of English expletives
Remove words such as fuck, crap, shit etc.
Remove other potentially offensive words from comments.
Tidy up other geopolicital terms in comments.
|
|
60ecdf59
|
2012-09-10T11:48:21
|
|
pack: iterate objects in offset order
Compute the ordering on demand and persist until the index is freed.
|
|
51e1d808
|
2012-08-06T12:41:08
|
|
Merge remote-tracking branch 'arrbee/tree-walk-fixes' into development
Conflicts:
src/notes.c
src/transports/git.c
src/transports/http.c
src/transports/local.c
tests-clar/odb/foreach.c
|
|
5dca2010
|
2012-08-03T17:08:01
|
|
Update iterators for consistency across library
This updates all the `foreach()` type functions across the library
that take callbacks from the user to have a consistent behavior.
The rules are:
* A callback terminates the loop by returning any non-zero value
* Once the callback returns non-zero, it will not be called again
(i.e. the loop stops all iteration regardless of state)
* If the callback returns non-zero, the parent fn returns GIT_EUSER
* Although the parent returns GIT_EUSER, no error will be set in
the library and `giterr_last()` will return NULL if called.
This commit makes those changes across the library and adds tests
for most of the iteration APIs to make sure that they follow the
above rules.
|
|
b8457baa
|
2012-07-24T07:57:58
|
|
portability: Improve x86/amd64 compatibility
|
|
521aedad
|
2012-06-05T14:48:51
|
|
odb: add git_odb_foreach()
Go through each backend and list every objects that exists in
them. This allows fsck-like uses.
|
|
1d8943c6
|
2012-06-28T12:05:49
|
|
mwindow: allow memory-window files to deregister
Once a file is registered, there is no way to deregister it, even
after the structure that contains it is no longer needed and has been
freed. This may be the source of #624.
Allow and use the deregister function to remove our file from the
global list.
|
|
2aeadb9c
|
2012-06-12T19:25:09
|
|
Actually do the mmap... unsurprisingly, this makes the indexer work on SFS
On RAM: the .idx and .pack files become links to a .lock and the original download respectively.
Assume some feature (such as record locking) supported by SFS but not JXFS or RAM: is required.
|
|
90490113
|
2012-06-10T18:08:15
|
|
Basic mmap/munmap compatiblity
|
|
904b67e6
|
2012-05-18T01:48:50
|
|
errors: Rename error codes
|
|
e172cf08
|
2012-05-18T01:21:06
|
|
errors: Rename the generic return codes
|
|
282283ac
|
2012-05-04T16:46:46
|
|
Fix valgrind issues
There are three changes here:
- correctly propogate error code from failed object lookups
- make zlib inflate use our allocators
- add OID to notfound error in ODB lookups
|
|
f9f2344b
|
2012-04-23T17:28:11
|
|
Merge pull request #632 from arrbee/win64-cleanup
Code clean up, including fixing warnings on Windows 64-bit build
|
|
44ef8b1b
|
2012-04-13T13:00:10
|
|
Fix warnings on 64-bit windows builds
This fixes all the warnings on win64 except those in deps, which
come from the regex code.
|
|
45d773ef
|
2012-04-05T23:36:14
|
|
pack: signal a short buffer when needed
|
|
0d0fa7c3
|
2012-03-16T15:56:01
|
|
Convert attr, ignore, mwindow, status to new errors
Also cleaned up some previously converted code that still had
little things to polish.
|
|
e1de726c
|
2012-03-12T22:55:40
|
|
Migrate ODB files to new error handling
This migrates odb.c, odb_loose.c, odb_pack.c and pack.c to
the new style of error handling. Also got the unix and win32
versions of map.c. There are some minor changes to other
files but no others were completely converted.
This also contains an update to filebuf so that a zeroed out
filebuf will not think that the fd (== 0) is actually open
(and inadvertently call close() on fd 0 if cleaned up).
Lastly, this was built and tested on win32 and contains a
bunch of fixes for the win32 build which was pretty broken.
|
|
1a481123
|
2012-02-17T00:13:34
|
|
error-handling: References
Yes, this is error handling solely for `refs.c`, but some of the
abstractions leak all ofer the code base.
|
|
0c3bae62
|
2012-02-15T16:56:56
|
|
zlib: Remove custom `git2/zlib.h` header
This is legacy compat stuff for when `deflateBound` is not defined, but
we're not embedding zlib and that function is always available. Kill
that with fire.
|
|
5e0de328
|
2012-02-13T17:10:24
|
|
Update Copyright header
Signed-off-by: schu <schu-github@schulog.org>
|
|
1744fafe
|
2012-01-17T15:49:47
|
|
Move path related functions from fileops to path
This takes all of the functions that look up simple data about
paths (such as `git_futils_isdir`) and moves them over to path.h
(becoming `git_path_isdir`). This leaves fileops.h just with
functions that actually manipulate the filesystem or look at
the file contents in some way.
As part of this, the dir.h header which is really just for win32
support was moved into win32 (with some minor changes).
|
|
1af56d7d
|
2012-01-15T15:48:36
|
|
Fix #534: 64-bit issues in Windows
off_t is always 32 bits in Windows, which is beyond stupid, but we just
don't care anymore because we're using `git_off_t` which is assured to
be 64 bits on all platforms, regardless of compilation mode. Just
ensure that no casts to `off_t` are performed.
Also, the check for `off_t` overflows has been dropped, once again,
because the size of our offsets is always 64 bits on all platforms.
Fixes #534
|
|
a15c550d
|
2011-11-16T14:09:44
|
|
threads: Fix the shared global state with TLS
See `global.c` for a description of what we're doing.
When libgit2 is built with GIT_THREADS support, the threading system
must be explicitly initialized with `git_threads_init()`.
|