kmx git

Commit	Date	Message
c8ee5270	2017-12-08T09:05:58	pack: rename `git_packfile_stream_free` The function `git_packfile_stream_free` frees all state of the packfile stream without freeing the structure itself. This naming makes it hard to spot whether it will try to free the pointer itself or not, causing potential future errors. Due to this reason, we have decided to name a function freeing state without freeing the actual struture a "dispose" function. Rename `git_packfile_stream_free` to `git_packfile_stream_dispose` as a first example following this rule.
0c7f49dd	2017-06-30T13:39:01	Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
bf339ab0	2017-01-21T14:51:31	indexer: introduce `git_packfile_close` Encapsulation!
27051d4e	2016-07-22T13:34:19	odb: only freshen pack files every 2 seconds Since writing multiple objects may all already exist in a single packfile, avoid freshening that packfile repeatedly in a tight loop. Instead, only freshen pack files every 2 seconds.
b644e223	2016-01-13T11:02:38	Make packfile_unpack_compressed a private API
b63b76e0	2014-10-12T11:42:31	Reorder some khash declarations Keep the definitions in the headers, while putting the declarations in the C files. Putting the function definitions in headers causes them to be duplicated if you include two headers with them.
c8e02b87	2015-02-15T21:07:05	Remove extra semicolon outside of a function Without this change, compiling with gcc and pedantic generates warning: ISO C does not allow extra ‘;’ outside of a function.
b3b66c57	2014-06-18T17:13:12	Share packs across repository instances Opening the same repository multiple times will currently open the same file multiple times, as well as map the same region of the file multiple times. This is not necessary, as the packfile data is immutable. Instead of opening and closing packfiles directly, introduce an indirection and allocate packfiles globally. This does mean locking on each packfile open, but we already use this lock for the global mwindow list so it doesn't introduce a new contention point.
a3ffbf23	2014-05-11T03:50:34	pack: expose a cached delta base directly Instead of going through a special entry in the chain, let's pass it as an output parameter.
a332e91c	2014-05-06T23:37:28	pack: use a cache for delta bases when unpacking Bring back the use of the delta base cache for unpacking objects. When generating the delta chain, we stop when we find a delta base in the pack's cache and use that as the starting point.
2acdf4b8	2014-05-06T19:20:33	pack: unpack using a loop We currently make use of recursive function calls to unpack an object, resolving the deltas as we come back down the chain. This means that we have unbounded stack growth as we look up objects in a pack. This is now done in two steps: first we figure out what the dependency chain is by looking up the delta bases until we reach a non-delta object, pushing the information we need onto a stack and then we pop from that stack and apply the deltas until there are no more left. This version of the code does not make use of the delta base cache so it is slower than what's in the mainline. A later commit will reintroduce it.
8610487c	2014-01-23T23:28:28	Drop parsing pack filename SHA1 part, no one cares the filename
51a3dfb5	2013-11-01T16:31:02	pack: `__object_header` always returns unsigned values
3343b5ff	2013-10-31T22:59:42	Fix warning on win64
51e82492	2013-10-03T16:54:25	pack: move the object header function here
5d2d21e5	2013-04-16T15:00:43	Consolidate packfile allocation further Rename git_packfile_check to git_packfile_alloc since it is now being used more in that capacity. Fix the various places that use it. Consolidate some repeated code in odb_pack.c related to the allocation of a new pack_backend.
53607868	2013-04-15T00:09:03	Further threading fixes This builds on the earlier thread safety work to make it so that setting the odb, index, refdb, or config for a repository is done in a threadsafe manner with minimized locking time. This is done by adding a lock to the repository object and using it to guard the assignment of the above listed pointers. The lock is only held to assign the pointer value. This also contains some minor fixes to the other work with pack files to reduce the time that locks are being held to and fix an apparently memory leak.
24c70804	2013-04-12T12:59:38	Add mutex around mapping and unmapping pack files When I was writing threading tests for the new cache, the main error I kept running into was a pack file having it's content unmapped underneath the running thread. This adds a lock around the routines that map and unmap the pack data so that threads can effectively reload the data when they need it. This also required reworking the error handling paths in a couple places in the code which I tried to make consistent.
0e040c03	2013-03-03T14:50:47	indexer: use a hashtable for keeping track of offsets These offsets are needed for REF_DELTA objects, which encode which object they use as a base, but not where it lies in the packfile, so we need a list. These objects are mostly from older packfiles, before OFS_DELTA was widely spread. The time spent in indexing these packfiles is greatly reduced, though remains above what git is able to do.
96c9b9f0	2013-01-12T18:38:19	indexer: properly free the packfile resources The indexer needs to call the packfile's free function so it takes care of freeing the caches. We still need to close the mwf descriptor manually so we can rename the packfile into its final name on Windows.
80d647ad	2013-01-11T20:15:06	Revert "pack: packfile_free -> git_packfile_free and use it in the indexers" This reverts commit f289f886cb81bb570bed747053d5ebf8aba6bef7, which makes the tests fail on Windows. Revert until we can figure out a solution.
d0b14cea	2013-01-11T18:21:09	pack: That declaration
c8f79c2b	2012-12-21T10:59:10	pack: abstract out the cache into its own functions
0ed75620	2012-12-21T13:46:48	pack: limit the amount of memory the base delta cache can use Currently limited to 16MB (like git) and to objects up to 1MB in size.
525d961c	2012-12-20T07:55:51	pack: refcount entries and add a mutex around cache access
c0f4a011	2012-12-19T16:48:12	pack: introduce a delta base cache Many delta bases are re-used. Cache them to avoid inflating the same data repeatedly. This version doesn't limit the amount of entries to store, so it can end up using a considerable amound of memory.
359fc2d2	2013-01-08T17:07:25	update copyrights
0249a503	2012-12-07T09:40:21	Merge pull request #1091 from carlosmn/stream-object Indexer speedup with large objects
44f9f547	2012-11-30T13:33:30	pack: add git_packfile_resolve_header To paraphrase @peff: You can get both size and type from a packed object reasonably cheaply. If you have: * An object that is not a delta; both type and size are available in the packfile header. * An object that is a delta. The packfile type will be OBJ_*_DELTA, and you have to resolve back to the base to find the real type. That means potentially a lot of packfile index lookups, but each one is relatively cheap. For the size, you inflate the first few bytes of the delta, whose header will tell you the resulting size of applying the delta to the base. For simplicity, we just decompress the whole delta for now.
46635339	2012-11-19T22:22:33	pack: introduce a streaming API for raw objects This allows us to take objects from the packfile as a stream instead of having to keep it all in memory.
c3fb7d04	2012-11-27T15:00:49	Make git_odb_foreach_cb take const param This makes the first OID param of the ODB callback a const pointer and also propogates that change all the way to the backends.
60ecdf59	2012-09-10T11:48:21	pack: iterate objects in offset order Compute the ordering on demand and persist until the index is freed.
b8457baa	2012-07-24T07:57:58	portability: Improve x86/amd64 compatibility
521aedad	2012-06-05T14:48:51	odb: add git_odb_foreach() Go through each backend and list every objects that exists in them. This allows fsck-like uses.
fa679339	2012-04-13T09:58:54	Add packfile_unpack_compressed() to the internal header
e1de726c	2012-03-12T22:55:40	Migrate ODB files to new error handling This migrates odb.c, odb_loose.c, odb_pack.c and pack.c to the new style of error handling. Also got the unix and win32 versions of map.c. There are some minor changes to other files but no others were completely converted. This also contains an update to filebuf so that a zeroed out filebuf will not think that the fd (== 0) is actually open (and inadvertently call close() on fd 0 if cleaned up). Lastly, this was built and tested on win32 and contains a bunch of fixes for the win32 build which was pretty broken.
5e0de328	2012-02-13T17:10:24	Update Copyright header Signed-off-by: schu <schu-github@schulog.org>
01ad7b3a	2011-09-06T15:48:45	*: correct and codify various file permissions The following files now have 0444 permissions: - loose objects - pack indexes - pack files - packs downloaded by fetch - packs downloaded by the HTTP transport And the following files now have 0666 permissions: - config files - repository indexes - reflogs - refs This brings libgit2 more in line with Git. Note that git_filebuf_commit() and git_filebuf_commit_at() have both gained a new mode parameter. The latter change fixes an important issue where filebufs created with GIT_FILEBUF_TEMPORARY received 0600 permissions (due to mkstemp(3) usage). Now we chmod() the file before renaming it into place. Tests have been added to confirm that new commit, tag, and tree objects are created with the right permissions. I don't have access to Windows, so for now I've guarded the tests with "#ifndef GIT_WIN32".
87d9869f	2011-09-19T03:34:49	Tabify everything There were quite a few places were spaces were being used instead of tabs. Try to catch them all. This should hopefully not break anything. Except for `git blame`. Oh well.
bb742ede	2011-09-19T01:54:32	Cleanup legal data 1. The license header is technically not valid if it doesn't have a copyright signature. 2. The COPYING file has been updated with the different licenses used in the project. 3. The full GPLv2 header in each file annoys me.
c1af5a39	2011-08-06T00:35:20	Implement cooperative caching When indexing a file with ref deltas, a temporary cache for the offsets has to be built, as we don't have an index file yet. If the user takes the responsiblity for filling the cache, the packing code will look there first when it finds a ref delta. Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
a070f152	2011-07-29T01:08:02	Move pack functions to their own file
b5b474dd	2011-07-28T11:45:46	Modify the given offset in git_packfile_unpack The callers immediately throw away the offset, so we don't need any logical changes in any of them. This will be useful for the indexer, as it does need to know where the compressed data ends. Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
7d0cdf82	2011-07-09T02:25:01	Make packfile_unpack_header more generic On the way, store the fd and the size in the mwindow file. Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>
c7c9e183	2011-07-07T10:17:40	Move the pack structs to an internal header

c8ee5270

2017-12-08T09:05:58

pack: rename `git_packfile_stream_free` The function `git_packfile_stream_free` frees all state of the packfile stream without freeing the structure itself. This naming makes it hard to spot whether it will try to free the pointer itself or not, causing potential future errors. Due to this reason, we have decided to name a function freeing state without freeing the actual struture a "dispose" function. Rename `git_packfile_stream_free` to `git_packfile_stream_dispose` as a first example following this rule.

0c7f49dd

2017-06-30T13:39:01

Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.

bf339ab0

2017-01-21T14:51:31

indexer: introduce `git_packfile_close` Encapsulation!

27051d4e

2016-07-22T13:34:19

odb: only freshen pack files every 2 seconds Since writing multiple objects may all already exist in a single packfile, avoid freshening that packfile repeatedly in a tight loop. Instead, only freshen pack files every 2 seconds.

b644e223

2016-01-13T11:02:38

Make packfile_unpack_compressed a private API

b63b76e0

2014-10-12T11:42:31

Reorder some khash declarations Keep the definitions in the headers, while putting the declarations in the C files. Putting the function definitions in headers causes them to be duplicated if you include two headers with them.

c8e02b87

2015-02-15T21:07:05

Remove extra semicolon outside of a function Without this change, compiling with gcc and pedantic generates warning: ISO C does not allow extra ‘;’ outside of a function.

b3b66c57

2014-06-18T17:13:12

Share packs across repository instances Opening the same repository multiple times will currently open the same file multiple times, as well as map the same region of the file multiple times. This is not necessary, as the packfile data is immutable. Instead of opening and closing packfiles directly, introduce an indirection and allocate packfiles globally. This does mean locking on each packfile open, but we already use this lock for the global mwindow list so it doesn't introduce a new contention point.

a3ffbf23

2014-05-11T03:50:34

pack: expose a cached delta base directly Instead of going through a special entry in the chain, let's pass it as an output parameter.

a332e91c

2014-05-06T23:37:28

pack: use a cache for delta bases when unpacking Bring back the use of the delta base cache for unpacking objects. When generating the delta chain, we stop when we find a delta base in the pack's cache and use that as the starting point.

2acdf4b8

2014-05-06T19:20:33

pack: unpack using a loop We currently make use of recursive function calls to unpack an object, resolving the deltas as we come back down the chain. This means that we have unbounded stack growth as we look up objects in a pack. This is now done in two steps: first we figure out what the dependency chain is by looking up the delta bases until we reach a non-delta object, pushing the information we need onto a stack and then we pop from that stack and apply the deltas until there are no more left. This version of the code does not make use of the delta base cache so it is slower than what's in the mainline. A later commit will reintroduce it.

8610487c

2014-01-23T23:28:28

Drop parsing pack filename SHA1 part, no one cares the filename

51a3dfb5

2013-11-01T16:31:02

pack: `__object_header` always returns unsigned values

3343b5ff

2013-10-31T22:59:42

Fix warning on win64

51e82492

2013-10-03T16:54:25

pack: move the object header function here

5d2d21e5

2013-04-16T15:00:43

Consolidate packfile allocation further Rename git_packfile_check to git_packfile_alloc since it is now being used more in that capacity. Fix the various places that use it. Consolidate some repeated code in odb_pack.c related to the allocation of a new pack_backend.

53607868

2013-04-15T00:09:03

Further threading fixes This builds on the earlier thread safety work to make it so that setting the odb, index, refdb, or config for a repository is done in a threadsafe manner with minimized locking time. This is done by adding a lock to the repository object and using it to guard the assignment of the above listed pointers. The lock is only held to assign the pointer value. This also contains some minor fixes to the other work with pack files to reduce the time that locks are being held to and fix an apparently memory leak.

24c70804

2013-04-12T12:59:38

Add mutex around mapping and unmapping pack files When I was writing threading tests for the new cache, the main error I kept running into was a pack file having it's content unmapped underneath the running thread. This adds a lock around the routines that map and unmap the pack data so that threads can effectively reload the data when they need it. This also required reworking the error handling paths in a couple places in the code which I tried to make consistent.

0e040c03

2013-03-03T14:50:47

indexer: use a hashtable for keeping track of offsets These offsets are needed for REF_DELTA objects, which encode which object they use as a base, but not where it lies in the packfile, so we need a list. These objects are mostly from older packfiles, before OFS_DELTA was widely spread. The time spent in indexing these packfiles is greatly reduced, though remains above what git is able to do.

96c9b9f0

2013-01-12T18:38:19

indexer: properly free the packfile resources The indexer needs to call the packfile's free function so it takes care of freeing the caches. We still need to close the mwf descriptor manually so we can rename the packfile into its final name on Windows.

80d647ad

2013-01-11T20:15:06

Revert "pack: packfile_free -> git_packfile_free and use it in the indexers" This reverts commit f289f886cb81bb570bed747053d5ebf8aba6bef7, which makes the tests fail on Windows. Revert until we can figure out a solution.

d0b14cea

2013-01-11T18:21:09

pack: That declaration

c8f79c2b

2012-12-21T10:59:10

pack: abstract out the cache into its own functions

0ed75620

2012-12-21T13:46:48

pack: limit the amount of memory the base delta cache can use Currently limited to 16MB (like git) and to objects up to 1MB in size.

525d961c

2012-12-20T07:55:51

pack: refcount entries and add a mutex around cache access

c0f4a011

2012-12-19T16:48:12

pack: introduce a delta base cache Many delta bases are re-used. Cache them to avoid inflating the same data repeatedly. This version doesn't limit the amount of entries to store, so it can end up using a considerable amound of memory.

359fc2d2

2013-01-08T17:07:25

update copyrights

0249a503

2012-12-07T09:40:21

Merge pull request #1091 from carlosmn/stream-object Indexer speedup with large objects

44f9f547

2012-11-30T13:33:30

pack: add git_packfile_resolve_header To paraphrase @peff: You can get both size and type from a packed object reasonably cheaply. If you have: * An object that is not a delta; both type and size are available in the packfile header. * An object that is a delta. The packfile type will be OBJ_*_DELTA, and you have to resolve back to the base to find the real type. That means potentially a lot of packfile index lookups, but each one is relatively cheap. For the size, you inflate the first few bytes of the delta, whose header will tell you the resulting size of applying the delta to the base. For simplicity, we just decompress the whole delta for now.

46635339

2012-11-19T22:22:33

pack: introduce a streaming API for raw objects This allows us to take objects from the packfile as a stream instead of having to keep it all in memory.

c3fb7d04

2012-11-27T15:00:49

Make git_odb_foreach_cb take const param This makes the first OID param of the ODB callback a const pointer and also propogates that change all the way to the backends.

60ecdf59

2012-09-10T11:48:21

pack: iterate objects in offset order Compute the ordering on demand and persist until the index is freed.

b8457baa

2012-07-24T07:57:58

portability: Improve x86/amd64 compatibility

521aedad

2012-06-05T14:48:51

odb: add git_odb_foreach() Go through each backend and list every objects that exists in them. This allows fsck-like uses.

fa679339

2012-04-13T09:58:54

Add packfile_unpack_compressed() to the internal header

e1de726c

2012-03-12T22:55:40

Migrate ODB files to new error handling This migrates odb.c, odb_loose.c, odb_pack.c and pack.c to the new style of error handling. Also got the unix and win32 versions of map.c. There are some minor changes to other files but no others were completely converted. This also contains an update to filebuf so that a zeroed out filebuf will not think that the fd (== 0) is actually open (and inadvertently call close() on fd 0 if cleaned up). Lastly, this was built and tested on win32 and contains a bunch of fixes for the win32 build which was pretty broken.

5e0de328

2012-02-13T17:10:24

Update Copyright header Signed-off-by: schu <schu-github@schulog.org>

01ad7b3a

2011-09-06T15:48:45

*: correct and codify various file permissions The following files now have 0444 permissions: - loose objects - pack indexes - pack files - packs downloaded by fetch - packs downloaded by the HTTP transport And the following files now have 0666 permissions: - config files - repository indexes - reflogs - refs This brings libgit2 more in line with Git. Note that git_filebuf_commit() and git_filebuf_commit_at() have both gained a new mode parameter. The latter change fixes an important issue where filebufs created with GIT_FILEBUF_TEMPORARY received 0600 permissions (due to mkstemp(3) usage). Now we chmod() the file before renaming it into place. Tests have been added to confirm that new commit, tag, and tree objects are created with the right permissions. I don't have access to Windows, so for now I've guarded the tests with "#ifndef GIT_WIN32".

87d9869f

2011-09-19T03:34:49

Tabify everything There were quite a few places were spaces were being used instead of tabs. Try to catch them all. This should hopefully not break anything. Except for `git blame`. Oh well.

bb742ede

2011-09-19T01:54:32

Cleanup legal data 1. The license header is technically not valid if it doesn't have a copyright signature. 2. The COPYING file has been updated with the different licenses used in the project. 3. The full GPLv2 header in each file annoys me.

c1af5a39

2011-08-06T00:35:20

Implement cooperative caching When indexing a file with ref deltas, a temporary cache for the offsets has to be built, as we don't have an index file yet. If the user takes the responsiblity for filling the cache, the packing code will look there first when it finds a ref delta. Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>

a070f152

2011-07-29T01:08:02

Move pack functions to their own file

b5b474dd

2011-07-28T11:45:46

Modify the given offset in git_packfile_unpack The callers immediately throw away the offset, so we don't need any logical changes in any of them. This will be useful for the indexer, as it does need to know where the compressed data ends. Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>

7d0cdf82

2011-07-09T02:25:01

Make packfile_unpack_header more generic On the way, store the fd and the size in the mwindow file. Signed-off-by: Carlos Martín Nieto <carlos@cmartin.tk>

c7c9e183

2011-07-07T10:17:40

Move the pack structs to an internal header

thodg/libgit2/src/pack.h

src/pack.h

Log