kmx git

Commit	Date	Message
6f544140	2021-01-05T19:45:23	commit-graph: Introduce `git_commit_list_generation_cmp` This change makes calculations of merge-bases a bit faster when there are complex graphs and the commit times cause visiting nodes multiple times. This is done by visiting the nodes in the graph in reverse generation order when the generation number is available instead of commit timestamp. If the generation number is missing in any pair of commits, it can safely fall back to the old heuristic with no negative side-effects. Part of: #5757
25b75cd9	2021-03-10T07:06:15	commit-graph: Create `git_commit_graph` as an abstraction for the file This change does a medium-size refactor of the git_commit_graph_file and the interaction with the ODB. Now instead of the ODB owning a direct reference to the git_commit_graph_file, there will be an intermediate git_commit_graph. The main advantage of that is that now end users can explicitly set a git_commit_graph that is eagerly checked for errors, while still being able to lazily use the commit-graph in a regular ODB, if the file is present.
248606eb	2021-01-05T17:20:27	commit-graph: Use the commit-graph in revwalks This change makes revwalks a bit faster by using the `commit-graph` file (if present). This is thanks to the `commit-graph` allow much faster parsing of the commit information by requiring near-zero I/O (aside from reading a few dozen bytes off of a `mmap(2)`-ed file) for each commit, instead of having to read the ODB, inflate the commit, and parse it. This is done by modifying `git_commit_list_parse()` and letting it use the ODB-owned commit-graph file. Part of: #5757
f04a58b0	2019-10-03T12:55:48	Merge pull request #4445 from tiennou/shallow/dry-commit-parsing DRY commit parsing
5cf17e0f	2019-10-03T09:39:42	commit_list: store in/out-degrees as uint16_t The commit list's in- and out-degrees are currently stored as `unsigned short`. When assigning it the value of `git_array_size`, which returns an `size_t`, this generates a warning on some Win32 platforms due to loosing precision. We could just cast the returned value of `git_array_size`, which would work fine for 99.99% of all cases as commits typically have less than 2^16 parents. For crafted commits though we might end up with a wrong value, and thus we should definitely check whether the array size actually fits into the field. To ease the check, let's convert the fields to store the degrees as `uint16_t`. We shouldn't rely on such unspecific types anyway, as it may lead to different behaviour across platforms. Furthermore, this commit introduces a new `git__is_uint16` function to check whether it actually fits -- if not, we return an error.
5988cf34	2017-12-15T18:11:51	commit_list: unify commit information parsing
57a9ccd5	2019-06-21T15:53:54	commit_list: fix possible buffer overflow in `commit_quick_parse` The function `commit_quick_parse` provides a way to quickly parse parts of a commit without storing or verifying most of its metadata. The first thing it does is calculating the number of parents by skipping "parent " lines until it finds the first non-parent line. Afterwards, this parent count is passed to `alloc_parents`, which will allocate an array to store all the parent. To calculate the amount of storage required for the parents array, `alloc_parents` simply multiplicates the number of parents with the respective elements's size. This already screams "buffer overflow", and in fact this problem is getting worse by the result being cast to an `uint32_t`. In fact, triggering this is possible: git-hash-object(1) will happily write a commit with multiple millions of parents for you. I've stopped at 67,108,864 parents as git-hash-object(1) unfortunately soaks up the complete object without streaming anything to disk and thus will cause an OOM situation at a later point. The point here is: this commit was about 4.1GB of size but compressed down to 24MB and thus easy to distribute. The above doesn't yet trigger the buffer overflow, thus. As the array's elements are all pointers which are 8 bytes on 64 bit, we need a total of 536,870,912 parents to trigger the overflow to `0`. The effect is that we're now underallocating the array and do an out-of-bound writes. As the buffer is kindly provided by the adversary, this may easily result in code execution. Extrapolating from the test file with 67m commits to the one with 536m commits results in a factor of 8. Thus the uncompressed contents would be about 32GB in size and the compressed ones 192MB. While still easily distributable via the network, only servers will have that amount of RAM and not cause an out-of-memory condition previous to triggering the overflow. This at least makes this attack not an easy vector for client-side use of libgit2.
d103f008	2019-05-21T13:44:47	pool: use `size_t` for sizes
f673e232	2018-12-27T13:47:34	git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
168fe39b	2018-11-28T14:26:57	object_type: use new enumeration names Use the new object_type enumeration names within the codebase.
1a3fa1f5	2018-10-18T11:25:59	commit_list: avoid use of strtol64 without length limit When quick-parsing a commit, we use `git__strtol64` to parse the commit's time. The buffer that's passed to `commit_quick_parse` is the raw data of an ODB object, though, whose data may not be properly formatted and also does not have to be `NUL` terminated. This may lead to out-of-bound reads. Use `git__strntol64` to avoid this problem.
0c7f49dd	2017-06-30T13:39:01	Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
909d5494	2016-12-29T12:25:15	giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
5e2a29a7	2016-09-27T13:11:47	commit_list: fix the date comparison function This returns the integer-cast truth value comparing the dates. What we want instead of a (-1, 0, 1) output depending on how they compare.
d3416dfe	2015-10-28T10:50:25	pool: Dot not assume mallocs are zeroed out
5ffdea6f	2015-10-14T16:49:01	revwalk: make commit list use 64 bits for time We moved the "main" parsing to use 64 bits for the timestamp, but the quick parsing for the revwalk did not. This means that for large timestamps we fail to parse the time and thus the walk. Move this parser to use 64 bits as well.
b874629b	2014-12-04T21:06:59	Spelling fixes
4075e060	2014-02-03T21:02:08	Replace pqueue with code from hashsig heap I accidentally wrote a separate priority queue implementation when I was working on file rename detection as part of the file hash signature calculation code. To simplify licensing terms, I just adapted that to a general purpose priority queue and replace the old priority queue implementation that was borrowed from elsewhere. This also removes parts of the COPYING document that no longer apply to libgit2.
3736b64f	2013-06-25T18:36:37	Prefer younger merge bases over older ones. git-core prefers younger merge bases over older ones in case that multiple valid merge bases exists.
badd85a6	2013-04-10T17:10:17	Use git_odb_object_data/_size whereever possible This uses the odb object accessors so we can change the internals more easily...
8842c75f	2013-04-03T22:30:07	What has science done.
359fc2d2	2013-01-08T17:07:25	update copyrights
bdf3e6df	2012-11-29T17:34:41	Fix error condition typo
d5e44d84	2012-11-29T17:02:27	Fix function name and add real error check `revwalk.h:commit_lookup()` -> `git_revwalk__commit_lookup()` and make `git_commit_list_parse()` do real error checking that the item in the list is an actual commit object. Also fixed an apparent typo in a test name.
4ff192d3	2012-11-26T19:47:47	Move merge functions to merge.c In so doing, promote commit_list to git_commit_list, with its own internal API header.

6f544140

2021-01-05T19:45:23

commit-graph: Introduce `git_commit_list_generation_cmp` This change makes calculations of merge-bases a bit faster when there are complex graphs and the commit times cause visiting nodes multiple times. This is done by visiting the nodes in the graph in reverse generation order when the generation number is available instead of commit timestamp. If the generation number is missing in any pair of commits, it can safely fall back to the old heuristic with no negative side-effects. Part of: #5757

25b75cd9

2021-03-10T07:06:15

commit-graph: Create `git_commit_graph` as an abstraction for the file This change does a medium-size refactor of the git_commit_graph_file and the interaction with the ODB. Now instead of the ODB owning a direct reference to the git_commit_graph_file, there will be an intermediate git_commit_graph. The main advantage of that is that now end users can explicitly set a git_commit_graph that is eagerly checked for errors, while still being able to lazily use the commit-graph in a regular ODB, if the file is present.

248606eb

2021-01-05T17:20:27

commit-graph: Use the commit-graph in revwalks This change makes revwalks a bit faster by using the `commit-graph` file (if present). This is thanks to the `commit-graph` allow much faster parsing of the commit information by requiring near-zero I/O (aside from reading a few dozen bytes off of a `mmap(2)`-ed file) for each commit, instead of having to read the ODB, inflate the commit, and parse it. This is done by modifying `git_commit_list_parse()` and letting it use the ODB-owned commit-graph file. Part of: #5757

f04a58b0

2019-10-03T12:55:48

Merge pull request #4445 from tiennou/shallow/dry-commit-parsing DRY commit parsing

5cf17e0f

2019-10-03T09:39:42

commit_list: store in/out-degrees as uint16_t The commit list's in- and out-degrees are currently stored as `unsigned short`. When assigning it the value of `git_array_size`, which returns an `size_t`, this generates a warning on some Win32 platforms due to loosing precision. We could just cast the returned value of `git_array_size`, which would work fine for 99.99% of all cases as commits typically have less than 2^16 parents. For crafted commits though we might end up with a wrong value, and thus we should definitely check whether the array size actually fits into the field. To ease the check, let's convert the fields to store the degrees as `uint16_t`. We shouldn't rely on such unspecific types anyway, as it may lead to different behaviour across platforms. Furthermore, this commit introduces a new `git__is_uint16` function to check whether it actually fits -- if not, we return an error.

5988cf34

2017-12-15T18:11:51

commit_list: unify commit information parsing

57a9ccd5

2019-06-21T15:53:54

commit_list: fix possible buffer overflow in `commit_quick_parse` The function `commit_quick_parse` provides a way to quickly parse parts of a commit without storing or verifying most of its metadata. The first thing it does is calculating the number of parents by skipping "parent " lines until it finds the first non-parent line. Afterwards, this parent count is passed to `alloc_parents`, which will allocate an array to store all the parent. To calculate the amount of storage required for the parents array, `alloc_parents` simply multiplicates the number of parents with the respective elements's size. This already screams "buffer overflow", and in fact this problem is getting worse by the result being cast to an `uint32_t`. In fact, triggering this is possible: git-hash-object(1) will happily write a commit with multiple millions of parents for you. I've stopped at 67,108,864 parents as git-hash-object(1) unfortunately soaks up the complete object without streaming anything to disk and thus will cause an OOM situation at a later point. The point here is: this commit was about 4.1GB of size but compressed down to 24MB and thus easy to distribute. The above doesn't yet trigger the buffer overflow, thus. As the array's elements are all pointers which are 8 bytes on 64 bit, we need a total of 536,870,912 parents to trigger the overflow to `0`. The effect is that we're now underallocating the array and do an out-of-bound writes. As the buffer is kindly provided by the adversary, this may easily result in code execution. Extrapolating from the test file with 67m commits to the one with 536m commits results in a factor of 8. Thus the uncompressed contents would be about 32GB in size and the compressed ones 192MB. While still easily distributable via the network, only servers will have that amount of RAM and not cause an out-of-memory condition previous to triggering the overflow. This at least makes this attack not an easy vector for client-side use of libgit2.

d103f008

2019-05-21T13:44:47

pool: use `size_t` for sizes

f673e232

2018-12-27T13:47:34

git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.

168fe39b

2018-11-28T14:26:57

object_type: use new enumeration names Use the new object_type enumeration names within the codebase.

1a3fa1f5

2018-10-18T11:25:59

commit_list: avoid use of strtol64 without length limit When quick-parsing a commit, we use `git__strtol64` to parse the commit's time. The buffer that's passed to `commit_quick_parse` is the raw data of an ODB object, though, whose data may not be properly formatted and also does not have to be `NUL` terminated. This may lead to out-of-bound reads. Use `git__strntol64` to avoid this problem.

0c7f49dd

2017-06-30T13:39:01

Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.

909d5494

2016-12-29T12:25:15

giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one

5e2a29a7

2016-09-27T13:11:47

commit_list: fix the date comparison function This returns the integer-cast truth value comparing the dates. What we want instead of a (-1, 0, 1) output depending on how they compare.

d3416dfe

2015-10-28T10:50:25

pool: Dot not assume mallocs are zeroed out

5ffdea6f

2015-10-14T16:49:01

revwalk: make commit list use 64 bits for time We moved the "main" parsing to use 64 bits for the timestamp, but the quick parsing for the revwalk did not. This means that for large timestamps we fail to parse the time and thus the walk. Move this parser to use 64 bits as well.

b874629b

2014-12-04T21:06:59

Spelling fixes

4075e060

2014-02-03T21:02:08

Replace pqueue with code from hashsig heap I accidentally wrote a separate priority queue implementation when I was working on file rename detection as part of the file hash signature calculation code. To simplify licensing terms, I just adapted that to a general purpose priority queue and replace the old priority queue implementation that was borrowed from elsewhere. This also removes parts of the COPYING document that no longer apply to libgit2.

3736b64f

2013-06-25T18:36:37

Prefer younger merge bases over older ones. git-core prefers younger merge bases over older ones in case that multiple valid merge bases exists.

badd85a6

2013-04-10T17:10:17

Use git_odb_object_data/_size whereever possible This uses the odb object accessors so we can change the internals more easily...

8842c75f

2013-04-03T22:30:07

What has science done.

359fc2d2

2013-01-08T17:07:25

update copyrights

bdf3e6df

2012-11-29T17:34:41

Fix error condition typo

d5e44d84

2012-11-29T17:02:27

Fix function name and add real error check `revwalk.h:commit_lookup()` -> `git_revwalk__commit_lookup()` and make `git_commit_list_parse()` do real error checking that the item in the list is an actual commit object. Also fixed an apparent typo in a test name.

4ff192d3

2012-11-26T19:47:47

Move merge functions to merge.c In so doing, promote commit_list to git_commit_list, with its own internal API header.

thodg/libgit2/src/commit_list.c

src/commit_list.c

Log