kmx git

Commit	Date	Message
e54343a4	2019-06-29T09:17:32	fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
3edbc441	2019-05-20T05:48:39	object: use literal constant in bigfile test Don't calculate 4 GiB as that will produce a compiler warning on MSVC. Just hardcode it.
8eb910b0	2019-06-23T11:26:10	largefile tests: only write 2GB on 32-bit platforms Don't try to feed 4 GB of data to APIs that only take a `size_t` on 32-bit platforms.
6574cd00	2019-06-08T19:25:36	index: rename `frombuffer` to `from_buffer` The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the index functions for consistency with the rest of the library.
08f39208	2019-06-08T17:46:04	blob: add underscore to `from` functions The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.
0c2d0d4b	2019-06-14T14:07:26	tests: object: refactor largefile test to not use `p_fallocate` The `p_fallocate` platform is currently in use in our tests, only, but it proved to be quite burdensome to get it implemented in a cross-platform way. The only "real" user is the test object::tree::read::largefile, where it's used to allocate a large file in the filesystem only to commit it to the repo and read its object back again. We can simplify this quite a bit by just using an in-memory buffer of 4GB. Sure, this cannot be used on platforms with low resources. But creating 4GB files is not any better, and we already skip the test if the environment variable "GITTEST_INVASIVE_FS_SIZE" is not set. So we're arguably not worse off than before.
1f47efc4	2019-06-07T14:20:54	tests: object: consolidate cache tests The object::cache test module has two tests that do nearly the same thing: given a cache limit, load a certain set of objects and verify if those objects have been cached or not. Convert those tests to the new data-driven initializers to demonstrate how these are to be used. Furthermore, add some additional test data. This conversion is mainly done to show this new facility.
fb7614c0	2019-04-04T13:51:52	tests: test largefiles on win32
4e3949b7	2019-01-30T02:14:11	tests: test that largefiles can be read through the tree API
f673e232	2018-12-27T13:47:34	git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
cd350852	2019-01-17T10:40:13	object_type: GIT_OBJECT_BAD is now GIT_OBJECT_INVALID We use the term "invalid" to refer to bad or malformed data, eg `GIT_REF_INVALID` and `GIT_EINVALIDSPEC`. Since we're changing the names of the `git_object_t`s in this release, update it to be `GIT_OBJECT_INVALID` instead of `BAD`.
168fe39b	2018-11-28T14:26:57	object_type: use new enumeration names Use the new object_type enumeration names within the codebase.
18e71e6d	2018-11-28T13:31:06	index: use new enum and structure names Use the new-style index names throughout our own codebase.
7fafec0e	2018-10-29T18:32:39	tree: fix integer overflow when reading unreasonably large filemodes The `parse_mode` option uses an open-coded octal number parser. The parser is quite naive in that it simply parses until hitting a character that is not in the accepted range of '0' - '7', completely ignoring the fact that we can at most accept a 16 bit unsigned integer as filemode. If the filemode is bigger than UINT16_MAX, it will thus overflow and provide an invalid filemode for the object entry. Fix the issue by using `git__strntol32` instead and doing a bounds check. As this function already handles overflows, it neatly solves the problem. Note that previously, `parse_mode` was also skipping the character immediately after the filemode. In proper trees, this should be a simple space, but in fact the parser accepted any character and simply skipped over it. As a consequence of using `git__strntol32`, we now need to an explicit check for a trailing whitespace after having parsed the filemode. Because of the newly introduced error message, the test object::tree::parse::mode_doesnt_cause_oob_read needs adjustment to its error message check, which in fact is a good thing as it demonstrates that we now fail looking for the whitespace immediately following the filemode. Add a test that shows that we will fail to parse such invalid filemodes now.
f647bbc8	2018-10-29T17:25:09	tree: fix mode parsing reading out-of-bounds When parsing a tree entry's mode, we will eagerly parse until we hit a character that is not in the accepted set of octal digits '0' - '7'. If the provided buffer is not a NUL terminated one, we may thus read out-of-bounds. Fix the issue by passing the buffer length to `parse_mode` and paying attention to it. Note that this is not a vulnerability in our usual code paths, as all object data read from the ODB is NUL terminated.
d4ad658a	2018-10-29T17:24:47	tree: add various tests exercising the tree parser We currently don't have any tests that directly exercise the tree parser. This is due to the fact that the parsers for raw object data has only been recently introduce with commit ca4db5f4a (object: implement function to parse raw data, 2017-10-13), and previous to that the setup simply was too cumbersome as it always required going through the ODB. Now that we have the infrastructure, add a suite of tests that directly exercise the tree parser and various edge cases.
7655b2d8	2018-10-19T10:29:19	commit: fix reading out of bounds when parsing encoding The commit message encoding is currently being parsed by the `git__prefixcmp` function. As this function does not accept a buffer length, it will happily skip over a buffer's end if it is not `NUL` terminated. Fix the issue by using `git__prefixncmp` instead. Add a test that verifies that we are unable to parse the encoding field if it's cut off by the supplied buffer length.
c2e3d8ef	2018-10-25T12:01:18	tests: add tests that exercise commit parsing We currently do not have any test suites dedicated to parsing commits from their raw representations. Add one based on `git_object__from_raw` to be able to test special cases more easily.
ee11d47e	2018-10-19T09:47:50	tag: fix out of bounds read when searching for tag message When parsing tags, we skip all unknown fields that appear before the tag message. This skipping is done by using a plain `strstr(buffer, "\n\n")` to search for the two newlines that separate tag fields from tag message. As it is not possible to supply a buffer length to `strstr`, this call may skip over the buffer's end and thus result in an out of bounds read. As `strstr` may return a pointer that is out of bounds, the following computation of `buffer_end - buffer` will overflow and result in an allocation of an invalid length. Fix the issue by using `git__memmem` instead. Add a test that verifies parsing the tag fails not due to the allocation failure but due to the tag having no message.
4c738e56	2018-10-19T09:44:14	tests: add tests that exercise tag parsing While the tests in object::tag::read exercises reading and parsing valid tags from the ODB, they barely try to verify that the parser fails in a sane way when parsing invalid tags. Create a new test suite object::tag::parse that directly exercise the parser by using `git_object__from_raw` and add various tests for valid and invalid tags.
f00db9ed	2018-07-27T12:00:37	tree: rename from_tree to validate and clarify the tree in the test
2dff7e28	2018-07-18T21:04:13	tree: accept null ids in existing trees when updating When we add entries to a treebuilder we validate them. But we validate even those that we're adding because they exist in the base tree. This disables using the normal mechanisms on these trees, even to fix them. Keep track of whether the entry we're appending comes from an existing tree and bypass the name and id validation if it's from existing data.
9994cd3f	2018-06-25T11:56:52	treewide: remove use of C++ style comments C++ style comment ("//") are not specified by the ISO C90 standard and thus do not conform to it. While libgit2 aims to conform to C90, we did not enforce it until now, which is why quite a lot of these non-conforming comments have snuck into our codebase. Do a tree-wide conversion of all C++ style comments to the supported C style comments to allow us enforcing strict C90 compliance in a later commit.
ecf4f33a	2018-02-08T11:14:48	Convert usage of `git_buf_free` to new `git_buf_dispose`
a554d588	2018-02-28T12:21:08	tree: initialize the id we use for testing submodule insertions Instead of laving it uninitialized and relying on luck for it to be non-zero, let's give it a dummy hash so we make valgrind happy (in this case the hash comes from `sha1sum </dev/null`.
c0487bde	2018-01-12T08:23:43	tree: reject writing null-OID entries to a tree In commit a96d3cc3f (cache-tree: reject entries with null sha1, 2017-04-21), the git.git project has changed its stance on null OIDs in tree objects. Previously, null OIDs were accepted in tree entries to help tools repair broken history. This resulted in some problems though in that many code paths mistakenly passed null OIDs to be added to a tree, which was not properly detected. Align our own code base according to the upstream change and reject writing tree entries early when the OID is all-zero.
1dc89aab	2017-05-01T21:34:21	object validation: free some memleaks
35079f50	2017-04-21T07:31:56	odb: add option to turn off hash verification Verifying hashsums of objects we are reading from the ODB may be costly as we have to perform an additional hashsum calculation on the object. Especially when reading large objects, the penalty can be as high as 35%, as can be seen when executing the equivalent of `git cat-file` with and without verification enabled. To mitigate for this, we add a global option for libgit2 which enables the developer to turn off the verification, e.g. when he can be reasonably sure that the objects on disk won't be corrupted.
28a0741f	2017-04-10T09:30:08	odb: verify object hashes The upstream git.git project verifies objects when looking them up from disk. This avoids scenarios where objects have somehow become corrupt on disk, e.g. due to hardware failures or bit flips. While our mantra is usually to follow upstream behavior, we do not do so in this case, as we never check hashes of objects we have just read from disk. To fix this, we create a new error class `GIT_EMISMATCH` which denotes that we have looked up an object with a hashsum mismatch. `odb_read_1` will then, after having read the object from its backend, hash the object and compare the resulting hash to the expected hash. If hashes do not match, it will return an error. This obviously introduces another computation of checksums and could potentially impact performance. Note though that we usually perform I/O operations directly before doing this computation, and as such the actual overhead should be drowned out by I/O. Running our test suite seems to confirm this guess. On a Linux system with best-of-five timings, we had 21.592s with the check enabled and 21.590s with the ckeck disabled. Note though that our test suite mostly contains very small blobs only. It is expected that repositories with bigger blobs may notice an increased hit by this check. In addition to a new test, we also had to change the odb::backend::nonrefreshing test suite, which now triggers a hashsum mismatch when looking up the commit "deadbeef...". This is expected, as the fake backend allocated inside of the test will return an empty object for the OID "deadbeef...", which will obviously not hash back to "deadbeef..." again. We can simply adjust the hash to equal the hash of the empty object here to fix this test.
d59dabe5	2017-04-10T09:00:51	tests: object: test looking up corrupted objects We currently have no tests which check whether we fail reading corrupted objects. Add one which modifies contents of an object stored on disk and then tries to read the object.
86c03552	2017-04-10T09:27:04	tests: object: create sandbox The object::lookup tests do use the "testrepo.git" repository in a read-only way, so we do not set up the repository as a sandbox but simply open it. But in a future commit, we will want to test looking up objects which are corrupted in some way, which requires us to modify the on-disk data. Doing this in a repository without creating the sandbox will modify contents of our libgit2 repository, though. Create the repository in a sandbox to avoid this.
1d41b86c	2016-11-14T12:22:20	tree: add a failing test for unsorted input We do not currently use the sorted version of this input in the function, which means we produce bad results.
4006455f	2016-08-09T10:09:23	tests: blob: remove unused callback function
faebc1c6	2016-06-20T17:44:04	threads: split up OS-dependent thread code
a2cb4713	2016-05-24T14:30:43	tree: handle removal of all entries in the updater When we remove all entries in a tree, we should remove that tree from its parent rather than include the empty tree.
53412305	2016-05-19T15:29:53	tree: plug leaks in the tree updater
92249656	2016-05-19T15:21:26	tree: use testrepo2 for the tree updater tests This gives us trees with subdirectories, which the new test needs.
9464f9eb	2016-05-02T17:36:58	Introduce a function to create a tree based on a different one Instead of going through the usual steps of reading a tree recursively into an index, modifying it and writing it back out as a tree, introduce a function to perform simple updates more efficiently. `git_tree_create_updated` avoids reading trees which are not modified and supports upsert and delete operations. It is not as versatile as modifying the index, but it makes some common operations much more efficient.
eb39284b	2016-04-25T12:16:05	tag: ignore extra header fields While no extra header fields are defined for tags, git accepts them by ignoring them and continuing the search for the message. There are a few tags like this in the wild which git parses just fine, so we should do the same.
6669e3e8	2015-11-08T04:28:08	blob: remove _fromchunks() The callback mechanism makes it awkward to write data from an IO source; move to `_fromstream()` which lets the caller remain in control, in the same vein as we prefer iterators over foreach callbacks.
35e68606	2015-11-04T10:36:50	blob: fix fromchunks iteration counter By returning when the count goes to zero rather than below it, setting `howmany` to 7 in fact writes out the string 6 times. Correct the termination condition to write out the string the amount of times we specify.
0a5c6028	2015-11-04T10:30:48	blob: introduce creating a blob by writing into a stream The pair of `git_blob_create_frombuffer()` and `git_blob_create_frombuffer_commit()` is meant to replace `git_blob_create_fromchunks()` by providing a way for a user to write a new blob when they want filtering or they do not know the size. This approach allows the caller to retain control over when to add data to this buffer and a more natural fit into higher-level language's own stream abstractions instead of having to handle IO wait in the callback. The in-memory buffer size of 2MB is chosen somewhat arbitrarily to be a round multiple of usual page sizes and a value where most blobs seem likely to be either going to be way below or way over that size. It's also a round number of pages. This implementation re-uses the helper we have from `_fromchunks()` so we end up writing everything to disk, but hopefully more efficiently than with a default filebuf. A later optimisation can be to avoid writing the in-memory contents to disk, with some extra complexity.
60a194aa	2016-03-20T11:00:12	tree: re-use the id and filename in the odb object Instead of copying over the data into the individual entries, point to the originals, which are already in a format we can use.
ea5bf6bb	2016-03-04T12:34:38	treebuilder: don't try to verify submodules exist in the odb Submodules don't exist in the objectdb and the code is making us try to look for a blob with its commit id, which is obviously not going to work. Skip the test if the user wants to insert a submodule.
f2dddf52	2016-02-28T15:51:38	turn on strict object validation by default
4afe536b	2016-02-28T16:02:49	tests: use legitimate object ids Use legitimate (existing) object IDs in tests so that we have the ability to turn on strict object validation when running tests.
2bbc7d3e	2016-02-23T15:00:27	treebuilder: validate tree entries (optionally) When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate the tree and parent ids given to treebuilder insertion.
2f1080ea	2015-05-19T11:17:07	conflict tests: use GIT_IDXENTRY_STAGE_SET
c4a2fd5c	2015-01-04T17:39:43	Plug a couple of leaks
208a2c8a	2014-12-27T12:09:11	treebuilder: rename _create() to _new() This function is a constructor, so let's name it like one and leave _create() for the reference functions, which do create/write the reference.
dce7b1a4	2014-12-16T19:24:04	treebuilder: take a repository for path validation Path validation may be influenced by `core.protectHFS` and `core.protectNTFS` configuration settings, thus treebuilders can take a repository to influence their configuration.
753e17b0	2014-11-19T18:42:29	peel: reject bad queries with EINVALIDSPEC There are some combination of objects and target types which we know cannot be fulfilled. Return EINVALIDSPEC for those to signify that there is a mismatch in the user-provided data and what the object model is capable of satisfying. If we start at a tag and in the course of peeling find out that we cannot reach a particular type, we return EPEEL.
3b2cb2c9	2014-09-16T11:49:25	Factor 40 and 41 constants from source.
4ca0b566	2014-08-18T12:41:06	oid: Export `git_oid_tostr_s` instead of `_allocfmt` The old `allocfmt` is of no use to callers, as they are not able to free the returned buffer. Export a new API that returns a static string that doesn't need to be freed.
0cee70eb	2014-07-01T14:09:01	Introduce cl_assert_equal_oid
4d3f1f97	2014-06-09T04:38:22	treebuilder: use a map instead of vector to store the entries Finding a filename in a vector means we need to resort it every time we want to read from it, which includes every time we want to write to it as well, as we want to find duplicate keys. A hash-map fits what we want to do much more accurately, as we do not care about sorting, but just the particular filename. We still keep removed entries around, as the interface let you assume they were going to be around until the treebuilder is cleared or freed, but in this case that involves an append to a vector in the filter case, which can now fail. The only time we care about sorting is when we write out the tree, so let's make that the only time we do any sorting.
fb591767	2014-06-07T12:51:48	Win32: Fix object::cache::threadmania test on x64
49e369b2	2014-05-18T10:06:49	message: don't assume the comment char The comment char is configurable and we need to provide a way for the user to specify which comment char they chose for their message.
af567e88	2014-05-12T10:44:13	Merge pull request #2334 from libgit2/rb/fix-2333 Be more careful with user-supplied buffers
1e4976cb	2014-05-08T10:17:14	Be more careful with user-supplied buffers This adds in missing calls to `git_buf_sanitize` and fixes a number of places where `git_buf` APIs could inadvertently write NUL terminator bytes into invalid buffers. This also changes the behavior of `git_buf_sanitize` to NUL terminate a buffer if it can and of `git_buf_shorten` to do nothing if it can. Adds tests of filtering code with zeroed (i.e. unsanitized) buffer which was previously triggering a segfault.
5269008c	2014-05-06T16:01:49	Add filter options and ALLOW_UNSAFE Diff and status do not want core.safecrlf to actually raise an error regardless of the setting, so this extends the filter API with an additional options flags parameter and adds a flag so that filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating that unsafe filter application should be downgraded from a failure to a warning.
217c029b	2014-04-09T14:08:22	commit: safer commit creation with reference update The current version of the commit creation and amend function are unsafe to use when passing the update_ref parameter, as they do not check that the reference at the moment of update points to what the user expects. Make sure that we're moving history forward when we ask the library to update the reference for us by checking that the first parent of the new commit is the current value of the reference. We also make sure that the ref we're updating hasn't moved between the read and the write. Similarly, when amending a commit, make sure that the current tip of the branch is the commit we're amending.
eb46fb2b	2014-03-08T00:49:18	Add failing test for git_object_short_id
13f7ecd7	2014-03-04T16:23:28	Add git_object_short_id API to get short id string This finds a short id string that will unambiguously select the given object, starting with the core.abbrev length (usually 7) and growing until it is no longer ambiguous.
80c29fe9	2014-01-17T10:45:11	Add git_commit_amend API This adds an API to amend an existing commit, basically a shorthand for creating a new commit filling in missing parameters from the values of an existing commit. As part of this, I also added a new "sys" API to create a commit using a callback to get the parents. This allowed me to rewrite all the other commit creation APIs so that temporary allocations are no longer needed.
629ba7f1	2014-02-05T13:07:46	Merge pull request #2027 from libgit2/rb/only-windows-is-windows Some tests of paths that can't actually be written to disk
93954245	2014-01-27T09:39:36	Merge pull request #2075 from libgit2/cmn/leftover-oid Leftover OID -> ID changes
e1d7f003	2014-01-26T16:32:49	messsage: use git_buf in prettify() A lot of the tests were checking for overflow, which we don't have anymore, so we can remove them.
d541170c	2014-01-24T11:36:41	index: rename an entry's id to 'id' This was not converted when we converted the rest, so do it now.
79ccb921	2014-01-03T14:26:02	Further tree building tests with hard paths
97bbf61e	2014-01-03T12:14:22	Tree accessor tests with hard path names
452c7de6	2013-12-12T14:16:40	Add git_treebuilder_insert test and clarify doc This wasn't being tested and since it has a callback, I fixed it even though the return value of this callback is not treated like any of the other callbacks in the API.
19853bdd	2013-12-10T13:01:34	Update git_blob_create_fromchunks callback behavr The callback to supply data chunks could return a negative value to stop creation of the blob, but we were neither using GIT_EUSER nor propagating the return value. This makes things use the new behavior of returning the negative value back to the user.
25e0b157	2013-12-06T15:07:57	Remove converting user error to GIT_EUSER This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.
17820381	2013-11-14T14:05:52	Rename tests-clar to tests

e54343a4

2019-06-29T09:17:32

fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.

3edbc441

2019-05-20T05:48:39

object: use literal constant in bigfile test Don't calculate 4 GiB as that will produce a compiler warning on MSVC. Just hardcode it.

8eb910b0

2019-06-23T11:26:10

largefile tests: only write 2GB on 32-bit platforms Don't try to feed 4 GB of data to APIs that only take a `size_t` on 32-bit platforms.

6574cd00

2019-06-08T19:25:36

index: rename `frombuffer` to `from_buffer` The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the index functions for consistency with the rest of the library.

08f39208

2019-06-08T17:46:04

blob: add underscore to `from` functions The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.

0c2d0d4b

2019-06-14T14:07:26

tests: object: refactor largefile test to not use `p_fallocate` The `p_fallocate` platform is currently in use in our tests, only, but it proved to be quite burdensome to get it implemented in a cross-platform way. The only "real" user is the test object::tree::read::largefile, where it's used to allocate a large file in the filesystem only to commit it to the repo and read its object back again. We can simplify this quite a bit by just using an in-memory buffer of 4GB. Sure, this cannot be used on platforms with low resources. But creating 4GB files is not any better, and we already skip the test if the environment variable "GITTEST_INVASIVE_FS_SIZE" is not set. So we're arguably not worse off than before.

1f47efc4

2019-06-07T14:20:54

tests: object: consolidate cache tests The object::cache test module has two tests that do nearly the same thing: given a cache limit, load a certain set of objects and verify if those objects have been cached or not. Convert those tests to the new data-driven initializers to demonstrate how these are to be used. Furthermore, add some additional test data. This conversion is mainly done to show this new facility.

fb7614c0

2019-04-04T13:51:52

tests: test largefiles on win32

4e3949b7

2019-01-30T02:14:11

tests: test that largefiles can be read through the tree API

f673e232

2018-12-27T13:47:34

git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.

cd350852

2019-01-17T10:40:13

object_type: GIT_OBJECT_BAD is now GIT_OBJECT_INVALID We use the term "invalid" to refer to bad or malformed data, eg `GIT_REF_INVALID` and `GIT_EINVALIDSPEC`. Since we're changing the names of the `git_object_t`s in this release, update it to be `GIT_OBJECT_INVALID` instead of `BAD`.

168fe39b

2018-11-28T14:26:57

object_type: use new enumeration names Use the new object_type enumeration names within the codebase.

18e71e6d

2018-11-28T13:31:06

index: use new enum and structure names Use the new-style index names throughout our own codebase.

7fafec0e

2018-10-29T18:32:39

tree: fix integer overflow when reading unreasonably large filemodes The `parse_mode` option uses an open-coded octal number parser. The parser is quite naive in that it simply parses until hitting a character that is not in the accepted range of '0' - '7', completely ignoring the fact that we can at most accept a 16 bit unsigned integer as filemode. If the filemode is bigger than UINT16_MAX, it will thus overflow and provide an invalid filemode for the object entry. Fix the issue by using `git__strntol32` instead and doing a bounds check. As this function already handles overflows, it neatly solves the problem. Note that previously, `parse_mode` was also skipping the character immediately after the filemode. In proper trees, this should be a simple space, but in fact the parser accepted any character and simply skipped over it. As a consequence of using `git__strntol32`, we now need to an explicit check for a trailing whitespace after having parsed the filemode. Because of the newly introduced error message, the test object::tree::parse::mode_doesnt_cause_oob_read needs adjustment to its error message check, which in fact is a good thing as it demonstrates that we now fail looking for the whitespace immediately following the filemode. Add a test that shows that we will fail to parse such invalid filemodes now.

f647bbc8

2018-10-29T17:25:09

tree: fix mode parsing reading out-of-bounds When parsing a tree entry's mode, we will eagerly parse until we hit a character that is not in the accepted set of octal digits '0' - '7'. If the provided buffer is not a NUL terminated one, we may thus read out-of-bounds. Fix the issue by passing the buffer length to `parse_mode` and paying attention to it. Note that this is not a vulnerability in our usual code paths, as all object data read from the ODB is NUL terminated.

d4ad658a

2018-10-29T17:24:47

tree: add various tests exercising the tree parser We currently don't have any tests that directly exercise the tree parser. This is due to the fact that the parsers for raw object data has only been recently introduce with commit ca4db5f4a (object: implement function to parse raw data, 2017-10-13), and previous to that the setup simply was too cumbersome as it always required going through the ODB. Now that we have the infrastructure, add a suite of tests that directly exercise the tree parser and various edge cases.

7655b2d8

2018-10-19T10:29:19

commit: fix reading out of bounds when parsing encoding The commit message encoding is currently being parsed by the `git__prefixcmp` function. As this function does not accept a buffer length, it will happily skip over a buffer's end if it is not `NUL` terminated. Fix the issue by using `git__prefixncmp` instead. Add a test that verifies that we are unable to parse the encoding field if it's cut off by the supplied buffer length.

c2e3d8ef

2018-10-25T12:01:18

tests: add tests that exercise commit parsing We currently do not have any test suites dedicated to parsing commits from their raw representations. Add one based on `git_object__from_raw` to be able to test special cases more easily.

ee11d47e

2018-10-19T09:47:50

tag: fix out of bounds read when searching for tag message When parsing tags, we skip all unknown fields that appear before the tag message. This skipping is done by using a plain `strstr(buffer, "\n\n")` to search for the two newlines that separate tag fields from tag message. As it is not possible to supply a buffer length to `strstr`, this call may skip over the buffer's end and thus result in an out of bounds read. As `strstr` may return a pointer that is out of bounds, the following computation of `buffer_end - buffer` will overflow and result in an allocation of an invalid length. Fix the issue by using `git__memmem` instead. Add a test that verifies parsing the tag fails not due to the allocation failure but due to the tag having no message.

4c738e56

2018-10-19T09:44:14

tests: add tests that exercise tag parsing While the tests in object::tag::read exercises reading and parsing valid tags from the ODB, they barely try to verify that the parser fails in a sane way when parsing invalid tags. Create a new test suite object::tag::parse that directly exercise the parser by using `git_object__from_raw` and add various tests for valid and invalid tags.

f00db9ed

2018-07-27T12:00:37

tree: rename from_tree to validate and clarify the tree in the test

2dff7e28

2018-07-18T21:04:13

tree: accept null ids in existing trees when updating When we add entries to a treebuilder we validate them. But we validate even those that we're adding because they exist in the base tree. This disables using the normal mechanisms on these trees, even to fix them. Keep track of whether the entry we're appending comes from an existing tree and bypass the name and id validation if it's from existing data.

9994cd3f

2018-06-25T11:56:52

treewide: remove use of C++ style comments C++ style comment ("//") are not specified by the ISO C90 standard and thus do not conform to it. While libgit2 aims to conform to C90, we did not enforce it until now, which is why quite a lot of these non-conforming comments have snuck into our codebase. Do a tree-wide conversion of all C++ style comments to the supported C style comments to allow us enforcing strict C90 compliance in a later commit.

ecf4f33a

2018-02-08T11:14:48

Convert usage of `git_buf_free` to new `git_buf_dispose`

a554d588

2018-02-28T12:21:08

tree: initialize the id we use for testing submodule insertions Instead of laving it uninitialized and relying on luck for it to be non-zero, let's give it a dummy hash so we make valgrind happy (in this case the hash comes from `sha1sum </dev/null`.

c0487bde

2018-01-12T08:23:43

tree: reject writing null-OID entries to a tree In commit a96d3cc3f (cache-tree: reject entries with null sha1, 2017-04-21), the git.git project has changed its stance on null OIDs in tree objects. Previously, null OIDs were accepted in tree entries to help tools repair broken history. This resulted in some problems though in that many code paths mistakenly passed null OIDs to be added to a tree, which was not properly detected. Align our own code base according to the upstream change and reject writing tree entries early when the OID is all-zero.

1dc89aab

2017-05-01T21:34:21

object validation: free some memleaks

35079f50

2017-04-21T07:31:56

odb: add option to turn off hash verification Verifying hashsums of objects we are reading from the ODB may be costly as we have to perform an additional hashsum calculation on the object. Especially when reading large objects, the penalty can be as high as 35%, as can be seen when executing the equivalent of `git cat-file` with and without verification enabled. To mitigate for this, we add a global option for libgit2 which enables the developer to turn off the verification, e.g. when he can be reasonably sure that the objects on disk won't be corrupted.

28a0741f

2017-04-10T09:30:08

odb: verify object hashes The upstream git.git project verifies objects when looking them up from disk. This avoids scenarios where objects have somehow become corrupt on disk, e.g. due to hardware failures or bit flips. While our mantra is usually to follow upstream behavior, we do not do so in this case, as we never check hashes of objects we have just read from disk. To fix this, we create a new error class `GIT_EMISMATCH` which denotes that we have looked up an object with a hashsum mismatch. `odb_read_1` will then, after having read the object from its backend, hash the object and compare the resulting hash to the expected hash. If hashes do not match, it will return an error. This obviously introduces another computation of checksums and could potentially impact performance. Note though that we usually perform I/O operations directly before doing this computation, and as such the actual overhead should be drowned out by I/O. Running our test suite seems to confirm this guess. On a Linux system with best-of-five timings, we had 21.592s with the check enabled and 21.590s with the ckeck disabled. Note though that our test suite mostly contains very small blobs only. It is expected that repositories with bigger blobs may notice an increased hit by this check. In addition to a new test, we also had to change the odb::backend::nonrefreshing test suite, which now triggers a hashsum mismatch when looking up the commit "deadbeef...". This is expected, as the fake backend allocated inside of the test will return an empty object for the OID "deadbeef...", which will obviously not hash back to "deadbeef..." again. We can simply adjust the hash to equal the hash of the empty object here to fix this test.

d59dabe5

2017-04-10T09:00:51

tests: object: test looking up corrupted objects We currently have no tests which check whether we fail reading corrupted objects. Add one which modifies contents of an object stored on disk and then tries to read the object.

86c03552

2017-04-10T09:27:04

tests: object: create sandbox The object::lookup tests do use the "testrepo.git" repository in a read-only way, so we do not set up the repository as a sandbox but simply open it. But in a future commit, we will want to test looking up objects which are corrupted in some way, which requires us to modify the on-disk data. Doing this in a repository without creating the sandbox will modify contents of our libgit2 repository, though. Create the repository in a sandbox to avoid this.

1d41b86c

2016-11-14T12:22:20

tree: add a failing test for unsorted input We do not currently use the sorted version of this input in the function, which means we produce bad results.

4006455f

2016-08-09T10:09:23

tests: blob: remove unused callback function

faebc1c6

2016-06-20T17:44:04

threads: split up OS-dependent thread code

a2cb4713

2016-05-24T14:30:43

tree: handle removal of all entries in the updater When we remove all entries in a tree, we should remove that tree from its parent rather than include the empty tree.

53412305

2016-05-19T15:29:53

tree: plug leaks in the tree updater

92249656

2016-05-19T15:21:26

tree: use testrepo2 for the tree updater tests This gives us trees with subdirectories, which the new test needs.

9464f9eb

2016-05-02T17:36:58

Introduce a function to create a tree based on a different one Instead of going through the usual steps of reading a tree recursively into an index, modifying it and writing it back out as a tree, introduce a function to perform simple updates more efficiently. `git_tree_create_updated` avoids reading trees which are not modified and supports upsert and delete operations. It is not as versatile as modifying the index, but it makes some common operations much more efficient.

eb39284b

2016-04-25T12:16:05

tag: ignore extra header fields While no extra header fields are defined for tags, git accepts them by ignoring them and continuing the search for the message. There are a few tags like this in the wild which git parses just fine, so we should do the same.

6669e3e8

2015-11-08T04:28:08

blob: remove _fromchunks() The callback mechanism makes it awkward to write data from an IO source; move to `_fromstream()` which lets the caller remain in control, in the same vein as we prefer iterators over foreach callbacks.

35e68606

2015-11-04T10:36:50

blob: fix fromchunks iteration counter By returning when the count goes to zero rather than below it, setting `howmany` to 7 in fact writes out the string 6 times. Correct the termination condition to write out the string the amount of times we specify.

0a5c6028

2015-11-04T10:30:48

blob: introduce creating a blob by writing into a stream The pair of `git_blob_create_frombuffer()` and `git_blob_create_frombuffer_commit()` is meant to replace `git_blob_create_fromchunks()` by providing a way for a user to write a new blob when they want filtering or they do not know the size. This approach allows the caller to retain control over when to add data to this buffer and a more natural fit into higher-level language's own stream abstractions instead of having to handle IO wait in the callback. The in-memory buffer size of 2MB is chosen somewhat arbitrarily to be a round multiple of usual page sizes and a value where most blobs seem likely to be either going to be way below or way over that size. It's also a round number of pages. This implementation re-uses the helper we have from `_fromchunks()` so we end up writing everything to disk, but hopefully more efficiently than with a default filebuf. A later optimisation can be to avoid writing the in-memory contents to disk, with some extra complexity.

60a194aa

2016-03-20T11:00:12

tree: re-use the id and filename in the odb object Instead of copying over the data into the individual entries, point to the originals, which are already in a format we can use.

ea5bf6bb

2016-03-04T12:34:38

treebuilder: don't try to verify submodules exist in the odb Submodules don't exist in the objectdb and the code is making us try to look for a blob with its commit id, which is obviously not going to work. Skip the test if the user wants to insert a submodule.

f2dddf52

2016-02-28T15:51:38

turn on strict object validation by default

4afe536b

2016-02-28T16:02:49

tests: use legitimate object ids Use legitimate (existing) object IDs in tests so that we have the ability to turn on strict object validation when running tests.

2bbc7d3e

2016-02-23T15:00:27

treebuilder: validate tree entries (optionally) When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate the tree and parent ids given to treebuilder insertion.

2f1080ea

2015-05-19T11:17:07

conflict tests: use GIT_IDXENTRY_STAGE_SET

c4a2fd5c

2015-01-04T17:39:43

Plug a couple of leaks

208a2c8a

2014-12-27T12:09:11

treebuilder: rename _create() to _new() This function is a constructor, so let's name it like one and leave _create() for the reference functions, which do create/write the reference.

dce7b1a4

2014-12-16T19:24:04

treebuilder: take a repository for path validation Path validation may be influenced by `core.protectHFS` and `core.protectNTFS` configuration settings, thus treebuilders can take a repository to influence their configuration.

753e17b0

2014-11-19T18:42:29

peel: reject bad queries with EINVALIDSPEC There are some combination of objects and target types which we know cannot be fulfilled. Return EINVALIDSPEC for those to signify that there is a mismatch in the user-provided data and what the object model is capable of satisfying. If we start at a tag and in the course of peeling find out that we cannot reach a particular type, we return EPEEL.

3b2cb2c9

2014-09-16T11:49:25

Factor 40 and 41 constants from source.

4ca0b566

2014-08-18T12:41:06

oid: Export `git_oid_tostr_s` instead of `_allocfmt` The old `allocfmt` is of no use to callers, as they are not able to free the returned buffer. Export a new API that returns a static string that doesn't need to be freed.

0cee70eb

2014-07-01T14:09:01

Introduce cl_assert_equal_oid

4d3f1f97

2014-06-09T04:38:22

treebuilder: use a map instead of vector to store the entries Finding a filename in a vector means we need to resort it every time we want to read from it, which includes every time we want to write to it as well, as we want to find duplicate keys. A hash-map fits what we want to do much more accurately, as we do not care about sorting, but just the particular filename. We still keep removed entries around, as the interface let you assume they were going to be around until the treebuilder is cleared or freed, but in this case that involves an append to a vector in the filter case, which can now fail. The only time we care about sorting is when we write out the tree, so let's make that the only time we do any sorting.

fb591767

2014-06-07T12:51:48

Win32: Fix object::cache::threadmania test on x64

49e369b2

2014-05-18T10:06:49

message: don't assume the comment char The comment char is configurable and we need to provide a way for the user to specify which comment char they chose for their message.

af567e88

2014-05-12T10:44:13

Merge pull request #2334 from libgit2/rb/fix-2333 Be more careful with user-supplied buffers

1e4976cb

2014-05-08T10:17:14

Be more careful with user-supplied buffers This adds in missing calls to `git_buf_sanitize` and fixes a number of places where `git_buf` APIs could inadvertently write NUL terminator bytes into invalid buffers. This also changes the behavior of `git_buf_sanitize` to NUL terminate a buffer if it can and of `git_buf_shorten` to do nothing if it can. Adds tests of filtering code with zeroed (i.e. unsanitized) buffer which was previously triggering a segfault.

5269008c

2014-05-06T16:01:49

Add filter options and ALLOW_UNSAFE Diff and status do not want core.safecrlf to actually raise an error regardless of the setting, so this extends the filter API with an additional options flags parameter and adds a flag so that filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating that unsafe filter application should be downgraded from a failure to a warning.

217c029b

2014-04-09T14:08:22

commit: safer commit creation with reference update The current version of the commit creation and amend function are unsafe to use when passing the update_ref parameter, as they do not check that the reference at the moment of update points to what the user expects. Make sure that we're moving history forward when we ask the library to update the reference for us by checking that the first parent of the new commit is the current value of the reference. We also make sure that the ref we're updating hasn't moved between the read and the write. Similarly, when amending a commit, make sure that the current tip of the branch is the commit we're amending.

eb46fb2b

2014-03-08T00:49:18

Add failing test for git_object_short_id

13f7ecd7

2014-03-04T16:23:28

Add git_object_short_id API to get short id string This finds a short id string that will unambiguously select the given object, starting with the core.abbrev length (usually 7) and growing until it is no longer ambiguous.

80c29fe9

2014-01-17T10:45:11

Add git_commit_amend API This adds an API to amend an existing commit, basically a shorthand for creating a new commit filling in missing parameters from the values of an existing commit. As part of this, I also added a new "sys" API to create a commit using a callback to get the parents. This allowed me to rewrite all the other commit creation APIs so that temporary allocations are no longer needed.

629ba7f1

2014-02-05T13:07:46

Merge pull request #2027 from libgit2/rb/only-windows-is-windows Some tests of paths that can't actually be written to disk

93954245

2014-01-27T09:39:36

Merge pull request #2075 from libgit2/cmn/leftover-oid Leftover OID -> ID changes

e1d7f003

2014-01-26T16:32:49

messsage: use git_buf in prettify() A lot of the tests were checking for overflow, which we don't have anymore, so we can remove them.

d541170c

2014-01-24T11:36:41

index: rename an entry's id to 'id' This was not converted when we converted the rest, so do it now.

79ccb921

2014-01-03T14:26:02

Further tree building tests with hard paths

97bbf61e

2014-01-03T12:14:22

Tree accessor tests with hard path names

452c7de6

2013-12-12T14:16:40

Add git_treebuilder_insert test and clarify doc This wasn't being tested and since it has a callback, I fixed it even though the return value of this callback is not treated like any of the other callbacks in the API.

19853bdd

2013-12-10T13:01:34

Update git_blob_create_fromchunks callback behavr The callback to supply data chunks could return a negative value to stop creation of the blob, but we were neither using GIT_EUSER nor propagating the return value. This makes things use the new behavior of returning the negative value back to the user.

25e0b157

2013-12-06T15:07:57

Remove converting user error to GIT_EUSER This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.

17820381

2013-11-14T14:05:52

Rename tests-clar to tests

thodg/libgit2/tests/object

tests/object

Log