kmx git

Commit	Date	Message
f0e693b1	2021-09-07T17:53:49	str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
31ecaca2	2021-09-30T08:11:40	hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.
2a713da1	2021-09-29T21:31:17	hash: accept the algorithm in inputs
7e7cfe8a	2021-09-26T20:20:03	buf: common_prefix takes a string array `git_strarray` is a public-facing type. Change `git_buf_text_common_prefix` to not use it, and just take an array of strings instead.
a24e656a	2021-09-04T10:16:41	common: support custom repository extensions Allow users to specify additional repository extensions that they want to support. For example, callers can specify that they support `preciousObjects` and then may open repositories that support `extensions.preciousObjects`. Similarly, callers may opt out of supporting extensions that the library itself supports.
1196de4f	2021-08-31T15:22:44	util: introduce `git__strlcmp` Introduce a utility function that compares a NUL terminated string to a possibly not-NUL terminated string with length. This is similar to `strncmp` but with an added check to ensure that the lengths match (not just the `size` portion of the two strings).
47c70fc5	2021-08-26T05:40:20	Merge remote-tracking branch 'origin/main' into cgraph-write
63f08e42	2021-08-26T05:29:34	Make the defaultable fields defaultable Also, add `git_commit_graph_writer_options_init`!
c7a195a1	2021-08-25T14:11:03	Merge pull request #6006 from boretrk/c11-warnings GCC C11 warnings
4bbe5e6e	2021-08-25T18:14:10	win32: name the dummy union in GIT_REPARSE_DATA_BUFFER Instead of buf->"typeofbuffer"ReparseBuffer the members will be referenced with buf->ReparseBuffer."typeofbuffer" https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_reparse_data_buffer?redirectedfrom=MSDN calls the union DUMMYUNIONNAME but that looks a bit cluttered.
231ca4fa	2021-02-23T19:33:34	Proof-of-concept for a more aggressive GIT_UNUSED() This adds a `-Wunused-result`-proof `GIT_UNUSED()`, just to demonstrate that it works. With this, sortedcache.h is now completely `GIT_WARN_UNUSED_RESULT`-annotated!
9eb17d46	2021-02-16T19:38:34	Introduce GIT_WARN_UNUSED_RESULT This change adds the GIT_WARN_UNUSED_RESULT annotation, which makes the compiler warn when a return result is not used. This avoids bugs.
3f28eafe	2021-07-07T19:35:42	stdint constants in test suite Passes w/ gcc 11 on Fedora x64. Protip: So you don;t have to suffer, ``` perl -pe 's/(-?(?:0x)?[A-Fa-f0-9]+)([Uu])?[Ll][Ll]/\U$2INT64_C(\E$1)/mg' ```
d525e063	2021-05-10T23:04:59	buf: remove internal `git_buf_text` namespace The `git_buf_text` namespace is unnecessary and strange. Remove it, just keep the functions prefixed with `git_buf`.
cb136cdd	2021-04-14T22:22:11	utf8: introduce git_utf8_char_length Introduce a function to determine the number of Unicode characters in a given UTF-8 string.
d9c15387	2021-01-05T14:29:58	blob: add git_blob_filter_options_init The `git_blob_filter_options_init` function should be included, to allow callers in FFI environments to let us initialize an options structure for them.
5ab0736b	2020-12-19T09:30:26	Add tests for `git__multiply_int64_overflow` As it turns out, the implementation of `git__multiply_int64_overflow` is full of edge cases and tricky arithmetic. That means that it should have unit tests. As a result, a bug in `git__strntol64` was found (and fixed!) in clang+32-bit.
2cfa31c4	2020-04-05T18:30:07	path: remove unused git_path_topdir
6554b40e	2020-05-13T10:39:33	settings: localize global data Move the settings global data teardown into its own separate function, instead of intermingled with the global state.
cad7a1ba	2020-06-05T08:42:38	clar: include the function name
cbae1c21	2020-04-01T22:12:07	assert: allow non-int returning functions to assert Include GIT_ASSERT_WITH_RETVAL and GIT_ASSERT_ARG_WITH_RETVAL so that functions that do not return int (or more precisely, where `-1` would not be an error code) can assert. This allows functions that return, eg, NULL on an error code to do that by passing the return value (in this example, `NULL`) as a second parameter to the GIT_ASSERT_WITH_RETVAL functions.
a95096ba	2020-01-12T10:31:07	assert: optionally fall-back to assert(3) Fall back to the system assert(3) in debug builds, which may aide in debugging. "Safe" assertions can be enabled in debug builds by setting GIT_ASSERT_HARD=0. Similarly, hard assertions can be enabled in release builds by setting GIT_ASSERT_HARD to nonzero.
abe2efe1	2019-12-09T12:37:34	Introduce GIT_ASSERT macros Provide macros to replace usages of `assert`. A true `assert` is punishing as a library. Instead we should do our best to not crash. GIT_ASSERT_ARG(x) will now assert that the given argument complies to some format and sets an error message and returns `-1` if it does not. GIT_ASSERT(x) is for internal usage, and available as an internal consistency check. It will set an error message and return `-1` in the event of failure.
163db8f2	2020-02-28T18:53:22	win32: test relative symlinks Ensure that we don't canonicalize symlink targets.
7d55bee6	2020-01-10T12:44:51	win32: fix relative symlinks pointing into dirs On Windows platforms, we need some logic to emulate symlink(3P) defined by POSIX. As unprivileged symlinks on Windows are a rather new feature, our current implementation is comparatively new and still has some rough edges in special cases. One such case is relative symlinks. While relative symlinks to files in the same directory work as expected, libgit2 currently fails to create reltaive symlinks pointing into other directories. This is due to the fact that we forgot to translate the Unix-style target path to Windows-style. Most importantly, we are currently not converting directory separators from "/" to "\". Fix the issue by calling `git_win32_path_canonicalize` on the target. Add a test that verifies our ability to create such relative links across directories.
6460e8ab	2019-06-23T18:13:29	internal: use off64_t instead of git_off_t Prefer `off64_t` internally.
63307cba	2019-09-28T17:32:18	Merge pull request #5226 from pks-t/pks/regexp-api regexp: implement a new regular expression API
f585b129	2019-09-12T14:29:28	posix: remove superseded POSIX regex wrappers The old POSIX regex wrappers have been superseded by our own regexp API that provides a higher-level abstraction. Remove the POSIX wrappers in favor of the new one.
d77378eb	2019-09-13T08:54:26	regexp: implement new regular expression API We currently support a set of different regular expression backends with PCRE, PCRE2, regcomp(3P) and regcomp_l(3). The current implementation of this is done via a simple POSIX wrapper that either directly uses supplied functions or that is a very small wrapper. To support PCRE and PCRE2, we use their provided <pcreposix.h> and <pcre2posix.h> wrappers. These wrappers are implemented in such a way that the accompanying libraries pcre-posix and pcre2-posix provide the same symbols as the libc ones, namely regcomp(3P) et al. This works out on some systems just fine, most importantly on glibc-based ones, where the regular expression functions are implemented as weak aliases and thus get overridden by linking in the pcre{,2}-posix library. On other systems we depend on the linking order of libc and pcre library, and as libc always comes first we will end up with the functions of the libc implementation. As a result, we may use the structures `regex_t` and `regmatch_t` declared by <pcre{,2}posix.h>, but use functions defined by the libc, leading to segfaults. The issue is not easily solvable. Somed distributions like Debian have resolved this by patching PCRE and PCRE2 to carry custom prefixes to all the POSIX function wrappers. But this is not supported by upstream and thus inherently unportable between distributions. We could instead try to modify linking order, but this starts becoming fragile and will not work e.g. when libgit2 is loaded via dlopen(3P) or similar ways. In the end, this means that we simply cannot use the POSIX wrappers provided by the PCRE libraries at all. Thus, this commit introduces a new regular expression API. The new API is on a tad higher level than the previous POSIX abstraction layer, as it tries to abstract away any non-portable flags like e.g. REG_EXTENDED, which has no equivalents in all of our supported backends. As there are no users of POSIX regular expressions that do _not_ reguest REG_EXTENDED this is fine to be abstracted away, though. Due to the API being higher-level than before, it should generally be a tad easier to use than the previous one. Note: ideally, the new API would've been called `git_regex_foobar` with a file "regex.h" and "regex.c". Unfortunately, this is currently impossible to implement due to naming clashes between the then-existing "regex.h" and <regex.h> provided by the libc. As we add the source directory of libgit2 to the header search path, an include of <regex.h> would always find our own "regex.h". Thus, we have to take the bitter pill of adding one more character to all the functions to disambiguate the includes. To improve guarantees around cross-backend compatibility, this commit also brings along an improved regular expression test suite core::regexp.
174b7a32	2019-09-19T12:24:06	buffer: fix printing into out-of-memory buffer Before printing into a `git_buf` structure, we always call `ENSURE_SIZE` first. This macro will reallocate the buffer as-needed depending on whether the current amount of allocated bytes is sufficient or not. If `asize` is big enough, then it will just do nothing, otherwise it will call out to `git_buf_try_grow`. But in fact, it is insufficient to only check `asize`. When we fail to allocate any more bytes e.g. via `git_buf_try_grow`, then we set the buffer's pointer to `git_buf__oom`. Note that we touch neither `asize` nor `size`. So if we just check `asize > targetsize`, then we will happily let the caller of `ENSURE_SIZE` proceed with an out-of-memory buffer. As a result, we will print all bytes into the out-of-memory buffer instead, resulting in an out-of-bounds write. Fix the issue by having `ENSURE_SIZE` verify that the buffer is not marked as OOM. Add a test to verify that we're not writing into the OOM buffer.
208f1d7a	2019-09-19T12:46:37	buffer: fix infinite loop when growing buffers When growing buffers, we repeatedly multiply the currently allocated number of bytes by 1.5 until it exceeds the requested number of bytes. This has two major problems: 1. If the current number of bytes is tiny and one wishes to resize to a comparatively huge number of bytes, then we may need to loop thousands of times. 2. If resizing to a value close to `SIZE_MAX` (which would fail anyway), then we probably hit an infinite loop as multiplying the current amount of bytes will repeatedly result in integer overflows. When reallocating buffers, one typically chooses values close to 1.5 to enable re-use of resulting memory holes in later reallocations. But because of this, it really only makes sense to use a factor of 1.5 _once_, but not looping until we finally are able to fit it. Thus, we can completely avoid the loop and just opt for the much simpler algorithm of multiplying with 1.5 once and, if the result doesn't fit, just use the target size. This avoids both problems of looping extensively and hitting overflows. This commit also adds a test that would've previously resulted in an infinite loop.
8cbef12d	2019-08-08T11:52:54	util: do not perform allocations in insertsort Our hand-rolled fallback sorting function `git__insertsort_r` does an in-place sort of the given array. As elements may not necessarily be pointers, it needs a way of swapping two values of arbitrary size, which is currently implemented by allocating a temporary buffer of the element's size. This is problematic, though, as the emulated `qsort` interface doesn't provide any return values and thus cannot signal an error if allocation of that temporary buffer has failed. Convert the function to swap via a temporary buffer allocated on the stack. Like this, it can `memcpy` contents of both elements in small batches without requiring a heap allocation. The buffer size has been chosen such that in most cases, a single iteration of copying will suffice. Most importantly, it can fully contain `git_oid` structures and pointers. Add a bunch of tests for the `git__qsort_r` interface to verify nothing breaks. Furthermore, this removes the declaration of `git__insertsort_r` and makes it static as it is not used anywhere else.
50194dcd	2019-07-11T15:14:42	win32: fix symlinks to relative file targets When creating a symlink in Windows, one needs to tell Windows whether the symlink should be a file or directory symlink. To determine which flag to pass, we call `GetFileAttributesW` on the target file to see whether it is a directory and then pass the flag accordingly. The problem though is if create a symlink with a relative target path, then we will check that relative path while not necessarily being inside of the working directory where the symlink is to be created. Thus, getting its attributes will either fail or return attributes of the wrong target. Fix this by resolving the target path relative to the directory in which the symlink is to be created.
93d37a1d	2019-06-29T09:59:36	tests: core: improve symlink test coverage Add two more tests to verify that we're not deleting symlink targets, but the symlinks themselves. Furthermore, convert several `cl_skip`s on Win32 to conditional skips depending on whether the clar sandbox supports symlinks or not. Windows is grown up now and may allow unprivileged symlinks if the machine has been configured accordingly.
683ea2b0	2019-06-29T09:10:57	tests: core: add missing asserts for several function calls Several function calls to `p_stat` and `p_close` have no verification if they actually succeeded. As these functions _may_ fail and as we also want to make sure that we're not doing anything dumb, let's check them, too.
e54343a4	2019-06-29T09:17:32	fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
c512d58f	2019-06-15T22:26:23	win32: cast WinAPI to void * before casting GetProcAddress is prototyped to return a `FARPROC`, which is meant to be a generic function pointer. It's literally `int (FAR WINAPI * FARPROC)()` which gcc complains if you attempt to cast to a `void ()(GIT_SRWLOCK )`. Cast to a `void *` before casting to avoid warnings about the arguments.
fef847ae	2019-06-15T15:47:41	Merge pull request #5110 from pks-t/pks/wildmatch Replace fnmatch with wildmatch
a9f57629	2019-06-13T15:03:00	wildmatch: import wildmatch from git.git In commit 70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15), upstream git has switched over all code from their internal fnmatch copy to its new wildmatch code. We haven't followed suit, and thus have developed some incompatibilities in how we match regular expressions. Import git's wildmatch from v2.22.0 and add a test suite based on their t3070-wildmatch.sh tests.
2d85c7e8	2019-06-14T14:12:19	posix: remove `p_fallocate` abstraction By now, we have repeatedly failed to provide a nice cross-platform implementation of `p_fallocate`. Recent tries to do that escalated quite fast to a set of different CMake checks, implementations, fallbacks, etc., which started to look real awkward to maintain. In fact, `p_fallocate` had only been introduced in commit 4e3949b73 (tests: test that largefiles can be read through the tree API, 2019-01-30) to support a test with large files, but given the maintenance costs it just seems not to be worht it. As we have removed the sole user of `p_fallocate` in the previous commit, let's drop it altogether.
c0dd7122	2019-06-06T16:48:04	apply: add an options struct initializer
0b5ba0d7	2019-06-06T16:36:23	Rename opt init functions to `options_init` In libgit2 nomenclature, when we need to verb a direct object, we name a function `git_directobject_verb`. Thus, if we need to init an options structure named `git_foo_options`, then the name of the function that does that should be `git_foo_options_init`. The previous names of `git_foo_init_options` is close - it _sounds_ as if it's initializing the options of a `foo`, but in fact `git_foo_options` is its own noun that should be respected. Deprecate the old names; they'll now call directly to the new ones.
09902985	2019-01-13T21:12:10	core::posix: skip some locale tests on win32
8877d7d3	2019-01-13T02:08:43	tests: regcomp: use proper character classes The '[[:digit:]]' and '[[:alpha:]]' classes require double brackets, not single.
ca1b07a2	2019-01-13T02:05:58	tests: regcomp: test that regex functions succeed The regex functions return nonzero (not necessarily negative values) on failure.
aea9a712	2018-03-02T15:12:14	tests: regcomp: assert character groups do match normal alphabet In order to avoid us being unable to match characters which are part of the normal US alphabet in certain weird languages, add two tests to catch this behavior.
e207b2a2	2018-03-02T15:09:20	tests: regex: restructure setup of locales In order to make it easier adding more locale-related tests, add a generalized framework handling initial setup of languages as well as the cleanup of them afterwards.
b055a6b5	2019-01-13T01:24:39	tests: regex: add test with LC_COLLATE being set While we already have a test for `p_regexec` with `LC_CTYPE` being modified, `regexec` also alters behavior as soon as `LC_COLLATE` is being modified. Most importantly, `LC_COLLATE` changes the way how ranges are interpreted to just not handling them at all. Thus, ensure that either we use `regcomp_l` to avoid this, or that we've fallen back to our builtin regex functionality which also behaves properly.
ad4ede91	2018-03-02T13:51:57	tests: fix p_regcomp test not checking return type While the test asserts that the error value indcates a non-value, it is actually never getting assigned to. Fix this.
02683b20	2019-01-12T23:06:39	regexec: prefix all regexec function calls with p_ Prefix all the calls to the the regexec family of functions with `p_`. This allows us to swap out all the regular expression functions with our own implementation. Move the declarations to `posix_regex.h` for simpler inclusion.
aeea1c46	2019-04-04T15:06:44	Merge pull request #4874 from tiennou/test/4615 Test that largefiles can be read through the tree API
0345a380	2019-02-22T14:39:08	p_fallocate: add a test for our implementation
bd66925a	2018-12-01T10:29:32	oidmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_oidmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.
fdfabdc4	2018-12-01T09:49:10	strmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_strmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.
18cf5698	2018-12-01T09:37:40	maps: provide high-level iteration interface Currently, our headers need to leak some implementation details of maps due to their direct use of indices in the implementation of their foreach macros. This makes it impossible to completely hide the map structures away, and also makes it impossible to include the khash implementation header in the C files of the respective map only. This is now being fixed by providing a high-level iteration interface `map_iterate`, which takes as inputs the map that shall be iterated over, an iterator as well as the locations where keys and values shall be put into. For simplicity's sake, the iterator is a simple `size_t` that shall initialized to `0` on the first call. All existing foreach macros are then adjusted to make use of this new function.
2e0a3048	2019-01-23T10:48:55	oidmap: introduce high-level setter for key/value pairs Currently, one would use either `git_oidmap_insert` to insert key/value pairs into a map or `git_oidmap_put` to insert a key only. These function have historically been macros, which is why their syntax is kind of weird: instead of returning an error code directly, they instead have to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes.Furthermore, `git_oidmap_put` is tightly coupled with implementation details of the map as it exposes the index of inserted entries. Introduce a new function `git_oidmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all trivial callers of `git_oidmap_insert` and `git_oidmap_put` to make use of it.
9694ef20	2018-12-17T09:01:53	oidmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_oidmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
03555830	2019-01-23T10:44:33	strmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_strmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_strmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_strmap_insert` to make use of it.
ef507bc7	2019-01-23T10:44:02	strmap: introduce `git_strmap_get` and use it throughout the tree The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_strmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
7e926ef3	2018-11-30T12:14:43	maps: provide a uniform entry count interface There currently exist two different function names for getting the entry count of maps, where offmaps offset and string maps use `num_entries` and OID maps use `size`. In most programming languages with built-in map types, this is simply called `size`, which is also shorter to type. Thus, this commit renames the other two functions `num_entries` to match the common way and adjusts all callers.
351eeff3	2019-01-23T10:42:46	maps: use uniform lifecycle management functions Currently, the lifecycle functions for maps (allocation, deallocation, resize) are not named in a uniform way and do not have a uniform function signature. Rename the functions to fix that, and stick to libgit2's naming scheme of saying `git_foo_new`. This results in the following new interface for allocation: - `int git_<t>map_new(git_<t>map *out)` to allocate a new map, returning an error code if we ran out of memory - `void git_<t>map_free(git_<t>map map)` to free a map - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map This commit also fixes all existing callers.
3fba5891	2019-01-20T23:53:33	test: cast to a char the zstream test
9c5e05ad	2019-01-23T10:43:29	deprecation: move deprecated tests into their own file Move the deprecated stream tests into their own compilation unit. This will allow us to disable any preprocessor directives that apply to deprecation just for these tests (eg, disabling `GIT_DEPRECATED_HARD`).
f673e232	2018-12-27T13:47:34	git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
b5e8272f	2019-01-06T08:29:56	Attempt at fixing the MingW64 compilation It seems like MingW64's size_t is defined differently than in Linux.
487233fa	2018-11-29T07:21:41	Merge pull request #4895 from pks-t/pks/unused-warnings Unused function warnings
02bb39f4	2018-11-22T08:49:09	stream registration: take an enum type Accept an enum (`git_stream_t`) during custom stream registration that indicates whether the registration structure should be used for standard (non-TLS) streams or TLS streams.
df2cc108	2018-11-18T10:29:07	stream: provide generic registration API Update the new stream registration API to be `git_stream_register` which takes a registration structure and a TLS boolean. This allows callers to register non-TLS streams as well as TLS streams. Provide `git_stream_register_tls` that takes just the init callback for backward compatibliity.
43b592ac	2018-10-25T08:49:01	tls: introduce a wrap function Introduce `git_tls_stream_wrap` which will take an existing `stream` with an already connected socket and begin speaking TLS on top of it. This is useful if you've built a connection to a proxy server and you wish to begin CONNECT over it to tunnel a TLS connection. Also update the pluggable TLS stream layer so that it can accept a registration structure that provides an `init` and `wrap` function, instead of a single initialization function.
852bc9f4	2018-11-23T19:26:24	khash: remove intricate knowledge of khash types Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.
4209a512	2018-11-14T12:04:42	strntol: fix out-of-bounds reads when parsing numbers with leading sign When parsing a number, we accept a leading plus or minus sign to return a positive or negative number. When the parsed string has such a leading sign, we set up a flag indicating that the number is negative and advance the pointer to the next character in that string. This misses updating the number of bytes in the string, though, which is why the parser may later on do an out-of-bounds read. Fix the issue by correctly updating both the pointer and the number of remaining bytes. Furthermore, we need to check whether we actually have any bytes left after having advanced the pointer, as otherwise the auto-detection of the base may do an out-of-bonuds access. Add a test that detects the out-of-bound read. Note that this is not actually security critical. While there are a lot of places where the function is called, all of these places are guarded or irrelevant: - commit list: this operates on objects from the ODB, which are always NUL terminated any may thus not trigger the off-by-one OOB read. - config: the configuration is NUL terminated. - curl stream: user input is being parsed that is always NUL terminated - index: the index is read via `git_futils_readbuffer`, which always NUL terminates it. - loose objects: used to parse the length from the object's header. As we check previously that the buffer contains a NUL byte, this is safe. - rebase: this parses numbers from the rebase instruction sheet. As the rebase code uses `git_futils_readbuffer`, the buffer is always NUL terminated. - revparse: this parses a user provided buffer that is NUL terminated. - signature: this parser the header information of objects. As objects read from the ODB are always NUL terminated, this is a non-issue. The constructor `git_signature_from_buffer` does not accept a length parameter for the buffer, so the buffer needs to be NUL terminated, as well. - smart transport: the buffer that is parsed is NUL terminated - tree cache: this parses the tree cache from the index extension. The index itself is read via `git_futils_readbuffer`, which always NUL terminates it. - winhttp transport: user input is being parsed that is always NUL terminated
50d09407	2018-10-29T18:05:27	strntol: fix detection and skipping of base prefixes The `git__strntol` family of functions has the ability to auto-detect a number's base if the string has either the common '0x' prefix for hexadecimal numbers or '0' prefix for octal numbers. The detection of such prefixes and following handling has two major issues though that are being fixed in one go now. - We do not do any bounds checking previous to verifying the '0x' base. While we do verify that there is at least one digit available previously, we fail to verify that there are two digits available and thus may do an out-of-bounds read when parsing this two-character-prefix. - When skipping the prefix of such numbers, we only update the pointer length without also updating the number of remaining bytes. Thus if we try to parse a number '0x1' of total length 3, we will first skip the first two bytes and then try to read 3 bytes starting at '1'. Fix both issues by disentangling the logic. Instead of doing the detection and skipping of such prefixes in one go, we will now first try to detect the base while also honoring how many bytes are left. Only if we have a valid base that is either 8 or 16 and have one of the known prefixes, we will now advance the pointer and update the remaining bytes in one step. Add some tests that verify that no out-of-bounds parsing happens and that autodetection works as advertised.
41863a00	2018-10-29T17:19:58	strntol: fix out-of-bounds read when skipping leading spaces The `git__strntol` family of functions accepts leading spaces and will simply skip them. The skipping will not honor the provided buffer's length, though, which may lead it to read outside of the provided buffer's bounds if it is not a simple NUL-terminated string. Furthermore, if leading space is trimmed, the function will further advance the pointer but not update the number of remaining bytes, which may also lead to out-of-bounds reads. Fix the issue by properly paying attention to the buffer length and updating it when stripping leading whitespace characters. Add a test that verifies that we won't read past the provided buffer length.
623647af	2018-10-26T12:33:59	Merge pull request #4864 from pks-t/pks/object-parse-fixes Object parse fixes
83e8a6b3	2018-10-18T16:08:46	util: provide `git__memmem` function Unfortunately, neither the `memmem` nor the `strnstr` functions are part of any C standard but are merely extensions of C that are implemented by e.g. glibc. Thus, there is no standardized way to search for a string in a block of memory with a limited size, and using `strstr` is to be considered unsafe in case where the buffer has not been sanitized. In fact, there are some uses of `strstr` in exactly that unsafe way in our codebase. Provide a new function `git__memmem` that implements the `memmem` semantics. That is in a given haystack of `n` bytes, search for the occurrence of a byte sequence of `m` bytes and return a pointer to the first occurrence. The implementation chosen is the "Not So Naive" algorithm from [1]. It was chosen as the implementation is comparably simple while still being reasonably efficient in most cases. Preprocessing happens in constant time and space, searching has a time complexity of O(n*m) with a slightly sub-linear average case. [1]: http://www-igm.univ-mlv.fr/~lecroq/string/
ea19efc1	2018-10-18T15:08:56	util: fix out of bounds read in error message When an integer that is parsed with `git__strntol32` is too big to fit into an int32, we will generate an error message that includes the actual string that failed to parse. This does not acknowledge the fact that the string may either not be NUL terminated or alternative include additional characters after the number that is to be parsed. We may thus end up printing characters into the buffer that aren't the number or, worse, read out of bounds. Fix the issue by utilizing the `endptr` that was set by `git__strntol64`. This pointer is guaranteed to be set to the first character following the number, and we can thus use it to compute the width of the number that shall be printed. Create a test to verify that we correctly truncate the number.
39087ab8	2018-10-18T12:11:33	tests: core::strtol: test for some more edge-cases Some edge cases were currently completely untested, e.g. parsing numbers greater than INT64_{MIN,MAX}, truncating buffers by length and invalid characters. Add tests to verify that the system under test performs as expected.
8d7fa88a	2018-10-18T12:04:07	util: remove `git__strtol32` The function `git__strtol32` can easily be misused when untrusted data is passed to it that may not have been sanitized with trailing `NUL` bytes. As all usages of this function have now been removed, we can remove this function altogether to avoid future misuse of it.
68deb2cc	2018-10-18T11:37:10	util: remove unsafe `git__strtol64` function The function `git__strtol64` does not take a maximum buffer length as parameter. This has led to some unsafe usages of this function, and as such we may consider it as being unsafe to use. As we have now eradicated all usages of this function, let's remove it completely to avoid future misuse.
838a2f29	2018-10-07T12:00:48	Merge pull request #4828 from csware/git_futils_rmdir_r_failing Add some more tests for git_futils_rmdir_r and some cleanup
ad273718	2018-10-04T10:32:07	tests: sanitize file hierarchy after running rmdir tests Currently, we do not clean up after ourselves after tests in core::rmdir have created new files in the directory hierarchy. This may leave stale files and/or directories after having run tests, confusing subsequent tests that expect a pristine test environment. Most importantly, it may cause the test initialization to fail which expects being able to re-create the testing hierarchy before each test in case where another test hasn't cleaned up after itself. Fix the issue by adding a cleanup function that removes the temporary testing hierarchy after each test if it still exists.
e886ab46	2018-10-02T19:50:29	tests: Add some more tests for git_futils_rmdir_r Signed-off-by: Sven Strickroth <email@cs-ware.de>
dbb4a586	2018-10-05T10:27:33	tests: fix warning for implicit conversion of integer to pointer GCC warns by default when implicitly converting integers to pointers or the other way round, and commit fa48d2ea7 (vector: do not malloc 0-length vectors on dup, 2018-09-26) introduced such an implicit conversion into our vector tests. While this is totally fine in this test, as the pointer's value is never being used in the first place, we can trivially avoid the warning by instead just inserting a pointer for a variable allocated on the stack into the vector.
ba1cd495	2018-09-28T11:10:49	Merge pull request #4784 from tiennou/fix/warnings Some warnings
fa48d2ea	2018-09-26T19:15:35	vector: do not malloc 0-length vectors on dup
be4717d2	2018-09-18T12:12:06	path: fix "comparison always true" warning
9994cd3f	2018-06-25T11:56:52	treewide: remove use of C++ style comments C++ style comment ("//") are not specified by the ISO C90 standard and thus do not conform to it. While libgit2 aims to conform to C90, we did not enforce it until now, which is why quite a lot of these non-conforming comments have snuck into our codebase. Do a tree-wide conversion of all C++ style comments to the supported C style comments to allow us enforcing strict C90 compliance in a later commit.
ecf4f33a	2018-02-08T11:14:48	Convert usage of `git_buf_free` to new `git_buf_dispose`
e3d764a4	2018-03-29T22:14:12	tests: clarify comment
86219f40	2017-11-30T15:40:13	util: introduce `git__prefixncmp` and consolidate implementations Introduce `git_prefixncmp` that will search up to the first `n` characters of a string to see if it is prefixed by another string. This is useful for examining if a non-null terminated character array is prefixed by a particular substring. Consolidate the various implementations of `git__prefixcmp` around a single core implementation and add some test cases to validate its behavior.
e9369856	2017-03-21T00:25:15	stream: Gather streams to src/streams
08c1b8fc	2017-08-28T21:24:13	cmake: simplify some HTTPS tests
89a34828	2017-06-16T13:34:43	diff: implement function to calculate patch ID The upstream git project provides the ability to calculate a so-called patch ID. Quoting from git-patch-id(1): A "patch ID" is nothing but a sum of SHA-1 of the file diffs associated with a patch, with whitespace and line numbers ignored." Patch IDs can be used to identify two patches which are probably the same thing, e.g. when a patch has been cherry-picked to another branch. This commit implements a new function `git_diff_patchid`, which gets a patch and derives an OID from the diff. Note the different terminology here: a patch in libgit2 are the differences in a single file and a diff can contain multiple patches for different files. The implementation matches the upstream implementation and should derive the same OID for the same diff. In fact, some code has been directly derived from the upstream implementation. The upstream implementation has two different modes to calculate patch IDs, which is the stable and unstable mode. The old way of calculating the patch IDs was unstable in a sense that a different ordering the diffs was leading to different results. This oversight was fixed in git 1.9, but as git tries hard to never break existing workflows, the old and unstable way is still default. The newer and stable way does not care for ordering of the diff hunks, and in fact it is the mode that should probably be used today. So right now, we only implement the stable way of generating the patch ID.
8296da5f	2017-06-14T10:49:28	Merge pull request #4267 from mohseenrm/master adding GIT_FILTER_VERSION to GIT_FILTER_INIT as part of convention
a78441bc	2017-06-13T11:05:40	Adding git_filter_init for initializing `git_filter` struct + unit test
95170294	2017-06-13T11:08:28	tests: core: test initialization of `git_proxy_options` Initialization of the `git_proxy_options` structure is never tested anywhere. Include it in our usual initialization test in "core::structinit::compare".
8a5e7aae	2017-05-22T12:53:44	varint: fix computation for remaining buffer space When encoding varints to a buffer, we want to remain sure that the required buffer space does not exceed what is actually available. Our current check does not do the right thing, though, in that it does not honor that our `pos` variable counts the position down instead of up. As such, we will require too much memory for small varints and not enough memory for big varints. Fix the issue by correctly calculating the required size as `(sizeof(varint) - pos)`. Add a test which failed before.
417319cc	2017-04-25T10:14:37	tests: core::features: only check for HTTPS if it is supported
983979fa	2017-03-22T19:52:38	inet_pton: don't assume addr families don't exist Address family 5 might exist on some crazy system like Haiku. Use `INT_MAX-1` as an unsupported address family.
31059923	2017-03-20T12:16:18	Merge pull request #4169 from csware/absolute-symlink

f0e693b1

2021-09-07T17:53:49

str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.

31ecaca2

2021-09-30T08:11:40

hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.

2a713da1

2021-09-29T21:31:17

hash: accept the algorithm in inputs

7e7cfe8a

2021-09-26T20:20:03

buf: common_prefix takes a string array `git_strarray` is a public-facing type. Change `git_buf_text_common_prefix` to not use it, and just take an array of strings instead.

a24e656a

2021-09-04T10:16:41

common: support custom repository extensions Allow users to specify additional repository extensions that they want to support. For example, callers can specify that they support `preciousObjects` and then may open repositories that support `extensions.preciousObjects`. Similarly, callers may opt out of supporting extensions that the library itself supports.

1196de4f

2021-08-31T15:22:44

util: introduce `git__strlcmp` Introduce a utility function that compares a NUL terminated string to a possibly not-NUL terminated string with length. This is similar to `strncmp` but with an added check to ensure that the lengths match (not just the `size` portion of the two strings).

47c70fc5

2021-08-26T05:40:20

Merge remote-tracking branch 'origin/main' into cgraph-write

63f08e42

2021-08-26T05:29:34

Make the defaultable fields defaultable Also, add `git_commit_graph_writer_options_init`!

c7a195a1

2021-08-25T14:11:03

Merge pull request #6006 from boretrk/c11-warnings GCC C11 warnings

4bbe5e6e

2021-08-25T18:14:10

win32: name the dummy union in GIT_REPARSE_DATA_BUFFER Instead of buf->"typeofbuffer"ReparseBuffer the members will be referenced with buf->ReparseBuffer."typeofbuffer" https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_reparse_data_buffer?redirectedfrom=MSDN calls the union DUMMYUNIONNAME but that looks a bit cluttered.

231ca4fa

2021-02-23T19:33:34

Proof-of-concept for a more aggressive GIT_UNUSED() This adds a `-Wunused-result`-proof `GIT_UNUSED()`, just to demonstrate that it works. With this, sortedcache.h is now completely `GIT_WARN_UNUSED_RESULT`-annotated!

9eb17d46

2021-02-16T19:38:34

Introduce GIT_WARN_UNUSED_RESULT This change adds the GIT_WARN_UNUSED_RESULT annotation, which makes the compiler warn when a return result is not used. This avoids bugs.

3f28eafe

2021-07-07T19:35:42

stdint constants in test suite Passes w/ gcc 11 on Fedora x64. Protip: So you don;t have to suffer, ``` perl -pe 's/(-?(?:0x)?[A-Fa-f0-9]+)([Uu])?[Ll][Ll]/\U$2INT64_C(\E$1)/mg' ```

d525e063

2021-05-10T23:04:59

buf: remove internal `git_buf_text` namespace The `git_buf_text` namespace is unnecessary and strange. Remove it, just keep the functions prefixed with `git_buf`.

cb136cdd

2021-04-14T22:22:11

utf8: introduce git_utf8_char_length Introduce a function to determine the number of Unicode characters in a given UTF-8 string.

d9c15387

2021-01-05T14:29:58

blob: add git_blob_filter_options_init The `git_blob_filter_options_init` function should be included, to allow callers in FFI environments to let us initialize an options structure for them.

5ab0736b

2020-12-19T09:30:26

Add tests for `git__multiply_int64_overflow` As it turns out, the implementation of `git__multiply_int64_overflow` is full of edge cases and tricky arithmetic. That means that it should have unit tests. As a result, a bug in `git__strntol64` was found (and fixed!) in clang+32-bit.

2cfa31c4

2020-04-05T18:30:07

path: remove unused git_path_topdir

6554b40e

2020-05-13T10:39:33

settings: localize global data Move the settings global data teardown into its own separate function, instead of intermingled with the global state.

cad7a1ba

2020-06-05T08:42:38

clar: include the function name

cbae1c21

2020-04-01T22:12:07

assert: allow non-int returning functions to assert Include GIT_ASSERT_WITH_RETVAL and GIT_ASSERT_ARG_WITH_RETVAL so that functions that do not return int (or more precisely, where `-1` would not be an error code) can assert. This allows functions that return, eg, NULL on an error code to do that by passing the return value (in this example, `NULL`) as a second parameter to the GIT_ASSERT_WITH_RETVAL functions.

a95096ba

2020-01-12T10:31:07

assert: optionally fall-back to assert(3) Fall back to the system assert(3) in debug builds, which may aide in debugging. "Safe" assertions can be enabled in debug builds by setting GIT_ASSERT_HARD=0. Similarly, hard assertions can be enabled in release builds by setting GIT_ASSERT_HARD to nonzero.

abe2efe1

2019-12-09T12:37:34

Introduce GIT_ASSERT macros Provide macros to replace usages of `assert`. A true `assert` is punishing as a library. Instead we should do our best to not crash. GIT_ASSERT_ARG(x) will now assert that the given argument complies to some format and sets an error message and returns `-1` if it does not. GIT_ASSERT(x) is for internal usage, and available as an internal consistency check. It will set an error message and return `-1` in the event of failure.

163db8f2

2020-02-28T18:53:22

win32: test relative symlinks Ensure that we don't canonicalize symlink targets.

7d55bee6

2020-01-10T12:44:51

win32: fix relative symlinks pointing into dirs On Windows platforms, we need some logic to emulate symlink(3P) defined by POSIX. As unprivileged symlinks on Windows are a rather new feature, our current implementation is comparatively new and still has some rough edges in special cases. One such case is relative symlinks. While relative symlinks to files in the same directory work as expected, libgit2 currently fails to create reltaive symlinks pointing into other directories. This is due to the fact that we forgot to translate the Unix-style target path to Windows-style. Most importantly, we are currently not converting directory separators from "/" to "\". Fix the issue by calling `git_win32_path_canonicalize` on the target. Add a test that verifies our ability to create such relative links across directories.

6460e8ab

2019-06-23T18:13:29

internal: use off64_t instead of git_off_t Prefer `off64_t` internally.

63307cba

2019-09-28T17:32:18

Merge pull request #5226 from pks-t/pks/regexp-api regexp: implement a new regular expression API

f585b129

2019-09-12T14:29:28

posix: remove superseded POSIX regex wrappers The old POSIX regex wrappers have been superseded by our own regexp API that provides a higher-level abstraction. Remove the POSIX wrappers in favor of the new one.

d77378eb

2019-09-13T08:54:26

regexp: implement new regular expression API We currently support a set of different regular expression backends with PCRE, PCRE2, regcomp(3P) and regcomp_l(3). The current implementation of this is done via a simple POSIX wrapper that either directly uses supplied functions or that is a very small wrapper. To support PCRE and PCRE2, we use their provided <pcreposix.h> and <pcre2posix.h> wrappers. These wrappers are implemented in such a way that the accompanying libraries pcre-posix and pcre2-posix provide the same symbols as the libc ones, namely regcomp(3P) et al. This works out on some systems just fine, most importantly on glibc-based ones, where the regular expression functions are implemented as weak aliases and thus get overridden by linking in the pcre{,2}-posix library. On other systems we depend on the linking order of libc and pcre library, and as libc always comes first we will end up with the functions of the libc implementation. As a result, we may use the structures `regex_t` and `regmatch_t` declared by <pcre{,2}posix.h>, but use functions defined by the libc, leading to segfaults. The issue is not easily solvable. Somed distributions like Debian have resolved this by patching PCRE and PCRE2 to carry custom prefixes to all the POSIX function wrappers. But this is not supported by upstream and thus inherently unportable between distributions. We could instead try to modify linking order, but this starts becoming fragile and will not work e.g. when libgit2 is loaded via dlopen(3P) or similar ways. In the end, this means that we simply cannot use the POSIX wrappers provided by the PCRE libraries at all. Thus, this commit introduces a new regular expression API. The new API is on a tad higher level than the previous POSIX abstraction layer, as it tries to abstract away any non-portable flags like e.g. REG_EXTENDED, which has no equivalents in all of our supported backends. As there are no users of POSIX regular expressions that do _not_ reguest REG_EXTENDED this is fine to be abstracted away, though. Due to the API being higher-level than before, it should generally be a tad easier to use than the previous one. Note: ideally, the new API would've been called `git_regex_foobar` with a file "regex.h" and "regex.c". Unfortunately, this is currently impossible to implement due to naming clashes between the then-existing "regex.h" and <regex.h> provided by the libc. As we add the source directory of libgit2 to the header search path, an include of <regex.h> would always find our own "regex.h". Thus, we have to take the bitter pill of adding one more character to all the functions to disambiguate the includes. To improve guarantees around cross-backend compatibility, this commit also brings along an improved regular expression test suite core::regexp.

174b7a32

2019-09-19T12:24:06

buffer: fix printing into out-of-memory buffer Before printing into a `git_buf` structure, we always call `ENSURE_SIZE` first. This macro will reallocate the buffer as-needed depending on whether the current amount of allocated bytes is sufficient or not. If `asize` is big enough, then it will just do nothing, otherwise it will call out to `git_buf_try_grow`. But in fact, it is insufficient to only check `asize`. When we fail to allocate any more bytes e.g. via `git_buf_try_grow`, then we set the buffer's pointer to `git_buf__oom`. Note that we touch neither `asize` nor `size`. So if we just check `asize > targetsize`, then we will happily let the caller of `ENSURE_SIZE` proceed with an out-of-memory buffer. As a result, we will print all bytes into the out-of-memory buffer instead, resulting in an out-of-bounds write. Fix the issue by having `ENSURE_SIZE` verify that the buffer is not marked as OOM. Add a test to verify that we're not writing into the OOM buffer.

208f1d7a

2019-09-19T12:46:37

buffer: fix infinite loop when growing buffers When growing buffers, we repeatedly multiply the currently allocated number of bytes by 1.5 until it exceeds the requested number of bytes. This has two major problems: 1. If the current number of bytes is tiny and one wishes to resize to a comparatively huge number of bytes, then we may need to loop thousands of times. 2. If resizing to a value close to `SIZE_MAX` (which would fail anyway), then we probably hit an infinite loop as multiplying the current amount of bytes will repeatedly result in integer overflows. When reallocating buffers, one typically chooses values close to 1.5 to enable re-use of resulting memory holes in later reallocations. But because of this, it really only makes sense to use a factor of 1.5 _once_, but not looping until we finally are able to fit it. Thus, we can completely avoid the loop and just opt for the much simpler algorithm of multiplying with 1.5 once and, if the result doesn't fit, just use the target size. This avoids both problems of looping extensively and hitting overflows. This commit also adds a test that would've previously resulted in an infinite loop.

8cbef12d

2019-08-08T11:52:54

util: do not perform allocations in insertsort Our hand-rolled fallback sorting function `git__insertsort_r` does an in-place sort of the given array. As elements may not necessarily be pointers, it needs a way of swapping two values of arbitrary size, which is currently implemented by allocating a temporary buffer of the element's size. This is problematic, though, as the emulated `qsort` interface doesn't provide any return values and thus cannot signal an error if allocation of that temporary buffer has failed. Convert the function to swap via a temporary buffer allocated on the stack. Like this, it can `memcpy` contents of both elements in small batches without requiring a heap allocation. The buffer size has been chosen such that in most cases, a single iteration of copying will suffice. Most importantly, it can fully contain `git_oid` structures and pointers. Add a bunch of tests for the `git__qsort_r` interface to verify nothing breaks. Furthermore, this removes the declaration of `git__insertsort_r` and makes it static as it is not used anywhere else.

50194dcd

2019-07-11T15:14:42

win32: fix symlinks to relative file targets When creating a symlink in Windows, one needs to tell Windows whether the symlink should be a file or directory symlink. To determine which flag to pass, we call `GetFileAttributesW` on the target file to see whether it is a directory and then pass the flag accordingly. The problem though is if create a symlink with a relative target path, then we will check that relative path while not necessarily being inside of the working directory where the symlink is to be created. Thus, getting its attributes will either fail or return attributes of the wrong target. Fix this by resolving the target path relative to the directory in which the symlink is to be created.

93d37a1d

2019-06-29T09:59:36

tests: core: improve symlink test coverage Add two more tests to verify that we're not deleting symlink targets, but the symlinks themselves. Furthermore, convert several `cl_skip`s on Win32 to conditional skips depending on whether the clar sandbox supports symlinks or not. Windows is grown up now and may allow unprivileged symlinks if the machine has been configured accordingly.

683ea2b0

2019-06-29T09:10:57

tests: core: add missing asserts for several function calls Several function calls to `p_stat` and `p_close` have no verification if they actually succeeded. As these functions _may_ fail and as we also want to make sure that we're not doing anything dumb, let's check them, too.

e54343a4

2019-06-29T09:17:32

fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.

c512d58f

2019-06-15T22:26:23

win32: cast WinAPI to void * before casting GetProcAddress is prototyped to return a `FARPROC`, which is meant to be a generic function pointer. It's literally `int (FAR WINAPI * FARPROC)()` which gcc complains if you attempt to cast to a `void (*)(GIT_SRWLOCK *)`. Cast to a `void *` before casting to avoid warnings about the arguments.

fef847ae

2019-06-15T15:47:41

Merge pull request #5110 from pks-t/pks/wildmatch Replace fnmatch with wildmatch

a9f57629

2019-06-13T15:03:00

wildmatch: import wildmatch from git.git In commit 70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15), upstream git has switched over all code from their internal fnmatch copy to its new wildmatch code. We haven't followed suit, and thus have developed some incompatibilities in how we match regular expressions. Import git's wildmatch from v2.22.0 and add a test suite based on their t3070-wildmatch.sh tests.

2d85c7e8

2019-06-14T14:12:19

posix: remove `p_fallocate` abstraction By now, we have repeatedly failed to provide a nice cross-platform implementation of `p_fallocate`. Recent tries to do that escalated quite fast to a set of different CMake checks, implementations, fallbacks, etc., which started to look real awkward to maintain. In fact, `p_fallocate` had only been introduced in commit 4e3949b73 (tests: test that largefiles can be read through the tree API, 2019-01-30) to support a test with large files, but given the maintenance costs it just seems not to be worht it. As we have removed the sole user of `p_fallocate` in the previous commit, let's drop it altogether.

c0dd7122

2019-06-06T16:48:04

apply: add an options struct initializer

0b5ba0d7

2019-06-06T16:36:23

Rename opt init functions to `options_init` In libgit2 nomenclature, when we need to verb a direct object, we name a function `git_directobject_verb`. Thus, if we need to init an options structure named `git_foo_options`, then the name of the function that does that should be `git_foo_options_init`. The previous names of `git_foo_init_options` is close - it _sounds_ as if it's initializing the options of a `foo`, but in fact `git_foo_options` is its own noun that should be respected. Deprecate the old names; they'll now call directly to the new ones.

09902985

2019-01-13T21:12:10

core::posix: skip some locale tests on win32

8877d7d3

2019-01-13T02:08:43

tests: regcomp: use proper character classes The '[[:digit:]]' and '[[:alpha:]]' classes require double brackets, not single.

ca1b07a2

2019-01-13T02:05:58

tests: regcomp: test that regex functions succeed The regex functions return nonzero (not necessarily negative values) on failure.

aea9a712

2018-03-02T15:12:14

tests: regcomp: assert character groups do match normal alphabet In order to avoid us being unable to match characters which are part of the normal US alphabet in certain weird languages, add two tests to catch this behavior.

e207b2a2

2018-03-02T15:09:20

tests: regex: restructure setup of locales In order to make it easier adding more locale-related tests, add a generalized framework handling initial setup of languages as well as the cleanup of them afterwards.

b055a6b5

2019-01-13T01:24:39

tests: regex: add test with LC_COLLATE being set While we already have a test for `p_regexec` with `LC_CTYPE` being modified, `regexec` also alters behavior as soon as `LC_COLLATE` is being modified. Most importantly, `LC_COLLATE` changes the way how ranges are interpreted to just not handling them at all. Thus, ensure that either we use `regcomp_l` to avoid this, or that we've fallen back to our builtin regex functionality which also behaves properly.

ad4ede91

2018-03-02T13:51:57

tests: fix p_regcomp test not checking return type While the test asserts that the error value indcates a non-value, it is actually never getting assigned to. Fix this.

02683b20

2019-01-12T23:06:39

regexec: prefix all regexec function calls with p_ Prefix all the calls to the the regexec family of functions with `p_`. This allows us to swap out all the regular expression functions with our own implementation. Move the declarations to `posix_regex.h` for simpler inclusion.

aeea1c46

2019-04-04T15:06:44

Merge pull request #4874 from tiennou/test/4615 Test that largefiles can be read through the tree API

0345a380

2019-02-22T14:39:08

p_fallocate: add a test for our implementation

bd66925a

2018-12-01T10:29:32

oidmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_oidmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.

fdfabdc4

2018-12-01T09:49:10

strmap: remove legacy low-level interface Remove the low-level interface that was exposing implementation details of `git_strmap` to callers. From now on, only the high-level functions shall be used to retrieve or modify values of a map. Adjust remaining existing callers.

18cf5698

2018-12-01T09:37:40

maps: provide high-level iteration interface Currently, our headers need to leak some implementation details of maps due to their direct use of indices in the implementation of their foreach macros. This makes it impossible to completely hide the map structures away, and also makes it impossible to include the khash implementation header in the C files of the respective map only. This is now being fixed by providing a high-level iteration interface `map_iterate`, which takes as inputs the map that shall be iterated over, an iterator as well as the locations where keys and values shall be put into. For simplicity's sake, the iterator is a simple `size_t` that shall initialized to `0` on the first call. All existing foreach macros are then adjusted to make use of this new function.

2e0a3048

2019-01-23T10:48:55

oidmap: introduce high-level setter for key/value pairs Currently, one would use either `git_oidmap_insert` to insert key/value pairs into a map or `git_oidmap_put` to insert a key only. These function have historically been macros, which is why their syntax is kind of weird: instead of returning an error code directly, they instead have to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes.Furthermore, `git_oidmap_put` is tightly coupled with implementation details of the map as it exposes the index of inserted entries. Introduce a new function `git_oidmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all trivial callers of `git_oidmap_insert` and `git_oidmap_put` to make use of it.

9694ef20

2018-12-17T09:01:53

oidmap: introduce high-level getter for values The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_oidmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.

03555830

2019-01-23T10:44:33

strmap: introduce high-level setter for key/value pairs Currently, one would use the function `git_strmap_insert` to insert key/value pairs into a map. This function has historically been a macro, which is why its syntax is kind of weird: instead of returning an error code directly, it instead has to be passed a pointer to where the return value shall be stored. This does not match libgit2's common idiom of directly returning error codes. Introduce a new function `git_strmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert all callers of `git_strmap_insert` to make use of it.

ef507bc7

2019-01-23T10:44:02

strmap: introduce `git_strmap_get` and use it throughout the tree The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_strmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.

7e926ef3

2018-11-30T12:14:43

maps: provide a uniform entry count interface There currently exist two different function names for getting the entry count of maps, where offmaps offset and string maps use `num_entries` and OID maps use `size`. In most programming languages with built-in map types, this is simply called `size`, which is also shorter to type. Thus, this commit renames the other two functions `num_entries` to match the common way and adjusts all callers.

351eeff3

2019-01-23T10:42:46

maps: use uniform lifecycle management functions Currently, the lifecycle functions for maps (allocation, deallocation, resize) are not named in a uniform way and do not have a uniform function signature. Rename the functions to fix that, and stick to libgit2's naming scheme of saying `git_foo_new`. This results in the following new interface for allocation: - `int git_<t>map_new(git_<t>map **out)` to allocate a new map, returning an error code if we ran out of memory - `void git_<t>map_free(git_<t>map *map)` to free a map - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map This commit also fixes all existing callers.

3fba5891

2019-01-20T23:53:33

test: cast to a char the zstream test

9c5e05ad

2019-01-23T10:43:29

deprecation: move deprecated tests into their own file Move the deprecated stream tests into their own compilation unit. This will allow us to disable any preprocessor directives that apply to deprecation just for these tests (eg, disabling `GIT_DEPRECATED_HARD`).

f673e232

2018-12-27T13:47:34

git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.

b5e8272f

2019-01-06T08:29:56

Attempt at fixing the MingW64 compilation It seems like MingW64's size_t is defined differently than in Linux.

487233fa

2018-11-29T07:21:41

Merge pull request #4895 from pks-t/pks/unused-warnings Unused function warnings

02bb39f4

2018-11-22T08:49:09

stream registration: take an enum type Accept an enum (`git_stream_t`) during custom stream registration that indicates whether the registration structure should be used for standard (non-TLS) streams or TLS streams.

df2cc108

2018-11-18T10:29:07

stream: provide generic registration API Update the new stream registration API to be `git_stream_register` which takes a registration structure and a TLS boolean. This allows callers to register non-TLS streams as well as TLS streams. Provide `git_stream_register_tls` that takes just the init callback for backward compatibliity.

43b592ac

2018-10-25T08:49:01

tls: introduce a wrap function Introduce `git_tls_stream_wrap` which will take an existing `stream` with an already connected socket and begin speaking TLS on top of it. This is useful if you've built a connection to a proxy server and you wish to begin CONNECT over it to tunnel a TLS connection. Also update the pluggable TLS stream layer so that it can accept a registration structure that provides an `init` and `wrap` function, instead of a single initialization function.

852bc9f4

2018-11-23T19:26:24

khash: remove intricate knowledge of khash types Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.

4209a512

2018-11-14T12:04:42

strntol: fix out-of-bounds reads when parsing numbers with leading sign When parsing a number, we accept a leading plus or minus sign to return a positive or negative number. When the parsed string has such a leading sign, we set up a flag indicating that the number is negative and advance the pointer to the next character in that string. This misses updating the number of bytes in the string, though, which is why the parser may later on do an out-of-bounds read. Fix the issue by correctly updating both the pointer and the number of remaining bytes. Furthermore, we need to check whether we actually have any bytes left after having advanced the pointer, as otherwise the auto-detection of the base may do an out-of-bonuds access. Add a test that detects the out-of-bound read. Note that this is not actually security critical. While there are a lot of places where the function is called, all of these places are guarded or irrelevant: - commit list: this operates on objects from the ODB, which are always NUL terminated any may thus not trigger the off-by-one OOB read. - config: the configuration is NUL terminated. - curl stream: user input is being parsed that is always NUL terminated - index: the index is read via `git_futils_readbuffer`, which always NUL terminates it. - loose objects: used to parse the length from the object's header. As we check previously that the buffer contains a NUL byte, this is safe. - rebase: this parses numbers from the rebase instruction sheet. As the rebase code uses `git_futils_readbuffer`, the buffer is always NUL terminated. - revparse: this parses a user provided buffer that is NUL terminated. - signature: this parser the header information of objects. As objects read from the ODB are always NUL terminated, this is a non-issue. The constructor `git_signature_from_buffer` does not accept a length parameter for the buffer, so the buffer needs to be NUL terminated, as well. - smart transport: the buffer that is parsed is NUL terminated - tree cache: this parses the tree cache from the index extension. The index itself is read via `git_futils_readbuffer`, which always NUL terminates it. - winhttp transport: user input is being parsed that is always NUL terminated

50d09407

2018-10-29T18:05:27

strntol: fix detection and skipping of base prefixes The `git__strntol` family of functions has the ability to auto-detect a number's base if the string has either the common '0x' prefix for hexadecimal numbers or '0' prefix for octal numbers. The detection of such prefixes and following handling has two major issues though that are being fixed in one go now. - We do not do any bounds checking previous to verifying the '0x' base. While we do verify that there is at least one digit available previously, we fail to verify that there are two digits available and thus may do an out-of-bounds read when parsing this two-character-prefix. - When skipping the prefix of such numbers, we only update the pointer length without also updating the number of remaining bytes. Thus if we try to parse a number '0x1' of total length 3, we will first skip the first two bytes and then try to read 3 bytes starting at '1'. Fix both issues by disentangling the logic. Instead of doing the detection and skipping of such prefixes in one go, we will now first try to detect the base while also honoring how many bytes are left. Only if we have a valid base that is either 8 or 16 and have one of the known prefixes, we will now advance the pointer and update the remaining bytes in one step. Add some tests that verify that no out-of-bounds parsing happens and that autodetection works as advertised.

41863a00

2018-10-29T17:19:58

strntol: fix out-of-bounds read when skipping leading spaces The `git__strntol` family of functions accepts leading spaces and will simply skip them. The skipping will not honor the provided buffer's length, though, which may lead it to read outside of the provided buffer's bounds if it is not a simple NUL-terminated string. Furthermore, if leading space is trimmed, the function will further advance the pointer but not update the number of remaining bytes, which may also lead to out-of-bounds reads. Fix the issue by properly paying attention to the buffer length and updating it when stripping leading whitespace characters. Add a test that verifies that we won't read past the provided buffer length.

623647af

2018-10-26T12:33:59

Merge pull request #4864 from pks-t/pks/object-parse-fixes Object parse fixes

83e8a6b3

2018-10-18T16:08:46

util: provide `git__memmem` function Unfortunately, neither the `memmem` nor the `strnstr` functions are part of any C standard but are merely extensions of C that are implemented by e.g. glibc. Thus, there is no standardized way to search for a string in a block of memory with a limited size, and using `strstr` is to be considered unsafe in case where the buffer has not been sanitized. In fact, there are some uses of `strstr` in exactly that unsafe way in our codebase. Provide a new function `git__memmem` that implements the `memmem` semantics. That is in a given haystack of `n` bytes, search for the occurrence of a byte sequence of `m` bytes and return a pointer to the first occurrence. The implementation chosen is the "Not So Naive" algorithm from [1]. It was chosen as the implementation is comparably simple while still being reasonably efficient in most cases. Preprocessing happens in constant time and space, searching has a time complexity of O(n*m) with a slightly sub-linear average case. [1]: http://www-igm.univ-mlv.fr/~lecroq/string/

ea19efc1

2018-10-18T15:08:56

util: fix out of bounds read in error message When an integer that is parsed with `git__strntol32` is too big to fit into an int32, we will generate an error message that includes the actual string that failed to parse. This does not acknowledge the fact that the string may either not be NUL terminated or alternative include additional characters after the number that is to be parsed. We may thus end up printing characters into the buffer that aren't the number or, worse, read out of bounds. Fix the issue by utilizing the `endptr` that was set by `git__strntol64`. This pointer is guaranteed to be set to the first character following the number, and we can thus use it to compute the width of the number that shall be printed. Create a test to verify that we correctly truncate the number.

39087ab8

2018-10-18T12:11:33

tests: core::strtol: test for some more edge-cases Some edge cases were currently completely untested, e.g. parsing numbers greater than INT64_{MIN,MAX}, truncating buffers by length and invalid characters. Add tests to verify that the system under test performs as expected.

8d7fa88a

2018-10-18T12:04:07

util: remove `git__strtol32` The function `git__strtol32` can easily be misused when untrusted data is passed to it that may not have been sanitized with trailing `NUL` bytes. As all usages of this function have now been removed, we can remove this function altogether to avoid future misuse of it.

68deb2cc

2018-10-18T11:37:10

util: remove unsafe `git__strtol64` function The function `git__strtol64` does not take a maximum buffer length as parameter. This has led to some unsafe usages of this function, and as such we may consider it as being unsafe to use. As we have now eradicated all usages of this function, let's remove it completely to avoid future misuse.

838a2f29

2018-10-07T12:00:48

Merge pull request #4828 from csware/git_futils_rmdir_r_failing Add some more tests for git_futils_rmdir_r and some cleanup

ad273718

2018-10-04T10:32:07

tests: sanitize file hierarchy after running rmdir tests Currently, we do not clean up after ourselves after tests in core::rmdir have created new files in the directory hierarchy. This may leave stale files and/or directories after having run tests, confusing subsequent tests that expect a pristine test environment. Most importantly, it may cause the test initialization to fail which expects being able to re-create the testing hierarchy before each test in case where another test hasn't cleaned up after itself. Fix the issue by adding a cleanup function that removes the temporary testing hierarchy after each test if it still exists.

e886ab46

2018-10-02T19:50:29

tests: Add some more tests for git_futils_rmdir_r Signed-off-by: Sven Strickroth <email@cs-ware.de>

dbb4a586

2018-10-05T10:27:33

tests: fix warning for implicit conversion of integer to pointer GCC warns by default when implicitly converting integers to pointers or the other way round, and commit fa48d2ea7 (vector: do not malloc 0-length vectors on dup, 2018-09-26) introduced such an implicit conversion into our vector tests. While this is totally fine in this test, as the pointer's value is never being used in the first place, we can trivially avoid the warning by instead just inserting a pointer for a variable allocated on the stack into the vector.

ba1cd495

2018-09-28T11:10:49

Merge pull request #4784 from tiennou/fix/warnings Some warnings

fa48d2ea

2018-09-26T19:15:35

vector: do not malloc 0-length vectors on dup

be4717d2

2018-09-18T12:12:06

path: fix "comparison always true" warning

9994cd3f

2018-06-25T11:56:52

treewide: remove use of C++ style comments C++ style comment ("//") are not specified by the ISO C90 standard and thus do not conform to it. While libgit2 aims to conform to C90, we did not enforce it until now, which is why quite a lot of these non-conforming comments have snuck into our codebase. Do a tree-wide conversion of all C++ style comments to the supported C style comments to allow us enforcing strict C90 compliance in a later commit.

ecf4f33a

2018-02-08T11:14:48

Convert usage of `git_buf_free` to new `git_buf_dispose`

e3d764a4

2018-03-29T22:14:12

tests: clarify comment

86219f40

2017-11-30T15:40:13

util: introduce `git__prefixncmp` and consolidate implementations Introduce `git_prefixncmp` that will search up to the first `n` characters of a string to see if it is prefixed by another string. This is useful for examining if a non-null terminated character array is prefixed by a particular substring. Consolidate the various implementations of `git__prefixcmp` around a single core implementation and add some test cases to validate its behavior.

e9369856

2017-03-21T00:25:15

stream: Gather streams to src/streams

08c1b8fc

2017-08-28T21:24:13

cmake: simplify some HTTPS tests

89a34828

2017-06-16T13:34:43

diff: implement function to calculate patch ID The upstream git project provides the ability to calculate a so-called patch ID. Quoting from git-patch-id(1): A "patch ID" is nothing but a sum of SHA-1 of the file diffs associated with a patch, with whitespace and line numbers ignored." Patch IDs can be used to identify two patches which are probably the same thing, e.g. when a patch has been cherry-picked to another branch. This commit implements a new function `git_diff_patchid`, which gets a patch and derives an OID from the diff. Note the different terminology here: a patch in libgit2 are the differences in a single file and a diff can contain multiple patches for different files. The implementation matches the upstream implementation and should derive the same OID for the same diff. In fact, some code has been directly derived from the upstream implementation. The upstream implementation has two different modes to calculate patch IDs, which is the stable and unstable mode. The old way of calculating the patch IDs was unstable in a sense that a different ordering the diffs was leading to different results. This oversight was fixed in git 1.9, but as git tries hard to never break existing workflows, the old and unstable way is still default. The newer and stable way does not care for ordering of the diff hunks, and in fact it is the mode that should probably be used today. So right now, we only implement the stable way of generating the patch ID.

8296da5f

2017-06-14T10:49:28

Merge pull request #4267 from mohseenrm/master adding GIT_FILTER_VERSION to GIT_FILTER_INIT as part of convention

a78441bc

2017-06-13T11:05:40

Adding git_filter_init for initializing `git_filter` struct + unit test

95170294

2017-06-13T11:08:28

tests: core: test initialization of `git_proxy_options` Initialization of the `git_proxy_options` structure is never tested anywhere. Include it in our usual initialization test in "core::structinit::compare".

8a5e7aae

2017-05-22T12:53:44

varint: fix computation for remaining buffer space When encoding varints to a buffer, we want to remain sure that the required buffer space does not exceed what is actually available. Our current check does not do the right thing, though, in that it does not honor that our `pos` variable counts the position down instead of up. As such, we will require too much memory for small varints and not enough memory for big varints. Fix the issue by correctly calculating the required size as `(sizeof(varint) - pos)`. Add a test which failed before.

417319cc

2017-04-25T10:14:37

tests: core::features: only check for HTTPS if it is supported

983979fa

2017-03-22T19:52:38

inet_pton: don't assume addr families don't exist Address family 5 might exist on some crazy system like Haiku. Use `INT_MAX-1` as an unsupported address family.

31059923

2017-03-20T12:16:18

Merge pull request #4169 from csware/absolute-symlink

thodg/libgit2/tests/core

tests/core

Log