kmx git

Commit	Date	Message
b30dab8f	2019-07-11T12:10:48	apply: refactor to use a switch statement
001d76e1	2019-07-11T11:34:40	diff: ignore EOFNL for computing patch IDs The patch ID is supposed to be mostly context-insignificant and thus only includes added or deleted lines. As such, we shouldn't honor end-of-file-without-newline markers in diffs. Ignore such lines to fix how we compute the patch ID for such diffs.
2ba7020f	2019-06-27T09:23:59	config_file: avoid re-reading files on write When we rewrite the configuration file due to any of its values being modified, we call `config_refresh` to update the in-memory representation of our config file backend. This is needlessly wasteful though, as `config_refresh` will always open the on-disk representation to reads the file contents while we already know the complete file contents at this point in time as we have just written it to disk. Implement a new function `config_refresh_from_buffer` that will refresh the backend's config entries from a buffer instead of from the config file itself. Note that this will thus _not_ update the backend's timestamp, which will cause us to re-read the buffer when performing a read operation on it. But this is still an improvement as we now lazily re-read the contents, and most importantly we will avoid constantly re-reading the contents if we perform multiple write operations. The following strace demonstrates this if we're re-writing a key multiple times. It uses our config example with `config_set` changed to update the file 10 times with different keys: $ strace lg2 config x.x z \|& grep '^open.config' open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 And now with the optimization of `config_refresh_from_buffer`: $ strace lg2 config x.x z \|& grep '^open.config' open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 As can be seen, this is quite a lot of `open` calls less.
a0dc3027	2019-06-27T08:54:51	config_file: split out function that sets config entries Updating a config file backend's config entries is a bit more involved, as it requires clearing of the old config entries as well as handling locking correctly. As we will need this functionality in a future patch to refresh config entries from a buffer, let's extract this into its own function `config_set_entries`.
985f5cdf	2019-06-27T08:41:16	config_file: split out function that reads entries from a buffer The `config_read` function currently performs both reading the on-disk config file as well as parsing the retrieved buffer contents. To optimize how we refresh our config entries from an in-memory buffer, we need to be able to directly parse buffers, though, without involving any on-disk files at all. Extract a new function `config_read_buffer` that sets up the parsing logic and then parses config entries from a buffer, only. Have `config_read` use it to avoid duplicated logic.
3e1c137a	2019-06-27T08:24:21	config_file: move refresh into `write` function We are quite lazy in how we refresh our config file backend when updating any of its keys: instead of just updating our in-memory representation of the keys, we just discard the old set of keys and then re-read the config file contents from disk. This refresh currently happens separately at every callsite of `config_write`, but it is clear that we _always_ want to refresh if we have written the config file to disk. If we didn't, then we'd run around with an outdated config file backend that does not represent what we have on disk. By moving the refresh into `config_write`, we are also able to optimize the case where the config file is currently locked. Before, we would've tried to re-read the file even if we have only updated its cached contents without touching the on-disk file. Thus we'd have unnecessarily stat'd the file, even though we know that it shouldn't have been modified in the meantime due to its lock.
d7f58eab	2019-06-21T11:55:21	config_file: implement stat cache to avoid repeated rehashing To decide whether a config file has changed, we always hash its complete contents. This is unnecessarily expensive, as well-behaved filesystems will always update stat information for files which have changed. So before computing the hash, we should first check whether the stat info has actually changed for either the configuration file or any of its includes. This avoids having to re-read the configuration file and its includes every time when we check whether it's been modified. Tracing the for-each-ref example previous to this commit, one can see that we repeatedly re-open both the repo configuration as well as the global configuration: $ strace lg2 for-each-ref \|& grep config access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) access("/home/pks/.config/git/config", F_OK) = 0 access("/etc/gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 access("/tmp/repo/.git/config", F_OK) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05290) = -1 ENOENT (No such file or directory) access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 access("/home/pks/.config/git/config", F_OK) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c051f0) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 With the change, we only do stats for those files and open them a single time, only: $ strace lg2 for-each-ref \|& grep config access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) access("/home/pks/.config/git/config", F_OK) = 0 access("/etc/gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 access("/tmp/repo/.git/config", F_OK) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffe70540d20) = -1 ENOENT (No such file or directory) access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 access("/home/pks/.config/git/config", F_OK) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540ca0) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540c80) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 The following benchmark has been performed with and without the stat cache in a best-of-ten run: ``` int lg2_repro(git_repository repo, int argc, char argv) { git_config cfg; int32_t dummy; int i; UNUSED(argc); UNUSED(argv); check_lg2(git_repository_config(&cfg, repo), "Could not obtain config", NULL); for (i = 1; i < 100000; ++i) git_config_get_int32(&dummy, cfg, "foo.bar"); git_config_free(cfg); return 0; } ``` Without stat cache: $ time lg2 repro real 0m1.528s user 0m0.568s sys 0m0.944s With stat cache: $ time lg2 repro real 0m0.526s user 0m0.268s sys 0m0.258s This benchmark shows a nearly three-fold performance improvement. This change requires that we check our configuration stress tests as we're now in fact becoming more racy. If somebody is writing a configuration file at nearly the same time (there is a window of 100ns on Windows-based systems), then it might be that we realize that this file has actually changed and thus may not re-read it. This will only happen if either an external process is rewriting the configuration file or if the same process has multiple `git_config` structures pointing to the same time, where one of both is being used to write and the other one is used to read values.
d0868646	2019-06-21T11:43:09	config: use `git_config_file` in favor of `struct config_file`
398412cc	2019-07-05T11:56:16	Merge pull request #5143 from libgit2/ethomson/warnings ci: build with ENABLE_WERROR on Windows
2f14c4fc	2019-06-28T14:39:20	w32_stack: convert buffer length param to `size_t` In both `git_win32__stack_format` and `git_win32__stack`, we handle buffer lengths via an integer variable. As we only ever pass buffer sizes to it, this should be a `size_t` though to avoid loss of precision. As we also use it to compare with other `size_t` variables, this also silences signed/unsigned comparison warnings.
1bbec26d	2019-07-04T11:41:21	attr_file: completely initialize attribute sessions The function `git_attr_session__init` is currently only initializing setting up the attribute's session key by incrementing the repo-global key by one. Most notably, all other members of the `git_attr_session` struct are not getting initialized at all. So if one is to allocate a session on the stack and then calls `git_attr_session__init`, the session will still not be fully initialized. We have fared just fine with that until now as all users of the function have allocated the session structure as part of bigger structs with `calloc`, and thus its contents have been zero-initialized implicitly already. Fix this by explicitly zeroing out the session to enable allocation of sessions on the stack.
18a6d9f3	2019-06-29T16:19:08	attr: Don't fail in attr_setup if there exists a system attributes file Regression introduced in commit 5452e49fce21f726bec19519da7f012e3f19e736 on PR #4967. Signed-off-by: Sven Strickroth <email@cs-ware.de>
7fd3f32b	2019-06-27T13:54:55	hash: fix missing error return on production builds When no hash algorithm has been initialized in a given hash context, then we will simply `assert` and not return a value at all. This works just fine in debug builds, but on non-debug builds the assert will be converted to a no-op and thus we do not have a proper return value. Fix this by returning an error code in addition to the asserts.
e9102def	2019-06-27T11:38:04	Merge pull request #4438 from pks-t/pks/hash-algorithm Multiple hash algorithms
501c51b2	2019-06-26T14:49:50	repo: commondir resolution can sometimes fallback to the repodir For example, https://git-scm.com/docs/gitrepository-layout says: info Additional information about the repository is recorded in this directory. This directory is ignored if $GIT_COMMON_DIR is set and "$GIT_COMMON_DIR/info" will be used instead. So when looking for `info/attributes`, we need to check the commondir first, or fallback to "our" `info/attributes`.
9f723c97	2019-06-26T14:49:37	docs: fixups
b883d370	2019-06-26T14:49:30	ignore: fix a missing commondir causing failures As with the preceding commit, the ignore code tries to load code from info/exclude, and we fail to ignore a non-existent file here.
82c7a9bc	2019-06-26T14:49:24	attr: fix attribute lookup if repo has no common directory If creating a repository without a common directory (e.g. by using `git_repository_new`), then `git_repository_item_path` will return `GIT_ENOTFOUND` for every file that's usually located in this directory. While we do not care for this case when looking up the "info/attributes" file, we fail to properly ignore these errors when setting up or collecting attributes files. Thus, the gitattributes lookup is broken and will only ever return `GIT_ENOTFOUND`. Fix this issue by properly ignoring `GIT_ENOTFOUND` returned by `git_repository_item_path`.
5452e49f	2019-06-26T14:49:17	attr: refactor setup to match current coding style The code in the `attr_setup` function is not really matching our current coding style. Besides alignment issues, it's also hard to see what functions calls depend on one another because they're split up over multiple conditional statements. Fix these issues by grouping together dependent function calls and adjusting the alignment.
f48cf5b3	2019-06-25T14:46:31	w32_stack: treat a len as an size_t
b7187ed7	2019-02-22T14:38:31	hash: add ability to distinguish algorithms Create an enum that allows us to distinguish between different hashing algorithms. This enum is embedded into each `git_hash_ctx` and will instruct the code to which hashing function the particular request shall be dispatched. As we do not yet have multiple hashing algorithms, we simply initialize the hash algorithm to always be SHA1. At a later point, we will have to extend the `git_hash_init_ctx` function to get as parameter which algorithm shall be used.
8832172e	2019-02-22T14:32:40	hash: move SHA1 implementations to its own hashing context Create a separate `git_hash_sha1_ctx` structure that is specific to the SHA1 implementation and move all SHA1 functions over to use that one instead of the generic `git_hash_ctx`. The `git_hash_ctx` for now simply has a union containing this single SHA1 implementation, only, without any mechanism to distinguish between different algortihms.
d46d3b53	2019-04-05T10:59:46	hash: split into generic and SHA1-specific interface As a preparatory step to allow multiple hashing APIs to exist at the same time, split the hashing functions into one layer for generic hashing and one layer for SHA1-specific hashing. Right now, this is simply an additional indirection layer that doesn't yet serve any purpose. In the future, the generic API will be extended to allow for choosing which hash to use, though, by simply passing an enum to the hash context initialization function. This is necessary as a first step to be ready for Git's move to SHA256.
fda20622	2019-06-14T14:22:19	hash: move SHA1 implementations into 'sha1/' folder As we will include additional hash algorithms in the future due to upstream git discussing a move away from SHA1, we should accomodate for that and prepare for the move. As a first step, move all SHA1 implementations into a common subdirectory. Also, create a SHA1-specific header file that lives inside the hash folder. This header will contain the SHA1-specific header includes, function declarations and the SHA1 context structure.
759ec7d4	2019-06-15T22:01:00	win32: cast GetProcAddress to void * before casting GetProcAddress is prototyped to return a `FARPROC`, which is meant to be a generic function pointer. It's literally `int (FAR WINAPI * FARPROC)()` which gcc complains if you attempt to cast to a `void ()(GIT_SRWLOCK )`. Cast to a `void *` before casting to avoid warnings about the arguments.
3cd123e9	2019-06-15T21:56:53	win32: define DWORD_MAX if it's not defined MinGW does not define DWORD_MAX. Specify it when it's not defined.
d93b0aa0	2019-06-15T21:47:40	win32: decorate unused parameters
e2aba8ba	2019-06-15T20:45:22	wildmatch: explicitly cast to int
3dd1942b	2019-06-15T20:43:13	win32: don't re-define RtlCaptureStackBackTrace RtlCaptureStackBackTrace is well-defined in Windows, no need to redefine it.
cc9e47c9	2019-06-15T18:51:40	win32: support upgrading warnings to errors (/WX) For MSVC, support warnings as errors by providing the /WX compiler flags. (/WX is the moral equivalent of -Werror.) Disable warnings as errors ass part of xdiff, since it contains warnings. But as a component of git itself, we want to avoid skew and keep our implementation as similar as possible to theirs. We'll work with upstream to fix these issues, but in the meantime, simply let those continue to warn.
f6530438	2019-05-25T16:44:59	win32: stop inlining file_attribute_to_stat Move `git_win32__file_attribute_to_stat` to a regular function instead of an inlined function. This helps avoid header ordering issues and declarations.
bbf034ab	2019-02-22T13:43:16	hash: move `git_hash_prov` into Win32 backend The structure `git_hash_prov` is only ever used by the Win32 SHA1 backend. As such, it doesn't make much sense to expose it via the generic "hash.h" header, as it is an implementation detail of the Win32 backend only. Move the typedef of `git_hash_prov` into "hash/sha1/win32.h" to fix this.
bd48bf3f	2019-06-14T14:21:32	hash: introduce source files to break include circles The hash source files have circular include dependencies right now, which shows by our broken generic hash implementation. The "hash.h" header declares two functions and the `git_hash_ctx` typedef before actually including the hash backend header and can only declare the remaining hash functions after the include due to possibly static function declarations inside of the implementation includes. Let's break this cycle and help maintainability by creating a real implementation file for each of the hash implementations. Instead of relying on the exact include order, we now especially avoid the use of `GIT_INLINE` for function declarations.
b11eb08f	2019-05-21T14:39:55	config parse: safely cast to int
6b349ecc	2019-05-21T14:36:57	odb loose: only read at most INT_MAX
8c925ef8	2019-05-21T14:30:28	smart protocol: validate progress message length Ensure that the server has not sent us overly-large sideband messages (ensure that they are no more than `INT_MAX` bytes), then cast to `int`.
7afe788c	2019-05-21T14:27:46	smart transport: use size_t for sizes
db7f1d9b	2019-05-21T14:21:58	local transport: cast message size to int explicitly Our progress information messages are short (and bounded by their format string), cast the length to int for callers.
8048ba70	2019-05-21T14:18:40	winhttp: safely cast length to DWORD
db6b8f7d	2019-05-21T14:15:58	strtol: cast error message length to int
d103f008	2019-05-21T13:44:47	pool: use `size_t` for sizes
c4a64b1b	2019-05-21T13:27:39	tree-cache: safely cast to uint32_t
2375be48	2019-05-21T12:57:28	tree: return `size_t` for treebuilder entrycount We keep the treebuilder entrycount as a `size_t` - return that instead of downcasting to an `unsigned int`. Callers who were storing this value in an `unsigned int` will continue to downcast themselves, so there should be no behavior change for callers.
ad6f2153	2019-05-21T12:50:46	utf8: use size_t for length of buffer The `git__utf8_charlen` now takes `size_t` as the buffer length, since it contains the full length of the buffer at the current position. It now returns `-1` in all cases where utf8 codepoints are invalid, since callers only care about a valid length of a sequence of codepoints, or if the current position is not valid utf8.
5d5b76df	2019-05-21T12:35:19	worktree: use size_t for sizes
f7597410	2019-05-21T10:57:30	netops: safely cast to int Only read at most INT_MAX from the underlying stream, so that we can accurately return the number of bytes read. Since callers are not guaranteed to get as many bytes as requested (due to availability of input), this is safe and callers should call in a loop until EOF.
cfd44d6a	2019-05-20T07:57:46	trailer: use size_t for sizes
fc3a94ba	2019-05-20T07:13:42	repository: use size_t for length
b4a173b5	2019-05-20T07:12:36	rebase: use size_t for path length
991c9454	2019-05-20T07:11:00	pool: cast arithmetic
aca3f701	2019-05-20T07:09:46	path: safely cast path calculation
f1d73189	2019-05-20T07:02:50	patch: use size_t for size when parsing
9a6992c4	2019-05-20T06:46:10	merge: safely cast size of merged file for index Explicitly truncate the file size to a `uint32_t`.
b205f538	2019-05-20T06:38:51	iterator: sanity-check path length and safely cast
7e49deba	2019-05-20T06:35:11	index: safely cast file size
d488c02c	2019-05-20T06:31:42	win32: safely cast path sizes for win api
cadddaed	2019-05-20T06:20:18	w32: safely cast to int during charset conversion
3a5a07fc	2019-05-20T05:37:16	idxmap: safely cast down to khiter_t
2a4bcf63	2019-06-23T18:24:23	errors: use lowercase Use lowercase for our error messages, per our custom.
91a300b7	2019-06-16T00:46:30	attr: rename constants and macros for consistency Our enumeration values are not generally suffixed with `T`. Further, our enumeration names are generally more descriptive.
c3bbbcf5	2019-06-16T12:30:56	Merge pull request #5117 from libgit2/ethomson/to_from Change API instances of `fromnoun` to `from_noun` (with an underscore)
e45350fe	2019-06-16T00:10:02	tag: add underscore to `from` function The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the tag function for consistency with the rest of the library.
6574cd00	2019-06-08T19:25:36	index: rename `frombuffer` to `from_buffer` The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the index functions for consistency with the rest of the library.
b7791d04	2019-06-16T00:23:01	object: rename git_object__size to git_object_size We don't use double-underscores in the public API.
08f39208	2019-06-08T17:46:04	blob: add underscore to `from` functions The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.
5d92e547	2019-06-08T17:28:35	oid: `is_zero` instead of `iszero` The only function that is named `issomething` (without underscore) was `git_oid_iszero`. Rename it to `git_oid_is_zero` for consistency with the rest of the library.
fef847ae	2019-06-15T15:47:41	Merge pull request #5110 from pks-t/pks/wildmatch Replace fnmatch with wildmatch
13ded47c	2019-06-13T19:57:17	fnmatch: remove unused code The `fnmatch` code has now been completely replaced by `wildmatch`, same as upstream git.git has been doing in 2014. Remove it.
05f9986a	2019-06-14T08:06:05	attr_file: convert to use `wildmatch` Upstream git has converted to use `wildmatch` instead of `fnmatch`. Convert our gitattributes logic to use `wildmatch` as the last user of `fnmatch`. Please, don't expect I know what I'm doing here: the fnmatch parser is one of the most fun things to play around with as it has a shtload of weird cases. In all honesty, I'm simply relying on our tests that are by now rather comprehensive in that area. The conversion actually fixes compatibility with how git.git parser "" patterns when the given path does not contain any directory separators. Previously, a pattern "*.foo" erroneously wouldn't match a file "x.foo", while git.git would match. Remove the new-unused LEADINGDIR/NOLEADINGDIR flags for `git_attr_fnmatch`.
5811e3ba	2019-06-13T19:16:32	config_file: use `wildmatch` to evaluate conditionals We currently use `p_fnmatch` to compute whether a given "gitdir:" or "gitdir/i:" conditional matches the current configuration file path. As git.git has moved to use `wildmatch` instead of `p_fnmatch` throughout its complete codebase, we evaluate conditionals inconsistently with git.git in some special cases. Convert `p_fnmatch` to use `wildmatch`. The `FNM_LEADINGDIR` flag cannot be translated to `wildmatch`, but in fact git.git doesn't use it here either. And in fact, dropping it while we go increases compatibility with git.git.
cf1a114b	2019-06-13T19:10:22	config_file: do not include trailing '/' for "gitdir" conditionals When evaluating "gitdir:" and "gitdir/i:" conditionals, we currently compare the given pattern with the value of `git_repository_path`. Thing is though that `git_repository_path` returns the gitdir path with trailing '/', while we actually need to match against the gitdir without it. Fix this issue by stripping the trailing '/' previous to matching. Add various tests to ensure we get this right.
5d987f7d	2019-06-13T19:00:06	config_file: refactor `do_match_gitdir` to improve readability The function `do_match_gitdir` has some horribly named parameters and variables. Rename them to improve readability. Furthermore, fix a potentially undetected out-of-memory condition when appending "**" to the pattern.
de70bb46	2019-06-13T15:27:22	global: convert trivial `fnmatch` users to use `wildcard` Upstream git.git has converted its codebase to use wildcard in favor of fnmatch in commit 70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15). To keep our own regex-matching in line with what git does, convert all trivial instances of `fnmatch` usage to use `wildcard`, instead. Trivial usage is defined to be use of `fnmatch` with either no flags or flags that have a 1:1 equivalent in wildmatch (PATHNAME, IGNORECASE).
451df793	2019-06-13T15:20:23	posix: remove implicit include of "fnmatch.h" We're about to phase out our bundled fnmatch implementation as git.git has moved to wildmatch long ago in 2014. To make it easier to spot which files are stilll using fnmatch, remove the implicit "fnmatch.h" include in "posix.h" and instead include it explicitly.
a9f57629	2019-06-13T15:03:00	wildmatch: import wildmatch from git.git In commit 70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15), upstream git has switched over all code from their internal fnmatch copy to its new wildmatch code. We haven't followed suit, and thus have developed some incompatibilities in how we match regular expressions. Import git's wildmatch from v2.22.0 and add a test suite based on their t3070-wildmatch.sh tests.
2d85c7e8	2019-06-14T14:12:19	posix: remove `p_fallocate` abstraction By now, we have repeatedly failed to provide a nice cross-platform implementation of `p_fallocate`. Recent tries to do that escalated quite fast to a set of different CMake checks, implementations, fallbacks, etc., which started to look real awkward to maintain. In fact, `p_fallocate` had only been introduced in commit 4e3949b73 (tests: test that largefiles can be read through the tree API, 2019-01-30) to support a test with large files, but given the maintenance costs it just seems not to be worht it. As we have removed the sole user of `p_fallocate` in the previous commit, let's drop it altogether.
94fc83b6	2019-06-13T16:48:35	cmake: Modulize our TLS & hash detection The interactions between `USE_HTTPS` and `SHA1_BACKEND` have been streamlined. Previously we would have accepted not quite working configurations (like, `-DUSE_HTTPS=OFF -DSHA1_BACKEND=OpenSSL`) and, as the OpenSSL detection only ran with `USE_HTTPS`, the link would fail. The detection was moved to a new `USE_SHA1`, modeled after `USE_HTTPS`, which takes the values "CollisionDetection/Backend/Generic", to better match how the "hashing backend" is selected, the default (ON) being "CollisionDetection". Note that, as `SHA1_BACKEND` is still used internally, you might need to check what customization you're using it for.
c0dd7122	2019-06-06T16:48:04	apply: add an options struct initializer
0b5ba0d7	2019-06-06T16:36:23	Rename opt init functions to `options_init` In libgit2 nomenclature, when we need to verb a direct object, we name a function `git_directobject_verb`. Thus, if we need to init an options structure named `git_foo_options`, then the name of the function that does that should be `git_foo_options_init`. The previous names of `git_foo_init_options` is close - it _sounds_ as if it's initializing the options of a `foo`, but in fact `git_foo_options` is its own noun that should be respected. Deprecate the old names; they'll now call directly to the new ones.
a5ddae68	2019-06-13T22:00:48	Merge pull request #5097 from pks-t/pks/ignore-escapes gitignore with escapes
e277ff4d	2019-06-13T21:41:55	Merge pull request #5108 from libgit2/ethomson/urlparse_empty_port Handle URLs with a colon after host but no port
fb529a01	2019-06-11T22:03:29	http-parser: use our bundled http-parser by default Our bundled http-parser includes bugfixes, therefore we should prefer our http-parser until such time as we can identify that the system http-parser has these bugfixes (using a version check). Since these bugs are - at present - minor, retain the ability for users to force that they want to use the system http-parser anyway. This does change the cmake specification so that people _must_ opt-in to the new behavior knowingly.
0c1029be	2019-06-13T11:41:39	Merge pull request #5022 from rcoup/merge-analysis-bare-repo-5017 Merge analysis support for bare repos
3b517351	2019-06-07T10:13:34	attr_file: remove invalid TODO comment In our attributes pattern parsing code, we have a comment that states we might have to convert '\' characters to '/' to have proper POSIX paths. But in fact, '\' characters are valid inside the string and act as escape mechanism for various characters, which is why we never want to convert those to POSIX directory separators. Furthermore, gitignore patterns are specified to only treat '/' as directory separators. Remove the comment to avoid future confusion.
b3b6a39d	2019-06-07T11:12:54	attr_file: account for escaped escapes when searching trailing space When determining the trailing space length, we need to honor whether spaces are escaped or not. Currently, we do not check whether the escape itself is escaped, though, which might generate an off-by-one in that case as we will simply treat the space as escaped. Fix this by checking whether the backslashes preceding the space are themselves escaped.
10ac298c	2019-06-07T11:12:42	attr_file: fix unescaping of escapes required for fnmatch When parsing attribute patterns, we will eventually unescape the parsed pattern. This is required because we require custom escapes for whitespace characters, as normally they are used to terminate the current pattern. Thing is, we don't only unescape those whitespace characters, but in fact all escaped sequences. So for example if the pattern was "\", we unescape that to "". As this is directly passed to fnmatch(3) later, fnmatch would treat it as a simple glob matching all files where it should instead only match a file with name "*". Fix the issue by unescaping spaces, only. Add a bunch of tests to exercise escape parsing.
eb146e58	2019-06-07T09:17:23	attr_file: properly handle escaped '\' when searching non-escaped spaces When parsing attributes, we need to search for the first unescaped whitespace character to determine where the pattern is to be cut off. The scan fails to account for the case where the escaping '\' character is itself escaped, though, and thus we would not recognize the cut-off point in patterns like "\\ ". Refactor the scanning loop to remember whether the last character was an escape character. If it was and the next character is a '\', too, then we will reset to non-escaped mode again. Thus, we now handle escaped whitespaces as well as escaped wildcards correctly.
f7c6795f	2019-06-07T10:20:35	path: only treat paths starting with '\' as absolute on Win32 Windows-based systems treat paths starting with '\' as absolute, either referring to the current drive's root (e.g. "\foo" might refer to "C:\foo") or to a network path (e.g. "\\host\foo"). On the other hand, (most?) systems that are not based on Win32 accept backslashes as valid characters that may be part of the filename, and thus we cannot treat them to identify absolute paths. Change the logic to only paths starting with '\' as absolute on the Win32 platform. Add tests to avoid regressions and document behaviour.
fd734f7d	2019-06-11T12:45:27	Merge pull request #5107 from pks-t/pks/sha1dc-update sha1dc: update to fix endianess issues on AIX/HP-UX
230a451e	2019-06-10T13:54:11	sha1dc: update to fix endianess issues on AIX/HP-UX Update our copy of sha1dc to the upstream commit 855827c (Detect endianess on HP-UX, 2019-05-09). Changes include fixes to endian detection on AIX and HP-UX systems as well as a define that allows us to force aligned access, which we're not using yet.
7ea8630e	2019-04-07T20:11:59	http: free auth context on failure When we send HTTP credentials but the server rejects them, tear down the authentication context so that we can start fresh. To maintain this state, additionally move all of the authentication handling into `on_auth_required`.
005b5bc2	2019-04-07T17:55:23	http: reconnect to proxy on connection close When we're issuing a CONNECT to a proxy, we expect to keep-alive to the proxy. However, during authentication negotiations, the proxy may close the connection. Reconnect if the server closes the connection.
d171fbee	2019-04-07T17:40:23	http: allow server to drop a keepalive connection When we have a keep-alive connection to the server, that server may legally drop the connection for any reason once a successful request and response has occurred. It's common for servers to drop the connection after some amount of time or number of requests have occurred.
9af1de5b	2019-03-24T20:49:57	http: stop on server EOF We stop the read loop when we have read all the data. We should also consider the server's feelings. If the server hangs up on us, we need to stop our read loop. Otherwise, we'll try to read from the server - and fail - ad infinitum.
539e6293	2019-03-22T19:06:46	http: teach auth mechanisms about connection affinity Instead of using `is_complete` to decide whether we have connection or request affinity for authentication mechanisms, set a boolean on the mechanism definition itself.
3e0b4b43	2019-03-22T18:52:03	http: maintain authentication across connections For request-based authentication mechanisms (Basic, Digest) we should keep the authentication context alive across socket connections, since the authentication headers must be transmitted with every request. However, we should continue to remove authentication contexts for mechanisms with connection affinity (NTLM, Negotiate) since we need to reauthenticate for every socket connection.
ce72ae95	2019-03-22T10:53:30	http: simplify authentication mechanisms Hold an individual authentication context instead of trying to maintain all the contexts; we can select the preferred context during the initial negotiation. Subsequent authentication steps will re-use the chosen authentication (until such time as it's rejected) instead of trying to manage multiple contexts when all but one will never be used (since we can only authenticate with a single mechanism at a time.) Also, when we're given a 401 or 407 in the middle of challenge/response handling, short-circuit immediately without incrementing the retry count. The multi-step authentication is expected, and not a "retry" and should not be penalized as such. This means that we don't need to keep the contexts around and ensures that we do not unnecessarily fail for too many retries when we have challenge/response auth on a proxy and a server and potentially redirects in play as well.
6d931ba7	2019-03-22T16:35:59	http: don't set the header in the auth token
10718526	2019-03-09T13:53:16	http: don't reset replay count after connection A "connection" to a server is transient, and we may reconnect to a server in the midst of authentication failures (if the remote indicates that we should, via `Connection: close`) or in a redirect.
3192e3c9	2019-03-07T16:57:11	http: provide an NTLM authentication provider

b30dab8f

2019-07-11T12:10:48

apply: refactor to use a switch statement

001d76e1

2019-07-11T11:34:40

diff: ignore EOFNL for computing patch IDs The patch ID is supposed to be mostly context-insignificant and thus only includes added or deleted lines. As such, we shouldn't honor end-of-file-without-newline markers in diffs. Ignore such lines to fix how we compute the patch ID for such diffs.

2ba7020f

2019-06-27T09:23:59

config_file: avoid re-reading files on write When we rewrite the configuration file due to any of its values being modified, we call `config_refresh` to update the in-memory representation of our config file backend. This is needlessly wasteful though, as `config_refresh` will always open the on-disk representation to reads the file contents while we already know the complete file contents at this point in time as we have just written it to disk. Implement a new function `config_refresh_from_buffer` that will refresh the backend's config entries from a buffer instead of from the config file itself. Note that this will thus _not_ update the backend's timestamp, which will cause us to re-read the buffer when performing a read operation on it. But this is still an improvement as we now lazily re-read the contents, and most importantly we will avoid constantly re-reading the contents if we perform multiple write operations. The following strace demonstrates this if we're re-writing a key multiple times. It uses our config example with `config_set` changed to update the file 10 times with different keys: $ strace lg2 config x.x z |& grep '^open.*config' open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 And now with the optimization of `config_refresh_from_buffer`: $ strace lg2 config x.x z |& grep '^open.*config' open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 As can be seen, this is quite a lot of `open` calls less.

a0dc3027

2019-06-27T08:54:51

config_file: split out function that sets config entries Updating a config file backend's config entries is a bit more involved, as it requires clearing of the old config entries as well as handling locking correctly. As we will need this functionality in a future patch to refresh config entries from a buffer, let's extract this into its own function `config_set_entries`.

985f5cdf

2019-06-27T08:41:16

config_file: split out function that reads entries from a buffer The `config_read` function currently performs both reading the on-disk config file as well as parsing the retrieved buffer contents. To optimize how we refresh our config entries from an in-memory buffer, we need to be able to directly parse buffers, though, without involving any on-disk files at all. Extract a new function `config_read_buffer` that sets up the parsing logic and then parses config entries from a buffer, only. Have `config_read` use it to avoid duplicated logic.

3e1c137a

2019-06-27T08:24:21

config_file: move refresh into `write` function We are quite lazy in how we refresh our config file backend when updating any of its keys: instead of just updating our in-memory representation of the keys, we just discard the old set of keys and then re-read the config file contents from disk. This refresh currently happens separately at every callsite of `config_write`, but it is clear that we _always_ want to refresh if we have written the config file to disk. If we didn't, then we'd run around with an outdated config file backend that does not represent what we have on disk. By moving the refresh into `config_write`, we are also able to optimize the case where the config file is currently locked. Before, we would've tried to re-read the file even if we have only updated its cached contents without touching the on-disk file. Thus we'd have unnecessarily stat'd the file, even though we know that it shouldn't have been modified in the meantime due to its lock.

d7f58eab

2019-06-21T11:55:21

config_file: implement stat cache to avoid repeated rehashing To decide whether a config file has changed, we always hash its complete contents. This is unnecessarily expensive, as well-behaved filesystems will always update stat information for files which have changed. So before computing the hash, we should first check whether the stat info has actually changed for either the configuration file or any of its includes. This avoids having to re-read the configuration file and its includes every time when we check whether it's been modified. Tracing the for-each-ref example previous to this commit, one can see that we repeatedly re-open both the repo configuration as well as the global configuration: $ strace lg2 for-each-ref |& grep config access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) access("/home/pks/.config/git/config", F_OK) = 0 access("/etc/gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 access("/tmp/repo/.git/config", F_OK) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05290) = -1 ENOENT (No such file or directory) access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 access("/home/pks/.config/git/config", F_OK) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c051f0) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 With the change, we only do stats for those files and open them a single time, only: $ strace lg2 for-each-ref |& grep config access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) access("/home/pks/.config/git/config", F_OK) = 0 access("/etc/gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 access("/tmp/repo/.git/config", F_OK) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffe70540d20) = -1 ENOENT (No such file or directory) access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 access("/home/pks/.config/git/config", F_OK) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540ca0) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540c80) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 The following benchmark has been performed with and without the stat cache in a best-of-ten run: ``` int lg2_repro(git_repository *repo, int argc, char **argv) { git_config *cfg; int32_t dummy; int i; UNUSED(argc); UNUSED(argv); check_lg2(git_repository_config(&cfg, repo), "Could not obtain config", NULL); for (i = 1; i < 100000; ++i) git_config_get_int32(&dummy, cfg, "foo.bar"); git_config_free(cfg); return 0; } ``` Without stat cache: $ time lg2 repro real 0m1.528s user 0m0.568s sys 0m0.944s With stat cache: $ time lg2 repro real 0m0.526s user 0m0.268s sys 0m0.258s This benchmark shows a nearly three-fold performance improvement. This change requires that we check our configuration stress tests as we're now in fact becoming more racy. If somebody is writing a configuration file at nearly the same time (there is a window of 100ns on Windows-based systems), then it might be that we realize that this file has actually changed and thus may not re-read it. This will only happen if either an external process is rewriting the configuration file or if the same process has multiple `git_config` structures pointing to the same time, where one of both is being used to write and the other one is used to read values.

d0868646

2019-06-21T11:43:09

config: use `git_config_file` in favor of `struct config_file`

398412cc

2019-07-05T11:56:16

Merge pull request #5143 from libgit2/ethomson/warnings ci: build with ENABLE_WERROR on Windows

2f14c4fc

2019-06-28T14:39:20

w32_stack: convert buffer length param to `size_t` In both `git_win32__stack_format` and `git_win32__stack`, we handle buffer lengths via an integer variable. As we only ever pass buffer sizes to it, this should be a `size_t` though to avoid loss of precision. As we also use it to compare with other `size_t` variables, this also silences signed/unsigned comparison warnings.

1bbec26d

2019-07-04T11:41:21

attr_file: completely initialize attribute sessions The function `git_attr_session__init` is currently only initializing setting up the attribute's session key by incrementing the repo-global key by one. Most notably, all other members of the `git_attr_session` struct are not getting initialized at all. So if one is to allocate a session on the stack and then calls `git_attr_session__init`, the session will still not be fully initialized. We have fared just fine with that until now as all users of the function have allocated the session structure as part of bigger structs with `calloc`, and thus its contents have been zero-initialized implicitly already. Fix this by explicitly zeroing out the session to enable allocation of sessions on the stack.

18a6d9f3

2019-06-29T16:19:08

attr: Don't fail in attr_setup if there exists a system attributes file Regression introduced in commit 5452e49fce21f726bec19519da7f012e3f19e736 on PR #4967. Signed-off-by: Sven Strickroth <email@cs-ware.de>

7fd3f32b

2019-06-27T13:54:55

hash: fix missing error return on production builds When no hash algorithm has been initialized in a given hash context, then we will simply `assert` and not return a value at all. This works just fine in debug builds, but on non-debug builds the assert will be converted to a no-op and thus we do not have a proper return value. Fix this by returning an error code in addition to the asserts.

e9102def

2019-06-27T11:38:04

Merge pull request #4438 from pks-t/pks/hash-algorithm Multiple hash algorithms

501c51b2

2019-06-26T14:49:50

repo: commondir resolution can sometimes fallback to the repodir For example, https://git-scm.com/docs/gitrepository-layout says: info Additional information about the repository is recorded in this directory. This directory is ignored if $GIT_COMMON_DIR is set and "$GIT_COMMON_DIR/info" will be used instead. So when looking for `info/attributes`, we need to check the commondir first, or fallback to "our" `info/attributes`.

9f723c97

2019-06-26T14:49:37

docs: fixups

b883d370

2019-06-26T14:49:30

ignore: fix a missing commondir causing failures As with the preceding commit, the ignore code tries to load code from info/exclude, and we fail to ignore a non-existent file here.

82c7a9bc

2019-06-26T14:49:24

attr: fix attribute lookup if repo has no common directory If creating a repository without a common directory (e.g. by using `git_repository_new`), then `git_repository_item_path` will return `GIT_ENOTFOUND` for every file that's usually located in this directory. While we do not care for this case when looking up the "info/attributes" file, we fail to properly ignore these errors when setting up or collecting attributes files. Thus, the gitattributes lookup is broken and will only ever return `GIT_ENOTFOUND`. Fix this issue by properly ignoring `GIT_ENOTFOUND` returned by `git_repository_item_path`.

5452e49f

2019-06-26T14:49:17

attr: refactor setup to match current coding style The code in the `attr_setup` function is not really matching our current coding style. Besides alignment issues, it's also hard to see what functions calls depend on one another because they're split up over multiple conditional statements. Fix these issues by grouping together dependent function calls and adjusting the alignment.

f48cf5b3

2019-06-25T14:46:31

w32_stack: treat a len as an size_t

b7187ed7

2019-02-22T14:38:31

hash: add ability to distinguish algorithms Create an enum that allows us to distinguish between different hashing algorithms. This enum is embedded into each `git_hash_ctx` and will instruct the code to which hashing function the particular request shall be dispatched. As we do not yet have multiple hashing algorithms, we simply initialize the hash algorithm to always be SHA1. At a later point, we will have to extend the `git_hash_init_ctx` function to get as parameter which algorithm shall be used.

8832172e

2019-02-22T14:32:40

hash: move SHA1 implementations to its own hashing context Create a separate `git_hash_sha1_ctx` structure that is specific to the SHA1 implementation and move all SHA1 functions over to use that one instead of the generic `git_hash_ctx`. The `git_hash_ctx` for now simply has a union containing this single SHA1 implementation, only, without any mechanism to distinguish between different algortihms.

d46d3b53

2019-04-05T10:59:46

hash: split into generic and SHA1-specific interface As a preparatory step to allow multiple hashing APIs to exist at the same time, split the hashing functions into one layer for generic hashing and one layer for SHA1-specific hashing. Right now, this is simply an additional indirection layer that doesn't yet serve any purpose. In the future, the generic API will be extended to allow for choosing which hash to use, though, by simply passing an enum to the hash context initialization function. This is necessary as a first step to be ready for Git's move to SHA256.

fda20622

2019-06-14T14:22:19

hash: move SHA1 implementations into 'sha1/' folder As we will include additional hash algorithms in the future due to upstream git discussing a move away from SHA1, we should accomodate for that and prepare for the move. As a first step, move all SHA1 implementations into a common subdirectory. Also, create a SHA1-specific header file that lives inside the hash folder. This header will contain the SHA1-specific header includes, function declarations and the SHA1 context structure.

759ec7d4

2019-06-15T22:01:00

win32: cast GetProcAddress to void * before casting GetProcAddress is prototyped to return a `FARPROC`, which is meant to be a generic function pointer. It's literally `int (FAR WINAPI * FARPROC)()` which gcc complains if you attempt to cast to a `void (*)(GIT_SRWLOCK *)`. Cast to a `void *` before casting to avoid warnings about the arguments.

3cd123e9

2019-06-15T21:56:53

win32: define DWORD_MAX if it's not defined MinGW does not define DWORD_MAX. Specify it when it's not defined.

d93b0aa0

2019-06-15T21:47:40

win32: decorate unused parameters

e2aba8ba

2019-06-15T20:45:22

wildmatch: explicitly cast to int

3dd1942b

2019-06-15T20:43:13

win32: don't re-define RtlCaptureStackBackTrace RtlCaptureStackBackTrace is well-defined in Windows, no need to redefine it.

cc9e47c9

2019-06-15T18:51:40

win32: support upgrading warnings to errors (/WX) For MSVC, support warnings as errors by providing the /WX compiler flags. (/WX is the moral equivalent of -Werror.) Disable warnings as errors ass part of xdiff, since it contains warnings. But as a component of git itself, we want to avoid skew and keep our implementation as similar as possible to theirs. We'll work with upstream to fix these issues, but in the meantime, simply let those continue to warn.

f6530438

2019-05-25T16:44:59

win32: stop inlining file_attribute_to_stat Move `git_win32__file_attribute_to_stat` to a regular function instead of an inlined function. This helps avoid header ordering issues and declarations.

bbf034ab

2019-02-22T13:43:16

hash: move `git_hash_prov` into Win32 backend The structure `git_hash_prov` is only ever used by the Win32 SHA1 backend. As such, it doesn't make much sense to expose it via the generic "hash.h" header, as it is an implementation detail of the Win32 backend only. Move the typedef of `git_hash_prov` into "hash/sha1/win32.h" to fix this.

bd48bf3f

2019-06-14T14:21:32

hash: introduce source files to break include circles The hash source files have circular include dependencies right now, which shows by our broken generic hash implementation. The "hash.h" header declares two functions and the `git_hash_ctx` typedef before actually including the hash backend header and can only declare the remaining hash functions after the include due to possibly static function declarations inside of the implementation includes. Let's break this cycle and help maintainability by creating a real implementation file for each of the hash implementations. Instead of relying on the exact include order, we now especially avoid the use of `GIT_INLINE` for function declarations.

b11eb08f

2019-05-21T14:39:55

config parse: safely cast to int

6b349ecc

2019-05-21T14:36:57

odb loose: only read at most INT_MAX

8c925ef8

2019-05-21T14:30:28

smart protocol: validate progress message length Ensure that the server has not sent us overly-large sideband messages (ensure that they are no more than `INT_MAX` bytes), then cast to `int`.

7afe788c

2019-05-21T14:27:46

smart transport: use size_t for sizes

db7f1d9b

2019-05-21T14:21:58

local transport: cast message size to int explicitly Our progress information messages are short (and bounded by their format string), cast the length to int for callers.

8048ba70

2019-05-21T14:18:40

winhttp: safely cast length to DWORD

db6b8f7d

2019-05-21T14:15:58

strtol: cast error message length to int

d103f008

2019-05-21T13:44:47

pool: use `size_t` for sizes

c4a64b1b

2019-05-21T13:27:39

tree-cache: safely cast to uint32_t

2375be48

2019-05-21T12:57:28

tree: return `size_t` for treebuilder entrycount We keep the treebuilder entrycount as a `size_t` - return that instead of downcasting to an `unsigned int`. Callers who were storing this value in an `unsigned int` will continue to downcast themselves, so there should be no behavior change for callers.

ad6f2153

2019-05-21T12:50:46

utf8: use size_t for length of buffer The `git__utf8_charlen` now takes `size_t` as the buffer length, since it contains the full length of the buffer at the current position. It now returns `-1` in all cases where utf8 codepoints are invalid, since callers only care about a valid length of a sequence of codepoints, or if the current position is not valid utf8.

5d5b76df

2019-05-21T12:35:19

worktree: use size_t for sizes

f7597410

2019-05-21T10:57:30

netops: safely cast to int Only read at most INT_MAX from the underlying stream, so that we can accurately return the number of bytes read. Since callers are not guaranteed to get as many bytes as requested (due to availability of input), this is safe and callers should call in a loop until EOF.

cfd44d6a

2019-05-20T07:57:46

trailer: use size_t for sizes

fc3a94ba

2019-05-20T07:13:42

repository: use size_t for length

b4a173b5

2019-05-20T07:12:36

rebase: use size_t for path length

991c9454

2019-05-20T07:11:00

pool: cast arithmetic

aca3f701

2019-05-20T07:09:46

path: safely cast path calculation

f1d73189

2019-05-20T07:02:50

patch: use size_t for size when parsing

9a6992c4

2019-05-20T06:46:10

merge: safely cast size of merged file for index Explicitly truncate the file size to a `uint32_t`.

b205f538

2019-05-20T06:38:51

iterator: sanity-check path length and safely cast

7e49deba

2019-05-20T06:35:11

index: safely cast file size

d488c02c

2019-05-20T06:31:42

win32: safely cast path sizes for win api

cadddaed

2019-05-20T06:20:18

w32: safely cast to int during charset conversion

3a5a07fc

2019-05-20T05:37:16

idxmap: safely cast down to khiter_t

2a4bcf63

2019-06-23T18:24:23

errors: use lowercase Use lowercase for our error messages, per our custom.

91a300b7

2019-06-16T00:46:30

attr: rename constants and macros for consistency Our enumeration values are not generally suffixed with `T`. Further, our enumeration names are generally more descriptive.

c3bbbcf5

2019-06-16T12:30:56

Merge pull request #5117 from libgit2/ethomson/to_from Change API instances of `fromnoun` to `from_noun` (with an underscore)

e45350fe

2019-06-16T00:10:02

tag: add underscore to `from` function The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the tag function for consistency with the rest of the library.

6574cd00

2019-06-08T19:25:36

index: rename `frombuffer` to `from_buffer` The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the index functions for consistency with the rest of the library.

b7791d04

2019-06-16T00:23:01

object: rename git_object__size to git_object_size We don't use double-underscores in the public API.

08f39208

2019-06-08T17:46:04

blob: add underscore to `from` functions The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.

5d92e547

2019-06-08T17:28:35

oid: `is_zero` instead of `iszero` The only function that is named `issomething` (without underscore) was `git_oid_iszero`. Rename it to `git_oid_is_zero` for consistency with the rest of the library.

fef847ae

2019-06-15T15:47:41

Merge pull request #5110 from pks-t/pks/wildmatch Replace fnmatch with wildmatch

13ded47c

2019-06-13T19:57:17

fnmatch: remove unused code The `fnmatch` code has now been completely replaced by `wildmatch`, same as upstream git.git has been doing in 2014. Remove it.

05f9986a

2019-06-14T08:06:05

attr_file: convert to use `wildmatch` Upstream git has converted to use `wildmatch` instead of `fnmatch`. Convert our gitattributes logic to use `wildmatch` as the last user of `fnmatch`. Please, don't expect I know what I'm doing here: the fnmatch parser is one of the most fun things to play around with as it has a sh*tload of weird cases. In all honesty, I'm simply relying on our tests that are by now rather comprehensive in that area. The conversion actually fixes compatibility with how git.git parser "**" patterns when the given path does not contain any directory separators. Previously, a pattern "**.foo" erroneously wouldn't match a file "x.foo", while git.git would match. Remove the new-unused LEADINGDIR/NOLEADINGDIR flags for `git_attr_fnmatch`.

5811e3ba

2019-06-13T19:16:32

config_file: use `wildmatch` to evaluate conditionals We currently use `p_fnmatch` to compute whether a given "gitdir:" or "gitdir/i:" conditional matches the current configuration file path. As git.git has moved to use `wildmatch` instead of `p_fnmatch` throughout its complete codebase, we evaluate conditionals inconsistently with git.git in some special cases. Convert `p_fnmatch` to use `wildmatch`. The `FNM_LEADINGDIR` flag cannot be translated to `wildmatch`, but in fact git.git doesn't use it here either. And in fact, dropping it while we go increases compatibility with git.git.

cf1a114b

2019-06-13T19:10:22

config_file: do not include trailing '/' for "gitdir" conditionals When evaluating "gitdir:" and "gitdir/i:" conditionals, we currently compare the given pattern with the value of `git_repository_path`. Thing is though that `git_repository_path` returns the gitdir path with trailing '/', while we actually need to match against the gitdir without it. Fix this issue by stripping the trailing '/' previous to matching. Add various tests to ensure we get this right.

5d987f7d

2019-06-13T19:00:06

config_file: refactor `do_match_gitdir` to improve readability The function `do_match_gitdir` has some horribly named parameters and variables. Rename them to improve readability. Furthermore, fix a potentially undetected out-of-memory condition when appending "**" to the pattern.

de70bb46

2019-06-13T15:27:22

global: convert trivial `fnmatch` users to use `wildcard` Upstream git.git has converted its codebase to use wildcard in favor of fnmatch in commit 70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15). To keep our own regex-matching in line with what git does, convert all trivial instances of `fnmatch` usage to use `wildcard`, instead. Trivial usage is defined to be use of `fnmatch` with either no flags or flags that have a 1:1 equivalent in wildmatch (PATHNAME, IGNORECASE).

451df793

2019-06-13T15:20:23

posix: remove implicit include of "fnmatch.h" We're about to phase out our bundled fnmatch implementation as git.git has moved to wildmatch long ago in 2014. To make it easier to spot which files are stilll using fnmatch, remove the implicit "fnmatch.h" include in "posix.h" and instead include it explicitly.

a9f57629

2019-06-13T15:03:00

wildmatch: import wildmatch from git.git In commit 70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15), upstream git has switched over all code from their internal fnmatch copy to its new wildmatch code. We haven't followed suit, and thus have developed some incompatibilities in how we match regular expressions. Import git's wildmatch from v2.22.0 and add a test suite based on their t3070-wildmatch.sh tests.

2d85c7e8

2019-06-14T14:12:19

posix: remove `p_fallocate` abstraction By now, we have repeatedly failed to provide a nice cross-platform implementation of `p_fallocate`. Recent tries to do that escalated quite fast to a set of different CMake checks, implementations, fallbacks, etc., which started to look real awkward to maintain. In fact, `p_fallocate` had only been introduced in commit 4e3949b73 (tests: test that largefiles can be read through the tree API, 2019-01-30) to support a test with large files, but given the maintenance costs it just seems not to be worht it. As we have removed the sole user of `p_fallocate` in the previous commit, let's drop it altogether.

94fc83b6

2019-06-13T16:48:35

cmake: Modulize our TLS & hash detection The interactions between `USE_HTTPS` and `SHA1_BACKEND` have been streamlined. Previously we would have accepted not quite working configurations (like, `-DUSE_HTTPS=OFF -DSHA1_BACKEND=OpenSSL`) and, as the OpenSSL detection only ran with `USE_HTTPS`, the link would fail. The detection was moved to a new `USE_SHA1`, modeled after `USE_HTTPS`, which takes the values "CollisionDetection/Backend/Generic", to better match how the "hashing backend" is selected, the default (ON) being "CollisionDetection". Note that, as `SHA1_BACKEND` is still used internally, you might need to check what customization you're using it for.

c0dd7122

2019-06-06T16:48:04

apply: add an options struct initializer

0b5ba0d7

2019-06-06T16:36:23

Rename opt init functions to `options_init` In libgit2 nomenclature, when we need to verb a direct object, we name a function `git_directobject_verb`. Thus, if we need to init an options structure named `git_foo_options`, then the name of the function that does that should be `git_foo_options_init`. The previous names of `git_foo_init_options` is close - it _sounds_ as if it's initializing the options of a `foo`, but in fact `git_foo_options` is its own noun that should be respected. Deprecate the old names; they'll now call directly to the new ones.

a5ddae68

2019-06-13T22:00:48

Merge pull request #5097 from pks-t/pks/ignore-escapes gitignore with escapes

e277ff4d

2019-06-13T21:41:55

Merge pull request #5108 from libgit2/ethomson/urlparse_empty_port Handle URLs with a colon after host but no port

fb529a01

2019-06-11T22:03:29

http-parser: use our bundled http-parser by default Our bundled http-parser includes bugfixes, therefore we should prefer our http-parser until such time as we can identify that the system http-parser has these bugfixes (using a version check). Since these bugs are - at present - minor, retain the ability for users to force that they want to use the system http-parser anyway. This does change the cmake specification so that people _must_ opt-in to the new behavior knowingly.

0c1029be

2019-06-13T11:41:39

Merge pull request #5022 from rcoup/merge-analysis-bare-repo-5017 Merge analysis support for bare repos

3b517351

2019-06-07T10:13:34

attr_file: remove invalid TODO comment In our attributes pattern parsing code, we have a comment that states we might have to convert '\' characters to '/' to have proper POSIX paths. But in fact, '\' characters are valid inside the string and act as escape mechanism for various characters, which is why we never want to convert those to POSIX directory separators. Furthermore, gitignore patterns are specified to only treat '/' as directory separators. Remove the comment to avoid future confusion.

b3b6a39d

2019-06-07T11:12:54

attr_file: account for escaped escapes when searching trailing space When determining the trailing space length, we need to honor whether spaces are escaped or not. Currently, we do not check whether the escape itself is escaped, though, which might generate an off-by-one in that case as we will simply treat the space as escaped. Fix this by checking whether the backslashes preceding the space are themselves escaped.

10ac298c

2019-06-07T11:12:42

attr_file: fix unescaping of escapes required for fnmatch When parsing attribute patterns, we will eventually unescape the parsed pattern. This is required because we require custom escapes for whitespace characters, as normally they are used to terminate the current pattern. Thing is, we don't only unescape those whitespace characters, but in fact all escaped sequences. So for example if the pattern was "\*", we unescape that to "*". As this is directly passed to fnmatch(3) later, fnmatch would treat it as a simple glob matching all files where it should instead only match a file with name "*". Fix the issue by unescaping spaces, only. Add a bunch of tests to exercise escape parsing.

eb146e58

2019-06-07T09:17:23

attr_file: properly handle escaped '\' when searching non-escaped spaces When parsing attributes, we need to search for the first unescaped whitespace character to determine where the pattern is to be cut off. The scan fails to account for the case where the escaping '\' character is itself escaped, though, and thus we would not recognize the cut-off point in patterns like "\\ ". Refactor the scanning loop to remember whether the last character was an escape character. If it was and the next character is a '\', too, then we will reset to non-escaped mode again. Thus, we now handle escaped whitespaces as well as escaped wildcards correctly.

f7c6795f

2019-06-07T10:20:35

path: only treat paths starting with '\' as absolute on Win32 Windows-based systems treat paths starting with '\' as absolute, either referring to the current drive's root (e.g. "\foo" might refer to "C:\foo") or to a network path (e.g. "\\host\foo"). On the other hand, (most?) systems that are not based on Win32 accept backslashes as valid characters that may be part of the filename, and thus we cannot treat them to identify absolute paths. Change the logic to only paths starting with '\' as absolute on the Win32 platform. Add tests to avoid regressions and document behaviour.

fd734f7d

2019-06-11T12:45:27

Merge pull request #5107 from pks-t/pks/sha1dc-update sha1dc: update to fix endianess issues on AIX/HP-UX

230a451e

2019-06-10T13:54:11

sha1dc: update to fix endianess issues on AIX/HP-UX Update our copy of sha1dc to the upstream commit 855827c (Detect endianess on HP-UX, 2019-05-09). Changes include fixes to endian detection on AIX and HP-UX systems as well as a define that allows us to force aligned access, which we're not using yet.

7ea8630e

2019-04-07T20:11:59

http: free auth context on failure When we send HTTP credentials but the server rejects them, tear down the authentication context so that we can start fresh. To maintain this state, additionally move all of the authentication handling into `on_auth_required`.

005b5bc2

2019-04-07T17:55:23

http: reconnect to proxy on connection close When we're issuing a CONNECT to a proxy, we expect to keep-alive to the proxy. However, during authentication negotiations, the proxy may close the connection. Reconnect if the server closes the connection.

d171fbee

2019-04-07T17:40:23

http: allow server to drop a keepalive connection When we have a keep-alive connection to the server, that server may legally drop the connection for any reason once a successful request and response has occurred. It's common for servers to drop the connection after some amount of time or number of requests have occurred.

9af1de5b

2019-03-24T20:49:57

http: stop on server EOF We stop the read loop when we have read all the data. We should also consider the server's feelings. If the server hangs up on us, we need to stop our read loop. Otherwise, we'll try to read from the server - and fail - ad infinitum.

539e6293

2019-03-22T19:06:46

http: teach auth mechanisms about connection affinity Instead of using `is_complete` to decide whether we have connection or request affinity for authentication mechanisms, set a boolean on the mechanism definition itself.

3e0b4b43

2019-03-22T18:52:03

http: maintain authentication across connections For request-based authentication mechanisms (Basic, Digest) we should keep the authentication context alive across socket connections, since the authentication headers must be transmitted with every request. However, we should continue to remove authentication contexts for mechanisms with connection affinity (NTLM, Negotiate) since we need to reauthenticate for every socket connection.

ce72ae95

2019-03-22T10:53:30

http: simplify authentication mechanisms Hold an individual authentication context instead of trying to maintain all the contexts; we can select the preferred context during the initial negotiation. Subsequent authentication steps will re-use the chosen authentication (until such time as it's rejected) instead of trying to manage multiple contexts when all but one will never be used (since we can only authenticate with a single mechanism at a time.) Also, when we're given a 401 or 407 in the middle of challenge/response handling, short-circuit immediately without incrementing the retry count. The multi-step authentication is expected, and not a "retry" and should not be penalized as such. This means that we don't need to keep the contexts around and ensures that we do not unnecessarily fail for too many retries when we have challenge/response auth on a proxy and a server and potentially redirects in play as well.

6d931ba7

2019-03-22T16:35:59

http: don't set the header in the auth token

10718526

2019-03-09T13:53:16

http: don't reset replay count after connection A "connection" to a server is transient, and we may reconnect to a server in the midst of authentication failures (if the remote indicates that we should, via `Connection: close`) or in a redirect.

3192e3c9

2019-03-07T16:57:11

http: provide an NTLM authentication provider

thodg/libgit2/src

src

Log