kmx git

Commit	Date	Message
f0e693b1	2021-09-07T17:53:49	str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
31ecaca2	2021-09-30T08:11:40	hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.
2a713da1	2021-09-29T21:31:17	hash: accept the algorithm in inputs
379c4646	2021-09-09T19:49:04	Fix coding style for pointer Make some syntax change to follow coding style.
4bf136b0	2021-06-23T16:53:53	config: fix included configs not refreshed more than once If an included config is refreshed twice, the second update is not taken into account. This is because the list of included files is cleared after re-reading the new configuration, instead of being cleared before. Fix it and add a test case to check for this bug.
e7604da8	2020-04-05T14:51:56	config: use GIT_ASSERT
d1409f48	2020-05-06T19:57:07	config: ignore unreadable configuration files Modified `config_file_open()` so it returns 0 if the config file is not readable, which happens on global config files under macOS sandboxing (note that for some reason `access(F_OK)` DOES work with sandboxing, but it is lying). Without this read check sandboxed applications on macOS can not open any repository, because `config_file_read()` will return GIT_ERROR when it cannot read the global /Users/username/.gitconfig file, and the upper layers will just completely abort on GIT_ERROR when attempting to load the global config file, so no repositories can be opened.
56b203a5	2019-10-24T12:20:27	config_file: keep reference to config entries when creating iterator When creating a configuration file iterator, then we first refresh the backend and then afterwards duplicate all refreshed configuration entries into the iterator in order to avoid seeing any concurrent modifications of the entries while iterating. The duplication of entries is not guarded, though, as we do not increase the refcount of the entries that we duplicate right now. This opens us up for a race, as another thread may concurrently refresh the repository configuration and thus swap out the current set of entries. As we didn't increase the refcount, this may lead to the entries being free'd while we iterate over them in the first thread. Fix the issue by properly handling the lifecycle of the backend's entries via `config_file_entries_take` and `git_config_entries_free`, respectively.
0927156a	2019-10-24T12:32:11	config_file: refactor taking entries ref to return an error code The function to take a reference to the config file's config entries currently returns the reference via return value. Due to this, it's harder than necessary to integrate into our typical coding style, as one needs to make sure that a proper error code is set before erroring out from the caller. This bites us in `config_file_delete`, where we call `goto out` directly when `config_file_entries_take` returns `NULL`, but we actually forget to set up the error code and thus return success. Fix the issue by refactoring the function to return an error code and pass the reference via an out-pointer.
db301087	2019-10-24T12:17:02	config_file: remove unused includes
c2749849	2019-10-24T12:00:11	config_file: rename function names As with the predecessing commit, this commit renames backend functions of the configuration file backend. This helps to clearly separate functionality and also to be able to see from backtraces which backend is currently in use.
7aacf027	2019-09-13T08:55:33	global: convert all users of POSIX regex to use our new regexp API The old POSIX regex API has been superseded by our new regexp API. Convert all users to make use of the new one.
722ba93f	2019-08-01T15:14:06	config: implement "onbranch" conditional With Git v2.23.0, the conditional include mechanism gained another new conditional "onbranch". As the name says, it will cause a file to be included if the "onbranch" pattern matches the currently checked out branch. Implement this new condition and add a bunch of tests.
37ebe9ad	2019-07-24T18:49:08	config_backend: rename internal structures The internal backend structures are kind-of legacy and do not really speak for themselves. Rename them accordingly to make them easier to understand.
2bff84ba	2019-07-26T21:02:56	config_file: separate out read-only backend To further distinguish the file writeable and readonly backends, separate the readonly backend into its own "config_snapshot.c" implementation. The snapshot backend can be generically used to snapshot any type of backend.
f0b10066	2019-07-24T18:37:14	config_file: fix cast of readonly backend In `backend_readonly_free`, the passed in config backend is being cast to a `diskfile_backend` instead of to a `diskfile_readonly_backend`. While this works out just fine because we only access its header values, which were shared between both backends, it is undefined behaviour. Use the correct type to fix this.
a3159df8	2019-07-24T18:31:43	config_file: remove shared `diskfile_header` struct The `diskfile_header` structure is shared between both `diskfile_backend` and `diskfile_readonly_backend`. The separation and resulting casting is confusing at times and a source for programming errors. Remove the shared structure and inline them directly.
271e5fba	2019-07-24T18:18:18	config_file: duplicate accessors for readonly backend While most functions of the readonly configuration backend are implemented separately from the writeable configuration backend, the two functions `config_iterator_new` and `config_get` are shared between both. This sharing makes it necessary to have some shared data structures, which is the `diskfile_header` structure. Unfortunately, this makes the backends harder to grasp than necessary due to all the casting between structs and also quite error prone. Reimplement those functions for the readonly backends. As readonly backends cannot be refreshed anyway, we can remove the calls to `config_refresh` in there.
4e7ce1fb	2019-07-24T18:13:52	config_file: reimplement `config_readonly_open` generically The `config_readonly_open` function currently receives as input a diskfile backend and will copy its entries to a new snapshot. This is rather intimate, as we need to assume that the source config backend is in fact a diskfile entry. We can do better than this though by using generic methods to copy contents of the provided backend, e.g. by using a config iterator. This also allows us to decouple the read-only backend from the read-write backend.
2766b92d	2019-07-21T15:10:34	config_file: refresh when creating an iterator When creating a new iterator for a config file backend, then we should always make sure that we're up to date by calling `config_refresh`. Otherwise, we might not notice when another process has modified the configuration file and thus will represent outdated values. Add two tests to config::stress that verify that we get up-to-date values when reading configuration entries via `git_config_iterator`.
9fac8b78	2019-07-21T15:08:22	config_file: do not refresh read-only backends If calling `config_refresh` on a read-only configuration file backend, then we will segfault when comparing the timestamp of the file due to `path` being uninitialized. As a read-only snapshot should not be refreshed anyway and stay consistent, we can simply return early when calling `config_refresh` on a read-only snapshot.
28d11b59	2019-07-21T14:41:21	config_file: consistently use `GIT_CONTAINER_OF`
dbeadf8a	2019-07-11T10:56:05	config_parse: provide parser init and dispose functions Right now, all configuration file backends are expected to directly mess with the configuration parser's internals in order to set it up. Let's avoid doing that by implementing both a `git_config_parser_init` and `git_config_parser_dispose` function to clearly define the interface between configuration backends and the parser. Ideally, we would make the `git_config_parser` structure definition private to its implementation. But as that would require an additional memory allocation that was not required before we just live with it being visible to others.
32157526	2019-07-11T11:10:02	config_file: refactor error handling in `config_write` Error handling in `config_write` is rather convoluted and does not match our current code style. Refactor it to make it easier to understand.
820fa1a3	2019-07-11T11:04:33	config_file: internalize `git_config_file` struct With the previous commits, we have finally separated the config parsing logic from the specific configuration file backend. Due to that, we can now move the `git_config_file` structure into the config file backend's implementation so that no other code may accidentally start using it again. Furthermore, we rename the structure to `diskfile` to make it obvious that it is internal, only, and to unify it with naming scheme of the other diskfile structures.
6e6da75f	2019-07-11T11:00:05	config_parse: remove use of `git_config_file` The config parser code needs to keep track of the current parsed file's name so that we are able to provide proper error messages to the user. Right now, we do that by storing a `git_config_file` in the parser structure, but as that is a specific backend and the parser aims to be generic, it is a layering violation. Switch over to use a simple string to fix that.
54d350e0	2019-06-21T12:53:43	config_file: embed file in diskfile parse data The config file code needs to keep track of the actual `git_config_file` structure, as it not only contains the path of the current configuration file, but it also keeps tracks of all includes of that file. Right now, we keep track of that structure via the `git_config_parser`, but as that's supposed to be a backend generic implementation of configuration parsing it's a layering violation to have it in there. Switch over the config file backend to use its own config file structure that's embedded in the backend parse data. This allows us to switch over the generic config parser to avoid using the `git_config_file` structure.
2ba7020f	2019-06-27T09:23:59	config_file: avoid re-reading files on write When we rewrite the configuration file due to any of its values being modified, we call `config_refresh` to update the in-memory representation of our config file backend. This is needlessly wasteful though, as `config_refresh` will always open the on-disk representation to reads the file contents while we already know the complete file contents at this point in time as we have just written it to disk. Implement a new function `config_refresh_from_buffer` that will refresh the backend's config entries from a buffer instead of from the config file itself. Note that this will thus _not_ update the backend's timestamp, which will cause us to re-read the buffer when performing a read operation on it. But this is still an improvement as we now lazily re-read the contents, and most importantly we will avoid constantly re-reading the contents if we perform multiple write operations. The following strace demonstrates this if we're re-writing a key multiple times. It uses our config example with `config_set` changed to update the file 10 times with different keys: $ strace lg2 config x.x z \|& grep '^open.config' open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 And now with the optimization of `config_refresh_from_buffer`: $ strace lg2 config x.x z \|& grep '^open.config' open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 4 As can be seen, this is quite a lot of `open` calls less.
a0dc3027	2019-06-27T08:54:51	config_file: split out function that sets config entries Updating a config file backend's config entries is a bit more involved, as it requires clearing of the old config entries as well as handling locking correctly. As we will need this functionality in a future patch to refresh config entries from a buffer, let's extract this into its own function `config_set_entries`.
985f5cdf	2019-06-27T08:41:16	config_file: split out function that reads entries from a buffer The `config_read` function currently performs both reading the on-disk config file as well as parsing the retrieved buffer contents. To optimize how we refresh our config entries from an in-memory buffer, we need to be able to directly parse buffers, though, without involving any on-disk files at all. Extract a new function `config_read_buffer` that sets up the parsing logic and then parses config entries from a buffer, only. Have `config_read` use it to avoid duplicated logic.
3e1c137a	2019-06-27T08:24:21	config_file: move refresh into `write` function We are quite lazy in how we refresh our config file backend when updating any of its keys: instead of just updating our in-memory representation of the keys, we just discard the old set of keys and then re-read the config file contents from disk. This refresh currently happens separately at every callsite of `config_write`, but it is clear that we _always_ want to refresh if we have written the config file to disk. If we didn't, then we'd run around with an outdated config file backend that does not represent what we have on disk. By moving the refresh into `config_write`, we are also able to optimize the case where the config file is currently locked. Before, we would've tried to re-read the file even if we have only updated its cached contents without touching the on-disk file. Thus we'd have unnecessarily stat'd the file, even though we know that it shouldn't have been modified in the meantime due to its lock.
d7f58eab	2019-06-21T11:55:21	config_file: implement stat cache to avoid repeated rehashing To decide whether a config file has changed, we always hash its complete contents. This is unnecessarily expensive, as well-behaved filesystems will always update stat information for files which have changed. So before computing the hash, we should first check whether the stat info has actually changed for either the configuration file or any of its includes. This avoids having to re-read the configuration file and its includes every time when we check whether it's been modified. Tracing the for-each-ref example previous to this commit, one can see that we repeatedly re-open both the repo configuration as well as the global configuration: $ strace lg2 for-each-ref \|& grep config access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) access("/home/pks/.config/git/config", F_OK) = 0 access("/etc/gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 access("/tmp/repo/.git/config", F_OK) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05290) = -1 ENOENT (No such file or directory) access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 access("/home/pks/.config/git/config", F_OK) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c051f0) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 With the change, we only do stats for those files and open them a single time, only: $ strace lg2 for-each-ref \|& grep config access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) access("/home/pks/.config/git/config", F_OK) = 0 access("/etc/gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 access("/tmp/repo/.git/config", F_OK) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffe70540d20) = -1 ENOENT (No such file or directory) access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 access("/home/pks/.config/git/config", F_OK) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY\|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540ca0) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540c80) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG\|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG\|0644, st_size=1154, ...}) = 0 The following benchmark has been performed with and without the stat cache in a best-of-ten run: ``` int lg2_repro(git_repository repo, int argc, char argv) { git_config cfg; int32_t dummy; int i; UNUSED(argc); UNUSED(argv); check_lg2(git_repository_config(&cfg, repo), "Could not obtain config", NULL); for (i = 1; i < 100000; ++i) git_config_get_int32(&dummy, cfg, "foo.bar"); git_config_free(cfg); return 0; } ``` Without stat cache: $ time lg2 repro real 0m1.528s user 0m0.568s sys 0m0.944s With stat cache: $ time lg2 repro real 0m0.526s user 0m0.268s sys 0m0.258s This benchmark shows a nearly three-fold performance improvement. This change requires that we check our configuration stress tests as we're now in fact becoming more racy. If somebody is writing a configuration file at nearly the same time (there is a window of 100ns on Windows-based systems), then it might be that we realize that this file has actually changed and thus may not re-read it. This will only happen if either an external process is rewriting the configuration file or if the same process has multiple `git_config` structures pointing to the same time, where one of both is being used to write and the other one is used to read values.
d0868646	2019-06-21T11:43:09	config: use `git_config_file` in favor of `struct config_file`
5811e3ba	2019-06-13T19:16:32	config_file: use `wildmatch` to evaluate conditionals We currently use `p_fnmatch` to compute whether a given "gitdir:" or "gitdir/i:" conditional matches the current configuration file path. As git.git has moved to use `wildmatch` instead of `p_fnmatch` throughout its complete codebase, we evaluate conditionals inconsistently with git.git in some special cases. Convert `p_fnmatch` to use `wildmatch`. The `FNM_LEADINGDIR` flag cannot be translated to `wildmatch`, but in fact git.git doesn't use it here either. And in fact, dropping it while we go increases compatibility with git.git.
cf1a114b	2019-06-13T19:10:22	config_file: do not include trailing '/' for "gitdir" conditionals When evaluating "gitdir:" and "gitdir/i:" conditionals, we currently compare the given pattern with the value of `git_repository_path`. Thing is though that `git_repository_path` returns the gitdir path with trailing '/', while we actually need to match against the gitdir without it. Fix this issue by stripping the trailing '/' previous to matching. Add various tests to ensure we get this right.
5d987f7d	2019-06-13T19:00:06	config_file: refactor `do_match_gitdir` to improve readability The function `do_match_gitdir` has some horribly named parameters and variables. Rename them to improve readability. Furthermore, fix a potentially undetected out-of-memory condition when appending "**" to the pattern.
451df793	2019-06-13T15:20:23	posix: remove implicit include of "fnmatch.h" We're about to phase out our bundled fnmatch implementation as git.git has moved to wildmatch long ago in 2014. To make it easier to spot which files are stilll using fnmatch, remove the implicit "fnmatch.h" include in "posix.h" and instead include it explicitly.
02683b20	2019-01-12T23:06:39	regexec: prefix all regexec function calls with p_ Prefix all the calls to the the regexec family of functions with `p_`. This allows us to swap out all the regular expression functions with our own implementation. Move the declarations to `posix_regex.h` for simpler inclusion.
e44110db	2019-03-20T12:28:45	Correctly write to missing locked global config Opening a default config when ~/.gitconfig doesn't exist, locking it, and attempting to write to it causes an assertion failure. Treat non-existent global config file content as an empty string.
bc5b19e6	2019-04-29T09:01:45	Merge pull request #4561 from pks-t/pks/downcasting [RFC] util: introduce GIT_DOWNCAST macro
cc8a9892	2019-04-16T18:13:31	config_file: check result of git_array_alloc git_array_alloc can return NULL if no memory is available, causing a segmentation fault in memset. This adds GIT_ERROR_CHECK_ALLOC similar to how other parts of the code base deal with the return value of git_array_alloc.
65203b5a	2019-04-16T13:21:16	config_file: make use of `GIT_CONTAINER_OF` macro
f673e232	2018-12-27T13:47:34	git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
d06d4220	2018-10-05T10:56:02	config_file: properly ignore includes without "path" value In case a configuration includes a key "include.path=" without any value, the generated configuration entry will have its value set to `NULL`. This is unexpected by the logic handling includes, and as soon as we try to calculate the included path we will unconditionally dereference that `NULL` pointer and thus segfault. Fix the issue by returning early in both `parse_include` and `parse_conditional_include` in case where the `file` argument is `NULL`. Add a test to avoid future regression. The issue has been found by the oss-fuzz project, issue 10810.
b78f4ab0	2018-08-16T12:22:03	config_entries: refactor entries iterator memory ownership Right now, the config file code requires us to pass in its backend to the config entry iterator. This is required with the current code, as the config file backend will first create a read-only snapshot which is then passed to the iterator just for that purpose. So after the iterator is getting free'd, the code needs to make sure that the snapshot gets free'd, as well. By now, though, we can easily refactor the code to be more efficient and remove the reverse dependency from iterator to backend. Instead of creating a read-only snapshot (which also requires us to re-parse the complete configuration file), we can simply duplicate the config entries and pass those to the iterator. Like that, the iterator only needs to make sure to free the duplicated config entries, which is trivial to do and clears up memory ownership by a lot.
123e5963	2018-08-10T18:59:59	config_entries: abstract away reference counting Instead of directly calling `git_atomic_inc` in users of the config entries store, provide a `git_config_entries_incref` function to further decouple the interfaces. Convert the refcount to a `git_refcount` structure while at it.
5a7e0b3c	2018-08-10T18:49:38	config_entries: abstract away iteration over entries The nice thing about our `git_config_iterator` interfaces is that nobody needs to know anything about the implementation details. All that is required is to obtain the iterator via any backend and then use it by executing generic functions. We can thus completely internalize all the implementation details of how to iterate over entries into the config entries store and simply create such an iterator in our config file backend when we want to iterate its entries. This further decouples the config file backend from the config entries store.
60ebc137	2018-08-10T14:53:09	config_entries: abstract away retrieval of config entries The code accessing config entries in the `git_config_entries` structure is still much too intimate with implementation details, directly accessing the maps and handling indices. Provide two new functions to get config entries from the internal map structure to decouple the interfaces and use them in the config file code. The function `git_config_entries_get` will simply look up the entry by name and, in the case of a multi-value, return the last occurrence of that entry. The second function, `git_config_entries_get_unique`, will only return an entry if it is unique and not included via another configuration file. This one is required to properly implement write operations for single entries, as we refuse to write to or delete a single entry if it is not clear which one was meant.
fb8a87da	2018-08-10T14:50:15	config_entries: rename functions and structure The previous commit simply moved all code that is required to handle config entries to a new module without yet adjusting any of the function and structure names to help readability. We now rename things accordingly to have a common "git_config_entries" entries instead of the old "diskfile_entries" one.
04f57d51	2018-08-10T13:33:02	config_entries: pull out implementation of entry store The configuration entry store that is used for configuration files needs to keep track of all entries in two different structures: - a singly linked list is being used to be able to iterate through configuration files in the order they have been found - a string map is being used to efficiently look up configuration entries by their key This store is thus something that may be used by other, future backends as well to abstract away implementation details and iteration over the entries. Pull out the necessary functions from "config_file.c" and moves them into their own "config_entries.c" module. For now, this is simply moving over code without any renames and/or refactorings to help reviewing.
d75bbea1	2018-08-10T14:35:23	config_file: remove unnecessary snapshot indirection The implementation for config file snapshots has an unnecessary redirection from `config_snapshot` to `git_config_file__snapshot`. Inline the call to `git_config_file__snapshot` and remove it.
b944e137	2018-08-10T13:03:33	config: rename "config_file.h" to "config_backend.h" The header "config_file.h" has a list of inline-functions to access the contents of a config backend without directly messing with the struct's function pointers. While all these functions are called "git_config_file_", they are in fact completely backend-agnostic and don't care whether it is a file or not. Rename all the function to instead be backend-agnostic versions called "git_config_backend_" and rename the header to match.
1aeff5d7	2018-08-10T12:52:18	config: move function normalizing section names into "config.c" The function `git_config_file_normalize_section` is never being used in any file different than "config.c", but it is implemented in "config_file.c". Move it over and make the symbol static.
f2694635	2018-09-06T14:17:54	config_file: fix quadratic behaviour when adding config multivars In case where we add multiple configuration entries with the same key to a diskfile backend, we always need to iterate the list of this key to find the last entry due to the list being a singly-linked list. This is obviously quadratic behaviour, and this has sure enough been found by oss-fuzz by generating a configuration file with 50k lines, where most of them have the same key. While the issue will not arise with "sane" configuration files, an adversary may trigger it by providing a crafted ".gitmodules" file, which is delivered as part of the repo and also parsed by the configuration parser. The fix is trivial: store a pointer to the last entry of the list in its head. As there are only two locations now where we append to this data structure, mainting this pointer is trivial, too. We can also optimize retrieval of a single value via `config_get`, where we previously had to chase the `next` pointer to find the last entry that was added. Using our configuration file fozzur with a corpus that has a single file with 50000 "-=" lines previously took around 21s. With this optimization the same file scans in about 0.053s, which is a nearly 400-fold improvement. But in most cases with a "normal" amount of same-named keys it's not going to matter anyway.
ec76a1aa	2018-08-05T14:37:08	Add a comment
019409be	2018-08-05T14:25:22	Don't error on missing section, just continue
c4d7fa95	2018-07-22T23:31:19	config_file: Don't crash on options without a section
e1e90dcc	2018-01-09T14:52:34	config_file: avoid free'ing OOM buffers Buffers which ran out of memory will never have any memory attached to them. As such, it is not necessary to call `git_buf_free` if the buffer is out of memory.
e51e29e8	2017-11-12T13:59:47	config_parse: have `git_config_parse` own entry value and name The function `git_config_parse` uses several callbacks to pass data along to the caller as it parses the file. One design shortcoming here is that strings passed to those callbacks are expected to be freed by them, which is really confusing. Fix the issue by changing memory ownership here. Instead of expecting the `on_variable` callbacks to free memory for `git_config_parse`, just do it inside of `git_config_parse`. While this obviously requires a bit more memory allocation churn due to having to copy both name and value at some places, this shouldn't be too much of a burden.
ecf4f33a	2018-02-08T11:14:48	Convert usage of `git_buf_free` to new `git_buf_dispose`
6a15f657	2018-02-09T13:02:26	config_file: iterate over keys in the order they were added Currently, all configuration entries were only held in a string map, making iteration order mostly based on the hash of each entry's key. Now that we have extended the `diskfile_entries` structure by a list of config entries, we can effectively iterate through entries in the order they were added, though.
3a82475f	2018-02-09T12:49:45	config_file: add list holding config entries in order of appearance Right now, we only keep all configuration entries in a string map. This is required to efficiently access configuration entries by keys. It has the disadvantage of not being able to iterate through configuration entries in the order they were read, though. Instead, we only iterate through entries in a seemingly random order. Extend `diskfile_entries` by another list holding configuration entries. Like this, we maintain all entries in two data structures and can use the required one based on the current use case.
8c0b0717	2018-02-09T12:32:24	config_file: pass complete entry structure into `append_entry` Currently, we only parse the entry map into `append_entry` to append new configuration entries to it. Instead, though, we can just pass the complete `diskfile_entries` structure into it. This allows us to easily extend `diskfile_entries` by another singly linked list of configuration entries.
eafb8402	2018-02-09T12:29:16	config_file: rename `refcounted_strmap` to `diskfile_entries` The config file parsing code all revolves around the `refcounted_strmap` structure, which is a map of entry names to their respective keys. This naming scheme made grasping the code quite hard, warranting a rename. A good alternative is `diskfile_entries`, making clear that this really only holds all configuration entries. Furthermore, we are about to introduce a new linked list of configuration entries into the configuration file code. This list will be used to iterate over configuration entries in the order they are listed inside of the parsed configuration file. After renaming `refcounted_strmap` to `diskfile_entries`, this struct also becomes the natural target where to add that new list. Like this, data structures holding all entries are neatly contained inside of it.
e3c8462c	2018-02-09T11:50:28	config_file: rename parse_data struct The struct `parse_data` sounds as if it was defined and passed to us from the configuration parser, which is not true. Instead, `parse_data` is specific to the diskfile backend parsing logic. Rename it to `diskfile_parse_state` to make that clearer. This also follows existing naming patterns with the "diskfile" prefix.
18117a6c	2018-02-09T11:43:13	config_file: use new line to declare new variable
b6f88706	2018-02-09T11:39:26	config_file: refactor freeing of config entry lists The interface for freeing config list entries is more tangled than required. Instead of calling `cvar_free` for every list entry in `free_vars`, we now just provide a function `config_entry_list_free`. This function will iterate through all list entries and free the associated configuration entry as well as the list entry itself.
2d1f6676	2018-02-09T11:35:54	config_file: rename cvar_t struct to config_entry_list The `cvar_t` structure is really awkward to grasp, because its name actively hinders discovery of what it actually is. As it is nothing more than a singly-linked list of configuration entries, name rename it to just that: `config_entry_list`.
26cf48fc	2018-02-09T11:35:16	config_file: move include depth into config entry In order to reject writes to included configuration entries, we need to keep track of whether an entry was included via another configuration file or not. This information is being stored in the `cvar` structure, which is a rather weird location, as it is only used to create a list structure of config entries. Move the include depth into the structure `git_config_entry` instead. While this fixes the layering issue, it enables users of libgit2 to access the depth, too.
fcb0d841	2018-02-09T11:19:47	config_file: move cvar handling into `append_entry` The code appending new configuration entries to our current list first allocates the `cvar` structure and then passes it to `append_entry`. As we want to extend `append_entry` to store configuration entries in a map as well as in a list for ordered iteration, we will have to create two `cvar` structures, though. As such, the future change will become much easier when allocation of the `cvar` structure is doen in `append_entry` itself.
dfcd923c	2018-02-09T11:15:32	config_file: remove unused list iteration macros We currently provide a lot of macros for the `cvar_t` structure which are never being used. In fact, the only macro we need is to access the `next` pointer of `cvar_t`, which really does not require a macro at all. Remove all these macros and replace usage of `CVAR_LIST_NEXT(cvar)` with `cvar->next`.
9cd0c6f1	2018-02-28T16:01:16	config: return an error if config_refresh is called on a snapshot Instead of treating it as a no-op, treat it as a programming error and return the same kind of error as if you called to set or delete variables on a snapshot.
2424e64c	2018-02-28T12:06:02	config: harden our use of the backend objects a bit When we create an iterator we don't actually know that we have a live config object and we must instead only rely on the header. We fixed it to use this in a previous commit, but this makes it harder to misuse by converting to use the header object in the typecast. We also guard inside the `config_refresh` function against being given a snapshot (although callers right now do check).
1785de4e	2018-02-28T11:46:17	config: move the level field into the header We use it in a few places where we might have a full object or a snapshot so move it to where we can actually access it.
c1524b2e	2018-02-28T11:33:11	config: move the repository to the diskfile header We pass this around and when creating a new iterator we need to read the repository pointer. Put it in a common place so we can reach it regardless of whether we got a full object or a snapshot.
9e66590b	2017-07-21T13:01:43	config_parse: use common parser interface As the config parser is now cleanly separated from the config file code, we can easily refactor the code and make use of the common parser module. This removes quite a lot of duplicated functionality previously used for handling the actual parser state and replaces it with the generic interface provided by the parser context.
1953c68b	2017-11-11T17:12:31	config_file: split out module to parse config files The configuration file code grew quite big and intermingles both actual configuration logic as well as the parsing logic of the configuration syntax. This makes it hard to refactor the parsing logic on its own and convert it to make use of our new parsing context module. Refactor the code and split it up into two parts. The config file code will only handle actual handling of configuration files, includes and writing new files. The newly created config parser module is then only responsible for parsing the actual contents of a configuration file, leaving everything else to callbacks provided to its provided function `git_config_parse`.
42627933	2017-11-04T18:03:26	Merge remote-tracking branch 'upstream/master' into pks/conditional-includes
1475b981	2017-11-04T18:00:56	config: keep the output parameter at the start of the function
94e30d9b	2017-10-30T15:55:18	config: check for OOM when writing
8ec806d7	2017-10-30T06:23:31	config: preserve the original case when writing out new sections and vars For sections we will still use the existing one even if the case disagrees, but the variable always gets written with the case given by the caller.
f7d837c8	2017-05-24T12:12:29	config_file: implement "gitdir/i" conditional Next to the "gitdir" conditional for including other configuration files, there's also a "gitdir/i" conditional. In contrast to the former one, path matching with "gitdir/i" is done case-insensitively. This commit implements the case-insensitive condition.
071b6c06	2017-05-24T11:13:36	config_file: implement conditional "gitdir" includes Upstream git.git has implemented the ability to include other configuration files based on conditions. Right now, this only includes the ability to include a file based on the gitdir-location of the repository the currently parsed configuration file belongs to. This commit implements handling these conditional includes for the case-sensitive "gitdir" condition.
9d7a75be	2017-08-25T19:15:00	config_file: make repo and config path accessible to reader The reader machinery will be extended to handle conditional includes. The only conditions that currently exist all match the against the git directory of the repository the config file belongs to. As such, we need to have access to the repository when reading configuration files to properly handle these conditions. One specialty of thes conditional includes is that the actual pattern may also be a relative pattern starting with "./". In this case, we have to match the pattern against the path relative to the config file which is currently being parsed. So besides the repository, we also have to pass down the path to the current config file that is being parsed.
d5b9d9e9	2017-05-23T10:53:49	config_file: extract function to parse include path The logic inside this function will be required later on, when implementing conditional includes. Extract it into its own function to ease the implementation.
529e873c	2017-05-23T11:51:00	config: pass repository when opening config files Our current configuration logic is completely oblivious of any repository, but only cares for actual file paths. Unfortunately, we are forced to break this assumption by the introduction of conditional includes, which are evaluated in the context of a repository. Right now, only one conditional exists with "gitdir:" -- it will only include the configuration if the current repository's git directory matches the value passed to "gitdir:". To support these conditionals, we have to break our API and make the repository available when opening a configuration file. This commit extends the `open` call of configuration backends to include another repository and adjusts existing code to have it available. This includes the user-visible functions `git_config_add_file_ondisk` and `git_config_add_backend`.
1560b580	2017-08-15T10:35:47	Merge pull request #4288 from pks-t/pks/include-fixups Include fixups
1b329089	2017-05-31T22:27:19	config_file: refuse modifying included variables Modifying variables pulled in by an included file currently succeeds, but it doesn't actually do what one would expect, as refreshing the configuration will cause the values to reappear. As we are currently not really able to support this use case, we will instead just return an error for deleting and setting variables which were included via an include.
28c2cc3d	2017-05-31T16:41:44	config_file: move reader into `config_read` only Right now, we have multiple call sites which initialize a `reader` structure. As the structure is only actually used inside of `config_read`, we can instead just move the reader inside of the `config_read` function. Instead, we can just pass in the configuration file into `config_read`, which eases code readability.
83bcd3a1	2017-05-31T22:45:25	config_file: refresh all files if includes were modified Currently, we only re-parse the top-level configuration file when it has changed itself. This can cause problems when an include is changed, as we were not updating all values correctly. Instead of conditionally reparsing only refreshed files, the logic becomes much clearer and easier to follow if we always re-parse the top-level configuration file when either the file itself or one of its included configuration files has changed on disk. This commit implements this logic. Note that this might impact performance in some cases, as we need to re-read all configuration files whenever any of the included files changed. It could increase performance to just re-parse include files which have actually changed, but this would compromise maintainability of the code without much gain. The only case where we will gain anything is when we actually use includes and when only these includes are updated, which will probably be quite an unusual scenario to actually be worthwhile to optimize.
56a7a264	2017-05-31T14:50:40	config_file: remove unused backend field from parse data The backend passed to `config_read` is never actually used anymore, so we can remove it from the function and the `parse_data` structure.
3a7f7a6e	2017-05-31T14:43:46	config_file: pass reader directly to callbacks Previously, the callbacks passed to `config_parse` got the reader via a pointer to a pointer. This allowed the callbacks to update the callers `reader` variable when the array holding it has been reallocated. As the array is no longer present, we can simply the code by making the reader a simple pointer.
73df75d8	2017-05-31T14:34:48	config_file: refactor include handling Current code for configuration files uses the `reader` structure to parse configuration files and store additional metadata like the file's path and checksum. These structures are stored within an array in the backend itself, which causes multiple problems. First, it does not make sense to keep around the file's contents with the backend itself. While this data is usually free'd before being added to the backend, this brings along somewhat intricate lifecycle problems. A better solution would be to store only the file paths as well as the checksum of the currently parsed content only. The second problem is that the `reader` structures are stored inside an array. When re-parsing configuration files due to changed contents, we may cause this array to be reallocated, requiring us to update pointers hold by callers. Furthermore, we do not keep track of includes which are already associated to a reader inside of this array. This causes us to add readers multiple times to the backend, e.g. in the scenario of refreshing configurations. This commit fixes these shortcomings. We introduce a split between the parsing data and the configuration file's metadata. The `reader` will now only hold the file's contents and the parser state and the new `config_file` structure holds the file's path and checksum. Furthermore, the new structure is a recursive structure in that it will also hold references to the files it directly includes. The diskfile is changed to only store the top-level configuration file. These changes allow us further refactorings and greatly simplify understanding the code.
0c7f49dd	2017-06-30T13:39:01	Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
a693b873	2017-06-07T10:20:44	buffer: use `git_buf_init` with length The `git_buf_init` function has an optional length parameter, which will cause the buffer to be initialized and allocated in one step. This can be used instead of static initialization with `GIT_BUF_INIT` followed by a `git_buf_grow`. This patch does so for two functions where it is applicable.
a1023a43	2017-05-20T17:18:07	Merge pull request #4179 from libgit2/ethomson/expand_tilde Introduce home directory expansion function for config files, attribute files
4467aeac	2017-03-28T09:00:48	config_file: handle errors other than OOM while parsing section headers The current code in `parse_section_header_ext` is only prepared to properly handle out-of-memory conditions for the `git_buf` structure. While very unlikely and probably caused by a programming error, it is also possible to run into error conditions other than out-of-memory previous to reaching the actual parsing loop. In these cases, we will run into undefined behavior as the `rpos` variable is only initialized after these triggerable errors, but we use it in the cleanup-routine. Fix the issue by unifying the function's cleanup code with an `end_error` section, which will not use the `rpos` variable.
29aef948	2017-03-23T11:59:06	config, attrcache: don't fallback to dirs literally named `~` The config and attrcache file reading code would attempt to load a file in a home directory by expanding the `~` and looking for the file, using `git_sysdir_find_global_file`. If the file was not found, the error handling would look for the literal path, eg `~/filename.txt`. Use the new `git_config_expand_global_file` instead, which allows us to get the path to the file separately, when the path is prefixed with `~/`, and fail with a not found error without falling back to looking for the literal path.
301dc26a	2016-06-20T13:15:35	fix error when including a missing config file relative to the home directory
2cf48e13	2017-03-20T09:34:41	config_file: check if section header buffer runs out of memory While parsing section headers, we use a buffer to store the actual section name. We do not check though if the buffer runs out of memory at any stage. Do so.

f0e693b1

2021-09-07T17:53:49

str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.

31ecaca2

2021-09-30T08:11:40

hash: hash functions operate on byte arrays not git_oids Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.

2a713da1

2021-09-29T21:31:17

hash: accept the algorithm in inputs

379c4646

2021-09-09T19:49:04

Fix coding style for pointer Make some syntax change to follow coding style.

4bf136b0

2021-06-23T16:53:53

config: fix included configs not refreshed more than once If an included config is refreshed twice, the second update is not taken into account. This is because the list of included files is cleared after re-reading the new configuration, instead of being cleared before. Fix it and add a test case to check for this bug.

e7604da8

2020-04-05T14:51:56

config: use GIT_ASSERT

d1409f48

2020-05-06T19:57:07

config: ignore unreadable configuration files Modified `config_file_open()` so it returns 0 if the config file is not readable, which happens on global config files under macOS sandboxing (note that for some reason `access(F_OK)` DOES work with sandboxing, but it is lying). Without this read check sandboxed applications on macOS can not open any repository, because `config_file_read()` will return GIT_ERROR when it cannot read the global /Users/username/.gitconfig file, and the upper layers will just completely abort on GIT_ERROR when attempting to load the global config file, so no repositories can be opened.

56b203a5

2019-10-24T12:20:27

config_file: keep reference to config entries when creating iterator When creating a configuration file iterator, then we first refresh the backend and then afterwards duplicate all refreshed configuration entries into the iterator in order to avoid seeing any concurrent modifications of the entries while iterating. The duplication of entries is not guarded, though, as we do not increase the refcount of the entries that we duplicate right now. This opens us up for a race, as another thread may concurrently refresh the repository configuration and thus swap out the current set of entries. As we didn't increase the refcount, this may lead to the entries being free'd while we iterate over them in the first thread. Fix the issue by properly handling the lifecycle of the backend's entries via `config_file_entries_take` and `git_config_entries_free`, respectively.

0927156a

2019-10-24T12:32:11

config_file: refactor taking entries ref to return an error code The function to take a reference to the config file's config entries currently returns the reference via return value. Due to this, it's harder than necessary to integrate into our typical coding style, as one needs to make sure that a proper error code is set before erroring out from the caller. This bites us in `config_file_delete`, where we call `goto out` directly when `config_file_entries_take` returns `NULL`, but we actually forget to set up the error code and thus return success. Fix the issue by refactoring the function to return an error code and pass the reference via an out-pointer.

db301087

2019-10-24T12:17:02

config_file: remove unused includes

c2749849

2019-10-24T12:00:11

config_file: rename function names As with the predecessing commit, this commit renames backend functions of the configuration file backend. This helps to clearly separate functionality and also to be able to see from backtraces which backend is currently in use.

7aacf027

2019-09-13T08:55:33

global: convert all users of POSIX regex to use our new regexp API The old POSIX regex API has been superseded by our new regexp API. Convert all users to make use of the new one.

722ba93f

2019-08-01T15:14:06

config: implement "onbranch" conditional With Git v2.23.0, the conditional include mechanism gained another new conditional "onbranch". As the name says, it will cause a file to be included if the "onbranch" pattern matches the currently checked out branch. Implement this new condition and add a bunch of tests.

37ebe9ad

2019-07-24T18:49:08

config_backend: rename internal structures The internal backend structures are kind-of legacy and do not really speak for themselves. Rename them accordingly to make them easier to understand.

2bff84ba

2019-07-26T21:02:56

config_file: separate out read-only backend To further distinguish the file writeable and readonly backends, separate the readonly backend into its own "config_snapshot.c" implementation. The snapshot backend can be generically used to snapshot any type of backend.

f0b10066

2019-07-24T18:37:14

config_file: fix cast of readonly backend In `backend_readonly_free`, the passed in config backend is being cast to a `diskfile_backend` instead of to a `diskfile_readonly_backend`. While this works out just fine because we only access its header values, which were shared between both backends, it is undefined behaviour. Use the correct type to fix this.

a3159df8

2019-07-24T18:31:43

config_file: remove shared `diskfile_header` struct The `diskfile_header` structure is shared between both `diskfile_backend` and `diskfile_readonly_backend`. The separation and resulting casting is confusing at times and a source for programming errors. Remove the shared structure and inline them directly.

271e5fba

2019-07-24T18:18:18

config_file: duplicate accessors for readonly backend While most functions of the readonly configuration backend are implemented separately from the writeable configuration backend, the two functions `config_iterator_new` and `config_get` are shared between both. This sharing makes it necessary to have some shared data structures, which is the `diskfile_header` structure. Unfortunately, this makes the backends harder to grasp than necessary due to all the casting between structs and also quite error prone. Reimplement those functions for the readonly backends. As readonly backends cannot be refreshed anyway, we can remove the calls to `config_refresh` in there.

4e7ce1fb

2019-07-24T18:13:52

config_file: reimplement `config_readonly_open` generically The `config_readonly_open` function currently receives as input a diskfile backend and will copy its entries to a new snapshot. This is rather intimate, as we need to assume that the source config backend is in fact a diskfile entry. We can do better than this though by using generic methods to copy contents of the provided backend, e.g. by using a config iterator. This also allows us to decouple the read-only backend from the read-write backend.

2766b92d

2019-07-21T15:10:34

config_file: refresh when creating an iterator When creating a new iterator for a config file backend, then we should always make sure that we're up to date by calling `config_refresh`. Otherwise, we might not notice when another process has modified the configuration file and thus will represent outdated values. Add two tests to config::stress that verify that we get up-to-date values when reading configuration entries via `git_config_iterator`.

9fac8b78

2019-07-21T15:08:22

config_file: do not refresh read-only backends If calling `config_refresh` on a read-only configuration file backend, then we will segfault when comparing the timestamp of the file due to `path` being uninitialized. As a read-only snapshot should not be refreshed anyway and stay consistent, we can simply return early when calling `config_refresh` on a read-only snapshot.

28d11b59

2019-07-21T14:41:21

config_file: consistently use `GIT_CONTAINER_OF`

dbeadf8a

2019-07-11T10:56:05

config_parse: provide parser init and dispose functions Right now, all configuration file backends are expected to directly mess with the configuration parser's internals in order to set it up. Let's avoid doing that by implementing both a `git_config_parser_init` and `git_config_parser_dispose` function to clearly define the interface between configuration backends and the parser. Ideally, we would make the `git_config_parser` structure definition private to its implementation. But as that would require an additional memory allocation that was not required before we just live with it being visible to others.

32157526

2019-07-11T11:10:02

config_file: refactor error handling in `config_write` Error handling in `config_write` is rather convoluted and does not match our current code style. Refactor it to make it easier to understand.

820fa1a3

2019-07-11T11:04:33

config_file: internalize `git_config_file` struct With the previous commits, we have finally separated the config parsing logic from the specific configuration file backend. Due to that, we can now move the `git_config_file` structure into the config file backend's implementation so that no other code may accidentally start using it again. Furthermore, we rename the structure to `diskfile` to make it obvious that it is internal, only, and to unify it with naming scheme of the other diskfile structures.

6e6da75f

2019-07-11T11:00:05

config_parse: remove use of `git_config_file` The config parser code needs to keep track of the current parsed file's name so that we are able to provide proper error messages to the user. Right now, we do that by storing a `git_config_file` in the parser structure, but as that is a specific backend and the parser aims to be generic, it is a layering violation. Switch over to use a simple string to fix that.

54d350e0

2019-06-21T12:53:43

config_file: embed file in diskfile parse data The config file code needs to keep track of the actual `git_config_file` structure, as it not only contains the path of the current configuration file, but it also keeps tracks of all includes of that file. Right now, we keep track of that structure via the `git_config_parser`, but as that's supposed to be a backend generic implementation of configuration parsing it's a layering violation to have it in there. Switch over the config file backend to use its own config file structure that's embedded in the backend parse data. This allows us to switch over the generic config parser to avoid using the `git_config_file` structure.

2ba7020f

2019-06-27T09:23:59

config_file: avoid re-reading files on write When we rewrite the configuration file due to any of its values being modified, we call `config_refresh` to update the in-memory representation of our config file backend. This is needlessly wasteful though, as `config_refresh` will always open the on-disk representation to reads the file contents while we already know the complete file contents at this point in time as we have just written it to disk. Implement a new function `config_refresh_from_buffer` that will refresh the backend's config entries from a buffer instead of from the config file itself. Note that this will thus _not_ update the backend's timestamp, which will cause us to re-read the buffer when performing a read operation on it. But this is still an improvement as we now lazily re-read the contents, and most importantly we will avoid constantly re-reading the contents if we perform multiple write operations. The following strace demonstrates this if we're re-writing a key multiple times. It uses our config example with `config_set` changed to update the file 10 times with different keys: $ strace lg2 config x.x z |& grep '^open.*config' open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 And now with the optimization of `config_refresh_from_buffer`: $ strace lg2 config x.x z |& grep '^open.*config' open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 open("/tmp/repo/.git/config.lock", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0666) = 3 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 4 As can be seen, this is quite a lot of `open` calls less.

a0dc3027

2019-06-27T08:54:51

config_file: split out function that sets config entries Updating a config file backend's config entries is a bit more involved, as it requires clearing of the old config entries as well as handling locking correctly. As we will need this functionality in a future patch to refresh config entries from a buffer, let's extract this into its own function `config_set_entries`.

985f5cdf

2019-06-27T08:41:16

config_file: split out function that reads entries from a buffer The `config_read` function currently performs both reading the on-disk config file as well as parsing the retrieved buffer contents. To optimize how we refresh our config entries from an in-memory buffer, we need to be able to directly parse buffers, though, without involving any on-disk files at all. Extract a new function `config_read_buffer` that sets up the parsing logic and then parses config entries from a buffer, only. Have `config_read` use it to avoid duplicated logic.

3e1c137a

2019-06-27T08:24:21

config_file: move refresh into `write` function We are quite lazy in how we refresh our config file backend when updating any of its keys: instead of just updating our in-memory representation of the keys, we just discard the old set of keys and then re-read the config file contents from disk. This refresh currently happens separately at every callsite of `config_write`, but it is clear that we _always_ want to refresh if we have written the config file to disk. If we didn't, then we'd run around with an outdated config file backend that does not represent what we have on disk. By moving the refresh into `config_write`, we are also able to optimize the case where the config file is currently locked. Before, we would've tried to re-read the file even if we have only updated its cached contents without touching the on-disk file. Thus we'd have unnecessarily stat'd the file, even though we know that it shouldn't have been modified in the meantime due to its lock.

d7f58eab

2019-06-21T11:55:21

config_file: implement stat cache to avoid repeated rehashing To decide whether a config file has changed, we always hash its complete contents. This is unnecessarily expensive, as well-behaved filesystems will always update stat information for files which have changed. So before computing the hash, we should first check whether the stat info has actually changed for either the configuration file or any of its includes. This avoids having to re-read the configuration file and its includes every time when we check whether it's been modified. Tracing the for-each-ref example previous to this commit, one can see that we repeatedly re-open both the repo configuration as well as the global configuration: $ strace lg2 for-each-ref |& grep config access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) access("/home/pks/.config/git/config", F_OK) = 0 access("/etc/gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 access("/tmp/repo/.git/config", F_OK) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05290) = -1 ENOENT (No such file or directory) access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 access("/home/pks/.config/git/config", F_OK) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c051f0) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffd15c05090) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 With the change, we only do stats for those files and open them a single time, only: $ strace lg2 for-each-ref |& grep config access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) access("/home/pks/.config/git/config", F_OK) = 0 access("/etc/gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 access("/tmp/repo/.git/config", F_OK) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 open("/tmp/repo/.git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/home/pks/.gitconfig", 0x7ffe70540d20) = -1 ENOENT (No such file or directory) access("/home/pks/.gitconfig", F_OK) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 access("/home/pks/.config/git/config", F_OK) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 open("/home/pks/.config/git/config", O_RDONLY|O_CLOEXEC) = 3 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540ca0) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540c80) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 stat("/tmp/repo/.git/config", {st_mode=S_IFREG|0644, st_size=92, ...}) = 0 stat("/home/pks/.gitconfig", 0x7ffe70540b40) = -1 ENOENT (No such file or directory) stat("/home/pks/.gitconfig", 0x7ffe70540b20) = -1 ENOENT (No such file or directory) stat("/home/pks/.config/git/config", {st_mode=S_IFREG|0644, st_size=1154, ...}) = 0 The following benchmark has been performed with and without the stat cache in a best-of-ten run: ``` int lg2_repro(git_repository *repo, int argc, char **argv) { git_config *cfg; int32_t dummy; int i; UNUSED(argc); UNUSED(argv); check_lg2(git_repository_config(&cfg, repo), "Could not obtain config", NULL); for (i = 1; i < 100000; ++i) git_config_get_int32(&dummy, cfg, "foo.bar"); git_config_free(cfg); return 0; } ``` Without stat cache: $ time lg2 repro real 0m1.528s user 0m0.568s sys 0m0.944s With stat cache: $ time lg2 repro real 0m0.526s user 0m0.268s sys 0m0.258s This benchmark shows a nearly three-fold performance improvement. This change requires that we check our configuration stress tests as we're now in fact becoming more racy. If somebody is writing a configuration file at nearly the same time (there is a window of 100ns on Windows-based systems), then it might be that we realize that this file has actually changed and thus may not re-read it. This will only happen if either an external process is rewriting the configuration file or if the same process has multiple `git_config` structures pointing to the same time, where one of both is being used to write and the other one is used to read values.

d0868646

2019-06-21T11:43:09

config: use `git_config_file` in favor of `struct config_file`

5811e3ba

2019-06-13T19:16:32

config_file: use `wildmatch` to evaluate conditionals We currently use `p_fnmatch` to compute whether a given "gitdir:" or "gitdir/i:" conditional matches the current configuration file path. As git.git has moved to use `wildmatch` instead of `p_fnmatch` throughout its complete codebase, we evaluate conditionals inconsistently with git.git in some special cases. Convert `p_fnmatch` to use `wildmatch`. The `FNM_LEADINGDIR` flag cannot be translated to `wildmatch`, but in fact git.git doesn't use it here either. And in fact, dropping it while we go increases compatibility with git.git.

cf1a114b

2019-06-13T19:10:22

config_file: do not include trailing '/' for "gitdir" conditionals When evaluating "gitdir:" and "gitdir/i:" conditionals, we currently compare the given pattern with the value of `git_repository_path`. Thing is though that `git_repository_path` returns the gitdir path with trailing '/', while we actually need to match against the gitdir without it. Fix this issue by stripping the trailing '/' previous to matching. Add various tests to ensure we get this right.

5d987f7d

2019-06-13T19:00:06

config_file: refactor `do_match_gitdir` to improve readability The function `do_match_gitdir` has some horribly named parameters and variables. Rename them to improve readability. Furthermore, fix a potentially undetected out-of-memory condition when appending "**" to the pattern.

451df793

2019-06-13T15:20:23

posix: remove implicit include of "fnmatch.h" We're about to phase out our bundled fnmatch implementation as git.git has moved to wildmatch long ago in 2014. To make it easier to spot which files are stilll using fnmatch, remove the implicit "fnmatch.h" include in "posix.h" and instead include it explicitly.

02683b20

2019-01-12T23:06:39

regexec: prefix all regexec function calls with p_ Prefix all the calls to the the regexec family of functions with `p_`. This allows us to swap out all the regular expression functions with our own implementation. Move the declarations to `posix_regex.h` for simpler inclusion.

e44110db

2019-03-20T12:28:45

Correctly write to missing locked global config Opening a default config when ~/.gitconfig doesn't exist, locking it, and attempting to write to it causes an assertion failure. Treat non-existent global config file content as an empty string.

bc5b19e6

2019-04-29T09:01:45

Merge pull request #4561 from pks-t/pks/downcasting [RFC] util: introduce GIT_DOWNCAST macro

cc8a9892

2019-04-16T18:13:31

config_file: check result of git_array_alloc git_array_alloc can return NULL if no memory is available, causing a segmentation fault in memset. This adds GIT_ERROR_CHECK_ALLOC similar to how other parts of the code base deal with the return value of git_array_alloc.

65203b5a

2019-04-16T13:21:16

config_file: make use of `GIT_CONTAINER_OF` macro

f673e232

2018-12-27T13:47:34

git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.

d06d4220

2018-10-05T10:56:02

config_file: properly ignore includes without "path" value In case a configuration includes a key "include.path=" without any value, the generated configuration entry will have its value set to `NULL`. This is unexpected by the logic handling includes, and as soon as we try to calculate the included path we will unconditionally dereference that `NULL` pointer and thus segfault. Fix the issue by returning early in both `parse_include` and `parse_conditional_include` in case where the `file` argument is `NULL`. Add a test to avoid future regression. The issue has been found by the oss-fuzz project, issue 10810.

b78f4ab0

2018-08-16T12:22:03

config_entries: refactor entries iterator memory ownership Right now, the config file code requires us to pass in its backend to the config entry iterator. This is required with the current code, as the config file backend will first create a read-only snapshot which is then passed to the iterator just for that purpose. So after the iterator is getting free'd, the code needs to make sure that the snapshot gets free'd, as well. By now, though, we can easily refactor the code to be more efficient and remove the reverse dependency from iterator to backend. Instead of creating a read-only snapshot (which also requires us to re-parse the complete configuration file), we can simply duplicate the config entries and pass those to the iterator. Like that, the iterator only needs to make sure to free the duplicated config entries, which is trivial to do and clears up memory ownership by a lot.

123e5963

2018-08-10T18:59:59

config_entries: abstract away reference counting Instead of directly calling `git_atomic_inc` in users of the config entries store, provide a `git_config_entries_incref` function to further decouple the interfaces. Convert the refcount to a `git_refcount` structure while at it.

5a7e0b3c

2018-08-10T18:49:38

config_entries: abstract away iteration over entries The nice thing about our `git_config_iterator` interfaces is that nobody needs to know anything about the implementation details. All that is required is to obtain the iterator via any backend and then use it by executing generic functions. We can thus completely internalize all the implementation details of how to iterate over entries into the config entries store and simply create such an iterator in our config file backend when we want to iterate its entries. This further decouples the config file backend from the config entries store.

60ebc137

2018-08-10T14:53:09

config_entries: abstract away retrieval of config entries The code accessing config entries in the `git_config_entries` structure is still much too intimate with implementation details, directly accessing the maps and handling indices. Provide two new functions to get config entries from the internal map structure to decouple the interfaces and use them in the config file code. The function `git_config_entries_get` will simply look up the entry by name and, in the case of a multi-value, return the last occurrence of that entry. The second function, `git_config_entries_get_unique`, will only return an entry if it is unique and not included via another configuration file. This one is required to properly implement write operations for single entries, as we refuse to write to or delete a single entry if it is not clear which one was meant.

fb8a87da

2018-08-10T14:50:15

config_entries: rename functions and structure The previous commit simply moved all code that is required to handle config entries to a new module without yet adjusting any of the function and structure names to help readability. We now rename things accordingly to have a common "git_config_entries" entries instead of the old "diskfile_entries" one.

04f57d51

2018-08-10T13:33:02

config_entries: pull out implementation of entry store The configuration entry store that is used for configuration files needs to keep track of all entries in two different structures: - a singly linked list is being used to be able to iterate through configuration files in the order they have been found - a string map is being used to efficiently look up configuration entries by their key This store is thus something that may be used by other, future backends as well to abstract away implementation details and iteration over the entries. Pull out the necessary functions from "config_file.c" and moves them into their own "config_entries.c" module. For now, this is simply moving over code without any renames and/or refactorings to help reviewing.

d75bbea1

2018-08-10T14:35:23

config_file: remove unnecessary snapshot indirection The implementation for config file snapshots has an unnecessary redirection from `config_snapshot` to `git_config_file__snapshot`. Inline the call to `git_config_file__snapshot` and remove it.

b944e137

2018-08-10T13:03:33

config: rename "config_file.h" to "config_backend.h" The header "config_file.h" has a list of inline-functions to access the contents of a config backend without directly messing with the struct's function pointers. While all these functions are called "git_config_file_*", they are in fact completely backend-agnostic and don't care whether it is a file or not. Rename all the function to instead be backend-agnostic versions called "git_config_backend_*" and rename the header to match.

1aeff5d7

2018-08-10T12:52:18

config: move function normalizing section names into "config.c" The function `git_config_file_normalize_section` is never being used in any file different than "config.c", but it is implemented in "config_file.c". Move it over and make the symbol static.

f2694635

2018-09-06T14:17:54

config_file: fix quadratic behaviour when adding config multivars In case where we add multiple configuration entries with the same key to a diskfile backend, we always need to iterate the list of this key to find the last entry due to the list being a singly-linked list. This is obviously quadratic behaviour, and this has sure enough been found by oss-fuzz by generating a configuration file with 50k lines, where most of them have the same key. While the issue will not arise with "sane" configuration files, an adversary may trigger it by providing a crafted ".gitmodules" file, which is delivered as part of the repo and also parsed by the configuration parser. The fix is trivial: store a pointer to the last entry of the list in its head. As there are only two locations now where we append to this data structure, mainting this pointer is trivial, too. We can also optimize retrieval of a single value via `config_get`, where we previously had to chase the `next` pointer to find the last entry that was added. Using our configuration file fozzur with a corpus that has a single file with 50000 "-=" lines previously took around 21s. With this optimization the same file scans in about 0.053s, which is a nearly 400-fold improvement. But in most cases with a "normal" amount of same-named keys it's not going to matter anyway.

ec76a1aa

2018-08-05T14:37:08

Add a comment

019409be

2018-08-05T14:25:22

Don't error on missing section, just continue

c4d7fa95

2018-07-22T23:31:19

config_file: Don't crash on options without a section

e1e90dcc

2018-01-09T14:52:34

config_file: avoid free'ing OOM buffers Buffers which ran out of memory will never have any memory attached to them. As such, it is not necessary to call `git_buf_free` if the buffer is out of memory.

e51e29e8

2017-11-12T13:59:47

config_parse: have `git_config_parse` own entry value and name The function `git_config_parse` uses several callbacks to pass data along to the caller as it parses the file. One design shortcoming here is that strings passed to those callbacks are expected to be freed by them, which is really confusing. Fix the issue by changing memory ownership here. Instead of expecting the `on_variable` callbacks to free memory for `git_config_parse`, just do it inside of `git_config_parse`. While this obviously requires a bit more memory allocation churn due to having to copy both name and value at some places, this shouldn't be too much of a burden.

ecf4f33a

2018-02-08T11:14:48

Convert usage of `git_buf_free` to new `git_buf_dispose`

6a15f657

2018-02-09T13:02:26

config_file: iterate over keys in the order they were added Currently, all configuration entries were only held in a string map, making iteration order mostly based on the hash of each entry's key. Now that we have extended the `diskfile_entries` structure by a list of config entries, we can effectively iterate through entries in the order they were added, though.

3a82475f

2018-02-09T12:49:45

config_file: add list holding config entries in order of appearance Right now, we only keep all configuration entries in a string map. This is required to efficiently access configuration entries by keys. It has the disadvantage of not being able to iterate through configuration entries in the order they were read, though. Instead, we only iterate through entries in a seemingly random order. Extend `diskfile_entries` by another list holding configuration entries. Like this, we maintain all entries in two data structures and can use the required one based on the current use case.

8c0b0717

2018-02-09T12:32:24

config_file: pass complete entry structure into `append_entry` Currently, we only parse the entry map into `append_entry` to append new configuration entries to it. Instead, though, we can just pass the complete `diskfile_entries` structure into it. This allows us to easily extend `diskfile_entries` by another singly linked list of configuration entries.

eafb8402

2018-02-09T12:29:16

config_file: rename `refcounted_strmap` to `diskfile_entries` The config file parsing code all revolves around the `refcounted_strmap` structure, which is a map of entry names to their respective keys. This naming scheme made grasping the code quite hard, warranting a rename. A good alternative is `diskfile_entries`, making clear that this really only holds all configuration entries. Furthermore, we are about to introduce a new linked list of configuration entries into the configuration file code. This list will be used to iterate over configuration entries in the order they are listed inside of the parsed configuration file. After renaming `refcounted_strmap` to `diskfile_entries`, this struct also becomes the natural target where to add that new list. Like this, data structures holding all entries are neatly contained inside of it.

e3c8462c

2018-02-09T11:50:28

config_file: rename parse_data struct The struct `parse_data` sounds as if it was defined and passed to us from the configuration parser, which is not true. Instead, `parse_data` is specific to the diskfile backend parsing logic. Rename it to `diskfile_parse_state` to make that clearer. This also follows existing naming patterns with the "diskfile" prefix.

18117a6c

2018-02-09T11:43:13

config_file: use new line to declare new variable

b6f88706

2018-02-09T11:39:26

config_file: refactor freeing of config entry lists The interface for freeing config list entries is more tangled than required. Instead of calling `cvar_free` for every list entry in `free_vars`, we now just provide a function `config_entry_list_free`. This function will iterate through all list entries and free the associated configuration entry as well as the list entry itself.

2d1f6676

2018-02-09T11:35:54

config_file: rename cvar_t struct to config_entry_list The `cvar_t` structure is really awkward to grasp, because its name actively hinders discovery of what it actually is. As it is nothing more than a singly-linked list of configuration entries, name rename it to just that: `config_entry_list`.

26cf48fc

2018-02-09T11:35:16

config_file: move include depth into config entry In order to reject writes to included configuration entries, we need to keep track of whether an entry was included via another configuration file or not. This information is being stored in the `cvar` structure, which is a rather weird location, as it is only used to create a list structure of config entries. Move the include depth into the structure `git_config_entry` instead. While this fixes the layering issue, it enables users of libgit2 to access the depth, too.

fcb0d841

2018-02-09T11:19:47

config_file: move cvar handling into `append_entry` The code appending new configuration entries to our current list first allocates the `cvar` structure and then passes it to `append_entry`. As we want to extend `append_entry` to store configuration entries in a map as well as in a list for ordered iteration, we will have to create two `cvar` structures, though. As such, the future change will become much easier when allocation of the `cvar` structure is doen in `append_entry` itself.

dfcd923c

2018-02-09T11:15:32

config_file: remove unused list iteration macros We currently provide a lot of macros for the `cvar_t` structure which are never being used. In fact, the only macro we need is to access the `next` pointer of `cvar_t`, which really does not require a macro at all. Remove all these macros and replace usage of `CVAR_LIST_NEXT(cvar)` with `cvar->next`.

9cd0c6f1

2018-02-28T16:01:16

config: return an error if config_refresh is called on a snapshot Instead of treating it as a no-op, treat it as a programming error and return the same kind of error as if you called to set or delete variables on a snapshot.

2424e64c

2018-02-28T12:06:02

config: harden our use of the backend objects a bit When we create an iterator we don't actually know that we have a live config object and we must instead only rely on the header. We fixed it to use this in a previous commit, but this makes it harder to misuse by converting to use the header object in the typecast. We also guard inside the `config_refresh` function against being given a snapshot (although callers right now do check).

1785de4e

2018-02-28T11:46:17

config: move the level field into the header We use it in a few places where we might have a full object or a snapshot so move it to where we can actually access it.

c1524b2e

2018-02-28T11:33:11

config: move the repository to the diskfile header We pass this around and when creating a new iterator we need to read the repository pointer. Put it in a common place so we can reach it regardless of whether we got a full object or a snapshot.

9e66590b

2017-07-21T13:01:43

config_parse: use common parser interface As the config parser is now cleanly separated from the config file code, we can easily refactor the code and make use of the common parser module. This removes quite a lot of duplicated functionality previously used for handling the actual parser state and replaces it with the generic interface provided by the parser context.

1953c68b

2017-11-11T17:12:31

config_file: split out module to parse config files The configuration file code grew quite big and intermingles both actual configuration logic as well as the parsing logic of the configuration syntax. This makes it hard to refactor the parsing logic on its own and convert it to make use of our new parsing context module. Refactor the code and split it up into two parts. The config file code will only handle actual handling of configuration files, includes and writing new files. The newly created config parser module is then only responsible for parsing the actual contents of a configuration file, leaving everything else to callbacks provided to its provided function `git_config_parse`.

42627933

2017-11-04T18:03:26

Merge remote-tracking branch 'upstream/master' into pks/conditional-includes

1475b981

2017-11-04T18:00:56

config: keep the output parameter at the start of the function

94e30d9b

2017-10-30T15:55:18

config: check for OOM when writing

8ec806d7

2017-10-30T06:23:31

config: preserve the original case when writing out new sections and vars For sections we will still use the existing one even if the case disagrees, but the variable always gets written with the case given by the caller.

f7d837c8

2017-05-24T12:12:29

config_file: implement "gitdir/i" conditional Next to the "gitdir" conditional for including other configuration files, there's also a "gitdir/i" conditional. In contrast to the former one, path matching with "gitdir/i" is done case-insensitively. This commit implements the case-insensitive condition.

071b6c06

2017-05-24T11:13:36

config_file: implement conditional "gitdir" includes Upstream git.git has implemented the ability to include other configuration files based on conditions. Right now, this only includes the ability to include a file based on the gitdir-location of the repository the currently parsed configuration file belongs to. This commit implements handling these conditional includes for the case-sensitive "gitdir" condition.

9d7a75be

2017-08-25T19:15:00

config_file: make repo and config path accessible to reader The reader machinery will be extended to handle conditional includes. The only conditions that currently exist all match the against the git directory of the repository the config file belongs to. As such, we need to have access to the repository when reading configuration files to properly handle these conditions. One specialty of thes conditional includes is that the actual pattern may also be a relative pattern starting with "./". In this case, we have to match the pattern against the path relative to the config file which is currently being parsed. So besides the repository, we also have to pass down the path to the current config file that is being parsed.

d5b9d9e9

2017-05-23T10:53:49

config_file: extract function to parse include path The logic inside this function will be required later on, when implementing conditional includes. Extract it into its own function to ease the implementation.

529e873c

2017-05-23T11:51:00

config: pass repository when opening config files Our current configuration logic is completely oblivious of any repository, but only cares for actual file paths. Unfortunately, we are forced to break this assumption by the introduction of conditional includes, which are evaluated in the context of a repository. Right now, only one conditional exists with "gitdir:" -- it will only include the configuration if the current repository's git directory matches the value passed to "gitdir:". To support these conditionals, we have to break our API and make the repository available when opening a configuration file. This commit extends the `open` call of configuration backends to include another repository and adjusts existing code to have it available. This includes the user-visible functions `git_config_add_file_ondisk` and `git_config_add_backend`.

1560b580

2017-08-15T10:35:47

Merge pull request #4288 from pks-t/pks/include-fixups Include fixups

1b329089

2017-05-31T22:27:19

config_file: refuse modifying included variables Modifying variables pulled in by an included file currently succeeds, but it doesn't actually do what one would expect, as refreshing the configuration will cause the values to reappear. As we are currently not really able to support this use case, we will instead just return an error for deleting and setting variables which were included via an include.

28c2cc3d

2017-05-31T16:41:44

config_file: move reader into `config_read` only Right now, we have multiple call sites which initialize a `reader` structure. As the structure is only actually used inside of `config_read`, we can instead just move the reader inside of the `config_read` function. Instead, we can just pass in the configuration file into `config_read`, which eases code readability.

83bcd3a1

2017-05-31T22:45:25

config_file: refresh all files if includes were modified Currently, we only re-parse the top-level configuration file when it has changed itself. This can cause problems when an include is changed, as we were not updating all values correctly. Instead of conditionally reparsing only refreshed files, the logic becomes much clearer and easier to follow if we always re-parse the top-level configuration file when either the file itself or one of its included configuration files has changed on disk. This commit implements this logic. Note that this might impact performance in some cases, as we need to re-read all configuration files whenever any of the included files changed. It could increase performance to just re-parse include files which have actually changed, but this would compromise maintainability of the code without much gain. The only case where we will gain anything is when we actually use includes and when only these includes are updated, which will probably be quite an unusual scenario to actually be worthwhile to optimize.

56a7a264

2017-05-31T14:50:40

config_file: remove unused backend field from parse data The backend passed to `config_read` is never actually used anymore, so we can remove it from the function and the `parse_data` structure.

3a7f7a6e

2017-05-31T14:43:46

config_file: pass reader directly to callbacks Previously, the callbacks passed to `config_parse` got the reader via a pointer to a pointer. This allowed the callbacks to update the callers `reader` variable when the array holding it has been reallocated. As the array is no longer present, we can simply the code by making the reader a simple pointer.

73df75d8

2017-05-31T14:34:48

config_file: refactor include handling Current code for configuration files uses the `reader` structure to parse configuration files and store additional metadata like the file's path and checksum. These structures are stored within an array in the backend itself, which causes multiple problems. First, it does not make sense to keep around the file's contents with the backend itself. While this data is usually free'd before being added to the backend, this brings along somewhat intricate lifecycle problems. A better solution would be to store only the file paths as well as the checksum of the currently parsed content only. The second problem is that the `reader` structures are stored inside an array. When re-parsing configuration files due to changed contents, we may cause this array to be reallocated, requiring us to update pointers hold by callers. Furthermore, we do not keep track of includes which are already associated to a reader inside of this array. This causes us to add readers multiple times to the backend, e.g. in the scenario of refreshing configurations. This commit fixes these shortcomings. We introduce a split between the parsing data and the configuration file's metadata. The `reader` will now only hold the file's contents and the parser state and the new `config_file` structure holds the file's path and checksum. Furthermore, the new structure is a recursive structure in that it will also hold references to the files it directly includes. The diskfile is changed to only store the top-level configuration file. These changes allow us further refactorings and greatly simplify understanding the code.

0c7f49dd

2017-06-30T13:39:01

Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.

a693b873

2017-06-07T10:20:44

buffer: use `git_buf_init` with length The `git_buf_init` function has an optional length parameter, which will cause the buffer to be initialized and allocated in one step. This can be used instead of static initialization with `GIT_BUF_INIT` followed by a `git_buf_grow`. This patch does so for two functions where it is applicable.

a1023a43

2017-05-20T17:18:07

Merge pull request #4179 from libgit2/ethomson/expand_tilde Introduce home directory expansion function for config files, attribute files

4467aeac

2017-03-28T09:00:48

config_file: handle errors other than OOM while parsing section headers The current code in `parse_section_header_ext` is only prepared to properly handle out-of-memory conditions for the `git_buf` structure. While very unlikely and probably caused by a programming error, it is also possible to run into error conditions other than out-of-memory previous to reaching the actual parsing loop. In these cases, we will run into undefined behavior as the `rpos` variable is only initialized after these triggerable errors, but we use it in the cleanup-routine. Fix the issue by unifying the function's cleanup code with an `end_error` section, which will not use the `rpos` variable.

29aef948

2017-03-23T11:59:06

config, attrcache: don't fallback to dirs literally named `~` The config and attrcache file reading code would attempt to load a file in a home directory by expanding the `~` and looking for the file, using `git_sysdir_find_global_file`. If the file was not found, the error handling would look for the literal path, eg `~/filename.txt`. Use the new `git_config_expand_global_file` instead, which allows us to get the path to the file separately, when the path is prefixed with `~/`, and fail with a not found error without falling back to looking for the literal path.

301dc26a

2016-06-20T13:15:35

fix error when including a missing config file relative to the home directory

2cf48e13

2017-03-20T09:34:41

config_file: check if section header buffer runs out of memory While parsing section headers, we use a buffer to store the actual section name. We do not check though if the buffer runs out of memory at any stage. Do so.

thodg/libgit2/src/config_file.c

src/config_file.c

Log