kmx git

Commit	Date	Message
5d92e547	2019-06-08T17:28:35	oid: `is_zero` instead of `iszero` The only function that is named `issomething` (without underscore) was `git_oid_iszero`. Rename it to `git_oid_is_zero` for consistency with the rest of the library.
f673e232	2018-12-27T13:47:34	git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
168fe39b	2018-11-28T14:26:57	object_type: use new enumeration names Use the new object_type enumeration names within the codebase.
ecf4f33a	2018-02-08T11:14:48	Convert usage of `git_buf_free` to new `git_buf_dispose`
d8896bda	2018-01-03T16:07:36	diff_generate: avoid excessive stats of .gitattribute files When generating a diff between two trees, for each file that is to be diffed we have to determine whether it shall be treated as text or as binary files. While git has heuristics to determine which kind of diff to generate, users can also that default behaviour by setting or unsetting the 'diff' attribute for specific files. Because of that, we have to query gitattributes in order to determine how to diff the current files. Instead of hitting the '.gitattributes' file every time we need to query an attribute, which can get expensive especially on networked file systems, we try to cache them instead. This works perfectly fine for every '.gitattributes' file that is found, but we hit cache invalidation problems when we determine that an attribuse file is _not_ existing. We do create an entry in the cache for missing '.gitattributes' files, but as soon as we hit that file again we invalidate it and stat it again to see if it has now appeared. In the case of diffing large trees with each other, this behaviour is very suboptimal. For each pair of files that is to be diffed, we will repeatedly query every directory component leading towards their respective location for an attributes file. This leads to thousands or even hundreds of thousands of wasted syscalls. The attributes cache already has a mechanism to help in that scenario in form of the `git_attr_session`. As long as the same attributes session is still active, we will not try to re-query the gitmodules files at all but simply retain our currently cached results. To fix our problem, we can create a session at the top-most level, which is the initialization of the `git_diff` structure, and use it in order to look up the correct diff driver. As the `git_diff` structure is used to generate patches for multiple files at once, this neatly solves our problem by retaining the session until patches for all files have been generated. The fix has been tested with linux.git by calling `git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and v4.14^{tree}. \| time \| .gitattributes stats without fix \| 33.201s \| 844614 with fix \| 30.327s \| 4441 While execution only improved by roughly 10%, the stat(3) syscalls for .gitattributes files decreased by 99.5%. The benchmarks were quite simple with best-of-three timings on Linux ext4 systems. One can assume that for network based file systems the performance gain will be a lot larger due to a much higher latency.
2388a9e2	2017-12-15T10:47:01	diff_file: properly refcount blobs when initializing file contents When initializing a `git_diff_file_content` from a source whose data is derived from a blob, we simply assign the blob's pointer to the resulting struct without incrementing its refcount. Thus, the structure can only be used as long as the blob is kept alive by the caller. Fix the issue by using `git_blob_dup` instead of a direct assignment. This function will increment the refcount of the blob without allocating new memory, so it does exactly what we want. As `git_diff_file_content__unload` already frees the blob when `GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code handling the free but only have to set that flag correctly.
0c7f49dd	2017-06-30T13:39:01	Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
909d5494	2016-12-29T12:25:15	giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
9be638ec	2016-04-19T15:12:18	git_diff_generated: abstract generated diffs
d68cb736	2015-09-22T18:25:03	diff: include oid length in deltas Now that `git_diff_delta` data can be produced by reading patch file data, which may have an abbreviated oid, allow consumers to know that the id is abbreviated.
6b0fc6ab	2015-11-03T09:43:18	diff: on win32, treat fake "symlinks" specially On platforms that lack `core.symlinks`, we should not go looking for symbolic links and `p_readlink` their target. Instead, we should examine the file's contents.
c2418f46	2015-06-25T12:48:44	Rename FALLBACK to UNSPECIFIED Fallback describes the mechanism, while unspecified explains what the user is thinking.
c6f489c9	2015-05-04T17:29:12	submodule: add an ignore option to status This lets us specify in the status call which ignore rules we want to use (optionally falling back to whatever the submodule has in its configuration). This removes one of the reasons for having `_set_ignore()` set the value in-memory. We re-use the `IGNORE_RESET` value for this as it is no longer relevant but has a similar purpose to `IGNORE_FALLBACK`. Similarly, we remove `IGNORE_DEFAULT` which does not have use outside of initializers and move that to fall back to the configuration as well.
64bbd47a	2015-05-04T17:09:21	submodule: don't let status change an existing instance As submodules are becomes more like values, we should not let a status check to update its properties. Instead of taking a submodule, have status take a repo and submodule name.
8147b1af	2015-05-25T20:03:59	diff: introduce binary diff callbacks Introduce a new binary diff callback to provide the actual binary delta contents to callers. Create this data from the diff contents (instead of directly from the ODB) to support binary diffs including the workdir, not just things coming out of the ODB.
795eaccd	2015-02-19T11:09:54	git_filter_opt_t -> git_filter_flag_t For consistency with the rest of the library, where an opt is an options structure.
61bef72d	2014-05-20T23:57:40	Start adding GIT_DELTA_UNREADABLE and GIT_STATUS_WT_UNREADABLE.
5269008c	2014-05-06T16:01:49	Add filter options and ALLOW_UNSAFE Diff and status do not want core.safecrlf to actually raise an error regardless of the setting, so this extends the filter API with an additional options flags parameter and adds a flag so that filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating that unsafe filter application should be downgraded from a failure to a warning.
591e8295	2014-03-25T16:52:01	Fix submodule leaks and invalid references This cleans up some places I missed that could hold onto submodule references and cleans up the way in which the repository cache is both reloaded and released so that existing submodule references aren't destroyed inappropriately.
a15c7802	2014-03-25T09:14:48	Make submodules externally refcounted `git_submodule` objects were already refcounted internally in case the submodule name was different from the path at which it was stored. This makes that refcounting externally used as well, so `git_submodule_lookup` and `git_submodule_add_setup` return an object that requires a `git_submodule_free` when done.
6789b7a7	2014-02-27T14:13:22	Add buffer to buffer diff and patch APIs This adds `git_diff_buffers` and `git_patch_from_buffers`. This also includes a bunch of internal refactoring to increase the shared code between these functions and the blob-to-blob and blob-to-buffer APIs, as well as some higher level assert helpers in the tests to also remove redundancy.
9950bb4e	2014-01-24T20:23:17	diff: rename the file's 'oid' to 'id' In the same vein as the previous commits in this series.
10672e3e	2013-10-15T15:10:07	Diff API cleanup This lays groundwork for separating formatting options from diff creation options. This groups the formatting flags separately from the diff list creation flags and reorders the options. This also tweaks some APIs to further separate code that uses patches from code that just looks at git_diffs.
3ff1d123	2013-10-11T14:51:54	Rename diff objects and split patch.h This makes no functional change to diff but renames a couple of the objects and splits the new git_patch (formerly git_diff_patch) into a new header file.
a9f51e43	2013-09-11T22:00:36	Merge git_buf and git_buffer This makes the git_buf struct that was used internally into an externally available structure and eliminates the git_buffer. As part of that, some of the special cases that arose with the externally used git_buffer were blended into the git_buf, such as being careful about git_buf objects that may have a NULL ptr and allowing for bufs with a valid ptr and size but zero asize as a way of referring to externally owned data.
4b11f25a	2013-09-11T16:38:33	Add ident filter This adds the ident filter (that knows how to replace $Id$) and tweaks the filter APIs and code so that git_filter_source objects actually have the updated OID of the object being filtered when it is a known value.
2a7d224f	2013-09-10T16:33:32	Extend public filter api with filter lists This moves the git_filter_list into the public API so that users can create, apply, and dispose of filter lists. This allows more granular application of filters to user data outside of libgit2 internals. This also converts all the internal usage of filters to the public APIs along with a few small tweaks to make it easier to use the public git_buffer stuff alongside the internal git_buf.
85d54812	2013-08-28T16:44:04	Create public filter object and use it This creates include/sys/filter.h with a basic definition of a git_filter and then converts the internal code to use it. There are related internal objects (git_filter_list) that we will want to publish at some point, but this is a first step.
effdbeb3	2013-07-24T17:48:37	Make rename detection file size fix better The previous fix for checking file sizes with rename detection always loads the blob. In this version, if the odb backend can get the object header without loading the whole thing into memory, then we'll just use that, so that we can eliminate possible rename sources & targets without loading them.
39a1a662	2013-07-24T13:09:07	Don't unload diff data unless loaded
74ded024	2013-06-17T17:03:34	Add "as_path" parameters to blob and buffer diffs This adds parameters to the four functions that allow for blob-to- blob and blob-to-buffer differencing (either via callbacks or by making a git_diff_patch object). These parameters let you say that filename we should pretend the blob has while doing the diff. If you pass NULL, there should be no change from the existing behavior, which is to skip using attributes for file type checks and just look at content. With the parameters, you can plug into the new diff driver functionality and get binary or non-binary behavior, plus function context regular expressions, etc. This commit also fixes things so that the git_diff_delta that is generated by these functions will actually be populated with the data that we know about the blobs (or buffers) so you can use it appropriately. It also fixes a bug in generating patches from the git_diff_patch objects created via these functions. Lastly, there is one other behavior change that may matter. If there is no difference between the two blobs, these functions no longer generate any diff callbacks / patches unless you have passed in GIT_DIFF_INCLUDE_UNMODIFIED. This is pretty natural, but could potentially change the behavior of existing usage.
360f42f4	2013-06-12T14:18:09	Fix diff header naming issues This makes the git_diff_patch definition private to diff_patch.c and fixes a number of other header file naming inconsistencies to use `git_` prefixes on functions and structures that are shared between files.
5dc98298	2013-06-11T11:22:22	Implement regex pattern diff driver This implements the loading of regular expression pattern lists for diff drivers that search for function context in that way. This also changes the way that diff drivers update options and interface with xdiff APIs to make them a little more flexible.
114f5a6c	2013-06-10T10:10:39	Reorganize diff and add basic diff driver This is a significant reorganization of the diff code to break it into a set of more clearly distinct files and to document the new organization. Hopefully this will make the diff code easier to understand and to extend. This adds a new `git_diff_driver` object that looks of diff driver information from the attributes and the config so that things like function content in diff headers can be provided. The full driver spec is not implemented in the commit - this is focused on the reorganization of the code and putting the driver hooks in place. This also removes a few #includes from src/repository.h that were overbroad, but as a result required extra #includes in a variety of places since including src/repository.h no longer results in pulling in the whole world.

5d92e547

2019-06-08T17:28:35

oid: `is_zero` instead of `iszero` The only function that is named `issomething` (without underscore) was `git_oid_iszero`. Rename it to `git_oid_is_zero` for consistency with the rest of the library.

f673e232

2018-12-27T13:47:34

git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.

168fe39b

2018-11-28T14:26:57

object_type: use new enumeration names Use the new object_type enumeration names within the codebase.

ecf4f33a

2018-02-08T11:14:48

Convert usage of `git_buf_free` to new `git_buf_dispose`

d8896bda

2018-01-03T16:07:36

diff_generate: avoid excessive stats of .gitattribute files When generating a diff between two trees, for each file that is to be diffed we have to determine whether it shall be treated as text or as binary files. While git has heuristics to determine which kind of diff to generate, users can also that default behaviour by setting or unsetting the 'diff' attribute for specific files. Because of that, we have to query gitattributes in order to determine how to diff the current files. Instead of hitting the '.gitattributes' file every time we need to query an attribute, which can get expensive especially on networked file systems, we try to cache them instead. This works perfectly fine for every '.gitattributes' file that is found, but we hit cache invalidation problems when we determine that an attribuse file is _not_ existing. We do create an entry in the cache for missing '.gitattributes' files, but as soon as we hit that file again we invalidate it and stat it again to see if it has now appeared. In the case of diffing large trees with each other, this behaviour is very suboptimal. For each pair of files that is to be diffed, we will repeatedly query every directory component leading towards their respective location for an attributes file. This leads to thousands or even hundreds of thousands of wasted syscalls. The attributes cache already has a mechanism to help in that scenario in form of the `git_attr_session`. As long as the same attributes session is still active, we will not try to re-query the gitmodules files at all but simply retain our currently cached results. To fix our problem, we can create a session at the top-most level, which is the initialization of the `git_diff` structure, and use it in order to look up the correct diff driver. As the `git_diff` structure is used to generate patches for multiple files at once, this neatly solves our problem by retaining the session until patches for all files have been generated. The fix has been tested with linux.git by calling `git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and v4.14^{tree}. | time | .gitattributes stats without fix | 33.201s | 844614 with fix | 30.327s | 4441 While execution only improved by roughly 10%, the stat(3) syscalls for .gitattributes files decreased by 99.5%. The benchmarks were quite simple with best-of-three timings on Linux ext4 systems. One can assume that for network based file systems the performance gain will be a lot larger due to a much higher latency.

2388a9e2

2017-12-15T10:47:01

diff_file: properly refcount blobs when initializing file contents When initializing a `git_diff_file_content` from a source whose data is derived from a blob, we simply assign the blob's pointer to the resulting struct without incrementing its refcount. Thus, the structure can only be used as long as the blob is kept alive by the caller. Fix the issue by using `git_blob_dup` instead of a direct assignment. This function will increment the refcount of the blob without allocating new memory, so it does exactly what we want. As `git_diff_file_content__unload` already frees the blob when `GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code handling the free but only have to set that flag correctly.

0c7f49dd

2017-06-30T13:39:01

Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.

909d5494

2016-12-29T12:25:15

giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one

9be638ec

2016-04-19T15:12:18

git_diff_generated: abstract generated diffs

d68cb736

2015-09-22T18:25:03

diff: include oid length in deltas Now that `git_diff_delta` data can be produced by reading patch file data, which may have an abbreviated oid, allow consumers to know that the id is abbreviated.

6b0fc6ab

2015-11-03T09:43:18

diff: on win32, treat fake "symlinks" specially On platforms that lack `core.symlinks`, we should not go looking for symbolic links and `p_readlink` their target. Instead, we should examine the file's contents.

c2418f46

2015-06-25T12:48:44

Rename FALLBACK to UNSPECIFIED Fallback describes the mechanism, while unspecified explains what the user is thinking.

c6f489c9

2015-05-04T17:29:12

submodule: add an ignore option to status This lets us specify in the status call which ignore rules we want to use (optionally falling back to whatever the submodule has in its configuration). This removes one of the reasons for having `_set_ignore()` set the value in-memory. We re-use the `IGNORE_RESET` value for this as it is no longer relevant but has a similar purpose to `IGNORE_FALLBACK`. Similarly, we remove `IGNORE_DEFAULT` which does not have use outside of initializers and move that to fall back to the configuration as well.

64bbd47a

2015-05-04T17:09:21

submodule: don't let status change an existing instance As submodules are becomes more like values, we should not let a status check to update its properties. Instead of taking a submodule, have status take a repo and submodule name.

8147b1af

2015-05-25T20:03:59

diff: introduce binary diff callbacks Introduce a new binary diff callback to provide the actual binary delta contents to callers. Create this data from the diff contents (instead of directly from the ODB) to support binary diffs including the workdir, not just things coming out of the ODB.

795eaccd

2015-02-19T11:09:54

git_filter_opt_t -> git_filter_flag_t For consistency with the rest of the library, where an opt is an options *structure*.

61bef72d

2014-05-20T23:57:40

Start adding GIT_DELTA_UNREADABLE and GIT_STATUS_WT_UNREADABLE.

5269008c

2014-05-06T16:01:49

Add filter options and ALLOW_UNSAFE Diff and status do not want core.safecrlf to actually raise an error regardless of the setting, so this extends the filter API with an additional options flags parameter and adds a flag so that filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating that unsafe filter application should be downgraded from a failure to a warning.

591e8295

2014-03-25T16:52:01

Fix submodule leaks and invalid references This cleans up some places I missed that could hold onto submodule references and cleans up the way in which the repository cache is both reloaded and released so that existing submodule references aren't destroyed inappropriately.

a15c7802

2014-03-25T09:14:48

Make submodules externally refcounted `git_submodule` objects were already refcounted internally in case the submodule name was different from the path at which it was stored. This makes that refcounting externally used as well, so `git_submodule_lookup` and `git_submodule_add_setup` return an object that requires a `git_submodule_free` when done.

6789b7a7

2014-02-27T14:13:22

Add buffer to buffer diff and patch APIs This adds `git_diff_buffers` and `git_patch_from_buffers`. This also includes a bunch of internal refactoring to increase the shared code between these functions and the blob-to-blob and blob-to-buffer APIs, as well as some higher level assert helpers in the tests to also remove redundancy.

9950bb4e

2014-01-24T20:23:17

diff: rename the file's 'oid' to 'id' In the same vein as the previous commits in this series.

10672e3e

2013-10-15T15:10:07

Diff API cleanup This lays groundwork for separating formatting options from diff creation options. This groups the formatting flags separately from the diff list creation flags and reorders the options. This also tweaks some APIs to further separate code that uses patches from code that just looks at git_diffs.

3ff1d123

2013-10-11T14:51:54

Rename diff objects and split patch.h This makes no functional change to diff but renames a couple of the objects and splits the new git_patch (formerly git_diff_patch) into a new header file.

a9f51e43

2013-09-11T22:00:36

Merge git_buf and git_buffer This makes the git_buf struct that was used internally into an externally available structure and eliminates the git_buffer. As part of that, some of the special cases that arose with the externally used git_buffer were blended into the git_buf, such as being careful about git_buf objects that may have a NULL ptr and allowing for bufs with a valid ptr and size but zero asize as a way of referring to externally owned data.

4b11f25a

2013-09-11T16:38:33

Add ident filter This adds the ident filter (that knows how to replace $Id$) and tweaks the filter APIs and code so that git_filter_source objects actually have the updated OID of the object being filtered when it is a known value.

2a7d224f

2013-09-10T16:33:32

Extend public filter api with filter lists This moves the git_filter_list into the public API so that users can create, apply, and dispose of filter lists. This allows more granular application of filters to user data outside of libgit2 internals. This also converts all the internal usage of filters to the public APIs along with a few small tweaks to make it easier to use the public git_buffer stuff alongside the internal git_buf.

85d54812

2013-08-28T16:44:04

Create public filter object and use it This creates include/sys/filter.h with a basic definition of a git_filter and then converts the internal code to use it. There are related internal objects (git_filter_list) that we will want to publish at some point, but this is a first step.

effdbeb3

2013-07-24T17:48:37

Make rename detection file size fix better The previous fix for checking file sizes with rename detection always loads the blob. In this version, if the odb backend can get the object header without loading the whole thing into memory, then we'll just use that, so that we can eliminate possible rename sources & targets without loading them.

39a1a662

2013-07-24T13:09:07

Don't unload diff data unless loaded

74ded024

2013-06-17T17:03:34

Add "as_path" parameters to blob and buffer diffs This adds parameters to the four functions that allow for blob-to- blob and blob-to-buffer differencing (either via callbacks or by making a git_diff_patch object). These parameters let you say that filename we should pretend the blob has while doing the diff. If you pass NULL, there should be no change from the existing behavior, which is to skip using attributes for file type checks and just look at content. With the parameters, you can plug into the new diff driver functionality and get binary or non-binary behavior, plus function context regular expressions, etc. This commit also fixes things so that the git_diff_delta that is generated by these functions will actually be populated with the data that we know about the blobs (or buffers) so you can use it appropriately. It also fixes a bug in generating patches from the git_diff_patch objects created via these functions. Lastly, there is one other behavior change that may matter. If there is no difference between the two blobs, these functions no longer generate any diff callbacks / patches unless you have passed in GIT_DIFF_INCLUDE_UNMODIFIED. This is pretty natural, but could potentially change the behavior of existing usage.

360f42f4

2013-06-12T14:18:09

Fix diff header naming issues This makes the git_diff_patch definition private to diff_patch.c and fixes a number of other header file naming inconsistencies to use `git_` prefixes on functions and structures that are shared between files.

5dc98298

2013-06-11T11:22:22

Implement regex pattern diff driver This implements the loading of regular expression pattern lists for diff drivers that search for function context in that way. This also changes the way that diff drivers update options and interface with xdiff APIs to make them a little more flexible.

114f5a6c

2013-06-10T10:10:39

Reorganize diff and add basic diff driver This is a significant reorganization of the diff code to break it into a set of more clearly distinct files and to document the new organization. Hopefully this will make the diff code easier to understand and to extend. This adds a new `git_diff_driver` object that looks of diff driver information from the attributes and the config so that things like function content in diff headers can be provided. The full driver spec is not implemented in the commit - this is focused on the reorganization of the code and putting the driver hooks in place. This also removes a few #includes from src/repository.h that were overbroad, but as a result required extra #includes in a variety of places since including src/repository.h no longer results in pulling in the whole world.

thodg/libgit2/src/diff_file.c

src/diff_file.c

Log