|
89bd4ddb
|
2019-01-21T11:32:53
|
|
diff_generate: validate oid file size
Index entries are 32 bit unsigned ints, not `size_t`s.
|
|
c6cac733
|
2019-01-20T22:40:38
|
|
blob: validate that blob sizes fit in a size_t
Our blob size is a `git_off_t`, which is a signed 64 bit int. This may
be erroneously negative or larger than `SIZE_MAX`. Ensure that the blob
size fits into a `size_t` before casting.
|
|
f673e232
|
2018-12-27T13:47:34
|
|
git_error: use new names in internal APIs and usage
Move to the `git_error` name in the internal API for error-related
functions.
|
|
168fe39b
|
2018-11-28T14:26:57
|
|
object_type: use new enumeration names
Use the new object_type enumeration names within the codebase.
|
|
18e71e6d
|
2018-11-28T13:31:06
|
|
index: use new enum and structure names
Use the new-style index names throughout our own codebase.
|
|
3be73011
|
2018-06-11T18:26:22
|
|
Merge pull request #4436 from pks-t/pks/packfile-stream-free
pack: rename `git_packfile_stream_free`
|
|
ecf4f33a
|
2018-02-08T11:14:48
|
|
Convert usage of `git_buf_free` to new `git_buf_dispose`
|
|
5a7d454b
|
2018-06-04T12:56:08
|
|
Fix stash save bug with fast path index check
If the index contains stat data for a modified file, and the file is
not racily dirty, and there exists an untracked working tree directory
alphabetically after that file, and there are no other changes to the
repo, then git_stash_save would fail. It would confuse the untracked
working tree directory for the modified file, because they have the
same sha: zero. The wt directory has a sha of zero because it's a
directory, and the file would have a zero sha because we wouldn't read
the file -- we would just know that it doesn't match the index. To
fix this confusion, we simply check mode as well as SHA.
|
|
d8896bda
|
2018-01-03T16:07:36
|
|
diff_generate: avoid excessive stats of .gitattribute files
When generating a diff between two trees, for each file that is to be
diffed we have to determine whether it shall be treated as text or as
binary files. While git has heuristics to determine which kind of diff
to generate, users can also that default behaviour by setting or
unsetting the 'diff' attribute for specific files.
Because of that, we have to query gitattributes in order to determine
how to diff the current files. Instead of hitting the '.gitattributes'
file every time we need to query an attribute, which can get expensive
especially on networked file systems, we try to cache them instead. This
works perfectly fine for every '.gitattributes' file that is found, but
we hit cache invalidation problems when we determine that an attribuse
file is _not_ existing. We do create an entry in the cache for missing
'.gitattributes' files, but as soon as we hit that file again we
invalidate it and stat it again to see if it has now appeared.
In the case of diffing large trees with each other, this behaviour is
very suboptimal. For each pair of files that is to be diffed, we will
repeatedly query every directory component leading towards their
respective location for an attributes file. This leads to thousands or
even hundreds of thousands of wasted syscalls.
The attributes cache already has a mechanism to help in that scenario in
form of the `git_attr_session`. As long as the same attributes session
is still active, we will not try to re-query the gitmodules files at all
but simply retain our currently cached results. To fix our problem, we
can create a session at the top-most level, which is the initialization
of the `git_diff` structure, and use it in order to look up the correct
diff driver. As the `git_diff` structure is used to generate patches for
multiple files at once, this neatly solves our problem by retaining the
session until patches for all files have been generated.
The fix has been tested with linux.git by calling
`git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and
v4.14^{tree}.
| time | .gitattributes stats
without fix | 33.201s | 844614
with fix | 30.327s | 4441
While execution only improved by roughly 10%, the stat(3) syscalls for
.gitattributes files decreased by 99.5%. The benchmarks were quite
simple with best-of-three timings on Linux ext4 systems. One can assume
that for network based file systems the performance gain will be a lot
larger due to a much higher latency.
|
|
5ca3f115
|
2017-11-30T15:12:48
|
|
diff_generate: fix unsetting diff flags
The macro `DIFF_FLAG_SET` can be used to set or unset a flag by
modifying the diff's bitmask. While the case of setting the flag is
handled correctly, the case of unsetting the flag was not. Instead of
inverting the flags, we are inverting the value which is used to decide
whether we want to set or unset the bits.
The value being used here is a simple `bool` which is `false`. As that
is being uplifted to `int` when getting the bitwise-complement, we will
end up retaining all bits inside of the bitmask. As that's only ever
used to set `GIT_DIFF_IGNORE_CASE`, we were actually always ignoring
case for generated diffs.
Fix that by instead getting the bitwise-complement of `FLAG`, not `VAL`.
|
|
585b5dac
|
2017-11-18T15:43:11
|
|
refcount: make refcounting conform to aliasing rules
Strict aliasing rules dictate that for most data types, you are not
allowed to cast them to another data type and then access the casted
pointers. While this works just fine for most compilers, technically we
end up in undefined behaviour when we hurt that rule.
Our current refcounting code makes heavy use of casting and thus
violates that rule. While we didn't have any problems with that code,
Travis started spitting out a lot of warnings due to a change in their
toolchain. In the refcounting case, the code is also easy to fix:
as all refcounting-statements are actually macros, we can just access
the `rc` field directly instead of casting.
There are two outliers in our code where that doesn't work. Both the
`git_diff` and `git_patch` structures have specializations for generated
and parsed diffs/patches, which directly inherit from them. Because of
that, the refcounting code is only part of the base structure and not of
the children themselves. We can help that by instead passing their base
into `GIT_REFCOUNT_INC`, though.
|
|
0c7f49dd
|
2017-06-30T13:39:01
|
|
Make sure to always include "common.h" first
Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.
This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.
This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on.
|
|
909d5494
|
2016-12-29T12:25:15
|
|
giterr_set: consistent error messages
Error messages should be sentence fragments, and therefore:
1. Should not begin with a capital letter,
2. Should not conclude with punctuation, and
3. Should not end a sentence and begin a new one
|
|
b859faa6
|
2016-08-23T23:38:39
|
|
Teach `git_patch_from_diff` about parsed diffs
Ensure that `git_patch_from_diff` can return the patch for parsed diffs,
not just generate a patch for a generated diff.
|
|
7166bb16
|
2016-04-25T00:35:48
|
|
introduce `git_diff_from_buffer` to parse diffs
Parse diff files into a `git_diff` structure.
|
|
9be638ec
|
2016-04-19T15:12:18
|
|
git_diff_generated: abstract generated diffs
|