src/patch_generate.c


Log

Author Commit Date CI Message
Edward Thomson 1458fb56 2022-01-29T07:18:26 xdiff: include new xdiff from git Update to the xdiff used in git v2.35.0, with updates to our build configuration to ignore the sort of warnings that we normally care about (signed/unsigned mismatch, unused, etc.) Any git-specific abstraction bits are now redefined for our use in `git-xdiff.h`. It is a (wildly optimistic) hope that we can use that indirection layer to standardize on a shared xdiff implementation.
Edward Thomson f0e693b1 2021-09-07T17:53:49 str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Edward Thomson 1a724625 2020-04-05T18:27:51 patch: use GIT_ASSERT
Kevin Swinton e72ade87 2020-04-18T11:32:56 Fix binary diff showing /dev/null Fixes issue where a changed binary file's content in the working tree isn't displayed correctly, instead showing an oid of zero, and with its path being reported incorrectly as "/dev/null".
Patrick Steinhardt e54343a4 2019-06-29T09:17:32 fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
Erik Aigner b0893282 2019-07-11T12:12:04 patch_parse: ensure valid patch output with EOFNL
Edward Thomson f673e232 2018-12-27T13:47:34 git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
Patrick Steinhardt ecf4f33a 2018-02-08T11:14:48 Convert usage of `git_buf_free` to new `git_buf_dispose`
Patrick Steinhardt 585b5dac 2017-11-18T15:43:11 refcount: make refcounting conform to aliasing rules Strict aliasing rules dictate that for most data types, you are not allowed to cast them to another data type and then access the casted pointers. While this works just fine for most compilers, technically we end up in undefined behaviour when we hurt that rule. Our current refcounting code makes heavy use of casting and thus violates that rule. While we didn't have any problems with that code, Travis started spitting out a lot of warnings due to a change in their toolchain. In the refcounting case, the code is also easy to fix: as all refcounting-statements are actually macros, we can just access the `rc` field directly instead of casting. There are two outliers in our code where that doesn't work. Both the `git_diff` and `git_patch` structures have specializations for generated and parsed diffs/patches, which directly inherit from them. Because of that, the refcounting code is only part of the base structure and not of the children themselves. We can help that by instead passing their base into `GIT_REFCOUNT_INC`, though.
Edward Thomson 1560b580 2017-08-15T10:35:47 Merge pull request #4288 from pks-t/pks/include-fixups Include fixups
Patrick Steinhardt 9093ced6 2017-07-10T11:42:26 patch_generate: represent buffers as void pointers Pointers to general data should usually be used as a void pointer such that it is possible to hand in variables of a different pointer type without the need to cast. This is the same when creating patches from buffers, where the buffers may contain arbitrary data. Instead of requiring the caller to care whether his buffer is e.g. `char *` or `unsigned char *`, we should instead just accept a `void *`. This is also consistent in how we tread other types like for example `git_blob`, which also just has a void pointer as its raw contents.
Patrick Steinhardt 0c7f49dd 2017-06-30T13:39:01 Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Patrick Steinhardt 62a2fc06 2017-03-14T13:06:25 patch_generate: move `git_diff_foreach` to diff.c Now that the `git_diff_foreach` function does not depend on internals of the `git_patch_generated` structure anymore, we can easily move it to the actual diff code.
Patrick Steinhardt ace3508f 2017-03-14T10:37:47 patch_generate: fix `git_diff_foreach` only working with generated diffs The current logic of `git_diff_foreach` makes the assumption that all diffs passed in are actually derived from generated diffs. With these assumptions we try to derive the actual diff by inspecting either the working directory files or blobs of a repository. This obviously cannot work for diffs parsed from a file, where we do not necessarily have a repository at hand. Since the introduced split of parsed and generated patches, there are multiple functions which help us to handle patches generically, being indifferent from where they stem from. Use these functions and remove the old logic specific to generated patches. This allows re-using the same code for invoking the callbacks on the deltas.
Patrick Steinhardt 41019152 2017-03-14T10:01:56 patch_generate: remove duplicated logic Under the existing logic, we try to load patch contents differently, depending on whether the patch files stem from the working directory or not. But actually, the executed code paths are completely equal to each other -- so we were always the code despite the condition. Remove the condition altogether and conflate both code paths.
Etienne Samson b0014063 2016-12-26T22:13:35 patch: memory leak of patch.base.diff_opts.new|old_prefix
Edward Thomson 909d5494 2016-12-29T12:25:15 giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
Patrick Steinhardt 34b32053 2016-11-25T15:02:34 Fix potential use of uninitialized values
Edward Thomson adedac5a 2016-09-02T02:03:45 diff: treat binary patches with no data special When creating and printing diffs, deal with binary deltas that have binary data specially, versus diffs that have a binary file but lack the actual binary data.
Patrick Steinhardt 4b34f687 2016-09-01T15:14:25 patch_generate: only calculate binary diffs if requested When generating diffs for binary files, we load and decompress the blobs in order to generate the actual diff, which can be very costly. While we cannot avoid this for the case when we are called with the `GIT_DIFF_SHOW_BINARY` flag, we do not have to load the blobs in the case where this flag is not set, as the caller is expected to have no interest in the actual content of binary files. Fix the issue by only generating a binary diff when the caller is actually interested in the diff. As libgit2 uses heuristics to determine that a blob contains binary data by inspecting its size without loading from the ODB, this saves us quite some time when diffing in a repository with binary files.
Edward Thomson b859faa6 2016-08-23T23:38:39 Teach `git_patch_from_diff` about parsed diffs Ensure that `git_patch_from_diff` can return the patch for parsed diffs, not just generate a patch for a generated diff.
Edward Thomson 60e15ecd 2016-07-15T17:18:39 packbuilder: `size_t` all the things After 1cd65991, we were passing a pointer to an `unsigned long` to a function that now expected a pointer to a `size_t`. These types differ on 64-bit Windows, which means that we trash the stack. Use `size_t`s in the packbuilder to avoid this.
Edward Thomson 9be638ec 2016-04-19T15:12:18 git_diff_generated: abstract generated diffs
Edward Thomson 8d44f8b7 2015-11-24T15:19:59 patch: `patch_diff` -> `patch_generated`