src/diff_print.c


Log

Author Commit Date CI Message
Edward Thomson f0e693b1 2021-09-07T17:53:49 str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Edward Thomson 5d6c2f26 2020-04-05T14:59:54 diff: use GIT_ASSERT
Edward Thomson cb4bfbc9 2020-04-05T11:07:54 buffer: git_buf_sanitize should return a value `git_buf_sanitize` is called with user-input, and wants to sanity-check that input. Allow it to return a value if the input was malformed in a way that we cannot cope.
Patrick Steinhardt 6256d023 2020-06-15T14:34:29 diff_print: adjust code to match current coding style
Patrick Steinhardt 490d0c9c 2020-06-15T14:26:13 diff_print: return out-of-memory situation when printing binary We currently don't check for out-of-memory situations on exiting `format_binary` and, as a result, may return a partially filled buffer. Fix this by checking the buffer via `git_buf_oom`.
Patrick Steinhardt bea5fd9f 2020-06-15T13:26:18 diff_print: do not call abort(3P) Calling abort(3P) in a library is rather rude and shouldn't happen, as we effectively prohibit any corrective actions made by the application linking to it. We thus shouldn't call it at all, but instead use our new `GIT_ASSERT` macros. Remove the call to abort(3P) in case a diff delta has an unexpected type to fix this.
Patrick Steinhardt 0cf1f444 2020-06-15T13:19:44 diff_print: handle errors when printing to file When printing the diff to a `FILE *` handle, we neither check the return value of fputc(3P) nor the one of fwrite(3P). As a result, we'll silently return successful even if we didn't print anything at all. Futhermore, the arguments to fwrite(3P) are reversed: we have one item of length `content_len`, and not `content_len` items of one byte. Fix both issues by checking return values as well as reversing the arguments to fwrite(3P).
Patrick Steinhardt a6c9e0b3 2020-06-08T12:40:47 tree-wide: mark local functions as static We've accumulated quite some functions which are never used outside of their respective code unit, but which are lacking the `static` keyword. Add it to reduce their linkage scope and allow the compiler to optimize better.
Patrick Steinhardt 5f47cb48 2020-03-26T14:16:41 patch: correctly handle mode changes for renames When generating a patch for a renamed file whose mode bits have changed in addition to the rename, then we currently fail to parse the generated patch. Furthermore, when generating a diff we output mode bits after the similarity metric, which is different to how upstream git handles it. Fix both issues by adding another state transition that allows similarity indices after mode changes and by printing mode changes before the similarity index.
Gregory Herrero b921964b 2019-11-07T13:08:51 diff_print: add support for GIT_DIFF_FORMAT_PATCH_ID. Git is generating patch-id using a stripped down version of a patch where hunk header and index information are not present. Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
Gregory Herrero accd7848 2019-11-07T13:02:38 diff_print: add a new 'print_index' flag when printing diff. Add a new 'print_index' flag to let the caller decide whether or not 'index <oid>..<oid>' should be printed. Since patch id needs not to have index when hashing a patch, it will be useful soon. Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
Patrick Steinhardt e54343a4 2019-06-29T09:17:32 fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
Patrick Steinhardt 658022c4 2019-07-18T13:53:41 configuration: cvar -> configmap `cvar` is an unhelpful name. Refactor its usage to `configmap` for more clarity.
Edward Thomson 5d92e547 2019-06-08T17:28:35 oid: `is_zero` instead of `iszero` The only function that is named `issomething` (without underscore) was `git_oid_iszero`. Rename it to `git_oid_is_zero` for consistency with the rest of the library.
Edward Thomson f673e232 2018-12-27T13:47:34 git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
Patrick Steinhardt ecf4f33a 2018-02-08T11:14:48 Convert usage of `git_buf_free` to new `git_buf_dispose`
Erik van Zijst bc5ced66 2018-04-04T21:28:31 diff: Add missing GIT_DELTA_TYPECHANGE -> 'T' mapping. This adds the 'T' status character to git_diff_status_char() for diff entries that change type.
Patrick Steinhardt 0c7f49dd 2017-06-30T13:39:01 Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Edward Thomson 909d5494 2016-12-29T12:25:15 giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
Edward Thomson adedac5a 2016-09-02T02:03:45 diff: treat binary patches with no data special When creating and printing diffs, deal with binary deltas that have binary data specially, versus diffs that have a binary file but lack the actual binary data.
Edward Thomson f4e3dae7 2016-09-02T11:26:16 diff_print: change test for skipping binary printing Instead of skipping printing a binary diff when there is no data, skip printing when we have a status of `UNMODIFIED`. This is more in-line with our internal data model and allows us to expand the notion of binary data. In the future, there may have no data because the files were unmodified (there was no data to produce) or it may have no data because there was no data given to us in a patch. We want to treat these cases separately.
Edward Thomson 1a79cd95 2016-04-26T01:18:01 patch: show copy information for identical copies When showing copy information because we are duplicating contents, for example, when performing a `diff --find-copies-harder -M100 -B100`, then show copy from/to lines in a patch, and do not show context. Ensure that we can also parse such patches.
Edward Thomson 72827490 2016-04-25T12:40:19 Introduce `git_diff_to_buf` Like `git_patch_to_buf`, provide a simple helper method that can print an entire diff directory to a `git_buf`.
Edward Thomson 8d44f8b7 2015-11-24T15:19:59 patch: `patch_diff` -> `patch_generated`
Edward Thomson f941f035 2015-09-24T09:25:10 patch: drop some warnings
Edward Thomson 040ec883 2015-09-23T18:20:17 patch: use strlen to mean string length `oid_strlen` has meant one more than the length of the string. This is mighty confusing. Make it mean only the string length! Whomsoever needs to allocate a buffer to hold a string can null terminate it like normal.
Edward Thomson e2cdc145 2015-09-23T17:48:52 patch: show modes when only the mode has changed
Edward Thomson 4ac2d8ac 2015-09-23T16:33:02 patch: quote filenames when necessary
Edward Thomson 72806f4c 2015-09-23T13:56:48 patch: don't print some headers on pure renames
Edward Thomson 19e46645 2015-09-23T11:07:04 patch printing: include rename information
Edward Thomson 28f70443 2015-09-23T10:38:51 patch_parse: use names from `diff --git` header When a text file is added or deleted, use the file names from the `diff --git` header instead of the `---` or `+++` lines. This is for compatibility with git.
Edward Thomson d68cb736 2015-09-22T18:25:03 diff: include oid length in deltas Now that `git_diff_delta` data can be produced by reading patch file data, which may have an abbreviated oid, allow consumers to know that the id is abbreviated.
Edward Thomson 804d5fe9 2015-09-11T08:37:12 patch: abstract patches into diff'ed and parsed Patches can now come from a variety of sources - either internally generated (from diffing two commits) or as the results of parsing some external data.
Edward Thomson 3149ff6f 2015-06-17T18:13:10 patch application: apply binary patches Handle the application of binary patches. Include tests that produce a binary patch (an in-memory `git_patch` object), then enusre that the patch applies correctly.
Patrick Steinhardt be8479c9 2016-02-22T14:01:50 diff_print: assert patch is non-NULL When invoking `diff_print_info_init_frompatch` it is obvious that the patch should be non-NULL. We explicitly check if the variable is set and continue afterwards, happily dereferencing the potential NULL-pointer. Fix this by instead asserting that patch is set. This also silences Coverity.
Guille -bisho- e4b2b919 2015-09-25T10:37:41 Fix binary diffs git expects an empty line after the binary data: literal X ...binary data... <empty_line> The last literal block of the generated patches were not containing the required empty line. Example: diff --git a/binary_file b/binary_file index 3f1b3f9098131cfecea4a50ff8afab349ea66d22..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644 GIT binary patch literal 8 Pc${NM&PdElPvrst3ey5{ literal 6 Nc${NM%g@i}0ssZ|0lokL diff --git a/binary_file2 b/binary_file2 index 31be99be19470da4af5b28b21e27896a2f2f9ee2..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644 GIT binary patch literal 8 Pc${NM&PdElPvrst3ey5{ literal 13 Sc${NMEKbZyOexL+Qd|HZV+4u- git apply of that diff results in: error: corrupt binary patch at line 9: diff --git a/binary_file2 b/binary_file2 fatal: patch with only garbage at line 10 The proper formating is: diff --git a/binary_file b/binary_file index 3f1b3f9098131cfecea4a50ff8afab349ea66d22..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644 GIT binary patch literal 8 Pc${NM&PdElPvrst3ey5{ literal 6 Nc${NM%g@i}0ssZ|0lokL diff --git a/binary_file2 b/binary_file2 index 31be99be19470da4af5b28b21e27896a2f2f9ee2..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644 GIT binary patch literal 8 Pc${NM&PdElPvrst3ey5{ literal 13 Sc${NMEKbZyOexL+Qd|HZV+4u-
Edward Thomson 49840056 2015-06-30T14:20:31 diff: use size_t format
Carlos Martín Nieto 9568660f 2015-06-26T18:31:39 diff: fix leaks in diff printing
Pierre-Olivier Latour 0f4d9c03 2015-06-15T09:52:40 Fixed Xcode 6.1 build warnings
Edward Thomson 8147b1af 2015-05-25T20:03:59 diff: introduce binary diff callbacks Introduce a new binary diff callback to provide the actual binary delta contents to callers. Create this data from the diff contents (instead of directly from the ODB) to support binary diffs including the workdir, not just things coming out of the ODB.
Edward Thomson e003f83a 2014-07-31T15:14:56 Introduce git_buf_decode_base64 Decode base64-encoded text into a git_buf
Alan Rogers 85b7268e 2014-07-23T12:17:02 undo indentation change in diff_print.c
Alan Rogers dc49e1b5 2014-06-04T15:36:28 Merge remote-tracking branch 'origin/development' into fix-git-status-list-new-unreadable-folder Conflicts: include/git2/diff.h
Russell Belfer bc81220d 2014-05-31T10:19:55 minor cleanups
Russell Belfer 947a58c1 2014-05-30T13:19:49 Clean up the handling of large binary diffs
Alan Rogers 61bef72d 2014-05-20T23:57:40 Start adding GIT_DELTA_UNREADABLE and GIT_STATUS_WT_UNREADABLE.
Philip Kelley 4c9ffdff 2014-05-17T12:45:34 Fix printf format string from previous commit
Philip Kelley c6320bec 2014-05-17T12:19:32 print_binary_hunk: Treat types with respect
Russell Belfer 1e4976cb 2014-05-08T10:17:14 Be more careful with user-supplied buffers This adds in missing calls to `git_buf_sanitize` and fixes a number of places where `git_buf` APIs could inadvertently write NUL terminator bytes into invalid buffers. This also changes the behavior of `git_buf_sanitize` to NUL terminate a buffer if it can and of `git_buf_shorten` to do nothing if it can. Adds tests of filtering code with zeroed (i.e. unsanitized) buffer which was previously triggering a segfault.
Vicent Marti 212b6205 2014-04-23T09:27:15 Merge pull request #2291 from ethomson/patch_binary patch: emit deflated binary patches (optionally)
Edward Thomson e349ed50 2014-04-22T14:58:33 patch: emit binary patches (optionally)
Russell Belfer 12e422a0 2014-04-21T16:08:05 Some doc and examples/diff.c changes I was playing with "git diff-index" and wanted to be able to emulate that behavior a little more closely with the diff example. Also, I wanted to play with running `git_diff_tree_to_workdir` directly even though core Git doesn't exactly have the equivalent, so I added a command line option for that and tweaked some other things in the example code. This changes a minor output thing in that the "raw" print helper function will no longer add ellipses (...) if the OID is not actually abbreviated.
Russell Belfer 27e54bcf 2014-02-07T14:17:19 Add public diff print helpers The usefulness of these helpers came up for me while debugging some of the iterator changes that I was making, so since they have also been requested (albeit indirectly) I thought I'd include them.
Carlos Martín Nieto 86bfc3e1 2014-01-24T20:30:10 diff: change id abbrev option's name to id_abbrev Same as the other commits in the series, we use 'id' when talking about thing rather than the datatype.
Carlos Martín Nieto 9950bb4e 2014-01-24T20:23:17 diff: rename the file's 'oid' to 'id' In the same vein as the previous commits in this series.
Nicolas Hake c05cd792 2014-01-22T17:51:32 Drop git_patch_to_str It's hard or even impossible to correctly free the string buffer allocated by git_patch_to_str in some circumstances. Drop the function so people have to use git_patch_to_buf instead - git_buf has a dedicated destructor.
Nicolas Hake 450e8e9e 2014-01-22T13:22:15 Expose patch serialization to git_buf Returning library-allocated strings from libgit2 works fine on Linux, but may cause problems on Windows because there is no one C Runtime that everything links against. With libgit2 not exposing its own allocator, freeing the string is a gamble. git_patch_to_str already serializes to a buffer, then returns the underlying memory. Expose the functionality directly, so callers can use the git_buf_free function to free the memory later.
Russell Belfer 26c1cb91 2013-12-09T09:44:03 One more rename/cleanup for callback err functions
Russell Belfer 25e0b157 2013-12-06T15:07:57 Remove converting user error to GIT_EUSER This changes the behavior of callbacks so that the callback error code is not converted into GIT_EUSER and instead we propagate the return value through to the caller. Instead of using the giterr_capture and giterr_restore functions, we now rely on all functions to pass back the return value from a callback. To avoid having a return value with no error message, the user can call the public giterr_set_str or some such function to set an error message. There is a new helper 'giterr_set_callback' that functions can invoke after making a callback which ensures that some error message was set in case the callback did not set one. In places where the sign of the callback return value is meaningful (e.g. positive to skip, negative to abort), only the negative values are returned back to the caller, obviously, since the other values allow for continuing the loop. The hardest parts of this were in the checkout code where positive return values were overloaded as meaningful values for checkout. I fixed this by adding an output parameter to many of the internal checkout functions and removing the overload. This added some code, but it is probably a better implementation. There is some funkiness in the network code where user provided callbacks could be returning a positive or a negative value and we want to rely on that to cancel the loop. There are still a couple places where an user error might get turned into GIT_EUSER there, I think, though none exercised by the tests.
Russell Belfer 96869a4e 2013-12-03T16:45:39 Improve GIT_EUSER handling This adds giterr_user_cancel to return GIT_EUSER and clear any error message that is sitting around. As a result of using that in places, we need to be more thorough with capturing errors that happen inside a callback when used internally. To help with that, this also adds giterr_capture and giterr_restore so that when we internally use a foreach-type function that clears errors and converts them to GIT_EUSER, it is easier to restore not just the return value, but the actual error message text.
Russell Belfer 3b5f7954 2013-10-21T13:42:42 Create git_diff_line and extend git_diff_hunk Instead of having functions with so very many parameters to pass hunk and line data, this takes the existing git_diff_hunk struct and extends it with more hunk data, plus adds a git_diff_line. Those structs are used to pass back hunk and line data instead of the old APIs that took tons of parameters. Some work that was previously only being done for git_diff_patch creation (scanning the diff content for exact line counts) is now done for all callbacks, but the performance difference should not be noticable.
Russell Belfer 10672e3e 2013-10-15T15:10:07 Diff API cleanup This lays groundwork for separating formatting options from diff creation options. This groups the formatting flags separately from the diff list creation flags and reorders the options. This also tweaks some APIs to further separate code that uses patches from code that just looks at git_diffs.
Russell Belfer 3ff1d123 2013-10-11T14:51:54 Rename diff objects and split patch.h This makes no functional change to diff but renames a couple of the objects and splits the new git_patch (formerly git_diff_patch) into a new header file.
Russell Belfer 00e85927 2013-09-23T21:52:42 Clean up unnecessary git_buf_printf calls This replaces some git_buf_printf calls with simple calls to git_buf_put instead. Also, it fixes a missing va_end inside the git_buf_vprintf implementation.
Russell Belfer a7fcc44d 2013-09-05T16:14:32 Better macro name for is-exec-bit-set test
Russell Belfer f240acce 2013-09-05T11:20:12 Add more file mode permissions macros This adds some more macros for some standard operations on file modes, particularly related to permissions, and then updates a number of places around the code base to use the new macros.
Russell Belfer eb1c1707 2013-07-23T15:45:58 Restore GIT_DIFF_LINE_BINARY usage This restores the usage of GIT_DIFF_LINE_BINARY for the diff output line that reads "Binary files x and y differ" so that it can be optionally colorized independently of the file header.
Russell Belfer df40f398 2013-07-23T15:18:28 Make compact output more like core Git
Russell Belfer 197b8966 2013-07-23T14:34:31 Add hunk/file headers to git_diff_patch_size This allows git_diff_patch_size to account for hunk headers and file headers in the returned size. This required some refactoring of the code that is used to print file headers so that it could be invoked by the git_diff_patch_size API. Also this increases the test coverage and fixes an off-by-one bug in the size calculation when newline changes happen at the end of the file.
Russell Belfer e4acc3ba 2013-06-18T16:14:35 Fix rename looped reference issues This makes the diff rename tracking code more careful about the order in which it processes renames and more thorough in updating the mapping of correct renames when an earlier rename update alters the index of a later matched pair.
Russell Belfer 74ded024 2013-06-17T17:03:34 Add "as_path" parameters to blob and buffer diffs This adds parameters to the four functions that allow for blob-to- blob and blob-to-buffer differencing (either via callbacks or by making a git_diff_patch object). These parameters let you say that filename we should pretend the blob has while doing the diff. If you pass NULL, there should be no change from the existing behavior, which is to skip using attributes for file type checks and just look at content. With the parameters, you can plug into the new diff driver functionality and get binary or non-binary behavior, plus function context regular expressions, etc. This commit also fixes things so that the git_diff_delta that is generated by these functions will actually be populated with the data that we know about the blobs (or buffers) so you can use it appropriately. It also fixes a bug in generating patches from the git_diff_patch objects created via these functions. Lastly, there is one other behavior change that may matter. If there is no difference between the two blobs, these functions no longer generate any diff callbacks / patches unless you have passed in GIT_DIFF_INCLUDE_UNMODIFIED. This is pretty natural, but could potentially change the behavior of existing usage.
Russell Belfer a1683f28 2013-06-14T16:18:04 More tests and bug fixes for status with rename This changes the behavior of the status RENAMED flags so that they will be combined with the MODIFIED flags if appropriate. If a file is modified in the index and also renamed, then the status code will have both the GIT_STATUS_INDEX_MODIFIED and INDEX_RENAMED bits set. If it is renamed but the OID has not changed, then just the GIT_STATUS_INDEX_RENAMED bit will be set. Similarly, the flags GIT_STATUS_WT_MODIFIED and GIT_STATUS_WT_RENAMED can both be set independently of one another. This fixes a serious bug where the check for unmodified files that was done at data load time could end up erasing the RENAMED state of a file that was renamed with no changes. Lastly, this contains a bunch of new tests for status with renames, including tests where the only rename changes are case changes. The expected results of these tests have to vary by whether the platform uses a case sensitive filesystem or not, so the expected data covers those platform differences separately.
Russell Belfer 360f42f4 2013-06-12T14:18:09 Fix diff header naming issues This makes the git_diff_patch definition private to diff_patch.c and fixes a number of other header file naming inconsistencies to use `git_` prefixes on functions and structures that are shared between files.
Russell Belfer 114f5a6c 2013-06-10T10:10:39 Reorganize diff and add basic diff driver This is a significant reorganization of the diff code to break it into a set of more clearly distinct files and to document the new organization. Hopefully this will make the diff code easier to understand and to extend. This adds a new `git_diff_driver` object that looks of diff driver information from the attributes and the config so that things like function content in diff headers can be provided. The full driver spec is not implemented in the commit - this is focused on the reorganization of the code and putting the driver hooks in place. This also removes a few #includes from src/repository.h that were overbroad, but as a result required extra #includes in a variety of places since including src/repository.h no longer results in pulling in the whole world.
Russell Belfer 7000f3fa 2013-06-04T10:32:59 Move some diff helpers into separate file