src/commit.c


Log

Author Commit Date CI Message
Edward Thomson 3cca14b3 2021-12-23T14:13:34 Merge pull request #6125 from stforek/git_commit_summary_spaces git_commit_summary: ignore lines with spaces
Edward Thomson fc1a3f45 2021-11-29T13:36:36 object: return GIT_EINVALID on parse errors Return `GIT_EINVALID` on parse errors so that direct callers of parse functions can determine when there was a failure to parse the object. The object parser functions will swallow this error code to prevent it from propagating down the chain to end-users. (`git_merge` should not return `GIT_EINVALID` when a commit it tries to look up is not valid, this would be too vague to be useful.) The only public function that this affects is `git_signature_from_buffer`, which is now documented as returning `GIT_EINVALID` when appropriate.
Przemyslaw Ciezkowski 1e015088 2021-11-25T15:19:17 git_commit_summary: ignore lines with spaces Fixes libgit2/libgit2#6065
Edward Thomson f0e693b1 2021-09-07T17:53:49 str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Edward Thomson 3b2153fe 2020-04-05T14:42:44 commit: use GIT_ASSERT
Tobias Nießen 5e1b6eaf 2020-01-15T12:58:59 Make type mismatch errors consistent
Carlos Martín Nieto 718f24ad 2019-10-30T20:39:03 commit: verify objects exist in git_commit_with_signature There can be a significant difference between the system where we created the buffer (if at all) and when the caller provides us with the contents of a commit. Verify that the commit we are being asked to create references objects which do exist in the target repository.
Patrick Steinhardt f04a58b0 2019-10-03T12:55:48 Merge pull request #4445 from tiennou/shallow/dry-commit-parsing DRY commit parsing
Etienne Samson 1c847a6a 2018-10-25T19:40:19 commit: generic parse mechanism This allows us to pick which data from a commit we're interested in. This will be used by the revwalk code, which is only interested in parents' and committer data.
Tyler Ang-Wanek 998f9c15 2019-08-07T07:21:27 fixup: strange indentation
Tyler Ang-Wanek 75947105 2019-07-02T09:53:49 commit: git_commit_create_with_signature should support null signature If provided with a null signature, skip adding the signature header and create the commit anyway.
Edward Thomson f673e232 2018-12-27T13:47:34 git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
Edward Thomson 168fe39b 2018-11-28T14:26:57 object_type: use new enumeration names Use the new object_type enumeration names within the codebase.
Patrick Steinhardt cb23c3ef 2018-11-21T10:54:29 commit: fix out-of-bound reads when parsing truncated author fields While commit objects usually should have only one author field, our commit parser actually handles the case where a commit has multiple author fields because some tools that exist in the wild actually write them. Detection of those additional author fields is done by using a simple `git__prefixcmp`, checking whether the current line starts with the string "author ". In case where we are handed a non-NUL-terminated string that ends directly after the space, though, we may have an out-of-bounds read of one byte when trying to compare the expected final NUL byte. Fix the issue by using `git__prefixncmp` instead of `git_prefixcmp`. Unfortunately, a test cannot be easily written to catch this case. While we could test the last error message and verify that it didn't in fact fail parsing a signature (because that would indicate that it has in fact tried to parse the additional "author " field, which it shouldn't be able to detect in the first place), this doesn't work as the next line needs to be the "committer" field, which would error out with the same error message even if we hadn't done an out-of-bounds read. As objects read from the object database are always NUL terminated, this issue cannot be triggered in normal code and thus it's not security critical.
Patrick Steinhardt 7655b2d8 2018-10-19T10:29:19 commit: fix reading out of bounds when parsing encoding The commit message encoding is currently being parsed by the `git__prefixcmp` function. As this function does not accept a buffer length, it will happily skip over a buffer's end if it is not `NUL` terminated. Fix the issue by using `git__prefixncmp` instead. Add a test that verifies that we are unable to parse the encoding field if it's cut off by the supplied buffer length.
Patrick Steinhardt ab265a35 2017-10-13T13:11:59 commit: implement function to parse raw data Currently, parsing objects is strictly tied to having an ODB object available. This makes it hard to parse an object when all that is available is its raw object and size. Furthermore, hacking around that limitation by directly creating an ODB structure either on stack or on heap does not really work that well due to ODB objects being reference counted and then automatically free'd when reaching a reference count of zero. Implement a function `git_commit__parse_raw` to parse a commit object from a pair of `data` and `size`.
Nika Layzell 56303e1a 2018-05-07T11:59:00 mailmap: API and style cleanup
Nika Layzell e3dcaca5 2018-03-17T18:15:01 mailmap: Integrate mailmaps with blame and signatures
Patrick Steinhardt ecf4f33a 2018-02-08T11:14:48 Convert usage of `git_buf_free` to new `git_buf_dispose`
Patrick Steinhardt 0c7f49dd 2017-06-30T13:39:01 Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Edward Thomson 52d03f37 2017-03-03T13:26:29 git_commit_create: freshen tree objects in commit Freshen the tree object that a commit points to during commit time.
Patrick Steinhardt ade0d9c6 2017-02-13T13:46:17 commit: avoid possible use-after-free When extracting a commit's signature, we first free the object and only afterwards put its signature contents into the result buffer. This works in most cases - the free'd object will normally be cached anyway, so we only end up decrementing its reference count without actually freeing its contents. But in some more exotic setups, where caching is disabled, this can definitly be a problem, as we might be the only instance currently holding a reference to this object. Fix this issue by first extracting the contents and freeing the object afterwards only.
Patrick Steinhardt dc851d9e 2017-02-13T13:42:16 commit: clear user-provided buffers The functions `git_commit_header_field` and `git_commit_extract_signature` both receive buffers used to hand back the results to the user. While these functions called `git_buf_sanitize` on these buffers, this is not the right thing to do, as it will simply initialize or zero-terminate passed buffers. As we want to overwrite contents, we instead have to call `git_buf_clear` to completely reset them.
Edward Thomson 909d5494 2016-12-29T12:25:15 giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
Patrick Steinhardt a719ef5e 2016-10-07T09:31:41 commit: always initialize commit message When parsing a commit, we will treat all bytes left after parsing the headers as the commit message. When no bytes are left, we leave the commit's message uninitialized. While uncommon to have a commit without message, this is the right behavior as Git unfortunately allows for empty commit messages. Given that this scenario is so uncommon, most programs acting on the commit message will never check if the message is actually set, which may lead to errors. To work around the error and not lay the burden of checking for empty commit messages to the developer, initialize the commit message with an empty string when no commit message is given.
Edward Thomson 0d77a56f 2016-05-26T12:28:32 checkout: drop unused repo
John Haley 225cb880 2016-04-26T08:09:04 Fix `git_commit_create` for an initial commit When calling `git_commit_create` with an empty array of `parents` and `parent_count == 0` the call will segfault at https://github.com/libgit2/libgit2/blob/master/src/commit.c#L107 when it's trying to compare `current_id` to a null parent oid. This just puts in a check to stop that segfault.
Edward Thomson f0224772 2016-02-17T18:04:19 git_object_dup: introduce typesafe versions
Edward Thomson ba349322 2016-03-17T06:57:56 Merge pull request #3673 from libgit2/cmn/commit-with-signature commit: add function to attach a signature to a commit
Carlos Martín Nieto bf804d40 2016-03-17T10:45:22 commit: fix extraction of single-line signatures The function to extract signatures suffers from a similar bug to the header field finding one by having an unecessary line feed check as a break condition of its loop. Fix that and add a test for this single-line signature situation.
Carlos Martín Nieto 02d61a3b 2016-03-10T10:53:20 commit: add function to attach a signature to a commit In combination with the function which creates a commit into a buffer, this allows us to more easily create signed commits.
Carlos Martín Nieto 47cb42da 2016-03-03T22:56:02 commit: split creating the commit and writing it out Sometimes you want to create a commit but not write it out to the objectdb immediately. For these cases, provide a new function to retrieve the buffer instead of having to go through the db.
Edward Thomson ef63bab3 2016-02-23T13:34:35 git_commit: validate tree and parent ids When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate the tree and parent ids given to commit creation functions.
Carlos Martín Nieto eadd0f05 2016-02-16T14:06:48 commit: expose the different kinds of errors We should be checking whether the object we're looking up is a commit, and we should let the caller know whether the not-found return code comes from a bad object type or just a missing signature.
Carlos Martín Nieto 460ae11f 2016-02-11T22:19:20 commit: don't forget the last header field When we moved the logic to handle the first one, wrong loop logic was kept in place which meant we still finished early. But we now notice it because we're not reading past the last LF we find. This was not noticed before as the last field in the tested commit was multi-line which does not trigger the early break.
Vicent Marti 488e2b85 2016-02-09T16:26:58 Merge pull request #3599 from libgit2/gpgsign Introduce git_commit_extract_signature
Carlos Martín Nieto a65afb75 2016-02-08T18:51:13 Introduce git_commit_extract_signature This returns the GPG signature for a commit and its contents without the signature block, allowing for the verification of the commit's signature.
Carlos Martín Nieto f55eca16 2016-02-09T07:17:26 commit: also match the first header field when searching We were searching only past the first header field, which meant we were unable to find e.g. `tree` which is the first field. While here, make sure to set an error message in case we cannot find the field.
Patrick Steinhardt 7f8fe1d4 2015-12-01T10:03:56 commit: introduce `git_commit_body` It is already possible to get a commit's summary with the `git_commit_summary` function. It is not possible to get the remaining part of the commit message, that is the commit message's body. Fix this by introducing a new function `git_commit_body`.
Stjepan Rajko f5f96a23 2015-10-09T10:41:06 Fix git_commit_summary to convert newlines to spaces even after whitespace. Collapse spaces around newlines for the summary.
Carlos Martín Nieto a3f42fe8 2015-06-22T15:32:29 commit: allow retrieving an arbitrary header field This allows the user to look up fields which we don't parse in libgit2, and allows them to access gpgsig or mergetag fields if they wish to check the signature.
Carlos Martín Nieto 65d69fe8 2015-06-11T08:24:58 commit: ignore multiple author fields Some tools create multiple author fields. git is rather lax when parsing them, although fsck does complain about them. This means that they exist in the wild. As it's not too taxing to check for them, and there shouldn't be a noticeable slowdown when dealing with correct commits, add logic to skip over these extra fields when parsing the commit.
Carlos Martín Nieto 659cf202 2015-01-07T12:23:05 Remove the signature from ref-modifying functions The signature for the reflog is not something which changes dynamically. Almost all uses will be NULL, since we want for the repository's default identity to be used, making it noise. In order to allow for changing the identity, we instead provide git_repository_set_ident() and git_repository_ident() which allow a user to override the choice of signature.
Stefan Widgren c8e02b87 2015-02-15T21:07:05 Remove extra semicolon outside of a function Without this change, compiling with gcc and pedantic generates warning: ISO C does not allow extra ‘;’ outside of a function.
Edward Thomson a612a25f 2014-07-18T18:22:54 git_rebase_commit: write HEAD's reflog appropriately
Carlos Martín Nieto 217c029b 2014-04-09T14:08:22 commit: safer commit creation with reference update The current version of the commit creation and amend function are unsafe to use when passing the update_ref parameter, as they do not check that the reference at the moment of update points to what the user expects. Make sure that we're moving history forward when we ask the library to update the reference for us by checking that the first parent of the new commit is the current value of the reference. We also make sure that the ref we're updating hasn't moved between the read and the write. Similarly, when amending a commit, make sure that the current tip of the branch is the commit we're amending.
Carlos Martín Nieto 7c1ee212 2014-03-07T15:17:08 commit: simplify and correct refcounting in nth_gen_ancestor We can make use of git_object_dup to use refcounting instead of pointer comparison to make sure we don't free the caller's object. This also lets us simplify the case for '~0' which is now just an assignment instead of looking up the object we have at hand.
Edward Thomson 4f46a98b 2014-02-24T23:32:25 Remove now-duplicated stdarg.h include
Russell Belfer 80c29fe9 2014-01-17T10:45:11 Add git_commit_amend API This adds an API to amend an existing commit, basically a shorthand for creating a new commit filling in missing parameters from the values of an existing commit. As part of this, I also added a new "sys" API to create a commit using a callback to get the parents. This allowed me to rewrite all the other commit creation APIs so that temporary allocations are no longer needed.
Ben Straub 0de2c4e3 2014-02-05T13:15:57 Merge remote-tracking branch 'libgit2/development' into bs/more-reflog-stuff
Carlos Martín Nieto a6563619 2014-02-05T13:01:54 commit: faster parsing The current code issues a lot of strncmp() calls in order to check for the end of the header, simply in order to copy it and start going through it again. These are a lot of calls for something we can check as we go along. Knowing the amount of parents beforehand to reduce allocations in extreme cases does not make up for them. Instead start parsing immediately and check for the double-newline after each header field, leaving the raw_header allocation for the end, which lets us go through the header once and reduces the amount of strncmp() calls significantly. In unscientific testing, this has reduced a shortlog-like usage (walking though the whole history of a branch and extracting data from the commits) of git.git from ~830ms to ~700ms and makes the time we spend in strncmp() negligible.
Ben Straub 0adb0606 2014-02-04T15:32:57 Fix reflog message when creating commits
Carlos Martín Nieto 47e28349 2014-01-24T12:01:34 commit: remvoe legacy 'oid' naming
Edward Thomson 238e8149 2014-01-22T14:41:04 Summarize empty messages
Carlos Martín Nieto 0b28217b 2014-01-15T12:51:31 refs: remove the _with_log differentiation Any well-behaved program should write a descriptive message to the reflog whenever it updates a reference. Let's make this more prominent by removing the version without the reflog parameters.
Paul Holden be0a1a79 2013-12-08T02:03:05 commit: Fix potential segfault in git_commit_message Dereferencing commit pointer before asserting
Edward Thomson 300d192f 2013-12-02T11:15:27 Introduce git_revert to revert a single commit
nulltoken 598f069b 2013-10-02T12:42:41 commit: Introduce git_commit_message_raw()
nulltoken d27a441d 2013-09-30T11:30:28 commit: Trim message leading newlines Fix libgit2/libgit2sharp#522
Russell Belfer 584f2d30 2013-07-11T11:04:42 Fix warnings on Win64
Russell Belfer 9abc78ae 2013-07-07T21:56:11 Convert commit->parent_ids to git_array_t This converts the array of parent SHAs from a git_vector where each SHA has to be separately allocated to a git_array_t where all the SHAs can be kept in one block. Since the two collections have almost identical APIs, there isn't much involved in making the change. I did add an API to git_array_t so that it could be allocated at a precise initial size.
Russell Belfer f094f905 2013-07-01T15:41:01 Add raw header access to commit API
Russell Belfer 58206c9a 2013-05-16T10:38:27 Add cat-file example and increase const use in API This adds an example implementation that emulates git cat-file. It is a convenient and relatively simple example of getting data out of a repository. Implementing this also revealed that there are a number of APIs that are still not using const pointers to objects that really ought to be. The main cause of this is that `git_vector_bsearch` may need to call `git_vector_sort` before doing the search, so a const pointer to the vector is not allowed. However, for tree objects, with a little care, we can ensure that the vector of tree entries is always sorted and allow lookups to take a const pointer. Also, the missing const in commit objects just looks like an oversight.
Linquize e583334c 2013-05-10T21:42:22 Fix broken build when MSVC SDL checks is enabled
nulltoken 467cbec7 2013-05-05T16:48:34 commit: make create_from_oids() accept plain oid
nulltoken ce72e399 2013-05-05T16:45:38 commit: guard create() against not owned trees
Russell Belfer 3f27127d 2013-04-16T11:51:02 Simplify object table parse functions This unifies the object parse functions into one signature that takes an odb_object.
Russell Belfer 78606263 2013-04-15T00:05:44 Add callback to git_objects_table This adds create and free callback to the git_objects_table so that more of the creation and destruction of objects can be table driven instead of using switch statements. This also makes the semantics of certain object creation functions consistent so that we can make better use of function pointers. This also fixes a theoretical error case where an object allocation fails and we end up storing NULL into the cache.
Russell Belfer badd85a6 2013-04-10T17:10:17 Use git_odb_object_data/_size whereever possible This uses the odb object accessors so we can change the internals more easily...
Vicent Marti 8842c75f 2013-04-03T22:30:07 What has science done.
John Wiegley 92550398 2013-01-29T09:53:23 Added git_commit_create_oid
Russell Belfer 9233b3de 2013-04-19T13:17:29 Move git_commit_create_from_oids into sys/commit.h Actually this renames git_commit_create_oid to git_commit_create_from_oids and moves the API declaration to include/git2/sys/commit.h since it is a dangerous API for general use (because it doesn't check that the OID list items actually refer to real objects).
Carlos Martín Nieto 0efae3b2 2013-04-15T12:24:08 commit: correctly detect the start of the commit message The end of the header is signaled by to consecutive LFs and the commit message starts immediately after. Jumping over LFs at the start of the message is a bug and leads to creating different commits if when rebuilding history. This also fixes an empty commit message being returned as "\n".
Arkadiy Shapkin 10c06114 2013-03-17T04:46:46 Several warnings detected by static code analyzer fixed Implicit type conversion argument of function to size_t type Suspicious sequence of types castings: size_t -> int -> size_t Consider reviewing the expression of the 'A = B == C' kind. The expression is calculated as following: 'A = (B == C)' Unsigned type is never < 0
Edward Thomson d00d5464 2013-03-01T15:37:33 immutable references and a pluggable ref database
Philip Kelley 11d9f6b3 2013-01-27T14:17:07 Vector improvements and their fallout
Carlos Martín Nieto d47c6aab 2013-01-20T04:20:09 commit: don't include the LF in the header field value When the encoding header changed to be treated as an additional header, the EOL pointer started to point to the byte after the LF, making the git__strndup call copy the LF into the value. Increase the EOL pointer value after copying the data to keep the rest of the semantics but avoid copying LF.
Russell Belfer 291090a0 2013-01-17T13:19:09 Add skipping of unknown commit headers This moves the check for the "encoding" header into a loop which is just scanning for non-required headers at the end of a commit header. That loop will skip unrecognized lines (including header continuation lines) until a terminating completely blank line is found, and only then does it move to reading the commit message.
Edward Thomson 359fc2d2 2013-01-08T17:07:25 update copyrights
Ben Straub de70aea6 2012-12-03T12:41:50 Remove GIT_SIGNATURE_VERSION and friends
Ben Straub c7231c45 2012-11-30T16:31:42 Deploy GITERR_CHECK_VERSION
Ben Straub 4ec197f3 2012-11-30T12:52:42 Deploy GIT_SIGNATURE_INIT
Vicent Marti cfbe4be3 2012-11-17T19:54:47 More external API cleanup Conflicts: src/branch.c tests-clar/refs/branches/create.c
nulltoken b8457baa 2012-07-24T07:57:58 portability: Improve x86/amd64 compatibility
nulltoken 2b92a154 2012-07-11T11:20:20 commit: reduce code duplication
nulltoken b1aca6ea 2012-07-11T16:14:12 commit: introduce git_commit_nth_gen_ancestor()
Tim Clem e00b56eb 2012-06-15T10:15:57 Fix broken tests caused by no longer prettifying by default
Tim Clem e4031cb5 2012-06-15T09:26:56 Kill message_prettify - we will export instead
Tim Clem bc2deed0 2012-06-15T09:13:59 Don't strip comments (#) from commit messages by default
nulltoken edebceff 2012-05-01T13:57:45 Add git_reset() Currently supports Soft and Mixed modes.
Michael Schubert 54db1a18 2012-05-19T13:20:55 Cleanup * indexer: remove leftover printf * commit: remove unused macros COMMIT_BASIC_PARSE, COMMIT_FULL_PARSE and COMMIT_PRINT
Vicent Martí 904b67e6 2012-05-18T01:48:50 errors: Rename error codes
Vicent Martí e172cf08 2012-05-18T01:21:06 errors: Rename the generic return codes
nulltoken 458b9450 2012-03-01T17:03:32 commit/tag: ensure the message is cleaned up 'git commit' and 'git tag -a' enforce some conventions, like cleaning up excess whitespace and making sure that the last line ends with a '\n'. This fix replicates this behavior. Fix libgit2/libgit2sharp#117
Carlos Martín Nieto 3aa351ea 2012-04-26T15:05:07 error handling: move the missing parts over to the new error handling
nulltoken d4d648b0 2012-04-11T15:25:34 Fix compilation errors and warnings
Vicent Martí 73fe6a8e 2012-03-28T18:59:12 error-handling: Commit (WIP)
Vicent Martí cb8a7961 2012-03-07T00:02:55 error-handling: Repository This also includes droping `git_buf_lasterror` because it makes no sense in the new system. Note that in most of the places were it has been dropped, the code needs cleanup. I.e. GIT_ENOMEM is going away, so instead it should return a generic `-1` and obviously not throw anything.
schu b4b79ac3 2012-02-15T00:12:53 commit: actually allow yet to be born update_ref git_commit_create is supposed to update the given reference "update_ref", but segfaulted in case of a yet to be born reference. Fix it. Signed-off-by: schu <schu-github@schulog.org>
schu 5e0de328 2012-02-13T17:10:24 Update Copyright header Signed-off-by: schu <schu-github@schulog.org>