src


Log

Author Commit Date CI Message
Edward Thomson 53454b68 2018-02-02T11:31:15 Merge pull request #4510 from pks-t/pks/attr-file-bare-stat attr: avoid stat'ting files for bare repositories
Patrick Steinhardt e28e17e6 2018-02-01T10:36:33 attr: avoid stat'ting files for bare repositories Depending on whether the path we want to look up an attribute for is a file or a directory, the fnmatch function will be called with different flags. Because of this, we have to first stat(3) the path to determine whether it is a file or directory in `git_attr_path__init`. This is wasteful though in bare repositories, where we can already be assured that the path will never exist at all due to there being no worktree. In this case, we will execute an unnecessary syscall, which might be noticeable on networked file systems. What happens right now is that we always pass the `GIT_DIR_FLAG_UNKOWN` flag to `git_attr_path__init`, which causes it to `stat` the file itself to determine its type. As it is calling `git_path_isdir` on the path, which will always return `false` in case the path does not exist, we end up with the path always being treated as a file in case of a bare repository. As such, we can just check the bare-repository case in all callers and then pass in `GIT_DIR_FLAG_FALSE` ourselves, avoiding the need to `stat`. While this may not always be correct, it at least is no different from our current behavior.
Edward Thomson 341608dc 2018-01-31T14:48:42 Merge pull request #4507 from tomas/patch-1 Honor 'GIT_USE_NSEC' option in `filesystem_iterator_set_current`
Edward Thomson 9d8510b3 2018-01-31T09:28:43 Merge pull request #4488 from libgit2/ethomson/conflict_marker_size Use longer conflict markers in recursive merge base
Tomás Pollak 054e4c08 2018-01-31T14:28:25 Set ctime/mtime nanosecs to 0 if USE_NSEC is not defined
Tomás Pollak 752006dd 2018-01-30T23:21:19 Honor 'GIT_USE_NSEC' option in `filesystem_iterator_set_current` This should have been part of PR #3638. Without this we still get nsec-related errors, even when using -DGIT_USE_NSEC: error: ‘struct stat’ has no member named ‘st_mtime_nsec’
Patrick Steinhardt 275f103d 2018-01-12T08:59:40 odb: reject reading and writing null OIDs The null OID (hash with all zeroes) indicates a missing object in upstream git and is thus not a valid object ID. Add defensive measurements to avoid writing such a hash to the object database in the very unlikely case where some data results in the null OID. Furthermore, add shortcuts when reading the null OID from the ODB to avoid ever returning an object when a faulty repository may contain the null OID.
Patrick Steinhardt c0487bde 2018-01-12T08:23:43 tree: reject writing null-OID entries to a tree In commit a96d3cc3f (cache-tree: reject entries with null sha1, 2017-04-21), the git.git project has changed its stance on null OIDs in tree objects. Previously, null OIDs were accepted in tree entries to help tools repair broken history. This resulted in some problems though in that many code paths mistakenly passed null OIDs to be added to a tree, which was not properly detected. Align our own code base according to the upstream change and reject writing tree entries early when the OID is all-zero.
Adrián Medraño Calvo d23ce187 2018-01-22T11:55:28 odb: export mempack backend Fixes #4492, #4496.
Edward Thomson 185b0d08 2018-01-20T19:41:28 merge: recursive uses larger conflict markers Git uses longer conflict markers in the recursive merge base - two more than the default (thus, 9 character long conflict markers). This allows users to tell the difference between the recursive merge conflicts and conflicts between the ours and theirs branches. This was introduced in git d694a17986a28bbc19e2a6c32404ca24572e400f. Update our tests to expect this as well.
Edward Thomson b8e9467a 2018-01-20T19:39:34 merge: allow custom conflict marker size Allow for a custom conflict marker size, allowing callers to override the default size of the "<<<<<<<" and ">>>>>>>" markers in the conflicted output file.
Edward Thomson 45f58409 2018-01-20T15:15:40 Merge pull request #4484 from pks-t/pks/fail-creating-branch-HEAD branch: refuse creating branches named 'HEAD'
Edward Thomson 4ea8035d 2018-01-20T14:56:51 Merge pull request #4478 from libgit2/cmn/packed-refs-sorted refs: include " sorted " in our packed-refs header
Patrick Steinhardt a9677e01 2018-01-19T09:20:59 branch: refuse creating branches named 'HEAD' Since a625b092c (branch: correctly reject refs/heads/{-dash,HEAD}, 2017-11-14), which is included in v2.16.0, upstream git refuses to create branches which are named HEAD to avoid ambiguity with the symbolic HEAD reference. Adjust our own code to match that behaviour and reject creating branches names HEAD.
Brian Lopez 4893a9c0 2018-01-17T13:54:42 Merge pull request #4451 from libgit2/charliesome/trailer-info Implement message trailer parsing API
Brian Lopez d4a3a4b5 2018-01-17T12:52:08 rename find_trailer to extract_trailer_block
Brian Lopez d43974fb 2018-01-16T13:40:26 Change trailer API to return a simple array
Carlos Martín Nieto 9bf37ddd 2018-01-12T15:17:41 refs: include " sorted " in our packed-refs header This lets git know that we do in fact have written our packed-refs file sorted (which is apparently not necessarily the case) and it can then use the new-ish mmaped access which lets it avoid significant amounts of effort parsing potentially large files to get to a single piece of data.
Patrick Steinhardt 90f81f9f 2018-01-12T12:56:57 transports: local: fix memory leak in reference walk Upon downloading the pack file, the local transport will iterate through every reference using `git_reference_foreach`. The function is a bit tricky though in that it requires the passed callback to free the references, which does not currently happen. Fix the memory leak by freeing all passed references in the callback.
Brian Lopez 5734768b 2018-01-10T19:19:34 Merge remote-tracking branch 'origin/master' into charliesome/trailer-info
Carlos Martín Nieto b21c5408 2018-01-08T12:33:07 cmake: add openssl to the private deps list when it's the TLS implementation We might want OpenSSL to be the implementation for SHA-1 and/or TLS. If we only want it for TLS (e.g. we're building with the collision-detecting SHA-1 implementation) then we did not indicate this to the systems including us a static library. Add OpenSSL to the list also during the TLS decision to make sure we say we should link to it if we use it for TLS.
Carlos Martín Nieto b85548ed 2018-01-08T12:30:50 cmake: treat LIBGIT2_PC_REQUIRES as a list It is indeed a list of dependencies for those which include the static archive. This is in preparation for adding two possible places where we might add openssl as a dependency.
Edward Thomson 70db57d4 2018-01-05T15:31:51 Merge pull request #4398 from pks-t/pks/generic-sha1 cmake: allow explicitly choosing SHA1 backend
Patrick Steinhardt 70aa6146 2017-12-05T08:48:31 cmake: allow explicitly choosing SHA1 backend Right now, if SHA1DC is disabled, the SHA1 backend is mostly chosen based on which system libgit2 is being compiled on and which libraries have been found. To give developers and distributions more choice, enable them to request specific backends by passing in a `-DSHA1_BACKEND=<BACKEND>` option instead. This completely replaces the previous auto-selection.
Brian Lopez f315cd14 2018-01-03T18:44:12 make separators const a macro as well
Brian Lopez 1cda43ba 2018-01-03T18:30:04 make comment_line_char const a macro
Edward Thomson a223bae5 2018-01-03T14:57:25 Merge pull request #4437 from pks-t/pks/openssl-hash-errors hash: openssl: check return values of SHA1_* functions
Edward Thomson 399c0b19 2018-01-03T14:55:06 Merge pull request #4462 from pks-t/pks/diff-generated-excessive-stats diff_generate: avoid excessive stats of .gitattribute files
Patrick Steinhardt d8896bda 2018-01-03T16:07:36 diff_generate: avoid excessive stats of .gitattribute files When generating a diff between two trees, for each file that is to be diffed we have to determine whether it shall be treated as text or as binary files. While git has heuristics to determine which kind of diff to generate, users can also that default behaviour by setting or unsetting the 'diff' attribute for specific files. Because of that, we have to query gitattributes in order to determine how to diff the current files. Instead of hitting the '.gitattributes' file every time we need to query an attribute, which can get expensive especially on networked file systems, we try to cache them instead. This works perfectly fine for every '.gitattributes' file that is found, but we hit cache invalidation problems when we determine that an attribuse file is _not_ existing. We do create an entry in the cache for missing '.gitattributes' files, but as soon as we hit that file again we invalidate it and stat it again to see if it has now appeared. In the case of diffing large trees with each other, this behaviour is very suboptimal. For each pair of files that is to be diffed, we will repeatedly query every directory component leading towards their respective location for an attributes file. This leads to thousands or even hundreds of thousands of wasted syscalls. The attributes cache already has a mechanism to help in that scenario in form of the `git_attr_session`. As long as the same attributes session is still active, we will not try to re-query the gitmodules files at all but simply retain our currently cached results. To fix our problem, we can create a session at the top-most level, which is the initialization of the `git_diff` structure, and use it in order to look up the correct diff driver. As the `git_diff` structure is used to generate patches for multiple files at once, this neatly solves our problem by retaining the session until patches for all files have been generated. The fix has been tested with linux.git by calling `git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and v4.14^{tree}. | time | .gitattributes stats without fix | 33.201s | 844614 with fix | 30.327s | 4441 While execution only improved by roughly 10%, the stat(3) syscalls for .gitattributes files decreased by 99.5%. The benchmarks were quite simple with best-of-three timings on Linux ext4 systems. One can assume that for network based file systems the performance gain will be a lot larger due to a much higher latency.
Patrick Steinhardt 30455a56 2018-01-03T13:09:21 Merge pull request #4439 from tiennou/fix/4352 cmake: create a dummy file for Xcode
Patrick Steinhardt ba56f781 2018-01-03T12:54:42 streams: openssl: fix thread-safety for OpenSSL error messages The function `ERR_error_string` can be invoked without providing a buffer, in which case OpenSSL will simply return a string printed into a static buffer. Obviously and as documented in ERR_error_string(3), this is not thread-safe at all. As libgit2 is a library, though, it is easily possible that other threads may be using OpenSSL at the same time, which might lead to clobbered error strings. Fix the issue by instead using a stack-allocated buffer. According to the documentation, the caller has to provide a buffer of at least 256 bytes of size. While we do so, make sure that the buffer will never get overflown by switching to `ERR_error_string_n` to specify the buffer's size.
Patrick Steinhardt 75e1737a 2017-12-08T10:10:19 hash: openssl: check return values of SHA1_* functions The OpenSSL functions `SHA1_Init`, `SHA1_Update` and `SHA1_Final` all return 1 for success and 0 otherwise, but we never check their return values. Do so.
Patrick Steinhardt 98303ea3 2018-01-03T11:27:12 Merge pull request #4457 from libgit2/ethomson/tree_error_messages tree: standard error messages are lowercase
Brian Lopez e8bc8558 2018-01-02T13:29:49 Merge remote-tracking branch 'origin/master' into charliesome/trailer-info
Edward Thomson 7610638e 2018-01-01T17:52:06 Merge pull request #4453 from libgit2/ethomson/spnego winhttp: properly support ntlm and negotiate
Edward Thomson 2c99011a 2017-12-31T09:33:19 tree: standard error messages are lowercase Our standard error messages begin with a lower case letter so that they can be prefixed or embedded nicely. These error messages were missed during the standardization pass since they use the `tree_error` helper function.
Edward Thomson d6210245 2017-12-30T13:09:43 Merge pull request #4159 from richardipsum/notes-commit Support using notes via a commit rather than a ref
Edward Thomson 8cdf439b 2017-12-30T13:07:03 Merge pull request #4028 from chescock/improve-local-fetch Transfer fewer objects on push and local fetch
Edward Thomson 2b7a3393 2017-12-30T12:47:57 Merge pull request #4455 from libgit2/ethomson/branch_symlinks refs: traverse symlinked directories
Edward Thomson e14bf97e 2017-12-30T08:09:22 Merge pull request #4443 from libgit2/ethomson/large_loose_blobs Inflate large loose blobs
Edward Thomson 9e94b6af 2017-12-30T00:12:46 iterator: cleanups with symlink dir handling Perform some error checking when examining symlink directories.
Andy Doan e9628e7b 2017-10-30T11:38:33 branches: Check symlinked subdirectories Native Git allows symlinked directories under .git/refs. This change allows libgit2 to also look for references that live under symlinked directories. Signed-off-by: Andy Doan <andy@opensourcefoundries.com>
Edward Thomson 526dea1c 2017-12-29T17:41:24 winhttp: properly support ntlm and negotiate When parsing unauthorized responses, properly parse headers looking for both NTLM and Negotiate challenges. Set the HTTP credentials to default credentials (using a `NULL` username and password) with the schemes supported by ourselves and the server.
Edward Thomson 083b1a2e 2017-12-28T10:38:31 Merge pull request #4021 from carlosmn/cmn/refspecs-fetchhead FETCH_HEAD and multiple refspecs
Carlos Martín Nieto 1b4fbf2e 2017-11-19T09:47:07 remote: append to FETCH_HEAD rather than overwrite for each refspec We treat each refspec on its own, but the code currently overwrites the contents of FETCH_HEAD so we end up with the entries for the last refspec we processed. Instead, truncate it before performing the updates and append to it when updating the references.
Carlos Martín Nieto 3ccc1a4d 2017-11-19T09:46:02 futils: add a function to truncate a file We want to do this in order to get FETCH_HEAD to be empty when we start updating it due to fetching from the remote.
Edward Thomson 4110fc84 2017-12-23T23:30:29 Merge pull request #4285 from pks-t/pks/patches-with-whitespace patch_parse: fix parsing unquoted filenames with spaces
lhchavez c3514b0b 2017-12-23T14:59:07 Fix unpack double free If an element has been cached, but then the call to packfile_unpack_compressed() fails, the very next thing that happens is that its data is freed and then the element is not removed from the cache, which frees the data again. This change sets obj->data to NULL to avoid the double-free. It also stops trying to resolve deltas after two continuous failed rounds of resolution, and adds a test for this.
Edward Thomson 9f7ad3c5 2017-12-23T10:55:13 Merge pull request #4430 from tiennou/fix/openssl-x509-leak Free OpenSSL peer certificate
Edward Thomson 30d91760 2017-12-23T10:52:08 Merge pull request #4435 from lhchavez/ubsan-shift-overflow libFuzzer: Prevent a potential shift overflow
Edward Thomson 1ddc57b3 2017-12-23T10:09:12 Merge pull request #4402 from libgit2/ethomson/iconv cmake: let USE_ICONV be optional on macOS
Edward Thomson 06f3aa5f 2017-12-23T10:07:44 Merge pull request #4429 from novalis/delete-modify-submodule-merge Do not attempt to check out submodule as blob when merging a submodule modify/deltete conflict
Edward Thomson bdb54214 2017-12-11T16:46:05 hash: commoncrypto hash should support large files Teach the CommonCrypto hash mechanisms to support large files. The hash primitives take a `CC_LONG` (aka `uint32_t`) at a time. So loop to give the hash function at most an unsigned 32 bit's worth of bytes until we have hashed the entire file.
Edward Thomson a89560d5 2017-12-10T17:26:43 hash: win32 hash mechanism should support large files Teach the win32 hash mechanisms to support large files. The hash primitives take at most `ULONG_MAX` bytes at a time. Loop, giving the hash function the maximum supported number of bytes, until we have hashed the entire file.
Edward Thomson 3e6533ba 2017-12-10T17:25:00 odb_loose: reject objects that cannot fit in memory Check the size of objects being read from the loose odb backend and reject those that would not fit in memory with an error message that reflects the actual problem, instead of error'ing later with an unintuitive error message regarding truncation or invalid hashes.
Edward Thomson 8642feba 2017-12-10T17:23:44 zstream: use UINT_MAX sized chunks Instead of paging to zlib in INT_MAX sized chunks, we can give it as many as UINT_MAX bytes at a time. zlib doesn't care how big a buffer we give it, this simply results in fewer calls into zlib.
Edward Thomson ddefea75 2017-11-30T15:55:59 odb: support large loose objects zlib will only inflate/deflate an `int`s worth of data at a time. We need to loop through large files in order to ensure that we inflate the entire file, not just an `int`s worth of data. Thankfully, we already have this loop in our `git_zstream` layer. Handle large objects using the `git_zstream`.
Edward Thomson d1e44655 2017-11-30T15:52:47 object: introduce git_object_stringn2type Introduce an internal API to get the object type based on a length-specified (not null terminated) string representation. This can be used to compare the (space terminated) object type name in a loose object. Reimplement `git_object_string2type` based on this API.
Edward Thomson 86219f40 2017-11-30T15:40:13 util: introduce `git__prefixncmp` and consolidate implementations Introduce `git_prefixncmp` that will search up to the first `n` characters of a string to see if it is prefixed by another string. This is useful for examining if a non-null terminated character array is prefixed by a particular substring. Consolidate the various implementations of `git__prefixcmp` around a single core implementation and add some test cases to validate its behavior.
Edward Thomson b7d36ef4 2017-12-12T12:24:11 zstream: treat `Z_BUF_ERROR` as non-fatal zlib will return `Z_BUF_ERROR` whenever there is more input to inflate or deflate than there is output to store the result. This is normal for us as we iterate through the input, particularly with very large input buffers.
Charlie Somerville 72fbf05c 2017-12-20T15:24:30 trailer: use git__prefixcmp instead of starts_with
Charlie Somerville 13722611 2017-12-20T15:24:23 trailer: remove inline specifier on is_blank_line
Charlie Somerville 1c43edca 2017-12-14T18:37:10 message: add routine for parsing trailers from messages This is implemented in trailer.c and borrows a large amount of logic from Git core to ensure compatibility.
Edward Thomson fa8cf14f 2017-12-16T21:49:45 Merge pull request #4447 from pks-t/pks/diff-file-contents-refcount-blob diff_file: properly refcount blobs when initializing file contents
Etienne Samson 8be2a790 2017-12-05T23:21:05 openssl: free the peer certificate Per SSL_get_peer_certificate docs: ``` The reference count of the X509 object is incremented by one, so that it will not be destroyed when the session containing the peer certificate is freed. The X509 object must be explicitly freed using X509_free(). ```
Etienne Samson 2518eb81 2017-11-24T14:04:10 openssl: merge all the exit paths of verify_server_cert This makes it easier to cleanup allocated resources on exit.
lhchavez 53f2c6b1 2017-12-15T15:01:50 Simplified overflow condition
Patrick Steinhardt 2482559d 2017-12-15T05:52:02 Merge pull request #4432 from lhchavez/fix-missing-trailer libFuzzer: Fix missing trailer crash
Patrick Steinhardt 2388a9e2 2017-12-15T10:47:01 diff_file: properly refcount blobs when initializing file contents When initializing a `git_diff_file_content` from a source whose data is derived from a blob, we simply assign the blob's pointer to the resulting struct without incrementing its refcount. Thus, the structure can only be used as long as the blob is kept alive by the caller. Fix the issue by using `git_blob_dup` instead of a direct assignment. This function will increment the refcount of the blob without allocating new memory, so it does exactly what we want. As `git_diff_file_content__unload` already frees the blob when `GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code handling the free but only have to set that flag correctly.
Etienne Samson 4969a672 2017-12-10T02:19:34 cmake: create a dummy file for Xcode Otherwise Xcode will happily not-link our git2 target, resulting in a "missing file" error when building eg. examples
Etienne Samson 1b2e83a9 2017-12-13T00:19:41 stransport: provide error message on trust failures Fixes #4440
lhchavez e7fac2af 2017-12-09T05:26:27 Using unsigned instead
lhchavez c8aaba24 2017-12-06T03:03:18 libFuzzer: Fix missing trailer crash This change fixes an invalid memory access when the trailer is missing / corrupt. Found using libFuzzer.
lhchavez 28662c13 2017-12-08T06:00:27 libFuzzer: Prevent a potential shift overflow The type of |base_offset| in get_delta_base() is `git_off_t`, which is a signed `long`. That means that we need to make sure that the 8 most significant bits are zero (instead of 7) to avoid an overflow when it is shifted by 7 bits. Found using libFuzzer.
lhchavez 400caed3 2017-12-06T03:22:58 libFuzzer: Fix a git_packfile_stream leak This change ensures that the git_packfile_stream object in git_indexer_append() does not leak when the stream has errors. Found using libFuzzer.
David Turner 2a3e0635 2017-12-04T16:56:07 Do not attempt to check out submodule as blob when merging a submodule modify/deltete conflict
Richard Ipsum d788f42a 2017-04-09T14:06:23 notes: Rewrite funcs in terms of note_commit funcs
Edward Thomson 429bb357 2017-12-01T11:45:53 Merge pull request #4318 from Uncommon/amend_status Add git_status_file_at
Edward Thomson 344b4ead 2017-12-01T11:27:15 Merge pull request #4427 from pks-t/pks/openssl-threadid openssl: fix thread-safety on non-glibc POSIX systems
Edward Thomson 494a2f23 2017-11-30T21:45:27 Merge pull request #4426 from pks-t/pks/diff-flag-set-fix diff_generate: fix unsetting diff flags
Patrick Steinhardt 2d2e70f8 2017-11-30T18:10:28 openssl: fix thread-safety on non-glibc POSIX systems While the OpenSSL library provides all means to work safely in a multi-threaded application, we fail to do so correctly. Quoting from crypto_lock(3): OpenSSL can safely be used in multi-threaded applications provided that at least two callback functions are set, locking_function and threadid_func. We do in fact provide the means to set up the locking function via `git_openssl_set_locking()`, where we initialize a set of locks by using the POSIX threads API and set the correct callback function to lock and unlock them. But what we do not do is setting the `threadid_func` callback. This function is being used to correctly locate thread-local data of the OpenSSL library and should thus return per-thread identifiers. Digging deeper into OpenSSL's documentation, the library does provide a fallback in case that locking function is not provided by the user. On Windows and BeOS we should be safe, as it simply "uses the system's default thread identifying API". On other platforms though OpenSSL will fall back to using the address of `errno`, assuming it is thread-local. While this assumption holds true for glibc-based systems, POSIX in fact does not specify whether it is thread-local or not. Quoting from errno(3p): It is unspecified whether errno is a macro or an identifier declared with external linkage. And in fact, with musl there is at least one libc implementation which simply declares `errno` as a simple `int` without being thread-local. On those systems, the fallback threadid function of OpenSSL will not be thread-safe. Fix this by setting up our own callback for this setting. As users of libgit2 may want to set it themselves, we obviously cannot always set that function on initialization. But as we already set up primitives for threading in `git_openssl_set_locking()`, this function becomes the obvious choice where to implement the additional setup.
Patrick Steinhardt 5ca3f115 2017-11-30T15:12:48 diff_generate: fix unsetting diff flags The macro `DIFF_FLAG_SET` can be used to set or unset a flag by modifying the diff's bitmask. While the case of setting the flag is handled correctly, the case of unsetting the flag was not. Instead of inverting the flags, we are inverting the value which is used to decide whether we want to set or unset the bits. The value being used here is a simple `bool` which is `false`. As that is being uplifted to `int` when getting the bitwise-complement, we will end up retaining all bits inside of the bitmask. As that's only ever used to set `GIT_DIFF_IGNORE_CASE`, we were actually always ignoring case for generated diffs. Fix that by instead getting the bitwise-complement of `FLAG`, not `VAL`.
Patrick Steinhardt 90fc7f53 2017-11-30T15:09:05 diff: remove unused macros `DIFF_FLAG_*` In commit 9be638ecf (git_diff_generated: abstract generated diffs, 2016-04-19), the code for generated diffs was moved out of the generic "diff.c" and instead into its own module. During that conversion, it was forgotten to remove the macros `DIFF_FLAG_IS_SET`, `DIFF_FLAG_ISNT_SET` and `DIFF_FLAG_SET`, which are now only used in "diff_generated.c". Remove those macros now.
David Catmull 4ccacdc8 2017-07-21T17:07:10 status: Add a baseline field to git_status_options for comparing to trees other than HEAD
Etienne Samson 38eaa7ab 2017-11-24T12:28:19 winhttp: pass the same payload as ssh & http transports when checking certificates
Carlos Martín Nieto 7e3faf58 2017-10-29T15:05:28 diff: expose the "indent heuristic" in the diff options We default to off, but we might want to consider changing `GIT_DIFF_NORMAL` to include it.
Patrick Steinhardt 585b5dac 2017-11-18T15:43:11 refcount: make refcounting conform to aliasing rules Strict aliasing rules dictate that for most data types, you are not allowed to cast them to another data type and then access the casted pointers. While this works just fine for most compilers, technically we end up in undefined behaviour when we hurt that rule. Our current refcounting code makes heavy use of casting and thus violates that rule. While we didn't have any problems with that code, Travis started spitting out a lot of warnings due to a change in their toolchain. In the refcounting case, the code is also easy to fix: as all refcounting-statements are actually macros, we can just access the `rc` field directly instead of casting. There are two outliers in our code where that doesn't work. Both the `git_diff` and `git_patch` structures have specializations for generated and parsed diffs/patches, which directly inherit from them. Because of that, the refcounting code is only part of the base structure and not of the children themselves. We can help that by instead passing their base into `GIT_REFCOUNT_INC`, though.
Henry Kleynhans f063dafb 2017-11-12T10:56:50 signature: distinguish +0000 and -0000 UTC offsets Git considers '-0000' a valid offset for signature lines. They need to be treated as _not_ equal to a '+0000' signature offset. Parsing a signature line stores the offset in a signed integer which does not distinguish between `+0` and `-0`. This patch adds an additional flag `sign` to the `git_time` in the `signature` object which is populated with the sign of the offset. In addition to exposing this information to the user, this information is also used to compare signatures. /cc @pks-t @ethomson
Patrick Steinhardt 80226b5f 2017-09-22T13:39:05 patch_parse: allow parsing ambiguous patch headers The git patch format allows for having unquoted paths with whitespaces inside. This format becomes ambiguous to parse, e.g. in the following example: diff --git a/file b/with spaces.txt b/file b/with spaces.txt While we cannot parse this in a correct way, we can instead use the "---" and "+++" lines to retrieve the file names, as the path is not followed by anything here but spans the complete remaining line. Because of this, we can simply bail outwhen parsing the "diff --git" header here without an actual error and then proceed to just take the paths from the other headers.
Patrick Steinhardt 3892f70d 2017-09-22T13:26:47 patch_parse: treat complete line after "---"/"+++" as path When parsing the "---" and "+++" line, we stop after the first whitespace inside of the filename. But as files containing whitespaces do not need to be quoted, we should instead use the complete line here. This fixes parsing patches with unquoted paths with whitespaces.
Edward Thomson 1d7c15ad 2017-11-11T20:15:07 Merge pull request #4310 from pks-t/pks/common-parser Common parser interface
Edward Thomson bbb213c1 2017-11-11T13:19:24 cmake: let USE_ICONV be optional on macOS Instead of forcing iconv support on macOS (by forcing `USE_ICONV` on), honor the `USE_ICONV` option only on macOS. Although macOS includes iconv by default, some macOS users may have a deficient installation for some reason and they should be provided a workaround to use libgit2 even in this situation. iconv support is now disabled entirely on non-macOS platforms. No other platform supports core.precomposeunicode, and iconv should never be linked.
Patrick Steinhardt 9e66590b 2017-07-21T13:01:43 config_parse: use common parser interface As the config parser is now cleanly separated from the config file code, we can easily refactor the code and make use of the common parser module. This removes quite a lot of duplicated functionality previously used for handling the actual parser state and replaces it with the generic interface provided by the parser context.
Patrick Steinhardt 1953c68b 2017-11-11T17:12:31 config_file: split out module to parse config files The configuration file code grew quite big and intermingles both actual configuration logic as well as the parsing logic of the configuration syntax. This makes it hard to refactor the parsing logic on its own and convert it to make use of our new parsing context module. Refactor the code and split it up into two parts. The config file code will only handle actual handling of configuration files, includes and writing new files. The newly created config parser module is then only responsible for parsing the actual contents of a configuration file, leaving everything else to callbacks provided to its provided function `git_config_parse`.
Patrick Steinhardt 7bdfc0a6 2017-07-14T15:33:32 parse: always initialize line pointer Upon initializing the parser context, we do not currently initialize the current line, line length and line number. Do so in order to make the interface easier to use and more obvious for future consumers of the parsing API.
Patrick Steinhardt e72cb769 2017-07-14T14:37:07 parse: implement `git_parse_peek` Some code parts need to inspect the next few bytes without actually consuming it yet, for example to examine what content it has to expect next. Create a new function `git_parse_peek` which returns the next byte without modifying the parsing context and use it at multiple call sites.
Patrick Steinhardt 252f2eee 2017-07-14T13:45:05 parse: implement and use `git_parse_advance_digit` The patch parsing code has multiple recurring patterns where we want to parse an actual number. Create a new function `git_parse_advance_digit` and use it to avoid code duplication.
Patrick Steinhardt 65dcb645 2017-07-14T13:29:29 patch_parse: use git_parse_contains_s Instead of manually checking the parsing context's remaining length and comparing the leading bytes with a specific string, we can simply re-use the function `git_parse_ctx_contains_s`. Do so to avoid code duplication and to further decouple patch parsing from the parsing context's struct members.
Patrick Steinhardt ef1395f3 2017-11-11T15:30:43 parse: extract parse module The `git_patch_parse_ctx` encapsulates both parser state as well as options specific to patch parsing. To advance this state and keep it consistent, we provide a few functions which handle advancing the current position and accessing bytes of the patch contents. In fact, these functions are quite generic and not related to patch-parsing by themselves. Seeing that we have similar logic inside of other modules, it becomes quite enticing to extract this functionality into its own parser module. To do so, we create a new module `parse` with a central struct called `git_parse_ctx`. It encapsulates both the content that is to be parsed as well as its lengths and the current position. `git_patch_parse_ctx` now only contains this `parse_ctx` only, which is then accessed whenever we need to touch the current parser. This is the first step towards re-using this functionality across other modules which require parsing functionality and remove code-duplication.
Henry Kleynhans a0b0b808 2017-11-11T14:03:14 cmake: Allow user to select bundled zlib Under some circumstances the installed / system version of zlib may not be desirable due to being too old or buggy. This patch adds the option `USE_BUNDLED_ZLIB` that will cause the bundled version of zlib to be used. We may also want to add similar functionality to allow the user to select other bundled 3rd-party dependencies instead of using the system versions. /cc @pks-t @ethomson