kmx git

Commit	Date	Message
19f1a8e6	2016-09-05T11:44:51	diff: improve positioning of add/delete blocks in diffs Some groups of added/deleted lines in diffs can be slid up or down, because lines at the edges of the group are not unique. Picking good shifts for such groups is not a matter of correctness but definitely has a big effect on aesthetics. For example, consider the following two diffs. The first is what standard Git emits: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) { } if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} +if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { $smtp_server = $_; The following diff is equivalent, but is obviously preferable from an aesthetic point of view: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) { $initial_reply_to =~ s/(^\s+\|\s+$)//g; } +if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { This patch teaches Git to pick better positions for such "diff sliders" using heuristics that take the positions of nearby blank lines and the indentation of nearby lines into account. The existing Git code basically always shifts such "sliders" as far down in the file as possible. The only exception is when the slider can be aligned with a group of changed lines in the other file, in which case Git favors depicting the change as one add+delete block rather than one add and a slightly offset delete block. This naive algorithm often yields ugly diffs. Commit d634d61ed6 improved the situation somewhat by preferring to position add/delete groups to make their last line a blank line, when that is possible. This heuristic does more good than harm, but (1) it can only help if there are blank lines in the right places, and (2) always picks the last blank line, even if there are others that might be better. The end result is that it makes perhaps 1/3 as many errors as the default Git algorithm, but that still leaves a lot of ugly diffs. This commit implements a new and much better heuristic for picking optimal "slider" positions using the following approach: First observe that each hypothetical positioning of a diff slider introduces two splits: one between the context lines preceding the group and the first added/deleted line, and the other between the last added/deleted line and the first line of context following it. It tries to find the positioning that creates the least bad splits. Splits are evaluated based only on the presence and locations of nearby blank lines, and the indentation of lines near the split. Basically, it prefers to introduce splits adjacent to blank lines, between lines that are indented less, and between lines with the same level of indentation. In more detail: 1. It measures the following characteristics of a proposed splitting position in a `struct split_measurement`: * the number of blank lines above the proposed split * whether the line directly after the split is blank * the number of blank lines following that line * the indentation of the nearest non-blank line above the split * the indentation of the line directly below the split * the indentation of the nearest non-blank line after that line 2. It combines the measured attributes using a bunch of empirically-optimized weighting factors to derive a `struct split_score` that measures the "badness" of splitting the text at that position. 3. It combines the `split_score` for the top and the bottom of the slider at each of its possible positions, and selects the position that has the best `split_score`. I determined the initial set of weighting factors by collecting a corpus of Git histories from 29 open-source software projects in various programming languages. I generated many diffs from this corpus, and determined the best positioning "by eye" for about 6600 diff sliders. I used about half of the repositories in the corpus (corresponding to about 2/3 of the sliders) as a training set, and optimized the weights against this corpus using a crude automated search of the parameter space to get the best agreement with the manually-determined values. Then I tested the resulting heuristic against the full corpus. The results are summarized in the following table, in column `indent-1`: \| repository \| count \| Git 2.9.0 \| compaction \| compaction-fixed \| indent-1 \| indent-2 \| \| --------------------- \| ----- \| -------------- \| -------------- \| ---------------- \| -------------- \| -------------- \| \| afnetworking \| 109 \| 89 (81.7%) \| 37 (33.9%) \| 37 (33.9%) \| 2 (1.8%) \| 2 (1.8%) \| \| alamofire \| 30 \| 18 (60.0%) \| 14 (46.7%) \| 15 (50.0%) \| 0 (0.0%) \| 0 (0.0%) \| \| angular \| 184 \| 127 (69.0%) \| 39 (21.2%) \| 23 (12.5%) \| 5 (2.7%) \| 5 (2.7%) \| \| animate \| 313 \| 2 (0.6%) \| 2 (0.6%) \| 2 (0.6%) \| 2 (0.6%) \| 2 (0.6%) \| \| ant \| 380 \| 356 (93.7%) \| 152 (40.0%) \| 148 (38.9%) \| 15 (3.9%) \| 15 (3.9%) \| * \| bugzilla \| 306 \| 263 (85.9%) \| 109 (35.6%) \| 99 (32.4%) \| 14 (4.6%) \| 15 (4.9%) \| * \| corefx \| 126 \| 91 (72.2%) \| 22 (17.5%) \| 21 (16.7%) \| 6 (4.8%) \| 6 (4.8%) \| \| couchdb \| 78 \| 44 (56.4%) \| 26 (33.3%) \| 28 (35.9%) \| 6 (7.7%) \| 6 (7.7%) \| * \| cpython \| 937 \| 158 (16.9%) \| 50 (5.3%) \| 49 (5.2%) \| 5 (0.5%) \| 5 (0.5%) \| * \| discourse \| 160 \| 95 (59.4%) \| 42 (26.2%) \| 36 (22.5%) \| 18 (11.2%) \| 13 (8.1%) \| \| docker \| 307 \| 194 (63.2%) \| 198 (64.5%) \| 253 (82.4%) \| 8 (2.6%) \| 8 (2.6%) \| * \| electron \| 163 \| 132 (81.0%) \| 38 (23.3%) \| 39 (23.9%) \| 6 (3.7%) \| 6 (3.7%) \| \| git \| 536 \| 470 (87.7%) \| 73 (13.6%) \| 78 (14.6%) \| 16 (3.0%) \| 16 (3.0%) \| * \| gitflow \| 127 \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| \| ionic \| 133 \| 89 (66.9%) \| 29 (21.8%) \| 38 (28.6%) \| 1 (0.8%) \| 1 (0.8%) \| \| ipython \| 482 \| 362 (75.1%) \| 167 (34.6%) \| 169 (35.1%) \| 11 (2.3%) \| 11 (2.3%) \| * \| junit \| 161 \| 147 (91.3%) \| 67 (41.6%) \| 66 (41.0%) \| 1 (0.6%) \| 1 (0.6%) \| * \| lighttable \| 15 \| 5 (33.3%) \| 0 (0.0%) \| 2 (13.3%) \| 0 (0.0%) \| 0 (0.0%) \| \| magit \| 88 \| 75 (85.2%) \| 11 (12.5%) \| 9 (10.2%) \| 1 (1.1%) \| 0 (0.0%) \| \| neural-style \| 28 \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| 0 (0.0%) \| \| nodejs \| 781 \| 649 (83.1%) \| 118 (15.1%) \| 111 (14.2%) \| 4 (0.5%) \| 5 (0.6%) \| * \| phpmyadmin \| 491 \| 481 (98.0%) \| 75 (15.3%) \| 48 (9.8%) \| 2 (0.4%) \| 2 (0.4%) \| * \| react-native \| 168 \| 130 (77.4%) \| 79 (47.0%) \| 81 (48.2%) \| 0 (0.0%) \| 0 (0.0%) \| \| rust \| 171 \| 128 (74.9%) \| 30 (17.5%) \| 27 (15.8%) \| 16 (9.4%) \| 14 (8.2%) \| \| spark \| 186 \| 149 (80.1%) \| 52 (28.0%) \| 52 (28.0%) \| 2 (1.1%) \| 2 (1.1%) \| \| tensorflow \| 115 \| 66 (57.4%) \| 48 (41.7%) \| 48 (41.7%) \| 5 (4.3%) \| 5 (4.3%) \| \| test-more \| 19 \| 15 (78.9%) \| 2 (10.5%) \| 2 (10.5%) \| 1 (5.3%) \| 1 (5.3%) \| * \| test-unit \| 51 \| 34 (66.7%) \| 14 (27.5%) \| 8 (15.7%) \| 2 (3.9%) \| 2 (3.9%) \| * \| xmonad \| 23 \| 22 (95.7%) \| 2 (8.7%) \| 2 (8.7%) \| 1 (4.3%) \| 1 (4.3%) \| * \| --------------------- \| ----- \| -------------- \| -------------- \| ---------------- \| -------------- \| -------------- \| \| totals \| 6668 \| 4391 (65.9%) \| 1496 (22.4%) \| 1491 (22.4%) \| 150 (2.2%) \| 144 (2.2%) \| \| totals (training set) \| 4552 \| 3195 (70.2%) \| 1053 (23.1%) \| 1061 (23.3%) \| 86 (1.9%) \| 88 (1.9%) \| \| totals (test set) \| 2116 \| 1196 (56.5%) \| 443 (20.9%) \| 430 (20.3%) \| 64 (3.0%) \| 56 (2.6%) \| In this table, the numbers are the count and percentage of human-rated sliders that the corresponding algorithm got wrong. The columns are * "repository" - the name of the repository used. I used the diffs between successive non-merge commits on the HEAD branch of the corresponding repository. * "count" - the number of sliders that were human-rated. I chose most, but not all, sliders to rate from those among which the various algorithms gave different answers. * "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0. * "compaction" - the heuristic used by `git diff --compaction-heuristic` in Git 2.9.0. * "compaction-fixed" - the heuristic used by `git diff --compaction-heuristic` after the fixes from earlier in this patch series. Note that the results are not dramatically different than those for "compaction". Both produce non-ideal diffs only about 1/3 as often as the default `git diff`. * "indent-1" - the new `--indent-heuristic` algorithm, using the first set of weighting factors, determined as described above. * "indent-2" - the new `--indent-heuristic` algorithm, using the final set of weighting factors, determined as described below. * `*` - indicates that repo was part of training set used to determine the first set of weighting factors. The fact that the heuristic performed nearly as well on the test set as on the training set in column "indent-1" is a good indication that the heuristic was not over-trained. Given that fact, I ran a second round of optimization, using the entire corpus as the training set. The resulting set of weights gave the results in column "indent-2". These are the weights included in this patch. The final result gives consistently and significantly better results across the whole corpus than either `git diff` or `git diff --compaction-heuristic`. It makes only about 1/30 as many errors as the former and about 1/10 as many errors as the latter. (And a good fraction of the remaining errors are for diffs that involve weirdly-formatted code, sometimes apparently machine-generated.) The tools that were used to do this optimization and analysis, along with the human-generated data values, are recorded in a separate project [1]. [1] https://github.com/mhagger/diff-slider-tools Original Git commit: 433860f3d0beb0c6f205290bd16cda413148f098
a49895b5	2016-08-22T13:22:44	xdl_change_compact(): introduce the concept of a change group The idea of xdl_change_compact() is fairly simple: * Proceed through groups of changed lines in the file to be compacted, keeping track of the corresponding location in the "other" file. * If possible, slide the group up and down to try to give the most aesthetically pleasing diff. Whenever it is slid, the current location in the other file needs to be adjusted. But these simple concepts are obfuscated by a lot of index handling that is written in terse, subtle, and varied patterns. I found it very hard to convince myself that the function was correct. So introduce a "struct group" that represents a group of changed lines in a file. Add some functions that perform elementary operations on groups: * Initialize a group to the first group in a file * Move to the next or previous group in a file * Slide a group up or down Even though the resulting code is longer, I think it is easier to understand and review. Its performance is not changed appreciably (though it would be if `group_next()` and `group_previous()` were not inlined). ...and in fact, the rewriting helped me discover another bug in the --compaction-heuristic code: The update of blank_lines was never done for the highest possible position of the group. This means that it could fail to slide the group to its highest possible position, even if that position had a blank line as its last line. So for example, it yielded the following diff: $ git diff --no-index --compaction-heuristic a.txt b.txt diff --git a/a.txt b/b.txt index e53969f..0d60c5fe 100644 --- a/a.txt +++ b/b.txt @@ -1,3 +1,7 @@ 1 A + +B + +A 2 when in fact the following diff is better (according to the rules of --compaction-heuristic): $ git diff --no-index --compaction-heuristic a.txt b.txt diff --git a/a.txt b/b.txt index e53969f..0d60c5fe 100644 --- a/a.txt +++ b/b.txt @@ -1,3 +1,7 @@ 1 +A + +B + A 2 The new code gives the bottom answer. Original Git commit: e8adf23d1ee97b57c8aea32ee8365203b77c0e42
09fb5b2a	2016-08-22T13:22:43	recs_match(): take two xrecord_t pointers as arguments There is no reason for it to take an array and two indexes as argument, as it only accesses two elements of the array. Original Git commit: 152598cbb667471c8f5be16e199922a41452b2d5
506bf09d	2016-04-15T16:01:45	xdiff: add recs_match helper function It is a common pattern in xdl_change_compact to check that hashes and strings match. The resulting code to perform this change causes very long lines and makes it hard to follow the intention. Introduce a helper function recs_match which performs both checks to increase code readability. Original Git commit: 92e5b62fec0e9b647429e8d3736c571c434dd375
2749ff46	2016-09-13T15:52:43	time: Export `git_time_monotonic`
9ad07fc0	2016-09-06T10:43:21	Merge pull request #3923 from libgit2/ethomson/diff-read-empty-binary Read binary patches (with no binary data)
46035d98	2016-09-06T11:21:29	Merge pull request #3882 from pks-t/pks/fix-fetch-refspec-dst-parsing refspec: do not set empty rhs for fetch refspecs
adedac5a	2016-09-02T02:03:45	diff: treat binary patches with no data special When creating and printing diffs, deal with binary deltas that have binary data specially, versus diffs that have a binary file but lack the actual binary data.
f4e3dae7	2016-09-02T11:26:16	diff_print: change test for skipping binary printing Instead of skipping printing a binary diff when there is no data, skip printing when we have a status of `UNMODIFIED`. This is more in-line with our internal data model and allows us to expand the notion of binary data. In the future, there may have no data because the files were unmodified (there was no data to produce) or it may have no data because there was no data given to us in a patch. We want to treat these cases separately.
4bfd7c63	2016-09-01T16:55:27	patch: error on diff callback failure
4b34f687	2016-09-01T15:14:25	patch_generate: only calculate binary diffs if requested When generating diffs for binary files, we load and decompress the blobs in order to generate the actual diff, which can be very costly. While we cannot avoid this for the case when we are called with the `GIT_DIFF_SHOW_BINARY` flag, we do not have to load the blobs in the case where this flag is not set, as the caller is expected to have no interest in the actual content of binary files. Fix the issue by only generating a binary diff when the caller is actually interested in the diff. As libgit2 uses heuristics to determine that a blob contains binary data by inspecting its size without loading from the ODB, this saves us quite some time when diffing in a repository with binary files.
88cfe614	2016-08-24T01:20:39	git_checkout_tree options fix According to the reference the git_checkout_tree and git_checkout_head functions should accept NULL in the opts field This was broken since the opts field was dereferenced and thus lead to a crash.
ace0d36b	2016-08-29T09:29:34	Merge pull request #3900 from pks-t/pks/http-close-substream-on-connect transports: http: set substream as disconnected after closing
b859faa6	2016-08-23T23:38:39	Teach `git_patch_from_diff` about parsed diffs Ensure that `git_patch_from_diff` can return the patch for parsed diffs, not just generate a patch for a generated diff.
7a3f1de5	2016-08-22T09:27:47	filesystem_iterator: fixed double free on error
c1b370e9	2016-08-17T09:24:44	Merge pull request #3837 from novalis/dturner/indexv4 Support index v4
635a9222	2016-08-17T08:54:48	Merge pull request #3895 from pks-t/pks/negate-basename-in-subdirs ignore: allow unignoring basenames in subdirectories
b1453601	2016-08-17T11:38:26	transports: http: reset `connected` flag when closing transport
c4cba4e9	2016-08-17T11:00:05	transports: http: reset `connected` flag when re-connecting transport When calling `http_connect` on a subtransport whose stream is already connected, we first close the stream in case no keep-alive is in use. When doing so, we do not reset the transport's connection state, though. Usually, this will do no harm in case the subsequent connect will succeed. But when the connection fails we are left with a substransport which is tagged as connected but which has no valid stream attached. Fix the issue by resetting the subtransport's connected-state when closing its stream in `http_connect`.
fcb2c1c8	2016-08-12T09:06:15	ignore: allow unignoring basenames in subdirectories The .gitignore file allows for patterns which unignore previous ignore patterns. When unignoring a previous pattern, there are basically three cases how this is matched when no globbing is used: 1. when a previous file has been ignored, it can be unignored by using its exact name, e.g. foo/bar !foo/bar 2. when a file in a subdirectory has been ignored, it can be unignored by using its basename, e.g. foo/bar !bar 3. when all files with a basename are ignored, a specific file can be unignored again by specifying its path in a subdirectory, e.g. bar !foo/bar The first problem in libgit2 is that we did not correctly treat the second case. While we verified that the negative pattern matches the tail of the positive one, we did not verify if it only matches the basename of the positive pattern. So e.g. we would have also negated a pattern like foo/fruz_bar !bar Furthermore, we did not check for the third case, where a basename is being unignored in a certain subdirectory again. Both issues are fixed with this commit.
5625d86b	2016-05-17T15:40:32	index: support index v4 Support reading and writing index v4. Index v4 uses a very simple compression scheme for pathnames, but is otherwise similar to index v3. Signed-off-by: David Turner <dturner@twitter.com>
aeb5ee5a	2016-05-17T15:40:46	varint: Add varint encoding/decoding This code is ported from git.git Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: David Turner <dturner@twopensource.com>
b9895144	2016-08-08T14:47:32	stransport: do not use `git_stream_free` on uninitialized stransport When failing to initialize a new stransport stream, we try to release already allocated memory by calling out to `git_stream_free`, which in turn called out to the stream's `free` function pointer. As we only initialize the function pointer later on, this leads to a `NULL` pointer exception. Furthermore, plug another memory leak when failing to create the SSL context.
97e57e87	2016-08-08T15:13:59	Merge pull request #3887 from libgit2/ethomson/empty_blob odb: only provide the empty tree
b47e79e2	2016-08-08T08:42:32	Merge pull request #3890 from pks-t/pks/stransport-static-linkage stransport: make internal functions static
067bf5dc	2016-08-08T13:49:17	stransport: make internal functions static
becadafc	2016-08-05T19:30:56	odb: only provide the empty tree Only provide the empty tree internally, which matches git's behavior. If we provide the empty blob then any users trying to write it with libgit2 would omit it from actually landing in the odb, which appear to git proper as a broken repository (missing that object).
9884dd61	2016-08-05T18:40:37	SecureTransport: handle NULL trust on success The `SSLCopyPeerTrust` call can succeed but fail to return a trust object if it can't load the certificate chain and thus cannot check the validity of a certificate. This can lead to us calling `CFRelease` on a `NULL` trust object, causing a crash. Handle this by returning ECERTIFICATE.
274a727e	2016-08-05T10:57:42	apply: fix warning when initializing patch images
844f5b20	2016-08-05T10:57:13	pool: provide macro to statically initialize git_pool
27051d4e	2016-07-22T13:34:19	odb: only freshen pack files every 2 seconds Since writing multiple objects may all already exist in a single packfile, avoid freshening that packfile repeatedly in a tight loop. Instead, only freshen pack files every 2 seconds.
8f09a98e	2016-07-14T16:23:24	odb: freshen existing objects when writing When writing an object, we calculate its OID and see if it exists in the object database. If it does, we need to freshen the file that contains it.
d2794b0e	2016-08-04T20:49:50	Merge pull request #3877 from libgit2/ethomson/paths_init sysdir: don't assume an empty dir is uninitialized
0d84de02	2016-08-04T13:20:49	Merge pull request #3869 from richardipsum/fix-outdated-comment Fix outdated comment
78b500bf	2016-08-04T12:45:19	Merge pull request #3850 from wildart/custom-tls Enable https transport for custom TLS streams
031d34b7	2016-07-29T12:59:42	sysdir: use the standard `init` pattern Don't try to determine when sysdirs are uninitialized. Instead, simply initialize them all at `git_libgit2_init` time and never try to reinitialize, except when consumers explicitly call `git_sysdir_set`. Looking at the buffer length is especially problematic, since there may no appropriate path for that value. (For example, the Windows-specific programdata directory has no value on non-Windows machines.) Previously we would continually trying to re-lookup these values, which could get racy if two different threads are each calling `git_sysdir_get` and trying to lookup / clear the value simultaneously.
da7f9feb	2016-08-04T11:51:06	Merge pull request #3879 from libgit2/ethomson/mwindow_init mwindow: init mwindow files in git_libgit2_init
2381d9e4	2016-08-03T17:01:48	mwindow: init mwindow files in git_libgit2_init
1eee631d	2016-08-04T13:45:28	refspec: do not set empty rhs for fetch refspecs According to git-fetch(1), "[t]he colon can be omitted when <dst> is empty." So according to git, the refspec "refs/heads/master" is the same as the refspec "refs/heads/master:" when fetching changes. When trying to fetch from a remote with a trailing colon with libgit2, though, the fetch actually fails while it works when the trailing colon is left out. So obviously, libgit2 does _not_ treat these two refspec formats the same for fetches. The problem results from parsing refspecs, where the resulting refspec has its destination set to an empty string in the case of a trailing colon and to a `NULL` pointer in the case of no trailing colon. When passing this to our DWIM machinery, the empty string gets translated to "refs/heads/", which is simply wrong. Fix the problem by having the parsing machinery treat both cases the same for fetch refspecs.
002c8e29	2016-08-03T17:09:41	git_diff_file: move `id_abbrev` Move `id_abbrev` to a more reasonable place where it packs more nicely (before anybody starts using it).
152efee2	2016-08-02T18:43:12	Merge pull request #3865 from libgit2/ethomson/leaks Fix leaks, some warnings and an error
df87648a	2016-07-24T16:10:30	crlf: set a safe crlf default
b118f647	2016-07-22T14:02:00	repository: don't cast to `int` for no reason And give it a default so that some compilers don't (unnecessarily) complain.
4aaae935	2016-07-22T12:53:13	index: cast to avoid warning
60e15ecd	2016-07-15T17:18:39	packbuilder: `size_t` all the things After 1cd65991, we were passing a pointer to an `unsigned long` to a function that now expected a pointer to a `size_t`. These types differ on 64-bit Windows, which means that we trash the stack. Use `size_t`s in the packbuilder to avoid this.
581a4d39	2016-07-14T23:32:35	apply: safety check files that dont end with eol
c065f6a1	2016-07-14T23:04:47	apply: check allocation properly
531be3e8	2016-07-14T22:59:37	apply: compare preimage to image Compare the preimage to the image; don't compare the preimage to itself.
8b2ad593	2016-07-23T11:55:43	Make comment conform to style guide Style guide says // style comments should be avoided.
877282ea	2016-07-23T11:47:59	Fix outdated comment SSH transport seems to be supported now.
d81cb2e4	2016-07-15T13:32:23	remote: Handle missing config values when deleting a remote Somehow I ended up with the following in my ~/.gitconfig: [branch "master"] remote = origin merge = master rebase = true I assume something went crazy while I was running the git.git tests some time ago, and that I never noticed until now. This is not a good configuration, but it shouldn't cause problems. But it does. Specifically, if you have this in your config, and you perform the following set of actions: create a remote fetch from that remote create a branch off of the remote master branch called "master" delete the branch delete the remote The remote delete fails with the message "Could not find key 'branch.master.rebase' to delete". This is because it's iterating over the config entries (including the ones in the global config) and believes that there is a master branch which must therefore have these config keys. https://github.com/libgit2/libgit2/issues/3856
bdec62dc	2016-07-06T13:06:25	remove conditions that prevent use of custom TLS stream
c18a2bc4	2016-07-05T15:51:01	Merge pull request #3851 from txdv/get-user-agent Add get user agent functionality.
b57c176a	2016-07-05T12:46:27	Merge pull request #3846 from rkrp/fix_bug_parsing_int64min Fixed bug while parsing INT64_MIN
f1dba144	2016-07-05T09:41:51	Add get user agent functionality.
d8243465	2016-07-01T18:47:06	Merge pull request #3836 from joshtriplett/cleanup-find_repo find_repo: Clean up and simplify logic
ebeb56f0	2016-07-01T18:45:10	Merge pull request #3711 from joshtriplett/git_repository_discover_default Add GIT_REPOSITORY_OPEN_FROM_ENV flag to respect $GIT_* environment vars
6249d960	2016-06-29T17:55:44	index: include conflicts in `git_index_read_index` Ensure that we include conflicts when calling `git_index_read_index`, which will remove conflicts in the index that do not exist in the new target, and will add conflicts from the new target.
6f7ec728	2016-06-29T17:01:47	index: refactor common `read_index` functionality Most of `git_index_read_index` is common to reading any iterator. Refactor it out in case we want to implement `read_tree` in terms of it in the future.
59a0005d	2016-06-29T10:01:26	Merge pull request #3813 from stinb/submodule-update-fetch submodule: Try to fetch when update fails to find the target commit.
21766702	2016-06-27T15:20:20	blame: do not decrement commit refcount in make_origin When we create a blame origin, we try to look up the blob that is to be blamed at a certain revision. When this lookup fails, e.g. because the file did not exist at that certain revision, we fail to create the blame origin and return `NULL`. The blame origin that we have just allocated is thereby free'd with `origin_decref`. The `origin_decref` function does not only decrement reference counts for the blame origin, though, but also for its commit and blob. When this is done in the error case, we will cause an uneven reference count for these objects. This may result in hard-to-debug failures at seemingly unrelated code paths, where we try to access these objects when they in fact have already been free'd. Fix the issue by refactoring `make_origin` such that we only allocate the object after the only function that may fail so that we do not have to call `origin_decref` at all. Also fix the `pass_blame` function, which indirectly calls `make_origin`, to free the commit when `make_origin` failed.
70b9b841	2016-06-28T20:19:52	Fixed bug while parsing INT64_MIN
de43efcf	2016-06-28T16:07:25	submodule: Try to fetch when update fails to find the target commit in the submodule.
20302aa4	2016-06-25T23:33:05	Merge pull request #3223 from ethomson/apply Reading patch files
1a79cd95	2016-04-26T01:18:01	patch: show copy information for identical copies When showing copy information because we are duplicating contents, for example, when performing a `diff --find-copies-harder -M100 -B100`, then show copy from/to lines in a patch, and do not show context. Ensure that we can also parse such patches.
38a347ea	2016-04-25T17:52:39	patch::parse: handle patches with no hunks Patches may have no hunks when there's no modifications (for example, in a rename). Handle them.
2b490284	2016-06-24T15:59:37	find_repo: Clean up and simplify logic find_repo had a complex loop and heavily nested conditionals, making it difficult to follow. Simplify this as much as possible: - Separate assignments from conditionals. - Check the complex loop condition in the only place it can change. - Break out of the loop on error, rather than going through the rest of the loop body first. - Handle error cases by immediately breaking, rather than nesting conditionals. - Free repo_link unconditionally on the way out of the function, rather than in multiple places. - Add more comments on the remaining complex steps.
0dd98b69	2016-04-03T17:22:07	Add GIT_REPOSITORY_OPEN_FROM_ENV flag to respect $GIT_* environment vars git_repository_open_ext provides parameters for the start path, whether to search across filesystems, and what ceiling directories to stop at. git commands have standard environment variables and defaults for each of those, as well as various other parameters of the repository. To avoid duplicate environment variable handling in users of libgit2, add a GIT_REPOSITORY_OPEN_FROM_ENV flag, which makes git_repository_open_ext automatically handle the appropriate environment variables. Commands that intend to act just like those built into git itself can use this flag to get the expected default behavior. git_repository_open_ext with the GIT_REPOSITORY_OPEN_FROM_ENV flag respects $GIT_DIR, $GIT_DISCOVERY_ACROSS_FILESYSTEM, $GIT_CEILING_DIRECTORIES, $GIT_INDEX_FILE, $GIT_NAMESPACE, $GIT_OBJECT_DIRECTORY, and $GIT_ALTERNATE_OBJECT_DIRECTORIES. In the future, when libgit2 gets worktree support, git_repository_open_env will also respect $GIT_WORK_TREE and $GIT_COMMON_DIR; until then, git_repository_open_ext with this flag will error out if either $GIT_WORK_TREE or $GIT_COMMON_DIR is set.
39c6fca3	2016-04-03T16:01:01	Add GIT_REPOSITORY_OPEN_NO_DOTGIT flag to avoid appending /.git GIT_REPOSITORY_OPEN_NO_SEARCH does not search up through parent directories, but still tries the specified path both directly and with /.git appended. GIT_REPOSITORY_OPEN_BARE avoids appending /.git, but opens the repository in bare mode even if it has a working directory. To support the semantics git uses when given $GIT_DIR in the environment, provide a new GIT_REPOSITORY_OPEN_NO_DOTGIT flag to not try appending /.git.
ed577134	2016-04-03T19:24:15	Fix repository discovery with ceiling_dirs at current directory git only checks ceiling directories when its search ascends to a parent directory. A ceiling directory matching the starting directory will not prevent git from finding a repository in the starting directory or a parent directory. libgit2 handled the former case correctly, but differed from git in the latter case: given a ceiling directory matching the starting directory, but no repository at the starting directory, libgit2 would stop the search at that point rather than finding a repository in a parent directory. Test case using git command-line tools: /tmp$ git init x Initialized empty Git repository in /tmp/x/.git/ /tmp$ cd x/ /tmp/x$ mkdir subdir /tmp/x$ cd subdir/ /tmp/x/subdir$ GIT_CEILING_DIRECTORIES=/tmp/x git rev-parse --git-dir fatal: Not a git repository (or any of the parent directories): .git /tmp/x/subdir$ GIT_CEILING_DIRECTORIES=/tmp/x/subdir git rev-parse --git-dir /tmp/x/.git Fix the testsuite to test this case (in one case fixing a test that depended on the current behavior), and then fix find_repo to handle this case correctly. In the process, simplify and document the logic in find_repo(): - Separate the concepts of "currently checking a .git directory" and "number of iterations left before going further counts as a search" into two separate variables, in_dot_git and min_iterations. - Move the logic to handle in_dot_git and append /.git to the top of the loop. - Only search ceiling_dirs and find ceiling_offset after running out of min_iterations; since ceiling_offset only tracks the longest matching ceiling directory, if ceiling_dirs contained both the current directory and a parent directory, this change makes find_repo stop the search at the parent directory.
fe345c73	2016-02-09T12:29:31	Remove unused static functions
8fd74c08	2016-02-09T12:18:28	Avoid old-style function definitions Avoid declaring old-style functions without any parameters. Functions not accepting any parameters should be declared with `void fn(void)`. See ISO C89 $3.5.4.3.
bb0edf87	2016-06-20T22:50:46	Merge pull request #3830 from pks-t/pks/thread-namespacing Thread namespacing
aab266c9	2016-06-20T20:07:33	threads: add platform-independent thread initialization function
8aaa9fb6	2016-06-20T18:21:42	win32: rename pthread.{c,h} to thread.{c,h} The old pthread-file did re-implement the pthreads API with exact symbol matching. As the thread-abstraction has now been split up between Unix- and Windows-specific files within the `git_` namespace to avoid symbol-clashes between libgit2 and pthreads, the rewritten wrappers have nothing to do with pthreads anymore. Rename the Windows-specific pthread-files to honor this change.
a342e870	2016-06-20T18:28:00	threads: remove now-useless typedefs
4f10c1e6	2016-06-20T19:40:45	threads: remove unused function pthread_num_processors_np The function pthread_num_processors_np is currently unused and superseded by the function `git_online_cpus`. Remove the function.
6551004f	2016-06-20T17:49:47	threads: split up OS-dependent rwlock code
139bffa0	2016-06-20T17:20:13	threads: split up OS-dependent thread-condition code
20d078df	2016-06-20T19:48:19	threads: remove unused function pthread_cond_broadcast
1c135405	2016-06-20T17:07:14	threads: split up OS-dependent mutex code
faebc1c6	2016-06-20T17:44:04	threads: split up OS-dependent thread code
c80efb5f	2016-06-20T11:16:49	Merge pull request #3818 from meatcoder/fix_odb_read_error Fix truncation of SHA in error message for git_odb_read
2076d329	2016-06-09T22:50:53	fix error message SHA truncation in git_odb__error_notfound()
6c9eb86f	2016-06-19T11:46:43	HTTP authentication scheme name is case insensitive.
bb0bd71a	2016-06-15T15:47:28	checkout: use empty baseline when no index When no index file exists and a baseline is not explicitly provided, use an empty baseline instead of trying to load `HEAD`.
abb6f72a	2016-06-14T11:42:00	Merge pull request #3812 from stinb/fetch-tag-update-callback fetch: Fixed spurious update callback for existing tags.
7f9673e4	2016-06-14T14:46:12	fetch: Fixed spurious update callback for existing tags.
2a09de91	2016-06-14T04:33:55	Merge pull request #3816 from pks-t/pks/memory-leaks Memory leak fixes
43c55111	2016-06-07T14:14:07	winhttp: plug several memory leaks
432af52b	2016-06-07T12:55:17	global: clean up crt only after freeing tls data The thread local storage is used to hold some global state that is dynamically allocated and should be freed upon exit. On Windows, we clean up the C run-time right after execution of registered shutdown callbacks and before cleaning up the TLS. When we clean up the CRT, we also cause it to analyze for memory leaks. As we did not free the TLS yet this will lead to false positives. Fix the issue by first freeing the TLS and cleaning up the CRT only afterwards.
13deb874	2016-06-07T08:35:26	index: fix NULL pointer access in index_remove_entry When removing an entry from the index by its position, we first retrieve the position from the index's entries and then try to remove the retrieved value from the index map with `DELETE_IN_MAP`. When `index_remove_entry` returns `NULL` we try to feed it into the `DELETE_IN_MAP` macro, which will unconditionally call `idxentry_hash` and then happily dereference the `NULL` entry pointer. Fix the issue by not passing a `NULL` entry into `DELETE_IN_MAP`.
7d02019a	2016-06-06T12:59:17	transports: smart: fix potential invalid memory dereferences When we receive a packet of exactly four bytes encoding its length as those four bytes it can be treated as an empty line. While it is not really specified how those empty lines should be treated, we currently ignore them and do not return an error when trying to parse it but simply advance the data pointer. Callers invoking `git_pkt_parse_line` are currently not prepared to handle this case as they do not explicitly check this case. While they could always reset the passed out-pointer to `NULL` before calling `git_pkt_parse_line` and determine if the pointer has been set afterwards, it makes more sense to update `git_pkt_parse_line` to set the out-pointer to `NULL` itself when it encounters such an empty packet. Like this it is guaranteed that there will be no invalid memory references to free'd pointers. As such, the issue has been fixed such that `git_pkt_parse_line` always sets the packet out pointer to `NULL` when an empty packet has been received and callers check for this condition, skipping such packets.
46082c38	2016-06-02T02:34:03	index_read_index: invalidate new paths in tree cache When adding a new entry to an existing index via `git_index_read_index`, be sure to remove the tree cache entry for that new path. This will mark all parent trees as dirty.
9167c145	2016-06-02T01:04:58	index_read_index: set flags for path_len correctly Update the flags to reset the path_len (to emulate `index_insert`)
046ec3c9	2016-06-02T00:47:51	index_read_index: differentiate on mode Treat index entries with different modes as different, which they are, at least for the purposes of up-to-date calculations.
93de20b8	2016-06-01T14:56:27	index_read_index: reset error correctly Clear any error state upon each iteration. If one of the iterations ends (with an error of `GIT_ITEROVER`) we need to reset that error to 0, lest we stop the whole process prematurely.
14cf05da	2016-05-26T12:52:29	win32: clean up unused warnings in DllMain
4505a42a	2016-05-26T12:42:43	rebase: change assertion to avoid It looks like we're getting the operation and not doing anything with it, when in fact we are asserting that it's not null. Simply assert that we are within the operation boundary instead of using the `git_array_get` macro to do this for us.
e3c42fee	2016-05-26T12:39:09	filebuf: fix uninitialized warning

19f1a8e6

2016-09-05T11:44:51

diff: improve positioning of add/delete blocks in diffs Some groups of added/deleted lines in diffs can be slid up or down, because lines at the edges of the group are not unique. Picking good shifts for such groups is not a matter of correctness but definitely has a big effect on aesthetics. For example, consider the following two diffs. The first is what standard Git emits: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) { } if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} +if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { $smtp_server = $_; The following diff is equivalent, but is obviously preferable from an aesthetic point of view: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) { $initial_reply_to =~ s/(^\s+|\s+$)//g; } +if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { This patch teaches Git to pick better positions for such "diff sliders" using heuristics that take the positions of nearby blank lines and the indentation of nearby lines into account. The existing Git code basically always shifts such "sliders" as far down in the file as possible. The only exception is when the slider can be aligned with a group of changed lines in the other file, in which case Git favors depicting the change as one add+delete block rather than one add and a slightly offset delete block. This naive algorithm often yields ugly diffs. Commit d634d61ed6 improved the situation somewhat by preferring to position add/delete groups to make their last line a blank line, when that is possible. This heuristic does more good than harm, but (1) it can only help if there are blank lines in the right places, and (2) always picks the last blank line, even if there are others that might be better. The end result is that it makes perhaps 1/3 as many errors as the default Git algorithm, but that still leaves a lot of ugly diffs. This commit implements a new and much better heuristic for picking optimal "slider" positions using the following approach: First observe that each hypothetical positioning of a diff slider introduces two splits: one between the context lines preceding the group and the first added/deleted line, and the other between the last added/deleted line and the first line of context following it. It tries to find the positioning that creates the least bad splits. Splits are evaluated based only on the presence and locations of nearby blank lines, and the indentation of lines near the split. Basically, it prefers to introduce splits adjacent to blank lines, between lines that are indented less, and between lines with the same level of indentation. In more detail: 1. It measures the following characteristics of a proposed splitting position in a `struct split_measurement`: * the number of blank lines above the proposed split * whether the line directly after the split is blank * the number of blank lines following that line * the indentation of the nearest non-blank line above the split * the indentation of the line directly below the split * the indentation of the nearest non-blank line after that line 2. It combines the measured attributes using a bunch of empirically-optimized weighting factors to derive a `struct split_score` that measures the "badness" of splitting the text at that position. 3. It combines the `split_score` for the top and the bottom of the slider at each of its possible positions, and selects the position that has the best `split_score`. I determined the initial set of weighting factors by collecting a corpus of Git histories from 29 open-source software projects in various programming languages. I generated many diffs from this corpus, and determined the best positioning "by eye" for about 6600 diff sliders. I used about half of the repositories in the corpus (corresponding to about 2/3 of the sliders) as a training set, and optimized the weights against this corpus using a crude automated search of the parameter space to get the best agreement with the manually-determined values. Then I tested the resulting heuristic against the full corpus. The results are summarized in the following table, in column `indent-1`: | repository | count | Git 2.9.0 | compaction | compaction-fixed | indent-1 | indent-2 | | --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- | | afnetworking | 109 | 89 (81.7%) | 37 (33.9%) | 37 (33.9%) | 2 (1.8%) | 2 (1.8%) | | alamofire | 30 | 18 (60.0%) | 14 (46.7%) | 15 (50.0%) | 0 (0.0%) | 0 (0.0%) | | angular | 184 | 127 (69.0%) | 39 (21.2%) | 23 (12.5%) | 5 (2.7%) | 5 (2.7%) | | animate | 313 | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | | ant | 380 | 356 (93.7%) | 152 (40.0%) | 148 (38.9%) | 15 (3.9%) | 15 (3.9%) | * | bugzilla | 306 | 263 (85.9%) | 109 (35.6%) | 99 (32.4%) | 14 (4.6%) | 15 (4.9%) | * | corefx | 126 | 91 (72.2%) | 22 (17.5%) | 21 (16.7%) | 6 (4.8%) | 6 (4.8%) | | couchdb | 78 | 44 (56.4%) | 26 (33.3%) | 28 (35.9%) | 6 (7.7%) | 6 (7.7%) | * | cpython | 937 | 158 (16.9%) | 50 (5.3%) | 49 (5.2%) | 5 (0.5%) | 5 (0.5%) | * | discourse | 160 | 95 (59.4%) | 42 (26.2%) | 36 (22.5%) | 18 (11.2%) | 13 (8.1%) | | docker | 307 | 194 (63.2%) | 198 (64.5%) | 253 (82.4%) | 8 (2.6%) | 8 (2.6%) | * | electron | 163 | 132 (81.0%) | 38 (23.3%) | 39 (23.9%) | 6 (3.7%) | 6 (3.7%) | | git | 536 | 470 (87.7%) | 73 (13.6%) | 78 (14.6%) | 16 (3.0%) | 16 (3.0%) | * | gitflow | 127 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | | ionic | 133 | 89 (66.9%) | 29 (21.8%) | 38 (28.6%) | 1 (0.8%) | 1 (0.8%) | | ipython | 482 | 362 (75.1%) | 167 (34.6%) | 169 (35.1%) | 11 (2.3%) | 11 (2.3%) | * | junit | 161 | 147 (91.3%) | 67 (41.6%) | 66 (41.0%) | 1 (0.6%) | 1 (0.6%) | * | lighttable | 15 | 5 (33.3%) | 0 (0.0%) | 2 (13.3%) | 0 (0.0%) | 0 (0.0%) | | magit | 88 | 75 (85.2%) | 11 (12.5%) | 9 (10.2%) | 1 (1.1%) | 0 (0.0%) | | neural-style | 28 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | | nodejs | 781 | 649 (83.1%) | 118 (15.1%) | 111 (14.2%) | 4 (0.5%) | 5 (0.6%) | * | phpmyadmin | 491 | 481 (98.0%) | 75 (15.3%) | 48 (9.8%) | 2 (0.4%) | 2 (0.4%) | * | react-native | 168 | 130 (77.4%) | 79 (47.0%) | 81 (48.2%) | 0 (0.0%) | 0 (0.0%) | | rust | 171 | 128 (74.9%) | 30 (17.5%) | 27 (15.8%) | 16 (9.4%) | 14 (8.2%) | | spark | 186 | 149 (80.1%) | 52 (28.0%) | 52 (28.0%) | 2 (1.1%) | 2 (1.1%) | | tensorflow | 115 | 66 (57.4%) | 48 (41.7%) | 48 (41.7%) | 5 (4.3%) | 5 (4.3%) | | test-more | 19 | 15 (78.9%) | 2 (10.5%) | 2 (10.5%) | 1 (5.3%) | 1 (5.3%) | * | test-unit | 51 | 34 (66.7%) | 14 (27.5%) | 8 (15.7%) | 2 (3.9%) | 2 (3.9%) | * | xmonad | 23 | 22 (95.7%) | 2 (8.7%) | 2 (8.7%) | 1 (4.3%) | 1 (4.3%) | * | --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- | | totals | 6668 | 4391 (65.9%) | 1496 (22.4%) | 1491 (22.4%) | 150 (2.2%) | 144 (2.2%) | | totals (training set) | 4552 | 3195 (70.2%) | 1053 (23.1%) | 1061 (23.3%) | 86 (1.9%) | 88 (1.9%) | | totals (test set) | 2116 | 1196 (56.5%) | 443 (20.9%) | 430 (20.3%) | 64 (3.0%) | 56 (2.6%) | In this table, the numbers are the count and percentage of human-rated sliders that the corresponding algorithm got *wrong*. The columns are * "repository" - the name of the repository used. I used the diffs between successive non-merge commits on the HEAD branch of the corresponding repository. * "count" - the number of sliders that were human-rated. I chose most, but not all, sliders to rate from those among which the various algorithms gave different answers. * "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0. * "compaction" - the heuristic used by `git diff --compaction-heuristic` in Git 2.9.0. * "compaction-fixed" - the heuristic used by `git diff --compaction-heuristic` after the fixes from earlier in this patch series. Note that the results are not dramatically different than those for "compaction". Both produce non-ideal diffs only about 1/3 as often as the default `git diff`. * "indent-1" - the new `--indent-heuristic` algorithm, using the first set of weighting factors, determined as described above. * "indent-2" - the new `--indent-heuristic` algorithm, using the final set of weighting factors, determined as described below. * `*` - indicates that repo was part of training set used to determine the first set of weighting factors. The fact that the heuristic performed nearly as well on the test set as on the training set in column "indent-1" is a good indication that the heuristic was not over-trained. Given that fact, I ran a second round of optimization, using the entire corpus as the training set. The resulting set of weights gave the results in column "indent-2". These are the weights included in this patch. The final result gives consistently and significantly better results across the whole corpus than either `git diff` or `git diff --compaction-heuristic`. It makes only about 1/30 as many errors as the former and about 1/10 as many errors as the latter. (And a good fraction of the remaining errors are for diffs that involve weirdly-formatted code, sometimes apparently machine-generated.) The tools that were used to do this optimization and analysis, along with the human-generated data values, are recorded in a separate project [1]. [1] https://github.com/mhagger/diff-slider-tools Original Git commit: 433860f3d0beb0c6f205290bd16cda413148f098

a49895b5

2016-08-22T13:22:44

xdl_change_compact(): introduce the concept of a change group The idea of xdl_change_compact() is fairly simple: * Proceed through groups of changed lines in the file to be compacted, keeping track of the corresponding location in the "other" file. * If possible, slide the group up and down to try to give the most aesthetically pleasing diff. Whenever it is slid, the current location in the other file needs to be adjusted. But these simple concepts are obfuscated by a lot of index handling that is written in terse, subtle, and varied patterns. I found it very hard to convince myself that the function was correct. So introduce a "struct group" that represents a group of changed lines in a file. Add some functions that perform elementary operations on groups: * Initialize a group to the first group in a file * Move to the next or previous group in a file * Slide a group up or down Even though the resulting code is longer, I think it is easier to understand and review. Its performance is not changed appreciably (though it would be if `group_next()` and `group_previous()` were not inlined). ...and in fact, the rewriting helped me discover another bug in the --compaction-heuristic code: The update of blank_lines was never done for the highest possible position of the group. This means that it could fail to slide the group to its highest possible position, even if that position had a blank line as its last line. So for example, it yielded the following diff: $ git diff --no-index --compaction-heuristic a.txt b.txt diff --git a/a.txt b/b.txt index e53969f..0d60c5fe 100644 --- a/a.txt +++ b/b.txt @@ -1,3 +1,7 @@ 1 A + +B + +A 2 when in fact the following diff is better (according to the rules of --compaction-heuristic): $ git diff --no-index --compaction-heuristic a.txt b.txt diff --git a/a.txt b/b.txt index e53969f..0d60c5fe 100644 --- a/a.txt +++ b/b.txt @@ -1,3 +1,7 @@ 1 +A + +B + A 2 The new code gives the bottom answer. Original Git commit: e8adf23d1ee97b57c8aea32ee8365203b77c0e42

09fb5b2a

2016-08-22T13:22:43

recs_match(): take two xrecord_t pointers as arguments There is no reason for it to take an array and two indexes as argument, as it only accesses two elements of the array. Original Git commit: 152598cbb667471c8f5be16e199922a41452b2d5

506bf09d

2016-04-15T16:01:45

xdiff: add recs_match helper function It is a common pattern in xdl_change_compact to check that hashes and strings match. The resulting code to perform this change causes very long lines and makes it hard to follow the intention. Introduce a helper function recs_match which performs both checks to increase code readability. Original Git commit: 92e5b62fec0e9b647429e8d3736c571c434dd375

2749ff46

2016-09-13T15:52:43

time: Export `git_time_monotonic`

9ad07fc0

2016-09-06T10:43:21

Merge pull request #3923 from libgit2/ethomson/diff-read-empty-binary Read binary patches (with no binary data)

46035d98

2016-09-06T11:21:29

Merge pull request #3882 from pks-t/pks/fix-fetch-refspec-dst-parsing refspec: do not set empty rhs for fetch refspecs

adedac5a

2016-09-02T02:03:45

diff: treat binary patches with no data special When creating and printing diffs, deal with binary deltas that have binary data specially, versus diffs that have a binary file but lack the actual binary data.

f4e3dae7

2016-09-02T11:26:16

diff_print: change test for skipping binary printing Instead of skipping printing a binary diff when there is no data, skip printing when we have a status of `UNMODIFIED`. This is more in-line with our internal data model and allows us to expand the notion of binary data. In the future, there may have no data because the files were unmodified (there was no data to produce) or it may have no data because there was no data given to us in a patch. We want to treat these cases separately.

4bfd7c63

2016-09-01T16:55:27

patch: error on diff callback failure

4b34f687

2016-09-01T15:14:25

patch_generate: only calculate binary diffs if requested When generating diffs for binary files, we load and decompress the blobs in order to generate the actual diff, which can be very costly. While we cannot avoid this for the case when we are called with the `GIT_DIFF_SHOW_BINARY` flag, we do not have to load the blobs in the case where this flag is not set, as the caller is expected to have no interest in the actual content of binary files. Fix the issue by only generating a binary diff when the caller is actually interested in the diff. As libgit2 uses heuristics to determine that a blob contains binary data by inspecting its size without loading from the ODB, this saves us quite some time when diffing in a repository with binary files.

88cfe614

2016-08-24T01:20:39

git_checkout_tree options fix According to the reference the git_checkout_tree and git_checkout_head functions should accept NULL in the opts field This was broken since the opts field was dereferenced and thus lead to a crash.

ace0d36b

2016-08-29T09:29:34

Merge pull request #3900 from pks-t/pks/http-close-substream-on-connect transports: http: set substream as disconnected after closing

b859faa6

2016-08-23T23:38:39

Teach `git_patch_from_diff` about parsed diffs Ensure that `git_patch_from_diff` can return the patch for parsed diffs, not just generate a patch for a generated diff.

7a3f1de5

2016-08-22T09:27:47

filesystem_iterator: fixed double free on error

c1b370e9

2016-08-17T09:24:44

Merge pull request #3837 from novalis/dturner/indexv4 Support index v4

635a9222

2016-08-17T08:54:48

Merge pull request #3895 from pks-t/pks/negate-basename-in-subdirs ignore: allow unignoring basenames in subdirectories

b1453601

2016-08-17T11:38:26

transports: http: reset `connected` flag when closing transport

c4cba4e9

2016-08-17T11:00:05

transports: http: reset `connected` flag when re-connecting transport When calling `http_connect` on a subtransport whose stream is already connected, we first close the stream in case no keep-alive is in use. When doing so, we do not reset the transport's connection state, though. Usually, this will do no harm in case the subsequent connect will succeed. But when the connection fails we are left with a substransport which is tagged as connected but which has no valid stream attached. Fix the issue by resetting the subtransport's connected-state when closing its stream in `http_connect`.

fcb2c1c8

2016-08-12T09:06:15

ignore: allow unignoring basenames in subdirectories The .gitignore file allows for patterns which unignore previous ignore patterns. When unignoring a previous pattern, there are basically three cases how this is matched when no globbing is used: 1. when a previous file has been ignored, it can be unignored by using its exact name, e.g. foo/bar !foo/bar 2. when a file in a subdirectory has been ignored, it can be unignored by using its basename, e.g. foo/bar !bar 3. when all files with a basename are ignored, a specific file can be unignored again by specifying its path in a subdirectory, e.g. bar !foo/bar The first problem in libgit2 is that we did not correctly treat the second case. While we verified that the negative pattern matches the tail of the positive one, we did not verify if it only matches the basename of the positive pattern. So e.g. we would have also negated a pattern like foo/fruz_bar !bar Furthermore, we did not check for the third case, where a basename is being unignored in a certain subdirectory again. Both issues are fixed with this commit.

5625d86b

2016-05-17T15:40:32

index: support index v4 Support reading and writing index v4. Index v4 uses a very simple compression scheme for pathnames, but is otherwise similar to index v3. Signed-off-by: David Turner <dturner@twitter.com>

aeb5ee5a

2016-05-17T15:40:46

varint: Add varint encoding/decoding This code is ported from git.git Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: David Turner <dturner@twopensource.com>

b9895144

2016-08-08T14:47:32

stransport: do not use `git_stream_free` on uninitialized stransport When failing to initialize a new stransport stream, we try to release already allocated memory by calling out to `git_stream_free`, which in turn called out to the stream's `free` function pointer. As we only initialize the function pointer later on, this leads to a `NULL` pointer exception. Furthermore, plug another memory leak when failing to create the SSL context.

97e57e87

2016-08-08T15:13:59

Merge pull request #3887 from libgit2/ethomson/empty_blob odb: only provide the empty tree

b47e79e2

2016-08-08T08:42:32

Merge pull request #3890 from pks-t/pks/stransport-static-linkage stransport: make internal functions static

067bf5dc

2016-08-08T13:49:17

stransport: make internal functions static

becadafc

2016-08-05T19:30:56

odb: only provide the empty tree Only provide the empty tree internally, which matches git's behavior. If we provide the empty blob then any users trying to write it with libgit2 would omit it from actually landing in the odb, which appear to git proper as a broken repository (missing that object).

9884dd61

2016-08-05T18:40:37

SecureTransport: handle NULL trust on success The `SSLCopyPeerTrust` call can succeed but fail to return a trust object if it can't load the certificate chain and thus cannot check the validity of a certificate. This can lead to us calling `CFRelease` on a `NULL` trust object, causing a crash. Handle this by returning ECERTIFICATE.

274a727e

2016-08-05T10:57:42

apply: fix warning when initializing patch images

844f5b20

2016-08-05T10:57:13

pool: provide macro to statically initialize git_pool

27051d4e

2016-07-22T13:34:19

odb: only freshen pack files every 2 seconds Since writing multiple objects may all already exist in a single packfile, avoid freshening that packfile repeatedly in a tight loop. Instead, only freshen pack files every 2 seconds.

8f09a98e

2016-07-14T16:23:24

odb: freshen existing objects when writing When writing an object, we calculate its OID and see if it exists in the object database. If it does, we need to freshen the file that contains it.

d2794b0e

2016-08-04T20:49:50

Merge pull request #3877 from libgit2/ethomson/paths_init sysdir: don't assume an empty dir is uninitialized

0d84de02

2016-08-04T13:20:49

Merge pull request #3869 from richardipsum/fix-outdated-comment Fix outdated comment

78b500bf

2016-08-04T12:45:19

Merge pull request #3850 from wildart/custom-tls Enable https transport for custom TLS streams

031d34b7

2016-07-29T12:59:42

sysdir: use the standard `init` pattern Don't try to determine when sysdirs are uninitialized. Instead, simply initialize them all at `git_libgit2_init` time and never try to reinitialize, except when consumers explicitly call `git_sysdir_set`. Looking at the buffer length is especially problematic, since there may no appropriate path for that value. (For example, the Windows-specific programdata directory has no value on non-Windows machines.) Previously we would continually trying to re-lookup these values, which could get racy if two different threads are each calling `git_sysdir_get` and trying to lookup / clear the value simultaneously.

da7f9feb

2016-08-04T11:51:06

Merge pull request #3879 from libgit2/ethomson/mwindow_init mwindow: init mwindow files in git_libgit2_init

2381d9e4

2016-08-03T17:01:48

mwindow: init mwindow files in git_libgit2_init

1eee631d

2016-08-04T13:45:28

refspec: do not set empty rhs for fetch refspecs According to git-fetch(1), "[t]he colon can be omitted when <dst> is empty." So according to git, the refspec "refs/heads/master" is the same as the refspec "refs/heads/master:" when fetching changes. When trying to fetch from a remote with a trailing colon with libgit2, though, the fetch actually fails while it works when the trailing colon is left out. So obviously, libgit2 does _not_ treat these two refspec formats the same for fetches. The problem results from parsing refspecs, where the resulting refspec has its destination set to an empty string in the case of a trailing colon and to a `NULL` pointer in the case of no trailing colon. When passing this to our DWIM machinery, the empty string gets translated to "refs/heads/", which is simply wrong. Fix the problem by having the parsing machinery treat both cases the same for fetch refspecs.

002c8e29

2016-08-03T17:09:41

git_diff_file: move `id_abbrev` Move `id_abbrev` to a more reasonable place where it packs more nicely (before anybody starts using it).

152efee2

2016-08-02T18:43:12

Merge pull request #3865 from libgit2/ethomson/leaks Fix leaks, some warnings and an error

df87648a

2016-07-24T16:10:30

crlf: set a safe crlf default

b118f647

2016-07-22T14:02:00

repository: don't cast to `int` for no reason And give it a default so that some compilers don't (unnecessarily) complain.

4aaae935

2016-07-22T12:53:13

index: cast to avoid warning

60e15ecd

2016-07-15T17:18:39

packbuilder: `size_t` all the things After 1cd65991, we were passing a pointer to an `unsigned long` to a function that now expected a pointer to a `size_t`. These types differ on 64-bit Windows, which means that we trash the stack. Use `size_t`s in the packbuilder to avoid this.

581a4d39

2016-07-14T23:32:35

apply: safety check files that dont end with eol

c065f6a1

2016-07-14T23:04:47

apply: check allocation properly

531be3e8

2016-07-14T22:59:37

apply: compare preimage to image Compare the preimage to the image; don't compare the preimage to itself.

8b2ad593

2016-07-23T11:55:43

Make comment conform to style guide Style guide says // style comments should be avoided.

877282ea

2016-07-23T11:47:59

Fix outdated comment SSH transport seems to be supported now.

d81cb2e4

2016-07-15T13:32:23

remote: Handle missing config values when deleting a remote Somehow I ended up with the following in my ~/.gitconfig: [branch "master"] remote = origin merge = master rebase = true I assume something went crazy while I was running the git.git tests some time ago, and that I never noticed until now. This is not a good configuration, but it shouldn't cause problems. But it does. Specifically, if you have this in your config, and you perform the following set of actions: create a remote fetch from that remote create a branch off of the remote master branch called "master" delete the branch delete the remote The remote delete fails with the message "Could not find key 'branch.master.rebase' to delete". This is because it's iterating over the config entries (including the ones in the global config) and believes that there is a master branch which must therefore have these config keys. https://github.com/libgit2/libgit2/issues/3856

bdec62dc

2016-07-06T13:06:25

remove conditions that prevent use of custom TLS stream

c18a2bc4

2016-07-05T15:51:01

Merge pull request #3851 from txdv/get-user-agent Add get user agent functionality.

b57c176a

2016-07-05T12:46:27

Merge pull request #3846 from rkrp/fix_bug_parsing_int64min Fixed bug while parsing INT64_MIN

f1dba144

2016-07-05T09:41:51

Add get user agent functionality.

d8243465

2016-07-01T18:47:06

Merge pull request #3836 from joshtriplett/cleanup-find_repo find_repo: Clean up and simplify logic

ebeb56f0

2016-07-01T18:45:10

Merge pull request #3711 from joshtriplett/git_repository_discover_default Add GIT_REPOSITORY_OPEN_FROM_ENV flag to respect $GIT_* environment vars

6249d960

2016-06-29T17:55:44

index: include conflicts in `git_index_read_index` Ensure that we include conflicts when calling `git_index_read_index`, which will remove conflicts in the index that do not exist in the new target, and will add conflicts from the new target.

6f7ec728

2016-06-29T17:01:47

index: refactor common `read_index` functionality Most of `git_index_read_index` is common to reading any iterator. Refactor it out in case we want to implement `read_tree` in terms of it in the future.

59a0005d

2016-06-29T10:01:26

Merge pull request #3813 from stinb/submodule-update-fetch submodule: Try to fetch when update fails to find the target commit.

21766702

2016-06-27T15:20:20

blame: do not decrement commit refcount in make_origin When we create a blame origin, we try to look up the blob that is to be blamed at a certain revision. When this lookup fails, e.g. because the file did not exist at that certain revision, we fail to create the blame origin and return `NULL`. The blame origin that we have just allocated is thereby free'd with `origin_decref`. The `origin_decref` function does not only decrement reference counts for the blame origin, though, but also for its commit and blob. When this is done in the error case, we will cause an uneven reference count for these objects. This may result in hard-to-debug failures at seemingly unrelated code paths, where we try to access these objects when they in fact have already been free'd. Fix the issue by refactoring `make_origin` such that we only allocate the object after the only function that may fail so that we do not have to call `origin_decref` at all. Also fix the `pass_blame` function, which indirectly calls `make_origin`, to free the commit when `make_origin` failed.

70b9b841

2016-06-28T20:19:52

Fixed bug while parsing INT64_MIN

de43efcf

2016-06-28T16:07:25

submodule: Try to fetch when update fails to find the target commit in the submodule.

20302aa4

2016-06-25T23:33:05

Merge pull request #3223 from ethomson/apply Reading patch files

1a79cd95

2016-04-26T01:18:01

patch: show copy information for identical copies When showing copy information because we are duplicating contents, for example, when performing a `diff --find-copies-harder -M100 -B100`, then show copy from/to lines in a patch, and do not show context. Ensure that we can also parse such patches.

38a347ea

2016-04-25T17:52:39

patch::parse: handle patches with no hunks Patches may have no hunks when there's no modifications (for example, in a rename). Handle them.

2b490284

2016-06-24T15:59:37

find_repo: Clean up and simplify logic find_repo had a complex loop and heavily nested conditionals, making it difficult to follow. Simplify this as much as possible: - Separate assignments from conditionals. - Check the complex loop condition in the only place it can change. - Break out of the loop on error, rather than going through the rest of the loop body first. - Handle error cases by immediately breaking, rather than nesting conditionals. - Free repo_link unconditionally on the way out of the function, rather than in multiple places. - Add more comments on the remaining complex steps.

0dd98b69

2016-04-03T17:22:07

Add GIT_REPOSITORY_OPEN_FROM_ENV flag to respect $GIT_* environment vars git_repository_open_ext provides parameters for the start path, whether to search across filesystems, and what ceiling directories to stop at. git commands have standard environment variables and defaults for each of those, as well as various other parameters of the repository. To avoid duplicate environment variable handling in users of libgit2, add a GIT_REPOSITORY_OPEN_FROM_ENV flag, which makes git_repository_open_ext automatically handle the appropriate environment variables. Commands that intend to act just like those built into git itself can use this flag to get the expected default behavior. git_repository_open_ext with the GIT_REPOSITORY_OPEN_FROM_ENV flag respects $GIT_DIR, $GIT_DISCOVERY_ACROSS_FILESYSTEM, $GIT_CEILING_DIRECTORIES, $GIT_INDEX_FILE, $GIT_NAMESPACE, $GIT_OBJECT_DIRECTORY, and $GIT_ALTERNATE_OBJECT_DIRECTORIES. In the future, when libgit2 gets worktree support, git_repository_open_env will also respect $GIT_WORK_TREE and $GIT_COMMON_DIR; until then, git_repository_open_ext with this flag will error out if either $GIT_WORK_TREE or $GIT_COMMON_DIR is set.

39c6fca3

2016-04-03T16:01:01

Add GIT_REPOSITORY_OPEN_NO_DOTGIT flag to avoid appending /.git GIT_REPOSITORY_OPEN_NO_SEARCH does not search up through parent directories, but still tries the specified path both directly and with /.git appended. GIT_REPOSITORY_OPEN_BARE avoids appending /.git, but opens the repository in bare mode even if it has a working directory. To support the semantics git uses when given $GIT_DIR in the environment, provide a new GIT_REPOSITORY_OPEN_NO_DOTGIT flag to not try appending /.git.

ed577134

2016-04-03T19:24:15

Fix repository discovery with ceiling_dirs at current directory git only checks ceiling directories when its search ascends to a parent directory. A ceiling directory matching the starting directory will not prevent git from finding a repository in the starting directory or a parent directory. libgit2 handled the former case correctly, but differed from git in the latter case: given a ceiling directory matching the starting directory, but no repository at the starting directory, libgit2 would stop the search at that point rather than finding a repository in a parent directory. Test case using git command-line tools: /tmp$ git init x Initialized empty Git repository in /tmp/x/.git/ /tmp$ cd x/ /tmp/x$ mkdir subdir /tmp/x$ cd subdir/ /tmp/x/subdir$ GIT_CEILING_DIRECTORIES=/tmp/x git rev-parse --git-dir fatal: Not a git repository (or any of the parent directories): .git /tmp/x/subdir$ GIT_CEILING_DIRECTORIES=/tmp/x/subdir git rev-parse --git-dir /tmp/x/.git Fix the testsuite to test this case (in one case fixing a test that depended on the current behavior), and then fix find_repo to handle this case correctly. In the process, simplify and document the logic in find_repo(): - Separate the concepts of "currently checking a .git directory" and "number of iterations left before going further counts as a search" into two separate variables, in_dot_git and min_iterations. - Move the logic to handle in_dot_git and append /.git to the top of the loop. - Only search ceiling_dirs and find ceiling_offset after running out of min_iterations; since ceiling_offset only tracks the longest matching ceiling directory, if ceiling_dirs contained both the current directory and a parent directory, this change makes find_repo stop the search at the parent directory.

fe345c73

2016-02-09T12:29:31

Remove unused static functions

8fd74c08

2016-02-09T12:18:28

Avoid old-style function definitions Avoid declaring old-style functions without any parameters. Functions not accepting any parameters should be declared with `void fn(void)`. See ISO C89 $3.5.4.3.

bb0edf87

2016-06-20T22:50:46

Merge pull request #3830 from pks-t/pks/thread-namespacing Thread namespacing

aab266c9

2016-06-20T20:07:33

threads: add platform-independent thread initialization function

8aaa9fb6

2016-06-20T18:21:42

win32: rename pthread.{c,h} to thread.{c,h} The old pthread-file did re-implement the pthreads API with exact symbol matching. As the thread-abstraction has now been split up between Unix- and Windows-specific files within the `git_` namespace to avoid symbol-clashes between libgit2 and pthreads, the rewritten wrappers have nothing to do with pthreads anymore. Rename the Windows-specific pthread-files to honor this change.

a342e870

2016-06-20T18:28:00

threads: remove now-useless typedefs

4f10c1e6

2016-06-20T19:40:45

threads: remove unused function pthread_num_processors_np The function pthread_num_processors_np is currently unused and superseded by the function `git_online_cpus`. Remove the function.

6551004f

2016-06-20T17:49:47

threads: split up OS-dependent rwlock code

139bffa0

2016-06-20T17:20:13

threads: split up OS-dependent thread-condition code

20d078df

2016-06-20T19:48:19

threads: remove unused function pthread_cond_broadcast

1c135405

2016-06-20T17:07:14

threads: split up OS-dependent mutex code

faebc1c6

2016-06-20T17:44:04

threads: split up OS-dependent thread code

c80efb5f

2016-06-20T11:16:49

Merge pull request #3818 from meatcoder/fix_odb_read_error Fix truncation of SHA in error message for git_odb_read

2076d329

2016-06-09T22:50:53

fix error message SHA truncation in git_odb__error_notfound()

6c9eb86f

2016-06-19T11:46:43

HTTP authentication scheme name is case insensitive.

bb0bd71a

2016-06-15T15:47:28

checkout: use empty baseline when no index When no index file exists and a baseline is not explicitly provided, use an empty baseline instead of trying to load `HEAD`.

abb6f72a

2016-06-14T11:42:00

Merge pull request #3812 from stinb/fetch-tag-update-callback fetch: Fixed spurious update callback for existing tags.

7f9673e4

2016-06-14T14:46:12

fetch: Fixed spurious update callback for existing tags.

2a09de91

2016-06-14T04:33:55

Merge pull request #3816 from pks-t/pks/memory-leaks Memory leak fixes

43c55111

2016-06-07T14:14:07

winhttp: plug several memory leaks

432af52b

2016-06-07T12:55:17

global: clean up crt only after freeing tls data The thread local storage is used to hold some global state that is dynamically allocated and should be freed upon exit. On Windows, we clean up the C run-time right after execution of registered shutdown callbacks and before cleaning up the TLS. When we clean up the CRT, we also cause it to analyze for memory leaks. As we did not free the TLS yet this will lead to false positives. Fix the issue by first freeing the TLS and cleaning up the CRT only afterwards.

13deb874

2016-06-07T08:35:26

index: fix NULL pointer access in index_remove_entry When removing an entry from the index by its position, we first retrieve the position from the index's entries and then try to remove the retrieved value from the index map with `DELETE_IN_MAP`. When `index_remove_entry` returns `NULL` we try to feed it into the `DELETE_IN_MAP` macro, which will unconditionally call `idxentry_hash` and then happily dereference the `NULL` entry pointer. Fix the issue by not passing a `NULL` entry into `DELETE_IN_MAP`.

7d02019a

2016-06-06T12:59:17

transports: smart: fix potential invalid memory dereferences When we receive a packet of exactly four bytes encoding its length as those four bytes it can be treated as an empty line. While it is not really specified how those empty lines should be treated, we currently ignore them and do not return an error when trying to parse it but simply advance the data pointer. Callers invoking `git_pkt_parse_line` are currently not prepared to handle this case as they do not explicitly check this case. While they could always reset the passed out-pointer to `NULL` before calling `git_pkt_parse_line` and determine if the pointer has been set afterwards, it makes more sense to update `git_pkt_parse_line` to set the out-pointer to `NULL` itself when it encounters such an empty packet. Like this it is guaranteed that there will be no invalid memory references to free'd pointers. As such, the issue has been fixed such that `git_pkt_parse_line` always sets the packet out pointer to `NULL` when an empty packet has been received and callers check for this condition, skipping such packets.

46082c38

2016-06-02T02:34:03

index_read_index: invalidate new paths in tree cache When adding a new entry to an existing index via `git_index_read_index`, be sure to remove the tree cache entry for that new path. This will mark all parent trees as dirty.

9167c145

2016-06-02T01:04:58

index_read_index: set flags for path_len correctly Update the flags to reset the path_len (to emulate `index_insert`)

046ec3c9

2016-06-02T00:47:51

index_read_index: differentiate on mode Treat index entries with different modes as different, which they are, at least for the purposes of up-to-date calculations.

93de20b8

2016-06-01T14:56:27

index_read_index: reset error correctly Clear any error state upon each iteration. If one of the iterations ends (with an error of `GIT_ITEROVER`) we need to reset that error to 0, lest we stop the whole process prematurely.

14cf05da

2016-05-26T12:52:29

win32: clean up unused warnings in DllMain

4505a42a

2016-05-26T12:42:43

rebase: change assertion to avoid It looks like we're getting the operation and not doing anything with it, when in fact we are asserting that it's not null. Simply assert that we are within the operation boundary instead of using the `git_array_get` macro to do this for us.

e3c42fee

2016-05-26T12:39:09

filebuf: fix uninitialized warning

thodg/libgit2/src

src

Log