|
3658e81e
|
2013-03-25T14:20:07
|
|
Move crlf conversion into buf_text
This adds crlf/lf conversion functions into buf_text with more
efficient implementations that bypass the high level buffer
functions. They attempt to minimize the number of reallocations
done and they directly write the buffer data as needed if they
know that there is enough memory allocated to memcpy data.
Tests are added for these new functions. The crlf.c code is
updated to use the new functions.
Removed the include of buf_text.h from filter.h and just include
it more narrowly in the places that need it.
|
|
9bc8be3d
|
2013-02-19T10:25:41
|
|
Refine pluggable similarity API
This plugs in the three basic similarity strategies for handling
whitespace via internal use of the pluggable API. In so doing, I
realized that the use of git_buf in the hashsig API was not needed
and actually just made it harder to use, so I tweaked that API as
well.
Note that the similarity metric is still not hooked up in the
find_similarity code - this is just setting out the function that
will be used.
|
|
aa643260
|
2013-02-15T11:08:02
|
|
More tests of file signatures with whitespace opts
Seems to be working pretty well...
|
|
5e5848eb
|
2013-02-14T17:25:10
|
|
Change similarity metric to sampled hashes
This moves the similarity metric code out of buf_text and into a
new file. Also, this implements a different approach to similarity
measurement based on a Rabin-Karp rolling hash where we only keep
the top 100 and bottom 100 hashes. In theory, that should be
sufficient samples to given a fairly accurate measurement while
limiting the amount of data we keep for file signatures no matter
how large the file is.
|
|
9c454b00
|
2013-01-11T22:13:02
|
|
Initial implementation of similarity scoring algo
This adds a new `git_buf_text_hashsig` type and functions to
generate these hash signatures and compare them to give a
similarity score. This can be plugged into diff similarity
scoring.
|
|
17c92bea
|
2013-01-29T12:13:24
|
|
Test buf join with NULL behavior explicitly
|
|
0d65acad
|
2013-01-11T11:24:26
|
|
Match binary file check of core git in diff
Core git just looks for NUL bytes in files when deciding about
is-binary inside diff (although it uses a better algorithm in
checkout, when deciding if CRLF conversion should be done).
Libgit2 was using the better algorithm in both places, but that
is causing some confusion. For now, this makes diff just look
for NUL bytes to decide if a file is binary by content in diff.
|
|
7bf87ab6
|
2012-11-28T09:58:48
|
|
Consolidate text buffer functions
There are many scattered functions that look into the contents of
buffers to do various text manipulations (such as escaping or
unescaping data, calculating text stats, guessing if content is
binary, etc). This groups all those functions together into a
new file and converts the code to use that.
This has two enhancements to existing functionality. The old
text stats function is significantly rewritten and the BOM
detection code was extended (although largely we can't deal with
anything other than a UTF8 BOM).
|
|
2d3579be
|
2012-10-10T14:54:31
|
|
Add git_buf_put_base64 to buffer API
|
|
e9ca852e
|
2012-08-23T09:20:17
|
|
Fix warnings and merge issues on Win64
|
|
02a0d651
|
2012-07-12T16:31:59
|
|
Add git_buf_unescape and git__unescape to unescape all characters in a string (in-place)
|
|
465092ce
|
2012-07-12T11:56:50
|
|
Fix memory leak in test
|
|
039fc406
|
2012-07-10T15:10:14
|
|
Add a couple of useful git_buf utilities
* `git_buf_rfind` (with tests and tests for `git_buf_rfind_next`)
* `git_buf_puts_escaped` and `git_buf_puts_escaped_regex` (with tests)
to copy strings into a buffer while injecting an escape sequence
(e.g. '\') in front of particular characters.
|
|
41a82592
|
2012-05-15T14:17:39
|
|
Ranged iterators and rewritten git_status_file
The goal of this work is to rewrite git_status_file to use the
same underlying code as git_status_foreach.
This is done in 3 phases:
1. Extend iterators to allow ranged iteration with start and
end prefixes for the range of file names to be covered.
2. Improve diff so that when there is a pathspec and there is
a common non-wildcard prefix of the pathspec, it will use
ranged iterators to minimize excess iteration.
3. Rewrite git_status_file to call git_status_foreach_ext
with a pathspec that covers just the one file being checked.
Since ranged iterators underlie the status & diff implementation,
this is actually fairly efficient. The workdir iterator does
end up loading the contents of all the directories down to the
single file, which should ideally be avoided, but it is pretty
good.
|
|
1a6e8f8a
|
2012-04-13T10:42:00
|
|
Update clar and remove old helpers
This updates to the latest clar which includes the helpers
`cl_assert_equal_s` and `cl_assert_equal_i`. Convert the code
over to use those and remove the old libgit2-only helpers.
|
|
a4c291ef
|
2012-03-20T21:57:38
|
|
Convert reflog to new errors
Cleaned up some other issues.
|
|
13224ea4
|
2012-02-27T04:28:31
|
|
buffer: Unify `git_fbuffer` and `git_buf`
This makes so much sense that I can't believe it hasn't been done
before. Kill the old `git_fbuffer` and read files straight into
`git_buf` objects.
Also: In order to fully support 4GB files in 32-bit systems, the
`git_buf` implementation has been changed from using `ssize_t` for
storage and storing negative values on allocation failure, to using
`size_t` and changing the buffer pointer to a magical pointer on
allocation failure.
Hopefully this won't break anything.
|
|
3fd1520c
|
2012-01-24T20:35:15
|
|
Rename the Clay test suite to Clar
Clay is the name of a programming language on the makings, and we want
to avoid confusions. Sorry for the huge diff!
|