|
9bc8be3d
|
2013-02-19T10:25:41
|
|
Refine pluggable similarity API
This plugs in the three basic similarity strategies for handling
whitespace via internal use of the pluggable API. In so doing, I
realized that the use of git_buf in the hashsig API was not needed
and actually just made it harder to use, so I tweaked that API as
well.
Note that the similarity metric is still not hooked up in the
find_similarity code - this is just setting out the function that
will be used.
|
|
aa643260
|
2013-02-15T11:08:02
|
|
More tests of file signatures with whitespace opts
Seems to be working pretty well...
|
|
9c454b00
|
2013-01-11T22:13:02
|
|
Initial implementation of similarity scoring algo
This adds a new `git_buf_text_hashsig` type and functions to
generate these hash signatures and compare them to give a
similarity score. This can be plugged into diff similarity
scoring.
|
|
5e5848eb
|
2013-02-14T17:25:10
|
|
Change similarity metric to sampled hashes
This moves the similarity metric code out of buf_text and into a
new file. Also, this implements a different approach to similarity
measurement based on a Rabin-Karp rolling hash where we only keep
the top 100 and bottom 100 hashes. In theory, that should be
sufficient samples to given a fairly accurate measurement while
limiting the amount of data we keep for file signatures no matter
how large the file is.
|
|
56543a60
|
2013-02-15T16:02:45
|
|
Clear up warnings from cppcheck
The cppcheck static analyzer generates warnings for a bunch of
places in the libgit2 code base. All the ones fixed in this
commit are actually false positives, but I've reorganized the
code to hopefully make it easier for static analysis tools to
correctly understand the structure. I wouldn't do this if I
felt like it was making the code harder to read or worse for
humans, but in this case, these fixes don't seem too bad and will
hopefully make it easier for better analysis tools to get at any
real issues.
|
|
8c29dca6
|
2013-02-11T09:25:57
|
|
Fix some incorrect MSVC #ifdef's. Fixes #1305
|
|
17c92bea
|
2013-01-29T12:13:24
|
|
Test buf join with NULL behavior explicitly
|
|
0d52cb4a
|
2013-01-24T00:09:55
|
|
opts: Some basic tests
|
|
f63d0ee9
|
2013-01-17T15:47:10
|
|
Move all non-ascii test data to raw hex
This takes all of the characters in core::env and makes them use
hex sequences instead of keeping tricky character data inline in
the test.
|
|
0d65acad
|
2013-01-11T11:24:26
|
|
Match binary file check of core git in diff
Core git just looks for NUL bytes in files when deciding about
is-binary inside diff (although it uses a better algorithm in
checkout, when deciding if CRLF conversion should be done).
Libgit2 was using the better algorithm in both places, but that
is causing some confusion. For now, this makes diff just look
for NUL bytes to decide if a file is binary by content in diff.
|
|
b8a1ea7c
|
2013-01-03T11:04:03
|
|
Fix core::env cleanup code
Mark fake home directories that failed to be created, so we won't
try to remove them and have cleanup just use p_rmdir.
|
|
600d8dbf
|
2013-01-03T09:10:38
|
|
Move test cleanup into cleanup functions
|
|
6fef1ab3
|
2013-01-03T07:47:51
|
|
Tests should clean up after themselves
|
|
50a762a5
|
2012-12-26T12:03:07
|
|
path: Teach UNC paths to git_path_dirname_r()
Fix libgit2/libgit2sharp#256
|
|
34b6f05f
|
2012-12-26T11:59:07
|
|
path: enhance git_path_dirname_r() test coverage
|
|
7bf87ab6
|
2012-11-28T09:58:48
|
|
Consolidate text buffer functions
There are many scattered functions that look into the contents of
buffers to do various text manipulations (such as escaping or
unescaping data, calculating text stats, guessing if content is
binary, etc). This groups all those functions together into a
new file and converts the code to use that.
This has two enhancements to existing functionality. The old
text stats function is significantly rewritten and the BOM
detection code was extended (although largely we can't deal with
anything other than a UTF8 BOM).
|
|
16248ee2
|
2012-11-21T11:03:07
|
|
Fix up some missing consts in tree & index
This fixes some missed places where we can apply const-ness to
various public APIs.
There are still some index and tree APIs that cannot take const
pointers because we sort our `git_vectors` lazily and so we can't
reliably bsearch the index and tree content without applying a
`git_vector_sort()` first.
This also fixes some missed places where size_t can be used and
where const can be applied to a couple internal functions.
|
|
9094d30b
|
2012-11-23T11:41:56
|
|
Reset all static variables to NULL in clar's __cleanup
Without this change, any failed assertion in the second (or a later) test
inside a test suite has a chance of double deleting memory, resulting in
a heap corruption. See #1096 for details.
This leaves alone the test cases where we "just" use cl_git_sandbox_init()
and cl_git_sandbox_cleanup(). These methods already take good care to not
double delete a repository.
Fixes #1096
|
|
e566b609
|
2012-11-20T00:57:56
|
|
Update clar tests p_lstat_posixly and p_lstat
|
|
0e95e70a
|
2012-11-17T05:22:39
|
|
env: ensure git_futils_find_xxx() returns ENOTFOUND
|
|
cccacac5
|
2012-11-14T22:41:51
|
|
Add POSIX compat lstat() variant for win32
The existing p_lstat implementation on win32 is not quite POSIX
compliant when setting errno to ENOTDIR. This adds an option to
make is be compliant so that code (such as checkout) that cares
to have separate behavior for ENOTDIR can use it portably.
This also contains a couple of other minor cleanups in the
posix_w32.c implementations to avoid unnecessary work.
|
|
331e7de9
|
2012-10-24T17:32:50
|
|
Extensions to rmdir and mkdir utilities
* Rework GIT_DIRREMOVAL values to GIT_RMDIR flags, allowing
combinations of flags
* Add GIT_RMDIR_EMPTY_PARENTS flag to remove parent dirs that
are left empty after removal
* Add GIT_MKDIR_VERIFY_DIR to give an error if item is a file,
not a dir (previously an EEXISTS error was ignored, even for
files) and enable this flag for git_futils_mkpath2file call
* Improve accuracy of error messages from git_futils_mkdir
|
|
f45ec1a0
|
2012-10-29T20:04:21
|
|
index refactoring
|
|
0d422ec9
|
2012-10-19T15:40:43
|
|
Fix env variable tests with new Win32 path rules
The new Win32 global path search was not working with the
environment variable tests. But when I fixed the test, the new
codes use of getenv() was causing more failures (presumably because
of caching on Windows ???). This fixes the global file lookup to
always go directly to the Win32 API in a predictable way.
|
|
4c47a8bc
|
2012-10-17T14:14:51
|
|
Merge pull request #968 from arrbee/diff-support-typechange
Support TYPECHANGE records in status and adjust checkout accordingly
|
|
18217e7e
|
2012-10-16T19:34:29
|
|
test: Don't be so picky with failed lookups
Not found means not found, and the other way around.
|
|
2d3579be
|
2012-10-10T14:54:31
|
|
Add git_buf_put_base64 to buffer API
|
|
0d64bef9
|
2012-10-05T15:56:57
|
|
Add complex checkout test and then fix checkout
This started as a complex new test for checkout going through the
"typechanges" test repository, but that revealed numerous issues
with checkout, including:
* complete failure with submodules
* failure to create blobs with exec bits
* problems when replacing a tree with a blob because the tree
"example/" sorts after the blob "example" so the delete was
being processed after the single file blob was created
This fixes most of those problems and includes a number of other
minor changes that made it easier to do that, including improving
the TYPECHANGE support in diff/status, etc.
|
|
1a628100
|
2012-09-21T15:04:39
|
|
Make giterr_set_str public
There has been discussion for a while about making some set of
the `giterr_set` type functions part of the public API for code
that is implementing new backends to libgit2. This makes the
`giterr_set_str()` and `giterr_set_oom()` functions public.
|
|
2eb4edf5
|
2012-08-24T10:48:48
|
|
Fix errors on Win32 with new repo init
|
|
e9ca852e
|
2012-08-23T09:20:17
|
|
Fix warnings and merge issues on Win64
|
|
85bd1746
|
2012-08-22T16:03:35
|
|
Some cleanup suggested during review
This cleans up a number of items suggested during code review
with @vmg, including:
* renaming "outside repo" config API to `git_config_open_default`
* killing the `git_config_open_global` API
* removing the `git_` prefix from the static functions in fileops
* removing some unnecessary functionality from the "cp" command
|
|
b769e936
|
2012-08-01T14:49:47
|
|
Don't reference stack vars in cleanup callback
If you use the clar cleanup callback function, you can't pass a
reference pointer to a stack allocated variable because when the
cleanup function runs, the stack won't exist anymore.
|
|
ca1b6e54
|
2012-07-31T17:02:54
|
|
Add template dir and set gid to repo init
This extends git_repository_init_ext further with support for
initializing the repository from an external template directory
and with support for the "create shared" type flags that make a
set GID repository directory.
This also adds tests for much of the new functionality to the
existing `repo/init.c` test suite.
Also, this adds a bunch of new utility functions including a
very general purpose `git_futils_mkdir` (with the ability to
make paths and to chmod the paths post-creation) and a file
tree copying function `git_futils_cp_r`. Also, this includes
some new path functions that were useful to keep the code
simple.
|
|
02a0d651
|
2012-07-12T16:31:59
|
|
Add git_buf_unescape and git__unescape to unescape all characters in a string (in-place)
|
|
465092ce
|
2012-07-12T11:56:50
|
|
Fix memory leak in test
|
|
b0fe1129
|
2012-07-10T15:13:30
|
|
Add path utilities to resolve relative paths
This makes it easy to take a buffer containing a path with relative
references (i.e. .. or . path segments) and resolve all of those
into a clean path. This can be applied to URLs as well as file
paths which can be useful.
As part of this, I made the drive-letter detection apply on all
platforms, not just windows. If you give a path that looks like
"c:/..." on any platform, it seems like we might as well detect
that as a rooted path. I suppose if you create a directory named
"x:" on another platform and want to use that as the beginning
of a relative path under the root directory of your repo, this
could cause a problem, but then it seems like you're asking for
trouble.
|
|
039fc406
|
2012-07-10T15:10:14
|
|
Add a couple of useful git_buf utilities
* `git_buf_rfind` (with tests and tests for `git_buf_rfind_next`)
* `git_buf_puts_escaped` and `git_buf_puts_escaped_regex` (with tests)
to copy strings into a buffer while injecting an escape sequence
(e.g. '\') in front of particular characters.
|
|
cdca82c7
|
2012-06-20T00:46:26
|
|
Plug a few leaks
|
|
e272efcb
|
2012-06-08T11:24:37
|
|
Tests: wrap 'getenv' and friends for Win32 tests.
|
|
3f035860
|
2012-06-07T22:43:03
|
|
misc: Fix warnings from PVS Studio trial
|
|
6654dbe3
|
2012-06-07T14:09:25
|
|
tests: fix assertion
|
|
dbab0459
|
2012-05-26T14:59:07
|
|
tests-clar/core: fix non-null warning
gcc 4.7.0 apparently doesn't see that we won't call setenv with NULL as
second argument.
|
|
29ef309e
|
2012-05-25T09:44:56
|
|
Make errors for system and global files consistent
The error codes from failed lookups of system and global files
on Windows were not consistent with the codes returned on other
platforms. This makes the error detection patterns match and
adds a unit test for the various errors.
|
|
2a99df69
|
2012-05-24T17:14:56
|
|
Fix bugs for status with spaces and reloaded attrs
This fixes two bugs:
* Issue #728 where git_status_file was not working for files
that contain spaces. This was caused by reusing the "fnmatch"
parsing code from ignore and attribute files to interpret the
"pathspec" that constrained the files to apply the status to.
In that code, unescaped whitespace was considered terminal to
the pattern, so a file with internal whitespace was excluded
from the matched files. The fix was to add a mode to that code
that allows spaces and tabs inside patterns. This mode only
comes into play when parsing in-memory strings.
* The other issue was undetected, but it was in the recently
added code to reload gitattributes / gitignores when they were
changed on disk. That code was not clearing out the old values
from the cached file content before reparsing which meant that
newly added patterns would be read in, but deleted patterns
would not be removed. The fix was to clear the vector of
patterns in a cached file before reparsing the file.
|
|
9cde607c
|
2012-05-24T15:08:55
|
|
Clean up system file finding tests on Win32
|
|
9e35d7fd
|
2012-05-24T13:44:24
|
|
Fix bugs in UTF-8 <-> UTF-16 conversion
The function to convert UTF-16 to UTF-8 was only allocating a
buffer of wcslen(utf16str) bytes for the UTF-8 string, but that
is not sufficient if you have multibyte characters, and so when
those occured, the conversion was failing. This updates the
conversion functions to use the Win APIs to calculate the correct
buffer lengths.
Also fixes a comparison in the unit tests that would fail if
you did not have a particular environment variable set.
|
|
23059130
|
2012-05-24T12:45:20
|
|
Get user's home dir in UTF-16 clean manner
On Windows, we are having problems with home directories
that have non-ascii characters in them. This rewrites the
relevant code to fetch environment variables as UTF-16 and
then explicitly map then into UTF-8 for our internal usage.
|
|
904b67e6
|
2012-05-18T01:48:50
|
|
errors: Rename error codes
|
|
e172cf08
|
2012-05-18T01:21:06
|
|
errors: Rename the generic return codes
|
|
41a82592
|
2012-05-15T14:17:39
|
|
Ranged iterators and rewritten git_status_file
The goal of this work is to rewrite git_status_file to use the
same underlying code as git_status_foreach.
This is done in 3 phases:
1. Extend iterators to allow ranged iteration with start and
end prefixes for the range of file names to be covered.
2. Improve diff so that when there is a pathspec and there is
a common non-wildcard prefix of the pathspec, it will use
ranged iterators to minimize excess iteration.
3. Rewrite git_status_file to call git_status_foreach_ext
with a pathspec that covers just the one file being checked.
Since ranged iterators underlie the status & diff implementation,
this is actually fairly efficient. The workdir iterator does
end up loading the contents of all the directories down to the
single file, which should ideally be avoided, but it is pretty
good.
|
|
212eb09d
|
2012-05-13T23:12:51
|
|
Add a test to verify FILENAME_MAX
Since we now rely on it (at least under Solaris), I figured we probably
want to make sure it's accurate. The new test makes sure that creating a
file with a name of length FILENAME_MAX+1 fails.
|
|
9abb5bca
|
2012-05-07T13:58:01
|
|
compat: make p_realpath Windows implementation be a bit more POSIX compliant and fail if the provided path does not lead to an existing entry
|
|
3fbcac89
|
2012-05-02T19:56:38
|
|
Remove old and unused error codes
|
|
c2b67043
|
2012-04-25T15:20:28
|
|
Rename git_khash_str to git_strmap, etc.
This renamed `git_khash_str` to `git_strmap`, `git_hash_oid` to
`git_oidmap`, and deletes `git_hashtable` from the tree, plus
adds unit tests for `git_strmap`.
|
|
19fa2bc1
|
2012-04-17T15:12:50
|
|
Convert attrs and diffs to use string pools
This converts the git attr related code (including ignores) and
the git diff related code (and implicitly the status code) to use
`git_pools` for storing strings. This reduces the number of small
blocks allocated dramatically.
|
|
2bc8fa02
|
2012-04-17T10:14:24
|
|
Implement git_pool paged memory allocator
This adds a `git_pool` object that can do simple paged memory
allocation with free for the entire pool at once. Using this,
you can replace many small allocations with large blocks that
can then cheaply be doled out in small pieces. This is best
used when you plan to free the small blocks all at once - for
example, if they represent the parsed state from a file or data
stream that are either all kept or all discarded.
There are two real patterns of usage for `git_pools`: either
for "string" allocation, where the item size is a single byte
and you end up just packing the allocations in together, or for
"fixed size" allocation where you are allocating a large object
(e.g. a `git_oid`) and you generally just allocation single
objects that can be tightly packed. Of course, you can use it
for other things, but those two cases are the easiest.
|
|
1a6e8f8a
|
2012-04-13T10:42:00
|
|
Update clar and remove old helpers
This updates to the latest clar which includes the helpers
`cl_assert_equal_s` and `cl_assert_equal_i`. Convert the code
over to use those and remove the old libgit2-only helpers.
|
|
0a20eee9
|
2012-04-11T03:43:30
|
|
Merge pull request #619 from nulltoken/topic/branches
Basic branch management API
|
|
555aa453
|
2012-04-09T02:28:31
|
|
fileops: Make git_futils_mkdir_r() able to skip non-empty directories
|
|
17bd6de3
|
2012-04-04T13:59:58
|
|
Fix MSVC "unreferenced local variable" compilation warning.
|
|
a4c291ef
|
2012-03-20T21:57:38
|
|
Convert reflog to new errors
Cleaned up some other issues.
|
|
7c7ff7d1
|
2012-03-19T16:10:11
|
|
Migrate index, oid, and utils to new errors
This includes a few cleanups that came up while converting
these files.
This commit introduces a could new git error classes, including
the catchall class: GITERR_INVALID which I'm using as the class
for invalid and out of range values which are detected at too low
a level of library to use a higher level classification. For
example, an overflow error in parsing an integer or a bad letter
in parsing an OID string would generate an error in this class.
|
|
0d0fa7c3
|
2012-03-16T15:56:01
|
|
Convert attr, ignore, mwindow, status to new errors
Also cleaned up some previously converted code that still had
little things to polish.
|
|
7b93079b
|
2012-03-16T15:16:52
|
|
Make git_path_root() cope with windows network paths
Fix libgit2/libgit2sharp#125
|
|
e3c47510
|
2012-03-13T14:23:24
|
|
Resolve comments from pull request
This converts the map validation function into a macro, tweaks
the GITERR_OS system error automatic appending, and adds a
tentative new error access API and some quick unit tests for
both the old and new error APIs.
|
|
1a481123
|
2012-02-17T00:13:34
|
|
error-handling: References
Yes, this is error handling solely for `refs.c`, but some of the
abstractions leak all ofer the code base.
|
|
854eccbb
|
2012-02-29T12:04:59
|
|
Clean up GIT_UNUSED macros on all platforms
It turns out that commit 31e9cfc4cbcaf1b38cdd3dbe3282a8f57e5366a5
did not fix the GIT_USUSED behavior on all platforms. This commit
walks through and really cleans things up more thoroughly, getting
rid of the unnecessary stuff.
To remove the use of some GIT_UNUSED, I ended up adding a couple
of new iterators for hashtables that allow you to iterator just
over keys or just over values.
In making this change, I found a bug in the clar tests (where we
were doing *count++ but meant to do (*count)++ to increment the
value). I fixed that but then found the test failing because it
was not really using an empty repo. So, I took some of the code
that I wrote for iterator testing and moved it to clar_helpers.c,
then made use of that to make it easier to open fixtures on a
per test basis even within a single test file.
|
|
2705576b
|
2012-01-24T14:06:42
|
|
Simplify GIT_UNUSED macros
Since casting to void works to eliminate errors with unused
parameters on all platforms, avoid the various special cases.
Over time, it will make sense to eliminate the GIT_UNUSED
macro completely and just have GIT_UNUSED_ARG.
|
|
13224ea4
|
2012-02-27T04:28:31
|
|
buffer: Unify `git_fbuffer` and `git_buf`
This makes so much sense that I can't believe it hasn't been done
before. Kill the old `git_fbuffer` and read files straight into
`git_buf` objects.
Also: In order to fully support 4GB files in 32-bit systems, the
`git_buf` implementation has been changed from using `ssize_t` for
storage and storing negative values on allocation failure, to using
`size_t` and changing the buffer pointer to a magical pointer on
allocation failure.
Hopefully this won't break anything.
|
|
3fd1520c
|
2012-01-24T20:35:15
|
|
Rename the Clay test suite to Clar
Clay is the name of a programming language on the makings, and we want
to avoid confusions. Sorry for the huge diff!
|