kmx git

Commit	Date	Message
1fed6b07	2013-05-13T21:57:37	Fix trailing whitespaces
e40f1c2d	2013-03-08T16:39:57	Make tree iterator handle icase equivalence There is a serious bug in the previous tree iterator implementation. If case insensitivity resulted in member elements being equivalent to one another, and those member elements were trees, then the children of the colliding elements would be processed in sequence instead of in a single flattened list. This meant that the tree iterator was not truly acting like a case-insensitive list. This completely reworks the tree iterator to manage lists with case insensitive equivalence classes and advance through the items in a unified manner in a single sorted frame. It is possible that at a future date we might want to update this to separate the case insensitive and case sensitive tree iterators so that the case sensitive one could be a minimal amount of code and the insensitive one would always know what it needed to do without checking flags. But there would be so much shared code between the two, that I'm not sure it that's a win. For now, this gets what we need. More tests are needed, though.
97b71374	2013-02-28T14:14:45	Add GIT_STDLIB_CALL This removes the one-off GIT_CDECL and adds a new standard way of doing this named GIT_STDLIB_CALL with a src/win32 specific def when on the Windows platform.
f708c89f	2013-02-27T15:15:39	fixing some warnings on Windows
11b5beb7	2013-02-27T15:07:28	use cdecl for hashsig sorting functions on Windows
9bc8be3d	2013-02-19T10:25:41	Refine pluggable similarity API This plugs in the three basic similarity strategies for handling whitespace via internal use of the pluggable API. In so doing, I realized that the use of git_buf in the hashsig API was not needed and actually just made it harder to use, so I tweaked that API as well. Note that the similarity metric is still not hooked up in the find_similarity code - this is just setting out the function that will be used.
5e5848eb	2013-02-14T17:25:10	Change similarity metric to sampled hashes This moves the similarity metric code out of buf_text and into a new file. Also, this implements a different approach to similarity measurement based on a Rabin-Karp rolling hash where we only keep the top 100 and bottom 100 hashes. In theory, that should be sufficient samples to given a fairly accurate measurement while limiting the amount of data we keep for file signatures no matter how large the file is.

1fed6b07

2013-05-13T21:57:37

Fix trailing whitespaces

e40f1c2d

2013-03-08T16:39:57

Make tree iterator handle icase equivalence There is a serious bug in the previous tree iterator implementation. If case insensitivity resulted in member elements being equivalent to one another, and those member elements were trees, then the children of the colliding elements would be processed in sequence instead of in a single flattened list. This meant that the tree iterator was not truly acting like a case-insensitive list. This completely reworks the tree iterator to manage lists with case insensitive equivalence classes and advance through the items in a unified manner in a single sorted frame. It is possible that at a future date we might want to update this to separate the case insensitive and case sensitive tree iterators so that the case sensitive one could be a minimal amount of code and the insensitive one would always know what it needed to do without checking flags. But there would be so much shared code between the two, that I'm not sure it that's a win. For now, this gets what we need. More tests are needed, though.

97b71374

2013-02-28T14:14:45

Add GIT_STDLIB_CALL This removes the one-off GIT_CDECL and adds a new standard way of doing this named GIT_STDLIB_CALL with a src/win32 specific def when on the Windows platform.

f708c89f

2013-02-27T15:15:39

fixing some warnings on Windows

11b5beb7

2013-02-27T15:07:28

use cdecl for hashsig sorting functions on Windows

9bc8be3d

2013-02-19T10:25:41

Refine pluggable similarity API This plugs in the three basic similarity strategies for handling whitespace via internal use of the pluggable API. In so doing, I realized that the use of git_buf in the hashsig API was not needed and actually just made it harder to use, so I tweaked that API as well. Note that the similarity metric is still not hooked up in the find_similarity code - this is just setting out the function that will be used.

5e5848eb

2013-02-14T17:25:10

Change similarity metric to sampled hashes This moves the similarity metric code out of buf_text and into a new file. Also, this implements a different approach to similarity measurement based on a Rabin-Karp rolling hash where we only keep the top 100 and bottom 100 hashes. In theory, that should be sufficient samples to given a fairly accurate measurement while limiting the amount of data we keep for file signatures no matter how large the file is.

thodg/libgit2/src/hashsig.c

src/hashsig.c

Log