src/crlf.c


Log

Author Commit Date CI Message
Peter Pettersson 7dcc29fc 2021-10-22T22:51:59 Make enum in src,tests and examples C90 compliant by removing trailing comma.
Edward Thomson f0e693b1 2021-09-07T17:53:49 str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Edward Thomson c1f4f45e 2021-08-27T16:43:00 crlf: use streaming filters
Edward Thomson d525e063 2021-05-10T23:04:59 buf: remove internal `git_buf_text` namespace The `git_buf_text` namespace is unnecessary and strange. Remove it, just keep the functions prefixed with `git_buf`.
Edward Thomson 4334b177 2019-06-23T15:43:38 blob: use `git_object_size_t` for object size Instead of using a signed type (`off_t`) use a new `git_object_size_t` for the sizes of objects.
Patrick Steinhardt e54343a4 2019-06-29T09:17:32 fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
Patrick Steinhardt 658022c4 2019-07-18T13:53:41 configuration: cvar -> configmap `cvar` is an unhelpful name. Refactor its usage to `configmap` for more clarity.
Edward Thomson 91a300b7 2019-06-16T00:46:30 attr: rename constants and macros for consistency Our enumeration values are not generally suffixed with `T`. Further, our enumeration names are generally more descriptive.
Edward Thomson f673e232 2018-12-27T13:47:34 git_error: use new names in internal APIs and usage Move to the `git_error` name in the internal API for error-related functions.
Sven Strickroth 45001906 2019-01-07T16:14:51 Fix warning 'function': incompatible types - from 'git_cvar_value *' to 'int *' (C4133) on VS Signed-off-by: Sven Strickroth <email@cs-ware.de>
Edward Thomson ef8f8ec6 2018-12-03T13:35:30 crlf: update to match git's logic Examine the recent CRLF changes to git by Torsten Bögershausen and include similar changes to update our CRLF logic to match. Note: Torsten Bögershausen has previously agreed to allow his changes to be included in libgit2.
Edward Thomson 99ec4fdb 2018-04-17T20:06:30 crlf: wrap line
Sven Strickroth a5115842 2017-01-28T18:31:11 crlf: update checkout logic to reflect Git 2.9+ behaviour Signed-off-by: Sven Strickroth <email@cs-ware.de>
Patrick Steinhardt 0c7f49dd 2017-06-30T13:39:01 Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Edward Thomson 909d5494 2016-12-29T12:25:15 giterr_set: consistent error messages Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
Edward Thomson df87648a 2016-07-24T16:10:30 crlf: set a safe crlf default
Patrick Steinhardt 2129d6df 2016-02-22T13:33:48 crlf: do not ignore GIT_PASSTHROUGH error When no payload is set for `crlf_apply` we try to compute the crlf attributes ourselves with `crlf_check`. When the function determines that the current file does not require any treatment we return the GIT_PASSTHROUGH error code without actually allocating the out-pointer, which indicates the file should not be passed through the filter. The `crlf_apply` function explicitly checks for the GIT_PASSTHROUGH return code and ignores it. This means we will try to apply the crlf-filter to the current file, leading us to dereference the unallocated payload-pointer. Fix this obviously incorrect behavior by not treating GIT_PASSTHROUGH in any special way. This is the correct thing to do anyway, as the code indicates that the file should not be passed through the filter.
Edward Thomson 146d0d08 2015-06-09T00:42:28 crlf: give Unix the glory of autocrlf=true Perform LF->CRLF for core.autocrlf=true on non-Win32 because core git does.
Edward Thomson 47e9a6cb 2015-06-08T15:58:54 crlf: use statistics to control to workdir filter Use statistics (like core git) to control the behavior of the to workdir CRLF filter.
Edward Thomson 795eaccd 2015-02-19T11:09:54 git_filter_opt_t -> git_filter_flag_t For consistency with the rest of the library, where an opt is an options *structure*.
Jacques Germishuys dfda1cf5 2014-12-27T21:04:28 Check for OOM
Edward Thomson d412165f 2014-06-18T16:54:32 Update text=auto / core.autocrlf=false behavior Git for Windows 1.9.4 changed the behavior when the text=auto attribute is specified and core.autocrlf=false. Previous observed behavior would *not* filter files when going into the working directory, the new behavior *does* filter. Update our behavior to match.
Russell Belfer c094197b 2014-05-19T15:05:39 Just don't CRLF filter if there are no CRs
Russell Belfer 16798d08 2014-05-19T14:57:09 Make core.safecrlf work on LF-ending platforms If you enabled core.safecrlf on an LF-ending platform, we would error even for files with all LFs. We should only be warning on irreversible mappings, I think.
Russell Belfer 5269008c 2014-05-06T16:01:49 Add filter options and ALLOW_UNSAFE Diff and status do not want core.safecrlf to actually raise an error regardless of the setting, so this extends the filter API with an additional options flags parameter and adds a flag so that filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating that unsafe filter application should be downgraded from a failure to a warning.
Edward Thomson 7be1caf7 2014-04-07T21:41:36 Determine crlf safety by statistics, not literal reversibility
Edward Thomson 855c66de 2014-03-27T16:34:20 Introduce core.safecrlf handling
Edward Thomson b033f3a3 2014-02-11T16:52:08 Never convert CRLF->LF Core git performs no conversion on systems that use LF, emulate that.
Edward Thomson 66b2626c 2014-02-09T13:43:56 core.autocrlf=input w/ text=auto attr to workdir
Carlos Martín Nieto d541170c 2014-01-24T11:36:41 index: rename an entry's id to 'id' This was not converted when we converted the rest, so do it now.
Russell Belfer 634f10f6 2013-09-24T10:11:20 Fix incorrect return code in crlf filter The git_buf_text_gather_stats call returns a boolean indicating if the file looks like binary data. That shouldn't be an error; it should be used to skip CRLF processing though.
Russell Belfer eefc32d5 2013-09-16T12:54:40 Bug fixes and cleanups This contains a few bug fixes and some header and API cleanups. The main API change is that filters should now use GIT_PASSTHROUGH to indicate that they wish to skip processing a file instead of GIT_ENOTFOUND. The bug fixes include a possible out-of-range buffer access in the ident filter, a filter ordering problem I introduced into the custom filter tests on Windows, and a filter buf NUL termination issue that was coming up on Linux.
Russell Belfer 0e32635f 2013-09-12T14:47:15 Move binary check to CRLF filter itself Checkout should not reject binary files from filters, as a filter may actually wish to operate on binary files. The CRLF filter should reject binary files itself if it wishes to. Moreover, the CRLF filter requires this logic so that users can emulate the checkout data in their odb -> workdir filtering. Conflicts: src/checkout.c src/crlf.c
Russell Belfer a9f51e43 2013-09-11T22:00:36 Merge git_buf and git_buffer This makes the git_buf struct that was used internally into an externally available structure and eliminates the git_buffer. As part of that, some of the special cases that arose with the externally used git_buffer were blended into the git_buf, such as being careful about git_buf objects that may have a NULL ptr and allowing for bufs with a valid ptr and size but zero asize as a way of referring to externally owned data.
Russell Belfer 4b11f25a 2013-09-11T16:38:33 Add ident filter This adds the ident filter (that knows how to replace $Id$) and tweaks the filter APIs and code so that git_filter_source objects actually have the updated OID of the object being filtered when it is a known value.
Russell Belfer 0646634e 2013-09-11T12:45:37 Update filter registry code This updates the git filter registry to be a little cleaner and plugs some memory leaks.
Russell Belfer 40cb40fa 2013-09-11T14:23:39 Add functions to manipulate filter lists Extend the git2/sys/filter API with functions to look up a filter and add it manually to a filter list. This requires some trickery because the regular attribute lookups and checks are bypassed when this happens, but in the right hands, it will allow a user to have granular control over applying filters.
Russell Belfer 2a7d224f 2013-09-10T16:33:32 Extend public filter api with filter lists This moves the git_filter_list into the public API so that users can create, apply, and dispose of filter lists. This allows more granular application of filters to user data outside of libgit2 internals. This also converts all the internal usage of filters to the public APIs along with a few small tweaks to make it easier to use the public git_buffer stuff alongside the internal git_buf.
Russell Belfer 974774c7 2013-09-09T16:57:34 Add attributes to filters and fix registry The filter registry as implemented was too primitive to actually work once multiple filters were coming into play. This expands the implementation of the registry to handle multiple prioritized filters correctly. Additionally, this adds an "attributes" field to a filter that makes it really really easy to implement filters that are based on one or more attribute values. The lookup and even simple value checking can all happen automatically without custom filter code. Lastly, with the registry improvements, this fills out the filter lifecycle callbacks, with initialize and shutdown callbacks that will be called before the filter is first used and after it is last invoked. This allows for system-wide initialization and cleanup by the filter.
Russell Belfer 570ba25c 2013-08-30T16:02:07 Make git_filter_source opaque
Russell Belfer 0cf77103 2013-08-26T23:17:07 Start of filter API + git_blob_filtered_content This begins the process of exposing git_filter objects to the public API. This includes: * new public type and API for `git_buffer` through which an allocated buffer can be passed to the user * new API `git_blob_filtered_content` * make the git_filter type and GIT_FILTER_TO_... constants public
Russell Belfer 85d54812 2013-08-28T16:44:04 Create public filter object and use it This creates include/sys/filter.h with a basic definition of a git_filter and then converts the internal code to use it. There are related internal objects (git_filter_list) that we will want to publish at some point, but this is a first step.
Russell Belfer 114f5a6c 2013-06-10T10:10:39 Reorganize diff and add basic diff driver This is a significant reorganization of the diff code to break it into a set of more clearly distinct files and to document the new organization. Hopefully this will make the diff code easier to understand and to extend. This adds a new `git_diff_driver` object that looks of diff driver information from the attributes and the config so that things like function content in diff headers can be provided. The full driver spec is not implemented in the commit - this is focused on the reorganization of the code and putting the driver hooks in place. This also removes a few #includes from src/repository.h that were overbroad, but as a result required extra #includes in a variety of places since including src/repository.h no longer results in pulling in the whole world.
Russell Belfer 3658e81e 2013-03-25T14:20:07 Move crlf conversion into buf_text This adds crlf/lf conversion functions into buf_text with more efficient implementations that bypass the high level buffer functions. They attempt to minimize the number of reallocations done and they directly write the buffer data as needed if they know that there is enough memory allocated to memcpy data. Tests are added for these new functions. The crlf.c code is updated to use the new functions. Removed the include of buf_text.h from filter.h and just include it more narrowly in the places that need it.
Russell Belfer 9733e80c 2013-03-22T10:44:45 Add has_cr_in_index check to CRLF filter This adds a check to the drop_crlf filter path to check it the file in the index already has a CR in it, in which case this will not drop the CRs from the workdir file contents. This uncovered a "bug" in `git_blob_create_fromworkdir` where the full path to the file was passed to look up the attributes instead of the relative path from the working directory root. This meant that the check in the index for a pre-existing entry of the same name was failing.
Edward Thomson 4a15ea86 2013-03-21T14:02:25 don't convert CRLF to CRCRLF
Edward Thomson 359fc2d2 2013-01-08T17:07:25 update copyrights
Russell Belfer 7bf87ab6 2012-11-28T09:58:48 Consolidate text buffer functions There are many scattered functions that look into the contents of buffers to do various text manipulations (such as escaping or unescaping data, calculating text stats, guessing if content is binary, etc). This groups all those functions together into a new file and converts the code to use that. This has two enhancements to existing functionality. The old text stats function is significantly rewritten and the BOM detection code was extended (although largely we can't deal with anything other than a UTF8 BOM).
Russell Belfer 47bfa0be 2012-09-07T13:27:49 Add git_repository_hashfile to hash with filters The existing `git_odb_hashfile` does not apply text filtering rules because it doesn't have a repository context to evaluate the correct rules to apply. This adds a new hashfile function that will apply repository-specific filters (based on config, attributes, and filename) before calculating the hash.
Russell Belfer f8e2cc9a 2012-08-31T15:53:47 Alternate test for autocrlf with status I couldn't get the last failing test to actually fail. This is a different test suggested by @nulltoken which should fail.
Ben Straub e4bac3c4 2012-07-31T15:38:12 Checkout: crlf filter.
Ben Straub 8651c10f 2012-07-17T19:57:37 Checkout: obey core.symlinks.
Ben Straub f2d42eea 2012-07-09T20:21:22 Checkout: add structure for CRLF.
Vicent Martí 904b67e6 2012-05-18T01:48:50 errors: Rename error codes
Vicent Martí e172cf08 2012-05-18T01:21:06 errors: Rename the generic return codes
Vicent Martí 29e948de 2012-05-10T10:38:10 global: Change parameter ordering in API Consistency is good.
Russell Belfer f917481e 2012-05-03T16:37:25 Support reading attributes from index Depending on the operation, we need to consider gitattributes in both the work dir and the index. This adds a parameter to all of the gitattributes related functions that allows user control of attribute reading behavior (i.e. prefer workdir, prefer index, only use index). This fix also covers allowing us to check attributes (and hence do diff and status) on bare repositories. This was a somewhat larger change that I hoped because it had to change the cache key used for gitattributes files.
Vicent Martí 3fbcac89 2012-05-02T19:56:38 Remove old and unused error codes
nulltoken fa6420f7 2012-04-29T21:46:33 buf: deploy git_buf_len()
Vicent Martí cb8a7961 2012-03-07T00:02:55 error-handling: Repository This also includes droping `git_buf_lasterror` because it makes no sense in the new system. Note that in most of the places were it has been dropped, the code needs cleanup. I.e. GIT_ENOMEM is going away, so instead it should return a generic `-1` and obviously not throw anything.
Russell Belfer ce49c7a8 2012-03-02T15:09:40 Add filter tests and fix some bugs This adds some initial unit tests for file filtering and fixes some simple bugs in filter application.
Vicent Martí f2c25d18 2012-03-02T20:08:00 config: Implement a proper cvar cache
Vicent Martí c63793ee 2012-03-02T03:51:45 attr: Change the attribute check macros The point of having `GIT_ATTR_TRUE` and `GIT_ATTR_FALSE` macros is to be able to change the way that true and false values are stored inside of the returned gitattributes value pointer. However, if these macros are implemented as a simple rename for the `git_attr__true` pointer, they will always be used with the `==` operator, and hence we cannot really change the implementation to any other way that doesn't imply using special pointer values and comparing them! We need to do the same thing that core Git does, which is using a function macro. With `GIT_ATTR_TRUE(attr)`, we can change internally the way that these values are stored to anything we want. This commit does that, and rewrites a large chunk of the attributes test suite to remove duplicated code for expected attributes, and to properly test the function macro behavior instead of comparing pointers.
Vicent Martí 47a899ff 2012-03-01T21:19:51 filter: Beautiful refactoring Comments soothe my soul.
Vicent Martí 27950fa3 2012-02-29T01:26:03 filter: Add write-to CRLF filter