src/filter.h


Log

Author Commit Date CI Message
Edward Thomson f0e693b1 2021-09-07T17:53:49 str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Edward Thomson a8943c04 2021-08-27T15:59:01 filter: proxy_stream is now git_filter_buffered_stream The filter's proxy_stream is used to adapt filters that only provided an `apply` function into a `stream` function. Make this internal to the library instead of private to the filter file. This will allow the filters to use it directly, instead of relying on the filter functionality to do the proxying.
Edward Thomson d7e8b934 2021-06-16T09:08:38 filter: add git_filter_options Allow filter users to provide an options structure instead of simply flags. This allows for future growth for filter options.
Edward Thomson 1db5b219 2021-06-16T09:06:26 filter: filter options are now "filter sessions" Filters use a short-lived structure to keep state during an operation to allow for caching and avoid unnecessary reallocations. This was previously called the "filter options", despite the fact that they contain no configurable options. Rename them to a "filter session" in keeping with an "attribute session", which more accurately describes their use (and allows us to create "filter options" in the future).
Edward Thomson 31d9c24b 2021-05-06T16:32:14 filter: internal git_buf filter handling function Introduce `git_filter_list__convert_buf` which behaves like the old implementation of `git_filter_list__apply_data`, where it might move the input data buffer over into the output data buffer space for efficiency. This new implementation will do so in a more predictible way, always freeing the given input buffer (either moving it to the output buffer or filtering it into the output buffer first). Convert internal users to it.
Edward Thomson ef8f8ec6 2018-12-03T13:35:30 crlf: update to match git's logic Examine the recent CRLF changes to git by Torsten Bögershausen and include similar changes to update our CRLF logic to match. Note: Torsten Bögershausen has previously agreed to allow his changes to be included in libgit2.
Patrick Steinhardt 0c7f49dd 2017-06-30T13:39:01 Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Edward Thomson 2ed855a9 2016-02-07T13:16:30 filter: avoid races during filter registration Previously we would set the global filter registry structure before adding filters to the structure, without a lock, which is quite racy. Now, register default filters during global registration and use an rwlock to read and write the filter registry (as appopriate).
Edward Thomson 9c9aa1ba 2015-02-19T11:32:55 filter: take `temp_buf` in `git_filter_options`
Edward Thomson d05218b0 2015-02-19T11:25:26 filter: add `git_filter_list__load_ext` Refactor `git_filter_list__load_with_attr_reader` into `git_filter_list__load_ext`, which takes a `git_filter_options`.
Edward Thomson 795eaccd 2015-02-19T11:09:54 git_filter_opt_t -> git_filter_flag_t For consistency with the rest of the library, where an opt is an options *structure*.
Edward Thomson 646364e7 2015-02-17T20:25:31 checkout: maintain temporary buffer for filters Let the filters use the checkout data's temporary buffer, instead of having to allocate new buffers each time.
Edward Thomson 9f779aac 2015-01-29T14:40:55 attrcache: don't re-read attrs during checkout During checkout, assume that the .gitattributes files aren't modified during the checkout. Instead, create an "attribute session" during checkout. Assume that attribute data read in the same checkout "session" hasn't been modified since the checkout started. (But allow subsequent checkouts to invalidate the cache.) Further, cache nonexistent git_attr_file data even when .gitattributes files are not found to prevent re-scanning for nonexistent files.
Russell Belfer d0f00de4 2014-05-16T11:08:19 Increase binary detection len to 8k
Russell Belfer 4b11f25a 2013-09-11T16:38:33 Add ident filter This adds the ident filter (that knows how to replace $Id$) and tweaks the filter APIs and code so that git_filter_source objects actually have the updated OID of the object being filtered when it is a known value.
Russell Belfer 2a7d224f 2013-09-10T16:33:32 Extend public filter api with filter lists This moves the git_filter_list into the public API so that users can create, apply, and dispose of filter lists. This allows more granular application of filters to user data outside of libgit2 internals. This also converts all the internal usage of filters to the public APIs along with a few small tweaks to make it easier to use the public git_buffer stuff alongside the internal git_buf.
Russell Belfer 0cf77103 2013-08-26T23:17:07 Start of filter API + git_blob_filtered_content This begins the process of exposing git_filter objects to the public API. This includes: * new public type and API for `git_buffer` through which an allocated buffer can be passed to the user * new API `git_blob_filtered_content` * make the git_filter type and GIT_FILTER_TO_... constants public
Russell Belfer 85d54812 2013-08-28T16:44:04 Create public filter object and use it This creates include/sys/filter.h with a basic definition of a git_filter and then converts the internal code to use it. There are related internal objects (git_filter_list) that we will want to publish at some point, but this is a first step.
Russell Belfer 3658e81e 2013-03-25T14:20:07 Move crlf conversion into buf_text This adds crlf/lf conversion functions into buf_text with more efficient implementations that bypass the high level buffer functions. They attempt to minimize the number of reallocations done and they directly write the buffer data as needed if they know that there is enough memory allocated to memcpy data. Tests are added for these new functions. The crlf.c code is updated to use the new functions. Removed the include of buf_text.h from filter.h and just include it more narrowly in the places that need it.
Edward Thomson 359fc2d2 2013-01-08T17:07:25 update copyrights
Russell Belfer 7bf87ab6 2012-11-28T09:58:48 Consolidate text buffer functions There are many scattered functions that look into the contents of buffers to do various text manipulations (such as escaping or unescaping data, calculating text stats, guessing if content is binary, etc). This groups all those functions together into a new file and converts the code to use that. This has two enhancements to existing functionality. The old text stats function is significantly rewritten and the BOM detection code was extended (although largely we can't deal with anything other than a UTF8 BOM).
nulltoken 5e4cb4f4 2012-09-17T10:38:57 checkout : reduce memory usage when not filtering
nulltoken 3aa443a9 2012-08-20T16:56:45 checkout: introduce git_checkout_tree()
Ben Straub 9587895f 2012-07-16T12:06:23 Migrate code to git_filter_blob_contents. Also removes the unnecessary check for filter length, since git_filters_apply does the right thing when there are none, and it's more efficient than this.
Ben Straub f2d42eea 2012-07-09T20:21:22 Checkout: add structure for CRLF.
Vicent Martí e172cf08 2012-05-18T01:21:06 errors: Rename the generic return codes
Vicent Martí f2c25d18 2012-03-02T20:08:00 config: Implement a proper cvar cache
Vicent Martí 47a899ff 2012-03-01T21:19:51 filter: Beautiful refactoring Comments soothe my soul.
Vicent Martí c5266eba 2012-03-01T01:16:25 filter: Precache the filter config options on load
Vicent Martí 27950fa3 2012-02-29T01:26:03 filter: Add write-to CRLF filter
Vicent Martí 450b40ca 2012-02-28T01:13:32 filter: Load attributes for file
Vicent Martí 44b1ff4c 2012-02-27T04:31:05 filter: Apply filters before writing a file to the ODB Initial implementation. The relevant code is in `blob.c`: the blob write function has been split into smaller functions. - Directly write a file to the ODB in streaming mode - Directly write a symlink to the ODB in direct mode - Apply a filter, and write a file to the ODB in direct mode When trying to write a file, we first call `git_filter__load_for_file`, which populates a filters array with the required filters based on the filename. If no filters are resolved to the filename, we can write to the ODB in streaming mode straight from disk. Otherwise, we load the whole file in memory and use double-buffering to apply the filter chain. We finish by writing the file as a whole to the ODB.