src/attr_file.h


Log

Author Commit Date CI Message
Edward Thomson f0e693b1 2021-09-07T17:53:49 str: introduce `git_str` for internal, `git_buf` is external libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Edward Thomson 46508fe6 2021-09-26T11:28:47 attr_file: don't take the `repo` as an arg The `repo` argument is now unnecessary. Remove it.
Edward Thomson 90656858 2021-09-21T11:28:39 filter: use a `git_oid` in filter options, not a pointer Using a `git_oid *` in filter options was a mistake; it is a deviation from our typical pattern, and callers in some languages that GC may need very special treatment in order to pass both an options structure and a pointer outside of it.
Edward Thomson 0bd547a8 2021-07-22T15:29:46 attr: introduce GIT_ATTR_CHECK_INCLUDE_COMMIT Introduce `GIT_ATTR_CHECK_INCLUDE_COMMIT`, which like 4fd5748 allows attribute information to be read from files in the repository. 4fd5748 always reads the information from HEAD, while `GIT_ATTR_CHECK_INCLUDE_COMMIT` allows users to provide the commit to read the attributes from.
Edward Thomson 3779a047 2021-05-27T18:47:22 attr: introduce `git_attr_options` for extended queries Allow more advanced attribute queries using a `git_attr_options`, and extended functions to use it. Presently there is no additional configuration in a `git_attr_options` beyond the flags, but this is for future growth.
Edward Thomson 1cd863fd 2021-05-24T13:44:45 attr: include the filename in the attr source The attribute source object is now the type and the path.
Edward Thomson 96dc1ffd 2021-05-22T20:14:47 attr: the attr source is now a struct We may want to extend the attribute source; use a structure instead of an enum.
Edward Thomson 5ee50488 2021-05-22T18:47:03 attr: rename internal attr file source enum The enum `git_attr_file_source` is better suffixed with a `_t` since it's a type-of source. Similarly, its members should have a matching name.
Edward Thomson 9fb755d5 2021-04-04T19:59:57 attr: validate workdir paths for attribute files We should allow attribute files - inside working directories - to have names longer than MAX_PATH when core.longpaths is set. `git_attr_path__init` takes a repository to validate the path with.
Edward Thomson 4fd5748c 2019-07-21T14:11:03 attr: optionally read attributes from repository When `GIT_ATTR_CHECK_INCLUDE_HEAD` is specified, read `gitattribute` files that are checked into the repository at the HEAD revision.
Patrick Steinhardt e54343a4 2019-06-29T09:17:32 fileops: rename to "futils.h" to match function signatures Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
Patrick Steinhardt f8346905 2019-07-12T09:03:33 attr_file: ignore macros defined in subdirectories Right now, we are unconditionally applying all macros found in a gitatttributes file. But quoting gitattributes(5): Custom macro attributes can be defined only in top-level gitattributes files ($GIT_DIR/info/attributes, the .gitattributes file at the top level of the working tree, or the global or system-wide gitattributes files), not in .gitattributes files in working tree subdirectories. The built-in macro attribute "binary" is equivalent to: So gitattribute files in subdirectories of the working tree may explicitly _not_ contain macro definitions, but we do not currently enforce this limitation. This patch introduces a new parameter to the gitattributes parser that tells whether macros are allowed in the current file or not. If set to `false`, we will still parse macros, but silently ignore them instead of adding them to the list of defined macros. Update all callers to correctly determine whether the to-be-parsed file may contain macros or not. Most importantly, when walking up the directory hierarchy, we will only set it to `true` once it reaches the root directory of the repo itself. Add a test that verifies that we are indeed not applying macros from subdirectories. Previous to these changes, the test would've failed.
Patrick Steinhardt 05f9986a 2019-06-14T08:06:05 attr_file: convert to use `wildmatch` Upstream git has converted to use `wildmatch` instead of `fnmatch`. Convert our gitattributes logic to use `wildmatch` as the last user of `fnmatch`. Please, don't expect I know what I'm doing here: the fnmatch parser is one of the most fun things to play around with as it has a sh*tload of weird cases. In all honesty, I'm simply relying on our tests that are by now rather comprehensive in that area. The conversion actually fixes compatibility with how git.git parser "**" patterns when the given path does not contain any directory separators. Previously, a pattern "**.foo" erroneously wouldn't match a file "x.foo", while git.git would match. Remove the new-unused LEADINGDIR/NOLEADINGDIR flags for `git_attr_fnmatch`.
Patrick Steinhardt 0c7f49dd 2017-06-30T13:39:01 Make sure to always include "common.h" first Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Patrick Steinhardt c5f3da96 2016-11-11T14:36:43 repository: use `git_repository_item_path` The recent introduction of the commondir variable of a repository requires callers to distinguish whether their files are part of the dot-git directory or the common directory shared between multpile worktrees. In order to take the burden from callers and unify knowledge on which files reside where, the `git_repository_item_path` function has been introduced which encapsulate this knowledge. Modify most existing callers of `git_repository_path` to use `git_repository_item_path` instead, thus making them implicitly aware of the common directory.
J Wyman 4c09e19a 2015-03-30T14:07:44 Improvements to ignore performance on Windows. Minimizing the number directory and file opens, minimizes the amount of IO thus reducing the overall cost of performing ignore operations.
Edward Thomson f58cc280 2015-02-03T00:28:32 attr_session: keep a temp buffer
Edward Thomson d4b1b767 2015-02-03T00:03:49 checkout: cache system attributes file location
Edward Thomson 9f779aac 2015-01-29T14:40:55 attrcache: don't re-read attrs during checkout During checkout, assume that the .gitattributes files aren't modified during the checkout. Instead, create an "attribute session" during checkout. Assume that attribute data read in the same checkout "session" hasn't been modified since the checkout started. (But allow subsequent checkouts to invalidate the cache.) Further, cache nonexistent git_attr_file data even when .gitattributes files are not found to prevent re-scanning for nonexistent files.
Carlos Martín Nieto 6069042f 2014-11-05T16:51:39 ignore: don't leak rules into higher directories A rule "src" in src/.gitignore must only match subdirectories of src/. The current code does not include this context in the match rule and would thus consider this rule to match the top-level src/ directory instead of the intended src/src/. Keep track fo the context in which the rule was defined so we can perform a prefix match.
Russell Belfer f554611a 2014-05-06T12:41:26 Improve checks for ignore containment The diff code was using an "ignored_prefix" directory to track if a parent directory was ignored that contained untracked files alongside tracked files. Unfortunately, when negative ignore rules were used for directories inside ignored parents, the wrong rules were applied to untracked files inside the negatively ignored child directories. This commit moves the logic for ignore containment into the workdir iterator (which is a better place for it), so the ignored-ness of a directory is contained in the frame stack during traversal. This allows a child directory to override with a negative ignore and yet still restore the ignored state of the parent when we traverse out of the child. Along with this, there are some problems with "directory only" ignore rules on container directories. Given "a/*" and "!a/b/c/" (where the second rule is a directory rule but the first rule is just a generic prefix rule), then the directory only constraint was having "a/b/c/d/file" match the first rule and not the second. This was fixed by having ignore directory-only rules test a rule against the prefix of a file with LEADINGDIR enabled. Lastly, spot checks for ignores using `git_ignore_path_is_ignored` were tested from the top directory down to the bottom to deal with the containment problem, but this is wrong. We have to test bottom to top so that negative subdirectory rules will be checked before parent ignore rules. This does change the behavior of some existing tests, but it seems only to bring us more in line with core Git, so I think those changes are acceptable.
Russell Belfer ac16bd0a 2014-04-18T15:45:59 Minor fixes Only apply LEADING_DIR pattern munging to patterns in ignore and attribute files, not to pathspecs used to select files to operate on. Also, allow internal macro definitions to be evaluated before loading all external ones (important so that external ones can make use of internal `binary` definition).
Russell Belfer 916fcbd6 2014-04-18T14:42:40 Fix ignore difference from git with trailing /* Ignore patterns that ended with a trailing '/*' were still needing to match against another actual '/' character in the full path. This is not the same behavior as core Git. Instead, we strip a trailing '/*' off of any patterns that were matching and just take it to imply the FNM_LEADING_DIR behavior.
Russell Belfer 823c0e9c 2014-04-17T11:53:13 Fix broken logic for attr cache invalidation The checks to see if files were out of date in the attibute cache was wrong because the cache-breaker data wasn't getting stored correctly. Additionally, when the cache-breaker triggered, the old file data was being leaked.
Russell Belfer e6e8530a 2014-04-14T12:31:17 Lock attribute file while reparsing data I don't love this approach, but achieving thread-safety for attribute and ignore data while reloading files would require a larger rewrite in order to avoid this. If an attribute or ignore file is out of date, this holds a lock on the file while we are reloading the data so that another thread won't try to reload the data at the same time.
Russell Belfer 7d490872 2014-04-10T22:31:01 Attribute file cache refactor This is a big refactoring of the attribute file cache to be a bit simpler which in turn makes it easier to enforce a lock around any updates to the cache so that it can be used in a threaded env. Tons of changes to the attributes and ignores code.
Russell Belfer 40ed4990 2014-02-11T14:45:37 Add diff threading tests and attr file cache locks This adds a basic test of doing simultaneous diffs on multiple threads and adds basic locking for the attr file cache because that was the immediate problem that arose from these tests.
Russell Belfer 4ba64794 2013-08-09T10:52:35 Revert PR #1462 and provide alternative fix This rolls back the changes to fnmatch parsing from commit 2e40a60e847d6c128af23e24ea7a8efebd2427da except for the tests that were added. Instead this adds couple of new flags that can be passed in when attempting to parse an fnmatch pattern. Also, this changes the pathspec match logic to special case matching a filename with a '!' prefix against a negative pattern. This fixes the build.
Russell Belfer 33d532dc 2013-08-09T09:32:06 Merge pull request #1462 from yorah/fix/libgit2sharp-issue-379 status: fix handling of filenames with special prefixes
Linquize 0cb16fe9 2013-05-15T20:26:55 Unify whitespaces to tabs
yorah 2e40a60e 2013-04-11T17:29:05 status: fix handling of filenames with special prefixes Fix libgit2/libgit2sharp#379
yorah 0d32f39e 2013-03-04T11:31:50 Notify '*' pathspec correctly when diffing I also moved all tests related to notifying in their own file.
Russell Belfer 5540d947 2013-03-15T16:39:00 Implement global/system file search paths The goal of this work is to expose the search logic for "global", "system", and "xdg" files through the git_libgit2_opts() interface. Behind the scenes, I changed the logic for finding files to have a notion of a git_strarray that represents a search path and to store a separate search path for each of the three tiers of config file. For each tier, I implemented a function to initialize it to default values (generally based on environment variables), and then general interfaces to get it, set it, reset it, and prepend new directories to it. Next, I exposed these interfaces through the git_libgit2_opts interface, reusing the GIT_CONFIG_LEVEL_SYSTEM, etc., constants for the user to control which search path they were modifying. There are alternative designs for the opts interface / argument ordering, so I'm putting this phase out for discussion. Additionally, I ended up doing a little bit of clean up regarding attr.h and attr_file.h, adding a new attrcache.h so the other two files wouldn't have to be included in so many places.
Edward Thomson 359fc2d2 2013-01-08T17:07:25 update copyrights
Russell Belfer 55cbd05b 2012-11-08T16:56:34 Some diff refactorings to help code reuse There are some diff functions that are useful in a rewritten checkout and this lays some groundwork for that. This contains three main things: 1. Share the function diff uses to calculate the OID for a file in the working directory (now named `git_diff__oid_for_file` 2. Add a `git_diff__paired_foreach` function to iterator over two diff lists concurrently. Convert status to use it. 3. Move all the string/prefix/index entry comparisons into function pointers inside the `git_diff_list` object so they can be switched between case sensitive and insensitive versions. This makes them easier to reuse in various functions without replicating logic. As part of this, move a couple of index functions out of diff.c and into index.c.
Vicent Marti c1f61af6 2012-10-31T20:52:01 I LIKE THESE NAMES
Russell Belfer c8b511f3 2012-10-31T11:26:12 Better naming for file timestamp/size checker
Russell Belfer 744cc03e 2012-10-30T12:10:36 Add git_config_refresh() API to reload config This adds a new API that allows users to reload the config if the file has changed on disk. A new config callback function to refresh the config was added. The modified time and file size are used to test if the file needs to be reloaded (and are now stored in the disk backend object). In writing tests, just using mtime was a problem / race, so I wanted to check file size as well. To support that, I extended `git_futils_readbuffer_updated` to optionally check file size in addition to mtime, and I added a new function `git_filebuf_stats` to fetch the mtime and size for an open filebuf (so that the config could be easily refreshed after a write). Lastly, I moved some similar file checking code for attributes into filebuf. It is still only being used for attrs, but it seems potentially reusable, so I thought I'd move it over.
Russell Belfer 52032ae5 2012-10-15T12:48:43 Fix single-file ignore checks To answer if a single given file should be ignored, the path to that file has to be processed progressively checking that there are no intermediate ignored directories in getting to the file in question. This enables that, fixing the broken old behavior, and adds tests to exercise various ignore situations.
Philip Kelley ec40b7f9 2012-09-17T15:42:41 Support for core.ignorecase
Vicent Marti 0c9eacf3 2012-08-02T01:15:24 attr: Do not export variables externally Fixes #824 Exporting variables in a dynamic library is a PITA. Let's keep these values internally and wrap them through a helper method. This doesn't break the external API. @arrbee, aren't you glad I turned the `GIT_ATTR_` macros into function macros? :sparkles:
Russell Belfer 2a99df69 2012-05-24T17:14:56 Fix bugs for status with spaces and reloaded attrs This fixes two bugs: * Issue #728 where git_status_file was not working for files that contain spaces. This was caused by reusing the "fnmatch" parsing code from ignore and attribute files to interpret the "pathspec" that constrained the files to apply the status to. In that code, unescaped whitespace was considered terminal to the pattern, so a file with internal whitespace was excluded from the matched files. The fix was to add a mode to that code that allows spaces and tabs inside patterns. This mode only comes into play when parsing in-memory strings. * The other issue was undetected, but it was in the recently added code to reload gitattributes / gitignores when they were changed on disk. That code was not clearing out the old values from the cached file content before reparsing which meant that newly added patterns would be read in, but deleted patterns would not be removed. The fix was to clear the vector of patterns in a cached file before reparsing the file.
Russell Belfer dc13f1f7 2012-05-10T11:08:59 Add cache busting to attribute cache This makes the git attributes and git ignores cache check stat information before using the file contents from the cache. For cached files from the index, it checks the SHA of the file instead. This should reduce the need to ever call `git_attr_cache_flush()` in most situations. This commit also fixes the `git_status_should_ignore` API to use the libgit2 standard parameter ordering.
Russell Belfer f917481e 2012-05-03T16:37:25 Support reading attributes from index Depending on the operation, we need to consider gitattributes in both the work dir and the index. This adds a parameter to all of the gitattributes related functions that allows user control of attribute reading behavior (i.e. prefer workdir, prefer index, only use index). This fix also covers allowing us to check attributes (and hence do diff and status) on bare repositories. This was a somewhat larger change that I hoped because it had to change the cache key used for gitattributes files.
Russell Belfer d58336dd 2012-04-26T10:51:45 Fix leading slash behavior in attrs/ignores We were not following the git behavior for leading slashes in path names when matching git ignores and git attribute file patterns. This should fix issue #638.
Russell Belfer c2b67043 2012-04-25T15:20:28 Rename git_khash_str to git_strmap, etc. This renamed `git_khash_str` to `git_strmap`, `git_hash_oid` to `git_oidmap`, and deletes `git_hashtable` from the tree, plus adds unit tests for `git_strmap`.
Russell Belfer 19fa2bc1 2012-04-17T15:12:50 Convert attrs and diffs to use string pools This converts the git attr related code (including ignores) and the git diff related code (and implicitly the status code) to use `git_pools` for storing strings. This reduces the number of small blocks allocated dramatically.
Russell Belfer 14a513e0 2012-04-13T15:00:29 Add support for pathspec to diff and status This adds preliminary support for pathspecs to diff and status. The implementation is not very optimized (it still looks at every single file and evaluated the the pathspec match against them), but it works.
Russell Belfer 95dfb031 2012-03-30T14:40:50 Improve config handling for diff,submodules,attrs This adds support for a bunch of core.* settings that affect diff and status, plus fixes up some incorrect implementations of those settings from before. Also, this cleans up the handling of config settings in the new submodules code and in the old attrs/ignore code.
Russell Belfer ab43ad2f 2012-03-14T11:07:14 Convert attr and other files to new errors This continues to add other files to the new error handling style. I think the only real concerns here are that there are a couple of error return cases that I have converted to asserts, but I think that it was the correct thing to do given the new error style.
schu 5e0de328 2012-02-13T17:10:24 Update Copyright header Signed-off-by: schu <schu-github@schulog.org>
Russell Belfer adc9bdb3 2012-01-31T13:59:32 Fix attr path is_dir check When building an attr path object, the code that checks if the file is a directory was evaluating the file as a relative path to the current working directory, instead of using the repo root. This lead to inconsistent behavior.
Russell Belfer a51cd8e6 2012-01-16T16:58:27 Fix handling of relative paths for attrs Per issue #533, the handling of relative paths in attribute and ignore files was not right. Fixed this by pre-joining the relative path of the attribute/ignore file onto the match string when a full path match is required. Unfortunately, fixing this required a bit more code than I would have liked because I had to juggle things around so that the fnmatch parser would have sufficient information to prepend the relative path when it was needed.
Russell Belfer cfbc880d 2012-01-16T15:16:44 Patch cleanup for merge After reviewing the gitignore support with Vicent, we came up with a list of minor cleanups to prepare for merge, including: * checking git_repository_config error returns * renaming git_ignore_is_ignored and moving to status.h * fixing next_line skipping to include \r skips * commenting on where ignores are and are not included
Russell Belfer df743c7d 2012-01-09T15:37:19 Initial implementation of gitignore support Adds support for .gitignore files to git_status_foreach() and git_status_file(). This includes refactoring the gitattributes code to share logic where possible. The GIT_STATUS_IGNORED flag will now be passed in for files that are ignored (provided they are not already in the index or the head of repo).
Russell Belfer 73b51450 2011-12-28T23:28:50 Add support for macros and cache flush API. Add support for git attribute macro definitions. Also, add support for cache flush API to clear the attribute file content cache when needed. Additionally, improved the handling of global and system files, making common utility functions in fileops and converting config and attr to both use the common functions. Adds a bunch more tests and fixed some memory leaks. Note that adding macros required me to use refcounted attribute assignment definitions, which complicated, but probably improved memory usage.
Russell Belfer ee1f0b1a 2011-12-16T10:56:43 Add APIs for git attributes This adds APIs for querying git attributes. In addition to the new API in include/git2/attr.h, most of the action is in src/attr_file.[hc] which contains utilities for dealing with a single attributes file, and src/attr.[hc] which contains the implementation of the APIs that merge all applicable attributes files.