Hash :
d8896bda
Author :
Date :
2018-01-03T16:07:36
diff_generate: avoid excessive stats of .gitattribute files When generating a diff between two trees, for each file that is to be diffed we have to determine whether it shall be treated as text or as binary files. While git has heuristics to determine which kind of diff to generate, users can also that default behaviour by setting or unsetting the 'diff' attribute for specific files. Because of that, we have to query gitattributes in order to determine how to diff the current files. Instead of hitting the '.gitattributes' file every time we need to query an attribute, which can get expensive especially on networked file systems, we try to cache them instead. This works perfectly fine for every '.gitattributes' file that is found, but we hit cache invalidation problems when we determine that an attribuse file is _not_ existing. We do create an entry in the cache for missing '.gitattributes' files, but as soon as we hit that file again we invalidate it and stat it again to see if it has now appeared. In the case of diffing large trees with each other, this behaviour is very suboptimal. For each pair of files that is to be diffed, we will repeatedly query every directory component leading towards their respective location for an attributes file. This leads to thousands or even hundreds of thousands of wasted syscalls. The attributes cache already has a mechanism to help in that scenario in form of the `git_attr_session`. As long as the same attributes session is still active, we will not try to re-query the gitmodules files at all but simply retain our currently cached results. To fix our problem, we can create a session at the top-most level, which is the initialization of the `git_diff` structure, and use it in order to look up the correct diff driver. As the `git_diff` structure is used to generate patches for multiple files at once, this neatly solves our problem by retaining the session until patches for all files have been generated. The fix has been tested with linux.git by calling `git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and v4.14^{tree}. | time | .gitattributes stats without fix | 33.201s | 844614 with fix | 30.327s | 4441 While execution only improved by roughly 10%, the stat(3) syscalls for .gitattributes files decreased by 99.5%. The benchmarks were quite simple with best-of-three timings on Linux ext4 systems. One can assume that for network based file systems the performance gain will be a lot larger due to a much higher latency.
Git HTTP | https://git.kmx.io/thodg/libgit2.git |
---|---|
Git SSH | git@git.kmx.io:thodg/libgit2.git |
Public access ? | public |
Description | |
Users |
![]() |
Tags |
|
libgit2
is a portable, pure C implementation of the Git core methods
provided as a linkable library with a solid API, allowing to build Git
functionality into your application. Language bindings like
Rugged (Ruby),
LibGit2Sharp (.NET),
pygit2 (Python) and
NodeGit (Node) allow you to build Git tooling
in your favorite language.
libgit2
is used to power Git GUI clients like
GitKraken and gmaster
and on Git hosting providers like GitHub,
GitLab and
Visual Studio Team Services.
We perform the merge every time you click “merge pull request”.
libgit2
is licensed under a very permissive license (GPLv2 with a special
Linking Exception). This basically means that you can link it (unmodified)
with any kind of software without having to release its source code.
Additionally, the example code has been released to the public domain (see the
separate license for more information).
Prerequisites for building libgit2:
PATH
. PATH
. Build
mkdir build && cd build
cmake ..
cmake --build .
Trouble with these steps? Read TROUBLESHOOTING.md
. More detailed build
guidance is available below.
Join us on Slack
Visit slack.libgit2.org to sign up, then join
us in #libgit2
. If you prefer IRC, you can also point your client to our
slack channel once you’ve registered.
Getting Help
If you have questions about the library, please be sure to check out the
API documentation. If you still have
questions, reach out to us on Slack or post a question on
StackOverflow (with the libgit2
tag).
Reporting Bugs
Please open a GitHub Issue and include as much information as possible. If possible, provide sample code that illustrates the problem you’re seeing. If you’re seeing a bug only on a specific repository, please provide a link to it if possible.
We ask that you not open a GitHub Issue for help, only for bug reports.
libgit2 provides you with the ability to manage Git repositories in the programming language of your choice. It’s used in production to power many applications including GitHub.com, Plastic SCM and Visual Studio Team Services.
It does not aim to replace the git tool or its user-facing commands. Some APIs resemble the plumbing commands as those align closely with the concepts of the Git system, but most commands a user would type are out of scope for this library to implement directly.
The library provides:
As libgit2 is purely a consumer of the Git system, we have to adjust to changes made upstream. This has two major consequences:
While the library provides git functionality without the need for dependencies, it can make use of a few libraries to add to it:
The library needs to keep track of some global state. Call
git_libgit2_init();
before calling any other libgit2 functions. You can call this function many times. A matching number of calls to
git_libgit2_shutdown();
will free the resources. Note that if you have worker threads, you should
call git_libgit2_shutdown
after those threads have exited. If you
require assistance coordinating this, simply have the worker threads call
git_libgit2_init
at startup and git_libgit2_shutdown
at shutdown.
See THREADING for information
See CONVENTIONS for an overview of the external and internal API/coding conventions we use.
libgit2
builds cleanly on most platforms without any external dependencies.
Under Unix-like systems, like Linux, *BSD and Mac OS X, libgit2 expects pthreads
to be available;
they should be installed by default on all systems. Under Windows, libgit2 uses the native Windows API
for threading.
The libgit2
library is built using CMake (version 2.8 or newer) on all platforms.
On most systems you can build the library using the following commands
$ mkdir build && cd build
$ cmake ..
$ cmake --build .
Alternatively you can point the CMake GUI tool to the CMakeLists.txt file and generate platform specific build project or IDE workspace.
Once built, you can run the tests from the build
directory with the command
$ ctest -V
Alternatively you can run the test suite directly using,
$ ./libgit2_clar
Invoking the test suite directly is useful because it allows you to execute
individual tests, or groups of tests using the -s
flag. For example, to
run the index tests:
$ ./libgit2_clar -sindex
To run a single test named index::racy::diff
, which corresponds to the test
function (test_index_racy__diff
)[https://github.com/libgit2/libgit2/blob/master/tests/index/racy.c#L23]:
$ ./libgit2_clar -sindex::racy::diff
The test suite will print a .
for every passing test, and an F
for any
failing test. An S
indicates that a test was skipped because it is not
applicable to your platform or is particularly expensive.
Note: There should be no failing tests when you build an unmodified source tree from a release, or from the master branch. Please contact us or open an issue if you see test failures.
To install the library you can specify the install prefix by setting:
$ cmake .. -DCMAKE_INSTALL_PREFIX=/install/prefix
$ cmake --build . --target install
For more advanced use or questions about CMake please read https://cmake.org/Wiki/CMake_FAQ.
The following CMake variables are declared:
BIN_INSTALL_DIR
: Where to install binaries to. LIB_INSTALL_DIR
: Where to install libraries to. INCLUDE_INSTALL_DIR
: Where to install headers to. BUILD_SHARED_LIBS
: Build libgit2 as a Shared Library (defaults to ON) BUILD_CLAR
: Build Clar-based test suite (defaults to ON) THREADSAFE
: Build libgit2 with threading support (defaults to ON) STDCALL
: Build libgit2 as stdcall
. Turn off for cdecl
(Windows; defaults to ON) CMake lets you specify a few variables to control the behavior of the compiler and linker. These flags are rarely used but can be useful for 64-bit to 32-bit cross-compilation.
CMAKE_C_FLAGS
: Set your own compiler flags CMAKE_FIND_ROOT_PATH
: Override the search path for libraries ZLIB_LIBRARY
, OPENSSL_SSL_LIBRARY
AND OPENSSL_CRYPTO_LIBRARY
:
Tell CMake where to find those specific libraries
If you want to build a universal binary for Mac OS X, CMake sets it
all up for you if you use -DCMAKE_OSX_ARCHITECTURES="i386;x86_64"
when configuring.
Extract toolchain from NDK using, make-standalone-toolchain.sh
script.
Optionally, crosscompile and install OpenSSL inside of it. Then create CMake
toolchain file that configures paths to your crosscompiler (substitute {PATH}
with full path to the toolchain):
SET(CMAKE_SYSTEM_NAME Linux)
SET(CMAKE_SYSTEM_VERSION Android)
SET(CMAKE_C_COMPILER {PATH}/bin/arm-linux-androideabi-gcc)
SET(CMAKE_CXX_COMPILER {PATH}/bin/arm-linux-androideabi-g++)
SET(CMAKE_FIND_ROOT_PATH {PATH}/sysroot/)
SET(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
SET(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
SET(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
Add -DCMAKE_TOOLCHAIN_FILE={pathToToolchainFile}
to cmake command
when configuring.
Here are the bindings to libgit2 that are currently available:
If you start another language binding to libgit2, please let us know so we can add it to the list.
We welcome new contributors! We have a number of issues marked as “up for grabs” and “easy fix” that are good places to jump in and get started. There’s much more detailed information in our list of outstanding projects.
Please be sure to check the contribution guidelines to understand our workflow, and the libgit2 coding conventions.
libgit2
is under GPL2 with linking exception. This means you can link to
and use the library from any program, proprietary or open source; paid or
gratis. However, if you modify libgit2 itself, you must distribute the
source to your modified version of libgit2.
See the COPYING file for the full license text.