|
4e151a4a
|
2025-08-26T21:11:07
|
|
Remove vestigial filenames from SIMD code headers
These were a relic of libjpeg/SIMD, which attempted to follow the
conventions of the libjpeg source code, but they are no longer relevant
(or even accurate in some cases.)
|
|
98c45838
|
2025-08-21T11:22:51
|
|
Fix issues with Windows Arm64EC builds
Arm64EC basically wraps native Arm64 functions with an emulated
Windows/x64 ABI, which can improve performance for Windows/x64
applications running under the x64 emulator on Windows/Arm. When
building for Arm64EC, the compiler defines _M_X64 and _M_ARM64EC but not
_M_ARM64.
|
|
ea4ee229
|
2024-12-11T16:51:39
|
|
Neon: Disable some strict compiler warnings
We use a standard set of strict compiler warnings with Clang and GCC to
continuously test and maintain C89 conformance in the libjpeg API code.
However, SIMD extensions need not comply with that. The Neon code
specifically uses some C99isms, so disable
-Wdeclaration-after-statement, -Wc99-extensions, and -Wpedantic in the
scope of that code. Also modify the Neon feature tests so that they
will succeed if any of the aforementioned compiler warnings are enabled.
|
|
e69dd40c
|
2024-01-23T13:26:41
|
|
Reorganize source to make things easier to find
- Move all libjpeg documentation, except for README.ijg, into the doc/
subdirectory.
- Move the TurboJPEG C API documentation from doc/html/ into
doc/turbojpeg/.
- Move all C source code and headers into a src/ subdirectory.
- Move turbojpeg-jni.c into the java/ subdirectory.
Referring to #226, there is no ideal solution to this problem. A
semantically ideal solution would have involved placing all source code,
including the SIMD and Java source code, under src/ (or perhaps placing
C library source code under lib/ and C test program source code under
test/), all header files under include/, and all documentation under
doc/. However:
- To me it makes more sense to have separate top-level directories for
each language, since the SIMD extensions and the Java API are
technically optional features. src/ now contains only the code that
is relevant to the core C API libraries and associated programs.
- I didn't want to bury the java/ and simd/ directories or add a level
of depth to them, since both directories already contain source code
that is 3-4 levels deep.
- I would prefer not to separate the header files from the C source
code, because:
1. It would be disruptive. libjpeg and libjpeg-turbo have
historically placed C source code and headers in the same
directory, and people who are familiar with both projects (self
included) are used to looking for the headers in the same directory
as the C source code.
2. In terms of how the headers are used internally in libjpeg-turbo,
the distinction between public and private headers is a bit fuzzy.
- It didn't make sense to separate the test source code from the library
source code, since there is not a clear distinction in some cases.
(For instance, the IJG image I/O functions are used by cjpeg and djpeg
as well as by the TurboJPEG API.)
This solution is minimally disruptive, since it keeps all C source code
and headers together and keeps java/ and simd/ as top-level directories.
It is a bit awkward, because java/ and simd/ technically contain source
code, even though they are not under src/. However, other solutions
would have been more awkward for different reasons.
Closes #226
|
|
c5f269eb
|
2021-09-03T11:52:40
|
|
Neon/AArch64: Explicitly unroll quant loop w/Clang
The loop in jsimd_quantize_neon() is only executed twice and should be
unrolled for AArch64 targets. GCC does that by default, but Clang 11
and later versions available at the time of this writing do not. This
patch adds an unroll pragma when targetting AArch64 with Clang. We do
not use the unroll pragma for AArch32 targets, because it causes the
Clang-generated assembly code to exhaust the available Neon registers
(32 x 64-bit) and spill to the stack. (DRC: Referring to the discussion
in #570, this is likely due to compiler confusion that results in poor
register allocation. It is possible to eliminate the spillage and
reduce the instruction count by loading the data on a just-in-time
basis, thus explicitly interleaving compute and I/O, but the performance
implications of that are currently unknown.)
The effects of unrolling the quantization loop are:
1) elimination of the loop control flow overhead and
2) enabling the use of LDP/STP instructions that work from a single
base pointer, instead of using double the number of LDR/STR
instructions, each requiring an address calculation.
Closes #570
|
|
951d3677
|
2018-08-24T18:04:21
|
|
Neon: Intrinsics impl. of int sample conv./quant.
The previous AArch32 and AArch64 GAS implementations have been removed,
since the intrinsics implementation provides the same or better
performance.
|