CMakeLists.txt


Log

Author Commit Date CI Message
DRC 8db03126 2024-08-02T08:45:36 example.c: Write correct dimensions to PPM header The dimensions in the PPM header of the output file generated by 'example decompress' were always 640 x 480, regardless of the size of the JPEG image being decompressed. Our regression tests (which this commit also fixes) missed the bug because they decompressed the 640 x 480 image generated by 'example compress'. Fixes #778
DRC 94c64ead 2024-06-17T20:27:57 Various doc tweaks - "bits per component" = "bits per sample" Describing the data precision of a JPEG image using "bits per component" is technically correct, but "bits per sample" is the terminology that the JPEG-1 spec uses. Also, "bits per component" is more commonly used to describe the precision of packed-pixel formats (as opposed to "bits per pixel") rather than planar formats, in which all components are grouped together. - Unmention legacy display technologies. Colormapped and monochrome displays aren't a thing anymore, and even when they were still a thing, it was possible to display full-color images to them. In 1991, when JPEG decompression time was measured in minutes per megapixel, it made sense to keep a decompressed copy of JPEG images on disk, in a format that could be displayed without further color conversion (since color conversion was slow and memory-intensive.) In 2024, JPEG decompression time is measured in milliseconds per megapixel, and color conversion is even faster. Thus, JPEG images can be decompressed, displayed, and color-converted (if necessary) "on the fly" at speeds too fast for human vision to perceive. (In fact, your TV performs much more complicated decompression algorithms at least 60 times per second.) - Document that color quantization (and associated features), GIF input/output, Targa input/output, and OS/2 BMP input/output are legacy features. Legacy status doesn't necessarily mean that the features are deprecated. Rather, it is meant to discourage users from using features that may be of little or no benefit on modern machines (such as low-quality modes that had significant performance advantages in the early 1990s but no longer do) and that are maintained on a break/fix basis only. - General wordsmithing, grammar/punctuation policing, and formatting tweaks - Clarify which data precisions each cjpeg input format and each djpeg output format supports. - cjpeg.1: Remove unnecessary and impolitic statement about the -targa switch. - Adjust or remove performance claims to reflect the fact that: * On modern machines, the djpeg "-fast" switch has a negligible effect on performance. * There is a measurable difference between the performance of Floyd- Steinberg dithering and no dithering, but it is not likely perceptible to most users. * There is a measurable difference between the performance of 1-pass and 2-pass color quantization, but it is not likely perceptible to most users. * There is a measurable difference between the performance of full-color and grayscale output when decompressing a full-color JPEG image, but it is not likely perceptible to most users. * IDCT scaling does not necessarily improve performance. (It generally does if the scaling factor is <= 1/2 and generally doesn't if the scaling factor is > 1/2, at least on my machine. The performance claim made in jpeg-6b was probably invalidated when we merged the additional scaling factors from jpeg-7.) - Clarify which djpeg switches/output formats cannot be used when decompressing lossless JPEG images. - Remove djpeg hints, since those involve quality vs. speed tradeoffs that are no longer relevant for modern machines. - Remove documentation regarding using color quantization with 16-bit data precision. (Color quantization requires lossy mode.) - Java: Fix typos in TJDecompressor.decompress12() and TJDecompressor.decompress16() documentation. - jpegtran.1: Fix truncated paragraph In a man page, a single quote at the start of a line is interpreted as a macro. Closes #775 - libjpeg.txt: * Mention J16SAMPLE data type (oversight.) * Remove statement about extending jdcolor.c. (libjpeg-turbo is not quite as DIY as libjpeg once was.) * Remove paragraph about tweaking the various typedefs in jmorecfg.h. It is no longer relevant for modern machines. * Remove caveat regarding systems with ints less than 16 bits wide. (ANSI/ISO C requires an int to be at least 16 bits wide, and libjpeg-turbo has never supported non-ANSI compilers.) - usage.txt: * Add copyright header. * Document cjpeg -icc, -memdst, -report, -strict, and -version switches. * Document djpeg -icc, -maxscans, -memsrc, -report, -skip, -crop, -strict, and -version switches. * Document jpegtran -icc, -maxscans, -report, -strict, and -version switches.
DRC bc491b16 2024-05-16T17:32:02 ChangeLog.md: Document previous commit
DRC 3f43bb33 2024-05-06T13:03:46 Build: Don't use COMPONENT w/install(INCLUDES ...) (Regression introduced by 24e09baaf024e71841a92d30d0e763242ed959ef) The install() INCLUDES option is not an artifact option. It specifies a list of directories that will be added to the INTERFACE_INCLUDE_DIRECTORIES target property when the target is exported using the install() EXPORT option, which occurs when CMake package config files are generated. Specifying 'COMPONENT include' with the install() INCLUDES option caused the INTERFACE_INCLUDE_DIRECTORIES properties in our CMake package config files to contain '${_IMPORT_PREFIX}/COMPONENT', which caused errors of the form 'Imported target "libjpeg-turbo::XXXX" includes non-existent path' when downstream build systems attempted to include libjpeg-turbo using find_package(). Fixes #759 Closes #760
Kleis Auke Wolthuizen 24e09baa 2024-04-12T11:46:21 Build: Add COMPONENT to all install() commands This makes it possible for downstream packagers and other integrators of libjpeg-turbo to include only specific directories from the libjpeg-turbo installation (or to install specific directories under a different prefix, etc.) The names of the components correspond to the directories into which they will be installed. Refer to libvips/libvips#3931, #265, #338 Closes #756
DRC 710865cf 2024-03-18T12:56:42 Build: Don't explicitly set CMP0065 to NEW This is no longer necessary, because of 1644bdb7d2fac66cd0ce25adef7754e008b5bc1e.
DRC fe218ca1 2024-03-18T11:27:30 Build: Handle CMAKE_C_COMPILER_ID=AppleClang Because of 1644bdb7d2fac66cd0ce25adef7754e008b5bc1e, we are now effectively using the NEW behavior for all CMake policies introduced in all CMake versions up to and including CMake 3.28. The NEW behavior for CMP0025, introduced in CMake 3.0, sets CMAKE_C_COMPILER_ID to "AppleClang" instead of "Clang" when using Apple's variant of Clang (in Xcode), so we need to match all values of CMAKE_C_COMPILER_ID that contain "Clang". This fixes three issues: - -O2 was not replaced with -O3 in CMAKE_C_FLAGS_RELWITHDEBINFO. This was a minor issue, since -O3 is now the default in CMAKE_C_FLAGS_RELEASE, and we use CMAKE_BUILD_TYPE=Release in our official builds. - The build system erroneously set the default value of FLOATTEST8 and FLOATTEST12 to no-fp-contract when compiling for PowerPC or Arm using Apple Clang 14+ (effectively reverting 5b2beb4bc4f41dd9dd2a905cb931b8d5054d909b.) Because Clang 14+ now enables -ffp-contract=on by default, this issue caused floating point test failures unless FLOATTEST8 and FLOATTEST12 were overridden. - The build system set MD5_PPM_3x2_FLOAT_FP_CONTRACT as appropriate for GCC, not as appropriate for Clang (effectively reverting 47656a082091f9c9efda054674522513f4768c6c.) This also caused floating point test failures. Fixes #753 Closes #755
DRC dfde1f85 2024-03-08T12:09:23 Fix (and test) more Clang 14 compiler warnings -Woverlength-strings, -Wshift-negative-value, -Wsign-compare
DRC 34c05585 2024-03-06T15:12:31 Fix warnings with -Wmissing-variable-declarations
DRC 26fc07c8 2024-02-08T12:03:37 Build: Set MSVC run-time lib based on IDE config
Alyssa Ross b6ee1016 2024-01-29T17:18:38 Build: Fix tests w/ emulators that don't check CWD While QEMU will run executables from the current working directory, other emulators may not. It is more reliable to pass the full executable path to the emulator. The add_test(NAME ... COMMAND ...) syntax automatically invokes the emulator (e.g. the command specified in CMAKE_CROSSCOMPILING_EMULATOR) and passes the full executable path to it, as long as the first COMMAND argument is the name of a target. This cleans up the CMake code somewhat as well, since it is no longer necessary to manually invoke CMAKE_CROSSCOMPILING_EMULATOR. Closes #747
DRC d59b1a3b 2024-01-30T15:40:51 Build: Reformat lines longer than 80 columns ... ... to ensure that no function argument starts beyond the 80th column.
DRC 7d67c349 2024-01-26T10:34:04 Build/Win: Report CMAKE_MSVC_RUNTIME_LIBRARY value ... when using CMake 3.15+
DRC 17df25f9 2024-01-25T13:52:58 Build/Win: Eliminate MSVC run-time DLL dependency (regression introduced by 1644bdb7d2fac66cd0ce25adef7754e008b5bc1e) Setting a maximum version in cmake_minimum_required() effectively sets the behavior to NEW for all policies introduced in all CMake versions up to and including that maximum version. The NEW behavior for CMP0091, introduced in CMake 3.15, uses CMake variables to specify the MSVC runtime library against which to link, rather than placing the relevant flags in CMAKE_C_FLAGS*. Thus, replacing /MD with /MT in CMAKE_C_FLAGS* no longer has any effect when using CMake 3.15+.
DRC 289df647 2024-01-23T17:35:53 Build: Add tjdoc target for building TurboJPEG dox
DRC 0ef07927 2024-01-23T10:46:04 Bump copyright year to 2024
DRC 1644bdb7 2024-01-22T14:33:31 BUILD: Silence CMake 3.28.x deprecation warning Closes #740
DRC fa2b6ea0 2024-01-12T18:21:41 Eliminate duplicate copies of jpeg_nbits_table ef9a4e05ba919494cbebe50e15f332de5ab97e82 (libjpeg-turbo 1.4.x), which was based on https://bug815473.bmoattachments.org/attachment.cgi?id=692126 (https://bugzilla.mozilla.org/show_bug.cgi?id=815473), modified the C baseline Huffman encoder so that it precomputes jpeg_nbits_table, in order to facilitate sharing the table among multiple processes. However, libjpeg-turbo never shared the table, and because the table was implemented as a static array, f3a8684cd1c28e557d394470962a7a224c76ddbc (libjpeg-turbo 1.5.x) and 37bae1a0e977ee1ba769e6f0aa27e519ab6e58c6 (libjpeg-turbo 2.0.x) each introduced a duplicate copy of the table for (respectively) the SSE2 baseline Huffman encoder and the C progressive Huffman encoder. This commit does the following: - Move the duplicated code in jchuff.c and jcphuff.c, originally introduced in 0cfc4c17b740cb2cbb11f9d85c8ab3745d5b913a and 37bae1a0e977ee1ba769e6f0aa27e519ab6e58c6, into a header (jpeg_nbits.h). - Credit the co-author of 0cfc4c17b740cb2cbb11f9d85c8ab3745d5b913a. (Refer to https://sourceforge.net/p/libjpeg-turbo/patches/57). - Modify the SSE2 baseline Huffman encoder so that the C Huffman encoders can share its definition of jpeg_nbits_table. - Move the definition of jpeg_nbits_table into a C source file (jpeg_nbits.c) rather than a header, and define the table only if USE_CLZ_INTRINSIC is undefined and the SSE2 baseline Huffman encoder will not be built. - Apply hidden symbol visibility to the shared definition of jpeg_nbits_table, if the compiler supports the necessary attribute. (In practice, only Visual C++ doesn't.) Closes #114 See also: https://bugzilla.mozilla.org/show_bug.cgi?id=1501523
DRC 0e2d289f 2023-11-07T12:38:05 Bump version to 3.0.2 to prepare for new commits
DRC 5b2beb4b 2023-10-10T16:44:59 Build: Set FLOATTEST* by default for AArch64, PPC Because of 47656a082091f9c9efda054674522513f4768c6c, we can now reliably determine the correct default values for FLOATTEST8 and FLOATTEST12 when using Clang or GCC to build for AArch64 or PowerPC platforms. (Testing confirms that this is the case with GCC 5-13 and Clang 5-14 on Ubuntu/AArch64, GCC 4 on CentOS 7/PPC, and GCC 8-10 and Clang 6-12 on Ubuntu/PPCLE.) Other CPU architectures and compilers can be added on a case-by-case basis as they are tested.
DRC 47656a08 2023-10-02T18:03:50 Test: Fix float test errors w/ Clang & fp-contract The MD5 sums associated with FLOATTEST8=fp-contract and FLOATTEST12=fp-contract are appropriate for GCC (tested v5 through v13) with -ffp-contract=fast, which is the default when compiling for an architecture that has fused multiply-add (FMA) instructions. However, different MD5 sums are needed for Clang (tested v5 through v14) with -ffp-contract=on, which is now the default in Clang 14 when compiling for an architecture that has FMA instructions. Refer to #705, #709, #710
DRC e17fa3a2 2023-07-27T13:11:39 Bump version to 3.0.1 to prepare for new commits
DRC 63bd7188 2023-07-25T10:01:42 Build: Unset FLOATTEST* by default for non-x86 Because libjpeg-turbo 3.0.x now supports multiple data precisions in the same build, the regression test system can test the 8-bit and 12-bit floating point DCT/IDCT algorithms separately. The expected MD5 sums for those tests are communicated to the test system using the FLOATTEST8 and FLOATTEST12 CMake variables. Whereas it is possible to intelligently set a default value for FLOATTEST8 when building for x86[-64] and a default value for FLOATTEST12 when building for x86-64, it is not possible with other architectures. (Refer to #705, #709, and #710.) Clang 14, for example, now enables FMA (fused multiply-add) instructions by default on architectures that support them, but with AArch64 builds, the results are not the same as when using GCC/AArch64 with FMA instructions enabled. Thus, setting FLOATTEST12=fp-contract doesn't make the tests pass. It was already impossible to intelligently set a default for FLOATTEST8 with i386 builds, but referring to #710, that appears to be the case with other non-x86-64 builds as well. Back in 1991, when Tom Lane first released libjpeg, some CPUs had floating point units and some didn't. It could take minutes to compress or decompress a 1-megapixel JPEG image using the "slow" integer DCT/IDCT algorithms, and the floating point algorithms were significantly faster on systems that had an FPU. On systems without FPUs, floating point math was emulated and really slow, so Tom also developed "fast" integer DCT/IDCT algorithms to speed up JPEG performance, at the expense of accuracy, on those systems. Because of libjpeg-turbo's SIMD extensions, the floating point algorithms are now significantly slower than the "slow" integer algorithms without being significantly more accurate, and the "fast" integer algorithms fail the ISO/ITU-T conformance tests without being any faster than the "slow" integer algorithms on x86 systems. Thus, the floating point and "fast" integer algorithms are considered legacy features. In order for the floating point regression tests to be useful, the results of the tests must be validated against an independent metric. (In other words, it wouldn't be useful to use the floating point DCT/IDCT algorithms to determine the expected results of the floating point DCT/IDCT algorithms.) In the past, I attempted without success to develop a low-level floating point test that would run at configure time and determine the appropriate default value of FLOATTEST*. Barring that approach, the only other possibilities would be: 1. Develop a test framework that compares the floating point results with a margin of error, as TJUnitTest does. However, that effort isn't justified unless it could also benefit non-legacy features. 2. Compare the floating point results against an expected MD5 sum, as we currently do. However, as previously described, it isn't possible in most cases to determine an appropriate default value for the expected MD5 sum. For the moment, it makes the most sense to disable the 8-bit floating point tests by default except with x86[-64] builds and to disable the 12-bit floating point tests by default except with x86-64 builds. That means that the floating point algorithms will still be regression tested when performing x86[-64] builds, but other types of builds will have to opt in to the same regression tests. Since the floating point DCT/IDCT algorithms are unlikely to change ever again (the only reason they still exist at all is to maintain backward compatibility with libjpeg), this seems like a reasonable tradeoff.
DRC d6914b6b 2023-07-24T16:41:18 CMakeLists.txt: Fix comment buglet
DRC 035ea386 2023-07-06T12:04:22 Build: Fix regression test concurrency issues - The example-*bit-*-decompress test must run after the example-*bit-*-compress test, since the latter generates testout*-example.jpg. - Add -static to the filenames of all output files generated by the "static" regression tests, to avoid conflicts with the "shared" regression tests. - Add the PID to the filenames of all files generated by the tjunittest packed-pixel image I/O tests. - Check the return value of MD5File() in tjunittest to avoid a segfault if the file doesn't exist. (Prior to the fix described above, that could occur if two instances of tjunittest ran concurrently from the same directory with the same -bmp and -precision arguments.) Fixes #705
DRC bf9f319c 2023-06-29T16:07:42 Disallow color quantization with lossless decomp Color quantization is a legacy feature that serves little or no purpose with lossless JPEG images. 9f756bc67a84d4566bf74a0c2432aa55da404021 eliminated interaction issues between the lossless decompressor and the color quantizers related to out-of-range 12-bit samples, but referring to #701, other interaction issues apparently still exist. Such issues are likely, given the fact that the color quantizers were not designed with lossless decompression in mind. This commit reverts 9f756bc67a84d4566bf74a0c2432aa55da404021, since the issues it fixed are no longer relevant because of this commit and 2192560d74e6e6cf99dd05928885573be00a8208. Fixed #672 Fixes #673 Fixes #674 Fixes #676 Fixes #677 Fixes #678 Fixes #679 Fixes #681 Fixes #683 Fixes #701
DRC 0e9683c4 2023-06-12T14:36:18 Bump version to 3.0.0
DRC d491094b 2023-04-03T12:31:40 Build separate static/shared jpeg12/16 obj libs If PIC isn't enabled for the entire build (using CMAKE_POSITION_INDEPENDENT_CODE), then we need to enable it for any of the objects in the libjpeg-turbo shared libraries. Ideally, however, we don't want to enable PIC for any of the objects in the libjpeg-turbo static libraries, unless CMAKE_POSITION_INDEPENDENT_CODE is set. Thus, we need to build separate static and shared jpeg12 and jpeg16 object libraries. Fixes #684
DRC 58a3427f 2023-03-09T21:07:40 Bump version to 2.1.92 to prepare for new commits
DRC c13fe159 2023-02-23T09:35:12 Build: Clarify CMAKE_OSX_ARCHITECTURES error It's not that the build system doesn't support multiple values in CMAKE_OSX_ARCHITECTURES. It's that libjpeg-turbo, because of its SIMD extensions, *cannot* support multiple values in CMAKE_OSX_ARCHITECTURES.
DRC 2af984fd 2023-02-10T09:55:27 Build: Fail if included with add_subdirectory() Even though BUILDING.md and CONTRIBUTING.md explicitly state that the libjpeg-turbo build system does not and will not support being integrated into downstream build systems using add_subdirectory() (see 05655481917a2d2761cf2fe19b76f639b7f159ef), people continue to file bug reports, feature requests, and pull requests regarding that (see #265, #637, and #653 in addition to the issues listed in 05655481917a2d2761cf2fe19b76f639b7f159ef.) Responding to those issues wastes our project's limited resources. Hopefully people will get the hint if the build system explicitly tells them that it can't be included using add_subdirectory(), which will prompt them to read the comments in CMakeLists.txt explaining why. To anyone stumbling upon this commit message, please refer to the discussions under the issues listed above, as well as the issues listed in 05655481917a2d2761cf2fe19b76f639b7f159ef. Our project's position on this has been stated, explained, and defended numerous times.
DRC 6c610333 2023-02-08T09:23:51 ChangeLog.md: Document 4e028ecd + bump version to 3.0 beta2
DRC dd89ce6c 2023-02-01T11:54:09 Build: Define THREAD_LOCAL even if !WITH_TURBOJPEG The SIMD dispatchers use thread-local storage now as well, because of f579cc11b33e5bfeb9931e37cc74b4a33c95d2e6.
DRC fd8c4da0 2023-01-27T14:05:07 Bump revision to 2.1.90 to prepare for beta + acknowledge upcoming 2.1.5 release
DRC fc01f467 2023-01-05T06:36:46 TurboJPEG 3 API overhaul (ChangeLog update forthcoming) - Prefix all function names with "tj3" and remove version suffixes from function names. (Future API overhauls will increment the prefix to "tj4", etc., thus retaining backward API/ABI compatibility without versioning each individual function.) - Replace stateless boolean flags (including TJ*FLAG_ARITHMETIC and TJ*FLAG_LOSSLESS, which were never released) with stateful integer parameters, the value of which persists between function calls. * Use parameters for the JPEG quality and subsampling as well, in order to eliminate the awkwardness of specifying function arguments that weren't relevant for lossless compression. * tj3DecompressHeader() now stores all relevant information about the JPEG image, including the width, height, subsampling type, entropy coding type, etc. in parameters rather than returning that information in its arguments. * TJ*FLAG_LIMITSCANS has been reimplemented as an integer parameter (TJ*PARAM_SCANLIMIT) that allows the number of scans to be specified. - Use the const keyword for all pointer arguments to unmodified buffers, as well as for both dimensions of 2D pointers. Addresses #395. - Use size_t rather than unsigned long to represent buffer sizes, since unsigned long is a 32-bit type on Windows. Addresses #24. - Return 0 from all buffer size functions if an error occurs, rather than awkwardly trying to return -1 in an unsigned data type. - Implement 12-bit and 16-bit data precision using dedicated compression, decompression, and image I/O functions/methods. * Suffix the names of all data-precision-specific functions with 8, 12, or 16. * Because the YUV functions are intended to be used for video, they are currently only implemented with 8-bit data precision, but they can be expanded to 12-bit data precision in the future, if necessary. * Extend TJUnitTest and TJBench to test 12-bit and 16-bit data precision, using a new -precision option. * Add appropriate regression tests for all of the above to the 'test' target. * Extend tjbenchtest to test 12-bit and 16-bit data precision, and add separate 'tjtest12' and 'tjtest16' targets. * BufferedImage I/O in the Java API is currently limited to 8-bit data precision, since the BufferedImage class does not straightforwardly support higher data precisions. * Extend the PPM reader to convert 12-bit and 16-bit PBMPLUS files to grayscale or CMYK pixels, as it already does for 8-bit files. - Properly accommodate lossless JPEG using dedicated parameters (TJ*PARAM_LOSSLESS, TJ*PARAM_LOSSLESSPSV, and TJ*PARAM_LOSSLESSPT), rather than using a flag and awkwardly repurposing the JPEG quality. Update TJBench to properly reflect whether a JPEG image is lossless. - Re-organize the TJBench usage screen. - Update the Java docs using Java 11, to improve the formatting and eliminate HTML frames. - Use the accurate integer DCT algorithm by default for both compression and decompression, since the "fast" algorithm is a legacy feature, it does not pass the ISO compliance tests, and it is not actually faster on modern x86 CPUs. * Remove the -accuratedct option from TJBench and TJExample. - Re-implement the 'tjtest' target using a CMake script that enables the appropriate tests, depending on the data precision and whether or not the Java API is part of the build. - Consolidate the C and Java versions of tjbenchtest into one script. - Consolidate the C and Java versions of tjexampletest into one script. - Combine all initialization functions into a single function (tj3Init()) that accepts an integer parameter specifying the subsystems to initialize. - Enable decompression scaling explicitly, using a new function/method (tj3SetScalingFactor()/TJDecompressor.setScalingFactor()), rather than implicitly using awkward "desired width"/"desired height" parameters. - Introduce a new macro/constant (TJUNSCALED/TJ.UNSCALED) that maps to a scaling factor of 1/1. - Implement partial image decompression, using a new function/method (tj3SetCroppingRegion()/TJDecompressor.setCroppingRegion()) and TJBench option (-crop). Extend tjbenchtest to test the new feature. Addresses #1. - Allow the JPEG colorspace to be specified explicitly when compressing, using a new parameter (TJ*PARAM_COLORSPACE). This allows JPEG images with the RGB and CMYK colorspaces to be created. - Remove the error/difference image feature from TJBench. Identical images to the ones that TJBench created can be generated using ImageMagick with 'magick composite <original_image> <output_image> -compose difference <diff_image>' - Handle JPEG images with unknown subsampling types. TJ*PARAM_SUBSAMP is set to TJ*SAMP_UNKNOWN (== -1) for such images, but they can still be decompressed fully into packed-pixel images or losslessly transformed (with the exception of lossless cropping.) They cannot be partially decompressed or decompressed into planar YUV images. Note also that TJBench, due to its lack of support for imperfect transforms, requires that the subsampling type be known when rotating, flipping, or transversely transposing an image. Addresses #436 - The Java version of TJBench now has identical functionality to the C version. This was accomplished by (somewhat hackishly) calling the TurboJPEG C image I/O functions through JNI and copying the pixels between the C heap and the Java heap. - Add parameters (TJ*PARAM_RESTARTROWS and TJ*PARAM_RESTARTBLOCKS) and a TJBench option (-restart) to allow the restart marker interval to be specified when compressing. Eliminate the undocumented TJ_RESTART environment variable. - Add a parameter (TJ*PARAM_OPTIMIZE), a transform option (TJ*OPT_OPTIMIZE), and a TJBench option (-optimize) to allow optimized baseline Huffman coding to be specified when compressing. Eliminate the undocumented TJ_OPTIMIZE environment variable. - Add parameters (TJ*PARAM_XDENSITY, TJ*PARAM_DENSITY, and TJ*DENSITYUNITS) to allow the pixel density to be specified when compressing or saving a Windows BMP image and to be queried when decompressing or loading a Windows BMP image. Addresses #77. - Refactor the fuzz targets to use the new API. * Extend decompression coverage to 12-bit and 16-bit data precision. * Replace the awkward cjpeg12 and cjpeg16 targets with proper TurboJPEG-based compress12, compress12-lossless, and compress16-lossless targets - Fix innocuous UBSan warnings uncovered by the new fuzzers. - Implement previous versions of the TurboJPEG API by wrapping the new functions (tested by running the 2.1.x versions of TJBench, via tjbenchtest, and TJUnitTest against the new implementation.) * Remove all JNI functions for deprecated Java methods and implement the deprecated methods using pure Java wrappers. It should be understood that backward API compatibility in Java applies only to the Java classes and that one cannot mix and match a JAR file from one version of libjpeg-turbo with a JNI library from another version. - tj3Destroy() now silently accepts a NULL handle. - tj3Alloc() and tj3Free() now return/accept void pointers, as malloc() and free() do. - The image I/O functions now accept a TurboJPEG instance handle, which is used to transmit/receive parameters and to receive error information. Closes #517
DRC d4589f4f 2023-01-14T18:07:53 Merge branch 'main' into dev
DRC d2608583 2023-01-05T10:51:12 TurboJPEG: Ensure 'pad' arg is a power of 2 Because the PAD() macro can only handle powers of 2, this is a necessary restriction (and a documented one, except in the case of tjCompressFromYUV()-- oops.) Failing to check the 'pad' argument caused tjBufSizeYUV2() to return bogus results if 'pad' was less than 1 or otherwise not a power of 2. tjEncodeYUV3() and tjDecodeYUV() effectively treated a 'pad' value of 0 as unpadded, but that was subtle and undocumented behavior. tjCompressFromYUV() did not check whether 'pad' was a power of 2, so the strides passed to tjCompressFromYUVPlanes() would have been incorrect if 'pad' was not a power of 2. That would not have caused tjCompressFromYUV() to overrun the source buffer, as long as the calling application allocated the buffer based on the return value of tjBufSizeYUV2() (which computes the strides in the same manner as tjCompressFromYUV().) However, if the calling application attempted to initialize the source buffer using correctly-computed strides, then it could have overrun its own buffer in certain cases or produced incorrect JPEG images in others. Realistically, there is no reason why an application would want to pass a non-power-of-2 'pad' value to a TurboJPEG API function, so this commit is about user-proofing the API rather than fixing any known issue.
DRC 2241434e 2022-12-15T12:20:50 16-bit lossless JPEG support
DRC 382563a5 2022-11-21T22:46:30 Build: Add missing tjbenchtest -arithmetic tests ... if WITH_JAVA=0. (Oversight from 6002720c37ec724dc20971ec77d73547a0feed9f)
DRC 98ff1fd1 2022-11-21T20:57:39 TurboJPEG: Add lossless JPEG detection capability Add a new TurboJPEG C API function (tjDecompressHeader4()) and Java API method (TJDecompressor.getFlags()) that return the bitwise OR of any flags that are relevant to the JPEG image being decompressed (currently TJFLAG_PROGRESSIVE, TJFLAG_ARITHMETIC, TJFLAG_LOSSLESS, and their Java equivalents.) This allows a calling program to determine whether the image being decompressed is a lossless JPEG image, which means that the decompression scaling feature will not be available and that a full-sized destination buffer should be allocated. More specifically, this fixes a buffer overrun in TJBench, TJExample, and the decompress* fuzz targets that occurred when attempting (in vain) to decompress a lossless JPEG image with decompression scaling enabled.
DRC 1a31176e 2022-11-21T22:42:46 Merge branch 'main' into dev
DRC 74d5b168 2022-11-21T22:41:46 Build: Update tjtest target dependencies
DRC 25ccad99 2022-11-16T15:57:25 TurboJPEG: 8-bit lossless JPEG support
DRC 766910e8 2022-11-16T01:03:15 Merge branch 'ijg.lossless' into dev Fix segfault when decomp lossless JPEG w/ restarts The predict_process_restart() method in jpeg_lossless_decompressor was unset, because we now use the start_pass() method in jpeg_inverse_dct instead. Thus, a segfault occurred when attempting to decompress a lossless JPEG that contained restart markers.
DRC 6002720c 2022-11-15T23:10:35 TurboJPEG: Opt. enable arithmetic entropy coding
DRC bc086c44 2022-11-15T23:38:47 Merge branch 'main' into dev
DRC eb1fd4ad 2022-11-15T23:38:19 Build: Fix typo in tjtest target
DRC 97772cba 2022-11-14T15:36:25 Merge branch 'ijg.lossless' into dev Refer to #402
DRC b5a9ef64 2022-11-13T13:00:26 Don't allow 12-bit JPEG support to be disabled In libjpeg-turbo 2.1.x and prior, the WITH_12BIT CMake variable was used to enable 12-bit JPEG support at compile time, because the libjpeg API library could not handle multiple JPEG data precisions at run time. The initial approach to handling multiple JPEG data precisions at run time (7fec5074f962b20ed00b4f5da4533e1e8d4ed8ac) created a whole new API, library, and applications for 12-bit data precision, so it made sense to repurpose WITH_12BIT to allow 12-bit data precision to be disabled. e8b40f3c2ba187ba95c13c3e8ce21c8534256df7 made it so that the libjpeg API library can handle multiple JPEG data precisions at run time via a handful of straightforward API extensions. Referring to 6c2bc901e27b047440ed46920c4d3f0480b48268, it hasn't been possible to build libjpeg-turbo with both forward and backward libjpeg API/ABI compatibility since libjpeg-turbo 1.4.x. Thus, whereas we retain full backward API/ABI compatibility with libjpeg v6b-v8, forward libjpeg API/ABI compatibility ceased being realistic years ago, so it no longer makes sense to provide compile-time options that give a false sense of forward API/ABI compatibility by allowing some (but not all) of our libjpeg API extensions to be disabled. Such options are difficult to maintain and clutter the code with #ifdefs.
DRC bf01ed2f 2022-11-04T13:08:08 Fix build when SIMD extensions are disabled (Broken by previous commit)
DRC e8b40f3c 2022-11-01T21:45:39 Vastly improve 12-bit JPEG integration The Gordian knot that 7fec5074f962b20ed00b4f5da4533e1e8d4ed8ac attempted to unravel was caused by the fact that there are several data-precision-dependent (JSAMPLE-dependent) fields and methods in the exposed libjpeg API structures, and if you change the exposed libjpeg API structures, then you have to change the whole API. If you change the whole API, then you have to provide a whole new library to support the new API, and that makes it difficult to support multiple data precisions in the same application. (It is not impossible, as example.c demonstrated, but using data-precision-dependent libjpeg API structures would have made the cjpeg, djpeg, and jpegtran source code hard to read, so it made more sense to build, install, and package 12-bit-specific versions of those applications.) Unfortunately, the result of that initial integration effort was an unreadable and unmaintainable mess, which is a problem for a library that is an ISO/ITU-T reference implementation. Also, as I dug into the problem of lossless JPEG support, I realized that 16-bit lossless JPEG images are a thing, and supporting yet another version of the libjpeg API just for those images is untenable. In fact, however, the touch points for JSAMPLE in the exposed libjpeg API structures are minimal: - The colormap and sample_range_limit fields in jpeg_decompress_struct - The alloc_sarray() and access_virt_sarray() methods in jpeg_memory_mgr - jpeg_write_scanlines() and jpeg_write_raw_data() - jpeg_read_scanlines() and jpeg_read_raw_data() - jpeg_skip_scanlines() and jpeg_crop_scanline() (This is subtle, but both of those functions use JSAMPLE-dependent opaque structures behind the scenes.) It is much more readable and maintainable to provide 12-bit-specific versions of those six top-level API functions and to document that the aforementioned methods and fields must be type-cast when using 12-bit samples. Since that eliminates the need to provide a 12-bit-specific version of the exposed libjpeg API structures, we can: - Compile only the precision-dependent libjpeg modules (the coefficient buffer controllers, the colorspace converters, the DCT/IDCT managers, the main buffer controllers, the preprocessing and postprocessing controller, the downsampler and upsamplers, the quantizers, the integer DCT methods, and the IDCT methods) for multiple data precisions. - Introduce 12-bit-specific methods into the various internal structures defined in jpegint.h. - Create precision-independent data type, macro, method, field, and function names that are prefixed by an underscore, and use an internal header to convert those into precision-dependent data type, macro, method, field, and function names, based on the value of BITS_IN_JSAMPLE, when compiling the precision-dependent libjpeg modules. - Expose precision-dependent jinit*() functions for each of the precision-dependent libjpeg modules. - Abstract the precision-dependent libjpeg modules by calling the appropriate precision-dependent jinit*() function, based on the value of cinfo->data_precision, from top-level libjpeg API functions.
DRC 6c2bc901 2022-11-03T14:39:19 Don't allow disabling in-memory src/dest managers By default, libjpeg-turbo 1.3.x and later have enabled the in-memory source/destination manager functions from libjpeg v8 when emulating the libjpeg v6b or v7 API/ABI, which has allowed operating system distributors to provide those functions without adopting the backward-incompatible libjpeg v8 API/ABI. Prior to libjpeg-turbo 1.5.x, it made sense to allow users to disable the in-memory source/destination manager functions at build time and thus retain both backward and forward API/ABI compatibility relative to libjpeg v6b or v7. Since then, however, we have introduced several new libjpeg API functions that break forward API/ABI compatibility, so it no longer makes sense to allow the in-memory source/destination managers to be disabled. libjpeg-turbo only claims to be backward-API/ABI-compatible, i.e. to allow applications built against libjpeg or an older version of libjpeg-turbo to work properly with the current version of libjpeg-turbo.
DRC 664b64a9 2022-11-03T14:25:35 Merge branch 'main' into dev
DRC 4f7a8afb 2022-11-03T13:37:55 Build: Fix issues w/ Ninja Multi-Config generator - Fix an issue whereby a build with ENABLE_SHARED=0 could not be installed when using the Ninja Multi-Config CMake generator. - Fix an issue whereby a Windows installer could not be built when using the Ninja Multi-Config CMake generator. - Fix an issue whereby the Java regression tests failed when using the Ninja Multi-Config CMake generator. Based on: https://github.com/stilllman/libjpeg-turbo/commit/4f169deeb092a0513472b04f05f57bfe42b31ceb Closes #626
DRC 8c5e78ce 2022-11-03T11:22:50 Build: Document SO_AGE and TURBOJPEG_SO_AGE vars
DRC cb3642cb 2022-11-03T12:22:51 Bump version to 2.1.5 to prepare for new commits
DRC 8a3b0f70 2022-06-24T15:21:51 Implement 12-bit-specific error/warn/trace macros The macros in jerror.h refer to j_common_ptr, so it is unfortunately necessary to introduce a 12-bit-specific version of that header file (j12error.h) with 12-bit specific ERREXIT*(), WARNMS*(), and TRACEMS*() macros. (The message table is still shared between 8-bit and 12-bit implementations.) Fixes #607
DRC b98dabac 2022-04-27T12:38:58 Merge branch 'main' into dev
DRC d0e7c454 2022-04-18T11:34:07 Don't install libturbojpeg.pc if WITH_TURBOJPEG=0 Fixes #593
DRC 1b9edb5c 2022-03-10T23:57:11 Build: Fix 12-bit FP tests w/ 32-bit builds With x86-64 builds, the default value of FLOATTEST works with both the 8-bit-per-sample and 12-bit-per-sample flavors of the libjpeg API library. However, that is not the case with x86 builds. Thus, we need separate 8-bit-per-sample and 12-bit-per-sample FLOATTEST variables.
DRC 263386c2 2022-03-11T17:35:59 Merge branch 'main' into dev
DRC 2ee7264d 2022-03-11T17:28:36 Build: Don't set DEFAULT_FLOATTEST for x86 MSVC Newer versions of the 32-bit x86 Visual Studio compiler produce results compatible with FLOATTEST=no-fp-contract, so we can no longer intelligently set a default FLOATTEST value for that platform.
DRC a0148454 2022-03-11T10:50:47 Win: Fix build with Visual Studio 2010 (broken by 607b668ff96e40fdc749de9b1bb98e7f40c86d93) - Visual Studio 2010 apparently doesn't have the snprintf() inline function, so restore the macro that emulates that function using _snprintf_s(). - Explicitly include errno.h in strtest.c, since jinclude.h doesn't include it when building with Visual Studio.
DRC b3ae7779 2022-03-10T23:13:43 Fix in-tree builds (oops)
DRC 7fec5074 2022-03-08T12:34:11 Support 8-bit & 12-bit JPEGs using the same build Partially implements #199 This commit also implements a request from #178 (the ability to compile the libjpeg example as a standalone program.)
DRC fc562d11 2022-03-07T14:29:37 Bump version to 2.2 alpha1 ... ... to prepare for new features
DRC 607b668f 2022-02-10T11:33:49 MSVC: Eliminate C4996 warnings in API libs The primary purpose of this is to encourage adoption of libjpeg-turbo in downstream Windows projects that forbid the use of "deprecated" functions. libjpeg-turbo's usage of those functions was not actually unsafe, because: - libjpeg-turbo always checks the return value of fopen() and ensures that a NULL filename can never be passed to it. - libjpeg-turbo always checks the return value of getenv() and never passes a NULL argument to it. - The sprintf() calls in format_message() (jerror.c) could never overflow the destination string buffer or leave it unterminated as long as the buffer was at least JMSG_LENGTH_MAX bytes in length, as instructed. (Regardless, this commit replaces those calls with snprintf() calls.) - libjpeg-turbo never uses sscanf() to read strings or multi-byte character arrays. - Because of b7d6e84d6a9283dc2bc50ef9fcaadc0cdeb25c9f, wrjpgcom explicitly checks the bounds of the source and destination strings before calling strcat() and strcpy(). - libjpeg-turbo always ensures that the destination string is terminated when using strncpy(). (548490fe5e2aa31cb00f6602d5a478b068b99682 made this explicit.) Regarding thread safety: Technically speaking, getenv() is not thread-safe, because the returned pointer may be invalidated if another thread sets the same environment variable between the time that the first thread calls getenv() and the time that that thread uses the return value. In practice, however, this could only occur with libjpeg-turbo if: (1) A multithreaded calling application used the deprecated and undocumented TJFLAG_FORCEMMX/TJFLAG_FORCESSE/TJFLAG_FORCESSE2 flags in the TurboJPEG API or set one of the corresponding environment variables (which are only intended for testing purposes.) Since the TurboJPEG API library only ever passed string constants to putenv(), the only inherent risk (i.e. the only risk introduced by the library and not the calling application) was that the SIMD extensions may have read an incorrect value from one of the aforementioned environment variables. or (2) A multithreaded calling application modified the value of the JPEGMEM environment variable in one thread while another thread was reading the value of that environment variable (in the body of jpeg_create_compress() or jpeg_create_decompress().) Given that the libjpeg API provides a thread-safe way for applications to modify the default memory limit without using the JPEGMEM environment variable, direct modification of that environment variable by calling applications is not supported. Microsoft's implementation of getenv_s() does not claim to be thread-safe either, so this commit uses getenv_s() solely to mollify Visual Studio. New inline functions and macros (GETENV_S() and PUTENV_S) wrap getenv_s()/_putenv_s() when building for Visual Studio and getenv()/setenv() otherwise, but GETENV_S()/PUTENV_S() provide no advantages over getenv()/setenv() other than parameter validation. They are implemented solely for convenience. Technically speaking, strerror() is not thread-safe, because the returned pointer may be invalidated if another thread changes the locale and/or calls strerror() between the time that the first thread calls strerror() and the time that that thread uses the return value. In practice, however, this could only occur with libjpeg-turbo if a multithreaded calling application encountered a file I/O error in tjLoadImage() or tjSaveImage(). Since both of those functions immediately copy the string returned from strerror() into a thread-local buffer, the risk is minimal, and the worst case would involve an incorrect error string being reported to the calling application. Regardless, this commit uses strerror_s() in the TurboJPEG API library when building for Visual Studio. Note that strerror_r() could have been used on Un*x systems, but it would have been necessary to handle both the POSIX and GNU implementations of that function and perform widespread compatibility testing. Such is left as an exercise for another day. Fixes #568
DRC a3d4aadd 2022-02-01T12:53:28 Build: Embed version/API/(C) info in MSVC DLLs Based on: https://github.com/TheDorkKnight/libjpeg-turbo/commit/da7a18801a5c305d3f8a71b065f179f1e22b73ae Closes #576
DRC 17297239 2022-01-06T09:17:30 Eliminate non-ANSI C compatibility macros libjpeg-turbo has never supported non-ANSI C compilers. Per the spec, ANSI C compilers must have locale.h, stddef.h, stdlib.h, memset(), memcpy(), unsigned char, and unsigned short. They must also handle undefined structures.
DRC 2ce32e0f 2021-11-30T10:54:24 cjpeg: automatically compress PGM-->grayscale JPEG (regression introduced by aa7459050d7a50e1d8a99488902d41fbc118a50f) cjpeg sets cinfo.in_color_space to JCS_RGB as an "arbitrary guess." Since tjLoadImage() never uses JCS_RGB, the PGM reader should treat JCS_RGB the same as JCS_UNKNOWN. Fixes #566
Alex Xu (Hello71) 18edeff4 2021-10-02T14:26:57 Build: Set CMP0065 NEW (respect ENABLE_EXPORTS) Referring to https://cmake.org/cmake/help/latest/policy/CMP0065.html, CMake 3.3 and earlier automatically added compiler/linker flags such as -rdynamic/-export-dynamic, which caused symbols to be exported from executables. The primary purpose of this is to allow plugins loaded via dlopen() to access symbols from the calling program. libjpeg-turbo does not need this functionality, and enabling it needlessly increases the size of the libjpeg-turbo executables. Setting CMP0065 to NEW when using CMake 3.4 and later prevents CMake from automatically adding the aforementioned compiler/linker flags unless the ENABLE_EXPORTS property is set for a target (or the CMAKE_ENABLE_EXPORTS variable is set, which causes ENABLE_EXPORTS to be set for all targets.) Closes #554
DRC 129f0cb7 2021-08-25T12:07:58 Neon/AArch64: Don't put GAS functions in .rodata Regression introduced by 240ba417aa4b3174850d05ea0d22dbe5f80553c1 Closes #546
DRC 84d6306f 2021-07-27T11:02:23 Fix build w/CMake 3.14+ when CMAKE_SYSTEM_NAME=iOS Closes #539
DRC 5135c2e2 2021-05-28T12:51:53 Build: Use PIC for jsimd_none.o in shared libs In theory, all objects that will be included in a Un*x shared library must be built using PIC. In practice, most compilers don't require PIC to be explicitly specified for jsimd_none.o, either because the compiler automatically enables PIC in all cases (Ubuntu) or because the size of the generated object is too small. But some rare compilers do require PIC to be explicitly specified for jsimd_none.o. Fixes #520
DRC 3932190c 2021-05-17T13:05:16 Fix build w/ non-GCC-compatible Un*x/Arm compilers Regression introduced by d2c407995992be1f128704ae2479adfd7906c158 Closes #519
DRC 4f51f36e 2021-04-23T11:42:40 Bump version to 2.1.0 to prepare for final release
DRC 2f9e8a11 2021-03-29T18:54:12 OSS-Fuzz integration This commit integrates OSS-Fuzz targets directly into the libjpeg-turbo source tree, thus obsoleting and improving code coverage relative to Google's OSS-Fuzz target for libjpeg-turbo (previously available here: https://github.com/google/oss-fuzz). I hope to eventually create fuzz targets for the BMP, GIF, and PPM readers as well, which would allow for fuzz-testing compression, but since those readers all require an input file, it is unclear how to build an efficient fuzzer around them. It doesn't make sense to fuzz-test compression in isolation, because compression can't accept arbitrary input data.
DRC e795afc3 2021-03-25T22:36:15 SSE2: Fix prog Huff enc err if Sl%32==0 && Al!=0 (regression introduced by 16bd984557fa2c490be0b9665e2ea0d4274528a8) This implements the same fix for jsimd_encode_mcu_AC_refine_prepare_sse2() that a81a8c137b3f1c65082aa61f236aa88af61b3ad4 implemented for jsimd_encode_mcu_AC_first_prepare_sse2(). Based on: https://github.com/MegaByte/libjpeg-turbo/commit/1a59587397150c9ef9dffc5813cb3891db4bc0c8 https://github.com/MegaByte/libjpeg-turbo/commit/eb176a91d87a470bf8c987be786668aa944dd1dd Fixes #509 Closes #510
Adrian Bunk 2c01200c 2021-03-15T19:56:53 Build: Fix incorrect regexes w/ if(...MATCHES...) "arm*" as a regex means 'ar' followed by zero or more 'm' characters, which matches 'parisc' and 'sparc64' as well.
DRC 8a2cad02 2021-01-21T10:51:49 Build: Handle CMAKE_OSX_ARCHITECTURES=(i386|ppc) We don't officially support i386 or PowerPC Mac builds of libjpeg-turbo anymore, but they still work (bearing in mind that PowerPC builds require GCC v4.0 in Xcode 3.2.6, and i386 builds require Xcode 9.x or earlier.) Referring to #495, apparently MacPorts needs this functionality.
DRC 399aa374 2021-01-19T12:25:11 Build: Support CMAKE_OSX_ARCHITECTURES ... as long as it contains only a singular value, which must equal "x86_64" or "arm64". Refer to #495
DRC 3e8911aa 2021-01-11T13:56:01 Build: Use correct SIMD exts w/VStudio IDE + Arm64 When configuring a Visual Studio IDE build and passing -A arm64 to CMake, CMAKE_SYSTEM_PROCESSOR will be amd64, so we should set CPU_TYPE based on the value of CMAKE_GENERATOR_PLATFORM rather than the value of CMAKE_SYSTEM_PROCESSOR.
DRC cfc7e6e5 2020-11-25T14:10:55 Bump revision to 2.0.91 for post-beta fixes
DRC 8cf6f716 2020-11-24T21:32:48 Bump revision to 2.0.90 to prepare for beta
Jonathan Wright eb14189c 2020-11-17T12:48:49 Fix Neon SIMD build issues with Visual Studio - Use the _M_ARM and _M_ARM64 macros provided by Visual Studio for compile-time detection of Arm builds, since __arm__ and __aarch64__ are only present in GNU-compatible compilers. - Neon/intrinsics: Use the _CountLeadingZeros() and _CountLeadingZeros64() intrinsics provided by Visual Studio, since __builtin_clz() and __builtin_clzl() are only present in GNU-compatible compilers. - Neon/intrinsics: Since Visual Studio does not support static vector initialization, replace static initialization of Neon vectors with the appropriate intrinsics. Compared to the static initialization approach, this produces identical assembly code with both GCC and Clang. - Neon/intrinsics: Since Visual Studio does not support inline assembly code, provide alternative code paths for Visual Studio whenever inline assembly is used. - Build: Set FLOATTEST appropriately for AArch64 Visual Studio builds (Visual Studio does not emit fused multiply-add [FMA] instructions by default for such builds.) - Neon/intrinsics: Move temporary buffer allocation outside of nested loops. Since Visual Studio configures Arm builds with a relatively small amount of stack memory, attempting to allocate those buffers within the inner loops caused a stack overflow. Closes #461 Closes #475
DRC 292d78e7 2020-11-16T15:28:02 Merge branch 'master' into dev
DRC 88bf1d16 2020-11-16T14:38:15 Build: Set FLOATTEST more intelligently The "32bit" vs. "64bit" floating point test results actually have nothing to do with the FPU. That was a fallacious assumption based on the observation that, with multiple CPU types, 32-bit and 64-bit builds produce different floating point test results. It seems that this is, in fact, due to differing compiler behavior-- more specifically, whether fused multiply-add (FMA) instructions are used to combine multiple floating point operations into a single instruction ("floating point expression contraction".) GCC does this by default if the target supports FMA instructions, which PowerPC and AArch64 targets both do. Fixes #468
DRC 8f830598 2020-11-13T15:21:26 Merge branch 'master' into dev
DRC 33859880 2020-11-13T12:12:47 Neon: Auto-detect compiler intrinsics completeness This allows the Neon intrinsics code to be built successfully (albeit likely with reduced run-time performance) with Xcode 5.0-6.2 (iOS/AArch64) and Android NDK < r19 (AArch32). Note that Xcode 5.0-6.2 will not build the Armv8 GAS code without gas-preprocessor.pl, and no version of Xcode will build the Armv7 GAS code without gas-preprocessor.pl, so we always use the full Neon intrinsics implementation by default with macOS and iOS builds. Auto-detecting the completeness of the compiler's set of Neon intrinsics also allows us to more intelligently set the default value of NEON_INTRINSICS, based on the values of HAVE_VLD1*. This is a reasonable, albeit imperfect, proxy for whether a compiler has a full and optimal set of Neon intrinsics. Specific notes: - 64-bit RGB-to-YCbCr color conversion does not use any of the intrinsics in question, regresses with GCC - 64-bit accurate integer forward DCT uses vld1_s16_x3(), regresses with GCC - 64-bit Huffman encoding uses vld1q_u8_x4(), regresses with GCC - 64-bit YCbCr-to-RGB color conversion does not use any of the intrinsics in question, regresses with GCC - 64-bit accurate integer inverse DCT uses vld1_s16_x3(), regresses with GCC - 64-bit 4x4 inverse DCT uses vld1_s16_x3(). I did not test this algorithm in isolation, so it may in fact regress with GCC, but the regression may be hidden by the speedup from the new SIMD-accelerated upsampling algorithms. - 32-bit RGB-to-YCbCr color conversion: uses vld1_u16_x2(), regresses with GCC - 32-bit accurate integer forward DCT uses vld1_s16_x3(), regression irrelevant because there was no previous implementation - 32-bit accurate integer inverse DCT uses vld1_s16_x3(), regresses with GCC - 32-bit fast integer inverse DCT does not use any of the intrinsics in question, regresses with GCC - 32-bit 4x4 inverse DCT uses vld1_s16_x3(). I did not test this algorithm in isolation, so it may in fact regress with GCC, but the regression may be hidden by the speedup from the new SIMD-accelerated upsampling algorithms. Presumably when GCC includes a full and optimal set of Neon intrinsics, the HAVE_VLD1* tests will pass, and the full Neon intrinsics implementation will be enabled automatically.
DRC 3e9e7c70 2020-11-11T17:54:06 Fix build if WITH_12BIT==1 && WITH_JPEG(7|8)==1 Fixes #466
Jonathan Wright ba52a3de 2018-07-19T18:46:24 Neon: Intrinsics impl of h2v1 & h2v2 merged upsamp There was no previous GAS implementation. This commit also reverts 40557b23015d2f8b576420231b8dd1f39f2ceed8 and 7723d7f7d0aa40349d5bdd1fbe4f8631fd5a2b57. 7723d7f7d0aa40349d5bdd1fbe4f8631fd5a2b57 was only necessary because there was no Neon implementation of merged upsampling/color conversion, and 40557b23015d2f8b576420231b8dd1f39f2ceed8 was only necessary because of 7723d7f7d0aa40349d5bdd1fbe4f8631fd5a2b57.
Jonathan Wright 2acfb93c 2019-05-08T15:43:26 Neon: Intrinsics impl. of h1v2 fancy upsamling There was no previous GAS implementation.
Jonathan Wright 4f2216b4 2019-11-26T18:14:33 Neon: Intrinsics implementation of RGB->YCbCr The previous AArch32 and AArch64 GAS implementations are retained by default when using GCC, in order to avoid a performance regression. The intrinsics implementation can be forced on or off using a new NEON_INTRINSICS CMake variable.
DRC c7dd1912 2020-11-08T15:15:02 Merge branch 'master' into dev
DRC 40557b23 2020-11-06T18:51:55 Build: Fix test failures w/ Arm Neon SIMD exts Regression caused by a46c111d9f3642f0ef3819e7298846ccc61869e0 Because of 7723d7f7d0aa40349d5bdd1fbe4f8631fd5a2b57, which was introduced in libjpeg-turbo 1.5.1 in response to #81, merged upsampling/ color conversion is disabled on platforms that have SIMD-accelerated YCbCr -> RGB color conversion but not SIMD-accelerated merged upsampling/color conversion. This was intended to improve performance with the Neon SIMD extensions, since those are the only SIMD extensions for which those circumstances apply. Under normal circumstances, the separate "plain" (non-fancy) upsampling and color conversion routines will produce bitwise-identical output to the merged upsampling/color conversion routines, but that is not the case when skipping scanlines starting at an odd-numbered scanline. The modified test introduced in a46c111d9f3642f0ef3819e7298846ccc61869e0 does precisely that in order to validate the fixes introduced in 9120a247436e84c0b4eea828cb11e8f665fcde30 and a46c111d9f3642f0ef3819e7298846ccc61869e0. Because of 7723d7f7d0aa40349d5bdd1fbe4f8631fd5a2b57, the segfault fixed in 9120a247436e84c0b4eea828cb11e8f665fcde30 and a46c111d9f3642f0ef3819e7298846ccc61869e0 didn't affect the Neon SIMD extensions, so this commit effectively reverts the test modifications in a46c111d9f3642f0ef3819e7298846ccc61869e0 when using those SIMD extensions. We can get rid of this hack, as well as 7723d7f7d0aa40349d5bdd1fbe4f8631fd5a2b57, once a Neon implementation of merged upsampling/color conversion is available.
DRC 59352195 2020-10-19T21:17:46 Merge branch 'master' into dev
DRC f7ca3c5a 2020-10-19T15:34:03 Build: Improve Arm 32-bit cross-comp./packaging - Set CPU_TYPE=arm if performing a 32-bit build on an AArch64 system. This eliminates the need to use a CMake toolchain file. - Set RPMARCH=armv7hl if building on a 32-bit Arm system with an FPU. - Set RPMARCH=armv7hl and DEBARCH=armhf if performing a 32-bit build using a gnueabihf toolchain. - If performing a 32-bit Arm build, generate a 32-bit supplementary DEB package for AArch64 systems.
DRC b8200c66 2019-03-08T11:57:54 Build: Add CMake package config files Based on: https://github.com/hjmallon/libjpeg-turbo/commit/d34b89b41134bd2b581e222514ee493594193d87 Closes #339 Closes #342
DRC fe79f56b 2020-07-28T15:09:00 Merge branch 'master' into dev
DRC a46c111d 2020-07-27T14:21:23 Further jpeg_skip_scanlines() fixes - Introduce a partial image decompression regression test script that validates the correctness of jpeg_skip_scanlines() and jpeg_crop_scanlines() for a variety of cropping regions and libjpeg settings. This regression test catches the following issues: #182, fixed in 5bc43c7821df982f65aa1c738f67fbf7cba8bd69 #237, fixed in 6e95c08649794f5018608f37250026a45ead2db8 #244, fixed in 398c1e9acc9b4531edceb3d77da0de5744675052 #441, fully fixed in this commit It does not catch the following issues: #194, fixed in 773040f9d949d5f313caf7507abaf4bd5d7ffa12 #244 (additional segfault), fixed in 9120a247436e84c0b4eea828cb11e8f665fcde30 - Modify the libjpeg-turbo regression test suite (make test) so that it checks for the issue reported in #441 (segfault in jpeg_skip_scanlines() when used with 4:2:0 merged upsampling/color conversion.) - Fix issues in jpeg_skip_scanlines() that caused incorrect output with h2v2 (4:2:0) merged upsampling/color conversion. The previous commit fixed the segfault reported in #441, but that was a symptom of a larger problem. Because merged 4:2:0 upsampling uses a "spare row" buffer, it is necessary to allow the upsampler to run when skipping rows (fancy 4:2:0 upsampling, which uses context rows, also requires this.) Otherwise, if skipping starts at an odd-numbered row, the output image will be incorrect. - Throw an error if jpeg_skip_scanlines() is called with two-pass color quantization enabled. With two-pass color quantization, the first pass occurs within jpeg_start_decompress(), so subsequent calls to jpeg_skip_scanlines() interfere with the multipass state and prevent the second pass from occurring during subsequent calls to jpeg_read_scanlines().