kmx git

Commit	Date	Message
129f0cb7	2021-08-25T12:07:58	Neon/AArch64: Don't put GAS functions in .rodata Regression introduced by 240ba417aa4b3174850d05ea0d22dbe5f80553c1 Closes #546
0a9b9721	2021-08-09T17:25:36	jmemmgr.c: Pass correct size arg to jpeg_free_() This issue was introduced in 5557fd22173ea9ab4c02c81e1dcec9bd6927814f due to an oversight, so it has existed in libjpeg-turbo since the project's inception. However, the issue is effectively a non-issue. Although #325 proposes allowing programs to override jpeg_get_() and jpeg_free_() externally, there is currently no way to override those functions without modifying the libjpeg-turbo source code. libjpeg-turbo only includes the malloc()/free() memory manager from libjpeg, and the implementation of jpeg_free_() in that memory manager ignores the size argument. libjpeg had several additional memory managers for legacy systems (MS-DOS, System 7, etc.), but those memory managers ignored the size argument to jpeg_free_*() as well. Thus, this issue would have only potentially affected custom memory managers in downstream libjpeg-turbo forks, and since no one has complained until now, apparently those are rare. Fixes #542
2849d86a	2021-08-06T13:41:15	SSE2/64-bit: Fix trans. segfault w/ malformed JPEG Attempting to losslessly transform certain malformed JPEG images can cause the nbits table index in the Huffman encoder to exceed 32768, so we need to pad the SSE2 implementation of that table to 65536 entries as we do with the C implementation. Regression introduced by 087c29e07f7533ec82fd7eb1dafc84c29e7870ec Fixes #543
84d6306f	2021-07-27T11:02:23	Fix build w/CMake 3.14+ when CMAKE_SYSTEM_NAME=iOS Closes #539
e6e952d5	2021-07-15T17:33:24	Suppress UBSan error in decode_mcu_fast() This is the same error that d147be83e9a9f904918ba7f834b0fb28e09de9b5 suppressed in decode_mcu_slow(). The image that reproduces this error in decode_mcu_fast() has been added to the libjpeg-turbo seed corpora. Closes #537
a72816ed	2021-07-16T09:37:06	Use uintptr_t, if avail, for pointer-to-int casts Although sizeof(void ) == sizeof(size_t) for all architectures that are currently supported by libjpeg-turbo, such is not guaranteed by the C standard. Specifically, CHERI-enabled architectures (e.g. CHERI-RISC-V or Arm's Morello) use capability pointers that are twice the size of size_t (128 bits for Morello and RV64), so casting to size_t strips the upper bits of the pointer (including the validity bit) and makes it non-deferenceable, as indicated by the following compiler warning: warning: cast from provenance-free integer type to pointer type will give pointer that can not be dereferenced [-Werror,-Wcheri-capability-misuse] cvalue = values = (JCOEF )PAD((size_t)values_unaligned, 16); Ignoring this warning results in a run-time crash. Casting pointers to uintptr_t, if it is available, avoids this problem, since uintptr_t is defined as an unsigned integer type that can hold a pointer value. Since C89 compatibility is still necessary in libjpeg-turbo, this commit introduces a new typedef for pointer-to-integer casts that uses a GNU-specific extension available in GCC 4.6+ and Clang 3.0+ and falls back to using size_t if the extension is unavailable. The only other options would require C99 or Clang-specific builtins. Closes #538
4d9f256b	2021-07-13T11:52:49	jpegtran: Add option to copy only ICC markers Closes #533
b201838d	2021-07-10T16:07:05	Neon: Silence -Wimplicit-fallthrough warnings Refer to https://bugs.chromium.org/p/chromium/issues/detail?id=995993 Closes #534
a1bfc058	2021-07-12T13:52:38	Neon/AArch32: Mark inline asm output as read/write 'buffer' is both passed into the inline assembly code and modified by it. See https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html, 6.47.2.3. With GCC 4, this commit does not change the generated assembly code at all. With GCC 8, this commit fixes an assembly error: /tmp/{foo}.s: Assembler messages: /tmp/{foo}.s:775: Error: registers may not be the same -- `str r9,[r9],#4' I'm not sure why that error went unnoticed, since I definitely benchmarked the previous commit with GCC 8. Anyhow, this commit changes the generated assembly code slightly but does not alter performance. With Clang 10, this commit changes the generated assembly code slightly but does not alter performance. Refer to #529
2a2970af	2021-07-09T15:35:56	Neon/AArch32: Work around Clang T32 miscompilation Referring to the C standard (http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf, J.2 Undefined behavior), the behavior of the compiler is undefined if "conversion between two pointer types produces a result that is incorrectly aligned." Thus, the behavior of this code ((uint32_t )buffer) = BUILTIN_BSWAP32(put_buffer); in the AArch32 version of the FLUSH() macro is undefined unless 'buffer' is 32-bit-aligned. Referring to https://bugs.llvm.org/show_bug.cgi?id=50785, certain versions of Clang, when generating Thumb (T32) instructions, miscompile that code into an assembly instruction (stm) that requires the destination to be 32-bit-aligned. Since such alignment cannot be guaranteed within the Huffman encoder, this reportedly led to crashes (SIGBUS: illegal alignment) with AArch32/Thumb builds of libjpeg-turbo running on Android devices, although thus far I have been unable to reproduce those crashes with a plain Linux/Arm system. The miscompilation is visible with the Compiler Explorer: https://godbolt.org/z/rv1ccx1Pb However, it goes away when removing the return statement from the function. Thus, it seems that Clang's behavior in this regard is somewhat variable, which may explain why the crashes are only reproducible on certain platforms. The suggested workaround is to use memcpy(), but whereas Clang and recent GCC releases are smart enough to compile a 4-byte memcpy() call into a str instruction, GCC < 6 is not. Referring to https://godbolt.org/z/ae7Wje3P6, the only way to consistently produce the desired str instruction across all supported compilers is to use inline assembly. Visual C++ presumably does not miscompile the code in question, since no issues have been reported with it, but since the code relies on undefined compiler behavior, prudence dictates that e4ec23d7ae051c1c73947f889818900362fdc52d should be reverted for Visual C++, which this commit does. The performance impact of e4ec23d7ae051c1c73947f889818900362fdc52d for Visual C++/Arm builds is unknown (I have no ability to test such builds), but regardless, this commit reverts the Visual C++/Arm performance to that of libjpeg-turbo 2.1 beta1. Closes #529
97a1575c	2021-07-07T14:38:17	RPM: Don't include system lib dir in file list This resolves a conflict between the RPM generated by the libjpeg-turbo build system and the Red Hat 'filesystem' RPM if CMAKE_INSTALL_LIBDIR=/usr/lib[64]. This code was largely borrowed from the VirtualGL RPM spec. (I can legally do that because I hold the copyright on VirtualGL's implementation.) Fixes #532
0081c2de	2021-07-07T10:12:46	Neon/AArch32: Fix build if 'soft' float ABI used Arm compilers have three floating point ABI options: 'soft' compiles floating point operations as function calls into a software floating point library, which emulates floating point operations using integer operations. Floating point function arguments are passed using integer registers. 'softfp' also compiles floating point operations as function calls into a floating point library and passes floating point function arguments using integer registers, but the floating point library functions can use FPU instructions if the CPU supports them. 'hard' compiles floating point operations into inline FPU instructions, similarly to x86 and other architectures, and passes floating point function arguments using FPU registers. Not all AArch32 CPUs have FPUs or support Neon instructions, so on Linux and Android platforms, the AArch32 SIMD dispatcher in libjpeg-turbo only enables the Neon SIMD extensions at run time if /proc/cpuinfo indicates that the CPU supports Neon instructions or if Neon instructions are explicitly enabled (e.g. by passing -mfpu=neon to the compiler.) In order to support all AArch32 CPUs using the same code base, i.e. to support run-time FPU and Neon auto-detection, it is necessary to compile the scalar C source code using -mfloat-abi=soft. However, the 'soft' floating point ABI cannot be used when compiling Neon intrinsics, so the intrinsics implementation of the Neon SIMD extensions must be compiled using -mfloat-abi=softfp if the scalar C source code is compiled using -mfloat-abi=soft. This commit modifies the build system so that it detects whether -mfloat-abi=softfp must be explicitly added to the compiler flags when building the intrinsics implementation of the Neon SIMD extensions. This will be necessary if the build is using the 'soft' floating point ABI along with run-time auto-detection of Neon instructions. Fixes #523
9df5786f	2021-06-26T16:22:21	Fix -Wimplicit-fallthrough warnings with Clang The existing /FALLTHROUGH/ comments work with GCC but not Clang, so this commit adds a FALLTHROUGH macro that uses the 'fallthrough' attribute if the compiler supports it. Refer to https://bugs.chromium.org/p/chromium/issues/detail?id=995993 NOTE: All versions of GCC that support -Wimplicit-fallthrough also support the 'fallthrough' attribute, but certain other compilers (Oracle Solaris Studio, for instance) support /FALLTHROUGH/ but not the 'fallthrough' attribute. Thus, this commit retains the /FALLTHROUGH/ comments, which have existed in the libjpeg code base in some form since 1994 (libjpeg v5.) Closes #531
1a1fb615	2021-06-18T09:46:03	ChangeLog.md: List CVE ID fixed by c76f4a08 Referring to #527, the security community did not assign this CVE ID until more than 8 months after the fix for the issue was released. By the time they assigned the ID, libjpeg-turbo already had two production releases containing the fix. This calls into question the usefulness of assigning a CVE ID to the issue, particularly given that the buffer overrun in question was fully contained in the stack, not detectable with valgrind, and confined to lossless transformation (it did not affect JPEG compression or decompression.) https://vuldb.com/?id.176175 says that "the exploitability is told to be easy" but provides no clarification, and given that the author of that page does not seem to be aware that a fix for the issue has been available since early December of 2019, it calls into question the accuracy of everything else on the page. It would really be nice if the security community approached me about these things before wasting my time, but I guess it's my lot in life to modify a change log entry from 2019 to include a CVE ID from 2020. So it goes...
5135c2e2	2021-05-28T12:51:53	Build: Use PIC for jsimd_none.o in shared libs In theory, all objects that will be included in a Un*x shared library must be built using PIC. In practice, most compilers don't require PIC to be explicitly specified for jsimd_none.o, either because the compiler automatically enables PIC in all cases (Ubuntu) or because the size of the generated object is too small. But some rare compilers do require PIC to be explicitly specified for jsimd_none.o. Fixes #520
3932190c	2021-05-17T13:05:16	Fix build w/ non-GCC-compatible Un*x/Arm compilers Regression introduced by d2c407995992be1f128704ae2479adfd7906c158 Closes #519
a219fd13	2021-05-12T10:58:59	GitHub bug-report.md: "master" branch --> "main"
c23672ce	2021-04-23T13:05:25	GitHub Actions: Don't build tags Our workflow script does not currently work with tags, and there is no point to building tags anyhow, since we do not use the CI system to spin official builds.
4f51f36e	2021-04-23T11:42:40	Bump version to 2.1.0 to prepare for final release
e0606daf	2021-04-21T14:49:06	TurboJPEG: Update JPEG buf ptrs on comp/xform err When using the in-memory destination manager, it is necessary to explicitly call the destination manager's term_destination() method if an error occurs. That method is called by jpeg_finish_compress() but not by jpeg_abort_compress(). This fixes a potential double free() that could occur if tjCompress*() or tjTransform() returned an error and the calling application tried to clean up a JPEG buffer that was dynamically re-allocated by one of those functions.
ffc1aa96	2021-04-21T11:06:22	Include TJ.FLAG_LIMITSCANS in JNI header (oversight from c81e91e8ca34f4e8b43cf48277c2becf3fe9447d) This is purely cosmetic, since the JNI wrapper doesn't actually use that flag.
55ec9b3b	2021-04-21T11:04:42	OSS-Fuzz: Code comment tweaks for compr. targets (oversight from 171b875b272f47f1ae42a5009c64f424db22a95b)
4de8f692	2021-04-16T16:34:12	jdhuff.h: Fix ASan regression caused by 8fa70367 The 0xFF is, in fact, necessary.
785ec30e	2021-04-16T15:59:38	cjpeg_fuzzer: Add cov for h2v2 smooth downsampling
d147be83	2021-04-15T23:31:51	Huff decs: Fix/suppress more innocuous UBSan errs - UBSan complained that entropy->restarts_to_go was underflowing an unsigned integer when it was decremented while cinfo->restart_interval == 0. That was, of course, completely innocuous behavior, since the result of the underflowing computation was never used. - d3a3a73f64041c6a6905faf6f9f9832e735fd880 and 7bc9fca4309563d66b0c5665a616285d0e9baeb4 silenced a UBSan signed integer overflow error, but unfortunately other malformed JPEG images have been discovered that cause unsigned integer overflow in the same computation. Since, to the best of our understanding, this behavior is innocuous, this commit reverts the commits listed above, suppresses the UBSan errors, and adds code comments to document the issue.
8fa70367	2021-04-15T22:26:53	Huff dec: Fix non-deterministic output w/bad input Referring to https://bugzilla.mozilla.org/show_bug.cgi?id=1050342, there are certain very rare circumstances under which a malformed JPEG image can cause different Huffman decoder output to be produced, depending on the size of the source manager's I/O buffer. (More specifically, the fast Huffman decoder didn't handle invalid codes in the same manner as the slow decoder, and since the fast decoder requires all data to be memory-resident, the buffering strategy determines whether or not the fast decoder can be used on a particular MCU block.) After extensive experimentation, the Mozilla and Chrome developers and I determined that this truly was an innocuous issue. The patch that both browsers adopted as a workaround caused a performance regression with 32-bit code, which is why it was not accepted into libjpeg-turbo. This commit fixes the problem in a less disruptive way with no performance regression.
171b875b	2021-04-15T19:03:53	OSS-Fuzz: Check img size b4 readers allocate mem After the completion of the start_input() method, it's too late to check the image size, because the image readers may have already tried to allocate memory for the image. If the width and height are excessively large, then attempting to allocate memory for the image could slow performance or lead to out-of-memory errors prior to the fuzz target checking the image size. NOTE: Specifically, the aforementioned OOM errors and slow units were observed with the compression fuzz targets when using MSan.
3ab32348	2021-04-13T11:51:29	OSS-Fuzz: More code coverage improvements
3e68a5ee	2021-04-12T14:37:43	jchuff.c: Fix MSan error Certain rare malformed input images can cause the Huffman encoder to generate a value for nbits that corresponds to an uninitialized member of the DC code table. The ramifications of this are minimal and would basically amount to a different bogus JPEG image being generated from a particular bogus input image.
4e451616	2021-04-12T11:53:29	compress_yuv_fuzzer: Minor code coverage tweak
629e96ee	2021-04-12T11:52:55	cjpeg.c: Code formatting tweak
ebaa67ea	2021-04-12T10:38:52	rdbmp.c: Fix more innocuous UBSan errors - Referring to 3311fc00010c6cb305d87525c9ef60ebdf036cfc, we need to use unsigned intermediate math in order to make UBSan happy, even though (JDIMENSION)(A * B) is effectively the same as (JDIMENSION)A *(JDIMENSION)B, regardless of intermediate overflow. - Because of the previous commit, it is now possible for bfOffBits to be INT_MIN, which would cause the initial computation of bPad to underflow a signed integer. Thus, we need to check for that possibility as soon as we know the values of bfOffBits and headerSize. The worst case from this regression is that bPad could wrap around to a large positive value, which would cause a "Premature end of input file" error in the subsequent read_byte() loop. Thus, this issue was effectively innocuous as well, since it resulted in catching the same error later and in a different way. Also, the issue was very well-contained, since it was both introduced and fixed as part of the ongoing OSS-Fuzz integration project.
dd830b3f	2021-04-09T17:36:41	rdbmp.c/rdppm.c: Fix more innocuous UBSan errors - rdbmp.c: Because of 8fb37b81713a0cdc14622dc08892ebd28a3233aa, bfOffBits, biClrUsed, and headerSize were made into unsigned ints. Thus, if bPad would eventually be negative due to a malformed header, UBSan complained about unsigned math being used in the intermediate computations. It was unnecessary to make those variables unsigned, since they are only meant to hold small values, so this commit makes them signed again. The UBSan error was innocuous, because it is effectively (if not officially) the case that (int)((unsigned int)a - (unsigned int)b) == (int)a - (int)b. - rdbmp.c: If (biWidth * source->bits_per_pixel / 8) would overflow an unsigned int, then UBSan complained at the point at which row_width was set in start_input_bmp(), even though the overflow would have been detected later in the function. This commit adds overflow checks prior to setting row_width. - rdppm.c: read_pbm_integer() now bounds-checks the intermediate value computations in order to catch integer overflow caused by a malformed text PPM. It's possible, though extremely unlikely, that the intermediate value computations could have wrapped around to a value smaller than maxval, but the worst case is that this would have generated a bogus pixel in the uncompressed image rather than throwing an error.
4ede2ef5	2021-04-09T17:26:19	OSS-Fuzz: cjpeg fuzz target
5cda8c5e	2021-04-09T13:12:32	compress_yuv_fuzzer: Use unique filename template
47b66d1d	2021-04-09T11:26:34	OSS-Fuzz: Fix UBSan err caused by TJFLAG_FUZZING
55ab0d39	2021-04-08T16:13:06	OSS-Fuzz: YUV encoding/compression fuzz target
18bc4c61	2021-04-07T16:04:58	compress.cc: Code formatting tweak
b1079002	2021-04-07T15:51:05	rdppm.c: Fix innocuous MSan error A fuzzing test case that was effectively a 1-pixel PGM file with a maximum value of 1 and an actual value of 8 caused an uninitialized member of the rescale[] array to be accessed in get_gray_rgb_row() or get_gray_cmyk_row(). Since, for performance reasons, those functions do not perform bounds checking on the PPM values, we need to ensure that unused members of the rescale[] array are initialized.
3311fc00	2021-04-07T14:20:49	rdbmp.c: Fix innocuous UBSan error A fuzzing test case with an image width of 838860946 triggered a UBSan error: rdbmp.c:633:34: runtime error: signed integer overflow: 838860946 * 3 cannot be represented in type 'int' Because the result is cast to an unsigned int (JDIMENSION), this error is irrelevant, because (unsigned int)((int)838860946 * (int)3) == (unsigned int)838860946 * (unsigned int)3
34d264d6	2021-04-07T12:44:50	OSS-Fuzz: Private TurboJPEG API flag for fuzzing This limits the tjLoadImage() behavioral changes to the scope of the compress_fuzzer target. Otherwise, TJBench in fuzzer builds would refuse to load images larger than 1 Mpixel.
f35fd27e	2021-04-06T12:51:03	tjLoadImage: Fix issues w/loading 16-bit PPMs/PGMs - The PPM reader now throws an error rather than segfaulting (due to a buffer overrun) if an application attempts to load a 16-bit PPM file into a grayscale uncompressed image buffer. No known applications allowed that (not even the test applications in libjpeg-turbo), because that mode of operation was never expected to work and did not work under any circumstances. (In fact, it was necessary to modify TJBench in order to reproduce the issue outside of a fuzzing environment.) This was purely a matter of making the library bow out gracefully rather than crash if an application tries to do something really stupid. - The PPM reader now throws an error rather than generating incorrect pixels if an application attempts to load a 16-bit PGM file into an RGB uncompressed image buffer. - The PPM reader now correctly loads 16-bit PPM files into extended RGB uncompressed image buffers. (Previously it generated incorrect pixels unless the input colorspace was JCS_RGB or JCS_EXT_RGB.) The only way that users could have potentially encountered these issues was through the tjLoadImage() function. cjpeg and TJBench were unaffected.
df17d398	2021-04-06T11:34:30	jcphuff.c: -Wjump-misses-init warning w/GCC 9 -m32 (verified that this commit does not change the generated 64-bit or 32-bit assembly code)
cd9a3185	2021-04-05T22:20:52	Bump TurboJPEG C API version to 2.1 (because of TJFLAG_LIMITSCANS)
d2d44655	2021-04-05T21:41:30	OSS-Fuzz: Compression fuzz target
5536ace1	2021-04-05T21:12:29	OSS-Fuzz: Fix C++11 compiler warnings in targets
5dd906be	2021-04-05T17:47:34	OSS-Fuzz: Test non-default opts w/ decompress_yuv The non-default options were not being tested because of a pixel format comparison buglet. This commit also changes the code in both decompression fuzz targets such that non-default options are tested based on the pixel format index rather than the pixel format value, which is a bit more idiot-proof.
c81e91e8	2021-04-05T16:08:22	TurboJPEG: New flag for limiting prog JPEG scans This also fixes timeouts reported by OSS-Fuzz.
bff7959e	2021-04-02T14:53:43	OSS-Fuzz: Require static libraries Refer to https://google.github.io/oss-fuzz/further-reading/fuzzer-environment/#runtime-dependencies for the reasons why this is necessary.
6ad658be	2021-04-02T14:50:35	OSS-Fuzz: Build fuzz targets using C++ compiler Otherwise, the targets will require libstdc++, the i386 version of which is not available in the OSS-Fuzz runtime environment. The OSS-Fuzz build environment passes -stdlib:libc++ in the CXXFLAGS environment variable in order to mitigate this issue, since the runtime environment has the i386 version of libc++, but using that compiler flag requires using the C++ compiler.
7b57cba6	2021-03-31T11:16:51	OSS-Fuzz: Fix uninitialized reads detected by MSan
2f9e8a11	2021-03-29T18:54:12	OSS-Fuzz integration This commit integrates OSS-Fuzz targets directly into the libjpeg-turbo source tree, thus obsoleting and improving code coverage relative to Google's OSS-Fuzz target for libjpeg-turbo (previously available here: https://github.com/google/oss-fuzz). I hope to eventually create fuzz targets for the BMP, GIF, and PPM readers as well, which would allow for fuzz-testing compression, but since those readers all require an input file, it is unclear how to build an efficient fuzzer around them. It doesn't make sense to fuzz-test compression in isolation, because compression can't accept arbitrary input data.
e4ec23d7	2021-02-10T16:45:50	Neon: Use byte-swap builtins instead of inline asm Define compiler-independent byte-swap macros and use them instead of executing 'rev' via inline assembly code with GCC-compatible compilers or a slow shift-store sequence with Visual C++. * This produces identical assembly code with: - 64-bit GCC 8.4.0 (Linux) - 64-bit GCC 9.3.0 (Linux) - 64-bit Clang 10.0.0 (Linux) - 64-bit Clang 10.0.0 (MinGW) - 64-bit Clang 12.0.0 (Xcode 12.2, macOS) - 64-bit Clang 12.0.0 (Xcode 12.2, iOS) * This produces different assembly code with: - 64-bit GCC 4.9.1 (Linux) - 32-bit GCC 4.8.2 (Linux) - 32-bit GCC 8.4.0 (Linux) - 32-bit GCC 9.3.0 (Linux) Since the intrinsics implementation of Huffman encoding is not used by default with these compilers, this is not a concern. - 32-bit Clang 10.0.0 (Linux) Verified performance neutrality Closes #507
e795afc3	2021-03-25T22:36:15	SSE2: Fix prog Huff enc err if Sl%32==0 && Al!=0 (regression introduced by 16bd984557fa2c490be0b9665e2ea0d4274528a8) This implements the same fix for jsimd_encode_mcu_AC_refine_prepare_sse2() that a81a8c137b3f1c65082aa61f236aa88af61b3ad4 implemented for jsimd_encode_mcu_AC_first_prepare_sse2(). Based on: https://github.com/MegaByte/libjpeg-turbo/commit/1a59587397150c9ef9dffc5813cb3891db4bc0c8 https://github.com/MegaByte/libjpeg-turbo/commit/eb176a91d87a470bf8c987be786668aa944dd1dd Fixes #509 Closes #510
2c01200c	2021-03-15T19:56:53	Build: Fix incorrect regexes w/ if(...MATCHES...) "arm*" as a regex means 'ar' followed by zero or more 'm' characters, which matches 'parisc' and 'sparc64' as well.
ed70101d	2021-03-15T12:36:55	ChangeLog.md: List CVE ID fixed by 1719d12e Referring to https://bugzilla.redhat.com/show_bug.cgi?id=1937385#c2, it is my opinion that the severity of this bug was grossly overstated and that a CVE never should have been assigned to it, but since one was assigned, users need to know which version of libjpeg-turbo contains the fix. Dear security community, please learn what "DoS" actually means and stop misusing that term for dramatic effect. Thanks.
8a2cad02	2021-01-21T10:51:49	Build: Handle CMAKE_OSX_ARCHITECTURES=(i386\|ppc) We don't officially support i386 or PowerPC Mac builds of libjpeg-turbo anymore, but they still work (bearing in mind that PowerPC builds require GCC v4.0 in Xcode 3.2.6, and i386 builds require Xcode 9.x or earlier.) Referring to #495, apparently MacPorts needs this functionality.
b6772910	2021-01-19T15:32:32	Add Sponsor button for GitHub repository
399aa374	2021-01-19T12:25:11	Build: Support CMAKE_OSX_ARCHITECTURES ... as long as it contains only a singular value, which must equal "x86_64" or "arm64". Refer to #495
1719d12e	2021-01-14T18:35:15	cjpeg: Fix FPE when compressing 0-width GIF Fixes #493
486cdcfb	2021-01-12T17:45:55	Fix build with Visual C++ and /std:c11 or /std:c17 Fixes #481 Closes #482
74e6ea45	2021-01-05T20:23:11	Neon: Fix Huffman enc. error w/Visual Studio+Clang The GNU builtin function __builtin_clzl() accepts an unsigned long argument, which is 8 bytes wide on LP64 systems (most Un*x systems, including Mac) but 4 bytes wide on LLP64 systems (Windows.) This caused the Neon intrinsics implementation of Huffman encoding to produce mathematically incorrect results when compiled using Visual Studio with Clang. This commit changes all invocations of __builtin_clzl() in the Neon SIMD extensions to __builtin_clzll(), which accepts an unsigned long long argument that is guaranteed to be 8 bytes wide on all systems. Fixes #480 Closes #490
d2c40799	2020-12-17T16:02:47	Use CLZ compiler intrinsic for Windows/Arm builds The __builtin_clz() compiler intrinsic was already used in the C Huffman encoders when building libjpeg-turbo for Arm CPUs using a GCC-compatible compiler. This commit modifies the C Huffman encoders so that they also use__builtin_clz() when building for Arm CPUs using Visual Studio + Clang, as well as the equivalent _CountLeadingZeros() compiler intrinsic when building for Arm CPUs using Visual C++. In addition to making the C Huffman encoders faster on Windows/Arm, this also prevents jpeg_nbits_table from being included in Windows/Arm builds, thus saving 128 KB of memory.
3e8911aa	2021-01-11T13:56:01	Build: Use correct SIMD exts w/VStudio IDE + Arm64 When configuring a Visual Studio IDE build and passing -A arm64 to CMake, CMAKE_SYSTEM_PROCESSOR will be amd64, so we should set CPU_TYPE based on the value of CMAKE_GENERATOR_PLATFORM rather than the value of CMAKE_SYSTEM_PROCESSOR.
4b838c38	2021-01-11T13:45:25	jcphuff.c: Fix compiler warning with clang-cl Fixes #492
944f5915	2021-01-08T12:41:02	Migrate from Travis CI to GitHub Actions Note that this removes our ability to regression test the Armv8 and PowerPC SIMD extensions, effectively reverting a524b9b06be2e0c24d8abc6528cf29316cfe8dc5 and 02227e48a990911a6da35ab8034911a9fbc1055a, but at the moment, there is no other way.
3179f330	2021-01-04T14:54:35	tjexample.c: Fix mem leak if tjTransform() fails Fixes #479
1388ad67	2020-12-08T21:25:47	Build: Officially support Ninja
110d8d6d	2020-12-07T11:12:49	decompress_smooth_data(): Fix another uninit. read Regression introduced by 42825b68d570fb07fe820ac62ad91017e61e9a25 The test case https://user-images.githubusercontent.com/3491627/101376530-fde56180-38b0-11eb-938d-734119a5b5ba.jpg is a malformed progressive JPEG image containing an interleaved Y/Cb/Cr DC scan followed by two non-interleaved Y DC scans. Thus, the prev_coef_bits[] array was initialized for the Y component but not the other components, the uninitialized values for Cb and Cr were transferred to the prev_coef_bits_latch[] array in smoothing_ok(), and because cinfo->master->last_good_iMCU_row was 0, decompress_smooth_data() read those uninitialized values when attempting to smooth the second iMCU row. Possibly fixes #478
7b687649	2020-12-03T19:15:07	LICENSE.md: Remove trailing whitespace Use <br> to indicate a line break, as we do in README.md, in order to make checkstyle happy.
21d05684	2020-12-03T18:50:08	Build: Test for correct AArch32 RPM/DEBARCH value ... based on the floating point ABI being used by the compiler (which do you choose, a hard or soft option?)
6e4509a3	2020-12-01T09:04:27	LICENSE.md: Formatting tweak
c7ca521b	2020-11-28T06:38:27	Fix uninitialized read in decompress_smooth_data() Regression introduced by 42825b68d570fb07fe820ac62ad91017e61e9a25 Referring to the discussion in #459, the OSS-Fuzz test case https://github.com/libjpeg-turbo/libjpeg-turbo/files/5597075/clusterfuzz-testcase-minimized-pngsave_buffer_fuzzer-5728375846731776.txt created a situation in which cinfo->output_iMCU_row > cinfo->master->last_good_iMCU_row but cinfo->input_scan_number == 1 thus causing decompress_smooth_data() to read from prev_coef_bits_latch[], which was uninitialized. I was unable to create the same situation with a real JPEG image.
ccaba5d7	2020-11-25T14:55:55	Fix buffer overrun with certain narrow prog JPEGs Regression introduced by 6d91e950c871103a11bac2f10c63bf998796c719 last_block_column in decompress_smooth_data() can be 0 if, for instance, decompressing a 4:4:4 image of width 8 or less or a 4:2:2 or 4:2:0 image of width 16 or less. Since last_block_column is an unsigned int, subtracting 1 from it produced 0xFFFFFFFF, the test in line 590 passed, and we attempted to access blocks from a second block column that didn't actually exist. Closes #476
cfc7e6e5	2020-11-25T14:10:55	Bump revision to 2.0.91 for post-beta fixes
4e52b66f	2020-11-24T21:54:42	Travis: Use Docker tag that matches Git branch
8cf6f716	2020-11-24T21:32:48	Bump revision to 2.0.90 to prepare for beta
eb14189c	2020-11-17T12:48:49	Fix Neon SIMD build issues with Visual Studio - Use the _M_ARM and _M_ARM64 macros provided by Visual Studio for compile-time detection of Arm builds, since __arm__ and __aarch64__ are only present in GNU-compatible compilers. - Neon/intrinsics: Use the _CountLeadingZeros() and _CountLeadingZeros64() intrinsics provided by Visual Studio, since __builtin_clz() and __builtin_clzl() are only present in GNU-compatible compilers. - Neon/intrinsics: Since Visual Studio does not support static vector initialization, replace static initialization of Neon vectors with the appropriate intrinsics. Compared to the static initialization approach, this produces identical assembly code with both GCC and Clang. - Neon/intrinsics: Since Visual Studio does not support inline assembly code, provide alternative code paths for Visual Studio whenever inline assembly is used. - Build: Set FLOATTEST appropriately for AArch64 Visual Studio builds (Visual Studio does not emit fused multiply-add [FMA] instructions by default for such builds.) - Neon/intrinsics: Move temporary buffer allocation outside of nested loops. Since Visual Studio configures Arm builds with a relatively small amount of stack memory, attempting to allocate those buffers within the inner loops caused a stack overflow. Closes #461 Closes #475
91dd3b23	2020-11-24T19:22:38	ChangeLog: macOS Armv8/x86-64 univ. binary support
7e0d94d3	2020-11-24T20:31:51	Merge branch 'master' into dev
1c839761	2020-11-24T18:51:16	Force Git to treat testorig.ppm as a binary file Otherwise, because the file begins with an ASCII header, Git will erroneously treat is as an ASCII file, and if Git for Windows is configured with default options (specifically, "Checkout windows-style, commit Unix-style line endings"), it will add carriage return characters to all of the "linefeed" characters in the PPM file, thus corrupting it and causing libjpeg-turbo's regression tests to fail.
6d91e950	2020-10-05T13:37:44	Use 5x5 win & 9 AC coeffs when smoothing DC scans ... of progressive images. Based on: https://github.com/mo271/libjpeg-turbo/commit/be8d36d13b79a472e56da0717ba067e6139bc0e1 https://github.com/mo271/libjpeg-turbo/commit/9d528f278ee3a5ba571c0b9ec4567c557614fb25 https://github.com/mo271/libjpeg-turbo/commit/85f36f0765ea2c28909fc4c0e570cd68d3a1ed85 https://github.com/mo271/libjpeg-turbo/commit/63a4d39e387f61bcb83b393838f436b410b97308 https://github.com/mo271/libjpeg-turbo/commit/51336a6ad5acb9379dc8e3e5e5758fd439224b7c Closes #459 Closes #474
d523435e	2020-11-19T19:30:38	Travis: Use Xcode 12.2 for all iOS & macOS builds There doesn't seem to be any performance or compatibility downside to this, and it has the advantages of simplicity and consistency between the PR and official builds.
1ac83cd6	2020-11-18T18:16:12	Travis: The Mac build log is now log-macos.txt (oversight from f7a10a61e3bbab14d2e901c8823cec4961a46b2f)
0ba70b6a	2020-11-18T15:01:24	Build: Support macOS Armv8/x86-64 univ. binaries - Rename IOS_ARMV8_BUILD to ARMV8_BUILD. - Rename install_ios() to install_subbuild() in makemacpkg. - Wordsmith the build instructions accordingly. - Use xcode12.2 image in Travis CI.
e417033d	2020-11-18T14:13:54	Merge branch 'master' into dev
6d2e8837	2020-11-18T13:25:06	jpeg_skip_scanlines(): Avoid NULL + 0 UBSan error This error occurs at the call to (cinfo->cconvert->color_convert)() in sep_upsample() whenever cinfo->upsample->need_context_rows == TRUE (i.e. whenever h2v2 or h1v2 fancy upsampling is used.) The error is innocuous, since (cinfo->cconvert->color_convert)() points to a dummy function (noop_convert()) in that case. Fixes #470
f7c54892	2020-11-18T10:11:21	Travis: Add /opt/local/bin to PATH for Mac build (oversight from previous commit) macports-ci does this, and it's necessary in order for the build script to find md5sum.
f7a10a61	2020-11-17T13:51:28	Build: "OS X"/"OSX" = "macOS"/"MACOS" There are no supported versions of "OS X" anymore. The operating system has been named "macOS" since 10.12 Sierra, which was released four years ago.
d111d9ff	2020-11-17T11:54:20	Merge branch 'master' into dev
10ba6ed3	2020-11-16T17:30:37	Travis: Install MacPorts without using macports-ci
292d78e7	2020-11-16T15:28:02	Merge branch 'master' into dev
88bf1d16	2020-11-16T14:38:15	Build: Set FLOATTEST more intelligently The "32bit" vs. "64bit" floating point test results actually have nothing to do with the FPU. That was a fallacious assumption based on the observation that, with multiple CPU types, 32-bit and 64-bit builds produce different floating point test results. It seems that this is, in fact, due to differing compiler behavior-- more specifically, whether fused multiply-add (FMA) instructions are used to combine multiple floating point operations into a single instruction ("floating point expression contraction".) GCC does this by default if the target supports FMA instructions, which PowerPC and AArch64 targets both do. Fixes #468
8f830598	2020-11-13T15:21:26	Merge branch 'master' into dev
42f7c78f	2020-11-13T15:18:35	BUILDING.md: Use min. iOS v8 in iOS Armv8 example This is necessary in order to enable thread-local storage.
33859880	2020-11-13T12:12:47	Neon: Auto-detect compiler intrinsics completeness This allows the Neon intrinsics code to be built successfully (albeit likely with reduced run-time performance) with Xcode 5.0-6.2 (iOS/AArch64) and Android NDK < r19 (AArch32). Note that Xcode 5.0-6.2 will not build the Armv8 GAS code without gas-preprocessor.pl, and no version of Xcode will build the Armv7 GAS code without gas-preprocessor.pl, so we always use the full Neon intrinsics implementation by default with macOS and iOS builds. Auto-detecting the completeness of the compiler's set of Neon intrinsics also allows us to more intelligently set the default value of NEON_INTRINSICS, based on the values of HAVE_VLD1. This is a reasonable, albeit imperfect, proxy for whether a compiler has a full and optimal set of Neon intrinsics. Specific notes: - 64-bit RGB-to-YCbCr color conversion does not use any of the intrinsics in question, regresses with GCC - 64-bit accurate integer forward DCT uses vld1_s16_x3(), regresses with GCC - 64-bit Huffman encoding uses vld1q_u8_x4(), regresses with GCC - 64-bit YCbCr-to-RGB color conversion does not use any of the intrinsics in question, regresses with GCC - 64-bit accurate integer inverse DCT uses vld1_s16_x3(), regresses with GCC - 64-bit 4x4 inverse DCT uses vld1_s16_x3(). I did not test this algorithm in isolation, so it may in fact regress with GCC, but the regression may be hidden by the speedup from the new SIMD-accelerated upsampling algorithms. - 32-bit RGB-to-YCbCr color conversion: uses vld1_u16_x2(), regresses with GCC - 32-bit accurate integer forward DCT uses vld1_s16_x3(), regression irrelevant because there was no previous implementation - 32-bit accurate integer inverse DCT uses vld1_s16_x3(), regresses with GCC - 32-bit fast integer inverse DCT does not use any of the intrinsics in question, regresses with GCC - 32-bit 4x4 inverse DCT uses vld1_s16_x3(). I did not test this algorithm in isolation, so it may in fact regress with GCC, but the regression may be hidden by the speedup from the new SIMD-accelerated upsampling algorithms. Presumably when GCC includes a full and optimal set of Neon intrinsics, the HAVE_VLD1 tests will pass, and the full Neon intrinsics implementation will be enabled automatically.
3e9e7c70	2020-11-11T17:54:06	Fix build if WITH_12BIT==1 && WITH_JPEG(7\|8)==1 Fixes #466
bbd80892	2020-11-10T17:54:14	Neon: Finalize intrinsics implementation - Remove gas-preprocessor.pl. None of the compilers that can build the new intrinsics implementation require gas-preprocessor.pl (tested with Xcode and with Clang 3.9+ for Linux.) - Document that Xcode 6.3.x or later is now required for iOS builds (older versions of Xcode do not have a full set of Neon intrinsics.) - Add a change log entry. - Do not enable the ASM CMake language unless NEON_INTRINSICS is false. - Add a Clang/Arm64 test to .travis.yml in order to test the new intrinsics implementation. Closes #455
141f26ff	2018-09-18T18:28:31	Neon: Intrinsics impl. of 2x2 and 4x4 scaled IDCTs The previous AArch32 and AArch64 GAS implementations have been removed, since the intrinsics implementations provide the same or better performance.
4574f01f	2018-06-28T16:17:36	Neon: Intrinsics impl. of h2v1 & h2v2 plain upsamp There was no previous GAS implementation. NOTE: This doesn't produce much of a speedup when using -O3, because -O3 already enables Neon autovectorization, which works well for the scalar C implementation of plain upsampling. However, the Neon SIMD implementation will benefit other optimization levels.

129f0cb7

2021-08-25T12:07:58

Neon/AArch64: Don't put GAS functions in .rodata Regression introduced by 240ba417aa4b3174850d05ea0d22dbe5f80553c1 Closes #546

0a9b9721

2021-08-09T17:25:36

jmemmgr.c: Pass correct size arg to jpeg_free_*() This issue was introduced in 5557fd22173ea9ab4c02c81e1dcec9bd6927814f due to an oversight, so it has existed in libjpeg-turbo since the project's inception. However, the issue is effectively a non-issue. Although #325 proposes allowing programs to override jpeg_get_*() and jpeg_free_*() externally, there is currently no way to override those functions without modifying the libjpeg-turbo source code. libjpeg-turbo only includes the malloc()/free() memory manager from libjpeg, and the implementation of jpeg_free_*() in that memory manager ignores the size argument. libjpeg had several additional memory managers for legacy systems (MS-DOS, System 7, etc.), but those memory managers ignored the size argument to jpeg_free_*() as well. Thus, this issue would have only potentially affected custom memory managers in downstream libjpeg-turbo forks, and since no one has complained until now, apparently those are rare. Fixes #542

2849d86a

2021-08-06T13:41:15

SSE2/64-bit: Fix trans. segfault w/ malformed JPEG Attempting to losslessly transform certain malformed JPEG images can cause the nbits table index in the Huffman encoder to exceed 32768, so we need to pad the SSE2 implementation of that table to 65536 entries as we do with the C implementation. Regression introduced by 087c29e07f7533ec82fd7eb1dafc84c29e7870ec Fixes #543

84d6306f

2021-07-27T11:02:23

Fix build w/CMake 3.14+ when CMAKE_SYSTEM_NAME=iOS Closes #539

e6e952d5

2021-07-15T17:33:24

Suppress UBSan error in decode_mcu_fast() This is the same error that d147be83e9a9f904918ba7f834b0fb28e09de9b5 suppressed in decode_mcu_slow(). The image that reproduces this error in decode_mcu_fast() has been added to the libjpeg-turbo seed corpora. Closes #537

a72816ed

2021-07-16T09:37:06

Use uintptr_t, if avail, for pointer-to-int casts Although sizeof(void *) == sizeof(size_t) for all architectures that are currently supported by libjpeg-turbo, such is not guaranteed by the C standard. Specifically, CHERI-enabled architectures (e.g. CHERI-RISC-V or Arm's Morello) use capability pointers that are twice the size of size_t (128 bits for Morello and RV64), so casting to size_t strips the upper bits of the pointer (including the validity bit) and makes it non-deferenceable, as indicated by the following compiler warning: warning: cast from provenance-free integer type to pointer type will give pointer that can not be dereferenced [-Werror,-Wcheri-capability-misuse] cvalue = values = (JCOEF *)PAD((size_t)values_unaligned, 16); Ignoring this warning results in a run-time crash. Casting pointers to uintptr_t, if it is available, avoids this problem, since uintptr_t is defined as an unsigned integer type that can hold a pointer value. Since C89 compatibility is still necessary in libjpeg-turbo, this commit introduces a new typedef for pointer-to-integer casts that uses a GNU-specific extension available in GCC 4.6+ and Clang 3.0+ and falls back to using size_t if the extension is unavailable. The only other options would require C99 or Clang-specific builtins. Closes #538

4d9f256b

2021-07-13T11:52:49

jpegtran: Add option to copy only ICC markers Closes #533

b201838d

2021-07-10T16:07:05

Neon: Silence -Wimplicit-fallthrough warnings Refer to https://bugs.chromium.org/p/chromium/issues/detail?id=995993 Closes #534

a1bfc058

2021-07-12T13:52:38

Neon/AArch32: Mark inline asm output as read/write 'buffer' is both passed into the inline assembly code and modified by it. See https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html, 6.47.2.3. With GCC 4, this commit does not change the generated assembly code at all. With GCC 8, this commit fixes an assembly error: /tmp/{foo}.s: Assembler messages: /tmp/{foo}.s:775: Error: registers may not be the same -- `str r9,[r9],#4' I'm not sure why that error went unnoticed, since I definitely benchmarked the previous commit with GCC 8. Anyhow, this commit changes the generated assembly code slightly but does not alter performance. With Clang 10, this commit changes the generated assembly code slightly but does not alter performance. Refer to #529

2a2970af

2021-07-09T15:35:56

Neon/AArch32: Work around Clang T32 miscompilation Referring to the C standard (http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf, J.2 Undefined behavior), the behavior of the compiler is undefined if "conversion between two pointer types produces a result that is incorrectly aligned." Thus, the behavior of this code *((uint32_t *)buffer) = BUILTIN_BSWAP32(put_buffer); in the AArch32 version of the FLUSH() macro is undefined unless 'buffer' is 32-bit-aligned. Referring to https://bugs.llvm.org/show_bug.cgi?id=50785, certain versions of Clang, when generating Thumb (T32) instructions, miscompile that code into an assembly instruction (stm) that requires the destination to be 32-bit-aligned. Since such alignment cannot be guaranteed within the Huffman encoder, this reportedly led to crashes (SIGBUS: illegal alignment) with AArch32/Thumb builds of libjpeg-turbo running on Android devices, although thus far I have been unable to reproduce those crashes with a plain Linux/Arm system. The miscompilation is visible with the Compiler Explorer: https://godbolt.org/z/rv1ccx1Pb However, it goes away when removing the return statement from the function. Thus, it seems that Clang's behavior in this regard is somewhat variable, which may explain why the crashes are only reproducible on certain platforms. The suggested workaround is to use memcpy(), but whereas Clang and recent GCC releases are smart enough to compile a 4-byte memcpy() call into a str instruction, GCC < 6 is not. Referring to https://godbolt.org/z/ae7Wje3P6, the only way to consistently produce the desired str instruction across all supported compilers is to use inline assembly. Visual C++ presumably does not miscompile the code in question, since no issues have been reported with it, but since the code relies on undefined compiler behavior, prudence dictates that e4ec23d7ae051c1c73947f889818900362fdc52d should be reverted for Visual C++, which this commit does. The performance impact of e4ec23d7ae051c1c73947f889818900362fdc52d for Visual C++/Arm builds is unknown (I have no ability to test such builds), but regardless, this commit reverts the Visual C++/Arm performance to that of libjpeg-turbo 2.1 beta1. Closes #529

97a1575c

2021-07-07T14:38:17

RPM: Don't include system lib dir in file list This resolves a conflict between the RPM generated by the libjpeg-turbo build system and the Red Hat 'filesystem' RPM if CMAKE_INSTALL_LIBDIR=/usr/lib[64]. This code was largely borrowed from the VirtualGL RPM spec. (I can legally do that because I hold the copyright on VirtualGL's implementation.) Fixes #532

0081c2de

2021-07-07T10:12:46

Neon/AArch32: Fix build if 'soft' float ABI used Arm compilers have three floating point ABI options: 'soft' compiles floating point operations as function calls into a software floating point library, which emulates floating point operations using integer operations. Floating point function arguments are passed using integer registers. 'softfp' also compiles floating point operations as function calls into a floating point library and passes floating point function arguments using integer registers, but the floating point library functions can use FPU instructions if the CPU supports them. 'hard' compiles floating point operations into inline FPU instructions, similarly to x86 and other architectures, and passes floating point function arguments using FPU registers. Not all AArch32 CPUs have FPUs or support Neon instructions, so on Linux and Android platforms, the AArch32 SIMD dispatcher in libjpeg-turbo only enables the Neon SIMD extensions at run time if /proc/cpuinfo indicates that the CPU supports Neon instructions or if Neon instructions are explicitly enabled (e.g. by passing -mfpu=neon to the compiler.) In order to support all AArch32 CPUs using the same code base, i.e. to support run-time FPU and Neon auto-detection, it is necessary to compile the scalar C source code using -mfloat-abi=soft. However, the 'soft' floating point ABI cannot be used when compiling Neon intrinsics, so the intrinsics implementation of the Neon SIMD extensions must be compiled using -mfloat-abi=softfp if the scalar C source code is compiled using -mfloat-abi=soft. This commit modifies the build system so that it detects whether -mfloat-abi=softfp must be explicitly added to the compiler flags when building the intrinsics implementation of the Neon SIMD extensions. This will be necessary if the build is using the 'soft' floating point ABI along with run-time auto-detection of Neon instructions. Fixes #523

9df5786f

2021-06-26T16:22:21

Fix -Wimplicit-fallthrough warnings with Clang The existing /*FALLTHROUGH*/ comments work with GCC but not Clang, so this commit adds a FALLTHROUGH macro that uses the 'fallthrough' attribute if the compiler supports it. Refer to https://bugs.chromium.org/p/chromium/issues/detail?id=995993 NOTE: All versions of GCC that support -Wimplicit-fallthrough also support the 'fallthrough' attribute, but certain other compilers (Oracle Solaris Studio, for instance) support /*FALLTHROUGH*/ but not the 'fallthrough' attribute. Thus, this commit retains the /*FALLTHROUGH*/ comments, which have existed in the libjpeg code base in some form since 1994 (libjpeg v5.) Closes #531

1a1fb615

2021-06-18T09:46:03

ChangeLog.md: List CVE ID fixed by c76f4a08 Referring to #527, the security community did not assign this CVE ID until more than 8 months after the fix for the issue was released. By the time they assigned the ID, libjpeg-turbo already had two production releases containing the fix. This calls into question the usefulness of assigning a CVE ID to the issue, particularly given that the buffer overrun in question was fully contained in the stack, not detectable with valgrind, and confined to lossless transformation (it did not affect JPEG compression or decompression.) https://vuldb.com/?id.176175 says that "the exploitability is told to be easy" but provides no clarification, and given that the author of that page does not seem to be aware that a fix for the issue has been available since early December of 2019, it calls into question the accuracy of everything else on the page. It would really be nice if the security community approached me about these things before wasting my time, but I guess it's my lot in life to modify a change log entry from 2019 to include a CVE ID from 2020. So it goes...

5135c2e2

2021-05-28T12:51:53

Build: Use PIC for jsimd_none.o in shared libs In theory, all objects that will be included in a Un*x shared library must be built using PIC. In practice, most compilers don't require PIC to be explicitly specified for jsimd_none.o, either because the compiler automatically enables PIC in all cases (Ubuntu) or because the size of the generated object is too small. But some rare compilers do require PIC to be explicitly specified for jsimd_none.o. Fixes #520

3932190c

2021-05-17T13:05:16

Fix build w/ non-GCC-compatible Un*x/Arm compilers Regression introduced by d2c407995992be1f128704ae2479adfd7906c158 Closes #519

a219fd13

2021-05-12T10:58:59

GitHub bug-report.md: "master" branch --> "main"

c23672ce

2021-04-23T13:05:25

GitHub Actions: Don't build tags Our workflow script does not currently work with tags, and there is no point to building tags anyhow, since we do not use the CI system to spin official builds.

4f51f36e

2021-04-23T11:42:40

Bump version to 2.1.0 to prepare for final release

e0606daf

2021-04-21T14:49:06

TurboJPEG: Update JPEG buf ptrs on comp/xform err When using the in-memory destination manager, it is necessary to explicitly call the destination manager's term_destination() method if an error occurs. That method is called by jpeg_finish_compress() but not by jpeg_abort_compress(). This fixes a potential double free() that could occur if tjCompress*() or tjTransform() returned an error and the calling application tried to clean up a JPEG buffer that was dynamically re-allocated by one of those functions.

ffc1aa96

2021-04-21T11:06:22

Include TJ.FLAG_LIMITSCANS in JNI header (oversight from c81e91e8ca34f4e8b43cf48277c2becf3fe9447d) This is purely cosmetic, since the JNI wrapper doesn't actually use that flag.

55ec9b3b

2021-04-21T11:04:42

OSS-Fuzz: Code comment tweaks for compr. targets (oversight from 171b875b272f47f1ae42a5009c64f424db22a95b)

4de8f692

2021-04-16T16:34:12

jdhuff.h: Fix ASan regression caused by 8fa70367 The 0xFF is, in fact, necessary.

785ec30e

2021-04-16T15:59:38

cjpeg_fuzzer: Add cov for h2v2 smooth downsampling

d147be83

2021-04-15T23:31:51

Huff decs: Fix/suppress more innocuous UBSan errs - UBSan complained that entropy->restarts_to_go was underflowing an unsigned integer when it was decremented while cinfo->restart_interval == 0. That was, of course, completely innocuous behavior, since the result of the underflowing computation was never used. - d3a3a73f64041c6a6905faf6f9f9832e735fd880 and 7bc9fca4309563d66b0c5665a616285d0e9baeb4 silenced a UBSan signed integer overflow error, but unfortunately other malformed JPEG images have been discovered that cause unsigned integer overflow in the same computation. Since, to the best of our understanding, this behavior is innocuous, this commit reverts the commits listed above, suppresses the UBSan errors, and adds code comments to document the issue.

8fa70367

2021-04-15T22:26:53

Huff dec: Fix non-deterministic output w/bad input Referring to https://bugzilla.mozilla.org/show_bug.cgi?id=1050342, there are certain very rare circumstances under which a malformed JPEG image can cause different Huffman decoder output to be produced, depending on the size of the source manager's I/O buffer. (More specifically, the fast Huffman decoder didn't handle invalid codes in the same manner as the slow decoder, and since the fast decoder requires all data to be memory-resident, the buffering strategy determines whether or not the fast decoder can be used on a particular MCU block.) After extensive experimentation, the Mozilla and Chrome developers and I determined that this truly was an innocuous issue. The patch that both browsers adopted as a workaround caused a performance regression with 32-bit code, which is why it was not accepted into libjpeg-turbo. This commit fixes the problem in a less disruptive way with no performance regression.

171b875b

2021-04-15T19:03:53

OSS-Fuzz: Check img size b4 readers allocate mem After the completion of the start_input() method, it's too late to check the image size, because the image readers may have already tried to allocate memory for the image. If the width and height are excessively large, then attempting to allocate memory for the image could slow performance or lead to out-of-memory errors prior to the fuzz target checking the image size. NOTE: Specifically, the aforementioned OOM errors and slow units were observed with the compression fuzz targets when using MSan.

3ab32348

2021-04-13T11:51:29

OSS-Fuzz: More code coverage improvements

3e68a5ee

2021-04-12T14:37:43

jchuff.c: Fix MSan error Certain rare malformed input images can cause the Huffman encoder to generate a value for nbits that corresponds to an uninitialized member of the DC code table. The ramifications of this are minimal and would basically amount to a different bogus JPEG image being generated from a particular bogus input image.

4e451616

2021-04-12T11:53:29

compress_yuv_fuzzer: Minor code coverage tweak

629e96ee

2021-04-12T11:52:55

cjpeg.c: Code formatting tweak

ebaa67ea

2021-04-12T10:38:52

rdbmp.c: Fix more innocuous UBSan errors - Referring to 3311fc00010c6cb305d87525c9ef60ebdf036cfc, we need to use unsigned intermediate math in order to make UBSan happy, even though (JDIMENSION)(A * B) is effectively the same as (JDIMENSION)A *(JDIMENSION)B, regardless of intermediate overflow. - Because of the previous commit, it is now possible for bfOffBits to be INT_MIN, which would cause the initial computation of bPad to underflow a signed integer. Thus, we need to check for that possibility as soon as we know the values of bfOffBits and headerSize. The worst case from this regression is that bPad could wrap around to a large positive value, which would cause a "Premature end of input file" error in the subsequent read_byte() loop. Thus, this issue was effectively innocuous as well, since it resulted in catching the same error later and in a different way. Also, the issue was very well-contained, since it was both introduced and fixed as part of the ongoing OSS-Fuzz integration project.

dd830b3f

2021-04-09T17:36:41

rdbmp.c/rdppm.c: Fix more innocuous UBSan errors - rdbmp.c: Because of 8fb37b81713a0cdc14622dc08892ebd28a3233aa, bfOffBits, biClrUsed, and headerSize were made into unsigned ints. Thus, if bPad would eventually be negative due to a malformed header, UBSan complained about unsigned math being used in the intermediate computations. It was unnecessary to make those variables unsigned, since they are only meant to hold small values, so this commit makes them signed again. The UBSan error was innocuous, because it is effectively (if not officially) the case that (int)((unsigned int)a - (unsigned int)b) == (int)a - (int)b. - rdbmp.c: If (biWidth * source->bits_per_pixel / 8) would overflow an unsigned int, then UBSan complained at the point at which row_width was set in start_input_bmp(), even though the overflow would have been detected later in the function. This commit adds overflow checks prior to setting row_width. - rdppm.c: read_pbm_integer() now bounds-checks the intermediate value computations in order to catch integer overflow caused by a malformed text PPM. It's possible, though extremely unlikely, that the intermediate value computations could have wrapped around to a value smaller than maxval, but the worst case is that this would have generated a bogus pixel in the uncompressed image rather than throwing an error.

4ede2ef5

2021-04-09T17:26:19

OSS-Fuzz: cjpeg fuzz target

5cda8c5e

2021-04-09T13:12:32

compress_yuv_fuzzer: Use unique filename template

47b66d1d

2021-04-09T11:26:34

OSS-Fuzz: Fix UBSan err caused by TJFLAG_FUZZING

55ab0d39

2021-04-08T16:13:06

OSS-Fuzz: YUV encoding/compression fuzz target

18bc4c61

2021-04-07T16:04:58

compress.cc: Code formatting tweak

b1079002

2021-04-07T15:51:05

rdppm.c: Fix innocuous MSan error A fuzzing test case that was effectively a 1-pixel PGM file with a maximum value of 1 and an actual value of 8 caused an uninitialized member of the rescale[] array to be accessed in get_gray_rgb_row() or get_gray_cmyk_row(). Since, for performance reasons, those functions do not perform bounds checking on the PPM values, we need to ensure that unused members of the rescale[] array are initialized.

3311fc00

2021-04-07T14:20:49

rdbmp.c: Fix innocuous UBSan error A fuzzing test case with an image width of 838860946 triggered a UBSan error: rdbmp.c:633:34: runtime error: signed integer overflow: 838860946 * 3 cannot be represented in type 'int' Because the result is cast to an unsigned int (JDIMENSION), this error is irrelevant, because (unsigned int)((int)838860946 * (int)3) == (unsigned int)838860946 * (unsigned int)3

34d264d6

2021-04-07T12:44:50

OSS-Fuzz: Private TurboJPEG API flag for fuzzing This limits the tjLoadImage() behavioral changes to the scope of the compress_fuzzer target. Otherwise, TJBench in fuzzer builds would refuse to load images larger than 1 Mpixel.

f35fd27e

2021-04-06T12:51:03

tjLoadImage: Fix issues w/loading 16-bit PPMs/PGMs - The PPM reader now throws an error rather than segfaulting (due to a buffer overrun) if an application attempts to load a 16-bit PPM file into a grayscale uncompressed image buffer. No known applications allowed that (not even the test applications in libjpeg-turbo), because that mode of operation was never expected to work and did not work under any circumstances. (In fact, it was necessary to modify TJBench in order to reproduce the issue outside of a fuzzing environment.) This was purely a matter of making the library bow out gracefully rather than crash if an application tries to do something really stupid. - The PPM reader now throws an error rather than generating incorrect pixels if an application attempts to load a 16-bit PGM file into an RGB uncompressed image buffer. - The PPM reader now correctly loads 16-bit PPM files into extended RGB uncompressed image buffers. (Previously it generated incorrect pixels unless the input colorspace was JCS_RGB or JCS_EXT_RGB.) The only way that users could have potentially encountered these issues was through the tjLoadImage() function. cjpeg and TJBench were unaffected.

df17d398

2021-04-06T11:34:30

jcphuff.c: -Wjump-misses-init warning w/GCC 9 -m32 (verified that this commit does not change the generated 64-bit or 32-bit assembly code)

cd9a3185

2021-04-05T22:20:52

Bump TurboJPEG C API version to 2.1 (because of TJFLAG_LIMITSCANS)

d2d44655

2021-04-05T21:41:30

OSS-Fuzz: Compression fuzz target

5536ace1

2021-04-05T21:12:29

OSS-Fuzz: Fix C++11 compiler warnings in targets

5dd906be

2021-04-05T17:47:34

OSS-Fuzz: Test non-default opts w/ decompress_yuv The non-default options were not being tested because of a pixel format comparison buglet. This commit also changes the code in both decompression fuzz targets such that non-default options are tested based on the pixel format index rather than the pixel format value, which is a bit more idiot-proof.

c81e91e8

2021-04-05T16:08:22

TurboJPEG: New flag for limiting prog JPEG scans This also fixes timeouts reported by OSS-Fuzz.

bff7959e

2021-04-02T14:53:43

OSS-Fuzz: Require static libraries Refer to https://google.github.io/oss-fuzz/further-reading/fuzzer-environment/#runtime-dependencies for the reasons why this is necessary.

6ad658be

2021-04-02T14:50:35

OSS-Fuzz: Build fuzz targets using C++ compiler Otherwise, the targets will require libstdc++, the i386 version of which is not available in the OSS-Fuzz runtime environment. The OSS-Fuzz build environment passes -stdlib:libc++ in the CXXFLAGS environment variable in order to mitigate this issue, since the runtime environment has the i386 version of libc++, but using that compiler flag requires using the C++ compiler.

7b57cba6

2021-03-31T11:16:51

OSS-Fuzz: Fix uninitialized reads detected by MSan

2f9e8a11

2021-03-29T18:54:12

OSS-Fuzz integration This commit integrates OSS-Fuzz targets directly into the libjpeg-turbo source tree, thus obsoleting and improving code coverage relative to Google's OSS-Fuzz target for libjpeg-turbo (previously available here: https://github.com/google/oss-fuzz). I hope to eventually create fuzz targets for the BMP, GIF, and PPM readers as well, which would allow for fuzz-testing compression, but since those readers all require an input file, it is unclear how to build an efficient fuzzer around them. It doesn't make sense to fuzz-test compression in isolation, because compression can't accept arbitrary input data.

e4ec23d7

2021-02-10T16:45:50

Neon: Use byte-swap builtins instead of inline asm Define compiler-independent byte-swap macros and use them instead of executing 'rev' via inline assembly code with GCC-compatible compilers or a slow shift-store sequence with Visual C++. * This produces identical assembly code with: - 64-bit GCC 8.4.0 (Linux) - 64-bit GCC 9.3.0 (Linux) - 64-bit Clang 10.0.0 (Linux) - 64-bit Clang 10.0.0 (MinGW) - 64-bit Clang 12.0.0 (Xcode 12.2, macOS) - 64-bit Clang 12.0.0 (Xcode 12.2, iOS) * This produces different assembly code with: - 64-bit GCC 4.9.1 (Linux) - 32-bit GCC 4.8.2 (Linux) - 32-bit GCC 8.4.0 (Linux) - 32-bit GCC 9.3.0 (Linux) Since the intrinsics implementation of Huffman encoding is not used by default with these compilers, this is not a concern. - 32-bit Clang 10.0.0 (Linux) Verified performance neutrality Closes #507

e795afc3

2021-03-25T22:36:15

SSE2: Fix prog Huff enc err if Sl%32==0 && Al!=0 (regression introduced by 16bd984557fa2c490be0b9665e2ea0d4274528a8) This implements the same fix for jsimd_encode_mcu_AC_refine_prepare_sse2() that a81a8c137b3f1c65082aa61f236aa88af61b3ad4 implemented for jsimd_encode_mcu_AC_first_prepare_sse2(). Based on: https://github.com/MegaByte/libjpeg-turbo/commit/1a59587397150c9ef9dffc5813cb3891db4bc0c8 https://github.com/MegaByte/libjpeg-turbo/commit/eb176a91d87a470bf8c987be786668aa944dd1dd Fixes #509 Closes #510

2c01200c

2021-03-15T19:56:53

Build: Fix incorrect regexes w/ if(...MATCHES...) "arm*" as a regex means 'ar' followed by zero or more 'm' characters, which matches 'parisc' and 'sparc64' as well.

ed70101d

2021-03-15T12:36:55

ChangeLog.md: List CVE ID fixed by 1719d12e Referring to https://bugzilla.redhat.com/show_bug.cgi?id=1937385#c2, it is my opinion that the severity of this bug was grossly overstated and that a CVE never should have been assigned to it, but since one was assigned, users need to know which version of libjpeg-turbo contains the fix. Dear security community, please learn what "DoS" actually means and stop misusing that term for dramatic effect. Thanks.

8a2cad02

2021-01-21T10:51:49

Build: Handle CMAKE_OSX_ARCHITECTURES=(i386|ppc) We don't officially support i386 or PowerPC Mac builds of libjpeg-turbo anymore, but they still work (bearing in mind that PowerPC builds require GCC v4.0 in Xcode 3.2.6, and i386 builds require Xcode 9.x or earlier.) Referring to #495, apparently MacPorts needs this functionality.

b6772910

2021-01-19T15:32:32

Add Sponsor button for GitHub repository

399aa374

2021-01-19T12:25:11

Build: Support CMAKE_OSX_ARCHITECTURES ... as long as it contains only a singular value, which must equal "x86_64" or "arm64". Refer to #495

1719d12e

2021-01-14T18:35:15

cjpeg: Fix FPE when compressing 0-width GIF Fixes #493

486cdcfb

2021-01-12T17:45:55

Fix build with Visual C++ and /std:c11 or /std:c17 Fixes #481 Closes #482

74e6ea45

2021-01-05T20:23:11

Neon: Fix Huffman enc. error w/Visual Studio+Clang The GNU builtin function __builtin_clzl() accepts an unsigned long argument, which is 8 bytes wide on LP64 systems (most Un*x systems, including Mac) but 4 bytes wide on LLP64 systems (Windows.) This caused the Neon intrinsics implementation of Huffman encoding to produce mathematically incorrect results when compiled using Visual Studio with Clang. This commit changes all invocations of __builtin_clzl() in the Neon SIMD extensions to __builtin_clzll(), which accepts an unsigned long long argument that is guaranteed to be 8 bytes wide on all systems. Fixes #480 Closes #490

d2c40799

2020-12-17T16:02:47

Use CLZ compiler intrinsic for Windows/Arm builds The __builtin_clz() compiler intrinsic was already used in the C Huffman encoders when building libjpeg-turbo for Arm CPUs using a GCC-compatible compiler. This commit modifies the C Huffman encoders so that they also use__builtin_clz() when building for Arm CPUs using Visual Studio + Clang, as well as the equivalent _CountLeadingZeros() compiler intrinsic when building for Arm CPUs using Visual C++. In addition to making the C Huffman encoders faster on Windows/Arm, this also prevents jpeg_nbits_table from being included in Windows/Arm builds, thus saving 128 KB of memory.

3e8911aa

2021-01-11T13:56:01

Build: Use correct SIMD exts w/VStudio IDE + Arm64 When configuring a Visual Studio IDE build and passing -A arm64 to CMake, CMAKE_SYSTEM_PROCESSOR will be amd64, so we should set CPU_TYPE based on the value of CMAKE_GENERATOR_PLATFORM rather than the value of CMAKE_SYSTEM_PROCESSOR.

4b838c38

2021-01-11T13:45:25

jcphuff.c: Fix compiler warning with clang-cl Fixes #492

944f5915

2021-01-08T12:41:02

Migrate from Travis CI to GitHub Actions Note that this removes our ability to regression test the Armv8 and PowerPC SIMD extensions, effectively reverting a524b9b06be2e0c24d8abc6528cf29316cfe8dc5 and 02227e48a990911a6da35ab8034911a9fbc1055a, but at the moment, there is no other way.

3179f330

2021-01-04T14:54:35

tjexample.c: Fix mem leak if tjTransform() fails Fixes #479

1388ad67

2020-12-08T21:25:47

Build: Officially support Ninja

110d8d6d

2020-12-07T11:12:49

decompress_smooth_data(): Fix another uninit. read Regression introduced by 42825b68d570fb07fe820ac62ad91017e61e9a25 The test case https://user-images.githubusercontent.com/3491627/101376530-fde56180-38b0-11eb-938d-734119a5b5ba.jpg is a malformed progressive JPEG image containing an interleaved Y/Cb/Cr DC scan followed by two non-interleaved Y DC scans. Thus, the prev_coef_bits[] array was initialized for the Y component but not the other components, the uninitialized values for Cb and Cr were transferred to the prev_coef_bits_latch[] array in smoothing_ok(), and because cinfo->master->last_good_iMCU_row was 0, decompress_smooth_data() read those uninitialized values when attempting to smooth the second iMCU row. Possibly fixes #478

7b687649

2020-12-03T19:15:07

LICENSE.md: Remove trailing whitespace Use <br> to indicate a line break, as we do in README.md, in order to make checkstyle happy.

21d05684

2020-12-03T18:50:08

Build: Test for correct AArch32 RPM/DEBARCH value ... based on the floating point ABI being used by the compiler (which do you choose, a hard or soft option?)

6e4509a3

2020-12-01T09:04:27

LICENSE.md: Formatting tweak

c7ca521b

2020-11-28T06:38:27

Fix uninitialized read in decompress_smooth_data() Regression introduced by 42825b68d570fb07fe820ac62ad91017e61e9a25 Referring to the discussion in #459, the OSS-Fuzz test case https://github.com/libjpeg-turbo/libjpeg-turbo/files/5597075/clusterfuzz-testcase-minimized-pngsave_buffer_fuzzer-5728375846731776.txt created a situation in which cinfo->output_iMCU_row > cinfo->master->last_good_iMCU_row but cinfo->input_scan_number == 1 thus causing decompress_smooth_data() to read from prev_coef_bits_latch[], which was uninitialized. I was unable to create the same situation with a real JPEG image.

ccaba5d7

2020-11-25T14:55:55

Fix buffer overrun with certain narrow prog JPEGs Regression introduced by 6d91e950c871103a11bac2f10c63bf998796c719 last_block_column in decompress_smooth_data() can be 0 if, for instance, decompressing a 4:4:4 image of width 8 or less or a 4:2:2 or 4:2:0 image of width 16 or less. Since last_block_column is an unsigned int, subtracting 1 from it produced 0xFFFFFFFF, the test in line 590 passed, and we attempted to access blocks from a second block column that didn't actually exist. Closes #476

cfc7e6e5

2020-11-25T14:10:55

Bump revision to 2.0.91 for post-beta fixes

4e52b66f

2020-11-24T21:54:42

Travis: Use Docker tag that matches Git branch

8cf6f716

2020-11-24T21:32:48

Bump revision to 2.0.90 to prepare for beta

eb14189c

2020-11-17T12:48:49

Fix Neon SIMD build issues with Visual Studio - Use the _M_ARM and _M_ARM64 macros provided by Visual Studio for compile-time detection of Arm builds, since __arm__ and __aarch64__ are only present in GNU-compatible compilers. - Neon/intrinsics: Use the _CountLeadingZeros() and _CountLeadingZeros64() intrinsics provided by Visual Studio, since __builtin_clz() and __builtin_clzl() are only present in GNU-compatible compilers. - Neon/intrinsics: Since Visual Studio does not support static vector initialization, replace static initialization of Neon vectors with the appropriate intrinsics. Compared to the static initialization approach, this produces identical assembly code with both GCC and Clang. - Neon/intrinsics: Since Visual Studio does not support inline assembly code, provide alternative code paths for Visual Studio whenever inline assembly is used. - Build: Set FLOATTEST appropriately for AArch64 Visual Studio builds (Visual Studio does not emit fused multiply-add [FMA] instructions by default for such builds.) - Neon/intrinsics: Move temporary buffer allocation outside of nested loops. Since Visual Studio configures Arm builds with a relatively small amount of stack memory, attempting to allocate those buffers within the inner loops caused a stack overflow. Closes #461 Closes #475

91dd3b23

2020-11-24T19:22:38

ChangeLog: macOS Armv8/x86-64 univ. binary support

7e0d94d3

2020-11-24T20:31:51

Merge branch 'master' into dev

1c839761

2020-11-24T18:51:16

Force Git to treat testorig.ppm as a binary file Otherwise, because the file begins with an ASCII header, Git will erroneously treat is as an ASCII file, and if Git for Windows is configured with default options (specifically, "Checkout windows-style, commit Unix-style line endings"), it will add carriage return characters to all of the "linefeed" characters in the PPM file, thus corrupting it and causing libjpeg-turbo's regression tests to fail.

6d91e950

2020-10-05T13:37:44

Use 5x5 win & 9 AC coeffs when smoothing DC scans ... of progressive images. Based on: https://github.com/mo271/libjpeg-turbo/commit/be8d36d13b79a472e56da0717ba067e6139bc0e1 https://github.com/mo271/libjpeg-turbo/commit/9d528f278ee3a5ba571c0b9ec4567c557614fb25 https://github.com/mo271/libjpeg-turbo/commit/85f36f0765ea2c28909fc4c0e570cd68d3a1ed85 https://github.com/mo271/libjpeg-turbo/commit/63a4d39e387f61bcb83b393838f436b410b97308 https://github.com/mo271/libjpeg-turbo/commit/51336a6ad5acb9379dc8e3e5e5758fd439224b7c Closes #459 Closes #474

d523435e

2020-11-19T19:30:38

Travis: Use Xcode 12.2 for all iOS & macOS builds There doesn't seem to be any performance or compatibility downside to this, and it has the advantages of simplicity and consistency between the PR and official builds.

1ac83cd6

2020-11-18T18:16:12

Travis: The Mac build log is now log-macos.txt (oversight from f7a10a61e3bbab14d2e901c8823cec4961a46b2f)

0ba70b6a

2020-11-18T15:01:24

Build: Support macOS Armv8/x86-64 univ. binaries - Rename IOS_ARMV8_BUILD to ARMV8_BUILD. - Rename install_ios() to install_subbuild() in makemacpkg. - Wordsmith the build instructions accordingly. - Use xcode12.2 image in Travis CI.

e417033d

2020-11-18T14:13:54

Merge branch 'master' into dev

6d2e8837

2020-11-18T13:25:06

jpeg_skip_scanlines(): Avoid NULL + 0 UBSan error This error occurs at the call to (*cinfo->cconvert->color_convert)() in sep_upsample() whenever cinfo->upsample->need_context_rows == TRUE (i.e. whenever h2v2 or h1v2 fancy upsampling is used.) The error is innocuous, since (*cinfo->cconvert->color_convert)() points to a dummy function (noop_convert()) in that case. Fixes #470

f7c54892

2020-11-18T10:11:21

Travis: Add /opt/local/bin to PATH for Mac build (oversight from previous commit) macports-ci does this, and it's necessary in order for the build script to find md5sum.

f7a10a61

2020-11-17T13:51:28

Build: "OS X"/"OSX" = "macOS"/"MACOS" There are no supported versions of "OS X" anymore. The operating system has been named "macOS" since 10.12 Sierra, which was released four years ago.

d111d9ff

2020-11-17T11:54:20

Merge branch 'master' into dev

10ba6ed3

2020-11-16T17:30:37

Travis: Install MacPorts without using macports-ci

292d78e7

2020-11-16T15:28:02

Merge branch 'master' into dev

88bf1d16

2020-11-16T14:38:15

Build: Set FLOATTEST more intelligently The "32bit" vs. "64bit" floating point test results actually have nothing to do with the FPU. That was a fallacious assumption based on the observation that, with multiple CPU types, 32-bit and 64-bit builds produce different floating point test results. It seems that this is, in fact, due to differing compiler behavior-- more specifically, whether fused multiply-add (FMA) instructions are used to combine multiple floating point operations into a single instruction ("floating point expression contraction".) GCC does this by default if the target supports FMA instructions, which PowerPC and AArch64 targets both do. Fixes #468

8f830598

2020-11-13T15:21:26

Merge branch 'master' into dev

42f7c78f

2020-11-13T15:18:35

BUILDING.md: Use min. iOS v8 in iOS Armv8 example This is necessary in order to enable thread-local storage.

33859880

2020-11-13T12:12:47

Neon: Auto-detect compiler intrinsics completeness This allows the Neon intrinsics code to be built successfully (albeit likely with reduced run-time performance) with Xcode 5.0-6.2 (iOS/AArch64) and Android NDK < r19 (AArch32). Note that Xcode 5.0-6.2 will not build the Armv8 GAS code without gas-preprocessor.pl, and no version of Xcode will build the Armv7 GAS code without gas-preprocessor.pl, so we always use the full Neon intrinsics implementation by default with macOS and iOS builds. Auto-detecting the completeness of the compiler's set of Neon intrinsics also allows us to more intelligently set the default value of NEON_INTRINSICS, based on the values of HAVE_VLD1*. This is a reasonable, albeit imperfect, proxy for whether a compiler has a full and optimal set of Neon intrinsics. Specific notes: - 64-bit RGB-to-YCbCr color conversion does not use any of the intrinsics in question, regresses with GCC - 64-bit accurate integer forward DCT uses vld1_s16_x3(), regresses with GCC - 64-bit Huffman encoding uses vld1q_u8_x4(), regresses with GCC - 64-bit YCbCr-to-RGB color conversion does not use any of the intrinsics in question, regresses with GCC - 64-bit accurate integer inverse DCT uses vld1_s16_x3(), regresses with GCC - 64-bit 4x4 inverse DCT uses vld1_s16_x3(). I did not test this algorithm in isolation, so it may in fact regress with GCC, but the regression may be hidden by the speedup from the new SIMD-accelerated upsampling algorithms. - 32-bit RGB-to-YCbCr color conversion: uses vld1_u16_x2(), regresses with GCC - 32-bit accurate integer forward DCT uses vld1_s16_x3(), regression irrelevant because there was no previous implementation - 32-bit accurate integer inverse DCT uses vld1_s16_x3(), regresses with GCC - 32-bit fast integer inverse DCT does not use any of the intrinsics in question, regresses with GCC - 32-bit 4x4 inverse DCT uses vld1_s16_x3(). I did not test this algorithm in isolation, so it may in fact regress with GCC, but the regression may be hidden by the speedup from the new SIMD-accelerated upsampling algorithms. Presumably when GCC includes a full and optimal set of Neon intrinsics, the HAVE_VLD1* tests will pass, and the full Neon intrinsics implementation will be enabled automatically.

3e9e7c70

2020-11-11T17:54:06

Fix build if WITH_12BIT==1 && WITH_JPEG(7|8)==1 Fixes #466

bbd80892

2020-11-10T17:54:14

Neon: Finalize intrinsics implementation - Remove gas-preprocessor.pl. None of the compilers that can build the new intrinsics implementation require gas-preprocessor.pl (tested with Xcode and with Clang 3.9+ for Linux.) - Document that Xcode 6.3.x or later is now required for iOS builds (older versions of Xcode do not have a full set of Neon intrinsics.) - Add a change log entry. - Do not enable the ASM CMake language unless NEON_INTRINSICS is false. - Add a Clang/Arm64 test to .travis.yml in order to test the new intrinsics implementation. Closes #455

141f26ff

2018-09-18T18:28:31

Neon: Intrinsics impl. of 2x2 and 4x4 scaled IDCTs The previous AArch32 and AArch64 GAS implementations have been removed, since the intrinsics implementations provide the same or better performance.

4574f01f

2018-06-28T16:17:36

Neon: Intrinsics impl. of h2v1 & h2v2 plain upsamp There was no previous GAS implementation. NOTE: This doesn't produce much of a speedup when using -O3, because -O3 already enables Neon autovectorization, which works well for the scalar C implementation of plain upsampling. However, the Neon SIMD implementation will benefit other optimization levels.

kc3-lang/libjpeg-turbo

Log