kmx git

Commit	Date	Message
d7d16df6	2022-02-01T09:11:19	Fix segv w/ h2v2 merged upsamp, jpeg_crop_scanline The h2v2 (4:2:0) merged upsampler uses a spare row buffer so that it can upsample two rows at a time but return only one row to the application, if necessary. merged_2v_upsample() copies from this spare row buffer into the application-supplied output buffer, using the out_row_width field in the my_merged_upsampler struct to determine how many samples to copy. out_row_width is set in jinit_merged_upsampler(), which is called within the body of jpeg_start_decompress(). Since jpeg_crop_scanline() must be called after jpeg_start_decompress(), jpeg_crop_scanline() must modify the value of out_row_width if the h2v2 merged upsampler will be used. Otherwise, merged_2v_upsample() can overflow the output buffer if the number of bytes between the current output buffer position and the end of the buffer is less than the number of bytes required to represent an uncropped scanline of the output image. All of the destination managers used by djpeg allocate either a whole image buffer or a scanline buffer based on the uncropped output image width, so this issue is not reproducible using djpeg. Fixes #574
57ba02a4	2021-10-01T16:28:54	Build: Improve Neon capability detection - Use check_c_source_compiles() rather than check_symbol_exists() to detect the presence of vld1_s16_x3(), vld1_u16_x2(), and vld1q_u8_x4(). check_symbol_exists() is unreliable for detecting intrinsics, and in practice, it did not detect the presence of the aforementioned intrinsics in versions of GCC that support them. - Set DEFAULT_NEON_INTRINSICS=0 for GCC < 12, even if the aforementioned intrinsics are available. The AArch64 back end in GCC 10 and 11 supports the necessary intrinsics, but the GAS implementation is still faster when using those compilers. Fixes #547
73eff6ef	2021-11-30T15:06:54	cjpeg: auto. compr. gray BMP/GIF-->grayscale JPEG aa7459050d7a50e1d8a99488902d41fbc118a50f was supposed to enable this for BMP input images but didn't, due to a similar oversight to the one fixed in the previous commit.
2ce32e0f	2021-11-30T10:54:24	cjpeg: automatically compress PGM-->grayscale JPEG (regression introduced by aa7459050d7a50e1d8a99488902d41fbc118a50f) cjpeg sets cinfo.in_color_space to JCS_RGB as an "arbitrary guess." Since tjLoadImage() never uses JCS_RGB, the PGM reader should treat JCS_RGB the same as JCS_UNKNOWN. Fixes #566
ecf021bc	2021-11-18T21:04:35	cjpeg: Add -strict arg to treat warnings as fatal This adds fault tolerance to the LZW-compressed GIF reader, which is the only compression-side code that can throw warnings.
d401d625	2021-10-27T03:39:09	PowerPC: Detect AltiVec support on FreeBSD Recent FreeBSD/PowerPC compilers, such as Clang 11.0.x on FreeBSD 13, do the equivalent of passing -maltivec to the compiler by default, so run-time AltiVec detection is unnecessary. However, it becomes necessary when using other compilers or when passing -mno-altivec to the compiler. Closes #552
a9c41fbc	2021-10-03T12:43:15	Build: Don't enable Neon SIMD exts with Armv6- When building for 32-bit Arm platforms, test whether basic Neon intrinsics will compile with the specified compiler and C flags. This prevents the build system from enabling the Neon SIMD extensions when targetting Armv6 and other legacy architectures that do not support Neon instructions. Regression introduced by bbd8089297862efb6c39a22b5623f04567ff6443. (Checking whether gas-preprocessor.pl was needed for 32-bit Arm builds had the effect of checking whether Neon instructions were supported.) Fixes #553
739ecbc5	2021-09-30T16:28:51	ChangeLog.md: List CVE ID fixed by 2849d86a
173900b1	2021-09-02T12:48:50	tjTrans: Allow 8x8 crop alignmnt w/odd 4:4:4 JPEGs Fixes #549
5d2430f4	2021-09-02T13:17:32	ChangeLog.md: Add missing sub-header for 2.1.2
129f0cb7	2021-08-25T12:07:58	Neon/AArch64: Don't put GAS functions in .rodata Regression introduced by 240ba417aa4b3174850d05ea0d22dbe5f80553c1 Closes #546
2849d86a	2021-08-06T13:41:15	SSE2/64-bit: Fix trans. segfault w/ malformed JPEG Attempting to losslessly transform certain malformed JPEG images can cause the nbits table index in the Huffman encoder to exceed 32768, so we need to pad the SSE2 implementation of that table to 65536 entries as we do with the C implementation. Regression introduced by 087c29e07f7533ec82fd7eb1dafc84c29e7870ec Fixes #543
a72816ed	2021-07-16T09:37:06	Use uintptr_t, if avail, for pointer-to-int casts Although sizeof(void ) == sizeof(size_t) for all architectures that are currently supported by libjpeg-turbo, such is not guaranteed by the C standard. Specifically, CHERI-enabled architectures (e.g. CHERI-RISC-V or Arm's Morello) use capability pointers that are twice the size of size_t (128 bits for Morello and RV64), so casting to size_t strips the upper bits of the pointer (including the validity bit) and makes it non-deferenceable, as indicated by the following compiler warning: warning: cast from provenance-free integer type to pointer type will give pointer that can not be dereferenced [-Werror,-Wcheri-capability-misuse] cvalue = values = (JCOEF )PAD((size_t)values_unaligned, 16); Ignoring this warning results in a run-time crash. Casting pointers to uintptr_t, if it is available, avoids this problem, since uintptr_t is defined as an unsigned integer type that can hold a pointer value. Since C89 compatibility is still necessary in libjpeg-turbo, this commit introduces a new typedef for pointer-to-integer casts that uses a GNU-specific extension available in GCC 4.6+ and Clang 3.0+ and falls back to using size_t if the extension is unavailable. The only other options would require C99 or Clang-specific builtins. Closes #538
4d9f256b	2021-07-13T11:52:49	jpegtran: Add option to copy only ICC markers Closes #533
2a2970af	2021-07-09T15:35:56	Neon/AArch32: Work around Clang T32 miscompilation Referring to the C standard (http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf, J.2 Undefined behavior), the behavior of the compiler is undefined if "conversion between two pointer types produces a result that is incorrectly aligned." Thus, the behavior of this code ((uint32_t )buffer) = BUILTIN_BSWAP32(put_buffer); in the AArch32 version of the FLUSH() macro is undefined unless 'buffer' is 32-bit-aligned. Referring to https://bugs.llvm.org/show_bug.cgi?id=50785, certain versions of Clang, when generating Thumb (T32) instructions, miscompile that code into an assembly instruction (stm) that requires the destination to be 32-bit-aligned. Since such alignment cannot be guaranteed within the Huffman encoder, this reportedly led to crashes (SIGBUS: illegal alignment) with AArch32/Thumb builds of libjpeg-turbo running on Android devices, although thus far I have been unable to reproduce those crashes with a plain Linux/Arm system. The miscompilation is visible with the Compiler Explorer: https://godbolt.org/z/rv1ccx1Pb However, it goes away when removing the return statement from the function. Thus, it seems that Clang's behavior in this regard is somewhat variable, which may explain why the crashes are only reproducible on certain platforms. The suggested workaround is to use memcpy(), but whereas Clang and recent GCC releases are smart enough to compile a 4-byte memcpy() call into a str instruction, GCC < 6 is not. Referring to https://godbolt.org/z/ae7Wje3P6, the only way to consistently produce the desired str instruction across all supported compilers is to use inline assembly. Visual C++ presumably does not miscompile the code in question, since no issues have been reported with it, but since the code relies on undefined compiler behavior, prudence dictates that e4ec23d7ae051c1c73947f889818900362fdc52d should be reverted for Visual C++, which this commit does. The performance impact of e4ec23d7ae051c1c73947f889818900362fdc52d for Visual C++/Arm builds is unknown (I have no ability to test such builds), but regardless, this commit reverts the Visual C++/Arm performance to that of libjpeg-turbo 2.1 beta1. Closes #529
0081c2de	2021-07-07T10:12:46	Neon/AArch32: Fix build if 'soft' float ABI used Arm compilers have three floating point ABI options: 'soft' compiles floating point operations as function calls into a software floating point library, which emulates floating point operations using integer operations. Floating point function arguments are passed using integer registers. 'softfp' also compiles floating point operations as function calls into a floating point library and passes floating point function arguments using integer registers, but the floating point library functions can use FPU instructions if the CPU supports them. 'hard' compiles floating point operations into inline FPU instructions, similarly to x86 and other architectures, and passes floating point function arguments using FPU registers. Not all AArch32 CPUs have FPUs or support Neon instructions, so on Linux and Android platforms, the AArch32 SIMD dispatcher in libjpeg-turbo only enables the Neon SIMD extensions at run time if /proc/cpuinfo indicates that the CPU supports Neon instructions or if Neon instructions are explicitly enabled (e.g. by passing -mfpu=neon to the compiler.) In order to support all AArch32 CPUs using the same code base, i.e. to support run-time FPU and Neon auto-detection, it is necessary to compile the scalar C source code using -mfloat-abi=soft. However, the 'soft' floating point ABI cannot be used when compiling Neon intrinsics, so the intrinsics implementation of the Neon SIMD extensions must be compiled using -mfloat-abi=softfp if the scalar C source code is compiled using -mfloat-abi=soft. This commit modifies the build system so that it detects whether -mfloat-abi=softfp must be explicitly added to the compiler flags when building the intrinsics implementation of the Neon SIMD extensions. This will be necessary if the build is using the 'soft' floating point ABI along with run-time auto-detection of Neon instructions. Fixes #523
1a1fb615	2021-06-18T09:46:03	ChangeLog.md: List CVE ID fixed by c76f4a08 Referring to #527, the security community did not assign this CVE ID until more than 8 months after the fix for the issue was released. By the time they assigned the ID, libjpeg-turbo already had two production releases containing the fix. This calls into question the usefulness of assigning a CVE ID to the issue, particularly given that the buffer overrun in question was fully contained in the stack, not detectable with valgrind, and confined to lossless transformation (it did not affect JPEG compression or decompression.) https://vuldb.com/?id.176175 says that "the exploitability is told to be easy" but provides no clarification, and given that the author of that page does not seem to be aware that a fix for the issue has been available since early December of 2019, it calls into question the accuracy of everything else on the page. It would really be nice if the security community approached me about these things before wasting my time, but I guess it's my lot in life to modify a change log entry from 2019 to include a CVE ID from 2020. So it goes...
3932190c	2021-05-17T13:05:16	Fix build w/ non-GCC-compatible Un*x/Arm compilers Regression introduced by d2c407995992be1f128704ae2479adfd7906c158 Closes #519
4f51f36e	2021-04-23T11:42:40	Bump version to 2.1.0 to prepare for final release
e0606daf	2021-04-21T14:49:06	TurboJPEG: Update JPEG buf ptrs on comp/xform err When using the in-memory destination manager, it is necessary to explicitly call the destination manager's term_destination() method if an error occurs. That method is called by jpeg_finish_compress() but not by jpeg_abort_compress(). This fixes a potential double free() that could occur if tjCompress*() or tjTransform() returned an error and the calling application tried to clean up a JPEG buffer that was dynamically re-allocated by one of those functions.
f35fd27e	2021-04-06T12:51:03	tjLoadImage: Fix issues w/loading 16-bit PPMs/PGMs - The PPM reader now throws an error rather than segfaulting (due to a buffer overrun) if an application attempts to load a 16-bit PPM file into a grayscale uncompressed image buffer. No known applications allowed that (not even the test applications in libjpeg-turbo), because that mode of operation was never expected to work and did not work under any circumstances. (In fact, it was necessary to modify TJBench in order to reproduce the issue outside of a fuzzing environment.) This was purely a matter of making the library bow out gracefully rather than crash if an application tries to do something really stupid. - The PPM reader now throws an error rather than generating incorrect pixels if an application attempts to load a 16-bit PGM file into an RGB uncompressed image buffer. - The PPM reader now correctly loads 16-bit PPM files into extended RGB uncompressed image buffers. (Previously it generated incorrect pixels unless the input colorspace was JCS_RGB or JCS_EXT_RGB.) The only way that users could have potentially encountered these issues was through the tjLoadImage() function. cjpeg and TJBench were unaffected.
c81e91e8	2021-04-05T16:08:22	TurboJPEG: New flag for limiting prog JPEG scans This also fixes timeouts reported by OSS-Fuzz.
e795afc3	2021-03-25T22:36:15	SSE2: Fix prog Huff enc err if Sl%32==0 && Al!=0 (regression introduced by 16bd984557fa2c490be0b9665e2ea0d4274528a8) This implements the same fix for jsimd_encode_mcu_AC_refine_prepare_sse2() that a81a8c137b3f1c65082aa61f236aa88af61b3ad4 implemented for jsimd_encode_mcu_AC_first_prepare_sse2(). Based on: https://github.com/MegaByte/libjpeg-turbo/commit/1a59587397150c9ef9dffc5813cb3891db4bc0c8 https://github.com/MegaByte/libjpeg-turbo/commit/eb176a91d87a470bf8c987be786668aa944dd1dd Fixes #509 Closes #510
ed70101d	2021-03-15T12:36:55	ChangeLog.md: List CVE ID fixed by 1719d12e Referring to https://bugzilla.redhat.com/show_bug.cgi?id=1937385#c2, it is my opinion that the severity of this bug was grossly overstated and that a CVE never should have been assigned to it, but since one was assigned, users need to know which version of libjpeg-turbo contains the fix. Dear security community, please learn what "DoS" actually means and stop misusing that term for dramatic effect. Thanks.
1719d12e	2021-01-14T18:35:15	cjpeg: Fix FPE when compressing 0-width GIF Fixes #493
74e6ea45	2021-01-05T20:23:11	Neon: Fix Huffman enc. error w/Visual Studio+Clang The GNU builtin function __builtin_clzl() accepts an unsigned long argument, which is 8 bytes wide on LP64 systems (most Un*x systems, including Mac) but 4 bytes wide on LLP64 systems (Windows.) This caused the Neon intrinsics implementation of Huffman encoding to produce mathematically incorrect results when compiled using Visual Studio with Clang. This commit changes all invocations of __builtin_clzl() in the Neon SIMD extensions to __builtin_clzll(), which accepts an unsigned long long argument that is guaranteed to be 8 bytes wide on all systems. Fixes #480 Closes #490
c7ca521b	2020-11-28T06:38:27	Fix uninitialized read in decompress_smooth_data() Regression introduced by 42825b68d570fb07fe820ac62ad91017e61e9a25 Referring to the discussion in #459, the OSS-Fuzz test case https://github.com/libjpeg-turbo/libjpeg-turbo/files/5597075/clusterfuzz-testcase-minimized-pngsave_buffer_fuzzer-5728375846731776.txt created a situation in which cinfo->output_iMCU_row > cinfo->master->last_good_iMCU_row but cinfo->input_scan_number == 1 thus causing decompress_smooth_data() to read from prev_coef_bits_latch[], which was uninitialized. I was unable to create the same situation with a real JPEG image.
ccaba5d7	2020-11-25T14:55:55	Fix buffer overrun with certain narrow prog JPEGs Regression introduced by 6d91e950c871103a11bac2f10c63bf998796c719 last_block_column in decompress_smooth_data() can be 0 if, for instance, decompressing a 4:4:4 image of width 8 or less or a 4:2:2 or 4:2:0 image of width 16 or less. Since last_block_column is an unsigned int, subtracting 1 from it produced 0xFFFFFFFF, the test in line 590 passed, and we attempted to access blocks from a second block column that didn't actually exist. Closes #476
8cf6f716	2020-11-24T21:32:48	Bump revision to 2.0.90 to prepare for beta
eb14189c	2020-11-17T12:48:49	Fix Neon SIMD build issues with Visual Studio - Use the _M_ARM and _M_ARM64 macros provided by Visual Studio for compile-time detection of Arm builds, since __arm__ and __aarch64__ are only present in GNU-compatible compilers. - Neon/intrinsics: Use the _CountLeadingZeros() and _CountLeadingZeros64() intrinsics provided by Visual Studio, since __builtin_clz() and __builtin_clzl() are only present in GNU-compatible compilers. - Neon/intrinsics: Since Visual Studio does not support static vector initialization, replace static initialization of Neon vectors with the appropriate intrinsics. Compared to the static initialization approach, this produces identical assembly code with both GCC and Clang. - Neon/intrinsics: Since Visual Studio does not support inline assembly code, provide alternative code paths for Visual Studio whenever inline assembly is used. - Build: Set FLOATTEST appropriately for AArch64 Visual Studio builds (Visual Studio does not emit fused multiply-add [FMA] instructions by default for such builds.) - Neon/intrinsics: Move temporary buffer allocation outside of nested loops. Since Visual Studio configures Arm builds with a relatively small amount of stack memory, attempting to allocate those buffers within the inner loops caused a stack overflow. Closes #461 Closes #475
91dd3b23	2020-11-24T19:22:38	ChangeLog: macOS Armv8/x86-64 univ. binary support
6d91e950	2020-10-05T13:37:44	Use 5x5 win & 9 AC coeffs when smoothing DC scans ... of progressive images. Based on: https://github.com/mo271/libjpeg-turbo/commit/be8d36d13b79a472e56da0717ba067e6139bc0e1 https://github.com/mo271/libjpeg-turbo/commit/9d528f278ee3a5ba571c0b9ec4567c557614fb25 https://github.com/mo271/libjpeg-turbo/commit/85f36f0765ea2c28909fc4c0e570cd68d3a1ed85 https://github.com/mo271/libjpeg-turbo/commit/63a4d39e387f61bcb83b393838f436b410b97308 https://github.com/mo271/libjpeg-turbo/commit/51336a6ad5acb9379dc8e3e5e5758fd439224b7c Closes #459 Closes #474
8f830598	2020-11-13T15:21:26	Merge branch 'master' into dev
3e9e7c70	2020-11-11T17:54:06	Fix build if WITH_12BIT==1 && WITH_JPEG(7\|8)==1 Fixes #466
bbd80892	2020-11-10T17:54:14	Neon: Finalize intrinsics implementation - Remove gas-preprocessor.pl. None of the compilers that can build the new intrinsics implementation require gas-preprocessor.pl (tested with Xcode and with Clang 3.9+ for Linux.) - Document that Xcode 6.3.x or later is now required for iOS builds (older versions of Xcode do not have a full set of Neon intrinsics.) - Add a change log entry. - Do not enable the ASM CMake language unless NEON_INTRINSICS is false. - Add a Clang/Arm64 test to .travis.yml in order to test the new intrinsics implementation. Closes #455
240ba417	2020-01-07T16:40:32	Neon: Intrinsics impl. of prog. Huffman encoding The previous AArch64 GAS implementation has been removed, since the intrinsics implementation provides the same or better performance. There was no previous AArch32 GAS implementation.
7c1a1789	2020-11-05T16:04:55	Merge branch 'master' into dev
6e632af9	2020-11-04T10:13:06	Demote "fast" [I]DCT algorithms to legacy status - Refer to the "slow" [I]DCT algorithms as "accurate" instead, since they are not slow under libjpeg-turbo. - Adjust documentation claims to reflect the fact that the "slow" and "fast" algorithms produce about the same performance on AVX2-equipped CPUs (because of the dual-lane nature of AVX2, it was not possible to accelerate the "fast" algorithm beyond what was achievable with SSE2.) Also adjust the claims to reflect the fact that the "fast" algorithm tends to be ~5-15% faster than the "slow" algorithm on non-AVX2-equipped CPUs, regardless of the use of the libjpeg-turbo SIMD extensions. - Indicate the legacy status of the "fast" and float algorithms in the documentation and cjpeg/djpeg usage info. - Remove obsolete paragraph in the djpeg man page that suggested that the float algorithm could be faster than the "fast" algorithm on some CPUs.
88ae6098	2020-10-27T13:28:56	Merge branch 'ijg' into dev - Restore GIF read/compressed GIF write support from jpeg-6a and jpeg-9d. - Integrate jpegtran -wipe and -drop options from jpeg-9a and jpeg-9d. - Integrate jpegtran -crop extension (for expanding the image size) from jpeg-9a and jpeg-9d. - Integrate other minor code tweaks from jpeg-9*
59352195	2020-10-19T21:17:46	Merge branch 'master' into dev
1ed312ea	2020-10-15T17:47:31	"ARM"="Arm", "NEON"="Neon" Refer to: https://www.arm.com/company/policies/trademarks/arm-trademark-list/arm-trademark https://www.arm.com/company/policies/trademarks/arm-trademark-list/neon-trademark NOTE: These changes are only applied to change log entries for 2.0.x and later, since the change log is a historical record and Arm's new trademark policy did not go into effect until late 2017.
b8200c66	2019-03-08T11:57:54	Build: Add CMake package config files Based on: https://github.com/hjmallon/libjpeg-turbo/commit/d34b89b41134bd2b581e222514ee493594193d87 Closes #339 Closes #342
460dfe40	2020-10-14T14:46:30	ChangeLog: Acknowledge 2.0.6 release
ae08115d	2020-10-15T10:25:46	Merge branch 'master' into dev
190382b7	2020-10-14T15:14:26	ChangeLog: Fix minor formatting issue
8789a5e2	2020-10-01T21:27:47	Merge branch 'master' into dev
89c88c25	2020-10-01T21:24:27	ChangeLog.md: jpeg_crop_scanline(), not scanlines
2ec4a5eb	2020-10-01T19:18:44	Fix dec artifacts w/cropped+smoothed prog DC scans This commit modifies decompress_smooth_data(), adding missing MCU column offsets to the prev_block_row and next_block_row indices that are used for block rows other than the first and last. Effectively, this eliminates unexpected visual artifacts when using jpeg_crop_scanline() along with interblock smoothing while decompressing the DC scan of a progressive JPEG image. Based on: https://github.com/mo271/libjpeg-turbo/commit/0227d4fb484e6baf1565163211ee64e52e7b96bd Fixes #456 Closes #457
6ab61fa1	2020-09-13T17:02:09	Merge branch 'master' into dev
6ee5d5f5	2020-07-28T18:06:20	ARMv8 NEON: Support Windows builds w/AArch64 MinGW Based on: https://github.com/mstorsjo/libjpeg-turbo/commit/c5ef6659285a7d5bc74c679aa87ad187186cf7e1 Closes #438
fe79f56b	2020-07-28T15:09:00	Merge branch 'master' into dev
c1037f43	2020-07-28T14:57:47	Fix bad return val when skipping past end of image Fixes #439
a46c111d	2020-07-27T14:21:23	Further jpeg_skip_scanlines() fixes - Introduce a partial image decompression regression test script that validates the correctness of jpeg_skip_scanlines() and jpeg_crop_scanlines() for a variety of cropping regions and libjpeg settings. This regression test catches the following issues: #182, fixed in 5bc43c7821df982f65aa1c738f67fbf7cba8bd69 #237, fixed in 6e95c08649794f5018608f37250026a45ead2db8 #244, fixed in 398c1e9acc9b4531edceb3d77da0de5744675052 #441, fully fixed in this commit It does not catch the following issues: #194, fixed in 773040f9d949d5f313caf7507abaf4bd5d7ffa12 #244 (additional segfault), fixed in 9120a247436e84c0b4eea828cb11e8f665fcde30 - Modify the libjpeg-turbo regression test suite (make test) so that it checks for the issue reported in #441 (segfault in jpeg_skip_scanlines() when used with 4:2:0 merged upsampling/color conversion.) - Fix issues in jpeg_skip_scanlines() that caused incorrect output with h2v2 (4:2:0) merged upsampling/color conversion. The previous commit fixed the segfault reported in #441, but that was a symptom of a larger problem. Because merged 4:2:0 upsampling uses a "spare row" buffer, it is necessary to allow the upsampler to run when skipping rows (fancy 4:2:0 upsampling, which uses context rows, also requires this.) Otherwise, if skipping starts at an odd-numbered row, the output image will be incorrect. - Throw an error if jpeg_skip_scanlines() is called with two-pass color quantization enabled. With two-pass color quantization, the first pass occurs within jpeg_start_decompress(), so subsequent calls to jpeg_skip_scanlines() interfere with the multipass state and prevent the second pass from occurring during subsequent calls to jpeg_read_scanlines().
9120a247	2020-07-23T21:24:38	Fix jpeg_skip_scanlines() segfault w/merged upsamp The additional segfault mentioned in #244 was due to the fact that the merged upsamplers use a different private structure than the non-merged upsamplers. jpeg_skip_scanlines() was assuming the latter, so when merged upsampling was enabled, jpeg_skip_scanlines() clobbered one of the IDCT method pointers in the merged upsampler's private structure. For reasons unknown, the test image in #441 did not encounter this segfault (too small?), but it encountered an issue similar to the one fixed in 5bc43c7821df982f65aa1c738f67fbf7cba8bd69, whereby it was necessary to set up a dummy postprocessing function in read_and_discard_scanlines() when merged upsampling was enabled. Failing to do so caused either a segfault in merged_2v_upsample() (due to a NULL pointer being passed to jcopy_sample_rows()) or an error ("Corrupt JPEG data: premature end of data segment"), depending on the number of scanlines skipped and whether the first scanline skipped was an odd- or even-numbered row. Fixes #441 Fixes #244 (for real this time)
c965dc7a	2020-07-22T13:59:27	ChangeLog.md: Add missing sub-header for 2.0.6
b9142b21	2020-07-22T13:24:51	Android: Fix "using JNI after critical get" errors (again.) Fixes #300
4c5a15c3	2020-06-25T19:08:19	Eliminate 32-bit Mac build/packaging support The scales have now tilted overwhelmingly in favor of eliminating support for 32-bit Macs: - 32-bit applications are only necessary in order to support OS X 10.5 "Leopard" and OS X 10.6 "Snow Leopard". OS X 10.7 "Lion" requires a 64-bit Mac and supports all 64-bit Macs. - 32-bit applications are no longer allowed in the macOS App Store. - 32-bit applications no longer run in macOS 10.15 "Catalina". - 32-bit applications do not support thread-local storage, so the TurboJPEG API library's global error handler is not thread-safe with such applications. - libjpeg-turbo 2.1.x no longer supports 32-bit iOS apps, so it makes sense to also eliminate support for 32-bit macOS applications. It's time.
aecee256	2020-06-19T00:03:51	Merge branch 'master' into dev
ae87a958	2020-06-16T13:52:39	TurboJPEG: Make global error handling thread-safe ... on platforms that support thread-local storage. This currently includes all supported platforms except 32-bit macOS. Fixes #396
b443c541	2020-06-03T16:08:08	ChangeLog.md: Add missing sub-header for 2.0.5
cf483eee	2020-06-03T16:04:06	ChangeLog.md: List CVE ID fixed by previous commit
70040cb7	2020-06-02T15:05:43	Merge branch 'master' into dev
3de15e0c	2020-06-02T14:15:37	rdppm.c: Fix buf overrun caused by bad binary PPM This extends the fix in 1e81b0c3ea26f4ea8f56de05367469333de64a9f to include binary PPM files with maximum values < 255, thus preventing a malformed binary PPM input file with those specifications from triggering an overrun of the rescale array and potentially crashing cjpeg, TJBench, or any program that uses the tjLoadImage() function. Fixes #433
b0f92a1d	2020-03-04T11:06:45	Merge branch 'master' into dev
8cc1277b	2020-02-24T13:29:50	TJCompressor.compress(int): Fix YUV-to-JPEG error Due to an oversight, the TJCompressor.compress(int) method did not handle YUV source images. Fixes #413
77ff3bd6	2020-02-18T12:56:01	Merge branch 'master' into dev
ecf5f9a9	2020-02-18T10:43:23	Bump version to 2.0.5; Document previous commit
c4675d62	2019-12-31T00:42:53	Merge branch 'master' into dev
b542e4c8	2019-12-20T13:18:23	ARMv8 SIMD: Support execute-only memory (XOM) Move constants out of the .text section in simd/arm64/jsimd_neon.S and into a .rodata section. This ensures that the ARMv8 NEON SIMD extensions are compatible with memory layouts that are marked execute-only (and thus unreadable.) Based on: https://github.com/ivanloz/libjpeg-turbo/commit/88f3ca7664fadfb5e106efecb7845753aaf330b7 Closes #318
e98b0612	2019-12-18T15:12:33	Add fault tolerance features to djpeg and jpegtran - Enable progress reporting at run time using a new -report argument (cjpeg now supports that argument as well) - Limit the allowable number of scans using a new -maxscans argument - Treat warnings as fatal using a new -strict argument This mainly demonstrates how to work around the two issues with the JPEG standard described here: https://libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf since those and similar issues continue to be erroneously reported as libjpeg-turbo bugs.
54288598	2019-12-17T15:37:50	ChangeLog.md: Document 81b8c0ee
e821464f	2018-04-03T12:47:54	ARM64 NEON SIMD impl. of prog. Huffman encoding This commit adds ARM64 NEON optimizations for the encode_mcu_AC_first() and encode_mcu_AC_refine() functions used in progressive Huffman encoding. Compression speedups for the typical set of five libjpeg-turbo test images (https://libjpeg-turbo.org/About/Performance): Cortex-A53: 23.8-39.2% (avg. 32.2%) Cortex-A72: 26.8-41.1% (avg. 33.5%) Apple A7: 29.7-45.9% (avg. 39.6%) Closes #229
f64c5508	2019-12-05T15:39:28	Merge branch 'master' into dev
c76f4a08	2019-12-05T13:12:28	Huffman enc.: Fix very rare local buffer overrun ... detected by ASan. This is a similar issue to the issue that was fixed with 402a715f82313384ef4606660c32d8678c79f197. Apparently it is possible to create a malformed JPEG image that exceeds the Huffman encoder's 256-byte local buffer when attempting to losslessly tranform the image. That makes sense, given that it was necessary to extend the Huffman decoder's local buffer to 512 bytes in order to handle all pathological cases (refer to 0463f7c9aad060fcd56e98d025ce16185279e2bc.) Since this issue affected only lossless transformation, a workflow that isn't generally exposed to arbitrary data exploits, and since the overrun did not overflow the stack (i.e. it did not result in a segfault or other user-visible issue, and valgrind didn't even detect it), it did not likely pose a security risk. Fixes #392
26b4d62a	2019-11-15T13:57:56	Merge branch 'master' into dev
c0b16e3d	2019-11-15T13:29:11	TurboJPEG: Fix erroneous subsampling detection ... that caused some JPEG images with unusual sampling factors to be misidentified as 4:4:4. This led to a buffer overflow when attempting to decompress some such images using tjDecompressToYUV*(). Regression introduced by 479501b07c0afd8912a0e0f5b4740cfce9a89a4c The correct behavior is for the TurboJPEG API to refuse to decompress such images, which it did prior to the aforementioned commit. Fixes #389
6cedf37c	2019-11-15T12:36:11	ChangeLog.md: List CVE IDs for specific fixes
c43a5081	2019-11-12T17:07:46	Merge branch 'master' into dev
bd20344b	2019-11-12T15:45:20	tjDecompressToYUV*(): Fix OOB write/double free ... when attempting to decompress grayscale JPEG images with sampling factors != 1. Fixes #387
c30b1e72	2019-11-12T12:27:22	64-bit tjbench: Fix signed int overflow/segfault ... that occurred when attempting to decompress images with more than 715827882 (204810241024 / 3) pixels. Fixes #388
713c451f	2019-11-08T14:53:55	Enable SSE2 progressive Huffman encoder for x32 Referring to #289, I'm not sure where I arrived at the conclusion that the SSE2 progressive Huffman encoder doesn't provide any speedup for x32. Upon re-testing, I discovered it to be about 50% faster than the C encoder. This commit also re-purposes one of the CI tests (specifically, the jpeg-7 API/ABI test) so that it tests x32 as well.
2234deee	2019-11-08T13:51:34	Fix MSan use-of-uninitialized-value error ... introduced by 42825b68d570fb07fe820ac62ad91017e61e9a25. In fact, fault-tolerant multi-scan block smoothing cannot currently be used with the arithmetic decoder, because that decoder doesn't have any way of distinguishing a normal end of scan from an unexpected end of scan. Thus, this commit also modifies the change log to reset the expectations regarding the scope of the fault-tolerant multi-scan block smoothing feature. If, at some point in the future, the arithmetic decoder can be modified to detect an unexpected end of scan, then one would need only set entropy->pub.insufficient_data = TRUE when the arithmetic decoder encounters an unexpected end of scan in order to make fault-tolerant block smoothing work properly with that decoder.
82695cfd	2019-11-07T15:15:36	ChangeLog: Acknowledge 2.0.3/de-acknowledge 2.0.4 It is very unlikely that there will be an official 2.0.4 release.
42825b68	2019-11-07T14:03:23	Fault-tolerant multi-scan block smoothing This commit modifies the behavior of the block smoothing algorithm in the libjpeg API library so that, if a scan in a multi-scan JPEG image is incomplete (due to premature termination of the image stream), the block smoothing parameters from the previous (complete) scan are used to smooth any iMCU rows that the incomplete scan does not contain. Closes #343
087c29e0	2018-10-22T10:05:18	Optimize Huffman encoding This commit improves the C and SSE2 Huffman encoding implementations in the following ways: - Avoid using xmm8-xmm15 in the x86-64 SSE2 implementation. There is no actual need to use those registers, and avoiding them produces a cleaner WIN64 function entry/exit-- as well as shorter code, since REX prefixes can be avoided (this is helpful on certain CPUs, such as Intel Atom, for which instruction fetch and decoding can be a bottleneck.) - Optimize register usage so that fewer REX prefixes and register-register moves are needed. - Use the bit counter to store the number of free bits in the bit buffer rather than the number of bits in the bit buffer. This changes the method for inserting a code into the bit buffer to: (put_buffer \|= code << (free_bits -= code_size)); As a result: * Only one bit counter needs to stay in a register (we just keep it in cl.) * The bit buffer contents are already properly aligned to be written out (after a byte swap.) * Adjusting the free bits counter and checking if the bit buffer is full can be combined into a single operation. * We can wait to flush the bit buffer until the buffer is actually full and not just in danger of becoming full. Thus, eight bytes can be flushed at a time. - Speed is quite sensitive to the alignment of branch target labels, so insert some padding and remove branches from the flush code. (Flushing this way isn't actually faster when compared to using branches, but the branchless code doesn't need extra alignment and is thus smaller.) - Speculatively write out the bit buffer as a single 8-byte write, falling back to a byte-by-byte write only if there are any 0xFF bytes in the bit buffer that need to be encoded as 0xFF 0x00. - Use MMX registers for the 32-bit implementation (so the bit buffer can be 64 bits wide.) - Slightly reduce overall function code size. - Eliminate or combine a few SSE instructions. - Make some minor improvements to instruction scheduling. - Adjust flush_bits() in jchuff.c to handle cases in which the bit buffer has less than 7 free bits (apparently that couldn't happen before.) Based on: https://github.com/1camper/libjpeg-turbo/commit/947a09defa2ec848322b1bae050d1b57b316a32a https://github.com/1camper/libjpeg-turbo/commit/262ebb6b816fd8a49ff4d7185f6c5153dddde02f https://github.com/1camper/libjpeg-turbo/commit/6e9a091221bb244c8ba232a942650e94254ffcf0 See change log for performance claims. Closes #292
95f4d6ef	2019-10-24T02:13:23	Merge branch 'master' into dev
708f013f	2019-10-22T20:08:57	Win packaging: Fix 64-bit VC/GCC co-install issue
2ae2bd66	2019-10-04T16:05:11	Remove support for Apple's Java implementation (AKA "Java for OS X systems.") This implementation of Java 1.6 is long obsolete and not supported on any version of macOS past High Sierra. Oracle no longer provides a 32-bit JVM on macOS, so it is no longer necessary to provide a 32-bit version of the TurboJPEG Java wrapper on macOS.
b4110b65	2019-09-04T18:58:12	Merge branch 'master' into dev
ded5a504	2019-08-15T13:24:25	tjDecodeYUV*: Fix err if TJ inst used for prog dec If the TurboJPEG instance passed to tjDecodeYUV[Planes]() was previously used to decompress a progressive JPEG image, then we need to disable the progressive decompression parameters in the underlying libjpeg instance before calling jinit_master_decompress(). This commit also modifies the build system so that the "tjtest" target will test for this issue, and it corrects a previous oversight in the build system whereby tjbenchtest did not test progressive compression/decompression unless WITH_JAVA was true.
8ef53b10	2019-08-14T22:08:59	Merge branch 'master' into dev
c0d0fe86	2019-08-14T22:08:44	ChangeLog.md: Wordsmithing
a81a8c13	2019-08-14T13:17:11	SSE2 SIMD: Fix prog Huffman enc. error if Sl%16==0 (regression introduced by 5b177b3cab5cfb661256c1e74df160158ec6c34e) The SSE2 implementation of progressive Huffman encoding performed extraneous iterations when the scan length was a multiple of 16. Based on: https://github.com/rouault/libjpeg-turbo/commit/bb7f1ef98305da915e581b59bd0ec2ef7bdb8468 Fixes #335 Closes #367
7fbfe29c	2019-07-18T15:18:27	Merge branch 'master' into dev
2a9e3bd7	2019-07-11T15:30:04	TurboJPEG: Properly handle gigapixel images Prevent several integer overflow issues and subsequent segfaults that occurred when attempting to compress or decompress gigapixel images with the TurboJPEG API: - Modify tjBufSize(), tjBufSizeYUV2(), and tjPlaneSizeYUV() to avoid integer overflow when computing the return values and to return an error if such an overflow is unavoidable. - Modify tjunittest to validate the above. - Modify tjCompress2(), tjEncodeYUVPlanes(), tjDecompress2(), and tjDecodeYUVPlanes() to avoid integer overflow when computing the row pointers in the 64-bit TurboJPEG C API. - Modify TJBench (both C and Java versions) to avoid overflowing the size argument to malloc()/new and to fail gracefully if such an overflow is unavoidable. In general, this allows gigapixel images to be accommodated by the 64-bit TurboJPEG C API when using automatic JPEG buffer (re)allocation. Such images cannot currently be accommodated without automatic JPEG buffer (re)allocation, due to the fact that tjAlloc() accepts a 32-bit integer argument (oops.) Such images cannot be accommodated in the TurboJPEG Java API due to the fact that Java always uses a signed 32-bit integer as an array index. Fixes #361
509c2680	2019-05-09T13:46:53	Use bias pattern for 4:4:0 (h1v2) fancy upsampling This commit modifies h1v2_fancy_upsample() so that it uses an ordered dither pattern, similar to that of h2v1_fancy_upsample(), rounding up or down the result for alternate pixels rather than always rounding down. This ensures that the decompression error pattern for a 4:4:0 JPEG image will be similar to the rotated decompression error pattern for a 4:2:2 JPEG image. Thus, the final result will be similar regardless of whether a 4:2:2 JPEG image is rotated or transposed before or after decompression. Closes #356
c055c880	2019-05-09T20:36:51	Discontinue support for 32-bit iOS builds
f36d5315	2019-04-23T14:54:23	Merge branch 'master' into dev
aa9db616	2019-04-15T17:55:47	x86 SIMD: Check for CPUID leaf 07H before using According to Intel's manual [1], "If a value entered for CPUID.EAX is higher than the maximum input value for basic or extended function for that processor then the data for the highest basic information leaf is returned." Right now, libjpeg-turbo doesn't first check that leaf 07H is supported before attempting to use it, so the ostensible AVX2 bit (Bit 05) of the CPUID result might actually be Bit 05 from a lower leaf. That bit might be set, even if the CPU doesn't support AVX2. This commit modifies the x86 and x86-64 SIMD feature detection code so that it first checks whether CPUID leaf 07H is supported before attempting to use it to check for AVX2 instruction support. DRC: This commit should fix https://bugzilla.mozilla.org/show_bug.cgi?id=1520760 However, I have not personally been able to reproduce that issue, despite using a Nehalem (pre-AVX2) CPU on which the maximum CPUID leaf has been limited via a BIOS setting. Closes #348 [1] "Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 2 (2A, 2B, 2C & 2D): Instruction Set Reference, A-Z", https://software.intel.com/sites/default/files/managed/a4/60/325383-sdm-vol-2abcd.pdf, page 3-192.
8a0e35b2	2019-04-15T13:41:45	Merge branch 'master' into dev

d7d16df6

2022-02-01T09:11:19

Fix segv w/ h2v2 merged upsamp, jpeg_crop_scanline The h2v2 (4:2:0) merged upsampler uses a spare row buffer so that it can upsample two rows at a time but return only one row to the application, if necessary. merged_2v_upsample() copies from this spare row buffer into the application-supplied output buffer, using the out_row_width field in the my_merged_upsampler struct to determine how many samples to copy. out_row_width is set in jinit_merged_upsampler(), which is called within the body of jpeg_start_decompress(). Since jpeg_crop_scanline() must be called after jpeg_start_decompress(), jpeg_crop_scanline() must modify the value of out_row_width if the h2v2 merged upsampler will be used. Otherwise, merged_2v_upsample() can overflow the output buffer if the number of bytes between the current output buffer position and the end of the buffer is less than the number of bytes required to represent an uncropped scanline of the output image. All of the destination managers used by djpeg allocate either a whole image buffer or a scanline buffer based on the uncropped output image width, so this issue is not reproducible using djpeg. Fixes #574

57ba02a4

2021-10-01T16:28:54

Build: Improve Neon capability detection - Use check_c_source_compiles() rather than check_symbol_exists() to detect the presence of vld1_s16_x3(), vld1_u16_x2(), and vld1q_u8_x4(). check_symbol_exists() is unreliable for detecting intrinsics, and in practice, it did not detect the presence of the aforementioned intrinsics in versions of GCC that support them. - Set DEFAULT_NEON_INTRINSICS=0 for GCC < 12, even if the aforementioned intrinsics are available. The AArch64 back end in GCC 10 and 11 supports the necessary intrinsics, but the GAS implementation is still faster when using those compilers. Fixes #547

73eff6ef

2021-11-30T15:06:54

cjpeg: auto. compr. gray BMP/GIF-->grayscale JPEG aa7459050d7a50e1d8a99488902d41fbc118a50f was supposed to enable this for BMP input images but didn't, due to a similar oversight to the one fixed in the previous commit.

2ce32e0f

2021-11-30T10:54:24

cjpeg: automatically compress PGM-->grayscale JPEG (regression introduced by aa7459050d7a50e1d8a99488902d41fbc118a50f) cjpeg sets cinfo.in_color_space to JCS_RGB as an "arbitrary guess." Since tjLoadImage() never uses JCS_RGB, the PGM reader should treat JCS_RGB the same as JCS_UNKNOWN. Fixes #566

ecf021bc

2021-11-18T21:04:35

cjpeg: Add -strict arg to treat warnings as fatal This adds fault tolerance to the LZW-compressed GIF reader, which is the only compression-side code that can throw warnings.

d401d625

2021-10-27T03:39:09

PowerPC: Detect AltiVec support on FreeBSD Recent FreeBSD/PowerPC compilers, such as Clang 11.0.x on FreeBSD 13, do the equivalent of passing -maltivec to the compiler by default, so run-time AltiVec detection is unnecessary. However, it becomes necessary when using other compilers or when passing -mno-altivec to the compiler. Closes #552

a9c41fbc

2021-10-03T12:43:15

Build: Don't enable Neon SIMD exts with Armv6- When building for 32-bit Arm platforms, test whether basic Neon intrinsics will compile with the specified compiler and C flags. This prevents the build system from enabling the Neon SIMD extensions when targetting Armv6 and other legacy architectures that do not support Neon instructions. Regression introduced by bbd8089297862efb6c39a22b5623f04567ff6443. (Checking whether gas-preprocessor.pl was needed for 32-bit Arm builds had the effect of checking whether Neon instructions were supported.) Fixes #553

739ecbc5

2021-09-30T16:28:51

ChangeLog.md: List CVE ID fixed by 2849d86a

173900b1

2021-09-02T12:48:50

tjTrans: Allow 8x8 crop alignmnt w/odd 4:4:4 JPEGs Fixes #549

5d2430f4

2021-09-02T13:17:32

ChangeLog.md: Add missing sub-header for 2.1.2

129f0cb7

2021-08-25T12:07:58

Neon/AArch64: Don't put GAS functions in .rodata Regression introduced by 240ba417aa4b3174850d05ea0d22dbe5f80553c1 Closes #546

2849d86a

2021-08-06T13:41:15

SSE2/64-bit: Fix trans. segfault w/ malformed JPEG Attempting to losslessly transform certain malformed JPEG images can cause the nbits table index in the Huffman encoder to exceed 32768, so we need to pad the SSE2 implementation of that table to 65536 entries as we do with the C implementation. Regression introduced by 087c29e07f7533ec82fd7eb1dafc84c29e7870ec Fixes #543

a72816ed

2021-07-16T09:37:06

Use uintptr_t, if avail, for pointer-to-int casts Although sizeof(void *) == sizeof(size_t) for all architectures that are currently supported by libjpeg-turbo, such is not guaranteed by the C standard. Specifically, CHERI-enabled architectures (e.g. CHERI-RISC-V or Arm's Morello) use capability pointers that are twice the size of size_t (128 bits for Morello and RV64), so casting to size_t strips the upper bits of the pointer (including the validity bit) and makes it non-deferenceable, as indicated by the following compiler warning: warning: cast from provenance-free integer type to pointer type will give pointer that can not be dereferenced [-Werror,-Wcheri-capability-misuse] cvalue = values = (JCOEF *)PAD((size_t)values_unaligned, 16); Ignoring this warning results in a run-time crash. Casting pointers to uintptr_t, if it is available, avoids this problem, since uintptr_t is defined as an unsigned integer type that can hold a pointer value. Since C89 compatibility is still necessary in libjpeg-turbo, this commit introduces a new typedef for pointer-to-integer casts that uses a GNU-specific extension available in GCC 4.6+ and Clang 3.0+ and falls back to using size_t if the extension is unavailable. The only other options would require C99 or Clang-specific builtins. Closes #538

4d9f256b

2021-07-13T11:52:49

jpegtran: Add option to copy only ICC markers Closes #533

2a2970af

2021-07-09T15:35:56

Neon/AArch32: Work around Clang T32 miscompilation Referring to the C standard (http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf, J.2 Undefined behavior), the behavior of the compiler is undefined if "conversion between two pointer types produces a result that is incorrectly aligned." Thus, the behavior of this code *((uint32_t *)buffer) = BUILTIN_BSWAP32(put_buffer); in the AArch32 version of the FLUSH() macro is undefined unless 'buffer' is 32-bit-aligned. Referring to https://bugs.llvm.org/show_bug.cgi?id=50785, certain versions of Clang, when generating Thumb (T32) instructions, miscompile that code into an assembly instruction (stm) that requires the destination to be 32-bit-aligned. Since such alignment cannot be guaranteed within the Huffman encoder, this reportedly led to crashes (SIGBUS: illegal alignment) with AArch32/Thumb builds of libjpeg-turbo running on Android devices, although thus far I have been unable to reproduce those crashes with a plain Linux/Arm system. The miscompilation is visible with the Compiler Explorer: https://godbolt.org/z/rv1ccx1Pb However, it goes away when removing the return statement from the function. Thus, it seems that Clang's behavior in this regard is somewhat variable, which may explain why the crashes are only reproducible on certain platforms. The suggested workaround is to use memcpy(), but whereas Clang and recent GCC releases are smart enough to compile a 4-byte memcpy() call into a str instruction, GCC < 6 is not. Referring to https://godbolt.org/z/ae7Wje3P6, the only way to consistently produce the desired str instruction across all supported compilers is to use inline assembly. Visual C++ presumably does not miscompile the code in question, since no issues have been reported with it, but since the code relies on undefined compiler behavior, prudence dictates that e4ec23d7ae051c1c73947f889818900362fdc52d should be reverted for Visual C++, which this commit does. The performance impact of e4ec23d7ae051c1c73947f889818900362fdc52d for Visual C++/Arm builds is unknown (I have no ability to test such builds), but regardless, this commit reverts the Visual C++/Arm performance to that of libjpeg-turbo 2.1 beta1. Closes #529

0081c2de

2021-07-07T10:12:46

Neon/AArch32: Fix build if 'soft' float ABI used Arm compilers have three floating point ABI options: 'soft' compiles floating point operations as function calls into a software floating point library, which emulates floating point operations using integer operations. Floating point function arguments are passed using integer registers. 'softfp' also compiles floating point operations as function calls into a floating point library and passes floating point function arguments using integer registers, but the floating point library functions can use FPU instructions if the CPU supports them. 'hard' compiles floating point operations into inline FPU instructions, similarly to x86 and other architectures, and passes floating point function arguments using FPU registers. Not all AArch32 CPUs have FPUs or support Neon instructions, so on Linux and Android platforms, the AArch32 SIMD dispatcher in libjpeg-turbo only enables the Neon SIMD extensions at run time if /proc/cpuinfo indicates that the CPU supports Neon instructions or if Neon instructions are explicitly enabled (e.g. by passing -mfpu=neon to the compiler.) In order to support all AArch32 CPUs using the same code base, i.e. to support run-time FPU and Neon auto-detection, it is necessary to compile the scalar C source code using -mfloat-abi=soft. However, the 'soft' floating point ABI cannot be used when compiling Neon intrinsics, so the intrinsics implementation of the Neon SIMD extensions must be compiled using -mfloat-abi=softfp if the scalar C source code is compiled using -mfloat-abi=soft. This commit modifies the build system so that it detects whether -mfloat-abi=softfp must be explicitly added to the compiler flags when building the intrinsics implementation of the Neon SIMD extensions. This will be necessary if the build is using the 'soft' floating point ABI along with run-time auto-detection of Neon instructions. Fixes #523

1a1fb615

2021-06-18T09:46:03

ChangeLog.md: List CVE ID fixed by c76f4a08 Referring to #527, the security community did not assign this CVE ID until more than 8 months after the fix for the issue was released. By the time they assigned the ID, libjpeg-turbo already had two production releases containing the fix. This calls into question the usefulness of assigning a CVE ID to the issue, particularly given that the buffer overrun in question was fully contained in the stack, not detectable with valgrind, and confined to lossless transformation (it did not affect JPEG compression or decompression.) https://vuldb.com/?id.176175 says that "the exploitability is told to be easy" but provides no clarification, and given that the author of that page does not seem to be aware that a fix for the issue has been available since early December of 2019, it calls into question the accuracy of everything else on the page. It would really be nice if the security community approached me about these things before wasting my time, but I guess it's my lot in life to modify a change log entry from 2019 to include a CVE ID from 2020. So it goes...

3932190c

2021-05-17T13:05:16

Fix build w/ non-GCC-compatible Un*x/Arm compilers Regression introduced by d2c407995992be1f128704ae2479adfd7906c158 Closes #519

4f51f36e

2021-04-23T11:42:40

Bump version to 2.1.0 to prepare for final release

e0606daf

2021-04-21T14:49:06

TurboJPEG: Update JPEG buf ptrs on comp/xform err When using the in-memory destination manager, it is necessary to explicitly call the destination manager's term_destination() method if an error occurs. That method is called by jpeg_finish_compress() but not by jpeg_abort_compress(). This fixes a potential double free() that could occur if tjCompress*() or tjTransform() returned an error and the calling application tried to clean up a JPEG buffer that was dynamically re-allocated by one of those functions.

f35fd27e

2021-04-06T12:51:03

tjLoadImage: Fix issues w/loading 16-bit PPMs/PGMs - The PPM reader now throws an error rather than segfaulting (due to a buffer overrun) if an application attempts to load a 16-bit PPM file into a grayscale uncompressed image buffer. No known applications allowed that (not even the test applications in libjpeg-turbo), because that mode of operation was never expected to work and did not work under any circumstances. (In fact, it was necessary to modify TJBench in order to reproduce the issue outside of a fuzzing environment.) This was purely a matter of making the library bow out gracefully rather than crash if an application tries to do something really stupid. - The PPM reader now throws an error rather than generating incorrect pixels if an application attempts to load a 16-bit PGM file into an RGB uncompressed image buffer. - The PPM reader now correctly loads 16-bit PPM files into extended RGB uncompressed image buffers. (Previously it generated incorrect pixels unless the input colorspace was JCS_RGB or JCS_EXT_RGB.) The only way that users could have potentially encountered these issues was through the tjLoadImage() function. cjpeg and TJBench were unaffected.

c81e91e8

2021-04-05T16:08:22

TurboJPEG: New flag for limiting prog JPEG scans This also fixes timeouts reported by OSS-Fuzz.

e795afc3

2021-03-25T22:36:15

SSE2: Fix prog Huff enc err if Sl%32==0 && Al!=0 (regression introduced by 16bd984557fa2c490be0b9665e2ea0d4274528a8) This implements the same fix for jsimd_encode_mcu_AC_refine_prepare_sse2() that a81a8c137b3f1c65082aa61f236aa88af61b3ad4 implemented for jsimd_encode_mcu_AC_first_prepare_sse2(). Based on: https://github.com/MegaByte/libjpeg-turbo/commit/1a59587397150c9ef9dffc5813cb3891db4bc0c8 https://github.com/MegaByte/libjpeg-turbo/commit/eb176a91d87a470bf8c987be786668aa944dd1dd Fixes #509 Closes #510

ed70101d

2021-03-15T12:36:55

ChangeLog.md: List CVE ID fixed by 1719d12e Referring to https://bugzilla.redhat.com/show_bug.cgi?id=1937385#c2, it is my opinion that the severity of this bug was grossly overstated and that a CVE never should have been assigned to it, but since one was assigned, users need to know which version of libjpeg-turbo contains the fix. Dear security community, please learn what "DoS" actually means and stop misusing that term for dramatic effect. Thanks.

1719d12e

2021-01-14T18:35:15

cjpeg: Fix FPE when compressing 0-width GIF Fixes #493

74e6ea45

2021-01-05T20:23:11

Neon: Fix Huffman enc. error w/Visual Studio+Clang The GNU builtin function __builtin_clzl() accepts an unsigned long argument, which is 8 bytes wide on LP64 systems (most Un*x systems, including Mac) but 4 bytes wide on LLP64 systems (Windows.) This caused the Neon intrinsics implementation of Huffman encoding to produce mathematically incorrect results when compiled using Visual Studio with Clang. This commit changes all invocations of __builtin_clzl() in the Neon SIMD extensions to __builtin_clzll(), which accepts an unsigned long long argument that is guaranteed to be 8 bytes wide on all systems. Fixes #480 Closes #490

c7ca521b

2020-11-28T06:38:27

Fix uninitialized read in decompress_smooth_data() Regression introduced by 42825b68d570fb07fe820ac62ad91017e61e9a25 Referring to the discussion in #459, the OSS-Fuzz test case https://github.com/libjpeg-turbo/libjpeg-turbo/files/5597075/clusterfuzz-testcase-minimized-pngsave_buffer_fuzzer-5728375846731776.txt created a situation in which cinfo->output_iMCU_row > cinfo->master->last_good_iMCU_row but cinfo->input_scan_number == 1 thus causing decompress_smooth_data() to read from prev_coef_bits_latch[], which was uninitialized. I was unable to create the same situation with a real JPEG image.

ccaba5d7

2020-11-25T14:55:55

Fix buffer overrun with certain narrow prog JPEGs Regression introduced by 6d91e950c871103a11bac2f10c63bf998796c719 last_block_column in decompress_smooth_data() can be 0 if, for instance, decompressing a 4:4:4 image of width 8 or less or a 4:2:2 or 4:2:0 image of width 16 or less. Since last_block_column is an unsigned int, subtracting 1 from it produced 0xFFFFFFFF, the test in line 590 passed, and we attempted to access blocks from a second block column that didn't actually exist. Closes #476

8cf6f716

2020-11-24T21:32:48

Bump revision to 2.0.90 to prepare for beta

eb14189c

2020-11-17T12:48:49

Fix Neon SIMD build issues with Visual Studio - Use the _M_ARM and _M_ARM64 macros provided by Visual Studio for compile-time detection of Arm builds, since __arm__ and __aarch64__ are only present in GNU-compatible compilers. - Neon/intrinsics: Use the _CountLeadingZeros() and _CountLeadingZeros64() intrinsics provided by Visual Studio, since __builtin_clz() and __builtin_clzl() are only present in GNU-compatible compilers. - Neon/intrinsics: Since Visual Studio does not support static vector initialization, replace static initialization of Neon vectors with the appropriate intrinsics. Compared to the static initialization approach, this produces identical assembly code with both GCC and Clang. - Neon/intrinsics: Since Visual Studio does not support inline assembly code, provide alternative code paths for Visual Studio whenever inline assembly is used. - Build: Set FLOATTEST appropriately for AArch64 Visual Studio builds (Visual Studio does not emit fused multiply-add [FMA] instructions by default for such builds.) - Neon/intrinsics: Move temporary buffer allocation outside of nested loops. Since Visual Studio configures Arm builds with a relatively small amount of stack memory, attempting to allocate those buffers within the inner loops caused a stack overflow. Closes #461 Closes #475

91dd3b23

2020-11-24T19:22:38

ChangeLog: macOS Armv8/x86-64 univ. binary support

6d91e950

2020-10-05T13:37:44

Use 5x5 win & 9 AC coeffs when smoothing DC scans ... of progressive images. Based on: https://github.com/mo271/libjpeg-turbo/commit/be8d36d13b79a472e56da0717ba067e6139bc0e1 https://github.com/mo271/libjpeg-turbo/commit/9d528f278ee3a5ba571c0b9ec4567c557614fb25 https://github.com/mo271/libjpeg-turbo/commit/85f36f0765ea2c28909fc4c0e570cd68d3a1ed85 https://github.com/mo271/libjpeg-turbo/commit/63a4d39e387f61bcb83b393838f436b410b97308 https://github.com/mo271/libjpeg-turbo/commit/51336a6ad5acb9379dc8e3e5e5758fd439224b7c Closes #459 Closes #474

8f830598

2020-11-13T15:21:26

Merge branch 'master' into dev

3e9e7c70

2020-11-11T17:54:06

Fix build if WITH_12BIT==1 && WITH_JPEG(7|8)==1 Fixes #466

bbd80892

2020-11-10T17:54:14

Neon: Finalize intrinsics implementation - Remove gas-preprocessor.pl. None of the compilers that can build the new intrinsics implementation require gas-preprocessor.pl (tested with Xcode and with Clang 3.9+ for Linux.) - Document that Xcode 6.3.x or later is now required for iOS builds (older versions of Xcode do not have a full set of Neon intrinsics.) - Add a change log entry. - Do not enable the ASM CMake language unless NEON_INTRINSICS is false. - Add a Clang/Arm64 test to .travis.yml in order to test the new intrinsics implementation. Closes #455

240ba417

2020-01-07T16:40:32

Neon: Intrinsics impl. of prog. Huffman encoding The previous AArch64 GAS implementation has been removed, since the intrinsics implementation provides the same or better performance. There was no previous AArch32 GAS implementation.

7c1a1789

2020-11-05T16:04:55

Merge branch 'master' into dev

6e632af9

2020-11-04T10:13:06

Demote "fast" [I]DCT algorithms to legacy status - Refer to the "slow" [I]DCT algorithms as "accurate" instead, since they are not slow under libjpeg-turbo. - Adjust documentation claims to reflect the fact that the "slow" and "fast" algorithms produce about the same performance on AVX2-equipped CPUs (because of the dual-lane nature of AVX2, it was not possible to accelerate the "fast" algorithm beyond what was achievable with SSE2.) Also adjust the claims to reflect the fact that the "fast" algorithm tends to be ~5-15% faster than the "slow" algorithm on non-AVX2-equipped CPUs, regardless of the use of the libjpeg-turbo SIMD extensions. - Indicate the legacy status of the "fast" and float algorithms in the documentation and cjpeg/djpeg usage info. - Remove obsolete paragraph in the djpeg man page that suggested that the float algorithm could be faster than the "fast" algorithm on some CPUs.

88ae6098

2020-10-27T13:28:56

Merge branch 'ijg' into dev - Restore GIF read/compressed GIF write support from jpeg-6a and jpeg-9d. - Integrate jpegtran -wipe and -drop options from jpeg-9a and jpeg-9d. - Integrate jpegtran -crop extension (for expanding the image size) from jpeg-9a and jpeg-9d. - Integrate other minor code tweaks from jpeg-9*

59352195

2020-10-19T21:17:46

Merge branch 'master' into dev

1ed312ea

2020-10-15T17:47:31

"ARM"="Arm", "NEON"="Neon" Refer to: https://www.arm.com/company/policies/trademarks/arm-trademark-list/arm-trademark https://www.arm.com/company/policies/trademarks/arm-trademark-list/neon-trademark NOTE: These changes are only applied to change log entries for 2.0.x and later, since the change log is a historical record and Arm's new trademark policy did not go into effect until late 2017.

b8200c66

2019-03-08T11:57:54

Build: Add CMake package config files Based on: https://github.com/hjmallon/libjpeg-turbo/commit/d34b89b41134bd2b581e222514ee493594193d87 Closes #339 Closes #342

460dfe40

2020-10-14T14:46:30

ChangeLog: Acknowledge 2.0.6 release

ae08115d

2020-10-15T10:25:46

Merge branch 'master' into dev

190382b7

2020-10-14T15:14:26

ChangeLog: Fix minor formatting issue

8789a5e2

2020-10-01T21:27:47

Merge branch 'master' into dev

89c88c25

2020-10-01T21:24:27

ChangeLog.md: jpeg_crop_scanline(), not scanlines

2ec4a5eb

2020-10-01T19:18:44

Fix dec artifacts w/cropped+smoothed prog DC scans This commit modifies decompress_smooth_data(), adding missing MCU column offsets to the prev_block_row and next_block_row indices that are used for block rows other than the first and last. Effectively, this eliminates unexpected visual artifacts when using jpeg_crop_scanline() along with interblock smoothing while decompressing the DC scan of a progressive JPEG image. Based on: https://github.com/mo271/libjpeg-turbo/commit/0227d4fb484e6baf1565163211ee64e52e7b96bd Fixes #456 Closes #457

6ab61fa1

2020-09-13T17:02:09

Merge branch 'master' into dev

6ee5d5f5

2020-07-28T18:06:20

ARMv8 NEON: Support Windows builds w/AArch64 MinGW Based on: https://github.com/mstorsjo/libjpeg-turbo/commit/c5ef6659285a7d5bc74c679aa87ad187186cf7e1 Closes #438

fe79f56b

2020-07-28T15:09:00

Merge branch 'master' into dev

c1037f43

2020-07-28T14:57:47

Fix bad return val when skipping past end of image Fixes #439

a46c111d

2020-07-27T14:21:23

Further jpeg_skip_scanlines() fixes - Introduce a partial image decompression regression test script that validates the correctness of jpeg_skip_scanlines() and jpeg_crop_scanlines() for a variety of cropping regions and libjpeg settings. This regression test catches the following issues: #182, fixed in 5bc43c7821df982f65aa1c738f67fbf7cba8bd69 #237, fixed in 6e95c08649794f5018608f37250026a45ead2db8 #244, fixed in 398c1e9acc9b4531edceb3d77da0de5744675052 #441, fully fixed in this commit It does not catch the following issues: #194, fixed in 773040f9d949d5f313caf7507abaf4bd5d7ffa12 #244 (additional segfault), fixed in 9120a247436e84c0b4eea828cb11e8f665fcde30 - Modify the libjpeg-turbo regression test suite (make test) so that it checks for the issue reported in #441 (segfault in jpeg_skip_scanlines() when used with 4:2:0 merged upsampling/color conversion.) - Fix issues in jpeg_skip_scanlines() that caused incorrect output with h2v2 (4:2:0) merged upsampling/color conversion. The previous commit fixed the segfault reported in #441, but that was a symptom of a larger problem. Because merged 4:2:0 upsampling uses a "spare row" buffer, it is necessary to allow the upsampler to run when skipping rows (fancy 4:2:0 upsampling, which uses context rows, also requires this.) Otherwise, if skipping starts at an odd-numbered row, the output image will be incorrect. - Throw an error if jpeg_skip_scanlines() is called with two-pass color quantization enabled. With two-pass color quantization, the first pass occurs within jpeg_start_decompress(), so subsequent calls to jpeg_skip_scanlines() interfere with the multipass state and prevent the second pass from occurring during subsequent calls to jpeg_read_scanlines().

9120a247

2020-07-23T21:24:38

Fix jpeg_skip_scanlines() segfault w/merged upsamp The additional segfault mentioned in #244 was due to the fact that the merged upsamplers use a different private structure than the non-merged upsamplers. jpeg_skip_scanlines() was assuming the latter, so when merged upsampling was enabled, jpeg_skip_scanlines() clobbered one of the IDCT method pointers in the merged upsampler's private structure. For reasons unknown, the test image in #441 did not encounter this segfault (too small?), but it encountered an issue similar to the one fixed in 5bc43c7821df982f65aa1c738f67fbf7cba8bd69, whereby it was necessary to set up a dummy postprocessing function in read_and_discard_scanlines() when merged upsampling was enabled. Failing to do so caused either a segfault in merged_2v_upsample() (due to a NULL pointer being passed to jcopy_sample_rows()) or an error ("Corrupt JPEG data: premature end of data segment"), depending on the number of scanlines skipped and whether the first scanline skipped was an odd- or even-numbered row. Fixes #441 Fixes #244 (for real this time)

c965dc7a

2020-07-22T13:59:27

ChangeLog.md: Add missing sub-header for 2.0.6

b9142b21

2020-07-22T13:24:51

Android: Fix "using JNI after critical get" errors (again.) Fixes #300

4c5a15c3

2020-06-25T19:08:19

Eliminate 32-bit Mac build/packaging support The scales have now tilted overwhelmingly in favor of eliminating support for 32-bit Macs: - 32-bit applications are only necessary in order to support OS X 10.5 "Leopard" and OS X 10.6 "Snow Leopard". OS X 10.7 "Lion" requires a 64-bit Mac and supports all 64-bit Macs. - 32-bit applications are no longer allowed in the macOS App Store. - 32-bit applications no longer run in macOS 10.15 "Catalina". - 32-bit applications do not support thread-local storage, so the TurboJPEG API library's global error handler is not thread-safe with such applications. - libjpeg-turbo 2.1.x no longer supports 32-bit iOS apps, so it makes sense to also eliminate support for 32-bit macOS applications. It's time.

aecee256

2020-06-19T00:03:51

Merge branch 'master' into dev

ae87a958

2020-06-16T13:52:39

TurboJPEG: Make global error handling thread-safe ... on platforms that support thread-local storage. This currently includes all supported platforms except 32-bit macOS. Fixes #396

b443c541

2020-06-03T16:08:08

ChangeLog.md: Add missing sub-header for 2.0.5

cf483eee

2020-06-03T16:04:06

ChangeLog.md: List CVE ID fixed by previous commit

70040cb7

2020-06-02T15:05:43

Merge branch 'master' into dev

3de15e0c

2020-06-02T14:15:37

rdppm.c: Fix buf overrun caused by bad binary PPM This extends the fix in 1e81b0c3ea26f4ea8f56de05367469333de64a9f to include binary PPM files with maximum values < 255, thus preventing a malformed binary PPM input file with those specifications from triggering an overrun of the rescale array and potentially crashing cjpeg, TJBench, or any program that uses the tjLoadImage() function. Fixes #433

b0f92a1d

2020-03-04T11:06:45

Merge branch 'master' into dev

8cc1277b

2020-02-24T13:29:50

TJCompressor.compress(int): Fix YUV-to-JPEG error Due to an oversight, the TJCompressor.compress(int) method did not handle YUV source images. Fixes #413

77ff3bd6

2020-02-18T12:56:01

Merge branch 'master' into dev

ecf5f9a9

2020-02-18T10:43:23

Bump version to 2.0.5; Document previous commit

c4675d62

2019-12-31T00:42:53

Merge branch 'master' into dev

b542e4c8

2019-12-20T13:18:23

ARMv8 SIMD: Support execute-only memory (XOM) Move constants out of the .text section in simd/arm64/jsimd_neon.S and into a .rodata section. This ensures that the ARMv8 NEON SIMD extensions are compatible with memory layouts that are marked execute-only (and thus unreadable.) Based on: https://github.com/ivanloz/libjpeg-turbo/commit/88f3ca7664fadfb5e106efecb7845753aaf330b7 Closes #318

e98b0612

2019-12-18T15:12:33

Add fault tolerance features to djpeg and jpegtran - Enable progress reporting at run time using a new -report argument (cjpeg now supports that argument as well) - Limit the allowable number of scans using a new -maxscans argument - Treat warnings as fatal using a new -strict argument This mainly demonstrates how to work around the two issues with the JPEG standard described here: https://libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf since those and similar issues continue to be erroneously reported as libjpeg-turbo bugs.

54288598

2019-12-17T15:37:50

ChangeLog.md: Document 81b8c0ee

e821464f

2018-04-03T12:47:54

ARM64 NEON SIMD impl. of prog. Huffman encoding This commit adds ARM64 NEON optimizations for the encode_mcu_AC_first() and encode_mcu_AC_refine() functions used in progressive Huffman encoding. Compression speedups for the typical set of five libjpeg-turbo test images (https://libjpeg-turbo.org/About/Performance): Cortex-A53: 23.8-39.2% (avg. 32.2%) Cortex-A72: 26.8-41.1% (avg. 33.5%) Apple A7: 29.7-45.9% (avg. 39.6%) Closes #229

f64c5508

2019-12-05T15:39:28

Merge branch 'master' into dev

c76f4a08

2019-12-05T13:12:28

Huffman enc.: Fix very rare local buffer overrun ... detected by ASan. This is a similar issue to the issue that was fixed with 402a715f82313384ef4606660c32d8678c79f197. Apparently it is possible to create a malformed JPEG image that exceeds the Huffman encoder's 256-byte local buffer when attempting to losslessly tranform the image. That makes sense, given that it was necessary to extend the Huffman decoder's local buffer to 512 bytes in order to handle all pathological cases (refer to 0463f7c9aad060fcd56e98d025ce16185279e2bc.) Since this issue affected only lossless transformation, a workflow that isn't generally exposed to arbitrary data exploits, and since the overrun did not overflow the stack (i.e. it did not result in a segfault or other user-visible issue, and valgrind didn't even detect it), it did not likely pose a security risk. Fixes #392

26b4d62a

2019-11-15T13:57:56

Merge branch 'master' into dev

c0b16e3d

2019-11-15T13:29:11

TurboJPEG: Fix erroneous subsampling detection ... that caused some JPEG images with unusual sampling factors to be misidentified as 4:4:4. This led to a buffer overflow when attempting to decompress some such images using tjDecompressToYUV*(). Regression introduced by 479501b07c0afd8912a0e0f5b4740cfce9a89a4c The correct behavior is for the TurboJPEG API to refuse to decompress such images, which it did prior to the aforementioned commit. Fixes #389

6cedf37c

2019-11-15T12:36:11

ChangeLog.md: List CVE IDs for specific fixes

c43a5081

2019-11-12T17:07:46

Merge branch 'master' into dev

bd20344b

2019-11-12T15:45:20

tjDecompressToYUV*(): Fix OOB write/double free ... when attempting to decompress grayscale JPEG images with sampling factors != 1. Fixes #387

c30b1e72

2019-11-12T12:27:22

64-bit tjbench: Fix signed int overflow/segfault ... that occurred when attempting to decompress images with more than 715827882 (2048*1024*1024 / 3) pixels. Fixes #388

713c451f

2019-11-08T14:53:55

Enable SSE2 progressive Huffman encoder for x32 Referring to #289, I'm not sure where I arrived at the conclusion that the SSE2 progressive Huffman encoder doesn't provide any speedup for x32. Upon re-testing, I discovered it to be about 50% faster than the C encoder. This commit also re-purposes one of the CI tests (specifically, the jpeg-7 API/ABI test) so that it tests x32 as well.

2234deee

2019-11-08T13:51:34

Fix MSan use-of-uninitialized-value error ... introduced by 42825b68d570fb07fe820ac62ad91017e61e9a25. In fact, fault-tolerant multi-scan block smoothing cannot currently be used with the arithmetic decoder, because that decoder doesn't have any way of distinguishing a normal end of scan from an unexpected end of scan. Thus, this commit also modifies the change log to reset the expectations regarding the scope of the fault-tolerant multi-scan block smoothing feature. If, at some point in the future, the arithmetic decoder can be modified to detect an unexpected end of scan, then one would need only set entropy->pub.insufficient_data = TRUE when the arithmetic decoder encounters an unexpected end of scan in order to make fault-tolerant block smoothing work properly with that decoder.

82695cfd

2019-11-07T15:15:36

ChangeLog: Acknowledge 2.0.3/de-acknowledge 2.0.4 It is very unlikely that there will be an official 2.0.4 release.

42825b68

2019-11-07T14:03:23

Fault-tolerant multi-scan block smoothing This commit modifies the behavior of the block smoothing algorithm in the libjpeg API library so that, if a scan in a multi-scan JPEG image is incomplete (due to premature termination of the image stream), the block smoothing parameters from the previous (complete) scan are used to smooth any iMCU rows that the incomplete scan does not contain. Closes #343

087c29e0

2018-10-22T10:05:18

Optimize Huffman encoding This commit improves the C and SSE2 Huffman encoding implementations in the following ways: - Avoid using xmm8-xmm15 in the x86-64 SSE2 implementation. There is no actual need to use those registers, and avoiding them produces a cleaner WIN64 function entry/exit-- as well as shorter code, since REX prefixes can be avoided (this is helpful on certain CPUs, such as Intel Atom, for which instruction fetch and decoding can be a bottleneck.) - Optimize register usage so that fewer REX prefixes and register-register moves are needed. - Use the bit counter to store the number of free bits in the bit buffer rather than the number of bits in the bit buffer. This changes the method for inserting a code into the bit buffer to: (put_buffer |= code << (free_bits -= code_size)); As a result: * Only one bit counter needs to stay in a register (we just keep it in cl.) * The bit buffer contents are already properly aligned to be written out (after a byte swap.) * Adjusting the free bits counter and checking if the bit buffer is full can be combined into a single operation. * We can wait to flush the bit buffer until the buffer is actually full and not just in danger of becoming full. Thus, eight bytes can be flushed at a time. - Speed is quite sensitive to the alignment of branch target labels, so insert some padding and remove branches from the flush code. (Flushing this way isn't actually faster when compared to using branches, but the branchless code doesn't need extra alignment and is thus smaller.) - Speculatively write out the bit buffer as a single 8-byte write, falling back to a byte-by-byte write only if there are any 0xFF bytes in the bit buffer that need to be encoded as 0xFF 0x00. - Use MMX registers for the 32-bit implementation (so the bit buffer can be 64 bits wide.) - Slightly reduce overall function code size. - Eliminate or combine a few SSE instructions. - Make some minor improvements to instruction scheduling. - Adjust flush_bits() in jchuff.c to handle cases in which the bit buffer has less than 7 free bits (apparently that couldn't happen before.) Based on: https://github.com/1camper/libjpeg-turbo/commit/947a09defa2ec848322b1bae050d1b57b316a32a https://github.com/1camper/libjpeg-turbo/commit/262ebb6b816fd8a49ff4d7185f6c5153dddde02f https://github.com/1camper/libjpeg-turbo/commit/6e9a091221bb244c8ba232a942650e94254ffcf0 See change log for performance claims. Closes #292

95f4d6ef

2019-10-24T02:13:23

Merge branch 'master' into dev

708f013f

2019-10-22T20:08:57

Win packaging: Fix 64-bit VC/GCC co-install issue

2ae2bd66

2019-10-04T16:05:11

Remove support for Apple's Java implementation (AKA "Java for OS X systems.") This implementation of Java 1.6 is long obsolete and not supported on any version of macOS past High Sierra. Oracle no longer provides a 32-bit JVM on macOS, so it is no longer necessary to provide a 32-bit version of the TurboJPEG Java wrapper on macOS.

b4110b65

2019-09-04T18:58:12

Merge branch 'master' into dev

ded5a504

2019-08-15T13:24:25

tjDecodeYUV*: Fix err if TJ inst used for prog dec If the TurboJPEG instance passed to tjDecodeYUV[Planes]() was previously used to decompress a progressive JPEG image, then we need to disable the progressive decompression parameters in the underlying libjpeg instance before calling jinit_master_decompress(). This commit also modifies the build system so that the "tjtest" target will test for this issue, and it corrects a previous oversight in the build system whereby tjbenchtest did not test progressive compression/decompression unless WITH_JAVA was true.

8ef53b10

2019-08-14T22:08:59

Merge branch 'master' into dev

c0d0fe86

2019-08-14T22:08:44

ChangeLog.md: Wordsmithing

a81a8c13

2019-08-14T13:17:11

SSE2 SIMD: Fix prog Huffman enc. error if Sl%16==0 (regression introduced by 5b177b3cab5cfb661256c1e74df160158ec6c34e) The SSE2 implementation of progressive Huffman encoding performed extraneous iterations when the scan length was a multiple of 16. Based on: https://github.com/rouault/libjpeg-turbo/commit/bb7f1ef98305da915e581b59bd0ec2ef7bdb8468 Fixes #335 Closes #367

7fbfe29c

2019-07-18T15:18:27

Merge branch 'master' into dev

2a9e3bd7

2019-07-11T15:30:04

TurboJPEG: Properly handle gigapixel images Prevent several integer overflow issues and subsequent segfaults that occurred when attempting to compress or decompress gigapixel images with the TurboJPEG API: - Modify tjBufSize(), tjBufSizeYUV2(), and tjPlaneSizeYUV() to avoid integer overflow when computing the return values and to return an error if such an overflow is unavoidable. - Modify tjunittest to validate the above. - Modify tjCompress2(), tjEncodeYUVPlanes(), tjDecompress2(), and tjDecodeYUVPlanes() to avoid integer overflow when computing the row pointers in the 64-bit TurboJPEG C API. - Modify TJBench (both C and Java versions) to avoid overflowing the size argument to malloc()/new and to fail gracefully if such an overflow is unavoidable. In general, this allows gigapixel images to be accommodated by the 64-bit TurboJPEG C API when using automatic JPEG buffer (re)allocation. Such images cannot currently be accommodated without automatic JPEG buffer (re)allocation, due to the fact that tjAlloc() accepts a 32-bit integer argument (oops.) Such images cannot be accommodated in the TurboJPEG Java API due to the fact that Java always uses a signed 32-bit integer as an array index. Fixes #361

509c2680

2019-05-09T13:46:53

Use bias pattern for 4:4:0 (h1v2) fancy upsampling This commit modifies h1v2_fancy_upsample() so that it uses an ordered dither pattern, similar to that of h2v1_fancy_upsample(), rounding up or down the result for alternate pixels rather than always rounding down. This ensures that the decompression error pattern for a 4:4:0 JPEG image will be similar to the rotated decompression error pattern for a 4:2:2 JPEG image. Thus, the final result will be similar regardless of whether a 4:2:2 JPEG image is rotated or transposed before or after decompression. Closes #356

c055c880

2019-05-09T20:36:51

Discontinue support for 32-bit iOS builds

f36d5315

2019-04-23T14:54:23

Merge branch 'master' into dev

aa9db616

2019-04-15T17:55:47

x86 SIMD: Check for CPUID leaf 07H before using According to Intel's manual [1], "If a value entered for CPUID.EAX is higher than the maximum input value for basic or extended function for that processor then the data for the highest basic information leaf is returned." Right now, libjpeg-turbo doesn't first check that leaf 07H is supported before attempting to use it, so the ostensible AVX2 bit (Bit 05) of the CPUID result might actually be Bit 05 from a lower leaf. That bit might be set, even if the CPU doesn't support AVX2. This commit modifies the x86 and x86-64 SIMD feature detection code so that it first checks whether CPUID leaf 07H is supported before attempting to use it to check for AVX2 instruction support. DRC: This commit should fix https://bugzilla.mozilla.org/show_bug.cgi?id=1520760 However, I have not personally been able to reproduce that issue, despite using a Nehalem (pre-AVX2) CPU on which the maximum CPUID leaf has been limited via a BIOS setting. Closes #348 [1] "Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 2 (2A, 2B, 2C & 2D): Instruction Set Reference, A-Z", https://software.intel.com/sites/default/files/managed/a4/60/325383-sdm-vol-2abcd.pdf, page 3-192.

8a0e35b2

2019-04-15T13:41:45

Merge branch 'master' into dev

kc3-lang/libjpeg-turbo/ChangeLog.md

ChangeLog.md

Log