|
54792ba3
|
2015-09-16T23:05:46
|
|
Fix MIPS DSPr2 4:2:0 upsample bug w/ small images
The DSPr2 code was errantly comparing the residual (t9, width & 0xF)
with the end pointer (t4, out + width) instead of the width directly
(a1). This would give the wrong results with any image whose output
width was less than 16. The other small changes (ulw to lw and removal
of the nop) are just some easy optimizations around this code.
This issue caused a buffer overrun and subsequent segfault on images
whose scaled output height was 1 pixel and whose scaled output width was
< 16 pixels. Note that the "plain" (non-fancy and non-merged) upsample
routine, which was affected by this bug, is normally not used except
when decompressing a non-YCbCr JPEG image, but it is also used when
decompressing a single-row image (because the other upsampling
algorithms require at least two rows.)
Closes #16.
|
|
498d9bc9
|
2015-09-15T11:57:03
|
|
Fix x86-64 ABI conformance issue in SIMD code
(descriptions cribbed by DRC from discussion in #20)
In the x86-64 ABI, the high (unused) DWORD of a 32-bit argument's
register is undefined, so it was incorrect to use a 64-bit mov
instruction to transfer a JDIMENSION argument in the 64-bit SSE2 SIMD
functions. The code worked thus far only because the existing compiler
optimizers weren't smart enough to do anything else with the register in
question, so the upper 32 bits happened to be all zeroes-- for the past
6 years, on every x86-64 compiler previously known to mankind.
The bleeding-edge Clang/LLVM compiler has a smarter optimizer, and
under certain circumstances, it will attempt to load-combine adjacent
32-bit integers from one of the libjpeg structures into a single 64-bit
integer and pass that 64-bit integer as a 32-bit argument to one of the
SIMD functions (which is allowed by the ABI, since the upper 32 bits of
the 32-bit argument's register are undefined.) This caused the
libjpeg-turbo regression tests to crash.
Also enhance the documentation of JDIMENSION to explain that its size
is significant to the implementation of the SIMD code.
Closes #20. Refer also to http://crbug.com/532214.
|
|
eea64245
|
2015-06-20T15:51:34
|
|
Typo
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1570 632fc199-4ca6-4c93-a231-07263d6284db
|
|
f15ef337
|
2015-06-08T17:41:34
|
|
Fix a segfault that occured in the MIPS DSPr2 fancy upsampling routine when downsampled_width==3. Because the DSPr2 code unrolls the loop for the middle columns (refer to jdsample.c), it has the effect of performing two column iterations, and that only works properly if the number of columns (minus the first and last) is >= 2. For the specific case of downsampled_width==3, this patch skips to the second iteration of the unrolled column loop.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1562 632fc199-4ca6-4c93-a231-07263d6284db
|
|
3b7015d5
|
2015-02-23T19:03:29
|
|
Enable silent build rules for the NASM objects, if the source is configured with automake 1.11 or later. NOTE: the build still spits out "error: ignoring unknown tag NASM" for each object, but unfortunately, if we remove "--tag NASM" from the command line, the build breaks under older versions of automake (it aborts with "unable to infer tagged configuration.")
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1534 632fc199-4ca6-4c93-a231-07263d6284db
|
|
8b5a0093
|
2015-01-21T17:42:28
|
|
Oops. The MIPS SIMD implementations of h2v1 and h2v2 upsampling were not checking for DSPr2 support, so running 'djpeg -nosmooth' on a non-DSPr2-enabled platform caused an "illegal instruction" error.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1523 632fc199-4ca6-4c93-a231-07263d6284db
|
|
62999d77
|
2014-12-19T15:36:39
|
|
Modify the ARM64 assembly file so that it uses only syntax that the clang assembler in XCode 5.x can understand. These changes should all be cosmetic in nature-- they do not change the meaning or readability of the code nor the ability to build it for Linux. Actually, the code is now more in compliance with the ARM64 programming manual. In addition to these changes, there were a couple of instructions that clang simply doesn't support, so gas-preprocessor.pl was modified so that it now converts those into equivalent instructions that clang can handle.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1450 632fc199-4ca6-4c93-a231-07263d6284db
|
|
0a9a2526
|
2014-08-29T01:53:17
|
|
Rename the ARM64 assembly file to match the C file
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1390 632fc199-4ca6-4c93-a231-07263d6284db
|
|
de26249a
|
2014-08-29T01:49:59
|
|
Fix several mathematical issues discovered in the ARM64 NEON code while running the extended regression tests introduced in r1267. Specific comments can be found in the original patches:
https://sourceforge.net/p/libjpeg-turbo/patches/64/
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1389 632fc199-4ca6-4c93-a231-07263d6284db
|
|
a6efae14
|
2014-08-25T15:26:09
|
|
Reformat code per Siarhei's original patch (to clearly indicate that the offset instructions are completely independent) and add Siarhei as an individual author (he no longer works for Nokia.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1388 632fc199-4ca6-4c93-a231-07263d6284db
|
|
bdc7650b
|
2014-08-23T15:57:38
|
|
ARM64 NEON SIMD support for YCC-to-RGB565 conversion
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1386 632fc199-4ca6-4c93-a231-07263d6284db
|
|
d729f4da
|
2014-08-23T15:47:51
|
|
ARM NEON SIMD support for YCC-to-RGB565 conversion, and optimizations to the existing YCC-to-RGB conversion code:
-----
https://github.com/ssvb/libjpeg-turbo/commit/aee36252be20054afce371a92406fc66ba6627b5.patch
From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 03:50:22 +0300
Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15
The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9.
Tuning it for newer ARM processors can introduce some speed-up (up to 20%).
The performance of the inner loop (conversion of 8 pixels) improves from
~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles
down to ~18 cycles on ARM Cortex-A15.
The performance remains exactly the same on ARM Cortex-A7 (~58 cycles),
ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors.
Also use larger indentation in the source code for separating two independent
instruction streams.
-----
https://github.com/ssvb/libjpeg-turbo/commit/a5efdbf22ce9c1acd4b14a353cec863c2c57557e.patch
From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 07:23:09 +0300
Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion
The performance of the inner loop (conversion of 8 pixels):
* ARM Cortex-A7: ~55 cycles
* ARM Cortex-A8: ~28 cycles
* ARM Cortex-A9: ~32 cycles
* ARM Cortex-A15: ~20 cycles
* Qualcomm Krait: ~24 cycles
Based on the Linaro rgb565 patch from
https://sourceforge.net/p/libjpeg-turbo/patches/24/
but implements better instructions scheduling.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db
|
|
55e328ec
|
2014-08-22T18:30:44
|
|
Revert r1335 and r1336. It was a valiant effort, but on Windows, xmm8-xmm15 are non-volatile, and the overhead of pushing them onto the stack at the beginning of each function and popping them at the end was causing worse performance (in the neighborhood of 3-5%) than just using the work areas and limiting the register usage to xmm0-xmm7. Best to leave the SSE2 code alone. We can optimize the register usage for AVX2, once that port takes place.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1382 632fc199-4ca6-4c93-a231-07263d6284db
|
|
2e2ce5a1
|
2014-08-22T11:31:46
|
|
.func/.endfunc are only necessary when generating STABS debug info, which basically went out of style with parachute pants and Rick Astley. At any rate, none of the platforms for which we're building the ARM code use it (DWARF is the common format these days), and the .func/.endfunc directives cause the clang integrated assembler to fail (http://llvm.org/bugs/show_bug.cgi?id=20424).
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1375 632fc199-4ca6-4c93-a231-07263d6284db
|
|
8a74848a
|
2014-08-09T22:58:18
|
|
Oops. The Windows version of collect_args/uncollect_args uses rsp, so we still need the rsp prologue/epilogue, despite the fact that we aren't using the stack as a work area. This fixes a segfault on Windows caused by r1335.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1336 632fc199-4ca6-4c93-a231-07263d6284db
|
|
a8ab3424
|
2014-08-09T14:30:28
|
|
Attempt to improve performance by refactoring the compression-side color conversion and DCT algorithms so that they take full advantage of the additional registers available with 64-bit SSE2. This produces a somewhat yawn-worthy speedup of 2-3%, but at least the code is a lot more readable now.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1335 632fc199-4ca6-4c93-a231-07263d6284db
|
|
3728aa01
|
2014-07-23T14:14:14
|
|
Fix performance and other issues uncovered in testing with actual ARM64 hardware; formatting tweaks; remove NEON platform check (NEON is always available with ARMv8)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1333 632fc199-4ca6-4c93-a231-07263d6284db
|
|
2472bc71
|
2014-06-22T21:14:39
|
|
Add proper support for Borland compilers (Borland needs section names to be prefixed with an underscore, and it needs OMF object files.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1328 632fc199-4ca6-4c93-a231-07263d6284db
|
|
495e4342
|
2014-05-19T19:13:22
|
|
Allow for building the MIPS DSPr2 extensions if the host is mips-* as well as mipsel-*. The DSPr2 extensions are little endian, so we still have to check that the compiler defines __MIPSEL__ before enabling them. This paves the way for supporting big-endian MIPS, and in the near term, it allows the SIMD extensions to be built with Sourcery CodeBench.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1316 632fc199-4ca6-4c93-a231-07263d6284db
|
|
5ef46305
|
2014-05-18T20:04:47
|
|
SIMD-accelerated int upsample routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1315 632fc199-4ca6-4c93-a231-07263d6284db
|
|
c728cfd8
|
2014-05-18T19:36:05
|
|
Fix MIPS build
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1314 632fc199-4ca6-4c93-a231-07263d6284db
|
|
5033f3e1
|
2014-05-18T18:33:44
|
|
Remove MS-DOS code and information, and adjust copyright headers to reflect the removal of features in r1307 and r1308. libjpeg-turbo has never supported MS-DOS, nor is it even possible for us to do so.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1312 632fc199-4ca6-4c93-a231-07263d6284db
|
|
bc56b754
|
2014-05-16T10:43:44
|
|
Get rid of the HAVE_PROTOTYPES configuration option, as well as the related JMETHOD and JPP macros. libjpeg-turbo has never supported compilers that don't handle prototypes. Doing so requires ansi2knr, which isn't even supported in the IJG code anymore.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1308 632fc199-4ca6-4c93-a231-07263d6284db
|
|
52ded876
|
2014-05-15T20:30:16
|
|
Remove all of the NEED_SHORT_EXTERNAL_NAMES stuff. There is scant information available as to which linkers ever had a 15-character global symbol name limit. AFAICT, it might have been a VMS and/or a.out BSD thing, but none of those platforms have ever been supported by libjpeg-turbo (nor are such systems supported by other open source libraries of this nature.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1307 632fc199-4ca6-4c93-a231-07263d6284db
|
|
64086281
|
2014-05-15T19:46:48
|
|
Clean up code formatting in the SIMD interface functions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1306 632fc199-4ca6-4c93-a231-07263d6284db
|
|
1419852c
|
2014-05-15T19:45:11
|
|
Clean up code formatting in the SIMD interface functions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1305 632fc199-4ca6-4c93-a231-07263d6284db
|
|
1b3fd7ee
|
2014-05-15T18:26:01
|
|
SIMD-accelerated NULL convert routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1304 632fc199-4ca6-4c93-a231-07263d6284db
|
|
0bf325b8
|
2014-05-15T17:10:39
|
|
Fix error in MIPS DSPr2 accelerated smooth downsample routine
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1302 632fc199-4ca6-4c93-a231-07263d6284db
|
|
6a61c1e6
|
2014-05-14T15:00:10
|
|
SIMD-accelerated h2v2 smooth downsampling routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1301 632fc199-4ca6-4c93-a231-07263d6284db
|
|
b844eaa3
|
2014-05-13T18:40:14
|
|
SIMD-accelerated merged upsampling routines for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1297 632fc199-4ca6-4c93-a231-07263d6284db
|
|
b7753510
|
2014-05-11T09:36:25
|
|
Convert tabs to spaces in the libjpeg code and the SIMD code (TurboJPEG retains the use of tabs for historical reasons. They were annoying in the libjpeg code primarily because they were not consistently used and because they were used to format as well as indent the code. In the case of TurboJPEG, tabs are used just to indent the code, so even if the editor assumes a different tab width, the code will still be readable.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1285 632fc199-4ca6-4c93-a231-07263d6284db
|
|
e47364ac
|
2014-05-10T10:10:03
|
|
Modify Windows build system to take into account new assembly file names
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1283 632fc199-4ca6-4c93-a231-07263d6284db
|
|
24e92e9f
|
2014-05-10T09:53:34
|
|
Using subdirectories unfortunately opened up a can of worms. In order to prevent object name conflicts, it is necessary to use the subdir-objects automake directive, but it simply doesn't work right on some of the versions of automake we still have to support. Another option would be to add a separate Makefile.am file to each subdirectory, but that requires maintaining a completely different set of build rules for each one. Fortunately, however, we're in the 21st century now, so we can use filenames longer than 8.3.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1282 632fc199-4ca6-4c93-a231-07263d6284db
|
|
72130be9
|
2014-05-09T20:14:26
|
|
Re-organize the x86/x86-64 SIMD routines into separate folders by instruction set so we can name each routine similarly to its corresponding C file. This also makes it easier to add support for new instruction sets.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1280 632fc199-4ca6-4c93-a231-07263d6284db
|
|
1a45b81f
|
2014-05-09T18:06:58
|
|
Remove trailing spaces (+ one additional tab in TJUnitTest.java that was missed in the previous commit)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1279 632fc199-4ca6-4c93-a231-07263d6284db
|
|
e5eaf374
|
2014-05-09T18:00:32
|
|
Convert tabs to spaces in the libjpeg code and the SIMD code (TurboJPEG retains the use of tabs for historical reasons. They were annoying in the libjpeg code primarily because they were not consistently used and because they were used to format as well as indent the code. In the case of TurboJPEG, tabs are used just to indent the code, so even if the editor assumes a different tab width, the code will still be readable.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1278 632fc199-4ca6-4c93-a231-07263d6284db
|
|
771886c1
|
2014-05-09T14:45:55
|
|
Fix an error in the MIPS DSPr2 fancy upsampling routine
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1277 632fc199-4ca6-4c93-a231-07263d6284db
|
|
34347862
|
2014-05-06T09:53:21
|
|
SIMD-accelerated slow integer IDCT routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1269 632fc199-4ca6-4c93-a231-07263d6284db
|
|
e189ec7a
|
2014-02-06T19:15:03
|
|
Remove trailing space
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1111 632fc199-4ca6-4c93-a231-07263d6284db
|
|
93974690
|
2014-02-06T19:13:24
|
|
Remove trailing space
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1110 632fc199-4ca6-4c93-a231-07263d6284db
|
|
2d07ee51
|
2014-02-05T19:03:41
|
|
Create a separate stub file for 64-bit ARM, since it currently implements only the decompression-related functions.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1109 632fc199-4ca6-4c93-a231-07263d6284db
|
|
ba55b2cd
|
2014-02-05T08:15:44
|
|
First pass at ARMv8 64-bit NEON SIMD support
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1108 632fc199-4ca6-4c93-a231-07263d6284db
|
|
3e00f03a
|
2014-02-05T07:40:00
|
|
Formatting tweaks
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1107 632fc199-4ca6-4c93-a231-07263d6284db
|
|
edc846f4
|
2014-02-05T07:39:38
|
|
Formatting tweaks
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1106 632fc199-4ca6-4c93-a231-07263d6284db
|
|
19eeaa79
|
2013-10-31T07:40:24
|
|
Make environment variable syntax consistent between ARM and x86 code, and add an option to disable SIMD on x86 (this option will be added to the x86-64 code as well, but it makes more sense to add it when we add AVX support.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1073 632fc199-4ca6-4c93-a231-07263d6284db
|
|
fff6c23a
|
2013-10-12T21:39:20
|
|
SIMD-accelerated integer convsamp routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1059 632fc199-4ca6-4c93-a231-07263d6284db
|
|
3d727281
|
2013-10-09T18:39:44
|
|
SIMD-accelerated floating point quantize and convsamp routines for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1058 632fc199-4ca6-4c93-a231-07263d6284db
|
|
d3131c1b
|
2013-10-08T02:18:59
|
|
SIMD-accelerated fast integer inverse DCT routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1056 632fc199-4ca6-4c93-a231-07263d6284db
|
|
71e06a7d
|
2013-10-08T02:11:21
|
|
SIMD-accelerated fast integer forward DCT routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1055 632fc199-4ca6-4c93-a231-07263d6284db
|
|
a6b7fbd3
|
2013-09-30T18:13:27
|
|
SIMD-accelerated slow integer forward DCT and quantize routines for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1054 632fc199-4ca6-4c93-a231-07263d6284db
|
|
e5005917
|
2013-09-27T17:51:08
|
|
SIMD-accelerated 3/4 and 3/2 decompression scaling for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1047 632fc199-4ca6-4c93-a231-07263d6284db
|
|
2ccf4d1a
|
2013-09-27T17:43:23
|
|
SIMD-accelerated 1/2 and 1/4 decompression scaling for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1046 632fc199-4ca6-4c93-a231-07263d6284db
|
|
49eaa757
|
2013-09-27T17:39:57
|
|
SIMD-optimized RGB-to-grayscale conversion for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1045 632fc199-4ca6-4c93-a231-07263d6284db
|
|
922b14b8
|
2013-09-25T17:33:37
|
|
Fix segfault in MIPS DSPr2 upsample routines that occurred when doing 'make test'
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1040 632fc199-4ca6-4c93-a231-07263d6284db
|
|
371b420e
|
2013-08-23T07:57:21
|
|
Fix 'make dist'
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1024 632fc199-4ca6-4c93-a231-07263d6284db
|
|
16962c11
|
2013-07-27T21:50:02
|
|
SIMD support for performing upsampling using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@996 632fc199-4ca6-4c93-a231-07263d6284db
|
|
6f2d3c2c
|
2013-07-27T21:48:18
|
|
SIMD support for performing downsampling using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@995 632fc199-4ca6-4c93-a231-07263d6284db
|
|
86fbf35f
|
2013-07-27T21:44:14
|
|
SIMD support for performing fancy upsampling using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@994 632fc199-4ca6-4c93-a231-07263d6284db
|
|
0be9fa57
|
2013-07-24T21:50:20
|
|
SIMD support for performing color conversion using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@993 632fc199-4ca6-4c93-a231-07263d6284db
|
|
b87a0b46
|
2013-01-13T12:15:58
|
|
Fix the x86 build with NASM 0.98. Since NASM 0.98 is the default version on OS X, we want to at least allow people to build 32-bit code with it, even though it can't properly build 64-bit code.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@906 632fc199-4ca6-4c93-a231-07263d6284db
|
|
29d8f253
|
2012-08-07T21:59:59
|
|
Fix build issues that occurred whenever the source directory contained the letters "col", "mer", or "gra".
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@860 632fc199-4ca6-4c93-a231-07263d6284db
|
|
112a0bb9
|
2012-06-28T23:25:34
|
|
More recent versions of autoconf add -traditional-cpp to the CPP flags, which causes jsimdcfg.inc.h to not preprocess correctly unless we expand all of the instances of the #definev macro.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@848 632fc199-4ca6-4c93-a231-07263d6284db
|
|
98eecb86
|
2012-06-18T00:18:04
|
|
Fixed regression caused by a bug in the 32-bit strict memory access code in jdmrgss2.asm (contributed by Chromium to stop valgrind from whining whenever the output buffer size was not evenly divisible by 16 bytes.) On Linux/x86, this regression caused incorrect pixels on the right-hand side of images whose rows were not 16-byte aligned, whenever fancy upsampling was used and when decompressing to a 32-bit (RGBX, etc.) buffer.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.1.x@844 632fc199-4ca6-4c93-a231-07263d6284db
|
|
13c318da
|
2012-06-18T00:16:15
|
|
Fixed regression caused by a bug in the 32-bit strict memory access code in jdmrgss2.asm (contributed by Chromium to stop valgrind from whining whenever the output buffer size was not evenly divisible by 16 bytes.) On Linux/x86, this regression caused incorrect pixels on the right-hand side of images whose rows were not 16-byte aligned, whenever fancy upsampling was used and when decompressing to a 32-bit (RGBX, etc.) buffer.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.0.x@843 632fc199-4ca6-4c93-a231-07263d6284db
|
|
8126d0c5
|
2012-06-15T21:54:45
|
|
Fixed regression caused by a bug in the 32-bit strict memory access code in jdmrgss2.asm (contributed by Chromium to stop valgrind from whining whenever the output buffer size was not evenly divisible by 16 bytes.) On Linux/x86, this regression generated incorrect pixels on the right-hand side of images whose rows were not 16-byte aligned, whenever fancy upsampling was used. This patch also enables the strict memory access code on all platforms, not just Linux (it does no harm on other platforms) and removes a couple of pcmpeqb instructions that were rendered unnecessary by r835.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@838 632fc199-4ca6-4c93-a231-07263d6284db
|
|
316617fa
|
2012-06-13T05:17:03
|
|
Accelerated 4:2:2 upsampling routine for ARM (improves performance ~20-30% when decompressing 4:2:2 JPEGs using fancy upsampling)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@837 632fc199-4ca6-4c93-a231-07263d6284db
|
|
66fe68b0
|
2012-06-13T01:23:09
|
|
Eliminate the use of the MASKMOVDQU instruction, to speed up decompression performance by 10x on AMD Bobcat embedded processors (and ~5% on AMD desktop processors.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@836 632fc199-4ca6-4c93-a231-07263d6284db
|
|
69799275
|
2012-06-13T01:21:29
|
|
Eliminate the use of the MASKMOVDQU instruction, to speed up decompression performance by 10x on AMD Bobcat embedded processors (and ~5% on AMD desktop processors.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@835 632fc199-4ca6-4c93-a231-07263d6284db
|
|
4f24016b
|
2012-04-26T19:50:37
|
|
Preserve all 128 bits of xmm6 and xmm7
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@829 632fc199-4ca6-4c93-a231-07263d6284db
|
|
4ef9c954
|
2012-04-26T19:48:33
|
|
Preserve all 128 bits of xmm6 and xmm7
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@828 632fc199-4ca6-4c93-a231-07263d6284db
|
|
8015a303
|
2012-03-17T14:32:38
|
|
Visual Studio 2010 doesn't like the wildcard at compile time, so let CMake expand it instead.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@813 632fc199-4ca6-4c93-a231-07263d6284db
|
|
6bc59384
|
2012-03-17T14:32:05
|
|
Visual Studio 2010 doesn't like the wildcard, so let CMake expand it instead.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@812 632fc199-4ca6-4c93-a231-07263d6284db
|
|
0f0fd751
|
2012-02-07T23:27:14
|
|
Compiler warnings
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.1.x@784 632fc199-4ca6-4c93-a231-07263d6284db
|
|
44a97e8f
|
2012-02-07T23:13:24
|
|
Not necessary to save r10 and r11, since these are scratch registers
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.1.x@782 632fc199-4ca6-4c93-a231-07263d6284db
|
|
be12e1de
|
2012-02-02T22:32:45
|
|
Accelerated 4:2:2 upsampling routine for ARM (improves performance ~20-30% when decompressing 4:2:2 JPEGs using fancy upsampling)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@773 632fc199-4ca6-4c93-a231-07263d6284db
|
|
d65d99a9
|
2012-01-31T03:39:23
|
|
Compiler warnings
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@758 632fc199-4ca6-4c93-a231-07263d6284db
|
|
67ce3b23
|
2011-12-19T02:21:03
|
|
Added new alpha channel colorspace constants/pixel formats, so applications can specify that they need the unused byte in a 4-component RGB output buffer set to 0xFF when decompressing.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@732 632fc199-4ca6-4c93-a231-07263d6284db
|
|
795e6ad3
|
2011-12-01T11:14:18
|
|
Fixed non-fatal out-of-bounds read in SSE2 SIMD code reported by valgrind when decompressing a JPEG image to a bitmap buffer whose size was not a multiple of 16 bytes.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.0.x@729 632fc199-4ca6-4c93-a231-07263d6284db
|
|
c412184b
|
2011-12-01T11:11:29
|
|
Fixed non-fatal out-of-bounds read in SSE2 SIMD code reported by valgrind when decompressing a JPEG image to a bitmap buffer whose size was not a multiple of 16 bytes.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.1.x@728 632fc199-4ca6-4c93-a231-07263d6284db
|
|
ebfe9e4a
|
2011-12-01T10:58:36
|
|
Fixed non-fatal out-of-bounds read in SSE2 SIMD code reported by valgrind when decompressing a JPEG image to a bitmap buffer whose size was not a multiple of 16 bytes.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@727 632fc199-4ca6-4c93-a231-07263d6284db
|
|
105f9a94
|
2011-11-29T09:02:10
|
|
Expose NASM variable in ccmake
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.1.x@726 632fc199-4ca6-4c93-a231-07263d6284db
|
|
0f905d35
|
2011-11-29T09:01:23
|
|
Expose NASM variable in ccmake
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@725 632fc199-4ca6-4c93-a231-07263d6284db
|
|
1ca924a5
|
2011-11-29T08:58:27
|
|
NASM automatically adds the current directory to the include path, but YASM doesn't, so we need to explicitly add it.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@724 632fc199-4ca6-4c93-a231-07263d6284db
|
|
c08e8c15
|
2011-09-08T23:54:40
|
|
When decompressing to a 4-byte RGB buffer, set the unused byte to 0xFF so it can be interpreted as an opaque alpha channel.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@699 632fc199-4ca6-4c93-a231-07263d6284db
|
|
b4570bbf
|
2011-09-07T06:31:00
|
|
Improve performance of non-SIMD color conversion routines and use global constants to define colorspace extension parameters
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@698 632fc199-4ca6-4c93-a231-07263d6284db
|
|
b071f01c
|
2011-09-06T18:58:22
|
|
Update Nokia contact info
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@694 632fc199-4ca6-4c93-a231-07263d6284db
|
|
ad6955d4
|
2011-09-06T18:57:53
|
|
Improve performance of IFAST iDCT by changing the order of transpose and descale steps
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@693 632fc199-4ca6-4c93-a231-07263d6284db
|
|
5129e396
|
2011-09-06T18:55:45
|
|
Make ARM ISLOW iDCT faster on typical cases, and eliminate the possibility of 16-bit overflows when handling arbitrary coefficients.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@692 632fc199-4ca6-4c93-a231-07263d6284db
|
|
98a44fe0
|
2011-08-24T23:27:44
|
|
Improve the performance of YCbCr to RGB conversion on ARM
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@691 632fc199-4ca6-4c93-a231-07263d6284db
|
|
ce4e3e86
|
2011-08-22T13:48:01
|
|
NEON-accelerated slow integer inverse DCT
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@690 632fc199-4ca6-4c93-a231-07263d6284db
|
|
82bd5219
|
2011-08-17T21:00:59
|
|
NEON-accelerated quantization
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@689 632fc199-4ca6-4c93-a231-07263d6284db
|
|
4b024a62
|
2011-08-15T08:36:51
|
|
Improve performance of ARM NEON IFAST iDCT
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@686 632fc199-4ca6-4c93-a231-07263d6284db
|
|
7a9376c1
|
2011-08-12T19:27:20
|
|
ARM NEON-accelerated RGB-to-YCbCr conversion
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@682 632fc199-4ca6-4c93-a231-07263d6284db
|
|
b740054f
|
2011-08-10T23:31:13
|
|
Support for accelerated forward DCT using ARM NEON instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@678 632fc199-4ca6-4c93-a231-07263d6284db
|
|
8c60d22f
|
2011-06-17T21:12:58
|
|
NEON-optimized 2x2 and 4x4 scaled iDCTs
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@662 632fc199-4ca6-4c93-a231-07263d6284db
|
|
4346f91f
|
2011-06-14T22:16:50
|
|
iOS ARM support
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@659 632fc199-4ca6-4c93-a231-07263d6284db
|
|
e19f15e3
|
2011-05-10T21:44:33
|
|
Not necessary to save r10 and r11, since these are scratch registers
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@610 632fc199-4ca6-4c93-a231-07263d6284db
|
|
321e0686
|
2011-05-03T08:47:43
|
|
ARM NEON support
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@607 632fc199-4ca6-4c93-a231-07263d6284db
|
|
56fb2374
|
2011-05-03T06:32:41
|
|
YASM support
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@606 632fc199-4ca6-4c93-a231-07263d6284db
|
|
dc6f6a9c
|
2011-04-07T05:27:29
|
|
Don't need MSVC definition in assembler code anymore
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@575 632fc199-4ca6-4c93-a231-07263d6284db
|