|
9055fb40
|
2016-07-07T13:10:30
|
|
ARM/MIPS: Change the behavior of JSIMD_FORCE*
The JSIMD_FORCE* environment variables previously meant "force the use
of this instruction set if it is available but others are available as
well", but that did nothing on ARM platforms, since there is only ever
one instruction set available. Since the ARM and MIPS CPU feature
detection code is less than bulletproof, and since there is only one
SIMD instruction set (currently) supported on those platforms, it makes
sense for the JSIMD_FORCE* environment variables on those platforms to
actually force the use of the SIMD instruction set, thus bypassing the
CPU feature detection code.
This addresses a concern raised in #88 whereby parsing /proc/cpuinfo
didn't work within a QEMU environment. This at least provides a
workaround, allowing users to force-enable or force-disable SIMD
instructions for ARM and MIPS builds of libjpeg-turbo.
|
|
123f7258
|
2016-05-24T10:23:56
|
|
Format copyright headers more consistently
The IJG convention is to format copyright notices as:
Copyright (C) YYYY, Owner.
We try to maintain this convention for any code that is part of the
libjpeg API library (with the exception of preserving the copyright
notices from Cendio's code verbatim, since those predate
libjpeg-turbo.)
Note that the phrase "All Rights Reserved" is no longer necessary, since
all Buenos Aires Convention signatories signed onto the Berne Convention
in 2000. However, our convention is to retain this phrase for any files
that have a self-contained copyright header but to leave it off of any
files that refer to another file for conditions of distribution and use.
For instance, all of the non-SIMD files in the libjpeg API library refer
to README.ijg, and the copyright message in that file contains "All
Rights Reserved", so it is unnecessary to add it to the individual
files.
The TurboJPEG code retains my preferred formatting convention for
copyright notices, which is based on that of VirtualGL (where the
TurboJPEG API originated.)
|
|
bd49803f
|
2016-02-19T08:53:33
|
|
Use consistent/modern code formatting for pointers
The convention used by libjpeg:
type * variable;
is not very common anymore, because it looks too much like
multiplication. Some (particularly C++ programmers) prefer to tuck the
pointer symbol against the type:
type* variable;
to emphasize that a pointer to a type is effectively a new type.
However, this can also be confusing, since defining multiple variables
on the same line would not work properly:
type* variable1, variable2; /* Only variable1 is actually a
pointer. */
This commit reformats the entirety of the libjpeg-turbo code base so
that it uses the same code formatting convention for pointers that the
TurboJPEG API code uses:
type *variable1, *variable2;
This seems to be the most common convention among C programmers, and
it is the convention used by other codec libraries, such as libpng and
libtiff.
|
|
f3a8684c
|
2016-01-07T00:19:43
|
|
SSE2 SIMD implementation of Huffman encoding
Full-color compression speedups relative to libjpeg-turbo 1.4.2:
2.8 GHz Intel Xeon W3530, Linux, 64-bit: 2.2-18% (avg. 9.5%)
2.8 GHz Intel Xeon W3530, Linux, 32-bit: 10-25% (avg. 17%)
2.3 GHz AMD A10-4600M APU, Linux, 64-bit: 4.9-17% (avg. 11%)
2.3 GHz AMD A10-4600M APU, Linux, 32-bit: 8.8-19% (avg. 15%)
3.0 GHz Intel Core i7, OS X, 64-bit: 3.5-16% (avg. 10%)
3.0 GHz Intel Core i7, OS X, 32-bit: 4.8-14% (avg. 11%)
2.6 GHz AMD Athlon 64 X2 5050e:
Performance-neutral (give or take a few percent)
Full-color compression speedups relative to IPP:
2.8 GHz Intel Xeon W3530, Linux, 64-bit: 4.8-34% (avg. 19%)
2.8 GHz Intel Xeon W3530, Linux, 32-bit: -19%-7.0% (avg. -7.0%)
Refer to #42 for discussion. Numerous other approaches were attempted,
but this one proved to be the most performant across all platforms.
This commit also fixes #3 (works around, really-- the clang-compiled version
of jchuff.c still performs 20% worse than its GCC-compiled counterpart, but
that code is now bypassed by the new SSE2 Huffman algorithm.)
Based on:
https://github.com/mayeut/libjpeg-turbo/commit/2cb4d41330e1edc4469f6b97ba73b73abfbeb02f
https://github.com/mayeut/libjpeg-turbo/commit/36c94e050d117912adbff9fbcc6fe307df240168
|
|
8b5a0093
|
2015-01-21T17:42:28
|
|
Oops. The MIPS SIMD implementations of h2v1 and h2v2 upsampling were not checking for DSPr2 support, so running 'djpeg -nosmooth' on a non-DSPr2-enabled platform caused an "illegal instruction" error.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1523 632fc199-4ca6-4c93-a231-07263d6284db
|
|
d729f4da
|
2014-08-23T15:47:51
|
|
ARM NEON SIMD support for YCC-to-RGB565 conversion, and optimizations to the existing YCC-to-RGB conversion code:
-----
https://github.com/ssvb/libjpeg-turbo/commit/aee36252be20054afce371a92406fc66ba6627b5.patch
From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 03:50:22 +0300
Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15
The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9.
Tuning it for newer ARM processors can introduce some speed-up (up to 20%).
The performance of the inner loop (conversion of 8 pixels) improves from
~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles
down to ~18 cycles on ARM Cortex-A15.
The performance remains exactly the same on ARM Cortex-A7 (~58 cycles),
ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors.
Also use larger indentation in the source code for separating two independent
instruction streams.
-----
https://github.com/ssvb/libjpeg-turbo/commit/a5efdbf22ce9c1acd4b14a353cec863c2c57557e.patch
From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 07:23:09 +0300
Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion
The performance of the inner loop (conversion of 8 pixels):
* ARM Cortex-A7: ~55 cycles
* ARM Cortex-A8: ~28 cycles
* ARM Cortex-A9: ~32 cycles
* ARM Cortex-A15: ~20 cycles
* Qualcomm Krait: ~24 cycles
Based on the Linaro rgb565 patch from
https://sourceforge.net/p/libjpeg-turbo/patches/24/
but implements better instructions scheduling.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db
|
|
495e4342
|
2014-05-19T19:13:22
|
|
Allow for building the MIPS DSPr2 extensions if the host is mips-* as well as mipsel-*. The DSPr2 extensions are little endian, so we still have to check that the compiler defines __MIPSEL__ before enabling them. This paves the way for supporting big-endian MIPS, and in the near term, it allows the SIMD extensions to be built with Sourcery CodeBench.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1316 632fc199-4ca6-4c93-a231-07263d6284db
|
|
5ef46305
|
2014-05-18T20:04:47
|
|
SIMD-accelerated int upsample routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1315 632fc199-4ca6-4c93-a231-07263d6284db
|
|
c728cfd8
|
2014-05-18T19:36:05
|
|
Fix MIPS build
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1314 632fc199-4ca6-4c93-a231-07263d6284db
|
|
1419852c
|
2014-05-15T19:45:11
|
|
Clean up code formatting in the SIMD interface functions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1305 632fc199-4ca6-4c93-a231-07263d6284db
|
|
1b3fd7ee
|
2014-05-15T18:26:01
|
|
SIMD-accelerated NULL convert routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1304 632fc199-4ca6-4c93-a231-07263d6284db
|
|
6a61c1e6
|
2014-05-14T15:00:10
|
|
SIMD-accelerated h2v2 smooth downsampling routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1301 632fc199-4ca6-4c93-a231-07263d6284db
|
|
b844eaa3
|
2014-05-13T18:40:14
|
|
SIMD-accelerated merged upsampling routines for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1297 632fc199-4ca6-4c93-a231-07263d6284db
|
|
34347862
|
2014-05-06T09:53:21
|
|
SIMD-accelerated slow integer IDCT routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1269 632fc199-4ca6-4c93-a231-07263d6284db
|
|
fff6c23a
|
2013-10-12T21:39:20
|
|
SIMD-accelerated integer convsamp routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1059 632fc199-4ca6-4c93-a231-07263d6284db
|
|
3d727281
|
2013-10-09T18:39:44
|
|
SIMD-accelerated floating point quantize and convsamp routines for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1058 632fc199-4ca6-4c93-a231-07263d6284db
|
|
d3131c1b
|
2013-10-08T02:18:59
|
|
SIMD-accelerated fast integer inverse DCT routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1056 632fc199-4ca6-4c93-a231-07263d6284db
|
|
71e06a7d
|
2013-10-08T02:11:21
|
|
SIMD-accelerated fast integer forward DCT routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1055 632fc199-4ca6-4c93-a231-07263d6284db
|
|
a6b7fbd3
|
2013-09-30T18:13:27
|
|
SIMD-accelerated slow integer forward DCT and quantize routines for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1054 632fc199-4ca6-4c93-a231-07263d6284db
|
|
e5005917
|
2013-09-27T17:51:08
|
|
SIMD-accelerated 3/4 and 3/2 decompression scaling for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1047 632fc199-4ca6-4c93-a231-07263d6284db
|
|
2ccf4d1a
|
2013-09-27T17:43:23
|
|
SIMD-accelerated 1/2 and 1/4 decompression scaling for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1046 632fc199-4ca6-4c93-a231-07263d6284db
|
|
49eaa757
|
2013-09-27T17:39:57
|
|
SIMD-optimized RGB-to-grayscale conversion for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1045 632fc199-4ca6-4c93-a231-07263d6284db
|
|
16962c11
|
2013-07-27T21:50:02
|
|
SIMD support for performing upsampling using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@996 632fc199-4ca6-4c93-a231-07263d6284db
|
|
6f2d3c2c
|
2013-07-27T21:48:18
|
|
SIMD support for performing downsampling using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@995 632fc199-4ca6-4c93-a231-07263d6284db
|
|
86fbf35f
|
2013-07-27T21:44:14
|
|
SIMD support for performing fancy upsampling using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@994 632fc199-4ca6-4c93-a231-07263d6284db
|
|
0be9fa57
|
2013-07-24T21:50:20
|
|
SIMD support for performing color conversion using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@993 632fc199-4ca6-4c93-a231-07263d6284db
|