simd/jsimd.h


Log

Author Commit Date CI Message
DRC c641cddb 2015-01-14T15:41:11 AltiVec SIMD implementation of H2V1 and H2V2 plain upsampling (used only when decompressing YCCK images with fast upsampling enabled.) git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1504 632fc199-4ca6-4c93-a231-07263d6284db
DRC 86af36ae 2015-01-14T13:27:32 AltiVec SIMD implementation of H2V1 and H2V2 merged upsampling git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1503 632fc199-4ca6-4c93-a231-07263d6284db
DRC 52a4ec6c 2015-01-13T09:02:29 AltiVec SIMD implementation of H2V1 and H2V2 fancy upsampling git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1495 632fc199-4ca6-4c93-a231-07263d6284db
DRC ac4daa77 2015-01-10T22:56:26 AltiVec SIMD implementation of YCC-to-RGB color conversion git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1489 632fc199-4ca6-4c93-a231-07263d6284db
DRC 22048207 2015-01-08T06:18:33 AltiVec SIMD implementation of 2x1 and 2x2 downsampling git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1483 632fc199-4ca6-4c93-a231-07263d6284db
DRC 577ecd93 2014-12-23T04:14:54 AltiVec SIMD implementation of sample conversion and integer quantization git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1474 632fc199-4ca6-4c93-a231-07263d6284db
DRC b1fec4ff 2014-12-22T14:10:33 AltiVec SIMD implementation of RGB-to-Grayscale color conversion git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1471 632fc199-4ca6-4c93-a231-07263d6284db
DRC 62bae204 2014-12-22T13:42:26 AltiVec SIMD implementation of RGB-to-YCC color conversion git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1469 632fc199-4ca6-4c93-a231-07263d6284db
DRC 0691162a 2014-12-20T01:17:39 AltiVec SIMD implementation of slow integer inverse DCT git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1461 632fc199-4ca6-4c93-a231-07263d6284db
DRC 6cb7f40a 2014-12-18T10:12:29 AltiVec SIMD implementation of fast integer inverse DCT git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1445 632fc199-4ca6-4c93-a231-07263d6284db
DRC fb0c3940 2014-12-17T08:04:39 AltiVec SIMD implementation of slow integer forward DCT; Clean up fast integer forward DCT code so that it is easier to see how it derives from the SSE2 code and to make it play more nicely with the slow FDCT code. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1443 632fc199-4ca6-4c93-a231-07263d6284db
DRC cd2d8e1c 2014-09-05T06:33:42 AltiVec SIMD implementation of fast forward DCT git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1405 632fc199-4ca6-4c93-a231-07263d6284db
DRC d729f4da 2014-08-23T15:47:51 ARM NEON SIMD support for YCC-to-RGB565 conversion, and optimizations to the existing YCC-to-RGB conversion code: ----- https://github.com/ssvb/libjpeg-turbo/commit/aee36252be20054afce371a92406fc66ba6627b5.patch From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001 From: Siarhei Siamashka <siarhei.siamashka@gmail.com> Date: Wed, 13 Aug 2014 03:50:22 +0300 Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15 The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9. Tuning it for newer ARM processors can introduce some speed-up (up to 20%). The performance of the inner loop (conversion of 8 pixels) improves from ~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles down to ~18 cycles on ARM Cortex-A15. The performance remains exactly the same on ARM Cortex-A7 (~58 cycles), ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors. Also use larger indentation in the source code for separating two independent instruction streams. ----- https://github.com/ssvb/libjpeg-turbo/commit/a5efdbf22ce9c1acd4b14a353cec863c2c57557e.patch From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001 From: Siarhei Siamashka <siarhei.siamashka@gmail.com> Date: Wed, 13 Aug 2014 07:23:09 +0300 Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion The performance of the inner loop (conversion of 8 pixels): * ARM Cortex-A7: ~55 cycles * ARM Cortex-A8: ~28 cycles * ARM Cortex-A9: ~32 cycles * ARM Cortex-A15: ~20 cycles * Qualcomm Krait: ~24 cycles Based on the Linaro rgb565 patch from https://sourceforge.net/p/libjpeg-turbo/patches/24/ but implements better instructions scheduling. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db
DRC 5ef46305 2014-05-18T20:04:47 SIMD-accelerated int upsample routine for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1315 632fc199-4ca6-4c93-a231-07263d6284db
DRC c728cfd8 2014-05-18T19:36:05 Fix MIPS build git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1314 632fc199-4ca6-4c93-a231-07263d6284db
DRC bc56b754 2014-05-16T10:43:44 Get rid of the HAVE_PROTOTYPES configuration option, as well as the related JMETHOD and JPP macros. libjpeg-turbo has never supported compilers that don't handle prototypes. Doing so requires ansi2knr, which isn't even supported in the IJG code anymore. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1308 632fc199-4ca6-4c93-a231-07263d6284db
DRC 52ded876 2014-05-15T20:30:16 Remove all of the NEED_SHORT_EXTERNAL_NAMES stuff. There is scant information available as to which linkers ever had a 15-character global symbol name limit. AFAICT, it might have been a VMS and/or a.out BSD thing, but none of those platforms have ever been supported by libjpeg-turbo (nor are such systems supported by other open source libraries of this nature.) git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1307 632fc199-4ca6-4c93-a231-07263d6284db
DRC 1b3fd7ee 2014-05-15T18:26:01 SIMD-accelerated NULL convert routine for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1304 632fc199-4ca6-4c93-a231-07263d6284db
DRC 6a61c1e6 2014-05-14T15:00:10 SIMD-accelerated h2v2 smooth downsampling routine for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1301 632fc199-4ca6-4c93-a231-07263d6284db
DRC b844eaa3 2014-05-13T18:40:14 SIMD-accelerated merged upsampling routines for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1297 632fc199-4ca6-4c93-a231-07263d6284db
DRC 34347862 2014-05-06T09:53:21 SIMD-accelerated slow integer IDCT routine for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1269 632fc199-4ca6-4c93-a231-07263d6284db
DRC fff6c23a 2013-10-12T21:39:20 SIMD-accelerated integer convsamp routine for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1059 632fc199-4ca6-4c93-a231-07263d6284db
DRC 3d727281 2013-10-09T18:39:44 SIMD-accelerated floating point quantize and convsamp routines for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1058 632fc199-4ca6-4c93-a231-07263d6284db
DRC d3131c1b 2013-10-08T02:18:59 SIMD-accelerated fast integer inverse DCT routine for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1056 632fc199-4ca6-4c93-a231-07263d6284db
DRC 71e06a7d 2013-10-08T02:11:21 SIMD-accelerated fast integer forward DCT routine for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1055 632fc199-4ca6-4c93-a231-07263d6284db
DRC a6b7fbd3 2013-09-30T18:13:27 SIMD-accelerated slow integer forward DCT and quantize routines for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1054 632fc199-4ca6-4c93-a231-07263d6284db
DRC e5005917 2013-09-27T17:51:08 SIMD-accelerated 3/4 and 3/2 decompression scaling for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1047 632fc199-4ca6-4c93-a231-07263d6284db
DRC 2ccf4d1a 2013-09-27T17:43:23 SIMD-accelerated 1/2 and 1/4 decompression scaling for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1046 632fc199-4ca6-4c93-a231-07263d6284db
DRC 49eaa757 2013-09-27T17:39:57 SIMD-optimized RGB-to-grayscale conversion for MIPS DSPr2 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1045 632fc199-4ca6-4c93-a231-07263d6284db
DRC 16962c11 2013-07-27T21:50:02 SIMD support for performing upsampling using MIPS DSPr2 instructions git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@996 632fc199-4ca6-4c93-a231-07263d6284db
DRC 6f2d3c2c 2013-07-27T21:48:18 SIMD support for performing downsampling using MIPS DSPr2 instructions git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@995 632fc199-4ca6-4c93-a231-07263d6284db
DRC 86fbf35f 2013-07-27T21:44:14 SIMD support for performing fancy upsampling using MIPS DSPr2 instructions git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@994 632fc199-4ca6-4c93-a231-07263d6284db
DRC 0be9fa57 2013-07-24T21:50:20 SIMD support for performing color conversion using MIPS DSPr2 instructions git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@993 632fc199-4ca6-4c93-a231-07263d6284db
DRC 316617fa 2012-06-13T05:17:03 Accelerated 4:2:2 upsampling routine for ARM (improves performance ~20-30% when decompressing 4:2:2 JPEGs using fancy upsampling) git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@837 632fc199-4ca6-4c93-a231-07263d6284db
DRC ce4e3e86 2011-08-22T13:48:01 NEON-accelerated slow integer inverse DCT git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@690 632fc199-4ca6-4c93-a231-07263d6284db
DRC 82bd5219 2011-08-17T21:00:59 NEON-accelerated quantization git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@689 632fc199-4ca6-4c93-a231-07263d6284db
DRC 7a9376c1 2011-08-12T19:27:20 ARM NEON-accelerated RGB-to-YCbCr conversion git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@682 632fc199-4ca6-4c93-a231-07263d6284db
DRC b740054f 2011-08-10T23:31:13 Support for accelerated forward DCT using ARM NEON instructions git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@678 632fc199-4ca6-4c93-a231-07263d6284db
DRC 8c60d22f 2011-06-17T21:12:58 NEON-optimized 2x2 and 4x4 scaled iDCTs git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@662 632fc199-4ca6-4c93-a231-07263d6284db
DRC 321e0686 2011-05-03T08:47:43 ARM NEON support git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@607 632fc199-4ca6-4c93-a231-07263d6284db
DRC ddb158c5 2011-02-27T09:09:54 Add short names for RGB->grayscale MMX functions git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@468 632fc199-4ca6-4c93-a231-07263d6284db
DRC 392e0483 2011-02-18T20:43:04 Updated (C) git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@394 632fc199-4ca6-4c93-a231-07263d6284db
DRC c8666333 2011-02-18T11:23:45 SIMD-accelerated RGB-to-Grayscale color conversion git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@393 632fc199-4ca6-4c93-a231-07263d6284db
DRC af1ca9bc 2011-02-02T05:42:37 Clarify that the C wrappers and headers fall under the same license as the rest of the SIMD code git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.0.x@335 632fc199-4ca6-4c93-a231-07263d6284db
DRC 720e1610 2009-04-05T21:51:25 Add colorspace extensions to merged upsampling routines git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@42 632fc199-4ca6-4c93-a231-07263d6284db
DRC f25c071e 2009-04-03T12:00:51 Implement new colorspaces to allow directly compressing from/decompressing to RGB/RGBX/BGR/BGRX/XBGR/XRGB without conversion git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@35 632fc199-4ca6-4c93-a231-07263d6284db
Pierre Ossman eea72155 2009-03-09T13:34:17 Add SSE2 SIMD implementation of computationally intensive routines. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@22 632fc199-4ca6-4c93-a231-07263d6284db
Pierre Ossman 018fc429 2009-03-09T13:31:56 Add SSE SIMD implementation of computationally intensive routines. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@21 632fc199-4ca6-4c93-a231-07263d6284db
Pierre Ossman 65d03173 2009-03-09T13:28:10 Add 3DNow SIMD implementation of computationally intensive routines. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@18 632fc199-4ca6-4c93-a231-07263d6284db
Pierre Ossman 5eb84ff9 2009-03-09T13:25:30 Add MMX SIMD implementation of computationally intensive routines. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@17 632fc199-4ca6-4c93-a231-07263d6284db
Pierre Ossman 2ae181c7 2009-03-09T13:21:27 Implement x86 SIMD framework Add NASM support and stub routine for detecting SIMD extensions. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@15 632fc199-4ca6-4c93-a231-07263d6284db