simd/jsimd_x86_64.c


Log

Author Commit Date CI Message
DRC d729f4da 2014-08-23T15:47:51 ARM NEON SIMD support for YCC-to-RGB565 conversion, and optimizations to the existing YCC-to-RGB conversion code: ----- https://github.com/ssvb/libjpeg-turbo/commit/aee36252be20054afce371a92406fc66ba6627b5.patch From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001 From: Siarhei Siamashka <siarhei.siamashka@gmail.com> Date: Wed, 13 Aug 2014 03:50:22 +0300 Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15 The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9. Tuning it for newer ARM processors can introduce some speed-up (up to 20%). The performance of the inner loop (conversion of 8 pixels) improves from ~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles down to ~18 cycles on ARM Cortex-A15. The performance remains exactly the same on ARM Cortex-A7 (~58 cycles), ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors. Also use larger indentation in the source code for separating two independent instruction streams. ----- https://github.com/ssvb/libjpeg-turbo/commit/a5efdbf22ce9c1acd4b14a353cec863c2c57557e.patch From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001 From: Siarhei Siamashka <siarhei.siamashka@gmail.com> Date: Wed, 13 Aug 2014 07:23:09 +0300 Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion The performance of the inner loop (conversion of 8 pixels): * ARM Cortex-A7: ~55 cycles * ARM Cortex-A8: ~28 cycles * ARM Cortex-A9: ~32 cycles * ARM Cortex-A15: ~20 cycles * Qualcomm Krait: ~24 cycles Based on the Linaro rgb565 patch from https://sourceforge.net/p/libjpeg-turbo/patches/24/ but implements better instructions scheduling. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db
DRC 64086281 2014-05-15T19:46:48 Clean up code formatting in the SIMD interface functions git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1306 632fc199-4ca6-4c93-a231-07263d6284db
DRC 1419852c 2014-05-15T19:45:11 Clean up code formatting in the SIMD interface functions git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1305 632fc199-4ca6-4c93-a231-07263d6284db
DRC b7753510 2014-05-11T09:36:25 Convert tabs to spaces in the libjpeg code and the SIMD code (TurboJPEG retains the use of tabs for historical reasons. They were annoying in the libjpeg code primarily because they were not consistently used and because they were used to format as well as indent the code. In the case of TurboJPEG, tabs are used just to indent the code, so even if the editor assumes a different tab width, the code will still be readable.) git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1285 632fc199-4ca6-4c93-a231-07263d6284db
DRC 1a45b81f 2014-05-09T18:06:58 Remove trailing spaces (+ one additional tab in TJUnitTest.java that was missed in the previous commit) git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1279 632fc199-4ca6-4c93-a231-07263d6284db
DRC 67ce3b23 2011-12-19T02:21:03 Added new alpha channel colorspace constants/pixel formats, so applications can specify that they need the unused byte in a 4-component RGB output buffer set to 0xFF when decompressing. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@732 632fc199-4ca6-4c93-a231-07263d6284db
DRC 392e0483 2011-02-18T20:43:04 Updated (C) git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@394 632fc199-4ca6-4c93-a231-07263d6284db
DRC c8666333 2011-02-18T11:23:45 SIMD-accelerated RGB-to-Grayscale color conversion git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@393 632fc199-4ca6-4c93-a231-07263d6284db
DRC af1ca9bc 2011-02-02T05:42:37 Clarify that the C wrappers and headers fall under the same license as the rest of the SIMD code git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.0.x@335 632fc199-4ca6-4c93-a231-07263d6284db
DRC c06073a9 2010-09-06T17:37:12 Remove simd/ prefix from #include (not necessary and was causing problems with Visual Studio project) git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@228 632fc199-4ca6-4c93-a231-07263d6284db
DRC 30959719 2010-08-07T16:06:56 Fix typo in SIMD dispatch routines which was causing 4:2:0 upsampling to be used instead of 4:2:2 when decompressing JPEG images using SSE2 code git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@223 632fc199-4ca6-4c93-a231-07263d6284db
DRC 04899094 2010-02-26T23:01:19 Bleepin' Windows uses LLP64, not LP64 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@158 632fc199-4ca6-4c93-a231-07263d6284db
DRC 321ad513 2009-10-08T09:41:39 Enable 64-bit build on Snow Leopard git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@68 632fc199-4ca6-4c93-a231-07263d6284db
Pierre Ossman ba82ddf6 2009-06-29T11:20:42 Clean up SIMD glue code The SIMD glue code has gotten a bit #ifdef heavy so clean it up by having one file for each possible SIMD arch. This also allows a simplification of the x86_64 code as SSE/SSE2 is always known to exist on that arch. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@49 632fc199-4ca6-4c93-a231-07263d6284db