Hash :
16bd9845
Author :
Date :
2018-03-02T22:33:19
C/SSE2 optimization of encode_mcu_AC_refine() This commit adds C and SSE2 optimizations for the encode_mcu_AC_refine() function used in progressive Huffman encoding. The image used for testing can be retrieved from this page: https://blog.cloudflare.com/doubling-the-speed-of-jpegtran All timings done on `Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz` clang version is `Apple LLVM version 9.0.0 (clang-900.0.39.2)` gcc-5 version is `gcc-5 (Homebrew GCC 5.5.0) 5.5.0` gcc-7 version is `gcc-7 (Homebrew GCC 7.2.0) 7.2.0` Here are the results in comparison to libjpeg-turbo@3c54642 using `time ./jpegtran -outfile /dev/null -progressive -optimise -copy none print_poster_0025.jpg` C clang x86_64: +7% gcc-5 x86_64: +30% gcc-7 x86_64: +33% clang i386: +0% gcc-5 i386: +24% gcc-7 i386: +23% SSE2 clang x86_64: +42% gcc-5 x86_64: +53% gcc-7 x86_64: +64% clang i386: +35% gcc-5 i386: +46% gcc-7 i386: +49% Discussion in libjpeg-turbo/libjpeg-turbo#46
/* libjpeg-turbo build number */
#define BUILD "@BUILD@"
/* Compiler's inline keyword */
#undef inline
/* How to obtain function inlining. */
#define INLINE @INLINE@
/* Define to the full name of this package. */
#define PACKAGE_NAME "@CMAKE_PROJECT_NAME@"
/* Version number of package */
#define VERSION "@VERSION@"
/* The size of `size_t', as computed by sizeof. */
#define SIZEOF_SIZE_T @SIZE_T@
/* Define if your compiler has __builtin_ctzl() and sizeof(unsigned long) == sizeof(size_t). */
#cmakedefine HAVE_BUILTIN_CTZL
/* Define to 1 if you have the <intrin.h> header file. */
#cmakedefine HAVE_INTRIN_H
#if defined(_MSC_VER) && defined(HAVE_INTRIN_H)
#if (SIZEOF_SIZE_T == 8)
#define HAVE_BITSCANFORWARD64
#elif (SIZEOF_SIZE_T == 4)
#define HAVE_BITSCANFORWARD
#endif
#endif