• Show log

    Commit

  • Hash : 2a2970af
    Author : DRC
    Date : 2021-07-09T15:35:56

    Neon/AArch32: Work around Clang T32 miscompilation
    
    Referring to the C standard
    (http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf,
    J.2 Undefined behavior), the behavior of the compiler is undefined if
    "conversion between two pointer types produces a result that is
    incorrectly aligned."  Thus, the behavior of this code
    
      *((uint32_t *)buffer) = BUILTIN_BSWAP32(put_buffer);
    
    in the AArch32 version of the FLUSH() macro is undefined unless 'buffer'
    is 32-bit-aligned.  Referring to
    https://bugs.llvm.org/show_bug.cgi?id=50785, certain versions of Clang,
    when generating Thumb (T32) instructions, miscompile that code into an
    assembly instruction (stm) that requires the destination to be
    32-bit-aligned.  Since such alignment cannot be guaranteed within the
    Huffman encoder, this reportedly led to crashes (SIGBUS: illegal
    alignment) with AArch32/Thumb builds of libjpeg-turbo running on Android
    devices, although thus far I have been unable to reproduce those crashes
    with a plain Linux/Arm system.
    
    The miscompilation is visible with the Compiler Explorer:
    https://godbolt.org/z/rv1ccx1Pb
    However, it goes away when removing the return statement from the
    function.  Thus, it seems that Clang's behavior in this regard is
    somewhat variable, which may explain why the crashes are only
    reproducible on certain platforms.
    
    The suggested workaround is to use memcpy(), but whereas Clang and
    recent GCC releases are smart enough to compile a 4-byte memcpy() call
    into a str instruction, GCC < 6 is not.  Referring to
    https://godbolt.org/z/ae7Wje3P6, the only way to consistently produce
    the desired str instruction across all supported compilers is to use
    inline assembly.  Visual C++ presumably does not miscompile the code in
    question, since no issues have been reported with it, but since the code
    relies on undefined compiler behavior, prudence dictates that
    e4ec23d7ae051c1c73947f889818900362fdc52d should be reverted for Visual
    C++, which this commit does.  The performance impact of
    e4ec23d7ae051c1c73947f889818900362fdc52d for Visual C++/Arm builds is
    unknown (I have no ability to test such builds), but regardless, this
    commit reverts the Visual C++/Arm performance to that of libjpeg-turbo
    2.1 beta1.
    
    Closes #529
    

  • Properties

  • Git HTTP https://git.kmx.io/kc3-lang/libjpeg-turbo.git
    Git SSH git@git.kmx.io:kc3-lang/libjpeg-turbo.git
    Public access ? public
    Description

    Fork of libjpeg with SIMD

    Users
    thodg_m kc3_lang_org thodg_w www_kmx_io thodg_l thodg
    Tags