ARM64 NEON: Fix another ABI conformance issue Based on https://github.com/mayeut/libjpeg-turbo/commit/98a5a9dc899aa9265858a3cbe0a96289a31a1322 with wordsmithing by DRC. In the AArch64 ABI, as in many others, it's forbidden to read/store data below the stack pointer. Some SIMD functions were doing just that (stack pointer misuse) when trying to preserve callee-saved registers, and this resulted in those registers being restored with incorrect contents under certain circumstances. This patch fixes that behavior, and callee-saved registers are now stored above the stack pointer throughout the function call. The patch also removes register saving in places where it is unnecessary for this ABI, or it makes use of unused scratch regiters instead of callee-saved registers. Fixes #97. Closes #101. Refer also to https://bugzilla.redhat.com/show_bug.cgi?id=1368569