audiocvt: stereo-to-mono SSE3 now uses unaligned accesses. On modern CPUs, there's no penalty for using the unaligned instruction on aligned memory, but now it can vectorize unaligned data too, which even if it's not optimal, is still going to be faster than the scalar fallback. Fixes #4532.