IABSD.fr/src /lib/libcrypto/sha

Branch :

Log

Commit	Date	Message
95317e4b	2026-05-12 15:07:30	Add a guarded .note.GNU-stack section to crypto assembly files. Add a .note.GNU-stack section to avoid ending up with an executable stack on toolchains that believe we should have an executable stack by default. Reported by ruuda on Github. Discussed with tb@
84482c2d	2026-05-09 07:14:42	Use uint32_t instead of SHA_LONG in the SHA-256 code. This is more readable and we already have a compile time assert that they are the same size. ok tb@
f35adb27	2026-05-09 07:12:51	Use W rather than X for the SHA-256 message schedule. This more closely matches the SHA-256 specification in FIPS 180-4. ok tb@
7b35a4fe	2026-05-09 07:11:05	Use consistent variable names in the sha256 code. Use 'ctx' rather than 'c' for the SHA256_CTX and use data/len rather than d/n. ok kenjiro@ tb@
0eb29a10	2026-05-09 07:08:43	Use crypto_add_u32dw_u64() to increment SHA-256 message bit counter. ok kenjiro@ tb@
dcbba1f6	2026-05-09 07:03:49	Correct argument type for SHA context. These are SHA_CTX not SHA256_CTX.
ca56b5a4	2026-05-09 07:02:29	Correct argument type in comments.
ef976b65	2026-05-07 15:50:47	Use macros for global functions and objects within SHA assembly. This lets us remove some of the repetitive statements and allows for them to be adjusted for various platforms. ok kenjiro@ tb@
eaa7a734	2026-05-07 15:41:37	Use defines for symbol offsets in aarch64 assembly. These also very between platforms. ok kenjiro@ tb@
94719c1d	2026-05-07 15:40:33	Use defines for text and rodata section names in SHA assembly. These vary between platforms. ok kenjiro@ tb@
c4e88d03	2026-05-07 15:38:03	Use a define based instruction separator in SHA assembly. Unfortunately, not all assemblers use the same instruction separator. In particular, LLVM on macOS uses %% as an instruction separator, while most other assemblers use a semi-colon. ok kenjiro@ tb@
921eb3c3	2026-04-25 05:47:03	Add FIPS 180-4 references for SHA-256 constants.
b39c1312	2026-03-28 13:11:28	Include crypto_assembly.h instead of manually ensuring _CET_ENDBR exists. ok kenjiro@ tb@
877296eb	2026-01-25 08:22:17	Make SHA aarch64 assembly build with gcc. gcc is extremely fussy about register naming and insists on q and s naming for the ARM CE SHA instructions, even though they're referring to the same register (while LLVM just figures it out). Work around this by mapping registers to their required variant at usage and defining a handful of mappings between v registers and alternate names/views. This is still somewhat ugly, but seems to be one of the cleaner options that will allow portable to enable SHA assembly on platforms that use gcc. ok kenjiro@ tb@
09f01e6f	2026-01-24 14:20:52	Tidy instruction separators in SHA assembly. Remove unnecessary separators and add a few to macros that call other macros (instead of expecting them to exist).
14fe603b	2026-01-17 06:31:45	Use .section before .rodata to appease gas. gas dislikes bare .rodata - add .section before .rodata to make it happier (LLVM does not care and is happy with either). For consistency, do the same with .text.
ef798222	2026-01-17 06:23:42	Use local label prefix for loop labels.
87ed7926	2025-06-28 12:51:08	Provide accelerated SHA-1 for aarch64. Provide an assembly implementation of SHA-1 for aarch64 using the ARM Cryptographic Extension (CE). This results in around a 2x speed up for larger block sizes. ok tb@
23f4dfa0	2025-06-09 14:28:33	Make OPENSSL_IA32_SSE2 the default for i386 and remove the flag. The OPENSSL_IA32_SSE2 flag controls whether a number of the perlasm scripts generate additional implementations that use SSE2 functionality. In all cases except ghash, the code checks OPENSSL_ia32cap_P for SSE2 support, before trying to run SSE2 code. For ghash it generates a CLMUL based implementation in addition to different MMX version (one MMX version hides behind OPENSSL_IA32_SSE2, the other does not), however this does not appear to actually use SSE2. We also disable AES-NI on i386 if OPENSSL_IA32_SSE2. On OpenBSD, we've always defined OPENSSL_IA32_SSE2 so this is effectively a no-op. The only change is that we now check MMX rather than SSE2 for the ghash MMX implementation. ok bcook@ beck@
234f524d	2025-06-09 13:58:28	Remove GNU assembler version check. GNU assembler version 2.19 was released in 2014, so it does not seem unreasonable to expect that we have an assembler that supports AVX. Furthermore, the current check fails on LLVM. ok bcook@ beck@
f2f8d78d	2025-04-18 07:36:11	Use 'ctx' for sha3_ctx variables, rather than the less readable 'c'. ok tb@
2b08fe26	2025-04-18 07:27:42	Pull casts from void * to uint8_t * up to variables, rather than inline. ok tb@
80598ffc	2025-04-18 07:23:53	Use two temporary variables in sha3_keccakf(), rather than reusing bc[0]. ok tb@
aca95e1c	2025-04-18 07:19:48	Use crypto_rol_u64() instead of a separate ROTL64 define. ok tb@
f24522bc	2025-03-12 14:13:41	Provide an accelerated SHA-512 assembly implementation for aarch64. This provides a SHA-512 assembly implementation that makes use of the ARM Cryptographic Extension (CE), which is found on many arm64 CPUs. This gives a performance gain of up to 2.5x on an Apple M2 (dependent on block size). If an aarch64 machine does not have SHA512 support, then we'll fall back to using the existing C implementation. ok kettenis@ tb@
80bce72e	2025-03-12 12:53:33	Use .arch rather than .cpu for sha2 instructions. We have code that targets a specific architecture level, hence .arch makes more sense here than .cpu. Suggested by kettenis@
08386632	2025-03-07 14:21:22	Provide an accelerated SHA-256 assembly implementation for aarch64. This provides a SHA-256 assembly implementation that makes use of the ARM Cryptographic Extension (CE), which is found on many arm64 CPUs. This gives a performance gain of up to 7.5x on an Apple M2 (dependent on block size). If an aarch64 machine does not have SHA2 support, then we'll fall back to using the existing C implementation. ok kettenis@ tb@
4eb9c9dc	2025-02-14 12:01:58	Replace Makefile based SHA_ASM defines with HAVE_SHA_ defines. Currently, SHA{1,256,512}_ASM defines are used to remove the C implementation of sha{1,256,512}_block_data_order() when it is provided by assembly. However, this prevents the C implementation from being used as a fallback. Rename the C sha_block_data_order() to sha_block_generic() and provide a sha_block_data_order() that calls sha_block_generic(). Replace the Makefile based SHA_ASM defines with two HAVE_SHA_ defines that allow these functions to be compiled in or removed, such that machine specific verisons can be provided. This should effectively be a no-op on any platform that defined SHA{1,256,512}_ASM. ok tb@
515aa502	2025-01-25 17:59:44	Remove #error if OPENSSL_NO_FOO is defined discussed with jsing
f6bb4990	2025-01-18 02:56:07	Use name instead of register.
90c5a28a	2024-12-06 11:57:17	Provide a SHA-1 assembly implementation for amd64 using SHA-NI. This provides a SHA-1 assembly implementation for amd64, which uses the Intel SHA Extensions (aka SHA New Instructions or SHA-NI). This provides a 2-2.5x performance gain on some Intel CPUs and many AMD CPUs. ok tb@
550a1cbd	2024-12-04 13:14:45	Another now unused perlasm script can bite the dust.
a61493a0	2024-12-04 13:13:33	Provide a replacement assembly implementation for SHA-1 on amd64. As already done for SHA-256 and SHA-512, replace the perlasm generated SHA-1 assembly implementation with one that is actually readable. Call the assembly implementation from a C wrapper that can, in the future, dispatch to alternate implementations. On a modern CPU the performance is around 5% faster than the base implementation generated by sha1-x86_64.pl, however it is around 15% slower than the excessively complex SSSE2/AVX version that is also generated by the same script (a SHA-NI version will greatly outperform this and is much cleaner/simpler). ok tb@
45e2a6c1	2024-11-23 15:38:12	Simplify endian handling in SHA-3. Rather than having blocks of code that are conditional on BYTE_ORDER != LITTLE_ENDIAN, use le64toh() and htole64() unconditionally. In the case of a little endian platform, the compiler will optimise this away, while on a big endian platform we'll either end up with better code or the same code than we have currently. ok tb@
228e7c1e	2024-11-16 15:31:36	Provide a SHA-256 assembly implementation for amd64 using SHA-NI. This provides a SHA-256 assembly implementation for amd64, which uses the Intel SHA Extensions (aka SHA New Instructions or SHA-NI). This provides a 3-5x performance gain on some Intel CPUs and many AMD CPUs. ok tb@
08bba489	2024-11-16 15:06:08	Remove sha512-x86_64.pl. Now that we have replacement SHA-256 and SHA-512 assembly implementations for amd64, sha512-x86_64.pl can go the way of the dodo.
8a0aadfb	2024-11-16 14:56:39	Provide a replacement assembly implementation for SHA-512 on amd64. Replace the perlasm generated SHA-512 assembly with a more readable version and the same C wrapper introduced for SHA-256. As for SHA-256, on a modern CPU the performance is largely the same. ok tb@
07e532b2	2024-11-16 12:34:16	Specify size for K256 symbol. Missing sizes spotted by guenther@
644472e5	2024-11-12 13:51:14	Use multipliers for stack offsets and tweak comment.
0acd6edb	2024-11-08 15:09:48	Provide a replacement assembly implementation for SHA-256 on amd64. Replace the perlasm generated SHA-256 assembly implementation with one that is actually readable. Call the assembly implementation from a C wrapper that can, in the future, dispatch to alternate implementations. Performance is similar (or even better) on modern CPUs, while somewhat slower on older CPUs (this is in part due to the wrapper, the impact of which is more noticable with small block sizes). Thanks to gkoehler@ and tb@ for testing. ok tb@
f6fc4eaf	2024-06-01 08:11:44	Missed SHA224() in previous: reverse order of attributes
fd65fe5a	2024-06-01 07:44:11	Reverse order of attributes requested by jsing on review
9cb04522	2024-06-01 07:36:16	Remove support for static buffers in HMAC/digests HMAC() and the one-step digests used to support passing a NULL buffer and would return the digest in a static buffer. This design is firmly from the nineties, not thread safe and it saves callers a single line. The few ports that used to rely this were fixed with patches sent to non-hostile (and non-dead) upstreams. It's early enough in the release cycle that remaining uses hidden from the compiler should be caught, at least the ones that matter. There won't be that many since BoringSSL removed this feature in 2017. https://boringssl-review.googlesource.com/14528 Add non-null attributes to the headers and add a few missing bounded attributes. ok beck jsing
a407cbb3	2024-03-28 07:06:12	Demacro sha1. Replace macros with static inline functions and use names that follow the spec more closely. Unlike SHA256/SHA512, the functions and constants do not align with the number of words loaded, which means we cannot easily loop and just end up just unrolling everything. ok joshua@ tb@
71b54e50	2024-03-28 04:23:02	Fix line wrapping.
c2de78a7	2024-03-26 12:54:22	Rework input and output handling for sha1. Use be32toh(), htobe32() and crypto_{load,store}_htobe32() as appropriate. Also use the same while() loop that is used for other hash functions. ok joshua@ tb@
22787c51	2024-02-24 15:30:14	Replace uses of endbr64 with _CET_ENDBR from cet.h cet.h is needed for other platforms to emit the relevant .gnu.properties sections that are necessary for them to enable IBT. It also avoids issues with older toolchains on macOS that explode on encountering endbr64. based on a diff by kettenis ok beck kettenis
2d2b9ed9	2023-08-11 15:27:28	Stop including md32_common.h. Now that we're no longer dependent on md32_common.h, stop including it. Remove various defines that only existed for md32_common.h usage.
d83e85e7	2023-08-11 15:25:36	Demacro sha256. Replace macros with static inline functions, as well as writing out the variable rotations instead of trying to outsmart the compiler. Also pull the message schedule update up and complete it prior to commencement of the round. Also use rotate right, rather than transposed rotate left. Overall this is more readable and more closely follows the specification. On some platforms (e.g. aarch64) there is no noteable change in performance, while on others there is a significant improvement (more than 25% on arm). ok miod@ tb@
cd67cc31	2023-08-10 07:15:23	Remove MD32_REG_T. This is a hack that is only enabled on a handful of 64 bit platforms, as a workaround for poor compiler optimisation. If you're running an archiac compiler on an archiac architecture, then you can deal with slightly lower performance. ok tb@
65be244d	2023-07-08 12:24:10	Hide symbols in sha ok tb@
26be10ed	2023-07-08 07:58:25	Remove unused SHA-1 implementation.
be81028a	2023-07-08 07:52:25	Remove now unnecessary "do { } while (0)"
f7bb1d80	2023-07-08 07:49:45	Inline HASH_MAKE_STRING macro. No change to generated assembly.
eb6cfd0b	2023-07-08 07:43:44	Reorder functions. No functional change.
06b4c63b	2023-07-08 07:08:11	style(9)
cbefc5eb	2023-07-07 15:09:45	Implement SHA1_{Update,Transform,Final}() directly in sha1.c. Copy the update, transform and final functions from md32_common.h, manually expanding the macros for SHA1. This will allow for further clean up to occur. No change in generated assembly.
b039d949	2023-07-07 15:06:50	Clean up alignment handling for SHA-256. If input data is 32 bit aligned use be32toh() directly, otherwise use crypto_load_be32toh(), cleaning up all of the HOST_c2l() usage. ok beck@
1fd3fa42	2023-07-07 15:03:55	Clean up SHA-256 input handling and round macros. Avoid reach around and initialisation outside of the macro, cleaning up the call sites to remove the initialisation. ok beck@
6fa35e22	2023-07-07 14:32:41	Remove unused SHA-256 implementation. ok beck@
e609121d	2023-07-07 10:22:28	Replace HOST_l2c() with htob32() or crypto_store_htobe32(). ok beck@
a255a78f	2023-07-02 14:57:58	Demacro SHA-512. Use static inline functions instead of macros to implement SHA-512. At the same time, make two key changes - firstly, rather than trying to outsmart the compiler and shuffle variables around, write the algorithm the way it is documented and actually swap the variable contents. Secondly, instead of interleaving the message schedule update and the round, do the full message schedule update first, then process the round. Overall, we get safer and more readable code. Additionally, the compiler can generate smaller and faster code (with a gain of 5-10% across a range of architectures). ok beck@ tb@
f1e15a90	2023-05-28 14:54:37	Sprinkle some style(9).
627637ad	2023-05-28 14:49:21	Expand occurrences of HASH_CTX that were previously missed. No change in generated assembly.
a94aa803	2023-05-28 14:14:33	Reorder functions. No intended functional change.
a2833576	2023-05-28 13:57:27	Clean up includes.
19eea776	2023-05-28 13:55:55	Remove now unnecessary do {} while(0);
03f0084a	2023-05-28 13:53:08	Inline HASH_MAKE_STRING for SHA256. No change to generated assembly.
73c48ca3	2023-05-27 18:39:03	Implement SHA256_{Update,Transform,Final}() directly in sha256.c. m32_common.h is a typical OpenSSL macro horror show - copy the update, transform and final functions from md32_common.h, manually expanding the macros for SHA256. This will allow for further clean up to occur. No change in generated assembly. ok beck@ tb@
825d9bb4	2023-05-27 09:18:17	Clean up alignment handling for SHA-512. This recommits r1.37 of sha512.c, however uses uint8_t * instead of void * for the crypto_load_* functions and primarily uses const uint8_t * to track input, only casting to const SHA_LONG64 * once we know that it is suitably aligned. This prevents the compiler from implying alignment based on type. Tested by tb@ and deraadt@ on platforms with gcc and strict alignment. ok tb@
3c6df8cf	2023-05-19 00:54:27	backout alignment changes (breaking at least two architectures)
ac15c2ab	2023-05-17 06:37:14	Clean up alignment handling for SHA-512. All assembly implementations are required to perform their own alignment handling. In the case of the C implementation, on strict alignment platforms, unaligned data will be copied into an aligned buffer. However, most platforms then perform byte-by-byte reads (via the PULL64 macros). Instead, remove SHA512_BLOCK_CAN_MANAGE_UNALIGNED_DATA and alignment handling to sha512_block_data_order() - if the data is aligned then simply perform 64 bit loads and then do endian conversion via be64toh(). If the data is unaligned then use memcpy() and be64toh() (in the form of crypto_load_be64toh()). Overall this reduces complexity and can improve performance (on aarch64 we get a ~10% performance gain with aligned input and about ~1-2% gain on armv7), while the same movq/bswapq is generated for amd64 and movl/bswapl for i386. ok tb@
c7cae210	2023-05-16 07:04:57	Clean up SHA-512 input handling and round macros. Avoid reach around and initialisation outside of the macro, cleaning up the call sites to remove the initialisation. Use a T2 variable to more closely follow the documented algorithm and remove the gorgeous compound statement X = Y += A + B + C. There is no change to the clang generated assembly on aarch64. ok tb@
d6fa391b	2023-05-12 10:10:55	Reduce the number of SHA-512 C implementations from three to one. We currently have three C implementations for SHA-512 - a version that is optimised for CPUs with minimal registers (specifically i386), a regular implementation and a semi-unrolled implementation. Testing on a ~15 year old i386 CPU, the fastest version is actually the semi-unrolled version (not to mention that we still currently have an i586 assembly implementation that is used on i386 instead...). More decent architectures do not seem to care between the regular and semi-unrolled version, presumably since they are effectively doing the same thing in hardware during execution. Remove all except the semi-unrolled version. ok tb@
ce228578	2023-04-25 19:32:19	Remove duplicate NID definitions
21724f70	2023-04-25 15:47:29	Remove no longer necessary compat #defines
1d4dcfa7	2023-04-25 04:42:25	Add endbr64 where needed by inspection. Passes regresson tests. ok jsing, and kind of tb an earlier version
925de8c6	2023-04-16 17:06:19	Provide EVP methods for SHA3 224/256/384/512. ok tb@
2afadb71	2023-04-16 16:42:06	Provide EVP methods for SHA512/224 and SHA512/256. ok tb@
aae7803d	2023-04-16 15:32:16	Bounds check mdlen that is passed to sha3_init(). While here, use KECCAK_BYTE_WIDTH instead of hardcoding the value.
acc18af8	2023-04-15 20:00:24	Use size_t rather than int. Also buy a vowel for rsiz.
b658812f	2023-04-15 19:44:36	Add SHA3 digest length define that was previously missed.
6be04bb3	2023-04-15 19:30:31	Remove sha3() function, which will not be used or exposed.
8e11058e	2023-04-15 19:29:20	Mark sha3_keccakf() as static and remove prototype from header.
0d9460fd	2023-04-15 19:27:54	Use memset() to zero the context, instead of zeroing manually.
a7bada8b	2023-04-15 19:22:34	Provide SHA3 length related defines. These will make EVP integration easier, as well as being used in the SHA3 implementation itself.
ac5713a6	2023-04-15 19:15:53	Use the same byte order tests as we do elsewhere in libcrypto.
ee266ad5	2023-04-15 18:32:55	Rename SHA3 context struct field from 'st' to 'state'.
f1b36196	2023-04-15 18:30:27	Rename SHA3 context to align with existing code.
d44d5087	2023-04-15 18:29:26	Move some defines out of the sha3_internal.h header.
dd34866c	2023-04-15 18:22:53	Revise header guards.
5c0ae387	2023-04-15 18:19:06	Pull constant tables out of sha3_keccakf().
e967e4e7	2023-04-15 18:14:21	Strip and reformat comments. Remove various comments that are unhelpful or obvious. Reformat remaining comments per style(9).
8a4ba2fc	2023-04-15 18:07:44	Apply style(9) (first pass).
c0cd1c8b	2023-04-15 18:00:57	Import sha3_internal.h.
9bb5e18b	2023-04-15 17:59:50	Add license to sha3 files.
e70bbf9b	2023-04-15 17:56:35	Import tiny_sha3 This is a minimal and readable SHA3 implementation. ok tb@
8e9acae6	2023-04-14 10:45:15	Add support for truncated SHA512 variants. This adds support for SHA512/224 and SHA512/256, as specified in FIPS FIPS 180-4. These are truncated versions of the SHA512 hash. ok tb@
e9f61643	2023-04-14 10:41:34	Use memset() and only initialise non-zero struct members. ok tb@
afc643d3	2023-04-12 05:16:08	Remove now unused sha_local.h.

IABSD.fr/src/lib/libcrypto/sha

Log

IABSD.fr/src /lib/libcrypto/sha