This should have happened when we removed most of MAINTAIN_BACKWARDS_COMPATIBILITY artifacts. Its not practical move SHA1 into Weak:: namespace or "typedef SHA256 SHA" because SHA1 is too intertwined at the moment.
In the interim, maybe we can place SHA1 in both CryptoPP:: and Weak:: namespaces. This will allow us to transition into Weak::SHA1 over time, and signal to users SHA1 should be avoided.
They are giving ARIA and BLAKE2 trouble. It looks like SSE4 support appeared in the GCC compiler around 4.1 or 4.2. It looks like SHA support appeared in the GNU assembler around 2.18
Initially we performed a 32-bit word-size ByteReverse() on the entire 64-byte buffer being hashed. Then we performed another fix-up when loading each 16-byte portion of the buffer into the SSE2 registers for SHA processing. The [undesired] consequence was byte swapping and reversals happened twice. Worse, the call to ByteReverse() produced 16 bswaps instead of 1 call pshufb, so it was orders of magnitude slower than it needed to be.
This check-in takes the sane approach to byte reversals and swapping. It performs it once when the message is loaded for SSE processing. The result is SHA1 calculations drop from about 3.0 cpb to about 2.5 cpb.
Previously, all 1024-bit tests were run, and then 2048-bit tests were run. Splitting them meant there were two entries for DSA-RFC6979/SHA-1, two entries for DSA-RFC6979/SHA-256 and so on. Now there will be one entry output during testing.
sha2.txt and sha3.txt are just collections of other files, so they don't take up much space.
This commit stens from and exception when running 'cryptest.exe tv sha2' and 'cryptest.exe tv sha3'. Its not obvious the name of the file to be run sha2_224_fips_180.txt. Users should not have to hunt for the reason sha2 and sha3 do not work.
sha2.txt and sha3.txt are just collections of other files, so they don't take up much space.
This commit stens from and exception when running 'cryptest.exe tv sha2' and 'cryptest.exe tv sha3'. Its not obvious the name of the file to be run sha2_224_fips_180.txt. Users should not have to hunt for the reason sha2 and sha3 do not work.
regtest.cpp is where ciphers register by name. The library has added a number of ciphers over the last couple of years and the source file has experienced bloat. Most of the ARM and MIPS test borads were suffering Out of Memory (OOM) kills as the compiler processed the source fille and the included header files.
This won't stop the OOM kills, but it will help the situation. An early BeagleBoard with 512 MB of RAM is still going to have trouble, but it can be worked around by building with 1 make job as opposed to 2 or 4.
The ARIA S-boxes could leak timining information. This commit applies the counter measures present in Rijndael and Camellia to ARIA. We take a penalty of about 0.05 to 0.1 cpb. It equates to about 0 MiB/s on an ARM device, and about 2 MiB/s on a modern Skylake.
We recently gained some performance though use of SSE and NEON in ProcessAndXorBlock, so the net result is an improvement.
Tune CRYPTOPP_ENABLE_ARIA_SSE2_INTRINSICS and CRYPTOPP_ENABLE_ARIA_SSSE3_INTRINSICS macro for older GCC and Clang. Clang needs some more tuning on Aarch64 becuase performance is off by about 15%.
Add additional NEON code paths.
Remove keyBits from Aarch64 code paths.
The SSSE3 intrinsics were performing aligned loads using _mm_load_si128 using user supplied pointers. The pointers are only a byte pointer, so its alignment can drop to 1 or 2. Switching to _mm_loadu_si128 will sidestep potential problems. The crash surfaced under Win32 testing.
Switch to memcpy's when performing bulk assignment x[0]=y[0] ... x[3]=y[3]. I believe Yun used the pattern to promote vectorization. Some compilers appear to be braindead and issue integer move's one word at a time. Non-braindead compiler will still take the optimization when advantageous, and slower compilers will benefit from the bulk move. We also cherry picked vectorization opportunities, like in ARIA_GSRK_NEON.
Remove keyBits variable. We now use UncheckedSetKey's keylen throughout.
Also fix a typo in CRYPTOPP_BOOL_SSSE3_INTRINSICS_AVAILABLE. __SSSE3__ was listed twice.