Commit Graph

100 Commits (c25a1e354de1d12e79ba265f37036c908ce9db8f)

Author SHA1 Message Date
Jeffrey Walton 01779726db
Use consistent suffix for SSE2 ASM 2018-08-20 07:16:59 -04:00
Jeffrey Walton 4282f94712
Disable X32 inline assembly (GH #686, PR #704)
Also use CRYPTOPP_DISABLE_XXX_ASM consistently. The pattern is needed for Clang which still can't compile Intel assembly language. Also see http://llvm.org/bugs/show_bug.cgi?id=24232.
2018-08-18 04:44:53 -04:00
Jeffrey Walton e580ed588a
Disable same buffer for in and out on ARM A-32 (GH #683) 2018-07-12 07:05:18 -04:00
Jeffrey Walton b3fe24b8b5
Remove CRYPTOPP_ALLOW_UNALIGNED_DATA_ACCESS support (GH #682)
We were able to gut CRYPTOPP_ALLOW_UNALIGNED_DATA_ACCESS for everything except Rijndael. Rijndael uses unaligned accesses on x86 to harden against timing attacks.
There's a little more to CRYPTOPP_ALLOW_UNALIGNED_DATA_ACCESS and Rijndael. If we remove unaligned access then AliasedWithTable hangs in an endless loop on non-AESNI machines. So care must be taken when trying to remove the vestige from Rijndael.
2018-07-11 11:40:25 -04:00
Jeffrey Walton 3ff7d7f028
Add ARM AES asm implementation from Cryptogams (GH #683) 2018-07-11 06:59:44 -04:00
Jeffrey Walton b74a6f4445
Add algorithm provider member function to Algorithm class 2018-07-06 09:23:37 -04:00
Jeffrey Walton 244c40ed61
Remove unneeded round parameter on Rijndael_UncheckedSetKey_SSE4_AESNI 2018-02-20 13:32:53 -05:00
Jeffrey Walton c80e28eec8
Remove unneeded parameter for Rijndael_UncheckedSetKey_POWER8 2018-02-20 06:42:43 -05:00
Jeffrey Walton fac3a44a84
Move Altivec AdvancedProcessBlocks into adv-simd.h 2018-01-02 07:08:13 -05:00
Jeffrey Walton a074722bfa
Switch to rotlConstant and rotrConstant
This will help Clang and its need for a constexpr
2017-11-25 02:52:19 -05:00
Jeffrey Walton 561926db34
Rename CRYPTOPP_ENABLE_ADVANCED_PROCESS_BLOCKS for Rijndael 2017-11-22 17:55:20 -05:00
Jeffrey Walton 69c8a4f9c6
Prefix IS_LITTLE_ENDIAN and IS_BIG_ENDIAN with CRYPTOPP 2017-11-10 14:15:30 -05:00
Jeffrey Walton 6e436427fb
Use SetMark to avoid unneeded zeroization in Rijndael 2017-10-08 12:05:33 -04:00
Jeffrey Walton 01e46aa474
Move AliasedWithTable into unnamed namespace
Move m_aliasBlock into Rijndael::Base. m_aliasBlock is now an extra data member for Dec because the aliased table is only used for Enc when unaligned data access is in effect. However, the SecBlock is not allocated in the Dec class so there is no runtime penalty.

Moving m_aliasBlock into Base also allowed us to remove the Enc::Enc() constructor, which always appeared as a wart in my eyes. Now m_aliasBlock is sized in UncheckedSetKey, so there's no need for the ctor initialization.

Also see https://stackoverflow.com/q/46561818/608639 on Stack Overflow. The SO question had an unusual/unexpected interaction with CMake, so the removal of the Enc::Enc() ctor should help the problem.
2017-10-05 09:28:56 -04:00
Jeffrey Walton 1057f89363
Move Power8 crypto functions into ppc-crypto.h 2017-09-22 05:23:29 -04:00
Jeffrey Walton e78464a1af
Enable little endian Rijndael_UncheckedSetKey_POWER8 using built-ins
The problem was vec_sld is endian sensitive. The built-in required more than us setting up arguments to ensure the vsx load resulted in a big endian value. Thanks to Paul R on Stack Overflow for sharing the information that IBM did not provide. Also see http://stackoverflow.com/q/46341923/608639
2017-09-21 09:56:37 -04:00
Jeffrey Walton c6b096ddd4
Move Rijndael_UncheckedSetKey_POWER8 prior to GetUserKey call
Arg... GetUserKey was performing a 32-bit word reverse. It was part of the problem on little endian machines
2017-09-21 01:08:44 -04:00
Jeffrey Walton 6440921723
Add Rijndael_UncheckedSetKey_POWER8
We are going to attempt to perform key setup using Power8 in-core vector instructions
2017-09-19 04:55:15 -04:00
Jeffrey Walton 923cf95571
ByteReverseArray → ReverseByteArrayLE 2017-09-18 18:40:19 -04:00
Jeffrey Walton 2c18fe8af8
Refactor LoadT() and StoreT(). Add separate ReverseT() for little endian machines
The refactoring has no effect on little endian machines. However, on big endian GCC119 using GCC 7.1 the performance improved by 2.5x for ECB and CTR modes:

BEFORE:

<TR><TH>AES/CTR (128-bit key)<TD>2723<TD>1.4<TD>0.163<TD>670
<TR><TH>AES/CTR (192-bit key)<TD>2560<TD>1.5<TD>0.175<TD>719
<TR><TH>AES/CTR (256-bit key)<TD>2728<TD>1.4<TD>0.183<TD>749
<TR><TH>AES/CBC (128-bit key)<TD>1204<TD>3.2<TD>0.135<TD>554
<TR><TH>AES/CBC (192-bit key)<TD>1066<TD>3.7<TD>0.148<TD>605
<TR><TH>AES/CBC (256-bit key)<TD>948<TD>4.1<TD>0.155<TD>635
<TR><TH>AES/OFB (128-bit key)<TD>1019<TD>3.8<TD>0.158<TD>648
<TR><TH>AES/CFB (128-bit key)<TD>949<TD>4.1<TD>0.192<TD>787
<TR><TH>AES/ECB (128-bit key)<TD>3564<TD>1.1<TD>0.082<TD>337

AFTER:

<TR><TH>AES/CTR (128-bit key)<TD>6484<TD>0.6<TD>0.163<TD>677
<TR><TH>AES/CTR (192-bit key)<TD>5641<TD>0.7<TD>0.176<TD>728
<TR><TH>AES/CTR (256-bit key)<TD>5005<TD>0.8<TD>0.183<TD>761
<TR><TH>AES/CBC (128-bit key)<TD>1223<TD>3.2<TD>0.135<TD>559
<TR><TH>AES/CBC (192-bit key)<TD>1080<TD>3.7<TD>0.147<TD>611
<TR><TH>AES/CBC (256-bit key)<TD>966<TD>4.1<TD>0.155<TD>642
<TR><TH>AES/OFB (128-bit key)<TD>1057<TD>3.7<TD>0.158<TD>656
<TR><TH>AES/CFB (128-bit key)<TD>1217<TD>3.3<TD>0.186<TD>774
<TR><TH>AES/ECB (128-bit key)<TD>7289<TD>0.5<TD>0.082<TD>342
2017-09-18 18:15:25 -04:00
Jeffrey Walton 6899d3f8bb
Add AdvancedProcessBlocks for Power8
This increases performance to about 1.6 cpb. We are about 0.5 cpb behind Botan, and about 1.0 cpb behind OpenSSL. However, it beats the snot out of C/C++, which runs at 20 to 30 cpb
2017-09-12 18:15:55 -04:00
Jeffrey Walton b090e5f69f
Add Power8 AES decryption 2017-09-12 05:53:17 -04:00
Jeffrey Walton 81a272b046
Update comments 2017-09-12 00:30:48 -04:00
Jeffrey Walton 7fb34e9b08
Add Power8 AES encryption
This is the forward direction on encryption only.  Crypto++ uses the "Equivalent Inverse Cipher" (FIPS-197, Section 5.3.5, p.23), and it is not compatible with IBM hardware. The library library will need to re-work the decryption key scheduling routines. (We may be able to work around it another way, but I have not investigated it).
2017-09-11 22:52:22 -04:00
Jeffrey Walton 37e02f9e0e
Revert AltiVec and Power8 commits
The strategy of "cleanup under-aligned buffers" is not scaling well. Corner cases are still turing up. The library has some corner-case breaks, like old 32-bit Intels. And it still has not solved the AltiVec and Power8 alignment problems.
For now we are backing out the changes and investigating other strategies
2017-09-05 16:28:00 -04:00
Jeffrey Walton fe0a5ee8e8
Warn of under-aligned buffers when using AES in debug mode
This commit supports the upcoming AltiVec and Power8 processor. This commit affects a number of classes due to the ubiquitous use of AES. The commit adds debug asserts to warn of under-aligned and misaligned buffers in debug builds.
2017-09-04 12:01:44 -04:00
Jeffrey Walton 75aef9bded
Fixup under-aligned buffers when using AES on AltiVec and Power8
This commit supports the upcoming AltiVec and Power8 processor. This commit affects a number of classes due to the ubiquitous use of AES. The commit provides the data alignment requirements.
2017-09-04 11:21:47 -04:00
Jeffrey Walton 5c6a32ba0f
Support Base Implementation + SIMD implementation on Solaris (PR #461) 2017-08-24 19:17:21 -04:00
Jeffrey Walton 7851a0d510 Remove BOOL macro value (GH #462)
Currently the CRYPTOPP_BOOL_XXX macros set the macro value to 0 or 1. If we remove setting the 0 value (the #else part of the expression), then the self tests speed up by about 0.3 seconds. I can't explain it, but I have observed it repeatedly.
This check-in prepares for the removal in Upstream master
2017-08-20 21:25:29 -04:00
Jeffrey Walton a1b3102eab
Update comments 2017-08-19 01:35:36 -04:00
Jeffrey Walton e2c377effd Split source files to support Base Implementation + SIMD implementation (GH #461)
Split source files to support Base Implementation + SIMD implementation
2017-08-17 12:33:43 -04:00
Jeffrey Walton 08c37e5887
Update comments in Rijndael head comments 2017-08-15 14:26:30 -04:00
Jeffrey Walton 2aff92ddb6
Fix bad SHA::Transform calculation (Issue 455)
Reworked SHA class internals to align all the implementations. Formerly all hashes were software based, IterHashBase handled endian conversions, IterHashBase repeatedly called the single block SHA{N}::Transform. The rework added SHA{N}::HashMultipleBlocks, and the SHA classes attempt to always use it.

Now SHA{N}::Transform calls into SHA{N}_HashMultipleBlocks, which is a free standing function. An added wrinkle is hardware wants little endian data and software presents big endian data, so HashMultipleBlocks accepts a ByteOrder for the incoming data. Hardware based SHA{N}_HashMultipleBlocks can often perform the endian swap much easier by setting an EPI mask so it was profitable to defer to hardware when available.

The rework also removed the hacked-in pointers to implementations. The class now looks more like AES, GCM, etc.
2017-08-13 16:05:39 -04:00
Jeffrey Walton 863bf9133c
Cleanup casts due to Clang 2017-08-13 06:32:09 -04:00
Jeffrey Walton 173dd0b530
Add AES for ARMv8 (Issue 458) 2017-08-11 07:31:09 -04:00
Jeffrey Walton 301437e693
Updated static initializers
When MSVC init_seg or GCC init_priority is available, we don't need to use the Singleton. We only need to create a file scope class variable and place it in the segment for MSVC or provide the attribute for GCC.
An additional upside is we cleared all the memory leaks that used to be reported by MSVC for debug builds.
2017-03-17 20:47:32 -04:00
Jeffrey Walton 5efb019d8b
Add C++ nullptr support (Issue 383) 2017-03-01 06:10:06 -05:00
Jeffrey Walton 733a073d65
Fix mismatched arch capabilities (Issue 283) 2016-10-27 01:01:01 -04:00
Jeffrey Walton 19ebf769e7
Add debug instrumentation to Rijndael
We added asserts due to Coverity findings. We beieve the findings were false positives
2016-09-30 13:14:29 -04:00
Jeffrey Walton 2b328e8f8b
Fix AES and X86 compile on Solaris 2016-09-30 09:31:23 -04:00
Jeffrey Walton 4c1b5472cc Cutover to SecByteBlock member for AES (Issue 302, CVE-2016-7544) 2016-09-30 01:09:21 -04:00
Jeffrey Walton bfd23861f4 Whitespace cleanup 2016-09-24 18:59:55 -04:00
John Byrd a33b95325f When calculating the AES block cipher, allocate 4K of memory on the stack instead of 256+ bytes. Search within that 4K space to put the 256-byte aligned Locals struct in a place which does not have 4K cache conflicts with the Te temporary buffer. This permits us to call _malloca() or alloca() once per call of this function. This commit also makes sure that the Microsoft-only _freea() occurs at the correct location instead of at a pointer to the middle of the stack, when the memory allocated by _malloca() or alloca() is not 256-byte aligned. 2016-09-22 17:43:57 -07:00
Jeffrey Walton 399a1546de Add CRYPTOPP_ASSERT (Issue 277, CVE-2016-7420)
trap.h and CRYPTOPP_ASSERT has existed for over a year in Master. We deferred on the cut-over waiting for a minor version bump (5.7). We have to use it now due to CVE-2016-7420
2016-09-16 11:27:15 -04:00
Jeffrey Walton ada2aa55ed Fix typo on SunCC version 2016-08-26 05:08:57 -04:00
Jeffrey Walton 4fd51eb06c Add vec_swap for compilers which do not support std::swap'ing SSE and NEON types 2016-07-17 21:25:55 -04:00
Jeffrey Walton 1cb906938d Fix SunCC 12.2 and 12.3 failed compile in rijndael.cpp due to std::swap(__m128i, __m128i) 2016-07-16 23:45:16 -04:00
Jeffrey Walton ba2c778f1b Fix typo in SunCC check 2016-07-15 01:53:01 -04:00
Jeffrey Walton b099030c46 Fix broken rijndael.cpp compile under Sun Studio (Issue 224) 2016-07-15 00:40:13 -04:00
Jeffrey Walton c1f025343a Add C++11 alignas support. Deleting 'alignas' branch 2016-06-14 19:14:09 -04:00