Jeffrey Walton
7f86f498d6
Remove GCC_NO_UBSAN attribute
2018-07-01 01:02:33 -04:00
Jeffrey Walton
80ae9f4f0a
Add AVX512 rotates to RotateLeft and RotateRight templates
2018-06-22 17:44:16 -04:00
Fabrice Fontaine
3c01bcc352
Allow user to set -DCRYPTOPP_ARM_NEON_AVAILABLE=0 ( #595 )
...
Disable neon through -DCRYPTOPP_ARM_NEON_AVAILABLE=0,
replace "if defined(CRYPTOPP_ARM_NEON_AVAILABLE)" by
"if (CRYPTOPP_ARM_NEON_AVAILABLE)"
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
2018-03-05 18:49:10 -05:00
Jeffrey Walton
e5a362c026
Re-add Simon and Speck, enable NEON and Aarch64 (GH #585 )
...
This commit re-adds Simon and Speck. The commit includes NEON, Aarch32 and Aarch64
2018-02-19 04:47:19 -05:00
Jeffrey Walton
5da795bf56
Whitespace check-in
2018-02-18 23:44:23 -05:00
Jeffrey Walton
e416b243d3
Re-add Simon and Speck, enable SSE (GH #585 )
...
This commit re-adds Simon and Speck. The commit includes C++, SSSE3 and SSE4. NEON, Aarch32 and Aarch64 are disabled at the moment.
2018-02-18 23:23:50 -05:00
Jeffrey Walton
15b14cc618
Remove Simon and Speck ciphers (GH #585 )
...
We recently learned our Simon and Speck implementation was wrong. The removal will stop harm until we can loop back and fix the issue.
The issue is, the paper, the test vectors and the ref-impl do not align. Each produces slightly different result. We followed the test vectors but they turned out to be wrong for the ciphers.
We have one kernel test vector but we don't have a working implementation to observe it to fix our implementation. Ugh...
2018-02-14 04:06:16 -05:00
Jeffrey Walton
541caa3978
Guard use of Aarch64 tbl instruction
2018-02-13 08:48:13 -05:00
Jeffrey Walton
db7b341f95
Fix Aarch64 RotateRight32<8> typo
2018-02-13 07:26:15 -05:00
Jeffrey Walton
5cee4a6573
Improve logic for <arm_acle.h> include (GH #568 )
2018-01-20 13:23:41 -05:00
Jeffrey Walton
fac3a44a84
Move Altivec AdvancedProcessBlocks into adv-simd.h
2018-01-02 07:08:13 -05:00
Jeffrey Walton
5f083d652e
Clear signed/unsigned warnings
2017-12-31 03:54:33 -05:00
Jeffrey Walton
4d9c91b425
Fix missing define for MSVC
2017-12-26 15:07:28 -05:00
Jeffrey Walton
4904d0fc8d
Fix unaligned load for _mm_loaddup_pd with GCC and UBsan
2017-12-26 14:55:10 -05:00
Jeffrey Walton
3fff9e85df
Fix unaligned load for _mm_loaddup_pd with GCC and UBsan
2017-12-26 12:41:04 -05:00
Jeffrey Walton
ae445c0b0f
Clear signed/unsigned warnings with GCC and -Wall -Wextra
2017-12-26 11:48:11 -05:00
Jeffrey Walton
e90cc9a028
Update comments
2017-12-10 05:41:19 -05:00
Jeffrey Walton
8a5911e6eb
Refactor <cipher>_AdvancedProcessBlocks_<arch> into adv-simd.h
...
This also fixes the SPECK64 bug where CTR mode self tests fail. It was an odd failure because it only affected 64-bit SPECK. SIMON was fine and it used nearly the same code. We tracked it down through trial and error to the table based rotates.
2017-12-09 21:04:25 -05:00
Jeffrey Walton
e457ca26f7
Add SSE3 <pmmintrin.h> for SImon and Speck
...
Add additional comments for WORKAROUND_GCC_OPTERON_ISSUE
2017-12-08 13:54:00 -05:00
Jeffrey Walton
148202369b
Fix Speck-64 CTR mode
...
It looks like the delay was due to some GCC 7 issue. We had to disable parallel blocks on Aarch64 with GCC 7. We may be running out of registers and that could be causing problems. It looks like GCC uses up to v30.
2017-12-07 22:30:03 -05:00
Jeffrey Walton
02037b5ce6
Fix Simon-64 CTR mode
...
This fixes CTR mode for Simon-64. We were only incrementing half the counters.
We still have Speck-64 to cleanup.
2017-12-07 19:45:32 -05:00
Jeffrey Walton
07f2a4fc3f
Fix Simon-64 and Speck-64 CTR mode
...
This fixes CTR mode for IA-32. We were only incrementing half the counters.
Added additional test vectors
2017-12-07 16:55:23 -05:00
Jeffrey Walton
86acc8ed45
Use 6x-2x-1x for Simon and Speck on IA-32
...
For Simon-64 and Speck-64 this means we are effectively using 12x-4x-1x. We are mostly at the threshold for IA-32 and parallelization. At any time 10 to 13 XMM registers are being used.
Prefer movsd by way of _mm_load_sd and _mm_store_sd.
Fix "error C3861: _mm_cvtsi128_si64x identifier not found".
2017-12-06 06:18:46 -05:00
Jeffrey Walton
e9654192f2
Remove unneeded temp[] array
2017-12-05 20:35:57 -05:00
Jeffrey Walton
490701acca
Use 12x-4x-1x for Simon and Speck on ARM
2017-12-05 18:43:53 -05:00
Jeffrey Walton
7bc621da62
Enable NEON/ASIMD for Simon and Speck on Aarch32/Aarch64 (GH #545 )
2017-12-05 14:02:48 -05:00
Jeffrey Walton
9b61d4143d
Add big- and little-endian rotates for Aarch32 and Aarch64
2017-12-05 12:32:26 -05:00
Jeffrey Walton
9faa504a24
Fix Aarch32 and Aarch64 rotates
2017-12-05 11:15:26 -05:00
Jeffrey Walton
c18793f862
Fix SIMON-64 missing transform
2017-12-05 09:14:58 -05:00
Jeffrey Walton
b208c8c1b4
Add 4 additional lanes to SPECK-64 for ARM
2017-12-05 07:16:34 -05:00
Jeffrey Walton
e09e6af1f8
Enable multi-block for SPECK-64 and SIMON-64
...
Also cleaned up SIMON-64 vector permute code. Thanks again to Peter Cordes
2017-12-05 04:19:44 -05:00
Jeffrey Walton
147ecba5df
Add temp working variable for SPECK64_AdvancedProcessBlocks_SSE41
...
Avoid potential undefined behavior by using aligned words
2017-12-04 14:52:36 -05:00
Jeffrey Walton
076937eb81
Update comments for vector permutes in SPECK-128
2017-12-04 12:31:32 -05:00
Jeffrey Walton
25709d2597
Fix SPECK64 vector permutes
...
Thanks to Peter Cordes for the suggestion on handling the case
2017-12-04 09:47:26 -05:00
Jeffrey Walton
46271660a1
Switch to uint64x2_t for SIMON-128
2017-12-04 05:47:34 -05:00
Jeffrey Walton
e9714b40d2
Switch to _mm_unpacklo_epi32 and _mm_unpackhi_epi32
...
The manual _mm_extract_epi32 and _mm_insert_epi32 are required during setup, be we can use SSE on teardown
2017-12-04 05:01:27 -05:00
Jeffrey Walton
cd31fa29dc
Switch to uint64x2_t for SPECK-128
2017-12-04 03:38:39 -05:00
Jeffrey Walton
1de143203e
Add SPECK-64 NEON intrinsics
2017-12-03 18:47:39 -05:00
Jeffrey Walton
f0e49785f6
Fix incorrect SPECK-128 decrypt when blocks >= 6
...
Add defines for CRYPTOPP_SPECK64_ADVANCED_PROCESS_BLOCKS and CRYPTOPP_SPECK128_ADVANCED_PROCESS_BLOCKS
2017-12-03 09:00:39 -05:00
Jeffrey Walton
6bb1f1d9c4
Add SPECK-64 SSE intrinsics
...
Performance went from about 11.9 cpb (C++) to about 4.5 cpb (SSE)
2017-12-03 02:28:40 -05:00
Jeffrey Walton
25493ded49
Add AVX512VL rotate support
2017-12-01 09:39:05 -05:00
Jeffrey Walton
a7fec9c0f6
Fix assert in Debug builds
...
This was copy/paste from the template function
2017-11-30 11:54:21 -05:00
Jeffrey Walton
22257c4b6e
Remove SunCC const cast workaround
...
This code does not suffer SunCC losing const-ness
2017-11-29 12:56:19 -05:00
Jeffrey Walton
39594a53b0
Add fast rotate-by-8 for Aarch32 and Aarch64
2017-11-29 12:33:34 -05:00
Jeffrey Walton
532f13fe53
Fix compile using SunCC 12.4
2017-11-29 12:10:19 -05:00
Jeffrey Walton
16ebfa72bf
Cleanup comments and whitespace
2017-11-29 10:15:41 -05:00
Jeffrey Walton
6e829cebee
Use EPI8 Shuffle rather than Shifts and Or for rotate when R=8
...
Louis Wingers and Bryan Weeks from the Simon and Speck team offered the suggestion. The change save 0.7 cpb for Speck, and 5 cpb for Simon on x86_64.
Speck is now running very close to the Team's time sor SSE4. Simon is still off, but we know the root cause. For Simon, the Team used a fast bit-sliced implementation
2017-11-29 08:53:48 -05:00
Jeffrey Walton
07c2047cec
Add simon-simd.cpp to file list and nmake file
2017-11-27 01:20:15 -05:00
Jeffrey Walton
4f2d6f713f
Switch to rotlConstant and rotrConstant
...
Update comments
2017-11-24 17:54:12 -05:00
Jeffrey Walton
2e63e46747
Fix Speck compile error with iOS Watch
2017-11-23 09:45:53 -05:00