Commit Graph

22 Commits (076937eb81b8bf1a7703894efee6fb90247874cc)

Author SHA1 Message Date
Jeffrey Walton 076937eb81
Update comments for vector permutes in SPECK-128 2017-12-04 12:31:32 -05:00
Jeffrey Walton 25709d2597
Fix SPECK64 vector permutes
Thanks to Peter Cordes for the suggestion on handling the case
2017-12-04 09:47:26 -05:00
Jeffrey Walton 46271660a1
Switch to uint64x2_t for SIMON-128 2017-12-04 05:47:34 -05:00
Jeffrey Walton e9714b40d2
Switch to _mm_unpacklo_epi32 and _mm_unpackhi_epi32
The manual _mm_extract_epi32 and  _mm_insert_epi32 are required during setup, be we can use SSE on teardown
2017-12-04 05:01:27 -05:00
Jeffrey Walton cd31fa29dc
Switch to uint64x2_t for SPECK-128 2017-12-04 03:38:39 -05:00
Jeffrey Walton 1de143203e
Add SPECK-64 NEON intrinsics 2017-12-03 18:47:39 -05:00
Jeffrey Walton f0e49785f6
Fix incorrect SPECK-128 decrypt when blocks >= 6
Add defines for CRYPTOPP_SPECK64_ADVANCED_PROCESS_BLOCKS and CRYPTOPP_SPECK128_ADVANCED_PROCESS_BLOCKS
2017-12-03 09:00:39 -05:00
Jeffrey Walton 6bb1f1d9c4
Add SPECK-64 SSE intrinsics
Performance went from about 11.9 cpb (C++) to about 4.5 cpb (SSE)
2017-12-03 02:28:40 -05:00
Jeffrey Walton 25493ded49
Add AVX512VL rotate support 2017-12-01 09:39:05 -05:00
Jeffrey Walton a7fec9c0f6
Fix assert in Debug builds
This was copy/paste from the template function
2017-11-30 11:54:21 -05:00
Jeffrey Walton 22257c4b6e
Remove SunCC const cast workaround
This code does not suffer SunCC losing const-ness
2017-11-29 12:56:19 -05:00
Jeffrey Walton 39594a53b0
Add fast rotate-by-8 for Aarch32 and Aarch64 2017-11-29 12:33:34 -05:00
Jeffrey Walton 532f13fe53
Fix compile using SunCC 12.4 2017-11-29 12:10:19 -05:00
Jeffrey Walton 16ebfa72bf
Cleanup comments and whitespace 2017-11-29 10:15:41 -05:00
Jeffrey Walton 6e829cebee
Use EPI8 Shuffle rather than Shifts and Or for rotate when R=8
Louis Wingers and Bryan Weeks from the Simon and Speck team offered the suggestion. The change save 0.7 cpb for Speck, and 5 cpb for Simon on x86_64.
Speck is now running very close to the Team's time sor SSE4. Simon is still off, but we know the root cause. For Simon, the Team used a fast bit-sliced implementation
2017-11-29 08:53:48 -05:00
Jeffrey Walton 07c2047cec
Add simon-simd.cpp to file list and nmake file 2017-11-27 01:20:15 -05:00
Jeffrey Walton 4f2d6f713f
Switch to rotlConstant and rotrConstant
Update comments
2017-11-24 17:54:12 -05:00
Jeffrey Walton 2e63e46747
Fix Speck compile error with iOS Watch 2017-11-23 09:45:53 -05:00
Jeffrey Walton 304809a65d
Add NEON and ASIMD intrinsics for SPECK-128 (GH #538)
Performance increased by about 115% on a 980 MHz BananaPi dev-board. Throughput went from about 46.2 cpb to about 21.5 cpb.
2017-11-23 02:47:44 -05:00
Jeffrey Walton f5784c1634
Update comments 2017-11-22 17:35:59 -05:00
Jeffrey Walton f2bc3cd0ca
Add speck-simd.cpp to project files (GH #538, #539)
Cleaned up whitespace
2017-11-22 08:45:38 -05:00
Jeffrey Walton e7fee716d6
Add SSSE3 intrinsics for SPECK-128 (GH #538)
Performance increased by about 100% on a 3.1 GHz Core i5 Skylake. Throughput went from about 7.3 cpb to about 3.5 cpb. Not bad for a software-based implementation of a block cipher
2017-11-22 08:01:41 -05:00