Commit Graph

18 Commits (9faa504a2444e4f53b6f1b3de87519e52bbfa804)

Author SHA1 Message Date
Jeffrey Walton 9faa504a24
Fix Aarch32 and Aarch64 rotates 2017-12-05 11:15:26 -05:00
Jeffrey Walton c18793f862
Fix SIMON-64 missing transform 2017-12-05 09:14:58 -05:00
Jeffrey Walton 4990ffe5b8
Add SIMON-64 NEON intrinsics 2017-12-05 08:53:57 -05:00
Jeffrey Walton e09e6af1f8
Enable multi-block for SPECK-64 and SIMON-64
Also cleaned up SIMON-64 vector permute code. Thanks again to Peter Cordes
2017-12-05 04:19:44 -05:00
Jeffrey Walton 46271660a1
Switch to uint64x2_t for SIMON-128 2017-12-04 05:47:34 -05:00
Jeffrey Walton f0e49785f6
Fix incorrect SPECK-128 decrypt when blocks >= 6
Add defines for CRYPTOPP_SPECK64_ADVANCED_PROCESS_BLOCKS and CRYPTOPP_SPECK128_ADVANCED_PROCESS_BLOCKS
2017-12-03 09:00:39 -05:00
Jeffrey Walton 081afde0fd
Add SIMON-64 SSE intrinsics
Performance went from about 29 cpb (C++) to about 11.1 cpb (SSE)
2017-12-03 04:10:55 -05:00
Jeffrey Walton 25493ded49
Add AVX512VL rotate support 2017-12-01 09:39:05 -05:00
Jeffrey Walton 4792578f09
Rearrange statements and avoid intermediates
The folding of statements helps GCC elimate some of the intermediate stores it was performing. The elimination saved about 1.0 cpb. SIMON-128 is now running around 10 cpb, but it is still off the Simon and Speck team's numbers of 3.5 cpb
2017-12-01 04:11:31 -05:00
Jeffrey Walton b7ced67892
Update comments 2017-12-01 02:38:19 -05:00
Jeffrey Walton a7fec9c0f6
Fix assert in Debug builds
This was copy/paste from the template function
2017-11-30 11:54:21 -05:00
Jeffrey Walton 22257c4b6e
Remove SunCC const cast workaround
This code does not suffer SunCC losing const-ness
2017-11-29 12:56:19 -05:00
Jeffrey Walton 39594a53b0
Add fast rotate-by-8 for Aarch32 and Aarch64 2017-11-29 12:33:34 -05:00
Jeffrey Walton 532f13fe53
Fix compile using SunCC 12.4 2017-11-29 12:10:19 -05:00
Jeffrey Walton 16ebfa72bf
Cleanup comments and whitespace 2017-11-29 10:15:41 -05:00
Jeffrey Walton 6e829cebee
Use EPI8 Shuffle rather than Shifts and Or for rotate when R=8
Louis Wingers and Bryan Weeks from the Simon and Speck team offered the suggestion. The change save 0.7 cpb for Speck, and 5 cpb for Simon on x86_64.
Speck is now running very close to the Team's time sor SSE4. Simon is still off, but we know the root cause. For Simon, the Team used a fast bit-sliced implementation
2017-11-29 08:53:48 -05:00
Jeffrey Walton a29b36c197
Whitespace check-in 2017-11-27 01:51:27 -05:00
Jeffrey Walton 568e608ea6
Add NEON and ASIMD intrinsics for SPECK-128 (GH #539)
Performance increased by about 200% on a 980 MHz BananaPi dev-board. Throughput went from about 176.6 cpb to about 60.3 cpb.
2017-11-27 00:36:45 -05:00