Commit Graph

10 Commits (4792578f09521968b88a9747c00eb4b462074100)

Author SHA1 Message Date
Jeffrey Walton 4792578f09
Rearrange statements and avoid intermediates
The folding of statements helps GCC elimate some of the intermediate stores it was performing. The elimination saved about 1.0 cpb. SIMON-128 is now running around 10 cpb, but it is still off the Simon and Speck team's numbers of 3.5 cpb
2017-12-01 04:11:31 -05:00
Jeffrey Walton b7ced67892
Update comments 2017-12-01 02:38:19 -05:00
Jeffrey Walton a7fec9c0f6
Fix assert in Debug builds
This was copy/paste from the template function
2017-11-30 11:54:21 -05:00
Jeffrey Walton 22257c4b6e
Remove SunCC const cast workaround
This code does not suffer SunCC losing const-ness
2017-11-29 12:56:19 -05:00
Jeffrey Walton 39594a53b0
Add fast rotate-by-8 for Aarch32 and Aarch64 2017-11-29 12:33:34 -05:00
Jeffrey Walton 532f13fe53
Fix compile using SunCC 12.4 2017-11-29 12:10:19 -05:00
Jeffrey Walton 16ebfa72bf
Cleanup comments and whitespace 2017-11-29 10:15:41 -05:00
Jeffrey Walton 6e829cebee
Use EPI8 Shuffle rather than Shifts and Or for rotate when R=8
Louis Wingers and Bryan Weeks from the Simon and Speck team offered the suggestion. The change save 0.7 cpb for Speck, and 5 cpb for Simon on x86_64.
Speck is now running very close to the Team's time sor SSE4. Simon is still off, but we know the root cause. For Simon, the Team used a fast bit-sliced implementation
2017-11-29 08:53:48 -05:00
Jeffrey Walton a29b36c197
Whitespace check-in 2017-11-27 01:51:27 -05:00
Jeffrey Walton 568e608ea6
Add NEON and ASIMD intrinsics for SPECK-128 (GH #539)
Performance increased by about 200% on a 980 MHz BananaPi dev-board. Throughput went from about 176.6 cpb to about 60.3 cpb.
2017-11-27 00:36:45 -05:00