Commit Graph

29 Commits (5b9b9b8d08dc760b5762d8b49f892aed58adb013)

Author SHA1 Message Date
Jeffrey Walton 225ab6cb7b
Drop ChaCha requirements to POWER7
This costs about 0.6 cpb (700 MB/s on GCC112), but it makes the faster algorithm available to more machines. In the future we may want to provide both POWER7 and POWER8
2018-11-14 08:19:13 -05:00
Jeffrey Walton d9011f07d2
Add ChaCha AVX2 implementation (GH #735) 2018-11-08 16:20:31 -05:00
Jeffrey Walton 6cc763939e
Skip unneeded wrap check in SIMD book keeping (GH #732) 2018-11-04 15:35:34 -05:00
Jeffrey Walton 29be6ed97a
Work-around potential counter increment problem in ChaCha20 (GH #732)
This is only a work-around for the moment. The issue only affects SIMD code. The problem is, the algorithm we use performs a 32-bit add as an intermediate result, but we really need a 64-bit add. We are running 4 transforms in parallel, and we can't add and carry the way we need to.

The workaround is, whenever we could cross the 32-bit counter boundary we use the C version of the transform. We determine the cross-over point by 'bool safe = 0xffffffff - state.low > 4'. When not safe we skip the SIMD version of the algorithm and use the C version. Once we are safe again we use the SIMD version again.

The work-around costs us about 0.1 to 0.2 cpb. At 1.10 or 1.15 cpb that equates to about 200 MB/s on a Skylake. We'd like to get it back eventually.
2018-11-04 14:49:26 -05:00
Jeffrey Walton d7d76fa5f7
Add ChaCha Power8 implementation 2018-10-27 08:40:07 -04:00
Jeffrey Walton c0b273dac8
Remove xorInput parameter from ChaCha SIMD functions
We can use the input pointer directly after checking KeystreamOperation
2018-10-26 10:10:52 -04:00
Jeffrey Walton 8da2b91cba
Add ChaCha AlgorithmName override 2018-10-26 03:13:06 -04:00
Jeffrey Walton b4b3623938
Whitespace check-in 2018-10-25 12:15:33 -04:00
Jeffrey Walton b1050636a6
Add ChaCha NEON implementation 2018-10-25 12:08:32 -04:00
Jeffrey Walton b4c4c5aa14
Add SSSE3 rotates when available
This change obtains the remaining 0.1 to 0.15 cpb. It should be engaged with -march=native
2018-10-24 15:34:54 -04:00
Jeffrey Walton 18dcbdf514
Move input xor to ChaCha_OperateKeystream_SSE2
This picks up about 0.2 cpb in ChaCha::OperateKeystream. It may not sound like much but it puts SSE2 intrinsics version on par with the ASM version of Salsa20. Salsa20 leads ChaCha by 0.1 to 0.15 cpb, which equates to about 50 MB/s.
2018-10-24 11:00:35 -04:00
Jeffrey Walton d230999b40
Fix ChaCha compile on ARM and MIPS 2018-10-24 01:11:45 -04:00
Jeffrey Walton 6a5d2ab03d
Remove unneeded params from ChaCha_OperateKeystream_SSE2 2018-10-23 08:52:29 -04:00
Jeffrey Walton 028a9f0494
Remove old comments from chacha.cpp
This should have been done at 916c4484a2
2018-10-23 08:12:02 -04:00
Jeffrey Walton 916c4484a2
Add ChaCha SSE2 implementation
Thanks to Jack Lloyd and Botan for allowing us to use the implementation.
The numbers for SSE2 are very good. When compared with Salsa20 ASM the results are:
  * Salsa20 2.55 cpb; ChaCha/20 2.90 cpb
  * Salsa20/12 1.61 cpb; ChaCha/12 1.90 cpb
  * Salsa20/8 1.34 cpb; ChaCha/8 1.5 cpb
2018-10-23 07:57:59 -04:00
Jeffrey Walton 48f2d95b0f
Fix ChaCha debug builds
This broke at https://github.com/weidai11/cryptopp/commit/e2be0cdecce7
2018-08-18 01:31:35 -04:00
Jeffrey Walton e2be0cdecc
Make ChaCha an Salsa use the same design pattern 2018-08-17 06:19:30 -04:00
Jeffrey Walton 2f83777e9b
Backout ChaCha changes to Crypto++ 7.0
These changes made it in by accident at Commit b74a6f4445. We were going to try to let them ride but they broke versioning. They may be added later but we should avoid the change at this time.
2018-07-25 16:25:41 -04:00
Jeffrey Walton b74a6f4445
Add algorithm provider member function to Algorithm class 2018-07-06 09:23:37 -04:00
Jeffrey Walton a074722bfa
Switch to rotlConstant and rotrConstant
This will help Clang and its need for a constexpr
2017-11-25 02:52:19 -05:00
Jeffrey Walton 7851a0d510 Remove BOOL macro value (GH #462)
Currently the CRYPTOPP_BOOL_XXX macros set the macro value to 0 or 1. If we remove setting the 0 value (the #else part of the expression), then the self tests speed up by about 0.3 seconds. I can't explain it, but I have observed it repeatedly.
This check-in prepares for the removal in Upstream master
2017-08-20 21:25:29 -04:00
Jeffrey Walton 8c20630c2d
Remove extra preamble for copyright.
Similar text may be added in the future
2017-02-21 02:54:09 -05:00
Jeffrey Walton 54d17c7361
Updated CRYPTOPP_ASSERT based on comments
Also see 399a1546de (commitcomment-19448453)
2016-10-17 22:00:31 -04:00
Jeffrey Walton 91ca6c117d Change from NDEBUG to CRYPTOPP_DEBUG in source files to ensure all debug behavior pivots on CRYPTOPP_DEBUG, and not NDEBUG (Issue 277, CVE-2016-7420) 2016-09-16 14:51:48 -04:00
Jeffrey Walton 399a1546de Add CRYPTOPP_ASSERT (Issue 277, CVE-2016-7420)
trap.h and CRYPTOPP_ASSERT has existed for over a year in Master. We deferred on the cut-over waiting for a minor version bump (5.7). We have to use it now due to CVE-2016-7420
2016-09-16 11:27:15 -04:00
Jeffrey Walton 7378a1b86d Cleared analysis warning on use of boolean in arithmetic expression 2016-07-23 19:37:17 -04:00
Jeffrey Walton 433f2d6566 Remove branch in increment counter 2016-04-21 19:53:04 -04:00
Jeffrey Walton 38f6c33789 Update ChaCha to latest sources 2016-04-21 12:12:42 -04:00
Jeffrey Walton 0f702accfc Add ChaCha family of stream ciphers 2016-04-21 12:08:36 -04:00