Formerly the ARM code favored CPU probes with SIGILLs. We've found its ineffiient on most platforms and dangerous on Apple platforms. This commit splits feature probes into CPU_QueryXXX(), which asks the OS if a feature is present. The detection code then falls back to CPU_ProbeXXX() using SIGILLs as a last resort.
We only need to base it on the compiler in config.h. config.h activates the code path guarded by HasNEON(). The source file that actially provides the NEON implementation will be compiled with -fpu=neon or -march=armv8-a.
Since we are providing the specialized implementation in a sequestered source file (and not a header file), we can probably avoid the defines like CRYPTOPP_ARM_NEON_AVAILABLE altogether.
Wrap DetectArmFeatures and DetectX86Features in InitializeCpu class
Use init_priority for InitializeCpu
Remove HAVE_GCC_CONSTRUCTOR1 and HAVE_GCC_CONSTRUCTOR0
Use init_seg(<name>) on Windows and explicitly insert at XCU segment
Simplify logic for HAVE_GAS
Remove special recipies for MACPORTS_GCC_COMPILER
Move C++ static initializers into anonymous namespace when possible
Add default NullNameValuePairs ctor for Clang
When MSVC init_seg or GCC init_priority is available, we don't need to use the Singleton. We only need to create a file scope class variable and place it in the segment for MSVC or provide the attribute for GCC.
An additional upside is we cleared all the memory leaks that used to be reported by MSVC for debug builds.
The macros that invoke GCC inline ASM have better code generation and speedup GCM ops by about 70 MiB/s on an Opteron 1100. The intrinsics are still available for Windows platforms and Visual Studio 2017 and above
It appears Apple Clang disgorges carryless multiply (PMULL) from Crypto (AES and SHA). The breakout added CRYPTOPP_BOOL_ARM_PMULL_INTRINSICS_AVAILABLE for PMULL, and retained CRYPTOPP_BOOL_ARM_CRYPTO_INTRINSICS_AVAILABLE for AES and SHA only
This should not be needed, but it does not hurt. According to Ian Lance Taylor (http://gcc.gnu.org/ml/gcc-help/2014-02/msg00023.html), the CC clobber causes GCC to forget its internal representation of flag state. It should not be needed for cpuid. However, Clang has some odd behave in a couple of versions of its compiler when using cpuid. Both JW and UB experienced it on separate occassions.