SIMD group kernels don't work on 32bit Windows
Tested SSE2 and SSE4.1 but I assume the AVX ones are broken too.
Compilation fails with
error C2719: 'x2': formal parameter with __declspec(align('16')) won't be aligned
32bit Windows doesn't allow align of function parameters because of calling conventions. Fixing this won't be easy and I suppose won't be done (I think the only way would be changing all SSE functions to macros).
But we should at least automatically disable all CPU acceleration for 32bit Windows and print a warning that this will have bad performance and that one should compile for 64bit if possible.
Ideally we would automatically compile for 64bit if possible but it seams it isn't possible to change the cmake generator to 64bit by default from withing CMakeLists.txt.
Fixes SSE/AVX compilation under Windows
- 32-bit MSVC cannot handle more than 3 xmm/ymm register
arguments due to stack alignment, so some group kernel
routines have been copied into optional macros. These are
only used for 32-bit MSVC compiles; other alternatives including
icc on windows use the normal functions that are easier to debug.
- Since the windows compilers control 32 vs 64 bit with flags, a
new log file entry has been added to show whether the present
build is a 32 or 64 bit one.
- Minor fixes to enable double precision AVX_128_FMA builds on
- Replace use of explicit binary constants with _MM_SHUFFLE()
macro in nbnxn kernels to make it compile under windows.
With these fixes, both SSE2, SSE4, and AVX256 group kernels pass
regressiontests in single and double with MSVC2010, MSVC2012, and
icc 2013.1 used with visual studio 2012. The nbnxn kernels pass
all tests with the exception of 32-bit double precision AVX_256
where all three compilers still fail (Refs #1097).
Fixes #1092, #1093, #1068.
#2 Updated by Roland Schulz about 7 years ago
I just tested it with VS2012 and it also doesn't work. Are you sure it works with ICC? I don't have a ICC license and the Jenkins Windows host only has 64bit ICC. So I can't test it. Because Microsoft claims that it is the calling convention so it shouldn't work with ICC either. But maybe you are right and this is a bug in MSVC and just an excuse.
Like I said, I'm not suggesting to fix the error. We should just fix cmake to disable the acceleration and print a warning. So it would be good to know whether ICC works or not. As far as I know we don't support native GCC (MINGW). And cygwin shouldn't be an issue.