MSVC AVX double precision kernel fail tests
With MSVC 2010, AVX, double the regressiontest kernel tests fail:
FAILED. Check checkforce.out (260 errors) files in nb_kernel_ElecCoul_VdwCSTab_GeomW3P1
FAILED. Check checkforce.out (1176 errors) files in nb_kernel_ElecCoul_VdwCSTab_GeomW3W3
FAILED. Check checkforce.out (310 errors) files in nb_kernel_ElecCoul_VdwCSTab_GeomW4P1
FAILED. Check checkforce.out (1156 errors) files in nb_kernel_ElecCoul_VdwCSTab_GeomW4W4
FAILED. Check checkforce.out (260 errors) files in nb_kernel_ElecCoul_VdwLJ_GeomW3P1
FAILED. Check checkforce.out (1176 errors) files in nb_kernel_ElecCoul_VdwLJ_GeomW3W3
FAILED. Check checkforce.out (310 errors) files in nb_kernel_ElecCoul_VdwLJ_GeomW4P1
FAILED. Check checkforce.out (1156 errors) files in nb_kernel_ElecCoul_VdwLJ_GeomW4W4
FAILED. Check checkforce.out (1238 errors) files in nb_kernel_ElecCoul_VdwNone_GeomP1P1
FAILED. Check checkforce.out (568 errors) files in nb_kernel_ElecCoul_VdwNone_GeomW3P1
FAILED. Check checkforce.out (1176 errors) files in nb_kernel_ElecCoul_VdwNone_GeomW3W3
FAILED. Check checkforce.out (312 errors) files in nb_kernel_ElecCoul_VdwNone_GeomW4P1
FAILED. Check checkforce.out (1156 errors) files in nb_kernel_ElecCoul_VdwNone_GeomW4W4
13 out of 142 kernel tests FAILED
The same 13 kernels also fail with SSE4.1. But only with /arch:AVX compiler flag. Without that compiler flag the SSE4.1 kernels pass the tests.
Fixes SSE/AVX compilation under Windows
- 32-bit MSVC cannot handle more than 3 xmm/ymm register
arguments due to stack alignment, so some group kernel
routines have been copied into optional macros. These are
only used for 32-bit MSVC compiles; other alternatives including
icc on windows use the normal functions that are easier to debug.
- Since the windows compilers control 32 vs 64 bit with flags, a
new log file entry has been added to show whether the present
build is a 32 or 64 bit one.
- Minor fixes to enable double precision AVX_128_FMA builds on
- Replace use of explicit binary constants with _MM_SHUFFLE()
macro in nbnxn kernels to make it compile under windows.
With these fixes, both SSE2, SSE4, and AVX256 group kernels pass
regressiontests in single and double with MSVC2010, MSVC2012, and
icc 2013.1 used with visual studio 2012. The nbnxn kernels pass
all tests with the exception of 32-bit double precision AVX_256
where all three compilers still fail (Refs #1097).
Fixes #1092, #1093, #1068.
#1 Updated by Erik Lindahl about 8 years ago
This might have been fixed by removing the extra /arch:SSE2 that was always added to the release flags. Currently the group kernel tests work fine for me in 32-bit MSVC (Yes Virginia, I've hacked up a few macros to work around the problems in 32-bit windows). Unfortunately there seems to be a bug in the NBNXN kernels when using double precision AVX_256, though.