Kernel010 in Single precision on Windows crashes with Segfault
I just tried to make the regressiontests run on Windows too and I noticed that several of the complex (7/14), kernel (6/63) and pdb2gmx (32/42) fail. They are all OK on Mac/Linux in all configurations. One cause is that Kernel010 segfaults. It seems to be OK in double precision. It segfaults with latest 4.5 release and the "Compiler=msvc CompilerVersion=2010 host=bs_Win2008_64" and "Compiler=icc CompilerVersion=12.1.2 GMX_DLOPEN=OFF GMX_THREADS=OFF GMX_FFT_LIBRARY=fftpack host=bs_Win2008_64" configs. It is OK with "Compiler=msvc CompilerVersion=2008 GMX_DOUBLE=ON GMX_OPENMP=OFF GMX_THREADS=OFF GMX_FFT_LIBRARY=fftpack host=bs_Win2008_64".
New CPU detection & AVX/SSE code, removed raw assembly files.
Removed all raw assembly files and deprecated altivec support.
Removed support for NASM and other assemblers, and replaced
previous SSE detection code with a new module using CPUID instead.
Added detection for SSE2, SSE4.1, AVX 128-bit with FMA, and AVX 256-bit.
Added Cmake detection of build platform based on CPUID, and output this
to the log file. The executables now compare the compile-time platform
and selected acceleration with the run-time platform and most suitable
acceleration and warns the user if they do not match. The compiler
detection code has also been reordered slightly to produce more readable
warnings when OpenMP is not available, and correctly disable pragma
Added intrinsics code and math functions for SSE2, SSE4.1, AVX128/256
both in single and double precision. All math functions and permutation
code have been tested & verified. Single precision math functions are
correct apart from the least significant bit, and double precision has
roughly twice the accuracy.
This has forced me to temporarily disable the SSE & Fortran acceleration.
SSE will be added back soon based on new intrinsics-only kernels currently
in testing, and we will test if Fortran still makes sense then.
Finally, the patch includes a modification to gmx_rmsdist where
a regression issue was introduced recently by using sqrtf() for
the norm function. This caused the intel compiler to produce slightly
different results at high optimization leves, which got evident here.
Closes #926 - Raw assembly code has been removed.
Refs #923 - Old kernels removed, new will be added shortly.
Fixes #914 - Cmake now does architecture-speficic optimization.
Fixes #912, #913
Fixes #857 - We detect rdtscp support with CPUID and use it if possible.
Closes #537, #574 - Altivec is now deprecated.
#1 Updated by Roland Schulz over 7 years ago
The non-segfault errors in the regression-tests were caused by a bug in the perl script fixed by https://gerrit.gromacs.org/#/c/794/. This leaves only segfaults: 6/14 in complex (aminoacids, argon, butane, dec+water, fe_test, tip4pflex) , 1/63 in kernel (kernel010), and 32/42 in pdb2gmx.
#2 Updated by Roland Schulz over 7 years ago
The Jenkins test results: http://jenkins.gromacs.org/job/Gromacs_Gerrit_4_5_gmxTestXML/7/testReport/ (without the 794 fix)
#3 Updated by Roland Schulz over 7 years ago
It is also OK with the C-kernel or compiled in 32bit. For 4.6 #923 is probably automatic fixing this issue. But we probably still need to fix it for the 4.5 branch somehow. It seems that the stack gets trashed before it segfaults (stack depth is 1). And I'm not aware of any memory debugger for 64bit Windows to see where the kernel starts to go wrong.
#5 Updated by Roland Schulz over 7 years ago
On Linux reverting 0b8f869fe63dfe2bbb7 fixes the issue but it doesn't fix the issue on Windows. Thus these are different problems.
Also using a different NASM version doesn't fix the problem on Windows. The Segfault happens with 2.08.02, 2.09.10, and 2.10.
Is FAH only using 32bit binaries? Or why is this error not effecting FAH?
#9 Updated by Erik Lindahl over 7 years ago
Roland Schulz wrote:
But it is effecting 4.6 too.
As you noticed in the third comment, this will be fixed by #923.
However, sorry for getting a little grumpy about this, but we are MONTHS past the complete feature freeze and reporting of new bugs for 4.6. Even if this would have affected 4.6, it would not have been fixed for 4.6.0 unless somebody volunteered. Please help with addressing currently open Redmine issues, but new issues opened now cannot have 4.6.0 as a target anymore.
#10 Updated by Roland Schulz over 7 years ago
True. I overlooked the #923 fix when writing that. Sorry.
Since 64bit windows single precision seem to have never worked with SSE it isn't a regression either.
My main interest in this was that not all Jenkins builds fail. I now simply changed that the configurations on Windows in single precision are build with GMX_ACCELERATION=none.
#11 Updated by Peter Kasson over 7 years ago
Erik's comments (and the #923 fix) stand, but you are correct in that no one is using Gromacs on 64-bit windows yet to my knowledge. We did some test builds for FAH, but the cores there remain 32-bit at the moment. That's something we should change (memory limitations with 32-bit get in the way), but it hasn't happened yet.