Project

General

Profile

Bug #174

MDRUN segfaults on an AMD64 system using Intel compilers and Intel MKL for FFT

Added by Jason Lander about 12 years ago. Updated about 11 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Erik Lindahl
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Reporting on behalf of a GROMACS user of the UK National Grid Service.

When GROMACS is compiled on an AMD64 system using version 9.1 of the Intel
compilers

CC=icc  CXX=icpc F77=ifort                       \
CPPFLAGS="-g -I$INTEL_MKL_INCLUDEDIR" \
LDFLAGS="-g -L$INTEL_MKL_LIBDIR -lmkl -lmkl_lapack"
./configure --enable-fortran \
--with-external-blas --with-external-blas \
--with-external-lapack
--with-fft=mkl --without-x --enable-float
--enable-mpi

the code coredumps. The best information I can find comes from GDB as the
Intel debugger is not co-operating...

The fault comes after the call to...

gmx_fft_2d_real (fft=0x71ecb0, dir=GMX_FFT_REAL_TO_COMPLEX,
in_data=0x2aaaabd7da20, out_data=0x2aaaabd7da20) at gmx_fft_mkl.c:939
939 status = DftiComputeForward(fft->inplace1,in_data);

where fft is

$1 = {ndim = 2, nx = 48, ny = 72, nz = 0, real_fft = 1, inplace = {0x95eb70,
0x9600b0, 0x0}, ooplace = {0x95f610, 0x960b30, 0x961670, 0x0},
work = 0x9621b0}

My guess is that this is stack corruption: the address of in_data is different
after the crash and before. A possible source is the value of inplace within
fft structure: these are meant to be pointers but the addresses look like
32-bit integers and this is a 64-bit machine.

Recompiling the code to use FFTW stops it crashing.

gmx_fft_mkl.c (32.7 KB) gmx_fft_mkl.c Modified (fixed?) version of gmx_fft_mkl.c Erik Lindahl, 11/15/2007 01:27 PM

Related issues

Is duplicate of GROMACS - Bug #195: MDRUN segfaults on Intel64 platform due to using wrong data type in MKL FFTClosed05/14/2008

History

#1 Updated by Ulf Markwardt about 12 years ago

I can confirm the problem. It occurs compiling with GNU (4.1), Pathscale (3.0), and Intel (10.0) compilers and MKL 9.1.
Running SLES 10 on AMD Opteron

Ulf Markwardt
TU Dresden
Center for Information Services and High Performance Computing (ZIH)

#2 Updated by Erik Lindahl about 12 years ago

Created an attachment (id=254)
Modified (fixed?) version of gmx_fft_mkl.c

#3 Updated by Erik Lindahl about 12 years ago

Hi,

Please try the attached file and see if that solves the problem. There was an error in an index that might have caused this bug as a side-effect.

#4 Updated by David van der Spoel over 11 years ago

Jason,

did Erik's patch fix the problem?

#5 Updated by Erik Lindahl about 11 years ago

This is probably solved by the fix for #195.

Also available in: Atom PDF