Project

General

Profile

Bug #298

Segfault in a parallel particle decomposition run

Added by Ondrej Marsalek over 10 years ago. Updated over 10 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Erik Lindahl
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Created an attachment (id=354)
tpr file that produces the error

I have encountered a segmentation fault with Gromacs 4.0.3. It only
happens in a parallel run (particle decomposition, open boundary
conditions), the same setup runs fine in serial. It also runs fine if I switch the thermostat to Berendsen instead of v-rescale.

The command that produces the error was:
mpirun -np 2 mdrun_d -pd -s w.tpr

I use OpenMPI 1.3 and the Intel compiler on a Core2 machine. I can provide more information if needed.

I attach the tpr file.

The stack trace was:

starting mdrun 'Neat water'
500000 steps, 500.0 ps.
[isis:23154] * Process received signal
[isis:23154] Signal: Segmentation fault (11)
[isis:23154] Signal code: Address not mapped (1)
[isis:23154] Failing at address: (nil)
[isis:23154] [ 0] /lib/libpthread.so.0 [0x7f41e64050f0]
[isis:23154] [ 1]
/opt/gromacs-4.0.3/double/lib/libmd_mpi_d.so.5(vrescale_tcoupl+0x184)
[0x7f41e8b3048c]
[isis:23154] [ 2]
/opt/gromacs-4.0.3/double/lib/libmd_mpi_d.so.5(update+0x1fa)
[0x7f41e8bad03c]
[isis:23154] [ 3] mdrun_d(do_md+0x25f9) [0x416ed1]
[isis:23154] [ 4] mdrun_d(mdrunner+0xf35) [0x419bd3]
[isis:23154] [ 5] mdrun_d(main+0x55f) [0x41a35f]
[isis:23154] [ 6] /lib/libc.so.6(__libc_start_main+0xe6) [0x7f41e60a2466]
[isis:23154] [ 7] mdrun_d [0x4062d9]
[isis:23154]
End of error message *
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 23154 on node isis exited
on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

w.tpr (7.55 KB) w.tpr tpr file that produces the error Ondrej Marsalek, 02/22/2009 12:51 PM

History

#1 Updated by Berk Hess over 10 years ago

This bug is not directly related to particle decomposition.
The thermostat integral flag for the state was not set with pbc=none.
This causes a segv with particle decomposition.
I fixed it.
The fix is below if you need it.

Berk

RCS file: /home/gmx/cvs/gmx/src/mdlib/init.c,v
retrieving revision 1.59.2.1
diff r1.59.2.1 init.c
125,127c125,127
< if (ir->etc etcNOSEHOOVER || ir->etc etcVRESCALE) {
< state->flags |= (1<<estTC_INT);
< }
--

}
if (ir->etc etcNOSEHOOVER || ir->etc etcVRESCALE) {
state->flags |= (1<<estTC_INT);

Also available in: Atom PDF