Project

General

Profile

Bug #285

new mdp option "nstlist=-1" doesn't work with parallel runs

Added by Lanyuan Lu over 10 years ago. Updated over 10 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Erik Lindahl
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Created an attachment (id=347)
Input files

The fowllowing is my post in the gmx-user mailing-list:
Hello,
Again I have a problem regarding the "nstlist=-1" option and this time it's related to parallel runs.
With the help from Berk, I can run my system with nstlist=-1 option and 1 cpu without any problem. However, each time when I run it using more than one cpu under MPI, the run crashes immediately. The input files are the same for both serial and parallel runs. I tried to change to nstlist=1 and this time it run OK both serial and parallel. So it seems to me there is a problem for the automatic neighbor list updating with parallel runs. I tried both domain decomposition and particle decompostion with "nstlis=-1" and in both cases the run crashed.
Could Berk or anyone else have a look at it?
Thanks very much.
Lanyuan

The attachement contains input files for the test. The commands are:
grompp -f sample.mdp -c system.pdb -p A8H.top -o a.tpr -n index.ndx
mpirun -np 2 -machinefile host mdrun -s a.tpr -table table.xvg
In my MPI environment I got:
"mdrun:31199 terminated with signal 11 at PC=6115d6 SP=7fbfffc520. Backtrace:
/uufs/hec.utah.edu/common/cbmsfs/u0578352/gmx4/bin/mdrun[0x6115d6]".

Thanks very much for your help!
Lanyuan

gmx4test.tar.gz (618 KB) gmx4test.tar.gz Input files Lanyuan Lu, 01/30/2009 06:49 PM

History

#1 Updated by Berk Hess over 10 years ago

Hi,

For me it seems to run also in parallel.

But I get warnings like this for all table files:
WARNING: For the 2398 non-zero entries for table 1 in table_CGW_CGW.xvg the forces deviate on average 126% from minus the numerical derivative of the potential

Did you not get these, or did you just ignore them?

You should first make your tables potential and forces consistent,
this will probably fix you problems.

PS I would use 10 times wider spaced tables, this saves disk space
and make mdrun run slightly more efficient. These is no detail in your
potential (you could even use 100x less points).

Berk

#2 Updated by Berk Hess over 10 years ago

I think there is no problem with nstlist=-1 in parallel.
Lanyuan, can we close this bug?

Berk

#3 Updated by Berk Hess over 10 years ago

The problem was not caused by nstlist=-1,
but probably due to user table input V/F mismatch.

Berk

Also available in: Atom PDF