Project

General

Profile

Bug #450

settle errors and decompositon error depending on auto PME decomposition

Added by Peter Kasson over 9 years ago. Updated over 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Erik Lindahl
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

I'm getting some interesting behavior with git-current Gromacs and a large vesicle system.
With -nt 24 and -npme 0, I get a 6 x 4 x 1 decomposition grid and things seem to work fine.
With -nt 12 and -npme 0, I get a 12 x 1 x 1 decomposition grid, which seems odd. On the PRACE system, things look ok, but one of our users reported settle errors + a decomposition error as below.
With -nt 12 and no -npme set, I get a 3 x 1 x 3 + 3 PME nodes and settle errors on startup (for some starts but not others).

It would seem that with -nt 12 -npme 0, we would want something like 4 x 3 x 1 rather than 12 x 1 x 1.

The tpr file (from an older grompp) is at ~kasson/frame6.tpr.

-----------------------
Starting 12 threads
NNODES=12, MYRANK=0, HOSTNAME=thread #0
NNODES=12, MYRANK=4, HOSTNAME=thread #4
NNODES=12, MYRANK=6, HOSTNAME=thread #6
NNODES=12, MYRANK=7, HOSTNAME=thread #7
NNODES=12, MYRANK=2, HOSTNAME=thread #2
NNODES=12, MYRANK=1, HOSTNAME=thread #1
NNODES=12, MYRANK=3, HOSTNAME=thread #3
NNODES=12, MYRANK=9, HOSTNAME=thread #9
NNODES=12, MYRANK=5, HOSTNAME=thread #5
NNODES=12, MYRANK=8, HOSTNAME=thread #8
Reading file work/wudata_01.tpr, VERSION 4.0.99_development_20090605 (single precision)
NNODES=12, MYRANK=10, HOSTNAME=thread #10
NNODES=12, MYRANK=11, HOSTNAME=thread #11
Making 1D domain decomposition 12 x 1 x 1
starting mdrun 'SINGLE VESICLE in water'
1750000 steps, 7000.0 ps (continuing from step 1500000, 6000.0 ps).
[06:27:06] Completed 0 out of 250000 steps (0%)

step 1500001: Water molecule starting at atom 662814 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.

step 1500001: Water molecule starting at atom 101529 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.

-------------------------------------------------------
Program mdrun, VERSION 4.0.99-dev-20100610-b6a86-dirty
Source code file: /data0/FAHdev/a3_development/gromacs/src/mdlib/pme.c, line: 535

Fatal error:
2 particles communicated to PME node 10 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension xThis usually means that your system is not well equilibrated
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

History

#1 Updated by Peter Kasson over 9 years ago

Error was occuring with gcc 4.1.x builds; decomposition 12x1x1 still occurs with 4.3.x builds but have not been able to reproduce settle or DD errors. Will reopen bug if can reproduce in the future.

Also available in: Atom PDF