Project

General

Profile

Bug #212

LINCS problem with large system

Added by Bert empty about 12 years ago. Updated about 12 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Erik Lindahl
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

I have done a series of simulations on inorganic surface. It goes quite well gmx-4.0rc1. When I tried I a relative larger system (7.2*7.2*22), a crash was found. There were two reasons I thought it was a bug:

1. It will not occur in the smaller system. I have tested on 3*3*10, 4*4*13, 5*5*16, 6*6*19.

2. With gromacs-3.3.3, no crash was found and all is ok.

----------------------------error output----------------------------------------
Getting Loaded...
Reading file prod.tpr, VERSION 4.0_rc1 (single precision)
Loaded with Money

Making 1D domain decomposition 3 x 1 x 1
starting mdrun 'mica'
100000 steps, 200.0 ps.
step 0
imb F 1% step 100, will finish Sat Sep 27 00:35:49 2008
imb F 0% step 200, will finish Sat Sep 27 00:52:44 2008
imb F 0% step 300, will finish Sat Sep 27 00:52:52 2008
imb F 1% step 400, will finish Sat Sep 27 00:48:47 2008
imb F 1% step 500, will finish Sat Sep 27 00:49:39 2008

Step 572, time 1.144 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms 0.000001, max 0.000003 (between atoms 3898 and 3904)
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length
4862 4868 63.9 0.1000 0.1000 0.1000

Step 573, time 1.146 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms 0.007133, max 0.122505 (between atoms 3770 and 3776)

Step 573, time 1.146 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms 0.043646, max 0.758486 (between atoms 746 and 752)

Step 573, time 1.146 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms 0.376296, max 6.506767 (between atoms 7970 and 7976)
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length
746 752 90.0 0.1000 0.1758 0.1000
7970 7976 90.0 0.1000 0.7507 0.1000
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length
3770 3776 90.0 0.1000 0.1123 0.1000
4862 4868 46.9 0.1000 0.1000 0.1000
Wrote pdb files with previous and current coordinates
Wrote pdb files with previous and current coordinates

Step 574, time 1.148 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms 0.340733, max 4.475628 (between atoms 4862 and 4868)

Step 574, time 1.148 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms 0.247949, max 3.215772 (between atoms 7332 and 7338)
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length

Step 574, time 1.148 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms 0.000001, max 0.000002 (between atoms 1096 and 1102)
7332 7338 90.0 0.1000 0.4216 0.1000
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length
7970 7976 90.0 0.7507 0.3836 0.1000
746 752 49.4 0.1758 0.1000 0.1000
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length
3098 3104 43.7 0.1000 0.1000 0.1000
4862 4868 90.0 0.1000 0.5476 0.1000
5578 5584 90.0 0.1000 0.4771 0.1000
5356 5362 34.4 0.1000 0.1000 0.1000
Wrote pdb files with previous and current coordinates
Wrote pdb files with previous and current coordinates

Step 575, time 1.15 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms 11.064705, max 192.282532 (between atoms 2712 and 2718)
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length

Step 575, time 1.15 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms nan, max 6.485989 (between atoms 7970 and 7976)
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length
7298 7304 90.0 0.1000 0.4121 0.1000
7332 7338 58.1 0.4216 0.1000 0.1000
8004 8010 90.0 0.1000 0.3708 0.1000
8044 8050 90.0 0.1000 0.2387 0.1000
8770 8776 90.0 0.1000 0.1898 0.1000
7970 7976 90.0 0.3836 0.7486 0.1000

Step 575, time 1.15 (ps) LINCS WARNING
relative constraint deviation after LINCS:
rms 237515.571801, max 4079464.000000 (between atoms 5578 and 5584)
746 752 90.0 0.1000 0.1770 0.1000
1714 1720 30.2 0.1000 0.1000 0.1000
2712 2718 90.0 0.1000 19.3283 0.1000
bonds that rotated more than 30 degrees:
atom 1 atom 2 angle previous, current, constraint length
3098 3104 90.0 0.1000 2.1622 0.1000
4862 4868 90.0 0.5476 0.1042 0.1000
5578 5584 90.0 0.4771 407946.5312 0.1000
5356 5362 41.9 0.1000 0.1000 0.1000
Wrote pdb files with previous and current coordinates
Wrote pdb files with previous and current coordinates
Wrote pdb files with previous and current coordinates

-------------------------------------------------------
Program mdrun, VERSION 4.0_rc1
Source code file: pme.c, line: 518

Fatal error:
2 particles communicated to PME node 0 are more than a cell length out of the domain decomposition cell of their charge group
-------------------------------------------------------

"It's Not Your Fault" (Pulp Fiction)

Error on node 0, will try to stop all the nodes
Halting parallel program mdrun on CPU 0 out of 3

gcq#132: "It's Not Your Fault" (Pulp Fiction)

-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 1031 failed on node n0 (127.0.0.1) due to signal 11.
-----------------------------------------------------------------------------
mpirun failed with exit status 11

History

#1 Updated by Bert empty about 12 years ago

If geometry=3dc is shift to 3d, things are ok again. So this error would be specific for PME3DC.

#2 Updated by Erik Lindahl about 12 years ago

Hi,

We just found a bug in the PME code (new optimization stuff) that could explain this. Please try the new RC2 version that I'm putting on the ftp site in 10 minutes!

#3 Updated by Erik Lindahl about 12 years ago

Since this should be fixed from RC2 I'm closing it. You can reopen later if the problem is still there.

Also available in: Atom PDF