Project

General

Profile

Bug #2165

GROMACS-2016 gives the domain decomposition error when doing free energy calculation with MPI

Added by Ahmet Yildirim over 3 years ago. Updated over 3 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Category:
mdrun
Target version:
-
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Hi,

I had submitted a bug report (http://redmine.gromacs.org/issues/2141) about the simulation of "ligand decoupling from complex" using MPI with 28 cores but couldn't get a solution. I still have the same problem. I am facing the same problem this time when doing "ligand decoupling from solution" using GROMACS-2016.2 with MPI and 28 cores. I get "domain decomposition error" once the job (solvation.sh) in the attached file start to run. I also tested the following commands on GROMACS-2016.2 and 2016.3 but they didn't help. This domain decomposition error arises in both eqA and eqB simulations in the attached file.

mpirun -n 28 gmx_mpi mdrun -ntomp 1 -dd 5 4 1 -npme 8

mpirun -n 28 gmx_mpi mdrun -ntomp 1 -dd 3 3 2 -npme 10

mpirun -n 28 gmx_mpi mdrun -ntomp 7 -dd 2 2 1

mpirun -n 28 gmx_mpi mdrun -ntomp 2 -npme 14
mpirun -n 28 gmx_mpi mdrun -ntomp 4 -npme 7

Fatal error:
The size of the domain decomposition grid (0) does not match the number of ranks (28). The total number of ranks is 28.

Any help to resolve this error is greatly appreciated.

solvation.tar.gz (82.9 KB) solvation.tar.gz Ahmet Yildirim, 04/22/2017 07:18 AM

History

#1 Updated by Berk Hess over 3 years ago

  • Category set to mdrun
  • Status changed from New to Rejected
  • Assignee set to Berk Hess
  • Priority changed from High to Normal

Have you looked at the error message?

Fatal error:
The initial cell size (0.715508) is smaller than the cell size limit
(1.735298), change options -dd, -rdd or -rcon, see the log file for details

You have a too long interaction distance to run a decomposition with cells smaller than 1.73 nm, so you can't use 28 MPI ranks.
To avoid sampling issues in the vacuum state, you likely need to use couple-intramol=yes. This will also let you run your system on 28 ranks.

Also available in: Atom PDF