Project

General

Profile

Bug #2614

mdrun can exit with atom moved to far in DD or PME due to too small domains

Added by Berk Hess about 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Because of the large nstlist values that are used now, often by default with GPUs, atom displacements are far larger than they uses to be. This can lead to mdrun stopping with a fatal error due to particle moving too far in DD repartitioning or PME.
Possible solutions are limiting the minimum domain size and/or maximum nstlist appropriately for the system.

Associated revisions

Revision 2cb82175 (diff)
Added by Berk Hess about 1 year ago

Refactor calc_verlet_buffer_size()

Split of parts, which will be reused, from calc_verlet_buffer_size().
This change is only code motion.

Refs #2614

Change-Id: I9cf68a2661ee10eb991240d2768c077df9f9c0c5

Revision cd41b0fe (diff)
Added by Berk Hess about 1 year ago

Ensure domains are large enough for atom motion

The introduction of the dual pair list has led to larger nstlist
values, which leads to larger atom displacements between domain
decomposition steps. This has made it much more likely that
"atom moved to far" errors appeared at DD and PME redistribution.
Now minimum DD cell size setting correctly takes into account atom
displacement (when there is a reference temperature).

Note that this can significantly increase the minimum DD cell size
for solvent systems and slighlty for systems with large molecules.

Fixes #2614

Change-Id: Ie41131e9eed3ef828928516a6b8ebfb9b5ba2bdb

History

#1 Updated by Gerrit Code Review Bot about 1 year ago

Gerrit received a related patchset '1' for Issue #2614.
Uploader: Berk Hess ()
Change-Id: gromacs~release-2018~I9cf68a2661ee10eb991240d2768c077df9f9c0c5
Gerrit URL: https://gerrit.gromacs.org/8197

#2 Updated by Gerrit Code Review Bot about 1 year ago

Gerrit received a related patchset '1' for Issue #2614.
Uploader: Berk Hess ()
Change-Id: gromacs~release-2018~Ie41131e9eed3ef828928516a6b8ebfb9b5ba2bdb
Gerrit URL: https://gerrit.gromacs.org/8198

#3 Updated by Berk Hess about 1 year ago

  • Status changed from In Progress to Fix uploaded

I uploaded a fix for systems with a reference temperature.
Note that for systems without T-coupling or T=0 will should think of a better solution than using the pairlist buffer size.

#4 Updated by Szilárd Páll about 1 year ago

Berk Hess wrote:

Note that for systems without T-coupling or T=0 will should think of a better solution than using the pairlist buffer size.

Should we have a separate issue for that given that the current fix is partial -- asking because of the "Fix uploaded" status?

#5 Updated by Berk Hess about 1 year ago

  • Status changed from Fix uploaded to Resolved

#6 Updated by Paul Bauer about 1 year ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF