Project

General

Profile

Bug #1981

dd121 regressiontest fails with >=4 MPI ranks + OpenMP

Added by Roland Schulz over 3 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

E.g. with 4 MPI and 2 OpenMP I get:
checkpot.out (3 errors), checkforce.out (2 errors)
Angle step 20: 31.0352, step 20: 31.5902
Proper Dih. step 20: 20.253, step 20: 19.9199
Improper Dih. step 20: 5.84165, step 20: 5.98111

I get the error both the GCC (4.9.3) and ICC (16).

Associated revisions

Revision 0e31b39c (diff)
Added by Berk Hess over 3 years ago

Fix vsite bug with MPI+OpenMP

The recent commit b7e4f30d caused non-local virtual sites not be
treated when using OpenMP. This means their coordinates lagged one
step behind and their forces are not spread to the atoms, leading
to small errors in the forces. Note that non-local virtual sites are
only used when local virtual sites use them as a constructing atom;
the most common case is a C/N in a CH3/NH3 group with vsite H's.
Also added a check on the vsite count for debug builds.

Fixes #1981.

Change-Id: Ibe13b75b8ae9841937ad4abc007dba5ad78a30cd

History

#1 Updated by Berk Hess over 3 years ago

  • Subject changed from dd121 regressiontest fails with >=4 MPI ranks to dd121 regressiontest fails with >=4 MPI ranks + OpenMP
  • Category changed from testing to mdrun
  • Status changed from New to In Progress
  • Assignee set to Berk Hess
  • Priority changed from Normal to High
  • Target version set to 2016

This is an OpenMP in the vsite code rather than an MPI issue.

#2 Updated by Gerrit Code Review Bot over 3 years ago

Gerrit received a related patchset '1' for Issue #1981.
Uploader: Berk Hess ()
Change-Id: Ibe13b75b8ae9841937ad4abc007dba5ad78a30cd
Gerrit URL: https://gerrit.gromacs.org/5920

#3 Updated by Berk Hess over 3 years ago

  • Status changed from In Progress to Fix uploaded

Good that you caught this (before the 2016 release).
This was a subtle bug that only affect non-local vsites, which don't occur often. And even then, the errors in the force are rather small.

#4 Updated by Berk Hess over 3 years ago

  • Status changed from Fix uploaded to Resolved

#5 Updated by Mark Abraham over 3 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF