Project

General

Profile

Bug #1647

race condition with lincs + openmp + free-energy

Added by Mark Abraham almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
release-5-0 and master
Affected version:
Difficulty:
uncategorized
Close

Description

When I converted regressiontests/freeenergy/expanded and transformAtoB to the Verlet scheme and switched them to use LINCS, I exposed a race condition in the way LINCS reduces dvdlambda at http://redmine.gromacs.org/projects/gromacs/repository/revisions/release-5-0/entry/src/gromacs/mdlib/clincs.c#L470, which was caught by the TSAN build on Jenkins - see http://jenkins.gromacs.org/job/Gromacs_Gerrit_master/7403/OPTIONS=Compiler=gcc%20CompilerVersion=4.9%20CMAKE_BUILD_TYPE=TSAN%20host=bs_nix1310,label=bs_nix1310/testReport/junit/(root)/freeenergy/expanded/.

I have reproduced this locally with a no-futex build of gcc-4.9 with mdrun -ntmpi 1 -ntomp 2, but of course mdrun -ntmpi 1 -ntomp 1 and mdrun -ntmpi 2 -ntomp 1 are both fine.

I am not sure if release-4-6 is affected, because those test cases don't run with the Verlet scheme of that era. complex/nbnxn-free-energy does not use constraints, so we didn't catch it there, either. We should expedite converting regressiontests so we can run both cutoff schemes, even if we don't do so in Jenkins for every commit.

Replacing #pragma omp barrier with #pragma omp critical fixes the issue. I will upload a fix.

Associated revisions

Revision 8b404bee (diff)
Added by Berk Hess almost 3 years ago

Avoid race on dvdl with Verlet+OpenMP+LINCS+FE+VV

Also restructured the dH/dlambda reduction in do_lincs (used for
coordinates and not affected by the race issue) to work similar
do the do_lincsp code and properly use thread parallelization.
Fixes #1647.

Change-Id: I4eeb131018abca88b3635932491d99a779e16037

Revision d4c85926 (diff)
Added by Mark Abraham almost 3 years ago

Add test cases for FE+Verlet

One modifies a test case to test the VV integrator, another modifies
a test case to use LINCS instead of SHAKE.

Refs #1647

Change-Id: I090172d475b1c7224e18a52e72767999ce99bf7f

History

#1 Updated by Gerrit Code Review Bot almost 3 years ago

Gerrit received a related patchset '1' for Issue #1647.
Uploader: Mark Abraham ()
Change-Id: Ib3e08f163e621b9eacc0fdf3d5a2cbc8e36b7053
Gerrit URL: https://gerrit.gromacs.org/4239

#2 Updated by Gerrit Code Review Bot almost 3 years ago

Gerrit received a related patchset '1' for Issue #1647.
Uploader: Berk Hess ()
Change-Id: I4eeb131018abca88b3635932491d99a779e16037
Gerrit URL: https://gerrit.gromacs.org/4243

#3 Updated by Berk Hess almost 3 years ago

Note that this bug only affect constraining some else than coordinates, e.g. velocities or forces. So only running a test with md-vv would have caught this.
4.6 is unaffected.

#4 Updated by Mark Abraham almost 3 years ago

  • Status changed from New to Closed

Also available in: Atom PDF