Project

General

Profile

Bug #2433

Error in MPI_Allreduce when applying AWH biasing and for multiple sharing biases

Added by Viveca Lindahl about 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
GROMACS version: 2018.1-dev-20180301-6358ce7
Affected version:
Difficulty:
uncategorized
Close

Description

When running e.g. mpirun -np 4 $gmx_mpi mdrun -v -multidir walker-1 walker-2 there is an error from MPI_Allreduce:

 *** An error occurred in MPI_Allreduce
 *** on communicator MPI_COMM_WORLD
 *** MPI_ERR_COMM: invalid communicator

on my machine, or when running on the cluster

Rank 8 [Thu Mar  1 16:03:40 2018] [c1-0c0s8n3] Fatal error in MPI_Allreduce: Invalid communicator, error stack:
MPI_Allreduce(1007): MPI_Allreduce(sbuf=MPI_IN_PLACE, rbuf=0x22e08f0, count=337, MPI_INT, MPI_SUM, MPI_COMM_NULL) failed
MPI_Allreduce(926).: Null communicator

A tpr for this is attached. This is similar to https://redmine.gromacs.org/issues/2403 in that there is no error when there only one rank per directory multidir argument. I.e. mpirun -np 2 $gmx_mpi mdrun -v -multidir walker-1 walker-2 runs error-free. I have all related fixes up until now applied.

awh-share-on.tpr (365 KB) awh-share-on.tpr Viveca Lindahl, 03/01/2018 05:11 PM

Associated revisions

Revision cb1947ec (diff)
Added by Berk Hess about 1 year ago

Fix AWH bias-sharing with parallel simulation

Sharing the AWH bias over multiple simulations only worked when
each simulation was running on a single MPI rank. Now parallel
simulations also work.

Fixes #2433.

Change-Id: I71f9069a31b033151c772aac84c9912d91b213a1

History

#1 Updated by Gerrit Code Review Bot about 1 year ago

Gerrit received a related patchset '1' for Issue #2433.
Uploader: Berk Hess ()
Change-Id: gromacs~release-2018~I71f9069a31b033151c772aac84c9912d91b213a1
Gerrit URL: https://gerrit.gromacs.org/7637

#2 Updated by Berk Hess about 1 year ago

  • Status changed from New to Fix uploaded
  • Assignee set to Berk Hess
  • Target version set to 2018.1

#3 Updated by Mark Abraham about 1 year ago

  • Status changed from Fix uploaded to Resolved

#4 Updated by Mark Abraham about 1 year ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF