Bug #1334

concurrency-related bug with thread-MPI

Added by Szilárd Páll almost 7 years ago. Updated over 6 years ago.

Target version:
Affected version - extra info:
Affected version:


78569, while fixing some thread-safety related aspects of hardware detection, introduced other concurrency issue(s) which manifest(s) in:
  • hardware detection report not printed to log/console when ntmpi>1;
  • GPU oversubscription check not working: e.g -ntmpi 2 -gpu_id 00 starts to run, but due to concurrency issues with GPU conntext sharing between tMPI ranks, it throws an error or segfaults before exiting (which is the reason why GPU oversubscription has not been allowed with tMPI).

The issue has also been reported on the user's list:

Related issues

Related to GROMACS - Bug #1270: affinity setting broken with MPIClosed

Associated revisions

Revision 95d10d39 (diff)
Added by Berk Hess over 6 years ago

reorganized GPU detection and selection

The GPU selection has been separated from the GPU detection
and now happens after the thread-MPI threads are started.
The GPU user/auto-selected options have been removed from
gmx_hw_info_t, such that it only contains hardware info
and can be passed around as const.
As both the CPU and GPU options structs are now tMPI rank local,
tMPI thread concurrency issues are avoided.
Fixes #1334 #1359

The GPU detection is now skipped with mdrun -nb cpu
CPU acceleration binary/hardware mismatch is now only printed once
to stderr (instead of #MPI-rank times to stdout).
Removed the master_inf_t struct.

Change-Id: If497f611b911808f6d01ca83f41ae288061dd361


#1 Updated by Szilárd Páll almost 7 years ago

  • Priority changed from Normal to High
  • Parent task set to #1270

#2 Updated by Szilárd Páll almost 7 years ago

  • Parent task deleted (#1270)

#3 Updated by Szilárd Páll almost 7 years ago

Here's what happens: because of mutex-based implementation, now the first thread to arrive grabs the mutex and does the consistency checks, but all messages and warnings are printed using md_print_warn()/md_print_info() which only print on rank 0. Hence, if rank 0 is not the first to arrive and grab the mutex, the warnings/errors as well as detection information will not be issued.

I don't have a suggestion for a good solution. Making sure that tMPI rank 0 executes the critical region would defeat the purpose of the elegant "proper" threading-style mutex-based implementation. To me it still seems that this issue represents yet another reason for not treating the thread-MPI parallelization on the top level as a "native" multi-threading implementation, but more like an MPI implementation which in some cases requires special measures to ensure thread safety.

For the full discussion see my comments on gerrit #2433 PS5.

#4 Updated by Szilárd Páll almost 7 years ago

  • Target version set to 4.6.4

Here is a possible workaround: in the beginning of gmx_check_hw_runconf_consistency() do the following:

t_commrec *cr_hack;

  cr_hack = NULL;
  cr_hack = cr;

and replace cr with cr_hack in all md_print_warn() and mn_print_info() calls.

This is rather hack-ish workaround, but to me it seems the least invasive solution - unless we want to strip the mutex-based implementation (which I'd be in favor of).

PS: I set the target version because the problem has been well defined and there is a suggestion for the solution.

#5 Updated by Mark Abraham over 6 years ago

  • Status changed from New to Fix uploaded

#6 Updated by Berk Hess over 6 years ago

  • Status changed from Fix uploaded to Resolved
  • % Done changed from 0 to 100

#7 Updated by Szilárd Páll over 6 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF