Project

General

Profile

Bug #1359

race condition(s) in hardware detection(?)

Added by Mark Abraham almost 6 years ago. Updated almost 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

As noted in https://gerrit.gromacs.org/2633, there is a data race on (at least) pin_stride. Observed with non-reproducible .log file output about whether the pin stride being used is reported as being auto-selected or user-set, and also with clang TSan.

Currently blocking 4.6.4 until we know the (at least) severity of the underlying problem.

Associated revisions

Revision 95d10d39 (diff)
Added by Berk Hess almost 6 years ago

reorganized GPU detection and selection

The GPU selection has been separated from the GPU detection
and now happens after the thread-MPI threads are started.
The GPU user/auto-selected options have been removed from
gmx_hw_info_t, such that it only contains hardware info
and can be passed around as const.
As both the CPU and GPU options structs are now tMPI rank local,
tMPI thread concurrency issues are avoided.
Fixes #1334 #1359

The GPU detection is now skipped with mdrun -nb cpu
CPU acceleration binary/hardware mismatch is now only printed once
to stderr (instead of #MPI-rank times to stdout).
Removed the master_inf_t struct.

Change-Id: If497f611b911808f6d01ca83f41ae288061dd361

History

#1 Updated by Mark Abraham almost 6 years ago

  • Description updated (diff)

#2 Updated by Mark Abraham almost 6 years ago

  • Status changed from New to Fix uploaded

#3 Updated by Berk Hess almost 6 years ago

  • Status changed from Fix uploaded to Resolved
  • % Done changed from 0 to 100

#4 Updated by Rossen Apostolov almost 6 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF