Project

General

Profile

Bug #2319

Affinity setting fails when only starting a single thread

Added by Erik Lindahl almost 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

When only starting a single thread (-nt 1), thread affinity setting fails with beta1, both on skylake-x with gcc/7.1 and broadwell with gcc-5.4.

It works fine at least when using as many threads as there are hardware threads.

run.log (23.5 KB) run.log Erik Lindahl, 12/04/2017 01:35 AM
run.tpr (354 KB) run.tpr Erik Lindahl, 12/04/2017 01:35 AM

Related issues

Is duplicate of GROMACS - Bug #2088: affinity setting code issues spurious warning notesClosed

Associated revisions

Revision 7bc4ec9b (diff)
Added by Berk Hess almost 2 years ago

Tone down note about not pinning threads

The note mdrun prints to stderr and log file when not pinning
threads is now a short note instead of the old failure message,
since not pinning is often a choice, not a failure. With more than
one thread, a failure or longer explanatory note will anyhow have
been issued before.
Also removed a warning about OpenMP set affinity when the total
number of threads was chosen by mdrun.
Updated the thread affinity tests for the new message. Added tests
that are now possible with the recently introduced auto thread count
variable.

Fixes #2088 and #2319

Change-Id: I1411a1ce6e222d22da8d70bf7bab2c9bb7564507

History

#1 Updated by Szilárd Páll almost 2 years ago

  • Status changed from New to Blocked, need info

I first thought you observed a regression compared to v2016, but having done multiple builds and tests I can't reproduce at all, so I assume this is simply a duplicate of #2088.

#2 Updated by Erik Lindahl almost 2 years ago

Steps to reproduce:

1) Build latest master revision on login.biopysics.kth.se, with default compiler & options (see log for gory details)

2) Run the attached TPR as "gmx mdrun -nt 1 -v -deffnm run" (I don't expect it to be specific to this file, though)

This gives me (100% of the time):

NOTE: Thread affinity setting failed. This can cause performance degradation.
If you think your settings are correct, ask on the gmx-users list.

#3 Updated by Erik Lindahl almost 2 years ago

... and I just noticed that you are right about 2016.3 giving the same error.

In any case: This is a perfectly normal run (just limited to a single core), so either there's a bug in the pinning code, or (if we cannot pin in this case) we shouldn't issue messages telling people to ask on gmx-users if they think their settings are correct (we know they are here).

#4 Updated by Mark Abraham almost 2 years ago

Not sure if related, but there is also

1 GPU auto-selected for this run.
Mapping of GPU IDs to the 1 GPU task in the 1 rank on this node:
  PP:0

NOTE: Thread affinity setting failed. This can cause performance degradation.
      If you think your settings are correct, ask on the gmx-users list.
starting mdrun 'spc-and-methanol'
2 steps,      0.0 ps.

from http://jenkins.gromacs.org/job/Matrix_PreSubmit_2018/37/OPTIONS=gcc-5%20openmp%20simd=avx_128_fma%20opencl%20amdappsdk-3.0%20host=bs_nix-amd_gpu,label=bs_nix-amd_gpu/testReport/junit/(root)/CTest/MdrunTests/ (which ends in a segfault in what might be the affinity setting code...)

#5 Updated by Berk Hess almost 2 years ago

We only pin automatically when using the whole machine or when mdrun automatically reduced the number of threads. So we correctly don't pin here, but we should not issue a warning that pinning failed. We should either issue a note that we are not pinning or don't say anything at all.

#6 Updated by Szilárd Páll almost 2 years ago

OK, so there is nothing to fix here unless this is more than just #2088 -- where solutions were discussed. This is essentially the artifact of work that simplified/unified code-paths so now the same path is taken (and we can't distinguish at the spot where we warn) both the case when affinities are expectedly or unexpectedly not set.

#7 Updated by Mark Abraham almost 2 years ago

  • Is duplicate of Bug #2088: affinity setting code issues spurious warning notes added

#8 Updated by Mark Abraham almost 2 years ago

Szilárd Páll wrote:

OK, so there is nothing to fix here unless this is more than just #2088 -- where solutions were discussed. This is essentially the artifact of work that simplified/unified code-paths so now the same path is taken (and we can't distinguish at the spot where we warn) both the case when affinities are expectedly or unexpectedly not set.

Agree that this is a dupe. Szilard, can you please work on a resolution?

#9 Updated by Gerrit Code Review Bot almost 2 years ago

Gerrit received a related patchset '1' for Issue #2319.
Uploader: Berk Hess ()
Change-Id: gromacs~release-2018~I1411a1ce6e222d22da8d70bf7bab2c9bb7564507
Gerrit URL: https://gerrit.gromacs.org/7291

#10 Updated by Berk Hess almost 2 years ago

  • Status changed from Blocked, need info to Fix uploaded
  • Assignee set to Berk Hess

#11 Updated by Erik Lindahl almost 2 years ago

  • Status changed from Fix uploaded to Resolved

#12 Updated by Erik Lindahl almost 2 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF