Project

General

Profile

Bug #1633

mdrun -nsteps -1 reports silly numbers

Added by Mark Abraham almost 3 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Low
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

The execution of mdrun is OK, but the reporting on stdout (and similarly in the log file) has:

GROMACS:      GROMACS_Cray-XC30-ARCHER_cname_Cray-XC30-ARCHER, VERSION 5.0.2
Executable:   /fs4/d37/d37/mabraham/CRESTA_BENCH_release_v1/applications/GROMACS/tmp/GROMACS_Cray-XC30-ARCHER_ion_channel_i000013/n1p12t2_t001_i01/GROMACS_Cray
-XC30-ARCHER_cname_Cray-XC30-ARCHER.exe
Library dir:  /fs4/d37/d37/mabraham/CRESTA_BENCH_release_v1/applications/GROMACS/tmp/GROMACS_Cray-XC30-ARCHER_ion_channel_1000_i000008/n2p24t1_t001_i01/share/g
romacs/top
Command line:
  GROMACS_Cray-XC30-ARCHER_cname_Cray-XC30-ARCHER -deffnm bench -noconfout -nsteps -1 -maxh 0.04 -resetstep 1000 -gcom 100

Number of hardware threads detected (48) does not match the number reported by OpenMP (1).
Consider setting the launch configuration manually!
Reading file bench.tpr, VERSION 5.0.3-dev-20141020-bf6deb3 (single precision)
Changing nstlist from 10 to 20, rlist from 1 to 1.028

The number of OpenMP threads was set by environment variable OMP_NUM_THREADS to 2

Overriding nsteps with value passed on the command line: -1 steps, -0.003 ps

This .tpr is using a 2.5fs time step, so there are two problems with the final string shown above.

I have no idea what Cray/EPCC are doing to produce 48 and 1 in the mismatch reported higher up.


Related issues

Related to GROMACS - Feature #1122: Allow to force pinning Blocked, need info

Associated revisions

Revision ac6556c4 (diff)
Added by Szilárd Páll over 2 years ago

Fix nstep command line override print

The commit addresses two issues:
- printing negative simulation length with "-nsteps 1";
eliminates rounding when converting a non-integer time-step value from
fs to ps units.

Fixes #1633

Change-Id: If1aac7e0f4e8e37f3e9777fa4eaa79744f3ccd65

History

#1 Updated by Szilárd Páll over 2 years ago

I have no idea what Cray/EPCC are doing to produce 48 and 1 in the mismatch reported higher up.

This is not necessarily something Cray or EPCC misconfigured, it can also be a sign of incorrect launch config.

I think this commonly seem issue is related to OpenMP initialization (and likely pinning) happening outside of mdrun. Did you set the threads per rank/task flag for the job scheduler? While not a proper fix, aprun -cc none will likely work around the warning.

#2 Updated by Mark Abraham over 2 years ago

  • Target version changed from 5.0.3 to 5.0.4

#3 Updated by Szilárd Páll over 2 years ago

While looking at node sharing setups I managed to reproduce this issue by simply using taskset on the mdrun process (and telling mdrun to pin). E.g.

$ taskset 0x1 $gmx mdrun -ntmpi 1 -ntomp 2
[...]

GROMACS:      gmx mdrun, VERSION 5.0.4-dev-20141209-a79e02b-dirty
Executable:   /nethome/pszilard-projects/gromacs/gromacs-5.0/build_gcc48_pd/bin/gmx
Library dir:  /nethome/pszilard/programs/gromacs-5.0-pd/share/gromacs/top
Command line:
  gmx mdrun -ntmpi 1 -ntomp 2

Number of hardware threads detected (32) does not match the number reported by OpenMP (1).
Consider setting the launch configuration manually!

There is room for improvement, I'd say we should:
  • improve the message by including a hint on what can be causing this;
  • making the message upon encountering non-default affinity with "-pin auto" much more prominent to emphasize that incorrect affinity settings can cause severe performance loss and that the correct way to run mdrun with external affinities is to explicitly set "-pin off";
  • revisiting my previously not very successful attempt to allow affinity overriding (see #1122).

#4 Updated by Szilárd Páll over 2 years ago

#5 Updated by Szilárd Páll over 2 years ago

  • Status changed from New to In Progress
  • Assignee set to Szilárd Páll

#6 Updated by Gerrit Code Review Bot over 2 years ago

Gerrit received a related patchset '1' for Issue #1633.
Uploader: Szilárd Páll ()
Change-Id: If1aac7e0f4e8e37f3e9777fa4eaa79744f3ccd65
Gerrit URL: https://gerrit.gromacs.org/4292

#7 Updated by Mark Abraham over 2 years ago

  • Status changed from In Progress to Closed

Also available in: Atom PDF