Project

General

Profile

Bug #2041

mdrun -resetstep can finish too early

Added by Berk Hess over 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

With mdrun -resetstep the run can finish before the reset step is reached and print timings without warning. In contrast, we now generate a fatal error when PME tuning is still active at -resetstep. So we should also exit with a fatal error without printing timings when finishing before -resetstep.


Related issues

Related to GROMACS - Task #1781: re-design benchmarking functionalityAccepted
Related to GROMACS - Bug #2131: mdrun hangs upon "-nsteps " or "-maxh" trigger with more than 20 MPI processesClosed

Associated revisions

Revision 1d2d95e3 (diff)
Added by Mark Abraham almost 3 years ago

Don't print invalid performance data

If mdrun finished before a scheduled reset of the timing information
(e.g. from mdrun -resetstep or mdrun -resethway), then misleading
timing information should not be reported.

Fixes #2041

Change-Id: I4bd4383c924a342c01e9a3f06b521da128f96a35

History

#1 Updated by Mark Abraham about 3 years ago

  • Target version changed from 2016.1 to 2016.2

#2 Updated by Mark Abraham almost 3 years ago

  • Assignee set to Mark Abraham

#3 Updated by Gerrit Code Review Bot almost 3 years ago

Gerrit received a related patchset '1' for Issue #2041.
Uploader: Mark Abraham ()
Change-Id: gromacs~release-2016~I4bd4383c924a342c01e9a3f06b521da128f96a35
Gerrit URL: https://gerrit.gromacs.org/6428

#4 Updated by Mark Abraham almost 3 years ago

  • Status changed from New to Fix uploaded

I didn't follow Berk's suggestion to add a new fatal error. The fatal error when PME tuning and reset interact is because it isn't valid to continue tuning after a reset, nor does it make sense to interpret the performance data if tuning would be allowed to continue. But in the present case, it suffices to simply not print performance data when it is known that it isn't what the user asked for.

#5 Updated by Mark Abraham almost 3 years ago

  • Status changed from Fix uploaded to Resolved

#6 Updated by Mark Abraham almost 3 years ago

  • Status changed from Resolved to Closed

#7 Updated by Mark Abraham almost 3 years ago

  • Related to Task #1781: re-design benchmarking functionality added

#8 Updated by Mark Abraham almost 3 years ago

  • Status changed from Closed to Blocked, need info
  • Target version changed from 2016.2 to 2016.3

This fix may break mdrun -resethway with or without PME tuning, because bResetCountersHalfMaxH is then set on all ranks but only ever cleared on the master rank, which might break the new logic at the end of do_md. But perhaps the fix for #2131 also resolves this case.

#9 Updated by Mark Abraham almost 3 years ago

  • Related to Bug #2131: mdrun hangs upon "-nsteps " or "-maxh" trigger with more than 20 MPI processes added

#10 Updated by Mark Abraham almost 3 years ago

  • Status changed from Blocked, need info to Closed

Yes, looks like -resethway was broken, but now seems to be fixed by the solution for #2131.

Also available in: Atom PDF