Project

General

Profile

Bug #2680

mdrun-non-integrator-test with nt-mpi>2

Added by Roland Schulz almost 2 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
testing
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

With nt-mpi 3:
MdrunIsReproduced/MdrunRerunFreeEnergyTest.WithinTolerances/22

Source file: src/gromacs/mdlib/update.cpp (line 916)
Function: auto doSDUpdateGeneral(gmx_stochd_t *, int, int, real, const rvec *, const ivec *, const real *, const unsigned short *, const unsigned short *, const unsigned short *, const unsigned short *, const rvec *, rvec *, rvec *, const rvec *, int64_t, int, const int *)::(anonymous class)::operator()() const
MPI rank: 1 (out of 3)

Assertion failed:
Condition: f != nullptr
SD update with only forces requires forces

With 4:
[ FAILED ] NormalMdrunIsReproduced/MdrunRerunTest.WithinTolerances/7, where GetParam() = ("spc5", "sd")

With 5:
MinimizersWorkWithConstraints/EnergyMinimizationTest.WithinTolerances/4
Fatal error:
There is no domain decomposition for 5 ranks that is compatible with the given
box and a minimum cell size of 0.452126 nm

Fine if it doesn't work with certain number of threads. But probably should be documented and ideally give a nicer error if tried to run with wrong number or tests should be disabled for number of threads not supported.

Associated revisions

Revision 875b03c8 (diff)
Added by Mark Abraham over 1 year ago

Improve robustness of end-to-end testing

Previously one could choose the number of ranks to run the test binary
with, but e.g. DD errors could exit the binary if an unsuitable number
could be chosen.

Fixes #2680

Change-Id: I5ca165c6a80763083f2db9beec61ad3b4bfbe00d

History

#1 Updated by Mark Abraham almost 2 years ago

  • Category set to testing
  • Status changed from New to Accepted
  • Assignee set to Mark Abraham
  • Target version set to 2019-beta1

#2 Updated by Berk Hess almost 2 years ago

I assume this is due to the larger DD cutoff due to the update groups. If so, solutions can also be decreasing the cutoff or increasing the box size.

#3 Updated by Mark Abraham almost 2 years ago

There are some tiny systems and no checks, so that's the first thing to fix

#4 Updated by Mark Abraham almost 2 years ago

The assertion failure was fixed by the MSAN fix. Others need more support from the test runner to handle stably. WIP

#5 Updated by Gerrit Code Review Bot almost 2 years ago

Gerrit received a related patchset '1' for Issue #2680.
Uploader: Mark Abraham ()
Change-Id: gromacs~master~I5ca165c6a80763083f2db9beec61ad3b4bfbe00d
Gerrit URL: https://gerrit.gromacs.org/8522

#6 Updated by Mark Abraham almost 2 years ago

  • Status changed from Accepted to Fix uploaded

#7 Updated by Mark Abraham over 1 year ago

  • Status changed from Fix uploaded to Resolved

#8 Updated by Mark Abraham over 1 year ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF