calc_verlet_buffer_size can't handle nstlist==1
kT_fac becomes zero if you do
gmx mdrun -nstlist 1, because there is no lifetime for the list, so no particle drift. This leads to s2 being zero and an arithmetic exception, which the code should be asserting against, because it divides by s2 quite a bit.
It seems like something should also be recognizing this trivial case and returning a zero-sized buffer, but the code probably isn't resilient enough to do that.
Fix Verlet buffer calculation with nstlist=1
Under rare circumstances the Verlet buffer calculation code was
called with nstlist=1, which caused a division by zero. The division
by zero is now avoided.
Furthermore, grompp now also determines and prints the Verlet buffer
sizes with nstlist=1, which provider the user information and adds
#2 Updated by Berk Hess over 3 years ago
I agree it should simply return a zero buffer (but I have to think which checks should be done before, since we shouldn't fail with a strange error when grompp used nstlist=1 and mdrun uses nstlist > 1).
But I can't reproduce this issue. I thought I added a check for all cases with nstlist=1. What setup are you using?
#4 Updated by Mark Abraham over 3 years ago
I found it while implementing the integration tests of https://gerrit.gromacs.org/#/c/5899/17 when I set -nstlist to 1 in terminationhelper.cpp. But so far I indeed haven't been able to reproduce with standalone mdrun, and the gdb backtrace hasn't enlightened me yet.
#7 Updated by Mark Abraham over 3 years ago
- File MdrunTerminationTest_WritesCheckpointAfterMaxhTerminationAndThenRestarts.tpr MdrunTerminationTest_WritesCheckpointAfterMaxhTerminationAndThenRestarts.tpr added
gmx mdrun -s MdrunTerminationTest_WritesCheckpointAfterMaxhTerminationAndThenRestarts.tpr -deffnm debug -nstlist 1 with the attached .tpr produces a NaN value in drift (eventually). Apparently that is flagged as a FP exception in the test binary, but not the real gmx. That tpr is just two waters in a decent box with v-rescale.