steepest descent minimization works with ns_type=simple but not ns_type=grid
Occasionally, steepest descent minimization explodes for me (but not always --
usually it is fine). I finally established that, in the cases where it fails,
ns_type=simple works, but not ns_type=grid. Attached are two tpr files for
minimization illustrating this differencce. The one using ns_type=grid exits
with an error after 12 steps of minimization; the other finishes 500 steps just
fine and is stable in subsequent MD. I mentioned this on the users list.
Is this a bug? Or does it just mean ns_type=grid doesn't work well for
minimization for some reason? Or do I have some other input option set wrong?
#2 Updated by David van der Spoel about 14 years ago
On my iMac G5 both calculations give almost identical results and converge after
25 steps. Incidentally they both produce step11 and step12 .pdb files because of
a settle problem. The G5 message is the normal one:
Stepsize too small, or no change in energy.
Converged to machine precision,
but not to the requested precision Fmax < 100
On an Opteron however, both minimizations continue to 500 steps. This is weird...
The simple and grid neighborsearching algorithms produce neighborlists in a
different order, which means that on any platform the results will diverge due
to round-off (as can be tested with gmxcheck). Since neither of the two can be
used as a standard we have to accept this fact. That also means that
occasionally there will be crashes in either of the algorithms, and the grid
algorithm is somewhat more sensitive to crashes than the simple algorithm.
Since I can not reproduce the problem, I would like to ask what GROMACS version
you are using on which platform and with which compiler.
#3 Updated by David Mobley about 14 years ago
That is really weird.
Platforms: These particular calculations were run on Pentium 4's. But I also
have occasional identical problems on an Opteron cluster; I can probably dig up
tprs which show the same behavior from the opterons. This is an overarching
problem on my end: I have a lot of calculations I run on many slightly
different systems (the same protein, binding a lot of different ligands, each of
which may have several different orientations. Occasionally, one of the ligand
orientations has this problem. Now that I have switched to ns_type=simple, it
seems to never occur anymore).
Compilers: gcc 4.0.2 (on both the Pentium 4's and the Opterons). These are with
Fedora Core 4. I basically just ran configure and make with FC 4 and didn't do
anything special, except using FFTW 2. (Although the Opteron version has been
compiled with FFTW 3.).
Again, if it would be helpful, I can try and give you a pair of tpr files which
cause the problem on my opterons.
#4 Updated by David van der Spoel about 14 years ago
On my Mac I have gcc 4.0.0, on the Opteron 3.4.4.
There is a slim chance that FFTW2 could be involved in the problem (or our
linking to it). This has changed in 3.3. Bottomline for me is that if I can not
reproduce it I can't fix it, so if you can cough up another pair of tpr files
that show this problem it would be interesting. Maybe you could start by testing
the current tpr files on the different platforms you have available. Anyway, I
appreciate your cooperation in stomping out the last few bugs from GROMACS :).
#5 Updated by David Mobley about 14 years ago
I'm sorry, but now I'm having trouble turning up any tpr files which can reproduce the problem on the
opterons. I think it really must be pretty sporadic and partly due to finite precision, as you suggest, as I
had some before (but fixed the problem by going to ns_type=simple, and now when I go back to
ns_type=grid I can't find any that crash).
Anyway, I guess I suggest ignoring this for now and if I am able to find any way to reliably reproduce the
problem, I will update later.
Thanks for the fix on the g_angle bug.