Bug #390
my_prev.cpt should not be overwritten by a my.cpt file of zero size
Description
Hello,
This is not a gromacs problem, per se, but given that we all work on clusters that are sometimes unstable I still think that it's an important enhancement.
Our cluster recently experienced GPFS problems. During this period, gromacs wrote an empty my.cpt file (not gromacs's fault, I admit). Then, a cpt period of time later, gromacs copied this (empty) my.cpt file to my_prev.cpt and wrote another empty my.cpt file (again, not gromacs's fault that the new my.cpt was empty).
Now I have no .cpt file and my xtc file, while mostly fine, is corrupted at the end. I'm going to need to trjconv out the corrupted part of the .xtc, trjconv -dump a frame, and grompp a new tpr (not to mention the problem with simulation_part that can also be worked around: http://bugzilla.gromacs.org/show_bug.cgi?id=389)
So I think that it would be quite amazing if mdrun would check to see that the existing my.cpt is non-empty before copying it to my_prev.cpt in preparation for writing a new my.cpt.
Thanks,
Chris.
History
#1 Updated by Chris Neale almost 11 years ago
Perhaps I should clarify that this happened to a randomly selected 600 of 1400 jobs that were running at the time that we encountered GPFS problems and so it's more than a 10 minute task to fix the problem, and I suspect that MD users will increasingly be working with a large number of simulations.
#2 Updated by Sander Pronk over 10 years ago
Fixed together with new state_prev.cpt handling in the current git version.