Project

General

Profile

Bug #390

my_prev.cpt should not be overwritten by a my.cpt file of zero size

Added by Chris Neale almost 10 years ago. Updated over 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Erik Lindahl
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Hello,

This is not a gromacs problem, per se, but given that we all work on clusters that are sometimes unstable I still think that it's an important enhancement.

Our cluster recently experienced GPFS problems. During this period, gromacs wrote an empty my.cpt file (not gromacs's fault, I admit). Then, a cpt period of time later, gromacs copied this (empty) my.cpt file to my_prev.cpt and wrote another empty my.cpt file (again, not gromacs's fault that the new my.cpt was empty).

Now I have no .cpt file and my xtc file, while mostly fine, is corrupted at the end. I'm going to need to trjconv out the corrupted part of the .xtc, trjconv -dump a frame, and grompp a new tpr (not to mention the problem with simulation_part that can also be worked around: http://bugzilla.gromacs.org/show_bug.cgi?id=389)

So I think that it would be quite amazing if mdrun would check to see that the existing my.cpt is non-empty before copying it to my_prev.cpt in preparation for writing a new my.cpt.

Thanks,
Chris.

History

#1 Updated by Chris Neale almost 10 years ago

Perhaps I should clarify that this happened to a randomly selected 600 of 1400 jobs that were running at the time that we encountered GPFS problems and so it's more than a 10 minute task to fix the problem, and I suspect that MD users will increasingly be working with a large number of simulations.

#2 Updated by Sander Pronk over 9 years ago

Fixed together with new state_prev.cpt handling in the current git version.

Also available in: Atom PDF