Last frame not written when re-starting from a checkpoint
I have run a series of simulations, each 100 ns in length. I do the runs 10 ns at a time, and restart from checkpoints after each segment, i.e.:
tpbconv_4.0.7_s -s ./0_10ns/md_0_10.tpr -o md_10_20.tpr -until 20000
mpirun -np 24 mdrun_4.0.7_gcc_mpi -cpi ./0_10ns/md_0_10.cpt -noaddpart -deffnm md_10_20
tpbconv_4.0.7_s -s ./0_10ns/md_0_10.tpr -o md_20_30.tpr -until 30000
mpirun -np 24 mdrun_4.0.7_gcc_mpi -cpi ./10_20ns/md_10_20.cpt -noaddpart -deffnm md_20_30
What I have noticed is that the trajectory file that is written (.xtc) does not contain the last frame. This is not a big deal for the most part, since, for instance, the first frame of md_10_20.xtc contains the frame at 10 ns. The problem comes at the end of the run. All of my trajectories are now showing only 10000 frames (instead of 10001, since I saved every 10 ps), and end at 99990 ps, instead of 100000. The final .gro file is written correctly, and the .edr file is written properly. It seems that only the .xtc file is affected. I can add back this last frame by converting the .gro file to .xtc and using trjcat, but I would think it would be preferable if mdrun wrote the complete trajectory, since overlapping frames within the trajectory (i.e., at 10000, 20000, etc) are by default over-written with trjcat.
#2 Updated by Berk Hess almost 10 years ago
Are you sure your tpr's actually have nsteps that run up exactly a multiple
of 10000 ps?
In Gromacs 4 (and all versions before) the time step is stored as a real,
thus float when compiled with float. This might give rounding errors around
the times you are using, leading to a few steps more or less than what you want.
If that's the case, you need to use -nsteps for tpbconv.
In 4.5 I changed all time variables to doubles which get rid of this nasty issue.
#3 Updated by Justin Lemkul almost 10 years ago
Thanks, Berk. That appears to be what's going on. The md_10_20.tpr file specifies 10000000 steps, but md_20_30.tpr specifies 14999999 instead of 15000000. I've yet to run extensive simulations with the 4.5 beta series, but it sounds like you've already taken care of this.