Project

General

Profile

Task #1729

Resolve whether and how to resolve "state" variables stored in .tpr

Added by Mark Abraham over 4 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
preprocessing (pdb2gmx,grompp)
Target version:
Difficulty:
uncategorized
Close

Description

Following on from some discussion on https://gerrit.gromacs.org/#/c/4575/...

In the dark ages, the .tpr was the only way to tell mdrun where to start for free-energy algorithms, but these days the state variables can also be read from a checkpoint file. The inputrec has init-lambda and init-lambda state, but grompp fills fields like state->fep_state (at least, and probably others), only to perhaps have them over-written by the checkpoint later. Michael's free-energy changes contain a lot of comments that we need to handle such initialization more sanely. It is bug-prone to have code that handles setting and I/O of state variables to both the .tpr and .cpt.

I can see three reasonable ways forward

  1. Instead of writing state variables to the .tpr, bundle the state data (I/O done in the checkpoint-file format by the checkpoint-file code) after the parameter and inputrec data.
  2. As above, but have grompp write a separate checkpoint file
  3. Continue as before, but move free-energy state-initialization code from grompp to mdrun and run it instead of reading the checkpoint, when there is no checkpoint to read

2 has the defect that can't start a run from a single file any more, and I think that is useful enough that we want to keep it. 1 is a bit of work, but is conceptually clean - under the hood there's always a "checkpoint file" to read - so the implementation is easier to write and maintain. 3 is the easiest to do, but it's not a useful step on the path to 1.


Related issues

Related to GROMACS - Bug #1730: gmx compare does not compare all fields of a .tprNew05/12/2015
Related to GROMACS - Task #2971: Rework TPR reading to allow reading of raw bytes from disk and communication of complete information at setup timeClosed

Associated revisions

Revision 9fbe7e5b (diff)
Added by Mark Abraham over 4 years ago

Fix uninitialized fields in grompp t_state

Allocating t_state on the stack and using an incomplete
pseudo-constructor means it is possible to write an uninitialized
value in fep_state field to the .tpr file, which we've been
doing. Found with Memory Sanitizer. Perhaps this behaviour lies behind
some of the strange behaviour that is periodically seen on Jenkins.

Fixed by allocating t_state on the heap, which might resolve other
issues, since snew() zeroes the memory as a side effect. Also
initialized fep_state field in init_state().

Refs #1729, #1730

Change-Id: Ibcee7bff1e090fb1991969c4562f44f056868a03

History

#1 Updated by Mark Abraham over 4 years ago

  • Related to Bug #1730: gmx compare does not compare all fields of a .tpr added

#2 Updated by Gerrit Code Review Bot over 4 years ago

Gerrit received a related patchset '2' for Issue #1729.
Uploader: Mark Abraham ()
Change-Id: Ibcee7bff1e090fb1991969c4562f44f056868a03
Gerrit URL: https://gerrit.gromacs.org/4575

#3 Updated by Berk Hess over 4 years ago

The doesn't seem to be an issue for the fix of initializing fep_state in state, so I'm writing here.
Apart from all the valid points made above, it seems to me that state->fep_state should only be accessed when free-energy=yes. The only exceptions might be printing and comparing the state. Since the current code neither prints nore compares fep_state without free-energy, this sounds like there's are bug somewhere where code read fep_state when free-energy=no. Mark, did you find use of uninitialized fep_state?

I added a print of fep_state to gmx dump and saw that all reference tprs in our regressiontests have fep_state=0.

#4 Updated by Mark Abraham over 4 years ago

write_tpx from grompp on master copied the uninitialized state->fep_state value from new_status into IIRC a tpx-header structure and wrote that. MSan complained when that memory got passed to the XDR write routine

#5 Updated by Paul Bauer 6 months ago

  • Related to Task #2971: Rework TPR reading to allow reading of raw bytes from disk and communication of complete information at setup time added

Also available in: Atom PDF