C++ API for simulation input and output
Functionalities such as (hybrid) Monte Carlo, simulation replicas, replica exchange, and input preparation/manipulation share a need for API access to simulation inputs and outputs.
Additionally, efforts to limit the responsibilities of individual tools (and separate out convenience options) warrant light-weight ways to connect tools, including ways to filter or manipulate trajectory output before it hits the filesystem. See, for instance, #3286.
This issue is intended to collect a roadmap for design and development.
Related efforts include
- encapsulation, abstraction, and interface development under nb-lib
- restructuring of simulator launch, collaborations, and data structures related to expansion of the ModularSimulator (links from Pascal? Paul?)
- expansion of the MdModules framework (links from Christian? Others?)
- evolution of modular input handling
- evolution of the checkpoint facilities
- clarifying simulator program state and invariants (#3325, #2375)
To clarify the scope of this issue, define some use cases.
Features and tools enabled by the API functionality described in this issue.
Ensemble simulation / multi-sim¶
Temperature replica exchange¶
Hamiltonian replica exchange¶
Monte Carlo rejection of a trajectory segment¶
convert-tpr / gmxapi.modify_input¶
Filesystem-decoupled input preparation and simulation¶
Filesystem-decoupled simulation output handling¶
API use cases driving features within this issue scope, supporting the scenarios expected within the application use cases above.
Obtain a reference to the output of a simulation segment.¶
Produce input for a simulation segment from the output of a simulation segment.¶
Obtain a modified SimulationInput from an "editing" operation.¶
Compose a SimulationInput¶
Decompose a SimulationInput (topology, microstate, simulation parameters, metadata, others?)¶
Fingerprint a SimulationInput (identify the trajectory of which it is a part and the segment that will be produced (uniquely to the point of reproducibility and/or scientific relevance))¶
Library-internal use cases included by the above API implementation scenarios, or connected to the accompanying (re)factoring.
Apply SimulationInput to consuming modules.¶
Initialize volatile data (internal state) from the (immutable) record of input.¶
Coordinate a Memento, or publish light-weight (opaque) handle to simulator output or checkpoint (don't bake in details of data locality or structure)¶
Interactions between GROMACS internal modules and the new API facilities or supporting infrastructure.
(Re)initialize internal state.¶
Dump internal state.¶
Confirm input validity.¶
Register information or collaboration dependencies.¶
Register, publish, or be able to describe available outputs.¶
Distinguish between (immutable) input and (mutable) program state (clarify stages of initialization, reform inputrec use cases).
Clarify the information hierarchy represented by SimulationInput (and SimulationOutput)Maximize reusability of the MD runner
- allow SimulationInput to be reapplied in a process lifetime
- understand reusable resources or data structures that do not need reinitialization
Define SimulationState encapsulation, or coordinate with its road map.
To further clarify the scope of this issue, identify related tasks that should have a more explicit road map, but which are (currently) considered beyond the scope of this feature topic.
- Decouple Mdrunner collaborations from assumptions of file-based I/O (Remove the ArrayRef<const t_filenm> from gmx::Mdrunner.)
- Modernize/unify run time simulation options handling (#2877)
- clean up the mdrun call hierarchy and program flow (input aggregation, acquisition of run time resources, component initialization and binding, creation protocols, "runner" versus "simulator")
- Decouple Mdrunner from membed and essential-dynamics implementation details.
- Logging abstraction (#2999)
- SimulationInput abstraction (#3374): let the existence of TPR and checkpoint input files be client-level concerns, and encapsulate their handling from the rest of the mdrun call stack.
- Read files during acquisition of the SimulationInput handle. (proposed during dev telco, 12 February 2020)
- Optimize concrete SimulationInput for TPR/CPT serialization protocol. (proposed during dev telco, 12 February 2020)
- Export SimulationInput bindings to the Python package.
- Reimplement gmxapi.simulation operations in terms of SimulationInput.
- Reimplement gmxapi.modify_input and convert-tpr in terms of SimulationInput. (also ref #3295)
- more (please contribute)
Criteria for completion¶
This issue may remain open as long as it is a useful road map, but can likely be considered "resolved" when the API use cases to support the targeted applications are well understood, and either implemented or independently tracked on another road map.