Feature #2585

Updated by Eric Irrgang over 2 years ago

This issue describes the major feature developments necessary to support development of external API facilities with appropriate abstraction and integration testing. With issues like #988 still open, it is still possible to describe necessary GROMACS infrastructure changes to support API development and the best known API functionality goals.

h1. Goals

The following list summarizes feature requirements established during development of

* MPI environment sharing
* Binding external objects with interfaces for external code to retain access to launched tasks (such as objects instantiated during thread-MPI launch)
* Accessible simulation stop signal
* Launch MD runner through library calls
* Extensible ForceProviders
* Restraint framework / extensible “pull” code
* Libgmxapi build target and public headers
* Libgmxapi doxygen documentation
* Manage filesystem input and output locations
* Interaction / synchronization with predetermined integer time steps rather than time value.

h1. Packaging and project structure

The following notes may belong in a different issue, such as #2045, but are provided here for context.

h2. Libgromacs

Long term expectations consistent with this proposal:

* Its modules may have headers internal to them,
* Exposes a library API of deliberately selected components
* Its modules may expose components to that library API e.g. IForceProvider,
* That library API must be unit tested so it is known to work
* does not expose a user-facing API (ie no installed headers)
* Library API will be stable over the lifetime of a release branch (ie from about the time of the official release),
* Soversion bumps with each major release
* Library API will not be stable in master branch (but must pass its own tests in CI)
* Expects to be able to link itself dynamically
* Expects to be able to link itself statically (and its dependencies, if available)
* Linked as position-independent code
* Requires python v3 (patch from Pascal in gerrit, also releng needs an update too, probably)

h2. Libgmxapi in GROMACS repo

While the exposed symbols and public headers are reconsidered and as code in libgromacs continues to undergo more compartmentalization and modernization, it is convenient to define a second library and set of public headers. For now, we defer the question of whether libgmxapi would ultimately be subsumed into libgromacs, but we propose that the gmxapi headers ultimately replace the current set of public headers.

* Builds with access to libgromacs library API headers
* installs the user-facing C++ headers for client code (such as what Python module builds against)
* “plugin” API. E.g. a stable framework(s) for implementing various sorts of potentials while hiding IForceProvider e.g. by exposing adapters to things like domain-distributed positions. May warrant different versioning semantics than API/ABI supporting user interfaces: probably more stable API; possibly less stable ABI.
* (Proposing) semantic versioning (standard guarantees starting somewhere between 0.1 and 1.0)
* One name or separate name for tMPI / MPI? (suggest “no”)
* Plugin API only in C++, but has supported method to expose bindings to the inserted code that can be used in Python scripts, i.e plugins are not written in pure Python
* Main (high-level workflow) API complete only in python, but will migrate to C++. (We can accelerate the C++ version if there is a need)
* No plans to load C++ plugin modules natively (ie no dlopen) because this is able to be handled more portably at Python level (or some other user interface code), but we need to be able to receive and register factory functions to get objects from external code

Parts of this discussion may belong under #988

h2. Python package

A GROMACS Python package like is nominally beyond the scope of this issue. A few points from #2045 worth mentioning, though:

* serves to prove libgmxapi functionality; needs integration testing with libgromacs and libgmxapi
* Ultimately, it should be available as part of a GROMACS installation, but should it be in the same repository and/or CMake build environment?
* pybind11 source bundled with repo
* Provides both an implementation and an API specification.
- Researchers can be compatible with it without depending on it.
- Writers of GROMACS extension code are free to use other Python bindings frameworks.
- We provide tools and helpers, but C++ helpers use pybind11

h1. Details

A big picture of planned development is necessary even before Redmine Issues exist, so milestones are enumerated with feature ID tags. Dependencies are better illustrated in the accompanying chart. Each numbered feature (chart node) is expected to be from 1 to 10 Gerrit changes, generally 3 to 5.

h2. Proposed development targets

h3. Gmx1 (this issue) Design documentation strategy / project management plan

* this issue and a cluster of Redmine issues with subtasks should be good
* documentation and visual aids like the attached progress chart should probably be in the repository somewhere

h3. Gmx2 (Issue #2586) Versioned libgmxapi target for build, install, headers, docs

Issue #2586

h3. Gmx3 Integration testing

* Gmxapi interfaces should continue functioning with unchanged semantics for other GROMACS changes, or API level needs to be incremented according to semantic versioning.
* External projects need to be tested outside of the gromacs build tree to validate external interfaces of installation. Suggested external projects: Python package, sample_restraint, yet-to-be-written integration test suite.
* Tests should be clear about the API version they are testing, and we should test all versions that aren’t unsupported (though we need a policy in this regard) and we can note whether new API updates are backwards compatible.
* Forward-compatibility testing: we should at least _know_ whether we are breaking old client code and include release notes, regardless of policy
* ABI compatibility testing? (should we test mismatched compilers and such?)
* Example code in documentation should be tested, if possible.

h3. Gmx4 Library access to MD runner

* mdrun CLI program is an API client

Relates to #2229

h3. Gmx5 (Issue #2587)Provide Provide runner with context manager

Issue #2587

h3. Gmx6 Extensible MDModules and ForceProviders

* ForceProviders obtained after tMPI threads have spawned.
* MDModules list extended at runtime during simulation launch.
* External code may be provided to the runner to instantiate or get a handle to a module.
* Expanded Context class can broker object binding by registering and holding factory functions for modules, as well as other resources that may be implemented differently in different environments.
* Somewhere in here, MDModules either need access to the integral timestep number or the ability to register call-backs or signals on a schedule.

Relates to #2590, #2574, #1972

Do MDModules live in a scope of tight association with an integrator? Do we need other concepts, like RunnerModules? Or subdivisions like MDForceModule, MDObserverModule, MDControlModule?

h3. Gmx7 Binding API for higher-level client code

h3. Gmx8 Binding API for plug-in ForceProviders

Ultimately tied to gmx5 and gmx24, but we can start stabilizing the external interfaces now. The external interfaces are for (a) user interface / workflow management code, and (b) MD extension code. We define a simple message-passing C structure along with PyCapsule name and semantics. An MD extension object can provide a factory method with which the MD Runner can get an IMDModules interface at simulation launch. The object pointed to may exist before and/or after the lifetime of the simulation. It must be understood that the IMDModule handle will be obtained on every rank. Design should consider future infrastructure and needs, but does not need to implement now. (expressing data dependencies and locality, negotiating parallelism, expressing periodicity) Short-term implementation may require workarounds for some of these, but the workaround can mostly be segregated from this issue’s resolution.

Relates to #2590

h3. Gmx9 Headers and adapter classes for Restraint framework

Relates to #1972, #2590, #2590,

h3. Gmx10 MD signalling API

Relates to #2224

h3. Gmx11 Replace MPI_COMM_WORLD with a Context resource

h3. Gmx12 Runtime API for sharing / discovering hardware / parallelism resources

* Libgmxapi requests resources from libgromacs from the current node
* CUDA environment can be manipulated but we shouldn’t have to deal with that for a while
* Evolving task scheduling interfaces, expressing data locality
* Concepts of time and timestep

h3. Gmx13 API for working directory, input/output targets?

h3. Gmx14 Generalized pull groups / “generalized sites”

Christian Blau actively working on this from mid-July

h3. Gmx15 API logging resource

Log "file" artifacts are produced through API, allowing extensibility and abstraction from filesystem dependence. Progress has already been made in this direction, but the logging resource could be more clearly owned by the client code (or a Context object owned or managed on behalf of the client code) rather than created and destroyed in, say, the Mdrunner.

h3. Gmx16 Exception handler firewall

currently the gmx binary has a commandline runner thing that catches the exceptions, reports an error and exits, but the API can and should do something else, because it plays the same role as the commandline runner

h3. Gmx17 API status object

* Status type defines the interface for discovering operation success or failure, plus details.
* Consistent status object interface is portable across Python, C++, and C
* Status object can be used to ferry information across API boundaries from exceptions thrown. Exceptions could be chained / status nested.


* What are concerns and solutions for memory allocation for status objects? Should objects own one or generate one on function return?
* Should the API (or Context) keep a Status singleton? A Status stack? Or should operations create ephemeral Status objects, or objects implementing a Status interface?
* Should the status object contain strings, reference strings mapped by enum, or defer textual messages to messaging and logging facilities?

h3. Gmx18 Thinner test harness (for API client tests)

h3. Gmx19 API manipulation of simulation input / output

(for better testing) - GlobalTopology class and IGlobalTopologyUser interface underway will help here, so that client changes to the global topology can ripple through to the modules because the ones that care have registered themselves at setup time

h3. Gmx20 Accessible test data resources

h3. Gmx21 Break up runner state into a hierarchy of RAII classes with API hooks

* break up mdrun program into clearly defined layers and phases
* CLI program parses various inputs in order to launch an Mdrunner object that is CLI-agnostic
* launching tMPI threads and other significant changes of state establish a sequence or hierarchy of invariants through RAII and/or State pattern.
* Sebastian Wingbermuhle working now on aspects of this for hybrid MC/MD (ref #2375, …)

h3. Gmx22 API management of input objects

* Structure, topology
* Microstate
* Simulation state
* Simulation parameters
* Runtime parameters / execution environment
* Anything else?

h3. Gmx23 Event hooks or signals

Event hooks or signals for

* checkpoint
* time step number or delta / trajectory advancement
* input configuration
* input topology
* input state
* simulation parameters
* output data streams

h3. Gmx24 API expression of MDOptions interfaces and embedded user documentation

h3. Gmx25 Avoid sys::exit

Generally, replace std::exit (gmx_fatal)with exceptions

* Root out gmx_fatal, clearly define regular exit points and exception throwers
* API firewall should catch exceptions from gmx and convert to status objects for ABI compatibility. (gmx17)
* Clearly document regular and irregular shutdown behavior under MPI, tMPI, and generally, specifying responsibilities
* Create issue tickets for discovered missing exception safety, memory leaks, opportunities for RAII refactoring, and complicated protocols that should either be better documented or replaced with a clearer hierarchy (or sequence) of invariants

h3. gmx26 API messaging resources

Abstraction for status messages, such as are currently printed to stdout or stderr

h3. gmx27 (retracted)

h3. gmx28 set simulation parameters from API

Short term: mdrun CLI-like functionality to override other input is sufficient

Long term: sufficient API to update parameters between phases of simulation work

Implementation roadmap is probably

1. Inject argv fields
2. Write to input_rec or other structures
3. Interact with MDOptions framework

h3. gmx29 API access to grompp functionality

* Generate runnable input from user input
* United implementation for workflow API and utility functions (e.g. possibility of deferred execution / data transfer)
* Ultimately should not require writing output to (tpr) file
* File inputs ultimately should be generalized to API objects

h3. gmx30 API access to GROMACS file manipulation and topology manipulation tools.

* United implementation for workflow API and utility functions (e.g. possibility of deferred execution / data transfer)
* Utility API should be sufficient to reimplement CLI tools
* I/O should ultimately be separate from algorithm; filesystem interaction optional
* Consider feature requirements of other projects such as MDAnalysis.

h2. Scope

There are definitely design points for consideration that are left out of this list merely because they are not essential to gmxapi functionality or because gmxapi doesn’t have strong dependence on the ultimate design choice. These topics include:

* Task scheduling framework
* Insertion points in the MD loop
* Encapsulation of integrator

Further downstream, this infrastructure is necessary to support new high level interfaces to GROMACS, but the discussion of such interfaces is deferred as much as possible to separate issues to streamline incorporation of the changes proposed here in less public / stable code.

h1. Criteria for Completion

This issue is resolved when sufficient infrastructure is in place to support ongoing development of the other subprojects in issue #2045 against the GROMACS master branch in Gerrit. In particular, the proof-of-concept client code at and would not require a forked specialized copy of GROMACS.