Task #2616

Updated by Roland Schulz over 2 years ago

Currently we don't have a coherent model for our MD state. Most of the important information (such as t_state, gmx_domdec_t, t_inputrec, t_commrec, MdrunOptions, t_mdatoms, ...) is just widely passed around to whoever needs it. Given how widely they are passed, it would be about as good if they simply were global variables. At least then most function would have a reasonable number of arguments.

Recent efforts at modernization (such as, try to avoid the problem by storing pointers to them in some Handler/Task object but that doesn't actually solve the problem. It is also ad-hoc and the states which store the information have no relationship to the data stored. As far as I know there is no suggestion for a long term solution yet.

One thing we need to do is collect requirements for our MD state data-structure / object model. A start of this is the following:

requirement is, that we need to be able to write certain data (e.g. forces, counter, log) from multiple tasks in a thread safe way to support scheduling. For this I suggest we split out all data which is regularly modified from the other, so that data which is never(/rarely) modified can be made a read-only (/can be accessed by most through a read only view). For the data which needs per step writing in parallel paralleled from different tasks and are write-only (such as force) we can create buffers which automatically handle the buffer reduction.

Do we need the capability to create more than one state of any of the major MD variables? In other words for the API effort do we want to support creating multiple states in the same process? Or do we want to have things like -multi/-replex continue to be limited to MPI and have always one state per MD process? Where do we already already need copies of state objects (such as t_state in minimize)? And are there exceptions either way? Meaning do we want most state to have a single/multiple instance(s) but some state should be different.

Another requirement is that we need to be able to unit test functions. If we make sure that the global state is fast to default construct and have utility function for testing to initialize parts of the state required for the function under test.

I think a solution might be the following:
Basic idea: I think there should be one object which fully described the total state of the MD simulation. This should contain everything needed to perform an MD step excluding only data which changes frequently (e.g. forces, note 4). Obviously it shouldn't be one class. But instead one class with (smart) pointers to all the relevant other data. Method signatures can be simplified to only require that object. Encapsulation is preserved (note 3). That And that MD state object should be shallow copyable. We have methods (such as minimize) which require to make local modifications to the state. Without being able to make a full shallow copy of the MD state object, such a method would have to have one object which contains all the non-modified state (where the modified parts are invalid) plus another which contains the modified state. To avoid this mess, such a method should be able to make a shallow copy of the whole MD state and a deep copy for the parts it wants to modify. This can also help with initialization.

Have one MDState object which contains a shared_ptr<const X> const preserving (/copy on write) shared_ptr to each of the state objects. Every method gets a const& to that MDState object (or only the subobject needed but not multiple sub-objects objects to avoid the many arguments). All our state objects should be made fast/shallow copyable. That means:
- any data which is expensive to copy should use also use a shared_ptr<const X> shared_ptr and just copy the pointer
- data which should be deep-copied by default should either not be a pointer or have a copy constructor which copies it (or use value_ptr)
- and function which needs to locally modify MDState (e.g. minimize) can make a writeable copy and modify whatever it wants (2)
- for function which needs to modify the copy given by the caller (such as initialization/input reading) has two options:
a) the caller can pass a non-const pointer to the part it wants to give write access to (obvious the caller has to have write access itself, ideally again not multiple subobjects and thus multiple arguments) itself)
b) the callee creates a non-const copy (deep-copy for parts it wants to change, same note 2 applies) and returns that copy. The caller than commits those changes by copying them back into its version (again the caller has to have write access). This might be too expansive in some cases. But it has the advantage that it makes things exception safe (if a function returns early things don't get written back to the original version) and lets the caller do sanity checks before committing the changes (including verifying that only expected things were deep copied). It also avoids the issue that granting write access (by copying back) to multiple subobjects doesn't require many arguments. its version.

1) (got removed by edit but don't want to renumber) one option would be std::experimental::propagate_const. I have a non constexpr version of it which works in C++11 and could be put into compat.
2) After simply creating a shallow copy it is still not modifiable through it parent object (everything is pointers isn't sufficient. Those parts which should be modified need to const data) but because one created the be deep copy one has copied. Ideally that would be automatic. But using something like "copy-on-write-ptr": might have too much overhead. An option might be to create a non-const reference to the part pointer which in release mode tracks whether it has been copied and asserts that it was deep copied. copied when being written to. This would give the safety in Debug without any overhead in Release.
3) Things can be private and only accessible to a class. They can also be or restricted to a module if the type the smart-pointer points to is forward declared and the implementation is local the module. For read access I don't it is helpful to know what part of MDState is accessed from the function signature. And it can be documented. Write access has to be granted specially and thus it is visible in the code what parts can be written too.
4) "data which changes frequently" only refers to data which has observable simulation result an needs to be correctly propagate to the next step. Mutable data (caches, log, tmp-buffers) would be fine.