-reprod: checkpoint reading bug and general considerations
-reprod: checkpoint reading bug and general considerations¶
The `-reprod` option is described in the manual as
-[no]reprod (no) Try to avoid optimizations that affect binary reproducibilityOn a first glance, it seems that in practice, this option
- turns off dynamic load balancing / PME tuning
- does not allow to stop between neighbor-searching steps
- checks when reading in checkpoints if the number of total/PP/PME/PP&PME ranks is consistent between the current simulation and the checkpoint
(Am I missing something?)
Bug: Currently, the last check is not correctly implemented, as checkpoint reading is performed before domain decomposition (and hence set up of PP/PME duties). Checkpoint reading will always yield a "PME rank mismatch" if the number of PME ranks was not 0 in the checkpoint file, even when explicitly setting the same number of dedicated PME ranks via command line.
Further considerations: While the upper issue can easily be fixed (probably by shifting the check from checkpoint reading to DD set up), the broader question is whether the `-reprod` option is restrictive enough. Personally, I feel that the name and the description of the `-reprod` option promises a level of (binary) reproducibility which we can't and shouldn't guarantee except under very specific constraints. If my list above is complete, it seems that simulations would hardly be binary identical when running on multiple ranks or on GPUs.
Other mentioning of `-reprod` in the manual:¶
It is generally difficult to run an efficient parallel MD simulation that is based primarily on floating-point arithmetic and is fully reproducible. By default, gmx mdrun will observe how things are going and vary how the simulation is conducted in order to optimize throughput. However, there is a “reproducible mode” available with mdrun -reprod that will systematically eliminate all sources of variation within that run; repeated invocations on the same input and hardware will be binary identical. However, running in this mode on different hardware, or with a different compiler, etc. will not be reproducible. This should normally only be used when investigating possible problems.
Section http://manual.gromacs.org/documentation/current/user-guide/managing-simulations.html#reproducibility in general, and especially¶
Further, using `gmx mdrun -reprod` will eliminate all sources of non-reproducibility that it can, i.e. same executable + same hardware + same shared libraries + same run input file + same command line parameters will lead to reproducible results.
#2 Updated by Pascal Merz 2 months ago
Summary of lunch discussion with Berk:
- MPI simulations without load balancing are in principle reproducible (depending on implementation of MPI reduction), and in practice often are (we could implement a version that fixes the order of reduction if we wanted to be sure)
- Simulations using GPU are currently not reproducible (would need some effort to write a fixed order version)
- -reprod option is useful for debugging
next steps therefore include
- make documentation unambiguous about what -reprod does
- move comparison with checkpoint data to construction of domdec to fix bug
- add assertion to disallow -reprod runs with GPU
- add GPU flag to checkpoint to check that previous run did not use GPU either
#3 Updated by Erik Lindahl about 1 month ago
Update the documentation for now - the command line option docs are correct in that we try, but it's not perfect in all cases.
We have thought a while about implementing our own fixed point accumulators, which should likely solve the GPU issue, but likely not general load balancing. This is an option meant for debugging, not as a magical solution that hides the fact that we're dealing with floating point and doing dynamic ordering - there will always be a performance penalty.