Feature #2816: GPU offload / optimization for update&constraits, buffer ops and multi-gpu communication
CUDA version of SETTLE
Initial implementation that works as a separate instance, i.e. is be able to copy coordinates and velocities to and from GPU, handle PBC, compute virial. The infrastructure that maintains coordinates, velocities and PBC is temporary and will be removed when it is integrated with other parts of the GPU-only loop. Enabled as a part of the GPU update.TODO:
Stand-alone module that is enabled using environment variables.
- Unify virial reduction with LINCS
- Unify SettleParameters and their initialization with CPU version
- Better integration with CPU version (e.g. checks for input consistency)
CUDA version of SETTLE algorithm with basic tests
CUDA-based GPU implementation of SETTLE. This is a part of
all-GPU loop. Can work isolated from other parts of the code
since coordinates are copied to (from) device before (after)
SETTLE kernel call. The velocity update as well as virial
evaluations can be enabled.
To enable, set GMX_SETTLE_GPU environment variable.
1. Does not work when domain decomposition is enabled.
2. Projection of the derivative is not implemented.
3. Not fully integrated/unified with the CPU version.
1. Multi-GPU case.
2. Better virial reduction. This is a more general feature,
not only related to constraints.
5. More cleanup in constr.cpp needed.
6. Better unit tests.
Refactoring of the SETTLE tests
Current version of tests for CUDA version of SETTLE was a quick
addition to the old tests, with direct comparison of the GPU
implementation with the old original CPU-based implementation.
This commit rearranges the test structure, making it possible
to apply the same set of tests to both implementations. There
is no changes to the tests themselves. Currently, comparison tests
will run twice and will dry-run on CUDA builds without CUDA-
TODO: Add comparison with pre-computed values for coordinates,
velocities and virial. Remove the CPU vs GPU comparison
Make use of reference data in SETTLE tests
As a temporary measure, the CPU and GPU versions of SETTLE
were tested agains each other. Making use of the reference
data framework allows to test them against precomputed values.
Now, the final positions, velocities and virial are properly
tested in CPU and, if available, in GPU versions.