Project

General

Profile

Feature #2885

Updated by Artem Zhmurov 7 months ago

Adapt the LINCS constraints to work efficiently on CUDA-enabled GPUs.

TODO:

* -A separate class that contains the logic.-
* Reduction for the virial using shuffle.
* PLINCS.
* Many-GPU version.
* Free energy.

Ideas for kernel improvement:

* Use analytical solution for
Explicit matrix A inversion (for small matrices of H-bonds constraints), inverted matrix itself can be reused rather than recomputed.
* Move more data to local/shared memory and try to get rid of atomics (at least on the device level).
*
Use locality of coupled constraints better (maybe go from block-sync to warp-sync)
* Introduce mapping of thread id to both single constraint and single atom, thus designating Nth threads to deal with Nat <= Nth coupled atoms and Nc <= Nth coupled constraints. Reduction for the virial using shuffle.
* PLINCS.
* Many-GPU version.
* Free energy.


Testing:


* -Initial integration to the constraints test.-
* Add bigger systems to test virial reduction and overall redistribution of constraints among threads.
* Generalization of tests for different platforms.

Current version of the code is in gerrit change 9193 (https://gerrit.gromacs.org/#/c/9193/).

Back