Project

General

Profile

Integration planning for GROMACS 2020 GPU features » History » Version 11

« Previous - Version 11/12 (diff) - Next » - Current version
Alan Gray, 09/10/2019 10:30 AM


Integration planning for GROMACS 2020 GPU features

We have a set of target functionality to integrate by 9/16. (TODO record that)

Due 8/30

  • GPU halo-exchange for positions. Coding assigned to Alan. Review assigned to Mark and Berk.
    Status - code in gerrit, class+file+method naming input given, can merge when revised accordingly

Due 9/2

  • pp-pme exchange class. Coding assigned to Alan. Review assigned to Mark, Berk.
    Status - https://gerrit.gromacs.org/c/gromacs/+/12783 awaiting further review/merge.
  • simulation flags structs. Coding assigned to Szilard. Review assigned to Mark, Berk, Paul, Erik.
    Status - WIP at https://gerrit.gromacs.org/c/gromacs/+/11622, would benefit from FTF discussion on how to describe lifetime and intent for this initial patch, whose form is understood to be unlike what we want in the long term
  • Question: Does multi-domain GPU-halo-exchange force handling support adding extra CPU forces? Alan's answer: I believe it should do (if an H2D copy of CPU forces is placed before the halo exchange), since the non-local part from the remote GPU wil be accumulated into the local part of the force buffer.

Due 9/3

  • LF pressure-coupling support for GPU-based update. Coding assigned to Artem. Review assigned to TODO.
    Status - P-R is plus2 on gerrit at https://gerrit.gromacs.org/c/gromacs/+/12477. Berendsen support easy to add in a future patch?
  • tests covering .mdp nst* flags, in particular for well chosen co-prime cases to prevent unexpected behavior changes. Also can we test e.g. forces with FEP on non-output steps. Coding assigned to Mark. Review assigned to TODO.
    Status - start work Monday afternoon
  • Make GPU version of StatePropagatorData. Coding assigned to Artem. Review assigned to Erik, Mark.
    Status - awaiting patch from Artem. Naming decision recorded at https://gerrit.gromacs.org/c/gromacs/+/11986

Due 9/5

  • Clean up to make a PP-rank "PME force receiver". Coding assigned to Mark. Review assigned to Alan, Paul, Berk.
    Status: awaiting pp-pme exchange class stabilizing. Also review of clfft init change https://gerrit.gromacs.org/c/gromacs/+/12897.
  • Code to receive PME-rank forces on PP-rank GPU buffers (avoiding CPU). Coding assigned to Alan. Review assigned to Szilard, Berk, Mark.
    Status: Code at https://gerrit.gromacs.org/c/gromacs/+/12980 awaiting review.

Due 9/6

  • Halo-exchange class for force. Coding assigned to Alan. Review assigned to Szilard, Mark, Berk.
    Status: Code in Gerrit at https://gerrit.gromacs.org/c/gromacs/+/12943. Awaiting review.
  • Stitching together high-level single-GPU logic to achieve performance. Coding assigned to Artem. Review assigned to Erik, Mark.
    Status: Awaiting progress on GPU StatePropagatorData patch
  • Lift cr->duty assignment out of init_domain_decomposition(). Coding assigned to Berk. Review assigned to Mark, Paul, Szilard.

Due 9/9

  • Class containing Pme-pp exchange for coordinate buffers. Coding Assigned to Alan. Review assigned to Szilard, Berk, Mark.
    Status: Code at https://gerrit.gromacs.org/c/gromacs/+/13043 awaiting review.
  • refinement of simulation flag structs. Coding assigned to Szilard. Review assigned to Mark, Berk, Paul, Erik.
    Status: awaiting submission of earlier patch

Due 9/12

  • Stitching together high-level multi-GPU logic to achieve performance. Coding assigned to TODO. Will need input from many people
    Status: not started yet

Due 9/14

  • mdrun user interface + choices of defaults. Coding assigned to Paul. Review assigned to Alan, Artem, Mark.
    Status: not started yet

Due last minute

  • Link task assignment to simulation flags and high-level GPU logic. Coding assigned to TODO. Will need input from many people.
    Status: clfft init patch needing review https://gerrit.gromacs.org/c/gromacs/+/12897. That prepares for a patch to manage GPU streams at high level and pass handles into individual modules that need to collaborate on the ~5 streams we identified, namely nonlocal, local, pme work, pp-pme xfer, update). Waiting on Berk's patch to pull cr->duty assignment out of init_domain_decomposition(), so we can rearrange order of operations in runner so that GPU device info is available earlier, so that GPU streams can be set up before init_domain_decomposition() and init_forcerec()