Project

General

Profile

Integration planning for GROMACS 2020 GPU features

We have a set of target functionality to integrate by 9/16. (TODO record that)

Due 8/30

  • GPU halo-exchange for positions. Coding assigned to Alan. Review assigned to Mark and Berk.
    Status - code in gerrit, class+file+method naming input given, can merge when revised accordingly

DONE with follow-up work needed ASAP, see TODOs in code, in #2890, and subtasks of #2890

Due 9/2

  • pp-pme exchange class. Coding assigned to Alan. Review assigned to Mark, Berk.
    Status - https://gerrit.gromacs.org/c/gromacs/+/12783 awaiting further review/merge.
    Reviewed/Ready, merge pending rebase (StatePropagator upstream improvements?)
  • simulation flags structs. Coding assigned to Szilard. Review assigned to Mark, Berk, Paul, Erik.
    Status - WIP at https://gerrit.gromacs.org/c/gromacs/+/11622, would benefit from FTF discussion on how to describe lifetime and intent for this initial patch, whose form is understood to be unlike what we want in the long term
    DONE
  • Question: Does multi-domain GPU-halo-exchange force handling support adding extra CPU forces? Alan's answer: I believe it should do (if an H2D copy of CPU forces is placed before the halo exchange), since the non-local part from the remote GPU wil be accumulated into the local part of the force buffer.
    DONE

Due 9/3

  • LF pressure-coupling support for GPU-based update. Coding assigned to Artem. Review assigned to TODO.
    Status - P-R is plus2 on gerrit at https://gerrit.gromacs.org/c/gromacs/+/12477. Berendsen support easy to add in a future patch?
    DONE
  • tests covering .mdp nst* flags, in particular for well chosen co-prime cases to prevent unexpected behavior changes. Also can we test e.g. forces with FEP on non-output steps. Coding assigned to Mark. Review assigned to TODO.
    Status - start work Monday afternoon
    Status?
  • Make GPU version of StatePropagatorData. Coding assigned to Artem. Review assigned to Erik, Mark.
    Status - awaiting patch from Artem. Naming decision recorded at https://gerrit.gromacs.org/c/gromacs/+/11986
    DONE; follow-up WIP linking tasks with event dependencies

Due 9/5

  • Clean up to make a PP-rank "PME force receiver". Coding assigned to Mark. Review assigned to Alan, Paul, Berk.
    Status: awaiting pp-pme exchange class stabilizing. Also review of clfft init change https://gerrit.gromacs.org/c/gromacs/+/12897.
    DONE
  • Code to receive PME-rank forces on PP-rank GPU buffers (avoiding CPU). Coding assigned to Alan. Review assigned to Szilard, Berk, Mark.
    Status: Code at https://gerrit.gromacs.org/c/gromacs/+/12980 awaiting review.
    Reviewed (Szilard); merge pending rebase

Due 9/6

Due 9/9

  • Class containing Pme-pp exchange for coordinate buffers. Coding Assigned to Alan. Review assigned to Szilard, Berk, Mark.
    Status: Code at https://gerrit.gromacs.org/c/gromacs/+/13043 awaiting review.
    Rebase needed!
  • refinement of simulation flag structs. Coding assigned to Szilard. Review assigned to Mark, Berk, Paul, Erik.
    Status: awaiting submission of earlier patch
    DONE / follow-up needed: SimulationWorkload flags prepared after task assignment

Due 9/12

  • Stitching together high-level multi-GPU logic to achieve performance. Coding assigned to TODO. Will need input from many people
    Status: precursors WIP (13427, 13494); related tasks on #2890

Due 9/14

  • mdrun user interface + choices of defaults. Coding assigned to Paul. Review assigned to Alan, Artem, Mark.
    Status: not started yet
    _Partially DONE; follow-up needed: StimulationWorkload, refine defaults selection.

Due last minute

  • Link task assignment to simulation flags and high-level GPU logic. Coding assigned to TODO. Will need input from many people.
    Status: clfft init patch needing review https://gerrit.gromacs.org/c/gromacs/+/12897. That prepares for a patch to manage GPU streams at high level and pass handles into individual modules that need to collaborate on the ~5 streams we identified, namely nonlocal, local, pme work, pp-pme xfer, update). Waiting on Berk's patch to pull cr->duty assignment out of init_domain_decomposition(), so we can rearrange order of operations in runner so that GPU device info is available earlier, so that GPU streams can be set up before init_domain_decomposition() and init_forcerec()
    Status: data structures ready for simulation flags to be populated (StimulationWorkload) after task-assignment, a StimulationWorkloadBuilder is needed that takes the task-assignment output as well as some of the dev feature flags and makes a runtime-constant workload descriptor that can be passed down to do_force?(). Note that some of the current c_useFeatureX flags are used to compute a per-step flag with overrides (like buffer ops), but others have the overrides built in the global boolean construction (like c_enableGpuHaloExchange). Distinction needs to be made between i) per-step overrides ii) feature requirements/static overrides that might be reasonable to include in StimulationWorkload flags or might be best to assert on.