Integration planning for GROMACS 2020 GPU features » History » Version 1
Integration planning for GROMACS 2020 GPU features¶
We have a set of target functionality to integrate by 9/16. (TODO record that)
- GPU halo-exchange for positions. Coding assigned to Alan. Review assigned to Mark and Berk.
Status - code in gerrit, need input on class+file+method naming
- pp-pme exchange class. Coding assigned to Alan. Review assigned to Mark, Berk.
Status - awaiting Alan's update to existing https://gerrit.gromacs.org/c/gromacs/+/12783 following review last week
- simulation flags structs. Coding assigned to Szilard. Review assigned to Mark, Berk, Paul, Erik.
Status - WIP at https://gerrit.gromacs.org/c/gromacs/+/11622, would benefit from FTF discussion on how to describe lifetime and intent for this initial patch, whose form is understood to be unlike what we want in the long term
- microstate-vector GPU buffer manager. Coding assigned to Artem. Review assigned to Erik, Mark.
Status - awaiting patch from Artem
- Question: Does multi-domain GPU-halo-exchange force handling support adding extra CPU forces? Alan had an answer on 8/30 but we didn't write it down
- tests covering .mdp nst* flags, in particular for well chosen co-prime cases to prevent unexpected behavior changes. Coding assigned to Mark. Review assigned to TODO.
Status - start work Monday afternoon
- LF pressure-coupling support for GPU-based update. Coding assigned to Artem. Review assigned to TODO.
Status - P-R is /12477. Berendsen support easy to add in a future patch?
- Clean up to make a PP-rank "PME force receiver". Coding assigned to Mark. Review assigned to Alan, Paul, Berk.
Status: awaiting pp-pme exchange class stabilizing. Also review of clfft init change https://gerrit.gromacs.org/c/gromacs/+/12897.
- Class to contain code to receive PME-rank forces on PP-rank GPU buffers (avoiding CPU). Coding assigned to Alan. Review assigned to Szilard, Berk, Mark.
Status: awaiting pp-pme exchange class stabilizing
- Halo-exchange class for force+reduction. Coding assigned to Alan. Review assigned to Szilard, Mark, Berk.
Status: need feedback on position exchange to guide the similar choices here, otherwise Alan will just be refactoring existing code
- Stitching together high-level single-GPU logic to achieve performance. Coding assigned to Artem. Review assigned to Erik, Mark.
- refinement of simulation flag structs. Coding assigned to Szilard. Review assigned to Mark, Berk, Paul, Erik.
Status: awaiting submission of earlier patch
- Stitching together high-level multi-GPU logic to achieve performance. Coding assigned to TODO. Will need input from many people
Status: not started yet
- mdrun user interface + choices of defaults. Coding assigned to Paul. Review assigned to Alan, Artem, Mark.
Status: not started yet
Due last minute
- Link task assignment to simulation flags and high-level GPU logic. Coding assigned to TODO. Will need input from many people.
Status: clfft init patch needing review https://gerrit.gromacs.org/c/gromacs/+/12897. That prepares for a patch to manage GPU streams at high level and pass handles into individual modules that need to collaborate on the ~5 streams we identified, namely nonlocal, local, pme work, pp-pme xfer, update). Probably need to pull cr->duty assignment out of init_domain_decomposition() and rearrange order of operations in runner so that GPU device info is available earlier, so that GPU streams can be set up before init_domain_decomposition() and init_forcerec()