Task #3170

Feature #2816: GPU offload / optimization for update&constraits, buffer ops and multi-gpu communication

Feature #2817: GPU X/F buffer ops

Feature #3029: GPU force buffer ops + reduction

investigate GPU f buffer ops use cases

Added by Szilárd Páll 3 months ago. Updated 26 days ago.

Target version:


Check if there is any performance benefits to be had and in which regimes for x / f buffer opts without GPU update in:
  • runs with DD and CPU update
    • x buffer ops: offloadable with a likely simple crossover heuristic threshold; i.e. below N atoms/core not offloaded (locals or also nonlocals, with/without CPU work?)
    • f buffer ops: heuristics likely more complex criteria (as it is combined with reductions)
  • runs with / without DD and vsites
    • with GPU update requires D2H and H2D -- is it worth it, test use-cases (e.g. multiple ranks per GPU, both ensemble and DD runs, transfers might be overlapped)
    • without GPU update: same applies as above non-vistes runs just wait on D2H needs to be earlier

Related issues

Related to GROMACS - Task #3171: schedule CPU H2D force contribution in separate streamNew

Associated revisions

Revision 86a27bc2 (diff)
Added by Szilárd Páll about 1 month ago

Allow overlapping CPU force H2D with compute

The reduction orchestration code already uses explicit sync event
in all cases and StateGpu implements the ability to schedule force
H2D in a separate stream for the "All" locality.
Hence, this change switches for non-DD runs the CPU force H2D to be done
in the update stream to allow overlap with force work in the local

Refs #3170 #3029

Change-Id: Iceb9aac395335c062109d552d3f0289688a9c75f


#1 Updated by Szilárd Páll 3 months ago

  • Subject changed from investigate GPU f buffer ops + vsites use case to investigate GPU f buffer ops use cases
  • Description updated (diff)

#2 Updated by Szilárd Páll 3 months ago

  • Parent task set to #3029

#3 Updated by Szilárd Páll 2 months ago

  • Related to Task #3171: schedule CPU H2D force contribution in separate stream added

#4 Updated by Paul Bauer 26 days ago

  • Target version changed from 2020 to 2021

Also available in: Atom PDF