Task #2675

bonded CUDA offload task

Added by Szilárd Páll over 1 year ago. Updated 21 days ago.

In Progress
Target version:


Top-level task, a summary of the sub-tasks required to deliver the bonded GPU offload in CUDA for the 2019 release.

The plan is to take the NVIDIA code and attempt integrating it into the 2019 release with the goal of: running it next to the PP task using the same coordinates (and possibly force output buffer) and minimizing new CUDA code needed. The initial implementation will only support bonded offload if all listed interactions can be offloaded (offloading a subset should be straightforward extension, same goes for excluding perturbed bondeds).

Coarse list (individual subtasks linked):
  • filler-particle extension to the DD module bonded task conversion based on NB indexing (allows reuse of nbnxn coordinates +/- force buffer for bondeds)
  • initial bonded CUDA code cleanup (
  • bonded task scheduling and reduction scheduling code
  • command line interface and task assignment


Task #2676: map bonded interaction list to nbnxn layoutClosedBerk Hess
Task #2677: bonded task schedulingClosedSzilárd Páll
Task #2678: bonded force reductionClosed
Task #2679: bonded GPU offload task assignmentClosedMark Abraham
Task #2686: add tests for gpu bonded interactionsNew
Task #2694: bonded CUDA kernelsClosed
Task #2695: bonded GPU module timing New
Task #2723: Update mdrun-performance.rst to clearly express the nature of taskNewJoe Jordan
Task #2724: Clean up organization of bonded cuda moduleClosedMark Abraham
Task #2983: better suited data-types for bonded GPU kernelsNew
Bug #2987: assess the bonded GPU task assignment defaultNew
Task #2988: clean up and refactor code to modern standardsIn Progress
Task #3001: explore simplifying virial and shift force reductionNew
Task #3002: consider splitting bonded work into local/nonlocalNew
Task #3003: implement heuristic fallback to CPU when there is too little work for GPU offloadNew
Task #3008: verify block size choice of CUDA bonded kernelNew
Task #3183: enable bonded interactions on GPUAccepted

Related issues

Related to GROMACS - Task #2818: bonded GPU kernel fusionIn Progress
Related to GROMACS - Task #2936: introduce check that CPU-GPU transfers are made between arrays of compatible typesNew

Associated revisions

Revision b9e713b6 (diff)
Added by Jonathan Vincent about 1 year ago

Add CUDA bonded kernels

CUDA bonded kernels are added for the most common bonded and LJ-14
The default auto settings of mdrun offloads these interactions
to the GPU when possible.
Currently these interactions are computed in the local or non-local
nbnxn non-bonded streams. We should consider using a separate stream.
This change uses synchronous transfers. A child change will change
these to asynchronous.

Updated release notes and performance guide.

Fixes #2678
Refs #2675

Change-Id: Ifc6d97854cc7afa8526602942ec3b1712ba45bac


#1 Updated by Gerrit Code Review Bot over 1 year ago

Gerrit received a related patchset '10' for Issue #2675.
Uploader: Szilárd Páll ()
Change-Id: gromacs~master~Ia19452df74407186aaff350d8df27dfc3d7d359f
Gerrit URL:

#2 Updated by Szilárd Páll over 1 year ago

  • Description updated (diff)

#3 Updated by Mark Abraham over 1 year ago

  • Status changed from New to In Progress

#4 Updated by Gerrit Code Review Bot about 1 year ago

Gerrit received a related patchset '1' for Issue #2675.
Uploader: Berk Hess ()
Change-Id: gromacs~release-2019~Ifc6d97854cc7afa8526602942ec3b1712ba45bac
Gerrit URL:

#5 Updated by Szilárd Páll about 1 year ago

This task is now blocked by subtasks targeted to 2020 so we either need to retarget this or preferably move out those subtasks.

#6 Updated by Paul Bauer about 1 year ago

  • Target version changed from 2019 to 2020

#7 Updated by Szilárd Páll 11 months ago

  • Related to Task #2818: bonded GPU kernel fusion added

#8 Updated by Szilárd Páll 9 months ago

  • Related to Task #2936: introduce check that CPU-GPU transfers are made between arrays of compatible types added

#9 Updated by Paul Bauer 21 days ago

  • Target version changed from 2020 to 2021

Also available in: Atom PDF