Project

General

Profile

Task #3221

Feature #2816: GPU offload / optimization for update&constraits, buffer ops and multi-gpu communication

Feature #2888: CUDA Update and Constraints module

fix the x D2H overlap limitation with GPU update

Added by Szilárd Páll 10 months ago. Updated 10 months ago.

Status:
Closed
Priority:
High
Assignee:
-
Category:
mdrun
Target version:
Difficulty:
uncategorized
Close

Description

When there are CPU tasks with GPU update, the x D2H should overlap with the GPU force compute. Due to the simplistic safety measures put in place to keep the StatePropagatorGpu code maintainable, we still launch transfers and wait back-to-back preventing any overlap of the transfer.

This makes the x D2H a pure overhead on in GPU update runs, and to avoid that we should move the waits to just before the CPU tasks that consume x (preferably after we have the buffer state tracking in StatePropagatorGpu so we don't have to take the overhead of multiple CUDA wait calls).

Associated revisions

Revision 9a3c2ce3 (diff)
Added by Szilárd Páll 10 months ago

Allow x D2H to overlap with GPU force compute

With GPU update coordinates are transferred back to the CPU every step
if there are forces to compute on the CPU. Originally this was
implemented with a back-to-back transfer launch and wait at the
beginning of do_force().
This change moves the CPU wait for the completion of the coordinate
transfer closer to the consumer tasks in order to avoid blocking GPU
force tasks' launch and allowing compute and transfer to overlap.

Fixes #3221

Change-Id: Ia6641147bbec1186b54c1445d36dc31000eae9c4

History

#1 Updated by Szilárd Páll 10 months ago

  • Status changed from New to Resolved

#2 Updated by Szilárd Páll 10 months ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF