Task #3157
Task #3370: Further improvements to GPU Buffer Ops and Comms
Feature #2891: PME/PP GPU communications
separate PME x receive sync
Description
As agreed the data dependency sychronization should be implemented on the consumer task's end which is PME spread in the case of PME. PME-only ranks have the receive enqueue wait as soon as MPI returns. Consider assembling a list of events and passed to spread instead.
Consider whether having to receive from multiple PP ranks actually makes is more beneficial to overlap some receive with event wait enqueue.
Associated revisions
History
#1 Updated by Alan Gray about 1 year ago
- Status changed from New to Closed
Moved to umbrella task https://redmine.gromacs.org/issues/3370
GPU Coordinate PME/PP Communications
Extends PmePpCommGpu class to provide PP-side support for coordinate
transfers from either GPU or CPU to PME task, and adds new
PmeCoordinateReceiverGpu class to recieve coordinate data directly to
the GPU on the PME task.
Implements part of #2817
Refs TODOs #3157 #3158 #3159
Change-Id: Iefa2bdfd9813282ad8b07feeb7691f16880e61a2