Project

General

Profile

Feature #3087

Task #3370: Further improvements to GPU Buffer Ops and Comms

Feature #2915: GPU direct communications

enable GPU peer to peer access

Added by Szilárd Páll about 1 year ago. Updated 11 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
mdrun
Target version:
Difficulty:
uncategorized
Close

Description

For efficient direct GPU communications peer to peer access between GPUs in the run should be enabled.

This functionality should however be implemented such that all/most errors are handled explicitly and the function only aborts the run if a to be fatal error is detected, otherwise, as it is only a performance concern the run should continue.

Related: current working assumption is that even if peer access is not enabled direct copy should not be sower than staged copy, but as we are not sure, we might want to consider disabling the GPU direct copy if enabling peer access fails.


Related issues

Related to GROMACS - Feature #2890: GPU Halo ExchangeIn Progress
Related to GROMACS - Feature #2891: PME/PP GPU communications In Progress

Associated revisions

Revision 643e75da (diff)
Added by Alan Gray 11 months ago

Enable GPU Peer Access in GPU Utilities

When using the new GPU communication features, enabling peer access
between pairs of GPUs (where supported) will allow peer-to-peer
communications. In this patch the CUDA code to enable peer access is
introduced into central GPU utilities and called from do_md.

Implements #3087

Change-Id: If668366b76d49f7b624eedb501f8af19135c4386

History

#1 Updated by Szilárd Páll about 1 year ago

#2 Updated by Szilárd Páll about 1 year ago

#3 Updated by Alan Gray 11 months ago

  • Status changed from New to Closed

Also available in: Atom PDF