Project

General

Profile

Feature #3087

Feature #2816: GPU offload / optimization for update&constraits, buffer ops and multi-gpu communication

Feature #2915: GPU direct communications

enable GPU peer to peer access

Added by Szilárd Páll 3 months ago. Updated about 2 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
mdrun
Target version:
Difficulty:
uncategorized
Close

Description

For efficient direct GPU communications peer to peer access between GPUs in the run should be enabled.

This functionality should however be implemented such that all/most errors are handled explicitly and the function only aborts the run if a to be fatal error is detected, otherwise, as it is only a performance concern the run should continue.

Related: current working assumption is that even if peer access is not enabled direct copy should not be sower than staged copy, but as we are not sure, we might want to consider disabling the GPU direct copy if enabling peer access fails.


Related issues

Related to GROMACS - Feature #2890: GPU Halo ExchangeNew
Related to GROMACS - Feature #2891: PME/PP GPU communications New

Associated revisions

Revision 643e75da (diff)
Added by Alan Gray about 2 months ago

Enable GPU Peer Access in GPU Utilities

When using the new GPU communication features, enabling peer access
between pairs of GPUs (where supported) will allow peer-to-peer
communications. In this patch the CUDA code to enable peer access is
introduced into central GPU utilities and called from do_md.

Implements #3087

Change-Id: If668366b76d49f7b624eedb501f8af19135c4386

History

#1 Updated by Szilárd Páll 3 months ago

#2 Updated by Szilárd Páll 3 months ago

#3 Updated by Alan Gray about 2 months ago

  • Status changed from New to Closed

Also available in: Atom PDF