Project

General

Profile

Feature #3115

Device stream manager

Added by Artem Zhmurov about 2 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Difficulty:
uncategorized
Close

Description

Create a class that will manage GPU streams and context. The first version may be nothing more than just a collection of streams (PME, local, non-local and update streams) and OpenCL context. Everything would be constructed with the object and have an appropriate getters. The streams are then should be passed to all the consumers and the management of streams/contexts there should be removed.

Tentative plan:
  • Make a plain object.
  • Connect to PME.
  • Connect to NBNXM/Bonded.
Will solve the following issues:
  • Currently we have two contexts for OpenCL. Consequently one has to make sure that they are in-sync for the device buffers.
  • It is not transparent when and if certain streams are created. This leads to necessity of having "if (pmeStream != nullptr) {...}" constructions.

Subtasks

Feature #3135: Make GPU traits ino opaque typesNew

Related issues

Related to GROMACS - Task #2522: OpenCL context duplicationNew
Related to GROMACS - Task #3077: PME/PP GPU Comms unique pointer deletion causes seg fault when CUDA calls exist in destructorFeedback wanted

Associated revisions

Revision 77857c59 (diff)
Added by Artem Zhmurov about 1 month ago

Pass the GPU streams to StatePropagatorDataGpu constructor

Now the StatePropagatorDataGpu has a local copy of all GPU streams and
manages the update stream. This will allow to select the specific stream
for a specific copy event in the follow-ups. The update stream is now
created in the constructor of the StatePropagatorDataGPU object, which
is a temporary solution until there is a separate device stream manager
(#3115).

Notes:

- The current implementation where StatePropagatorDataGpu is also used
on PME-only ranks, where many of the streams do not exist, without
any restriction on the methods which would require these streams is a
weakness of the design that will be dealt with in follow-up
- The OpenCL builds unconditionally use PME stream/context, since for
these this object is only used when the initial coordinates are copied.
- The update stream is created in the constructor, whereas the rest of
the streams is passed as arguments. This asymmentry will be removed
with introduction of the centralized management of context/streams.

Refs. #2816.

Change-Id: Ia9b1cabd1d3d4942dba8465c716bf644037581e7

History

#1 Updated by Artem Zhmurov about 2 months ago

  • Description updated (diff)

#2 Updated by Szilárd Páll about 2 months ago

  • Related to Task #2522: OpenCL context duplication added

#4 Updated by Mark Abraham about 2 months ago

Artem Zhmurov wrote:

Create a class that will manage GPU streams and context. The first version may be nothing more than just a collection of streams (PME, local, non-local and update streams) and OpenCL context. Everything would be constructed with the object and have an appropriate getters. The streams are then should be passed to all the consumers and the management of streams/contexts there should be removed.

Done

Tentative plan:
  • Make a plain object.
  • Connect to PME.
  • Connect to NBNXM/Bonded.

Bonded not complete, but easy to fix. Others done.

Will solve the following issues:
  • Currently we have two contexts for OpenCL. Consequently one has to make sure that they are in-sync for the device buffers.

I think we need only one context

  • It is not transparent when and if certain streams are created. This leads to necessity of having "if (pmeStream != nullptr) {...}" constructions.

The creational logic in my patch is clear and tested, but I have not considered any aspect of modules that consume the streams needing to react to whether the stream is valid.

#5 Updated by Szilárd Páll 18 days ago

  • Related to Task #3077: PME/PP GPU Comms unique pointer deletion causes seg fault when CUDA calls exist in destructor added

Also available in: Atom PDF