Project

General

Profile

Feature #3311

GPU infrastructure development

Added by Artem Zhmurov 8 months ago. Updated 7 months ago.

Status:
In Progress
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Difficulty:
uncategorized
Close

Description

General goal is to develop platform-agnostic infrastructure for CUDA and OpenCL with perspective expansion to other technologies (SYCL?).


Subtasks

Feature #2967: GPU reallocateDeviceBuffer improvementsNew
Task #3312: Data type for coordinates, xyzq data, LJ parameters data to use for GPU buffersIn ProgressArtem Zhmurov
Bug #3372: Re-enable RVec and float3 compatibility testsClosedArtem Zhmurov
Feature #3313: Introduce and use opaque types for the DeviceStream and DeviceContextAcceptedArtem Zhmurov
Task #3314: Platform agnostic DeviceStreamResolvedArtem Zhmurov
Task #3315: Platform agnostic DeviceContextResolvedArtem Zhmurov
Task #3316: Context and Stream managerAcceptedArtem Zhmurov
Task #3317: Improve testing of the GPU codeAcceptedArtem Zhmurov
Feature #3318: Use wrappers for the GPU buffer copy/allocationsIn ProgressArtem Zhmurov
Task #3319: Use DeviceBuffer instead of native GPU types in NBNXMIn ProgressArtem Zhmurov
Task #3320: Remove duplicating D2H/H2D wrappers in NBNXMIn ProgressArtem Zhmurov
Task #3321: Add D2D wrapperAcceptedArtem Zhmurov
Task #3322: Add reallocate(...) function that does not care about the contents of the bufferAcceptedArtem Zhmurov
Task #3323: Rework the StatePropagatorDataGpuIn ProgressArtem Zhmurov
Task #3324: Rework CMake handling of GPU codeNew

Related issues

Related to GROMACS - Task #2936: introduce check that CPU-GPU transfers/assignments are made between compatible typesNew

Associated revisions

Revision 6197aaed (diff)
Added by Artem Zhmurov 8 months ago

Split and move the checkDeviceBuffer(...) function from PME

Resolving a TODO.

Also fixed the formatting in neighboring comment.

Refs. #3311.

Change-Id: I1687981cc80e2388714cbbb3113f37e34582e31c

Revision ca9c6942 (diff)
Added by Artem Zhmurov 8 months ago

Make OpenCL DeviceVendor into enum class and move to GPU traits

The device context in OpenCL requires the information on vendor when
constructed. To prepare for opaque DeviceContext, the vendor
enum was moved into OpenCL traits.

Refs. #3311, needed for #3315.

Change-Id: Iec22ff17543b6a99407048de6e0cd82bb7218fb0

Revision e742ad10 (diff)
Added by Artem Zhmurov 8 months ago

Move DeviceInfo into GPU traits

The DeviceInfo is needed upon construction of DeviceContext. To
prepare for opaque DeviceContext type, it is moved to GPU traits
and renamed according to the common naming scheme.

Refs. #3311, needed for #3315.

Change-Id: I2a9f1d932f142d645df75901521a734d208de509

Revision 84e5a0e6 (diff)
Added by Artem Zhmurov 7 months ago

Use init(..) function to build DeviceContext

This patch unifies the logic of OpenCL context creation in PME and
NBNXM by using the same init(..) function for the DeviceContext
object.

Also, the DeviceInfo is now de-referenced directly after the check
on the pointer validity and passed along as a const reference, which
improves the clarity of the code.

Refs. #3315, #3311.

Change-Id: I5ba0f530918f3340fa1a5ad3e8d60fe4e0967dab

Revision 6975fbfd (diff)
Added by Artem Zhmurov 7 months ago

Take over management of OpenCL context from PME and NBNXM

This patch set creates the DeviceContext in runner and passes it to the
consumers (PME and NBNXM). This removes unnessesary management code
duplication, makes the device buffers in two modules compatible.

Fixes #2522
Fixes #3315
Refs #3311

Change-Id: I10358cfaced5b5c7dbdddf95679c9a9703f3a2c0

Revision 8eadec22 (diff)
Added by Artem Zhmurov 7 months ago

Make DeviceStream into a class

Refs #3314
Refs #3311

Change-Id: Ic270864f0e82af63f91a91c9951bf678795680fa

Revision e3d904d0 (diff)
Added by Artem Zhmurov 7 months ago

Use DeviceStream init(...) function to create streams

Change the stream creation procedures from direct calls to CUDA
and OpenCL API to using pre-defined init(...) method of the
DeviceStream class.

Refs #3314
Refs #3311

Change-Id: I96a0ca41f251b9925ef9bed77c4f355939b65c6d

Revision c26d93da (diff)
Added by Artem Zhmurov 7 months ago

Small fixes to the DeviceStream

1. Fix compiler warning on having const modifier for bool in function declaration.
2. Fix comments.
3. Introduce isValid(...) method and use it inside the class.

Refs #3314
Refs #3311

Change-Id: I482aee831461f6b170c5fbf90f3f3e978282d226

Revision 99f4253d (diff)
Added by Artem Zhmurov 6 months ago

Introduce DeviceStreamManager

Make a separate object that will be handling the creation,
management and destruction of the GPU context and streams.
It is detached from the rest of the code in this patch,
but will be attached in the follow-up.

Refs #3316
Refs #3311

Change-Id: I2c59b930ac266d89fafe9e0172b83f07e9858f0b

Revision db2c0d2e (diff)
Added by Artem Zhmurov 6 months ago

Make use of the DeviceStreamManager

Use the DeviceStreamManager throughout the code. The manager is
owned by the runner and created when GPU is active. The consumers
get the context and streams if needed.

TODOs:
1. Make builders and move the selection on whether the stream should
be created there. The builders should take the manager and pass
the context and the stream to the consumer. Builders should have
the option to create a stream.
2. Makefile in ewald tests uses old infrastructure. Also, the device
context management should be lifted from there and utilized in
all the tests that can run on GPU hardware.

Refs #3316
Refs #3311

Change-Id: I0d08adbe1dee19c1890e55f0e0cf79cea97d39bd

Revision 1ced5fb7 (diff)
Added by Artem Zhmurov 6 months ago

Unify CUDA and OpenCL lookup-table creation

In CUDA code, textures are used for the lookup-tables,
whereas in OpenCL they are created as a read-only
buffers. This commit hides these differences behind a
unified wrapper.

Refs #3318
Refs #3311

Change-Id: I003e0c982c2452a2753e331b46fc59f0b7e1b711

Revision 5f8899ba (diff)
Added by Artem Zhmurov 5 months ago

Unify CUDA and OpenCL lookup-table creation

In CUDA code, textures are used for the lookup-tables,
whereas in OpenCL they are created as a read-only
buffers. This commit hides these differences behind a
unified wrapper.

Refs #3318
Refs #3311

Change-Id: I003e0c982c2452a2753e331b46fc59f0b7e1b711

Revision c048437f (diff)
Added by Artem Zhmurov 5 months ago

Unify CUDA and OpenCL lookup-table creation

In CUDA code, textures are used for the lookup-tables,
whereas in OpenCL they are created as a read-only
buffers. This commit hides these differences behind a
unified wrapper.

Refs #3318
Refs #3311

Change-Id: I003e0c982c2452a2753e331b46fc59f0b7e1b711

Revision 986b2bb1 (diff)
Added by Artem Zhmurov 5 months ago

Unify CUDA and OpenCL lookup-table creation

In CUDA code, textures are used for the lookup-tables,
whereas in OpenCL they are created as a read-only
buffers. This commit hides these differences behind a
unified wrapper.

Refs #3318
Refs #3311

Change-Id: I003e0c982c2452a2753e331b46fc59f0b7e1b711

Revision d3ce8501 (diff)
Added by Artem Zhmurov 5 months ago

Unify CUDA and OpenCL lookup-table creation

In CUDA code, textures are used for the lookup-tables,
whereas in OpenCL they are created as a read-only
buffers. This commit hides these differences behind a
unified wrapper.

Refs #3318
Refs #3311

Change-Id: I003e0c982c2452a2753e331b46fc59f0b7e1b711

History

#1 Updated by Artem Zhmurov 8 months ago

  • Tracker changed from Task to Feature

#2 Updated by Artem Zhmurov 8 months ago

  • Target version set to 2021-refactoring

#3 Updated by Szilárd Páll 8 months ago

  • Related to Task #2936: introduce check that CPU-GPU transfers/assignments are made between compatible types added

Also available in: Atom PDF