Project

General

Profile

Feature #2888

Feature #2816: Device-side update&constraits, buffer ops and multi-gpu comms

CUDA Update and Constraints module

Added by Artem Zhmurov 6 months ago. Updated 17 days ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Target version:
Difficulty:
uncategorized
Close

Description

  • LINCS for non-water constraints.
  • SETTLE for water constraints.
  • Leap frog integrator.
  • Merge of the three into single module.
  • Remove the scaffolding from LINCS, SETTLE and Leap-Frog:
    • Coordinates, velocities, forces management.
    • PBC management.
    • Virial reduction.
    • Update tests.
    • Remove Impl.
    • Template computeVirial and updateVelocities

Related issues

Related to GROMACS - Feature #2885: CUDA version of LINCSNew
Related to GROMACS - Feature #2886: CUDA version of SETTLENew
Related to GROMACS - Feature #2887: CUDA version of Leap Frog algorithmNew
Related to GROMACS - Task #2936: introduce check that CPU-GPU transfers are made between arrays of compatible typesNew

Associated revisions

Revision 1c8eb7c5 (diff)
Added by Artem Zhmurov 2 months ago

Combine CUDA Leap-Frog, LINCS and SETTLE. I.

This is the first step in combining constraints and integrator
into "UpdateAndConstraints" module. The initial merge does not
imply any performance optimisation or code clean-up. Hence, this
patch keeps all the temporary infrastructure that was built
around SETTLE, LINCS and Leap-Frog to allow them to function as
a separate units. In the following commits, this infrastructure
will be removed and these three implementations will be more closely
integrated. To enable, set GMX_UPDATE_CONSTRAIN_GPU environment
variable. Note, that environment variables GMX_LINCS_GPU,
GMX_SETTLE_GPU and GMX_INTEGRATE_GPU will no longer work.

Refs #2816, #2888

Change-Id: I8730aad0ecaa0230686fe89d1157b0da2f01f7bc

Revision fb7a59cd (diff)
Added by Artem Zhmurov about 2 months ago

Combine CUDA Leap-Frog, LINCS and SETTLE. II.

Stand-alone CUDA implementations of Leap-Frog, LINCS
and SETTLE required additional scaffolding for integration
and testing. The most prominent part of this is the
management of coordinates, velocities and forces, which
is removed in this commit. Management of periodic boundary
conditions and virial reduction will be removed in
following commits.

Refs #2816, #2888

Change-Id: I4c65a6c7088fd8059f4e7fa3cb4637cb2af79ebc

Revision 747c371c (diff)
Added by Artem Zhmurov about 2 months ago

Memory management fixes in CUDA version of LINCS

This fix is to prepare LINCS to run with DD.

1. The masses array size depends on the current number of atoms
rather than on the number of constraints.
2. The size of other arrays should be based on the number of
threads launched on the GPU, which include padding added to
align coupled constraints with the thread blocks. Also
renamed variable according to conventions.

Refs #2885 and #2888

Change-Id: I20cb53ebc6da6a1ff2ee1e385613b27c4a01d11f

Revision 1b64f6aa (diff)
Added by Artem Zhmurov about 2 months ago

Use reallocateDeviceBuffer(...) in CUDA version of SETTLE

Refs #2886 and #2888

Change-Id: Ia45254a24eda8e6ad151b1f4c6583b1a2c926004

Revision 6385f296 (diff)
Added by Artem Zhmurov about 2 months ago

Remove PImpl scaffolding from CUDA version of LINCS

The CUDA implementation of LINCS was initially introduced as a
stand-alone feature. This required hiding CUDA-specific variables
and subroutines into the private implementation subclass. Since the
LINCS is not a part of Update and Constraints module, this is no
longer required and can be removed.

Refs #2816, #2888

Change-Id: I9698224d4702dfb8d99106999335c62e83a511df

Revision b1150eee (diff)
Added by Artem Zhmurov about 1 month ago

Remove PImpl scaffolding from CUDA version of SETTLE

GPU version of SETTLE was implemented as a class with private
implementation so it will be possible to initialize on
non-CUDA hosts. Now, the implementation can be hidden
inside the Update and Constraints PImpl so that the CUDA
specific types and calls can be exposed in SETTLE and
private implementation is no longer needed there.

Refs #2816, #2888

Change-Id: I4c78f2629be34b42bb5f4f7d34970c3e41515691

Revision 1bfc9ba5 (diff)
Added by Artem Zhmurov 19 days ago

Remove PImpl scaffolding from CUDA version of Leap-Frog

Private implementation in CUDA version of Leap-Frog was
used to introduce this integrator as a stand-alone unit.
Now it is merged with constraints, PImpl is no longer
needed.

Refs #2816, #2888

Change-Id: Iea82abef016b7e15b9be44a0e1b446e12e582d3c

Revision b1be1e72 (diff)
Added by Artem Zhmurov 17 days ago

Refactor Leap-Frog tests and connect them to CPU version

This introduces test data object and runners to the Leap-Frog
tests, which are now connected to the CPU version of Leap-Frog.
This also makes possible to include tests based on the reference
values, which are needed to make sure that the temperature and(or)
pressure control works fine in new implementations.

Refs. #2816, #2888.

Change-Id: Id2d934c43138889ad178a94126cab4da2895bb5a

Revision d1f2302e (diff)
Added by Artem Zhmurov 16 days ago

Refactoring of the SETTLE tests

Current version of tests for CUDA version of SETTLE was a quick
addition to the old tests, with direct comparison of the GPU
implementation with the old original CPU-based implementation.
This commit rearranges the test structure, making it possible
to apply the same set of tests to both implementations. There
is no changes to the tests themselves. Currently, comparison tests
will run twice and will dry-run on CUDA builds without CUDA-
capable devices.

TODO: Add comparison with pre-computed values for coordinates,
velocities and virial. Remove the CPU vs GPU comparison
tests.

Refs #2886, #2888.

Change-Id: Ifcb6af9af6c93787b919b785348f9f4547b6c267

Revision 0cd72f2b (diff)
Added by Artem Zhmurov 16 days ago

Prepare Update and Constraints for Domain Decomposition

Initial GPU-based version of the update and constraints was not
designed to run with the Domain decomposition. This introduces a
couple of fixes to the memory management that should alow the
module to work with the DD enabled. The memory buffers are now
re-allocated at the set(...) stage, if so needed.

Refs. #2816, #2888.

Change-Id: I155884f5797252cf048a6400a2dd7b042d355b7e

Revision 7bd1c817 (diff)
Added by Artem Zhmurov 12 days ago

Make use of reference data in SETTLE tests

As a temporary measure, the CPU and GPU versions of SETTLE
were tested agains each other. Making use of the reference
data framework allows to test them against precomputed values.
Now, the final positions, velocities and virial are properly
tested in CPU and, if available, in GPU versions.

Refs. #2886, #2888.

Change-Id: I8e54e1a741263b8bf9774a21141c527f58130fa9

Revision 1fbaf8ff (diff)
Added by Artem Zhmurov 9 days ago

Remove PImpl scaffolding from CUDA version of SETTLE

GPU version of SETTLE was implemented as a class with private
implementation so it will be possible to initialize on
non-CUDA hosts. Now, the implementation can be hidden
inside the Update and Constraints PImpl so that the CUDA
specific types and calls can be exposed in SETTLE and
private implementation is no longer needed there.

Refs #2816, #2888

Change-Id: I4c78f2629be34b42bb5f4f7d34970c3e41515691

Revision 3d35e919 (diff)
Added by Artem Zhmurov 9 days ago

Remove PImpl scaffolding from CUDA version of Leap-Frog

Private implementation in CUDA version of Leap-Frog was
used to introduce this integrator as a stand-alone unit.
Now it is merged with constraints, PImpl is no longer
needed.

Refs #2816, #2888

Change-Id: Iea82abef016b7e15b9be44a0e1b446e12e582d3c

Revision 039709b7 (diff)
Added by Artem Zhmurov 9 days ago

Prepare Update and Constraints for Domain Decomposition

Initial GPU-based version of the update and constraints was not
designed to run with the Domain decomposition. This introduces a
couple of fixes to the memory management that should alow the
module to work with the DD enabled. The memory buffers are now
re-allocated at the set(...) stage, if so needed.

Refs. #2816, #2888.

Change-Id: I155884f5797252cf048a6400a2dd7b042d355b7e

History

#1 Updated by Artem Zhmurov 6 months ago

#2 Updated by Artem Zhmurov 6 months ago

#3 Updated by Artem Zhmurov 5 months ago

  • Related to Feature #2887: CUDA version of Leap Frog algorithm added

#4 Updated by Artem Zhmurov 5 months ago

  • Description updated (diff)

I have LINCS with some tests for it, SETTLE with some tests for it and Leap-Frog integrator with some tests for it. Now I combine them into one "Update and Constrain" module. Any ideas for the test that the merge was successful?

#5 Updated by Artem Zhmurov 5 months ago

  • Subject changed from CUDA GPU-only loop to CUDA Update and Constraints module
  • Description updated (diff)

#6 Updated by Gerrit Code Review Bot 5 months ago

Gerrit received a related patchset '4' for Issue #2888.
Uploader: Artem Zhmurov ()
Change-Id: gromacs~master~I8730aad0ecaa0230686fe89d1157b0da2f01f7bc
Gerrit URL: https://gerrit.gromacs.org/9329

#7 Updated by Artem Zhmurov 5 months ago

  • Description updated (diff)

#8 Updated by Artem Zhmurov 5 months ago

  • Description updated (diff)

#9 Updated by Artem Zhmurov 4 months ago

  • Description updated (diff)

#10 Updated by Artem Zhmurov 4 months ago

  • Description updated (diff)

#11 Updated by Szilárd Páll 4 months ago

  • Related to Task #2936: introduce check that CPU-GPU transfers are made between arrays of compatible types added

#12 Updated by Artem Zhmurov 17 days ago

  • Description updated (diff)

Also available in: Atom PDF