Project

General

Profile

Task #3216

improve CPU force reductions

Added by Szilárd Páll about 1 year ago. Updated 12 months ago.

Status:
New
Priority:
Normal
Category:
mdrun
Target version:
-
Difficulty:
uncategorized
Close

Description

Force reductions across different CPU modules need improvements, especially when considering heterogeneous / accelerator code-paths:
- make it more clear which is the "master" CPU buffer, its lifetime and validity
-- code-path aware force clearing
-- consumers should only get a read-only view
- have a separate reduction module be in charge of summing separate force contributions (e.g. PME GPU module does reduction of PME forces in the main buffer)
- make reduction code aware when there is a opportunity to store vs accumulate (e.g. when all forces are offloaded on a rank, the final F reduction can be 30% faster by storing rather than accumulating into the master buffer)

History

#1 Updated by Szilárd Páll about 1 year ago

  • Description updated (diff)

#2 Updated by Berk Hess 12 months ago

What is code-path aware force clearing.

#3 Updated by Szilárd Páll 12 months ago

Berk Hess wrote:

What is code-path aware force clearing.

Not the best wording, I guess. On the GPU-offload code-path, both with or without buffer ops offloaded, we may not have anything to compute on the CPU, so clearing (as well as accumulating into) the CPU force buffer is a waste.

Also available in: Atom PDF