Project

General

Profile

Bug #1049

pick_nbnxn_kernel needs reworking

Added by Mark Abraham over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

In https://gerrit.gromacs.org/#/c/1695/4/src/mdlib/forcerec.c I mention how I think the initialization of Verlet GPU settings can be made much clearer. We need this because of the complex interaction with GPU settings, emulation, hybrid, etc.

Berk can you comment please on the reorganization I talk about there and continue here?

Basically, I think we should use a different way of constructing bUseGPU, which will lead to simpler functionality for pick_nbnxn_kernel(), which should lead to clearer variable naming. I am not sure that "bUseGPU" actually reflects its function (does it mean run on a GPU, or just use 8x8x8 lists?), but you are the judge of that.

There are six(?) things the kernel picking code wants to be able to achieve. The necessary conditions are in brackets afterwards.
1) set up 8x8x8 lists and CUDA for GPU and run non-bonded on a GPU (if we have a GPU and nothing's set, use it)
2) set up 8x8x8 lists and run non-bonded on the CPU (GMX_EMULATE_GPU and not GMX_NO_NONBONDED)
3) set up 8x8x8 lists and do not run non-bonded anywhere (GMX_EMULATE_GPU and GMX_NO_NONBONDED, which is a combination useful to document for users to assess whether GPUs might be worthwhile)
4) set up CPU-suitable lists and run non-bonded on the CPU (no GPU found, not GMX_EMULATE_GPU and not GMX_NO_NONBONDED)
5) set up CPU-suitable lists and do not run non-bonded anywhere (no GPU found, not GMX_EMULATE_GPU and GMX_NO_NONBONDED)
6) manage hybrid GPU mode somehow? (NFI)

So we need booleans for
bCanUseGPU = hwinfo->bCanUseGPU
bEmulateGPU = getenv("GMX_EMULATE_GPU") != NULL
bNoNonbonded = getenv("GMX_NO_NONBONDED") != NULL
Those are easily defined above. Now we can use those in simple combinations to detect 1-5 above, call the right routines and set the value of bUseGPU appropriately. With good planning, we can avoid the complex OR-fest and composite booleans that currently exist. For example, 1) and 4) should be the fall-back paths once we've checked for all the weird things we might have been asked to do. 6) seems to be detected by the calling routine?

There's currently a bug (both before and after Szilard's patch) where GMX_NO_NONBONDED in CPU-only mdrun sets bEmulateGPU which triggers the 8x8x8 lists. That's benign, but wrong and the warning "Emulating a GPU run on the CPU (slow)" creates confusion for the user who's never even heard of GPUs and just set GMX_NO_NONBONDED in a rerun with a Verlet .mdp for the usual purpose of playing with their bonded interactions or something.

I think that it makes the most sense to refactor pick_nbnxn_kernel into two functions. In init_nb_verlet, the "local" case calls both functions in succession, and the "non-local" hybrid case uses only the function with the second half of the logic currently in pick_nbnxn_function.

Associated revisions

Revision 438c04eb (diff)
Added by Berk Hess over 6 years ago

split off pick_nbnxn_resource from nbnxn_pick_nbnxn

Fixes #1049

Change-Id: Iadf46f40676551b9b58d0d3da5bdca0f94d4a3fb

History

#1 Updated by Szilárd Páll over 6 years ago

  • Priority changed from Normal to High

Bumped priority just because I think this is important.

#2 Updated by Mark Abraham over 6 years ago

Can this be done in time for 4.6?

#3 Updated by Berk Hess over 6 years ago

This doesn't seem like much work, so yes.

#4 Updated by Szilárd Páll over 6 years ago

There are a few (hopefully) minor things I would like to consider.

After 4.6 I plan to work on some code-wise minor, performance impact-wise possible quite important improvements which could hopefully be considered for inclusion as optimization/performance improvement during the 4.6.x series.
  • using multiple GPUs from a single process - would help a avoiding the large DD performance hit with small number of GPUs => small number of domains/processes.
  • CPU-GPU non-bonded task splitting & load balancing: all the infrastructure is there as this works already with the hybrid scheme (but that's a static task splitting);
  • multiple processes per GPU in a more flexible manner: M processed using N GPUs, where M % N != 0 (might require more code which might push it to 5.0).

It would be of great help if the reworked pick_nbnxn_kernel() and related code would consider the above aspects and facilitate adding them later on.

#5 Updated by Erik Lindahl over 6 years ago

Hi Berk,

There are currently 11 redmine issues with you as assignee, so even though it might not be a huge amount of work, this might not be the most urgent thing to fix before 4.6?

#6 Updated by Berk Hess over 6 years ago

I have a fix ready.
But I can't reproduce the issue with GMX_NO_NONBONDED triggering GPU emulation. Looking at the code this is impossible.

#7 Updated by Szilárd Páll over 6 years ago

Berk Hess wrote:

I have a fix ready.
But I can't reproduce the issue with GMX_NO_NONBONDED triggering GPU emulation. Looking at the code this is impossible.

I'm not sure which issue are you referring to, could you elaborate?

#8 Updated by Berk Hess over 6 years ago

Mark mentions it above.

#9 Updated by Szilárd Páll over 6 years ago

I still don't see any issue mentioned by Mark. He describes the uses-cases that do work with the current code - or at least did work last time I checked/worked with that code.

There is no requirement of GMX_NO_NONBONDED alone always triggering GPU emulation, but rather:
  • with -nb gpu (or any other means through which GPU acceleration is enabled) + GMX_NO_NONBONDED set => switch to GPU emulation to avoid calling the data management functions
  • with mdrun compiled without GPU emulation, setting GMX_EMULATE_GPU should trigger the emulation code-path.

#10 Updated by Berk Hess over 6 years ago

  • Status changed from New to Feedback wanted

Patch in gerrit.

#11 Updated by Berk Hess over 6 years ago

  • Status changed from Feedback wanted to Closed

Also available in: Atom PDF