Task #2535

Feature #2054: PME on GPU

Task #2453: PME OpenCL porting effort

consider compiling opencl fft kernels once

Added by Mark Abraham almost 3 years ago. Updated almost 3 years ago.

core library
Target version:


In, TODOs were noted that we should consider

  • lazy pre-compilation of FFT kernels for PME running on OpenCL
  • thread-safe RAII-style management of (at least) the underlying clfft library setup and tear down

Associated revisions

Revision a41344a0 (diff)
Added by Aleksei Iupinov almost 3 years ago

Added the bundled clFFT into OpenCL builds

Used an object library, since we have no need of a real library, to
have or to install, whether shared or static. Checked for the
availability of dynamic loading, and made it available portably to

Clfft initialization class is added and used in mdrunner to
initialize/tear down clFFT library resources in a thread-safe
manner, and only on ranks that require such setup. Noted TODOs
for future work.

Noted a useful style for explicit listing of source files.

Refs #2500
Refs #2515
Refs #2535

Change-Id: I62d7d66f65e147bde17929ccc30abad36e2373c6

Revision f8443e2b (diff)
Added by Mark Abraham over 1 year ago

Move initialization of clFFT

Gave ClfftInitializer the responsibility for mutual exclusion, which
means the initialization is now convenient to do alongside other
PME-on-GPU initialization tasks. This simplifies the code.

Removed mention of lazy initialization, which was not implemented at

Refs #2535

Change-Id: I429767b059ddc3b4f16f2f4a8830b54ed1f49fab


#1 Updated by Mark Abraham almost 3 years ago

The compilation of such kernels do depend on the FFT grid size, which may or may not be known until after the PME module is set up (because a user may have used fourierspacing), and might also be subject to auto-tuning. Not needing to re-compile if auto-tuning returns to a previous grid size would be an advantage, but that's a different consideration from pre-compilation.

#2 Updated by Aleksei Iupinov almost 3 years ago

clFFT clearly has some mentions of caching in its code. I would think that properly storing all grid-related data of PME (such as FFT plan instances) in a map, using grid dimensions as a key, instead of deleting all the old stuff on each reinit, would already achieve this. One would still have to watch out for resource exhaustion, of course, and try to delete old stuff if really needed.

Also available in: Atom PDF