Task #2453

Updated by Aleksei Iupinov over 2 years ago

With porting PME from CUDA to OpenCL I'm first going with a dirty code with lots of duplication to see how to strike a balance between neatness and extensibility. Most of the host-side logic is quite easy to wrap to look the same in CUDA/OpenCL since there is no C++ limitations.

Functionality already achieved: achieved with a dirty duplicated code:
- Spline/spread OpenCL kernels passing unit tests on NVIDIA GPU.
TODO for correctness of the development branch:
- implement gather and solve kernels as well, using unit tests;
- test on AMD/Intel, remove warpsize==32 assumptions;
- try importing and using clFFT, verifying correctness of the full PME OpenCL with PmeTest/regression tests;
- take a glance at performance.

TODO for clean submission into master branch: checklist
(which should eventually have gerrit links for everything)
(There is also probably much more stuff that I've forgotten)