## Task #2537

Feature #2054: PME on GPU

Task #2453: PME OpenCL porting effort

### Simplify PME solve reduction

**Description**

Solve reduction (7 energy/virial components) in PME CUDA/OpenCL is written in a rather contrived way, and can be rewritten in a simpler way, e.g.

for (7 components) { put component into shared mem; reduce shared mem with a binary tree reduction (with barrier as long as reduction stride is larger than execution width); add component to global memory atomically on thread 0; }

It might be a bit slower than current approach, but would simplify the code.