Feature #2934

Updated by Alan Gray 4 months ago

Implement and improve the GPU version of position buffer operations. Gerrit change 9169 implements the functionality, and a follow-up change will improve as below.

* -Use pinned host vectors for grid and gridset arrays and remove explicit cudahostregister/unregister calls in init fn-
* -Replace allocatedevicebuffer with reallocatedevicebuffer in init fn-
* -Improve variable naming in init and buffer ops fns-
* -Fix issue with position buffer pinning to allow use of gmx api for memcpy-
* Implement sync point between PME and NB streams.
* Improve mechanism for deciding if position buffer needs to be
copied to GPU in advance of buffer op
* move GPU conditional out of OpenMP loop in atomdata_copy_x_to_nbat_x. Precompute maxAtomsInCOlumn and that and store it in the grid object, instead of recomputing it every step.