Task #2183
GPU-accessed memory page-locking and page sizes
Description
In order to have fast asynchronous transfers between host and CUDA GPUs, host memory buffers need to be page-locked (aligned with the memory page size + exposed to CUDA via cudaHostRegister() function).
There is a crude hack for PME GPU purposes, which aligns existing coordinates, charges and forces buffers: https://gerrit.gromacs.org/#/c/6578
There is also a tangent work by Mark: https://gerrit.gromacs.org/#/c/6552
No matter which code we would use, the important questions are whether and how we would want to use page-aligned memory conditionally (e.g. based on PME running on CPU/GPU), and could we face problems/limits otherwise
(consider a dozen of ranks each using a dozen of page-aligned buffers with whatever large page size).
One long-term solution would be having a page-locked memory provider object which would minimize the paged memory use.
History
#1 Updated by Aleksei Iupinov over 3 years ago
- Blocks Feature #2054: PME on GPU added
#2 Updated by Aleksei Iupinov over 3 years ago
- Subject changed from GPU memory page-locking and page sizes to GPU-accessed memory page-locking and page sizes
#3 Updated by Mark Abraham about 3 years ago
- Target version set to 2019
I'm sure there's things to improve here moving forward!
#4 Updated by Mark Abraham about 3 years ago
- Blocks deleted (Feature #2054: PME on GPU)
#5 Updated by Szilárd Páll about 3 years ago
- Status changed from New to In Progress
Mark Abraham wrote:
I'm sure there's things to improve here moving forward!
Is there? I think most if not all use-cases are covered by the (still) so-called HostAllocator implementation. If there are remaining cases relevant for this release and current its use-cases, let's keep this issue, otherwise, I suggest focusing on what we want to do next and organize ideas/requirements on new issues for that we can target specifically.
#6 Updated by Szilárd Páll about 3 years ago
Marked it "in progress" for now, but as noted above, unless there are objections, this might be best closed.
#7 Updated by Mark Abraham about 3 years ago
- Status changed from In Progress to Closed
needs to be closed because it's listed as blocking other stuff