Feature #2054: PME on GPU
ensure PME queue is flushed
At least one clFlush() call is needed to make sure the work is submitted to the GPU queue and the runtime does not decide to postpone it until the first blocking API call is issued (i.e. the wait).
Also revise the nbnxn module's use of flush when both PME and nonbondeds run on the GPU.
#1 Updated by Szilárd Páll 10 months ago
- Status changed from New to In Progress
Note: I am not very sure whether this is necessary at all to achieve concurrency, but if it is, we might need a clFlush after each task that we enqueue. Additionally, at the end of the step we currently flush at clearing then call the pruning kernels so the flush should be probably moved after the latter.