Task #2675: bonded CUDA offload task
consider making GPU bonded work independent from nonbonded
The search can become a considerable GPU idle-time in cases where a weak GPU is paired with a strong CPU. Allowing the bondeds to run independently of the nbnxm xq buffer would allow overlapping most of the pair search with PME and bonded execution.
#1 Updated by Szilárd Páll 8 months ago
On a second thought, the benefit from this might not be as much as I originally thought as we are planning to split the search out of do_force() which will make it impossible to launch any force task not dependent on the pair list early enough to overlap with the search.