bonded work might fail when there is no nonlocal nonbonded work
Coordinate copy is skipped when the nonlocal pair list is empty; this however does not mean that there are no bonded interactions that would need the nonlocal atom coordinates.
Proposed solution: change the conditional preceding the coordinate H2D copy to one that reflects the new dependency.
Split nbnxn input copy and kernel launch
The nonbonded x+q host-to-device copy and kernel launch is split into
two functions and called separately from do_force().
This will allow improving the bonded scheduling and better expressing a
missing bonded dependency (and fixing the related bug).
This change only moves code.
Fix conditional in nonlocal nbnxn GPU work skipping
The nbnxn nonlocal work, including coordinate buffer copy could be
skipped when the nonlocal pair list is empty. However this condition now
needs to also take into account that the bonded kernels also take the
same coordinates as input.
This change makes the non-local nbnxn copy depend both on whether there
is non-local nonbonded as well as bonded work on the current domain.