Current code does:
useGpuFBufferOps = simulationWork.useGpuBufferOps && simulationWork.useGpuNonbonded && !(flags.computeVirial || flags.computeEnergy);
This energy steps have no reason to fall back to CPU buffer ops so that condition needs to be removed.
Make the wait on nonbonded GPU results conditional
When the force reduction is done on the GPU and there are no energy or
shift force results required, there is no need to block and wait on the
CPU until the GPU nonbonded kernels complete.
This change makes the wait conditional on whether there are nonbonded
force, energy or shift force outputs so the blocking wait is now skipped
with GPU buffer ops on force-only steps.
Also removed the now unnecessary boolean argument passed to
Remove energy step dependence on GPU force buffer ops
Energy steps have no reason to fall back to CPU buffer ops. This
change removes the energy step flag from the calculation of the GPU
force buffer op flag.