LJ-PME unstable with OpenCL
Jenkins tests sometimes, but not always, fail with OpenCL for complex.nbnxn-ljpme-LB-geometric:
gmx: /home/jenkins/workspace/Gromacs_Gerrit_2016_presubmit/fdad3ebf/gromacs/src/gromacs/mdlib/nbnxn_ocl/nbnxn_ocl_data_mgmt.cpp:842: void nbnxn_ocl_clear_f(gmx_nbnxn_ocl_t*, int): Assertion `cl_error == 0' failed.
Aborted (core dumped)
#2 Updated by Berk Hess over 4 years ago
Note that there is only on ljpme test using OpenCL, the other two use the group scheme. So I suspect all ljpme runs are affected by this issue.
The code looks fine to me. The only thing I can't see is if the force buffer reallocation is guaranteed to happen after the force clearing is finished. If not, this could explain the failure. Note that in that case all assertion failures should occur at DD (=nstlist) steps.
#6 Updated by Berk Hess over 4 years ago
- Status changed from New to Fix uploaded
- Target version changed from 2016 to 5.1.3
- Affected version changed from 2016 to 5.1.2
This was a simple issue: the force clearing kernel was called with 0 total theads, which is not allowed.
Uploaded a fix to release-5-1.