Avoid requesting the user to recompile gromacs for Intel OpenCL support
Instead of setting c_nbnxnGpuClusterSize to a constant based on a CMake setting, it seems like the obvious solution is to make it a template parameter and compile with by 4 and 8.
#1 Updated by Mark Abraham 6 months ago
- Category set to core library
- Target version set to 2020
Indeed. It need not be necessary to template, either. It's not clear to me that the setup and search code benefits a lot from knowing that the cluster width is a compile-time constant, since its role is to set up data structures that also have several dynamically sized aspects. Since addressing this does require functional changes to the NBNXN module, we should target 2020.
Unfortunately the Intel OpenCL functionality sat in Gerrit and was not integrated over the May-August timeframe despite review, fixes, and voting from several people. That probably inhibited people from considering working on such usability aspects. Clearly we need to improve our workflows there. It's not acceptable that someone proposing GPU code has to wait months for us to prioritize the final stages of integrating it. We like having input from those most experienced with our GPU code, but when they are not available, we need to integrate so we can work on other aspects without large rebase chains. If there had been functional aspects of the Intel OpenCL code to address when we had time from those developers (e.g. I believe Intel OpenCL is currently broken for PME), it would have been fine to fix them in follow up work in August-September.
#5 Updated by Mark Abraham 6 months ago
Berk Hess wrote:
There are only a handful of functions in nbnxn_search.cpp which would need to be templated.
Do we need templating? We would anyway need to have a cluster size chosen at run time in order to choose the instance of the template to use, and it is simpler to just pass 4 or 8. :-)
The OpenCL kernels are always jitted, right? Then this is a rather small change.
Yes, always jitted. I'd hope it was fairly small, but I didn't want to tackle it while Intel OpenCL support was till in gerrit and the scale of any changes to search code for the update groups wasn't known.