Project

General

Profile

Task #2518

Task #2454: OpenCL infrastructure improvements

redesign task-assignment code for OpenCL

Added by Mark Abraham 3 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
mdrun
Target version:
Difficulty:
uncategorized
Close

Description

At https://gerrit.gromacs.org/#/c/7857/ it became clear that we will need to work on how OpenCL devices get chosen to have any (and which) compute task (e.g. NB, PME) run on them, and how that might affect parameters for e.g. NB search code, or kernel JIT.

OpenCL permits multiple platforms to be available and choosable at run time. It is quite possible to have a node with 5 platforms, e.g. NVIDIA GPU, AMD GPU, Intel GPU, somebody's CPU, and even an FPGA. Once there's a libOpenCL.so, it finds the configured ICD from the OS/filesystem which get installed alongside the drivers. If there's multiple devices visible, it's up to the application to decide what makes sense to do. Our Jenkins slaves literally have this issue to solve.

It's quite likely that current HPC nodes will only have one kind of GPU, but if they have the latest Intel OpenCL support installed, that will sometimes already be two platforms (Intel CPU and somebody's GPU). And with Intel positioning to make FPGAs available too...

I don't think it's a good long-term approach to require the user to configure for a single platform, and I don't think that helps us much with the run-time issue of which platform that was.

We can probably have the heuristic that we only use GPU platforms, but that assumes that we know how we can identify those (query something at run time?). We could let the user configure what the default GPU platform vendor is, so that e.g. the default for gmx mdrun -openclplatform is that value (but still potentially configurable)

More background information: https://stackoverflow.com/questions/18602534/is-there-a-good-way-to-choose-the-right-platform-on-the-fly

https://gerrit.gromacs.org/#/c/7787/ introduced a CMake variable that can be used to set things up to run on an Intel iGPU, which is fine for getting some changes integrated and runnable, but we need to do a better job before we can consider exposing support for such devices to users.

Also available in: Atom PDF