mdrun can oversubscribe the cores with GPUs, thread-MPI and only -ntomp
The code for deciding the number of thread-MPI threads did not (correctly) take into account that the number of OpenMP threads my have been set (by either the user of something in the scheduling). This could lead to oversubscribing the number of hardware threads.
Since in general there is no unique way to decide the number of thread-MPI ranks given the number of OpenMP threads, the number of hardware threads and the number of GPUs, we should not allow to specify -ntomp without -ntmpi.
For the CPU case there is a unique solution, so we do not need to change that (and we should not in a release branch).
Require -ntmpi with setting -ntomp with GPUs
With GPUs and thread-MPI, setting only -ntomp could lead to
oversubscription of the hardware threads.
Now with GPUs and thread-MPI the user is required to set -ntmpi when
using -ntomp. Here we chose that to also require -ntmpi when the user
specified both -nt and -ntomp; here we could infer the number of
ranks, but it's safer to ask the user to explicity set -ntmpi.
Note that specifying both -ntmpi and -nt has always worked correctly.