fix & improve CPU oversubscription handling
Oversubscription of the available CPU cores should be avoided in most (if not all) cases as it results in bad performance. This is made even worse by thread pinning. The current oversubscription check implemented in the gmx_omp_nthreads module is incorrect with separate PME nodes that use a different number of OpenMP threads than the PP nodes.The following improvements are required:
- implementing correct check outside of gmx_omp_nthreads - it's not (only) OpenMP-related, it can happen with pure MPI/tMPI;
- turning off thread pinning when oversubscription is detected.
Added basic CPU topology information to cpuid code
We can now detect the locality of hardware threads, cores,
and packages for Intel and AMD CPUs under Linux and Windows.
In particular, this provides an array with locality order
for logical processors that can be used to optimize placement.
Refs #1086, #1101.
#2 Updated by Szilárd Páll over 7 years ago
Mark Abraham wrote:
Can we deal with this in the next fortnight or so for 4.6, or push it back to 4.6.1?
Depends what does "we" mean.
I have not strived to fixing the current code because it requires yet another splitting of the default communicator (the same is already done in 2-3 places) and doing a per-node thread enumeration (at least I don't know of a simpler way). However, as we discussed earlier (and now I'm realizing an issue for this is missing), there should instead be communicators set up and stored for inter- and intra-node communication which would remove current redundancy and enable implementing other features. This task is pretty simple, so I might as well just do it myself if nobody can pitch in.