LINCS: (erroneous?) assertion with multiple tMPI ranks and multiple OpenMP threads
When running with more than one tMPI thread and more than one one OpenMP thread per rank,
I get the following error messages of the following type with gmx mdrun versions >5.0:
gmx_d: /home/tullman/GROMACS/gromacs.git/src/gromacs/mdlib/clincs.cpp:1417: void set_lincs_matrix_task(gmx_lincsdata*, lincs_task_t*, const real*, int*): Assertion `k >= li_task->b0 && k < li_task->b1' failed.
The error messages are due to these two assertions in clincs.cpp:
1413 /* If we are using multiple tasks for LINCS,
1414 * the calls to check_assign_triangle should have
1415 * put all constraints in the triangle in our task.
1417 assert(k >= li_task->b0 && k < li_task->b1);
1418 assert(kk >= li_task->b0 && kk < li_task->b1);
The assertions were introduced in this commit:
Commenting out the assertions seems to let the simulation run just fine,
judging from a quick look at the trajectory and some observables from
a short (200 ps). This run done was done with a GPU-version, but the
problem occurs also when not compiling for GPUs and on two different machines.
Only thread-MPI + OpenMP versions were tested so far.
Versions >=5.1 run without problems with the same input with a single rank
and any number of OpenMP threads, or two ranks with a single OpenMP thread.
Versions 4.x run without problems for any number of (t)MPI ranks and OpenMP
Could the conditions for the assertion be inapplicable for >=2 ranks and >=2
Fix LINCS triangle constraint thread issue
With triangle constraints, a OpenMP thread barrier could be missing
under certain conditions. A debug build produced an assertion failure.
This was unlikely to happen without DD, but quite likely with DD.
This bug will have had minimal effect on the results.