Bug #1578
PME incorrect with MPI+OpenMP and multiple MPI communication pulses
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Description
Change 272736bc fixed a related issue with additionally multiple OpenMP thread domains along a PME domain decomposition dimension. But his fix broke a more common case without this specific thread division.
Now the PME energies and forces are incorrect with MPI+OpenMP when multiple MPI communication pulses are used, that is when:
fftgrid_size[d]/#PME_ranks[d] < pme_order && fftgrid_size[d]/#PME_ranks[d] != pme_order-1
Note that without OpenMP threads there are no issues.
Related issues
Associated revisions
History
#1 Updated by Gerrit Code Review Bot over 4 years ago
Gerrit received a related DRAFT patchset '1' for Issue #1578.
Uploader: Szilárd Páll (pall.szilard@gmail.com)
Change-Id: I3649294a143bb744a2e26fd1d9dbfb87dea421ca
Gerrit URL: https://gerrit.gromacs.org/3905
#2 Updated by Berk Hess over 4 years ago
- Status changed from In Progress to Closed
#3 Updated by Roland Schulz about 4 years ago
- Related to Bug #1572: Incorrect PME energies and forces with high numbers of OpenMP threads added
Fixed two PME issues with MPI+OpenMP
Change 272736bc partially fixed #1388, but broke the more general
case of multiple MPI communication pulses in PME. Change 272736bc
incorrectly changed tx1 and ty1. This change has been reverted.
Change 27189bba fixed the incorrect PME grid reduction with multiple
thread grid overlap in y. But it broke the, much more common, case
where the y-size of the PME grid is not divisible by the domains in y.
This change, incorrectly, changed buf_my.
Now buf_my is set to the correct value, which solves both issues.
Fixes #1578.
Refs #1388 and #1572.
Change-Id: Id2d7d013a3b8cdc04eda1fb026567088a38ec81f