Difference between single rank and multiple rank when pulling using constraints relative to rest of the system
I noticed that I get completely different e.g. pull forces and temperature on a run started on a GPU server running NB and PME on GPU and then continued from a restart file on the PDC Beskow supercomputer running MPI (no GPUs).
Continuing from a checkpoint on a different hardware would not make the results binary identical, but in this case the difference is remarkable. The pull forces and temperature fluctuations are a lot higher on Beskow. I guess something is going wrong and I guess the output from the GPU server is correct, based only on the fact that it is more stable.
I'm attaching the pull force and temperature output and the log file from a run where the first 100 ps are run on a GPU server, the next 200 ps on Beskow and then 200 ps on the GPU server again.
#3 Updated by Magnus Lundborg 5 months ago
- Subject changed from Difference between MPI and thread-MPI version pulling using constraints relative to rest of the system to Difference between single rank and multiple rank when pulling using constraints relative to rest of the system
The problem was identified to be related to single rank vs multiple rank. Subject updated.