Project

General

Profile

Bug #3443

Bonded GPU kernel performance regression with 2020

Added by Andreas baer 8 months ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
mdrun
Target version:
-
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Hello,

a set of benchmark tests with large systems using Gromacs versions 2019.5 and 2020 showed a decrease of the performance to about 2/3 of the 2019.5 version. Interestingly, according to nvidia-smi, the GPU usage is about 20% higher for the 2020 version.
Apparently, it affects the energy calculation steps where the GPU bonded computational did get significantly slower (as a side-effect of optimizations that mainly targeted the force-only kernels).

All logfiles of the benchmarks can be found with the following link:
https://faubox.rrze.uni-erlangen.de/getlink/fiUpELsXokQr3a7vyeDSKdY3/benchmarks_2019-2020_all
Additionally, with this link there are setup files (.tpr/.gro/.top/.mdp in `C60xh.7z`) and scripts, starting the benchmarks (`runfiles.7z`).

Some background info on the benchmarks:
- System contains about 2.1 million atoms.
- Hardware: 2x Intel Xeon Gold 6134 („Skylake“) 3.2 GHz = 16 cores + SMT; 4x NVIDIA Tesla V100
(similar results with less significant performance drop (~15%) on a different machine: 2 or 4 nodes with each [2x Intel Xeon 2660v2 („Ivy Bridge“)
2.2GHz = 20 cores + SMT; 2x NVIDIA Kepler K20])
- Several options for -ntmpi, -ntomp, -bonded, -pme are used to find the optimal set. However the performance drop seems to be persistent for all such options.

Please let me know, if you need further information.

Cheers,
Andreas

#report #request

History

#1 Updated by Szilárd Páll 7 months ago

  • Subject changed from Performance regression with 2020 on GPUs to Bonded GPU kernel performance regression with 2020
  • Description updated (diff)

Ref users' list discussion as noted, the regression is in the bonded force & energy kernels.

Also available in: Atom PDF