Project

General

Profile

Bug #1325

"Core t" time is wrong with separate PME nodes

Added by Mark Abraham about 4 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
probably all 4.6.x
Affected version:
Difficulty:
uncategorized
Close

Description

On 128 nodes:

     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G

 Computing:         Nodes   Th.     Count  Wall t (s)     G-Cycles       %
-----------------------------------------------------------------------------
 Domain decomp.        96   16         51       2.526     6207.876     1.6
 DD comm. load         96   16         51       0.013       33.110     0.0
 DD comm. bounds       96   16         51       0.047      116.244     0.0
 Vsite constr.         96   16       1001       5.217    12821.130     3.3
 Send X to PME         96   16       1001       0.297      728.685     0.2
 Neighbor search       96   16         51       3.391     8334.242     2.2
 Comm. coord.          96   16        950       1.685     4142.285     1.1
 Force                 96   16       1001      61.505   151154.371    39.5
 Wait + Comm. F        96   16       1001       2.926     7191.164     1.9
 PME mesh              32   16       1001      26.229    21486.727     5.6
 PME wait for PP       32                      90.659    74267.855    19.4
 Wait + Recv. PME F    96   16       1001       0.490     1203.323     0.3
 NB X/F buffer ops.    96   16       2901       1.571     3860.317     1.0
 Vsite spread          96   16       1052       1.307     3211.383     0.8
 Write traj.           96   16          1       0.539     1325.423     0.3
 Update                96   16       1001       0.435     1070.252     0.3
 Constraints           96   16       1001      34.171    83978.435    21.9
 Comm. energies        96   16         51       0.035       85.321     0.0
 Rest                  96                       0.549     1800.140     0.5
-----------------------------------------------------------------------------
 Total                128                     116.888   383018.284   100.0
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
 PME redist. X/F       32   16       2002       7.337     6010.227     1.6
 PME spread/gather     32   16       2002      10.680     8749.207     2.3
 PME 3D-FFT            32   16       2002       4.841     3965.990     1.0
 PME 3D-FFT Comm.      32   16       4004       2.051     1680.014     0.4
 PME solve             32   16       1001       1.246     1020.900     0.3
-----------------------------------------------------------------------------

               Core t (s)   Wall t (s)        (%)
       Time:    11221.440      116.888     9600.2
                 (ns/day)    (hour/ns)
Performance:        2.960        8.109

117*96 approximately equals 11221. Core t should be approximately 117*128 because it is computed by a reduction over all MPI nodes. The (%) is wrong similarly.

The problem is caused by runtime_start() not being called for separate PME nodes, so they contribute zero to the reduction. This bug could be fixed by calling runtime_start() before the PME split, but that would include some setup time in the "run" time. Instead, the PME code needs to start the timer when iteration starts.

Fortunately, the rates are computed based solely on the real time on the SIMMASTER node, so they are correct.

Associated revisions

Revision 3a0323cb (diff)
Added by Mark Abraham about 4 years ago

Fix total time measurement with separate PME nodes

The runtime counter needs to be passed to the PME code so that
PME-only nodes can have their time included in the statistics.

Fixes #1325

Change-Id: I13effaa185b1290e41bdd642c607ff75ab8db929

History

#1 Updated by Mark Abraham about 4 years ago

  • Status changed from New to Fix uploaded

Fixed in https://gerrit.gromacs.org/#/c/2573/

     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G

 Computing:         Nodes   Th.     Count  Wall t (s)     G-Cycles       %
-----------------------------------------------------------------------------
 Domain decomp.        96   16         51       2.513     6175.897     1.6
 DD comm. load         96   16         51       0.014       34.229     0.0
 DD comm. bounds       96   16         51       0.048      118.054     0.0
 Vsite constr.         96   16       1001       5.143    12638.874     3.3
 Send X to PME         96   16       1001       0.206      507.097     0.1
 Neighbor search       96   16         51       3.397     8347.827     2.2
 Comm. coord.          96   16        950       1.455     3574.664     0.9
 Force                 96   16       1001      61.445   151003.097    39.9
 Wait + Comm. F        96   16       1001       2.793     6862.720     1.8
 PME mesh              32   16       1001      26.158    21427.811     5.7
 PME wait for PP       32                      89.386    73222.661    19.3
 Wait + Recv. PME F    96   16       1001       0.427     1048.845     0.3
 NB X/F buffer ops.    96   16       2901       1.559     3832.358     1.0
 Vsite spread          96   16       1052       1.173     2881.573     0.8
 Write traj.           96   16          1       0.537     1320.013     0.3
 Update                96   16       1001       0.437     1073.793     0.3
 Constraints           96   16       1001      33.628    82641.949    21.8
 Comm. energies        96   16         51       0.034       84.075     0.0
 Rest                  96                       0.551     1806.387     0.5
-----------------------------------------------------------------------------
 Total                128                     115.544   378601.924   100.0
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
 PME redist. X/F       32   16       2002       7.204     5901.486     1.6
 PME spread/gather     32   16       2002      10.744     8801.418     2.3
 PME 3D-FFT            32   16       2002       4.885     4001.408     1.1
 PME 3D-FFT Comm.      32   16       4004       2.043     1673.734     0.4
 PME solve             32   16       1001       1.206      987.938     0.3
-----------------------------------------------------------------------------

               Core t (s)   Wall t (s)        (%)
       Time:    14789.220      115.544    12799.7
                 (ns/day)    (hour/ns)
Performance:        2.994        8.016

#2 Updated by Mark Abraham about 4 years ago

  • Status changed from Fix uploaded to Resolved
  • % Done changed from 0 to 100

#3 Updated by Mark Abraham about 4 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF