Project

General

Profile

Bug #1019

Timing stats broken

Added by Mark Abraham over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Something's broken in 4-6 with the timing stats. Using verlet kernels in 6-replica REMD on a pure water system with 8 procs per replica:

R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
Computing:         Nodes   Th.     Count  Wall t (s)     G-Cycles       %
-----------------------------------------------------------------------------
Domain decomp. 8 1 2000001 423.166 9906.176 1.5
DD comm. load 8 1 40001 0.090 2.112 0.0
Neighbor search 8 1 2000001 1114.841 26098.045 4.0
Comm. coord. 8 1 18000000 484.870 11350.643 1.7
Force 8 1 20000001 16907.690 395803.171 60.1
Wait + Comm. F 8 1 20000001 1193.734 27944.906 4.2
PME mesh 8 1 20000001 4732.416 110784.219 16.8
NB X/F buffer ops. 8 1 56000001 185.132 4333.880 0.7
Write traj. 8 1 32 3.073 71.929 0.0
Update 8 1 20000001 206.091 4824.524 0.7
Constraints 8 1 40000002 1042.440 24403.161 3.7
Comm. energies 8 1 40000002 363.917 8519.178 1.3
Rest 8 1455.743 34078.436 5.2
-----------------------------------------------------------------------------
Total 8 28113.203 658120.381 100.0
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
PME redist. X/F 8 1 40000002 479.478 11224.419 1.7
PME spread/gather 8 1 40000002 1754.612 41074.862 6.2
PME 3D-FFT 8 1 40000002 1282.474 30022.274 4.6
PME 3D-FFT Comm. 8 1 40000002 546.852 12801.618 1.9
PME solve 8 1 20000001 640.001 14982.194 2.3
-----------------------------------------------------------------------------
Core t (s)   Wall t (s)        (%)
Time: 193616.300 28113.203 688.7
7h48:33
(ns/day) (hour/ns)
Performance: 122.932 0.195

Core time of 193616.3s agrees with the queueing system's report of 6h44min, but "Wall t" is about 4000s too large. That looks like double counting of "Comm. coord" somehow?

History

#1 Updated by Mark Abraham over 6 years ago

688.7% also looks wrong...

#2 Updated by Berk Hess over 6 years ago

  • Status changed from New to Feedback wanted

This is probably fixed by now.
The usage percentage is the same as what top would report, so above 100% with more than 1 thread.

#3 Updated by Mark Abraham over 6 years ago

Seems OK now, yes.

#4 Updated by Berk Hess over 6 years ago

  • Status changed from Feedback wanted to Closed

I assume this issue has been fixed by one or more patches in the meantime. We can reopen it when we still find issues.

Also available in: Atom PDF