Project

General

Profile

test_gmx467_1x16_cpu.log

Szilárd Páll, 04/28/2015 03:06 AM

 
1
Log file opened on Mon Apr 27 16:27:03 2015
2
Host: tcbs22  pid: 4603  nodeid: 0  nnodes:  1
3
Gromacs version:    VERSION 4.6.8-dev-20140830-0336ab2
4
GIT SHA1 hash:      0336ab2d9dad7fe9db608620ad7b1ee146f70bce
5
Precision:          single
6
Memory model:       64 bit
7
MPI library:        thread_mpi
8
OpenMP support:     enabled
9
GPU support:        enabled
10
invsqrt routine:    gmx_software_invsqrt(x)
11
CPU acceleration:   AVX_128_FMA
12
FFT library:        fftw-3.3.2-sse2
13
Large file support: enabled
14
RDTSCP usage:       enabled
15
Built on:           Fri Jan 24 13:51:01 CET 2014
16
Built by:           pszilard@tcbs22 [CMAKE]
17
Build OS/arch:      Linux 3.5.0-45-generic x86_64
18
Build CPU vendor:   AuthenticAMD
19
Build CPU brand:    AMD Opteron(tm) Processor 6376
20
Build CPU family:   21   Model: 2   Stepping: 0
21
Build CPU features: aes apic avx clfsh cmov cx8 cx16 f16c fma fma4 htt lahf_lm misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a sse4.1 sse4.2 ssse3 xop
22
C compiler:         /usr/bin/gcc-4.8 GNU gcc-4.8 (Ubuntu 4.8.1-2ubuntu1~12.04) 4.8.1
23
C compiler flags:   -mxop -mfma4 -mavx    -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wall -Wno-unused -Wunused-value -Wno-unused-parameter -Wno-array-bounds -Wno-maybe-uninitialized -Wno-strict-overflow   -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast  -O3 -DNDEBUG -march=bdver2
24
C++ compiler:       /usr/bin/g++-4.8 GNU g++-4.8 (Ubuntu 4.8.1-2ubuntu1~12.04) 4.8.1
25
C++ compiler flags: -mxop -mfma4 -mavx   -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wall -Wno-unused -Wunused-value -Wno-unused-parameter -Wno-array-bounds -Wno-maybe-uninitialized -Wno-strict-overflow   -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast  -O3 -DNDEBUG
26
CUDA compiler:      /opt/cuda-5.5.22/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2013 NVIDIA Corporation;Built on Wed_Jul_17_18:36:13_PDT_2013;Cuda compilation tools, release 5.5, V5.5.0
27
CUDA compiler flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_20,code=sm_21;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_35,code=compute_35;-use_fast_math;-ccbin=/usr/bin/gcc-4.8;-Xcompiler;-fPIC ; -mxop;-mfma4;-mavx;-Wextra;-Wno-missing-field-initializers;-Wno-sign-compare;-Wall;-Wno-unused;-Wunused-value;-Wno-unused-parameter;-Wno-array-bounds;-Wno-maybe-uninitialized;-Wno-strict-overflow;-fomit-frame-pointer;-funroll-all-loops;-fexcess-precision=fast;-O3;-DNDEBUG
28
CUDA driver:        7.0
29
CUDA runtime:       5.50
30

    
31

    
32
                         :-)  G  R  O  M  A  C  S  (-:
33

    
34
                  Green Red Orange Magenta Azure Cyan Skyblue
35

    
36
                  :-)  VERSION 4.6.8-dev-20140830-0336ab2  (-:
37

    
38
        Contributions from Mark Abraham, Emile Apol, Rossen Apostolov, 
39
           Herman J.C. Berendsen, Aldert van Buuren, Pär Bjelkmar,  
40
     Rudi van Drunen, Anton Feenstra, Gerrit Groenhof, Christoph Junghans, 
41
        Peter Kasson, Carsten Kutzner, Per Larsson, Pieter Meulenhoff, 
42
           Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schulz, 
43
                Michael Shirts, Alfons Sijbers, Peter Tieleman,
44

    
45
               Berk Hess, David van der Spoel, and Erik Lindahl.
46

    
47
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
48
         Copyright (c) 2001-2012,2013, The GROMACS development team at
49
        Uppsala University & The Royal Institute of Technology, Sweden.
50
            check out http://www.gromacs.org for more information.
51

    
52
         This program is free software; you can redistribute it and/or
53
       modify it under the terms of the GNU Lesser General Public License
54
        as published by the Free Software Foundation; either version 2.1
55
             of the License, or (at your option) any later version.
56

    
57
        :-)  /nethome/pszilard-projects/gromacs/gromacs-4.6/build_gcc48_pd_cuda5522/src/kernel/mdrun  (-:
58

    
59

    
60
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
61
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
62
GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
63
molecular simulation
64
J. Chem. Theory Comput. 4 (2008) pp. 435-447
65
-------- -------- --- Thank You --- -------- --------
66

    
67

    
68
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
69
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
70
Berendsen
71
GROMACS: Fast, Flexible and Free
72
J. Comp. Chem. 26 (2005) pp. 1701-1719
73
-------- -------- --- Thank You --- -------- --------
74

    
75

    
76
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
77
E. Lindahl and B. Hess and D. van der Spoel
78
GROMACS 3.0: A package for molecular simulation and trajectory analysis
79
J. Mol. Mod. 7 (2001) pp. 306-317
80
-------- -------- --- Thank You --- -------- --------
81

    
82

    
83
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
84
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
85
GROMACS: A message-passing parallel molecular dynamics implementation
86
Comp. Phys. Comm. 91 (1995) pp. 43-56
87
-------- -------- --- Thank You --- -------- --------
88

    
89

    
90
Changing rlist from 0.945 to 0.937 for non-bonded 4x4 atom kernels
91

    
92
Input Parameters:
93
   integrator           = md
94
   nsteps               = 10000
95
   init-step            = 0
96
   cutoff-scheme        = Verlet
97
   ns_type              = Grid
98
   nstlist              = 10
99
   ndelta               = 2
100
   nstcomm              = 100
101
   comm-mode            = Linear
102
   nstlog               = 0
103
   nstxout              = 0
104
   nstvout              = 0
105
   nstfout              = 0
106
   nstcalcenergy        = 100
107
   nstenergy            = 500
108
   nstxtcout            = 0
109
   init-t               = 0
110
   delta-t              = 0.005
111
   xtcprec              = 1000
112
   fourierspacing       = 0.1125
113
   nkx                  = 56
114
   nky                  = 56
115
   nkz                  = 56
116
   pme-order            = 4
117
   ewald-rtol           = 1e-05
118
   ewald-geometry       = 0
119
   epsilon-surface      = 0
120
   optimize-fft         = FALSE
121
   ePBC                 = xyz
122
   bPeriodicMols        = FALSE
123
   bContinuation        = FALSE
124
   bShakeSOR            = FALSE
125
   etc                  = V-rescale
126
   bPrintNHChains       = FALSE
127
   nsttcouple           = 10
128
   epc                  = No
129
   epctype              = Isotropic
130
   nstpcouple           = -1
131
   tau-p                = 1
132
   ref-p (3x3):
133
      ref-p[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
134
      ref-p[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
135
      ref-p[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
136
   compress (3x3):
137
      compress[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
138
      compress[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
139
      compress[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
140
   refcoord-scaling     = No
141
   posres-com (3):
142
      posres-com[0]= 0.00000e+00
143
      posres-com[1]= 0.00000e+00
144
      posres-com[2]= 0.00000e+00
145
   posres-comB (3):
146
      posres-comB[0]= 0.00000e+00
147
      posres-comB[1]= 0.00000e+00
148
      posres-comB[2]= 0.00000e+00
149
   verlet-buffer-drift  = 0.005
150
   rlist                = 0.937
151
   rlistlong            = 0.937
152
   nstcalclr            = 10
153
   rtpi                 = 0.05
154
   coulombtype          = PME
155
   coulomb-modifier     = Potential-shift
156
   rcoulomb-switch      = 0
157
   rcoulomb             = 0.9
158
   vdwtype              = Cut-off
159
   vdw-modifier         = Potential-shift
160
   rvdw-switch          = 0
161
   rvdw                 = 0.9
162
   epsilon-r            = 1
163
   epsilon-rf           = inf
164
   tabext               = 1
165
   implicit-solvent     = No
166
   gb-algorithm         = Still
167
   gb-epsilon-solvent   = 80
168
   nstgbradii           = 1
169
   rgbradii             = 1
170
   gb-saltconc          = 0
171
   gb-obc-alpha         = 1
172
   gb-obc-beta          = 0.8
173
   gb-obc-gamma         = 4.85
174
   gb-dielectric-offset = 0.009
175
   sa-algorithm         = Ace-approximation
176
   sa-surface-tension   = 2.05016
177
   DispCorr             = No
178
   bSimTemp             = FALSE
179
   free-energy          = no
180
   nwall                = 0
181
   wall-type            = 9-3
182
   wall-atomtype[0]     = -1
183
   wall-atomtype[1]     = -1
184
   wall-density[0]      = 0
185
   wall-density[1]      = 0
186
   wall-ewald-zfac      = 3
187
   pull                 = no
188
   rotation             = FALSE
189
   disre                = No
190
   disre-weighting      = Conservative
191
   disre-mixed          = FALSE
192
   dr-fc                = 1000
193
   dr-tau               = 0
194
   nstdisreout          = 100
195
   orires-fc            = 0
196
   orires-tau           = 0
197
   nstorireout          = 100
198
   dihre-fc             = 0
199
   em-stepsize          = 0.01
200
   em-tol               = 10
201
   niter                = 20
202
   fc-stepsize          = 0
203
   nstcgsteep           = 1000
204
   nbfgscorr            = 10
205
   ConstAlg             = Lincs
206
   shake-tol            = 0.0001
207
   lincs-order          = 6
208
   lincs-warnangle      = 30
209
   lincs-iter           = 1
210
   bd-fric              = 0
211
   ld-seed              = 1993
212
   cos-accel            = 0
213
   deform (3x3):
214
      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
215
      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
216
      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
217
   adress               = FALSE
218
   userint1             = 0
219
   userint2             = 0
220
   userint3             = 0
221
   userint4             = 0
222
   userreal1            = 0
223
   userreal2            = 0
224
   userreal3            = 0
225
   userreal4            = 0
226
grpopts:
227
   nrdf:       31693
228
   ref-t:         300
229
   tau-t:         0.1
230
anneal:          No
231
ann-npoints:           0
232
   acc:	           0           0           0
233
   nfreeze:           N           N           N
234
   energygrp-flags[  0]: 0
235
   efield-x:
236
      n = 0
237
   efield-xt:
238
      n = 0
239
   efield-y:
240
      n = 0
241
   efield-yt:
242
      n = 0
243
   efield-z:
244
      n = 0
245
   efield-zt:
246
      n = 0
247
   bQMMM                = FALSE
248
   QMconstraints        = 0
249
   QMMMscheme           = 0
250
   scalefactor          = 1
251
qm-opts:
252
   ngQM                 = 0
253

    
254
Overriding nsteps with value passed on the command line: 10000 steps, 50.000 ps
255

    
256
Using 1 MPI thread
257
Using 16 OpenMP threads 
258

    
259
Detecting CPU-specific acceleration.
260
Present hardware specification:
261
Vendor: AuthenticAMD
262
Brand:  AMD Opteron(tm) Processor 6376                 
263
Family: 21  Model:  2  Stepping:  0
264
Features: aes apic avx clfsh cmov cx8 cx16 f16c fma fma4 htt lahf_lm misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a sse4.1 sse4.2 ssse3 xop
265
Acceleration most likely to fit this hardware: AVX_128_FMA
266
Acceleration selected at GROMACS compile time: AVX_128_FMA
267

    
268
Will do PME sum in reciprocal space.
269

    
270
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
271
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen 
272
A smooth particle mesh Ewald method
273
J. Chem. Phys. 103 (1995) pp. 8577-8592
274
-------- -------- --- Thank You --- -------- --------
275

    
276
Will do ordinary reciprocal space Ewald sum.
277
Using a Gaussian width (1/beta) of 0.288146 nm for Ewald
278
Cut-off's:   NS: 0.937   Coulomb: 0.9   LJ: 0.9
279
System total charge: 0.000
280
Generated table with 968 data points for Ewald.
281
Tabscale = 500 points/nm
282
Generated table with 968 data points for LJ6.
283
Tabscale = 500 points/nm
284
Generated table with 968 data points for LJ12.
285
Tabscale = 500 points/nm
286
Generated table with 968 data points for 1-4 COUL.
287
Tabscale = 500 points/nm
288
Generated table with 968 data points for 1-4 LJ6.
289
Tabscale = 500 points/nm
290
Generated table with 968 data points for 1-4 LJ12.
291
Tabscale = 500 points/nm
292

    
293
Using AVX-128-FMA 4x4 non-bonded kernels
294

    
295
Using Lorentz-Berthelot Lennard-Jones combination rule
296

    
297
Potential shift: LJ r^-12: 3.541 r^-6 1.882, Ewald 1.000e-05
298
Initialized non-bonded Ewald correction tables, spacing: 6.26e-04 size: 3096
299

    
300
Removing pbc first time
301
Pinning threads with an auto-selected logical core stride of 1
302

    
303
Initializing LINear Constraint Solver
304

    
305
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
306
B. Hess
307
P-LINCS: A Parallel Linear Constraint Solver for molecular simulation
308
J. Chem. Theory Comput. 4 (2008) pp. 116-122
309
-------- -------- --- Thank You --- -------- --------
310

    
311
The number of constraints is 1220
312
249 constraints are involved in constraint triangles,
313
will apply an additional matrix expansion of order 6 for couplings
314
between constraints inside triangles
315

    
316
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
317
S. Miyamoto and P. A. Kollman
318
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
319
Water Models
320
J. Comp. Chem. 13 (1992) pp. 952-962
321
-------- -------- --- Thank You --- -------- --------
322

    
323
Center of mass motion removal mode is Linear
324
We have the following groups for center of mass motion removal:
325
  0:  rest
326

    
327
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
328
G. Bussi, D. Donadio and M. Parrinello
329
Canonical sampling through velocity rescaling
330
J. Chem. Phys. 126 (2007) pp. 014101
331
-------- -------- --- Thank You --- -------- --------
332

    
333
There are: 15900 Atoms
334
There are: 1048 VSites
335

    
336
Constraining the starting coordinates (step 0)
337

    
338
Constraining the coordinates at t0-dt (step 0)
339
RMS relative constraint deviation after constraining: 2.85e-05
340
Initial temperature: 300.134 K
341

    
342
Started mdrun on node 0 Thu Jan  1 01:00:00 1970
343
           Step           Time         Lambda
344
              0        0.00000        0.00000
345

    
346
   Energies (kJ/mol)
347
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
348
    2.61927e+03    5.31560e+03    1.71086e+02    2.16343e+03    1.68043e+04
349
        LJ (SR)   Coulomb (SR)   Coul. recip.      Potential    Kinetic En.
350
    2.70096e+04   -2.75073e+05    2.54483e+03   -2.18445e+05    3.94954e+04
351
   Total Energy  Conserved En.    Temperature Pressure (bar)   Constr. rmsd
352
   -1.78949e+05   -1.78949e+05    2.99762e+02    2.74516e+02    4.92996e-05
353

    
354

    
355
step 5000: resetting all time and cycle counters
356

    
357
Restarted time on node 0 Mon Apr 27 16:27:34 2015
358
           Step           Time         Lambda
359
          10000       50.00000        0.00000
360

    
361
   Energies (kJ/mol)
362
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
363
    2.38556e+03    5.36417e+03    1.66238e+02    2.10951e+03    1.70832e+04
364
        LJ (SR)   Coulomb (SR)   Coul. recip.      Potential    Kinetic En.
365
    2.67052e+04   -2.75658e+05    2.41930e+03   -2.19425e+05    3.96650e+04
366
   Total Energy  Conserved En.    Temperature Pressure (bar)   Constr. rmsd
367
   -1.79760e+05   -1.79472e+05    3.01049e+02   -1.29138e+01    5.14728e-05
368

    
369
	<======  ###############  ==>
370
	<====  A V E R A G E S  ====>
371
	<==  ###############  ======>
372

    
373
	Statistics over 10001 steps using 101 frames
374

    
375
   Energies (kJ/mol)
376
          Angle    Proper Dih.  Improper Dih.          LJ-14     Coulomb-14
377
    2.48686e+03    5.32844e+03    1.81928e+02    2.08913e+03    1.68984e+04
378
        LJ (SR)   Coulomb (SR)   Coul. recip.      Potential    Kinetic En.
379
    2.73909e+04   -2.76083e+05    2.47858e+03   -2.19228e+05    3.94975e+04
380
   Total Energy  Conserved En.    Temperature Pressure (bar)   Constr. rmsd
381
   -1.79731e+05   -1.79216e+05    2.99778e+02    6.27593e+01    0.00000e+00
382

    
383
   Total Virial (kJ/mol)
384
    1.26901e+04   -5.89796e+01    2.29335e+02
385
   -5.88232e+01    1.28908e+04    7.17809e+00
386
    2.29363e+02    7.12782e+00    1.29608e+04
387

    
388
   Pressure (bar)
389
    1.00487e+02    1.03805e+01   -4.54460e+01
390
    1.03497e+01    4.88689e+01   -6.12600e+00
391
   -4.54515e+01   -6.11610e+00    3.89217e+01
392

    
393

    
394
	M E G A - F L O P S   A C C O U N T I N G
395

    
396
 NB=Group-cutoff nonbonded kernels    NxN=N-by-N cluster Verlet kernels
397
 RF=Reaction-Field  VdW=Van der Waals  QSTab=quadratic-spline table
398
 W3=SPC/TIP3p  W4=TIP4p (single or pairs)
399
 V&F=Potential and force  V=Potential only  F=Force only
400

    
401
 Computing:                               M-Number         M-Flops  % Flops
402
-----------------------------------------------------------------------------
403
 Pair Search distance check            2365.602464       21290.422     1.0
404
 NxN Ewald Elec. + VdW [F]            16078.662680     1061191.737    48.8
405
 NxN Ewald Elec. + VdW [V&F]            165.557200       17714.620     0.8
406
 NxN Ewald Elec. [F]                  12149.214952      741102.112    34.1
407
 NxN Ewald Elec. [V&F]                  125.061312       10505.150     0.5
408
 1,4 nonbonded interactions              26.710341        2403.931     0.1
409
 Calc Weights                           254.270844        9153.750     0.4
410
 Spread Q Bspline                      5424.444672       10848.889     0.5
411
 Gather F Bspline                      5424.444672       32546.668     1.5
412
 3D-FFT                               30602.049186      244816.393    11.3
413
 Solve PME                               15.683136        1003.721     0.0
414
 Shift-X                                  8.490948          50.946     0.0
415
 Angles                                  15.973194        2683.497     0.1
416
 Propers                                 25.545108        5849.830     0.3
417
 Impropers                                2.110422         438.968     0.0
418
 Virial                                   0.866643          15.600     0.0
419
 Stop-CM                                  0.864348           8.643     0.0
420
 Calc-Ekin                               16.964948         458.054     0.0
421
 Lincs                                    6.101220         366.073     0.0
422
 Lincs-Mat                              162.492492         649.970     0.0
423
 Constraint-V                            86.137224         689.098     0.0
424
 Constraint-Vir                           0.816204          19.589     0.0
425
 Settle                                  24.644928        7960.312     0.4
426
 Virtual Site 3                           0.666864          24.674     0.0
427
 Virtual Site 3fd                         0.944724          89.749     0.0
428
 Virtual Site 3fad                        0.383952          67.576     0.0
429
 Virtual Site 3out                        2.556312         222.399     0.0
430
 Virtual Site 4fdn                        0.742644         188.632     0.0
431
-----------------------------------------------------------------------------
432
 Total                                                 2172361.001   100.0
433
-----------------------------------------------------------------------------
434

    
435

    
436
     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
437

    
438
 Computing:         Nodes   Th.     Count  Wall t (s)     G-Cycles       %
439
-----------------------------------------------------------------------------
440
 Vsite constr.          1   16       5001       0.156        5.749     0.5
441
 Neighbor search        1   16        501       2.144       78.904     7.0
442
 Force                  1   16       5001      17.887      658.299    58.5
443
 PME mesh               1   16       5001       7.265      267.380    23.8
444
 NB X/F buffer ops.     1   16       9501       0.977       35.943     3.2
445
 Vsite spread           1   16       5052       0.172        6.338     0.6
446
 Update                 1   16       5001       0.158        5.818     0.5
447
 Constraints            1   16       5001       1.538       56.596     5.0
448
 Rest                   1                       0.267        9.830     0.9
449
-----------------------------------------------------------------------------
450
 Total                  1                      30.564     1124.857   100.0
451
-----------------------------------------------------------------------------
452
-----------------------------------------------------------------------------
453
 PME spread/gather      1   16      10002       3.782      139.179    12.4
454
 PME 3D-FFT             1   16      10002       3.207      118.035    10.5
455
 PME solve              1   16       5001       0.249        9.163     0.8
456
-----------------------------------------------------------------------------
457
-----------------------------------------------------------------------------
458
 NS grid local          1   16        501       0.132        4.870     0.4
459
 NS search local        1   16        501       1.919       70.633     6.3
460
 Bonded F               1   16       5001       3.456      127.186    11.3
461
 Nonbonded F            1   16       5001      14.151      520.780    46.3
462
 NB X buffer ops.       1   16       4500       0.254        9.358     0.8
463
 NB F buffer ops.       1   16       5001       0.721       26.518     2.4
464
-----------------------------------------------------------------------------
465

    
466
               Core t (s)   Wall t (s)        (%)
467
       Time:      488.250       30.564     1597.4
468
                 (ns/day)    (hour/ns)
469
Performance:       70.684        0.340
470
Finished mdrun on node 0 Mon Apr 27 16:28:04 2015