Project

General

Profile

Bug #296

Infinite loop in nb_kernel100_ia64_single subroutine

Added by Dallas empty over 10 years ago. Updated about 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Erik Lindahl
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Created an attachment (id=353)
File that used on computer that goes into infinite loop with.

GROMACS 4.0.3 on the SC was compiled with intel-cc v.11.0.069 with fftw v.2.1.5 on itanium ia64.

Program works fine with generic kernel gmx_nb_generic_kernel but goes to infinite loop inside nb_kernel110_ia64_single.

Double precision version works exactly the same way: fine with generic kernel, infinite loop in nb_kernel110_ia64_double.

The details of the installation of double precision, single processor version:

setenv CC icc
setenv CFLAGS "-O3 -ip -ftz"
setenv CPPFLAGS "-I$FFTWDIR/include"
setenv CXX icpc
setenv MPI icc
setenv F77 ifort
setenv FFLAGS "-O3 -ip -ftz"
setenv LD ifort
setenv LDFLAGS "-L$FFTWDIR/lib"

./configure --prefix=/opt/gromacs/4.0.3 --exec-prefix=/opt/gromacs/4.0.3 --disable-nice --disable-float --program-suffix=_d --with-fft=fftw2

gmake
gmake install

My md.mdp file for these runs is (yes, I need to update a few things in this, but want to make some comparisons between versions etc first):

; Run Control
integrator = md ; simulation algorithm
dt = 0.005 ; time step (ps)
nsteps = 20000 ; # steps
comm_mode = linear ; c.o.m. motion reset
nstcomm = 1 ; steps for c.o.m. motion reset
comm_grps = System ; groups for c.o.m. removal
;
; Output Control
nstxout = 5000 ; write coordinates to .trr
nstvout = 5000 ; write velocities to .trr
nstlog = 800 ; write energies to .log
nstenergy = 400 ; write energies to .edr
nstxtcout = 800 ; write coordinates to .xtc
;
; Neightbour Searching
nstlist = 10 ; update neighbour list
ns_type = grid ; neighbour list method
pbc = xyz ; periodic boundary conditions
rlist = 1.0 ; cut-off for short-range neighbour (nm)
;
; Electrostatics and VdW
coulombtype = cut-off ; type of coulomb interaction
rcoulomb = 1.5 ; cut-off distance for coulomb (nm)
epsilon_r = 1 ; dielectric constant
vdwtype = cut-off ; type of VdW interaction
rvdw = 1.0 ; cut-off for vdw (nm)
DispCorr = EnerPres ; long range correction
;
; Temperature Coupling
Tcoupl = berendsen ; type of temperature coupling
tc-grps = System ; coupled groups
tau_t = .05 ; T-coupling time constant (ps)
ref_t = 310 ; reference temperature (K)
;
; Pressure Coupling
Pcoupl = berendsen ; type of pressure coupling
Pcoupltype = anisotropic ; pressure coupling geometry
tau_p = 4.0 4.0 4.0 4.0 4.0 4.0 ; p-coupling time constant (ps)
compressibility = 4.5e-5 4.5e-5 4.5e-5 0 0 0 ; compressibiity
ref_p = 1.0 1.0 1.0 0 0 0 ; reference pressure (bar)
;
; Velocity Generation
gen_vel = no ; generate initial velocities
;
; Bonds
constraints = all-bonds ; which bonds to contrain
constraint_algorithm = lincs ; algorithm to use
lincs_order = 8 ; highest order matrix expansion

Attached is the .tpr file used to run this job.

md_0.tpr (1.19 MB) md_0.tpr File that used on computer that goes into infinite loop with. Dallas empty, 02/13/2009 12:05 AM

History

#1 Updated by Erik Lindahl over 10 years ago

Hi Dallas,

We've had issues on SGI's with these loops, which is kind of strange since we haven't changed a line in the since Itanium 2 was released and they worked great.

We've scheduled an overview of all the assembly loops, but in the mean time I would recommend

1) Disable ia64 assembly
2) Compile with Fortran support

If you have a recent version of the Intel compiler this will reportedly give you very good performace, and on some hardware it can even beat the assembly loops. The reason for that is that ia64 is really sensitive to instruction scheduling, and since it's somewhat of a niche platform nowadays we haven't updated the assembly kernels with the latest latency (that is unfortunately a MAJOR pain to do on ia64 - the entire loop has to be rewritten).

#2 Updated by Rossen Apostolov over 9 years ago

Hi Erik,

(In reply to comment #1)

Hi Dallas,

We've had issues on SGI's with these loops, which is kind of strange since we
haven't changed a line in the since Itanium 2 was released and they worked
great.

We've scheduled an overview of all the assembly loops, but in the mean time I
would recommend

1) Disable ia64 assembly
2) Compile with Fortran support

If you have a recent version of the Intel compiler this will reportedly give
you very good performace, and on some hardware it can even beat the assembly
loops. The reason for that is that ia64 is really sensitive to instruction
scheduling, and since it's somewhat of a niche platform nowadays we haven't
updated the assembly kernels with the latest latency (that is unfortunately a
MAJOR pain to do on ia64 - the entire loop has to be rewritten).

Is that issue resolved or shall we recommend/enforce/require 1) and 2) in 4.5?

#3 Updated by Rossen Apostolov about 9 years ago

Closing the bug as WONTFIX. The assembly loops for ia64 will be deprecated in future releases.

Also available in: Atom PDF