Project

General

Profile

Bug #1762

arithmetic exception in pair search with GPUs

Added by Szilárd Páll over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
core library
Target version:
Affected version - extra info:
4372f13dd72303b63e71850493246227dac177fd
Affected version:
Difficulty:
uncategorized
Close

Description

After anywhere between tens to hundreds of thousands of steps I get an arithmetic exception:

Program terminated with signal 8, Arithmetic exception.
#0  0x0000000000839764 in nbnxn_make_pairlist._omp_fn.4 () at /nethome/pszilard/projects/gromacs/gromacs-master/src/gromacs/mdlib/nbnxn_search.c:2795
2795                excl[w]->pair[(ej & (NBNXN_GPU_JGROUP_SIZE-1))*nbl->na_ci + ei] &=
(gdb) bt
#0  0x0000000000839764 in nbnxn_make_pairlist._omp_fn.4 () at /nethome/pszilard/projects/gromacs/gromacs-master/src/gromacs/mdlib/nbnxn_search.c:2795
#1  0x00007f24e80ff0f0 in ?? ()
#2  0x00007f24e81e2ea8 in ?? ()
#3  0x00007f24f4b3f09e in ?? ()
#4  0x0000000000000000 in ?? ()

Attached is the input, ran with:

gmx mdrun -quiet -v -ntmpi 8 -ntomp 4 -gpu_id 00001111 -nsteps -1 -pforce 15000

Associated revisions

Revision ba4c59d2 (diff)
Added by Berk Hess over 4 years ago

Fix too small GPU pair count estimates

For triclinic unit-cells with DD the non-local cluster pair count
estimate was too high, especially for thin local domains, due to an
incorrect estimate of the cluster size. Since the pair count estimate
for the local pair-list was determined as a total minus a non-local
estimate, the local estimate could get negative and cause exceptions.
Fixed the cluster size estimate and added a lower limit for the local
size estimate.

Fixes #1762.

Change-Id: I3489550968f66bc03ba4e6056017a58eba37f7cc

History

#1 Updated by Gerrit Code Review Bot over 4 years ago

Gerrit received a related patchset '1' for Issue #1762.
Uploader: Berk Hess ()
Change-Id: I3489550968f66bc03ba4e6056017a58eba37f7cc
Gerrit URL: https://gerrit.gromacs.org/4824

#2 Updated by Mark Abraham over 4 years ago

  • Status changed from New to Resolved

#3 Updated by Mark Abraham over 4 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF