Project

General

Profile

Bug #2085

mdrun crashes with segmentation fault if started with more than 32 OpenMP threads.

Added by Jiri Kraus almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

I build GROMACS 2016.1 with CUDA 8 and GCC 5.4.0 using the following configure command:

CC=gcc CXX=g++ cmake ../gromacs-2016.1 -DGMX_GPU=ON -DGMX_BUILD_OWN_FFTW=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=...

With more than 32 OpenMP threads mdrun crashes on water/0000.96:

gmx mdrun -pin on -ntmpi 1 -ntomp 33 -nb gpu -noconfout -resethway -v -maxh 0.08333 -nsteps 100000
[0000.96]$ gmx mdrun -pin on -ntmpi 1 -ntomp 33 -nb gpu -noconfout -resethway -v -maxh 0.08333 -nsteps 100000
                      :-) GROMACS - gmx mdrun, 2016.1 (-:
[...] snip
starting mdrun 'Water'
100000 steps,    200.0 ps.
Segmentation fault

With 32 threads it runs:

[0000.96]$ gmx mdrun -pin on -ntmpi 1 -ntomp 32 -nb gpu -noconfout -resethway -v -maxh 0.08333 -nsteps 100000
                      :-) GROMACS - gmx mdrun, 2016.1 (-:
[...] snip
               Core t (s)   Wall t (s)        (%)
       Time:      437.159       13.661     3200.0
                 (ns/day)    (hour/ns)
Performance:      632.459        0.038

GROMACS reminds you: "Hangout In the Suburbs If You've Got the Guts" (Urban Dance Squad)

[0000.96]$

Even it if does not make sense to use that many OpenMP threads mdrun should not crash.

Associated revisions

Revision c0a90b8b (diff)
Added by Berk Hess almost 3 years ago

Add bonded #thread runtime check

Replaced a debug assertion on the number of OpenMP threads not being
larger than GMX_OPENMP_MAX_THREADS by fatal error.
But since the listed forces reduction is actually not required with
listed forces, these are now conditional and mdrun can run with more
than GMX_OPENMP_MAX_THREADS threads.

Fixes #2085.

Change-Id: I7a6049d727924cd0b4df10a3525f9e7aec49c3dc

History

#1 Updated by Szilárd Páll almost 3 years ago

Indeed, it should not crash. By design GMX_OPENMP_MAX_THREADS needs to be set at compile time to be able to use >32 threads/rank (this sets #bits used by the the masked sparse reduction in the bondeds). I'm not sure if the check broke recently or something else is wrong. ( IIRC this worked last time I tested on Power 8 a few weeks ago. )

#2 Updated by Berk Hess almost 3 years ago

I can't reproduce this with GMX_OPENMP_MAX_THREADS=64 on x86.
Could you run a debugger to see where it segfaults?

#3 Updated by Jiri Kraus almost 3 years ago

Berk Hess wrote:

I can't reproduce this with GMX_OPENMP_MAX_THREADS=64 on x86.
Could you run a debugger to see where it segfaults?

As Szilard said in comment #1 I would say its expected that the error does not reproduce with GMX_OPENMP_MAX_THREADS=64. I did not file this issue because I expected mdrun to work with more than 32 threads if GMX_OPENMP_MAX_THREADS is not changed. But if the number of OpenMP threads requested on the command line exceeds GMX_OPENMP_MAX_THREADS mdrun should either exit with a understandable error message or cap the number of OpenMP threads to GMX_OPENMP_MAX_THREADS and print a warning. Does that make sense?

#4 Updated by Berk Hess almost 3 years ago

  • Status changed from New to Accepted
  • Assignee set to Berk Hess

Ah, I misunderunderstood Szilard's comment.
Only some parts of the code that use OpenMP have a thread count limitation. I'll check where checks are missing.

#5 Updated by Gerrit Code Review Bot almost 3 years ago

Gerrit received a related patchset '1' for Issue #2085.
Uploader: Berk Hess ()
Change-Id: gromacs~release-2016~I7a6049d727924cd0b4df10a3525f9e7aec49c3dc
Gerrit URL: https://gerrit.gromacs.org/6365

#6 Updated by Berk Hess almost 3 years ago

  • Status changed from Accepted to Fix uploaded
  • Target version set to 2016.2

I uploaded a fix that not only adds the check, but also skips the OpenMP bonded thread reduction for cases without bondeds. So now the water system can run on 33 threads with GMX_OPENMP_MAX_THREADS=32.

#7 Updated by Berk Hess almost 3 years ago

  • Status changed from Fix uploaded to Resolved

#8 Updated by Mark Abraham almost 3 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF