Project

General

Profile

Bug #970

segfault in tip4pflex test case

Added by Christoph Junghans about 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
51f2f97d29cb6e57851a4b7ff037068c355e8b72
Affected version:
Difficulty:
uncategorized
Close

Description

This happens in 50% of the cases for the tip4p{,flex} examples on my x86 laptop (Intel Core 2 Duo, T7100, gcc with gcc (Gentoo 4.6.3 p1.0, pie-0.5.1) 4.6.3).

Program received signal SIGSEGV, Segmentation fault.
sse_mask_init (order=4, work=0xb5313ce8) at
/home/christoph/votca/src/gromacs/src/mdlib/pme.c:2881
2881 work->mask_SSE0[of] = _mm_loadu_ps(tmp);
(gdb) bt
#0 sse_mask_init (order=4, work=0xb5313ce8) at
/home/christoph/votca/src/gromacs/src/mdlib/pme.c:2881
#1 gmx_pme_init (pmedata=0xb530676c, cr=0xb5303ab8, nnodes_major=1,
nnodes_minor=2, ir=0x807e9b8,
homenr=3072, bFreeEnergy=0, bReproducible=0, nthread=1)
at /home/christoph/votca/src/gromacs/src/mdlib/pme.c:3164
#2 0x08054732 in mdrunner (nthreads_requested=0, fplog=0x807e378,
cr=0xb5303ab8, nfile=36, fnm=0xbfffe1a8,
oenv=0x807e2d8, bVerbose=0, bCompact=1, nstglobalcomm=-1,
ddxyz=0xbfffe7e4, dd_node_order=1, rdd=0,
rconstr=0, dddlb_opt=0x807238d "auto", dlb_scale=0.800000012,
ddcsx=0x0, ddcsy=0x0, ddcsz=0x0,
nstepout=100, resetstep=-1, nmultisim=0, repl_ex_nst=0,
repl_ex_nex=0, repl_ex_seed=-1, pforce=-1,
cpt_period=15, max_hours=-1, deviceOptions=0x80704ba "", Flags=7168)
at /home/christoph/votca/src/gromacs/src/kernel/runner.c:841
#3 0x0804e581 in main (argc=1, argv=0xbfffe9f4) at
/home/christoph/votca/src/gromacs/src/kernel/mdrun.c:685

Roland's 1st clue was that the segfault on SSE intrinsic mean that the data isn't aligned sufficiently, so:
(gdb) print &tmp
$1 = (float (*)[8]) 0xbfffcb50
but it not the case.

pme.c (130 KB) pme.c Berk Hess, 07/16/2012 05:55 PM

Associated revisions

Revision 3ee63ecf (diff)
Added by Berk Hess about 7 years ago

fix a segfault in sse_mask_init

  • work array was unaligned in some cases
  • fixes #970

Change-Id: I1b474019bf93e6ef6f7cc935aa3f73f2597a91a8

History

#1 Updated by Berk Hess about 7 years ago

What are the tip4p{,flex} examples exactly?
Have you run valgrind on this?

#2 Updated by Berk Hess about 7 years ago

I think this is an alignment issue.
tmp doesn't need to be aligned, we are using loadu.
But mask_SSE0 should be aligned and I think putting an SSE register in a struct does not guarantee this.

#3 Updated by Berk Hess about 7 years ago

Please try the patched version of pme.c attached.

#4 Updated by Christoph Junghans about 7 years ago

Berk Hess wrote:

What are the tip4p{,flex} examples exactly?

These are two "complex" tests for the regressionstests repository.

Have you run valgrind on this?

No, just gdb, but if you tell me, what exactly you need I can provide it.

#5 Updated by Christoph Junghans about 7 years ago

Berk Hess wrote:

Please try the patched version of pme.c attached.

It happens on my machine at home, I will test and come back.

#6 Updated by Roland Schulz almost 7 years ago

  • Status changed from New to Closed

Also available in: Atom PDF