Project

General

Profile

Bug #3294

multiple tests fail on fedora 31

Added by Christoph Junghans about 1 month ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

     49 - GmxapiExternalInterfaceTests (SEGFAULT)
     50 - GmxapiMpiTests (Failed)
     51 - GmxapiInternalInterfaceTests (Child aborted)
     52 - GmxapiInternalsMpiTests (Child aborted)
     53 - regressiontests/complex (Failed)`

Build log attached.

This can easily be reproduced in docker:

docker run -it fedora /bin/bash
dnf install -y fedpkg git make
git clone https://github.com/junghans/gromacs-rpm.git
cd gromacs-rpm/
dnf builddep gromacs.spec
spectool -g gromacs.spec
fedpkg --release f31 local
gromacs-2020/serial
LD_LIBRARY_PATH=~/rpmbuild/BUILDROOT/gromacs-2020-1.fc31.x86_64/usr/lib64/:$PWD/lib/ make VERBOSE=1 check

on Fedora Rawhide some more tests fail:

     49 - GmxapiExternalInterfaceTests (SEGFAULT)
     50 - GmxapiMpiTests (Failed)
     51 - GmxapiInternalInterfaceTests (Child aborted)
     52 - GmxapiInternalsMpiTests (Child aborted)
     53 - regressiontests/complex (Failed)
     54 - regressiontests/freeenergy (Failed)
     56 - regressiontests/essentialdynamics (Failed)

see https://koji.fedoraproject.org/koji/taskinfo?taskID=40388148 for details

build_docker.log (5.94 MB) build_docker.log Christoph Junghans, 01/11/2020 03:53 AM

History

#1 Updated by Paul Bauer about 1 month ago

I just had a quick look at this, and it looks really weird that it is failing on x86_64
Will check next week

#2 Updated by Paul Bauer about 1 month ago

Debugging backtrace from a build in Docker

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff63188d9 in __GI_abort () at abort.c:79
#2  0x00007ffff70469e8 in std::__replacement_assert (__file=__file@entry=0x7ffff7b464c8 "/usr/include/c++/9/bits/stl_vector.h", 
    __line=__line@entry=1042, 
    __function=__function@entry=0x7ffff7c3e6c0 "std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = gmx::BasicVector<float>; _Alloc = gmx::DefaultInitializationAllocator<gmx::Bas"..., 
    __condition=__condition@entry=0x7ffff7b47858 "__builtin_expect(__n < this->size(), true)")
    at /usr/include/c++/9/x86_64-redhat-linux/bits/c++config.h:2533
#3  0x00007ffff78e292f in std::vector<gmx::BasicVector<float>, gmx::DefaultInitializationAllocator<gmx::BasicVector<float>, std::allocator<gmx::BasicVector<float> > > >::operator[] (this=this@entry=0x7ffed0072a80, __n=__n@entry=145) at /usr/include/c++/9/bits/stl_vector.h:1040
#4  0x00007ffff78e24bf in dd_pmeredist_pos_coeffs (pme=pme@entry=0x7ffed00326b0, bX=bX@entry=true, x=..., data=data@entry=0x7ffed0085d80, 
    atc=atc@entry=0x7ffed00729e0) at /gromacs-rpm/gromacs-2020/src/gromacs/ewald/pme_redistribute.cpp:374
#5  0x00007ffff78e26bf in do_redist_pos_coeffs (pme=pme@entry=0x7ffed00326b0, cr=cr@entry=0x7ffed0018b20, bFirst=bFirst@entry=true, x=..., 
    data=data@entry=0x7ffed0085d80) at /gromacs-rpm/gromacs-2020/src/gromacs/ewald/pme_redistribute.cpp:480
#6  0x00007ffff78cb2d6 in gmx_pme_do (pme=0x7ffed00326b0, coordinates=..., forces=..., chargeA=chargeA@entry=0x7ffed0085d80, chargeB=0x0, 
    c6A=0x0, c6B=0x0, sigmaA=0x0, sigmaB=0x0, box=0x7ffed0076b64, cr=0x7ffed0018b20, maxshift_x=1, maxshift_y=1, nrnb=0x7fffbbffd460, 
    wcycle=0x7ffed0022de0, vir_q=0x7ffed001ac74, vir_lj=0x7ffed001ac98, energy_q=0x7fffbbffae18, energy_lj=0x7fffbbffae1c, 
    lambda_q=<optimized out>, lambda_q@entry=0, lambda_lj=1, lambda_lj@entry=0, dvdlambda_q=0x7ffed001ac60, dvdlambda_lj=0x7ffed001ac64, 
    flags=7) at /gromacs-rpm/gromacs-2020/src/gromacs/ewald/pme.cpp:1149
#7  0x00007ffff781ab6d in do_force_lowlevel (fr=fr@entry=0x7ffed001aa40, ir=ir@entry=0x7fffbbffd0d0, idef=idef@entry=0x7fffbbffbe10, 
    cr=cr@entry=0x7ffed0018b20, ms=ms@entry=0x0, nrnb=nrnb@entry=0x7fffbbffd460, wcycle=0x7ffed0022de0, md=0x7ffed00325a0, 
    coordinates=..., hist=0x7ffed0076de0, forceOutputs=0x7fffbbffb2a0, enerd=0x7fffbbffd800, fcd=0x7ffed00243f0, box=0x7ffed0076b64, 
    lambda=0x7ffed0076b48, graph=0x0, mu_tot=0x7ffed001aab8, stepWork=..., ddBalanceRegionHandler=...)
    at /gromacs-rpm/gromacs-2020/src/gromacs/mdlib/force.cpp:290
#8  0x00007ffff78532c0 in do_force (fplog=0x0, cr=0x7ffed0018b20, ms=0x0, inputrec=inputrec@entry=0x7fffbbffd0d0, awh=0x0, 
    enforcedRotation=0x0, imdSession=0x7ffed002f5d0, pull_work=0x0, step=1, nrnb=0x7fffbbffd460, wcycle=0x7ffed0022de0, 
    top=0x7fffbbffbe10, box=0x7ffed0076b64, x=..., hist=0x7ffed0076de0, force=..., vir_force=0x7fffbbffb860, mdatoms=0x7ffed00325a0, 
    enerd=0x7fffbbffd800, fcd=0x7ffed00243f0, lambda=..., graph=0x0, fr=0x7ffed001aa40, runScheduleWork=0x7fffbbffcc10, 
    vsite=0x7ffed0019410, mu_tot=0x7fffbbffb854, t=t@entry=0.0050000000000000001, ed=0x0, legacyFlags=<optimized out>, 
    ddBalanceRegionHandler=...) at /gromacs-rpm/gromacs-2020/src/gromacs/mdlib/sim_util.cpp:1519
#9  0x00007ffff791add6 in gmx::LegacySimulator::do_md (this=<optimized out>) at /gromacs-rpm/gromacs-2020/src/gromacs/mdrun/md.cpp:942
#10 0x00007ffff7916e71 in gmx::LegacySimulator::run (this=<optimized out>)
    at /gromacs-rpm/gromacs-2020/src/gromacs/mdrun/legacysimulator.cpp:73
#11 0x00007ffff79381c9 in gmx::Mdrunner::mdrunner (this=0x7fffbbffdcd0) at /gromacs-rpm/gromacs-2020/src/gromacs/mdrun/runner.cpp:1613
#12 0x00007ffff79392a8 in gmx::mdrunner_start_fn (arg=<optimized out>) at /gromacs-rpm/gromacs-2020/src/gromacs/mdrun/runner.cpp:374
#13 0x00007ffff7a64b06 in tMPI_Thread_starter (arg=0x5555555f41d8)
    at /gromacs-rpm/gromacs-2020/src/external/thread_mpi/src/tmpi_init.cpp:399
#14 0x00007ffff3d114e2 in start_thread (arg=<optimized out>) at pthread_create.c:479
#15 0x00007ffff63f4643 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Actual error is this one

/usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = gmx::BasicVector<float>; _Alloc = gmx::DefaultInitializationAllocator<gmx::BasicVector<float>, std::allocator<gmx::BasicVector<float> > >; std::vector<_Tp, _Alloc>::reference = gmx::BasicVector<float>&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.
/usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = gmx::BasicVector<float>; _Alloc = gmx::DefaultInitializationAllocator<gmx::BasicVector<float>, std::allocator<gmx::BasicVector<float> > >; std::vector<_Tp, _Alloc>::reference = gmx::BasicVector<float>&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.
/usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = gmx::BasicVector<float>; _Alloc = gmx::DefaultInitializationAllocator<gmx::BasicVector<float>, std::allocator<gmx::BasicVector<float> > >; std::vector<_Tp, _Alloc>::reference = gmx::BasicVector<float>&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.
/usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = gmx::BasicVector<float>; _Alloc = gmx::DefaultInitializationAllocator<gmx::BasicVector<float>, std::allocator<gmx::BasicVector<float> > >; std::vector<_Tp, _Alloc>::reference = gmx::BasicVector<float>&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.

#3 Updated by Paul Bauer about 1 month ago

The backtrace is for running the dd121 regressiontest as such

LD_LIBRARY_PATH=~/rpmbuild/BUILDROOT/gromacs-2020-1.fc31.x86_64/usr/lib64/:/gromacs-rpm/gromacs-2020/serial/lib/ gdb --args ../../../bin/gmx mdrun -ntmpi 12 -notunepme -debug 3

#4 Updated by Paul Bauer about 1 month ago

build on my machine, so x86_64 architecture

#5 Updated by Christoph Junghans about 1 month ago

ok, usually there errors come from compiling with `-D_GLIBCXX_ASSERTIONS` when accessing a null pointer.

#6 Updated by Paul Bauer about 1 month ago

it is an overflow of the atc->xBuffer vector in frame 4, with the length being 152 (on my machine) and the access using pos_local being 152 that causes the SIGABRT

Also available in: Atom PDF