Bug #3294
multiple tests fail on fedora 31
Description
49 - GmxapiExternalInterfaceTests (SEGFAULT) 50 - GmxapiMpiTests (Failed) 51 - GmxapiInternalInterfaceTests (Child aborted) 52 - GmxapiInternalsMpiTests (Child aborted) 53 - regressiontests/complex (Failed)`
Build log attached.
This can easily be reproduced in docker:
docker run -it fedora /bin/bash dnf install -y fedpkg git make git clone https://github.com/junghans/gromacs-rpm.git cd gromacs-rpm/ dnf builddep gromacs.spec spectool -g gromacs.spec fedpkg --release f31 local gromacs-2020/serial LD_LIBRARY_PATH=~/rpmbuild/BUILDROOT/gromacs-2020-1.fc31.x86_64/usr/lib64/:$PWD/lib/ make VERBOSE=1 check
on Fedora Rawhide some more tests fail:
49 - GmxapiExternalInterfaceTests (SEGFAULT) 50 - GmxapiMpiTests (Failed) 51 - GmxapiInternalInterfaceTests (Child aborted) 52 - GmxapiInternalsMpiTests (Child aborted) 53 - regressiontests/complex (Failed) 54 - regressiontests/freeenergy (Failed) 56 - regressiontests/essentialdynamics (Failed)
see https://koji.fedoraproject.org/koji/taskinfo?taskID=40388148 for details
History
#1 Updated by Paul Bauer about 1 year ago
I just had a quick look at this, and it looks really weird that it is failing on x86_64
Will check next week
#2 Updated by Paul Bauer 12 months ago
Debugging backtrace from a build in Docker
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007ffff63188d9 in __GI_abort () at abort.c:79 #2 0x00007ffff70469e8 in std::__replacement_assert (__file=__file@entry=0x7ffff7b464c8 "/usr/include/c++/9/bits/stl_vector.h", __line=__line@entry=1042, __function=__function@entry=0x7ffff7c3e6c0 "std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = gmx::BasicVector<float>; _Alloc = gmx::DefaultInitializationAllocator<gmx::Bas"..., __condition=__condition@entry=0x7ffff7b47858 "__builtin_expect(__n < this->size(), true)") at /usr/include/c++/9/x86_64-redhat-linux/bits/c++config.h:2533 #3 0x00007ffff78e292f in std::vector<gmx::BasicVector<float>, gmx::DefaultInitializationAllocator<gmx::BasicVector<float>, std::allocator<gmx::BasicVector<float> > > >::operator[] (this=this@entry=0x7ffed0072a80, __n=__n@entry=145) at /usr/include/c++/9/bits/stl_vector.h:1040 #4 0x00007ffff78e24bf in dd_pmeredist_pos_coeffs (pme=pme@entry=0x7ffed00326b0, bX=bX@entry=true, x=..., data=data@entry=0x7ffed0085d80, atc=atc@entry=0x7ffed00729e0) at /gromacs-rpm/gromacs-2020/src/gromacs/ewald/pme_redistribute.cpp:374 #5 0x00007ffff78e26bf in do_redist_pos_coeffs (pme=pme@entry=0x7ffed00326b0, cr=cr@entry=0x7ffed0018b20, bFirst=bFirst@entry=true, x=..., data=data@entry=0x7ffed0085d80) at /gromacs-rpm/gromacs-2020/src/gromacs/ewald/pme_redistribute.cpp:480 #6 0x00007ffff78cb2d6 in gmx_pme_do (pme=0x7ffed00326b0, coordinates=..., forces=..., chargeA=chargeA@entry=0x7ffed0085d80, chargeB=0x0, c6A=0x0, c6B=0x0, sigmaA=0x0, sigmaB=0x0, box=0x7ffed0076b64, cr=0x7ffed0018b20, maxshift_x=1, maxshift_y=1, nrnb=0x7fffbbffd460, wcycle=0x7ffed0022de0, vir_q=0x7ffed001ac74, vir_lj=0x7ffed001ac98, energy_q=0x7fffbbffae18, energy_lj=0x7fffbbffae1c, lambda_q=<optimized out>, lambda_q@entry=0, lambda_lj=1, lambda_lj@entry=0, dvdlambda_q=0x7ffed001ac60, dvdlambda_lj=0x7ffed001ac64, flags=7) at /gromacs-rpm/gromacs-2020/src/gromacs/ewald/pme.cpp:1149 #7 0x00007ffff781ab6d in do_force_lowlevel (fr=fr@entry=0x7ffed001aa40, ir=ir@entry=0x7fffbbffd0d0, idef=idef@entry=0x7fffbbffbe10, cr=cr@entry=0x7ffed0018b20, ms=ms@entry=0x0, nrnb=nrnb@entry=0x7fffbbffd460, wcycle=0x7ffed0022de0, md=0x7ffed00325a0, coordinates=..., hist=0x7ffed0076de0, forceOutputs=0x7fffbbffb2a0, enerd=0x7fffbbffd800, fcd=0x7ffed00243f0, box=0x7ffed0076b64, lambda=0x7ffed0076b48, graph=0x0, mu_tot=0x7ffed001aab8, stepWork=..., ddBalanceRegionHandler=...) at /gromacs-rpm/gromacs-2020/src/gromacs/mdlib/force.cpp:290 #8 0x00007ffff78532c0 in do_force (fplog=0x0, cr=0x7ffed0018b20, ms=0x0, inputrec=inputrec@entry=0x7fffbbffd0d0, awh=0x0, enforcedRotation=0x0, imdSession=0x7ffed002f5d0, pull_work=0x0, step=1, nrnb=0x7fffbbffd460, wcycle=0x7ffed0022de0, top=0x7fffbbffbe10, box=0x7ffed0076b64, x=..., hist=0x7ffed0076de0, force=..., vir_force=0x7fffbbffb860, mdatoms=0x7ffed00325a0, enerd=0x7fffbbffd800, fcd=0x7ffed00243f0, lambda=..., graph=0x0, fr=0x7ffed001aa40, runScheduleWork=0x7fffbbffcc10, vsite=0x7ffed0019410, mu_tot=0x7fffbbffb854, t=t@entry=0.0050000000000000001, ed=0x0, legacyFlags=<optimized out>, ddBalanceRegionHandler=...) at /gromacs-rpm/gromacs-2020/src/gromacs/mdlib/sim_util.cpp:1519 #9 0x00007ffff791add6 in gmx::LegacySimulator::do_md (this=<optimized out>) at /gromacs-rpm/gromacs-2020/src/gromacs/mdrun/md.cpp:942 #10 0x00007ffff7916e71 in gmx::LegacySimulator::run (this=<optimized out>) at /gromacs-rpm/gromacs-2020/src/gromacs/mdrun/legacysimulator.cpp:73 #11 0x00007ffff79381c9 in gmx::Mdrunner::mdrunner (this=0x7fffbbffdcd0) at /gromacs-rpm/gromacs-2020/src/gromacs/mdrun/runner.cpp:1613 #12 0x00007ffff79392a8 in gmx::mdrunner_start_fn (arg=<optimized out>) at /gromacs-rpm/gromacs-2020/src/gromacs/mdrun/runner.cpp:374 #13 0x00007ffff7a64b06 in tMPI_Thread_starter (arg=0x5555555f41d8) at /gromacs-rpm/gromacs-2020/src/external/thread_mpi/src/tmpi_init.cpp:399 #14 0x00007ffff3d114e2 in start_thread (arg=<optimized out>) at pthread_create.c:479 #15 0x00007ffff63f4643 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Actual error is this one
/usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = gmx::BasicVector<float>; _Alloc = gmx::DefaultInitializationAllocator<gmx::BasicVector<float>, std::allocator<gmx::BasicVector<float> > >; std::vector<_Tp, _Alloc>::reference = gmx::BasicVector<float>&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed. /usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = gmx::BasicVector<float>; _Alloc = gmx::DefaultInitializationAllocator<gmx::BasicVector<float>, std::allocator<gmx::BasicVector<float> > >; std::vector<_Tp, _Alloc>::reference = gmx::BasicVector<float>&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed. /usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = gmx::BasicVector<float>; _Alloc = gmx::DefaultInitializationAllocator<gmx::BasicVector<float>, std::allocator<gmx::BasicVector<float> > >; std::vector<_Tp, _Alloc>::reference = gmx::BasicVector<float>&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed. /usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = gmx::BasicVector<float>; _Alloc = gmx::DefaultInitializationAllocator<gmx::BasicVector<float>, std::allocator<gmx::BasicVector<float> > >; std::vector<_Tp, _Alloc>::reference = gmx::BasicVector<float>&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.
#3 Updated by Paul Bauer 12 months ago
The backtrace is for running the dd121 regressiontest as such
LD_LIBRARY_PATH=~/rpmbuild/BUILDROOT/gromacs-2020-1.fc31.x86_64/usr/lib64/:/gromacs-rpm/gromacs-2020/serial/lib/ gdb --args ../../../bin/gmx mdrun -ntmpi 12 -notunepme -debug 3
#4 Updated by Paul Bauer 12 months ago
build on my machine, so x86_64 architecture
#5 Updated by Christoph Junghans 12 months ago
ok, usually there errors come from compiling with `-D_GLIBCXX_ASSERTIONS` when accessing a null pointer.
#6 Updated by Paul Bauer 12 months ago
it is an overflow of the atc->xBuffer vector in frame 4, with the length being 152 (on my machine) and the access using pos_local being 152 that causes the SIGABRT
#7 Updated by Christoph Junghans 11 months ago
- Target version set to 2020.1
#8 Updated by Paul Bauer 11 months ago
I still have no idea why the gmxapi tests fail, they cause a segmentation fault when freeing the state but I don't understand why this only happens there
#9 Updated by Paul Bauer 11 months ago
- Target version changed from 2020.1 to 2020.2
#10 Updated by Christoph Junghans 11 months ago
I will just check 2020.1 again and see if the issue persists.