Bug #2197
Regression (mdrun segfault) after 1ecd43b0034
Description
Since:
commit 1ecd43b00346806a549ba4c0258206fb5fb29aeb Author: Berk Hess <hess@kth.se> Date: Tue Oct 18 15:09:34 2016 +0200 Introduce ObservablesHistory container Introduces a ObservablesHistory class and moved energyhistory_t, edsamstate_t and swapstate_t into this. TODO: Move more observables history from t_state into this container. Also added documentation in state.h. Part of #2059. Change-Id: Ic1efd95c5be2dede137763bfd24e3fb7d676eadd
gmx mdrun using the attached files run like:
gmx grompp -f acacaca_3-5_md2.mdp -c acacaca_3-5_md2.gro -p acacaca_3-5.top -po acacaca_3-5_md2out.mdp -o acacaca_3-5_md2.tpr -renum gmx mdrun -s acacaca_3-5_md2.tpr -o acacaca_3-5_md2.trr -cpo acacaca_3-5_md2.cpt -c acacaca_3-5_md2out.gro -e acacaca_3-5_md2.edr -g acacaca_3-5_md2.log -nsteps -2 -cpi acacaca_3-5_md1.cpt -noappend -x acacaca_3-5_md2.xtc -v -tunepme
will segfault with:
Program received signal SIGSEGV, Segmentation fault. 0x00007ffff627fc2d in do_cpt_enerhist (xd=0x680a00, bRead=1, fflags=0, enerhist=0x0, list=0x0) at /home/miletivn/workspace/gromacs-reactivemd/src/gromacs/fileio/checkpoint.cpp:1259 1259 enerhist->nsteps = 0; Missing separate debuginfos, use: dnf debuginfo-install fftw-libs-single-3.3.5-3.fc25.x86_64 libgcc-6.3.1-1.fc25.x86_64 libgfortran-6.3.1-1.fc25.x86_64 libgomp-6.3.1-1.fc25.x86_64 libquadmath-6.3.1-1.fc25.x86_64 libstdc++-6.3.1-1.fc25.x86_64 openblas-0.2.19-4.fc25.x86_64 (gdb) backtrace #0 0x00007ffff627fc2d in do_cpt_enerhist (xd=0x680a00, bRead=1, fflags=0, enerhist=0x0, list=0x0) at /home/miletivn/workspace/gromacs-reactivemd/src/gromacs/fileio/checkpoint.cpp:1259 #1 0x00007ffff6282a02 in read_checkpoint (fn=0x68cb30 "acacaca_3-5_md1.cpt", pfplog=0x7fffffffbfe0, cr=0x6812a0, dd_nc=0x7fffffffc800, npme=0x7fffffffc710, eIntegrator=0, init_fep_state=0x681410, step=0x7fffffffbf58, t=0x7fffffffbf50, state=0x67fa70, bReadEkin=0x7fffffffc00c, observablesHistory=0x7fffffffc020, simulation_part=0x7fffffffc080, bAppendOutputFiles=0, bForceAppend=2097152, reproducibilityRequested=0) at /home/miletivn/workspace/gromacs-reactivemd/src/gromacs/fileio/checkpoint.cpp:2156 #2 0x00007ffff628354f in load_checkpoint (fn=0x68cb30 "acacaca_3-5_md1.cpt", fplog=0x7fffffffbfe0, cr=0x6812a0, dd_nc=0x7fffffffc800, npme=0x7fffffffc710, ir=0x7fffffffc070, state=0x67fa70, bReadEkin=0x7fffffffc00c, observablesHistory=0x7fffffffc020, bAppend=0, bForceAppend=2097152, reproducibilityRequested=0) at /home/miletivn/workspace/gromacs-reactivemd/src/gromacs/fileio/checkpoint.cpp:2386 #3 0x00000000004390ca in gmx::mdrunner (hw_opt=0x7fffffffcdb0, fplog=0x67eda0, cr=0x6812a0, nfile=33, fnm=0x7fffffffcf70, oenv=0x688880, bVerbose=1, nstglobalcomm=-1, ddxyz=0x7fffffffc800, dd_rank_order=1, npme=0, rdd=0, rconstr=0, dddlb_opt=0x43d4cf "auto", dlb_scale=0.800000012, ddcsx=0x0, ddcsy=0x0, ddcsz=0x0, nbpu_opt=0x43d4cf "auto", nstlist_cmdline=0, nsteps_cmdline=-2, nstepout=100, resetstep=-1, nmultisim=0, repl_ex_nst=0, repl_ex_nex=0, repl_ex_seed=-1, pforce=-1, cpt_period=15, max_hours=-1, imdport=8888, Flags=3415040) at /home/miletivn/workspace/gromacs-reactivemd/src/programs/mdrun/runner.cpp:1024 #4 0x0000000000427cf2 in gmx_mdrun (argc=22, argv=0x7fffffffdcb0) at /home/miletivn/workspace/gromacs-reactivemd/src/programs/mdrun/mdrun.cpp:551 #5 0x00007ffff616d137 in gmx::(anonymous namespace)::CMainCommandLineModule::run (this=0x677a80, argc=22, argv=0x7fffffffdcb0) at /home/miletivn/workspace/gromacs-reactivemd/src/gromacs/commandline/cmdlinemodulemanager.cpp:133 #6 0x00007ffff616ec47 in gmx::CommandLineModuleManager::run (this=0x7fffffffdb80, argc=22, argv=0x7fffffffdcb0) at /home/miletivn/workspace/gromacs-reactivemd/src/gromacs/commandline/cmdlinemodulemanager.cpp:583 #7 0x000000000041bb7c in main (argc=23, argv=0x7fffffffdca8) at /home/miletivn/workspace/gromacs-reactivemd/src/programs/gmx.cpp:60 (gdb)
Associated revisions
History
#1 Updated by Berk Hess over 3 years ago
I don't get a segfault with 9f989798b3d44e6bdada80323447b24fcfa5ff46
Or do I need to put in a (old?) .cpi file that you did not attach?
#2 Updated by Roland Schulz over 3 years ago
You might also want to attach the log file so that we see compiler version, SIMD instructions, and number of threads.
#3 Updated by Gerrit Code Review Bot over 3 years ago
Gerrit received a related patchset '1' for Issue #2197.
Uploader: Berk Hess (hess@kth.se)
Change-Id: gromacs~master~Icf0e3022a62bdbcd890d1c79c4a0f6c748df4402
Gerrit URL: https://gerrit.gromacs.org/6685
#4 Updated by Berk Hess over 3 years ago
- Status changed from New to Feedback wanted
- Target version set to 2018
I can't reproduce this, but did notice an issue in the code.
Vedran, please check if my fix fixes your issue.
#5 Updated by Vedran Miletic over 3 years ago
Berk Hess wrote:
I don't get a segfault with 9f989798b3d44e6bdada80323447b24fcfa5ff46
Or do I need to put in a (old?) .cpi file that you did not attach?
Indeed, I forgot it, sorry for that. Anyhow, your patch on Gerrit fixes the problem!
#6 Updated by Berk Hess over 3 years ago
- Status changed from Feedback wanted to Fix uploaded
#7 Updated by Berk Hess over 3 years ago
- Status changed from Fix uploaded to Resolved
#8 Updated by Berk Hess over 3 years ago
Applied in changeset cd6b63dbe65fd262a1732ee3a1ff38b2be0018a7.
#9 Updated by Berk Hess over 3 years ago
- Status changed from Resolved to Closed
Fix checkpoint reading without energy history
When reading a checkpoint file without energy history a null pointer
could be dereferenced.
Fixes #2197.
Change-Id: Icf0e3022a62bdbcd890d1c79c4a0f6c748df4402