Project

General

Profile

Bug #1596

mdrun crashes in specific free energy runs

Added by Oleg Titov almost 3 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
mdrun
Target version:
-
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

mdrun crashes with segfault for a particular kind of systems.

I have PhBr molecule solvated by water. I'm trying to calculate free energy difference between PhBr molecule and PhBr molecule with atomic charges from PhF using thermodynamic integration. mdrun terminates with segfault after a couple of seconds after initialization finishes.

The same system without water (and without periodic boundaries) runs fine.

Input files attached.

phbr-phf.segfault.top (10.7 KB) Oleg Titov, 09/15/2014 11:48 PM

phbr.equil.segfault.gro (111 KB) Oleg Titov, 09/15/2014 11:48 PM

prod.0.5.segfault.mdp (1.41 KB) Oleg Titov, 09/15/2014 11:48 PM

Associated revisions

Revision 77356f36 (diff)
Added by Mark Abraham over 2 years ago

Fix group-scheme bug with changing LJ parameters in FE

We don't optimize for the case when we have only changed one of charge
or type, so the other vector must always be valid even when it is not
changing. The logic of calling ewald_LRcorrection didn't do this
correctly, perhaps because the construction logic in md2atoms was
unclear.

Changed name, origin and logic for bFreeEnergy to
bHaveChargeOrTypePerturbed to better reflect the correct usage and
meaning. Avoided testing any pointers for NULL - we should use
explicit control-flow constructs.

Fixes #1596

Change-Id: I61172681048075d3022bd6c4b781c6c9153eeadd

History

#1 Updated by Oleg Titov almost 3 years ago

I ran it with debugger and localized the bug.

For version 5.0.1
In file src/gromacs/mdlib/force.c
lines 540-557

ewald_LRcorrection(fr->excl_load[t], fr->excl_load[t+1],
                   cr, t, fr,
                   md->chargeA,
                   md->nChargePerturbed ? md->chargeB : NULL,
                   md->sqrt_c6A,
                   md->nTypePerturbed ? md->sqrt_c6B : NULL,
                   md->sigmaA,
                   md->nTypePerturbed ? md->sigmaB : NULL,
                   md->sigma3A,
                   md->nTypePerturbed ? md->sigma3B : NULL,
                   ir->cutoff_scheme != ecutsVERLET,
                   excl, x, bSB ? boxs : box, mu_tot,
                   ir->ewald_geometry,
                   ir->epsilon_surface,
                   fnv, *vir_q, *vir_lj,
                   Vcorrt_q, Vcorrt_lj,
                   lambda[efptCOUL], lambda[efptVDW],
                   dvdlt_q, dvdlt_lj);

There are two checks for free energy parameter presence in this function call: for El and for VdW.

At the same time in this function
In file src/gromacs/gmxlib/ewald_util.c
in function ewald_LRcorrection()
line 159

    gmx_bool    bFreeEnergy  = (chargeB != NULL);

The presence of free energy calculation is determined only by presence of valid pointer for charges of B and presence of VdW data for system B is never checked.
The crash happens in the same function at line 400, where VdW parameters for system B are requested (but null-pointer was passed during call, since VdW does not change in this system).
c6Bi = C6B[i];

#2 Updated by Gerrit Code Review Bot over 2 years ago

Gerrit received a related patchset '1' for Issue #1596.
Uploader: Mark Abraham ()
Change-Id: I61172681048075d3022bd6c4b781c6c9153eeadd
Gerrit URL: https://gerrit.gromacs.org/4148

#3 Updated by Mark Abraham over 2 years ago

I've uploaded a fix, Oleg. It would be great if you could follow the link to Gerrit, download the latest patch and verify that it produces the correct answers. Then we can likely have it in 5.0.3

#4 Updated by Berk Hess over 2 years ago

I have two, somewhat unrelated questions:

You are using the group cut-off scheme, maybe because you were also running vacuum calculations before. This bug does not affect the Verlet cut-off scheme. Don't you want to use the Verlet cut-off scheme?

The mdp file you attached does not use temperature coupling, but does use pressure coupling. Is this an error in preparing the files for redmine, or are you really running with these settings, which will not give reasonable results?

#5 Updated by Oleg Titov over 2 years ago

Berk Hess wrote:

I have two, somewhat unrelated questions:

You are using the group cut-off scheme, maybe because you were also running vacuum calculations before. This bug does not affect the Verlet cut-off scheme. Don't you want to use the Verlet cut-off scheme?

I was using group cut-off scheme because manual for version 4.6.x stated that free energy perturbed non-bondeds were not implemented for Verlet. I didn't notice that this was changed in 5.0 release. Anyway, Verlet was tested while investigating issue #1442 http://redmine.gromacs.org/issues/1442#note-36 .

Berk Hess wrote:

The mdp file you attached does not use temperature coupling, but does use pressure coupling. Is this an error in preparing the files for redmine, or are you really running with these settings, which will not give reasonable results?

I'm using sd integrator with ref-t=300 and tau-t=0.2, which should result in Langevein thermostat, so the results should be reasonable. Am I correct?

#6 Updated by Oleg Titov over 2 years ago

Mark Abraham wrote:

I've uploaded a fix, Oleg. It would be great if you could follow the link to Gerrit, download the latest patch and verify that it produces the correct answers. Then we can likely have it in 5.0.3

The answers look correct.

#7 Updated by Erik Lindahl about 2 years ago

  • Status changed from New to Resolved

#8 Updated by Erik Lindahl about 2 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF