Project

General

Profile

Bug #1255

bond constraint contribution to dhdl not being added.

Added by Michael Shirts over 4 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
High
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

See subject. The dhdl of bond constraint is being added to enerd->term[F_DVDL_BONDED], but then erased before being summed.

Proposed fix is to resurrect the F_DHDL_CON term (renamed F_DVDL_CONSTR) and use it to save this information, and THEN sum it into the bond component.

A fix has been (lightly) tested and will be posted to gerrit very soon.

Associated revisions

Revision 5d9070d9 (diff)
Added by Michael Shirts over 4 years ago

Fixed problem with bond constraint dhdl being left out

At some point, the dhdl from the holonomic bond constraints got left out,
because the term got zeroed before being added to the bonded energy.

Also, I made some of the variable names involved more descriptive to
clarify the logic.

Removed velocity constraint calls, as don't contribute to dhdl.

Fixes redmine #1255

Change-Id: I2eebf357d1a922b7636dc015b09f286827283dd0

Revision 714b43e2 (diff)
Added by Michael Shirts over 4 years ago

Fixing the constraint dhdl term in shake

Addresses and corrects new issues with #1255,
with additional comments explaining what else
may need to be done in the future to make sure
that the theory of the changes are completely
correctly founded. See the redmine entry #1255
to see the numerical validation of the current choices.
Also adds possible missing constraint dHdl contribution.
Fixes #1268

Change-Id: Ibe3980c2e265cced49b03b966e02143c5cf9f07c

Revision 513813b5 (diff)
Added by Mark Abraham about 3 years ago

Clean up cshake() and resolve old issues

C++ conversion in Id190e36 exposed some "interesting" use of the
variable toler caused by an ancient bug that has since been resolved.
Investigation showed that variable never meant anything to do with a
tolerance!

Renamed a bunch of variables for clarity and consistency. Documented
the core function. Clarified code comments and manual sections
accordingly.

Added some unit tests while trying to understand what things were
actually going on.

Refs #1255

Change-Id: I5b5eee0b8f3f4761ce9cb681dbdb0d4526a6761d

History

#1 Updated by Mark Abraham over 4 years ago

The parade of mdrun fixes for 4.6.2 continues :-)

#2 Updated by Michael Shirts over 4 years ago

But everybody loves a parade!

#3 Updated by Mark Abraham over 4 years ago

  • Status changed from New to Accepted

#4 Updated by Mark Abraham over 4 years ago

  • Status changed from Accepted to In Progress

#5 Updated by Mark Abraham over 4 years ago

  • Status changed from In Progress to Fix uploaded

#6 Updated by Szilárd Páll over 4 years ago

Is this really a "normal" priority bug? Should it not be at least high priority and perhaps announced so people know about it and can redo their simulations if necessary?

#7 Updated by Michael Shirts over 4 years ago

  • Priority changed from Normal to High

Probably should be higher. Raised.

#8 Updated by Mark Abraham over 4 years ago

  • Status changed from Fix uploaded to Resolved

#9 Updated by Michael Shirts over 4 years ago

  • Status changed from Resolved to In Progress

Further testing showed that although these difference are now summed, they are not yet calculated correctly. Validation was against changing a bond length, which should have constraint free energy propto kbT ln (bond_old/bond_new), and thus dh/dl = (dr/dl)/r, where r = bond_old + lambda*(bond_new-bond_old).

  • shake dhdl has an extra factor of the bond length in the denominator that was added at some point in the past (likely a few versions ago). I am figuring out where to put this exactly and am documenting in the code so the derivations are correct (i know WHAT the error is, but not where it was made yet). Lincs dhdl has the correct dependence on r.
  • Both shake and lincs differ by a factor of 2 between when applied to md-vv and md (not clear which is correct yet, appears to be md-vv, but I need to double check my derivation). I will figure out which one is correct and put the documentation of this in comments in the code. Free energy contribution, SHOULD be (I think) 2* kbT ln (bond_old/bond_new), while md gives 4*kbT kn (bond_old/bond_new), but I need to double check this.

I hope to finish this today, but requires being extra careful, so may not be until over the weekend.

#10 Updated by Michael Shirts over 4 years ago

  • Status changed from In Progress to Fix uploaded

I've fixed the terms now, as validated by a thermodynamic cycle below -- this allowed me to address the ambiguity in
the constant term in front of the correction.

Difference in solvation of a 0.5 and 0.2 nm dipole carried out two ways:

1. solvation difference between long and short
2. difference in change in free energy in solution and in vacuum:

solv-long - solv-short = (96.397 +- 0.163) - (56.424 +- 0.196) = 39.94 +/- 0.25 kJ/mol
change-solv - change-vac = (43.597 +- 0.141) - ( 3.321 +- 0.005) = 40.27 +/- 0.14 kJ/mol

One issue: the md-vv constraint is smaller by a factor of 1/2, because only 1/2 the velocity term is uncorrected. The fastest fix is to simply double the constraint contribution. In the limit of small timesteps, this is completely correct, but it is unclear if there is noticible error over longer timesteps. A first pass over 10 ns show that there probably isn't any noticeable error in this. the other way is to go back and compute the dhdl due to the forces in the velocity step, but this would require a lot more deeper work. This approach is, at the worst, correct to the limit of statistical error, and I'd rather make sure this gets in 4.6.2, and make fine adjustments later than not have it in at all.

#11 Updated by Michael Shirts over 4 years ago

Just to clarify about whether the approximation for md-vv (doubling the position restraint correction term) is correct,
here's the test with changing the dipole length from 0.5 to 0.2 nm with md, md-vv, and sd, 10 ns simulatiom. Note there is one sd simulation (lambda state = 1) that is running really slowly for some pinning issue, affecting the overall uncertainty-- I will fix it when it finishes. I don't think it affects the conclusions.

                   md:                  md-vv:                sd:
   0 -- 1         1.760  +-  0.018     1.752  +-  0.018      1.680  +-  0.078
   1 -- 2         1.831  +-  0.018     1.805  +-  0.018      1.764  +-  0.067
   2 -- 3         2.307  +-  0.019     2.267  +-  0.019      2.315  +-  0.024
   3 -- 4         2.887  +-  0.019     2.831  +-  0.019      2.887  +-  0.022
   4 -- 5         3.412  +-  0.019     3.386  +-  0.019      3.415  +-  0.021
   5 -- 6         4.225  +-  0.018     4.235  +-  0.018      4.226  +-  0.021
   6 -- 7         5.246  +-  0.017     5.239  +-  0.018      5.268  +-  0.020
   7 -- 8         6.361  +-  0.017     6.341  +-  0.017      6.335  +-  0.020
   8 -- 9         7.340  +-  0.016     7.288  +-  0.016      7.287  +-  0.019
   9 -- 10        8.085  +-  0.016     8.032  +-  0.019      8.044  +-  0.018
---------- ---------------- -----------------  ------------------
 TOTAL:          43.455  +-  0.069    43.178  +-  0.070     43.221  +-  0.154

(can't figure out how to fix spacing)
So, they appear to be almost statistically indistinguishable. The approximation for md-vv (if it is an approximation) gives results that appear to be well within noise for sd, and close to noise to md-vv (2.6 sigma). This should be good enough for essentially all situations.

#12 Updated by Mark Abraham over 4 years ago

Tabs are evil. Fixed :-)

#13 Updated by Mark Abraham over 4 years ago

  • Target version changed from 4.6.2 to 4.6.3

#14 Updated by Mark Abraham over 4 years ago

  • Status changed from Fix uploaded to Resolved

#15 Updated by Rossen Apostolov almost 4 years ago

  • Status changed from Resolved to Closed

#16 Updated by Mark Abraham about 3 years ago

While converting files to C++, I unearthed the cause of the bug that led to this fix.

Michael inserted some braces wrongly in http://redmine.gromacs.org/projects/gromacs/repository/revisions/a6ee084bcf3ac87d9dd493702b1c737d86506bdc/diff/src/mdlib/shakef.c that broke some FEP functionality at the end of vec_shake, which led to him later re-implementing that in bshakef (see http://redmine.gromacs.org/projects/gromacs/repository/revisions/714b43e235bf3ca67ebbe73bf5d267e1bf352b7b/diff/src/mdlib/shakef.c)

Berk's original implementation re-used the toler variable to post-multiply the Lagrangian multipliers by the constraint distance (weighted by FE-lambda in that case), but those multipliers are unused in the non-FE case. I don't think there is any use for that other than computing dH/dl, but we should clear that up rather than returning a Lagrangian that is scaled by the constraint distance. Michael's subsequent changes have converged on an implementation equivalent to Berk's original one, but with clarity in what value lagr[] actually holds.

Michael also observed that there is a factor of two that seems to be missing in the SHAKE case. The original Ryckaert SHAKE paper derives the constraints based on (r_i-r_j)^2-d^2=0, but the LINCS paper uses |r_i-r_j|-d=0. GROMACS documentation is inconsistent, with neither manual section 3.6.1 and 3.6.2 directly agreeing with section 4.5. The implementation seems to use the LINCS definition in the SHAKE case for computing dH/dl, which offhand seems OK so long as it is consistently implemented (and documented). So, is it? And how should we fix the docs?

#17 Updated by Gerrit Code Review Bot about 3 years ago

Gerrit received a related patchset '1' for Issue #1255.
Uploader: Mark Abraham ()
Change-Id: I5b5eee0b8f3f4761ce9cb681dbdb0d4526a6761d
Gerrit URL: https://gerrit.gromacs.org/4057

#18 Updated by Mark Abraham about 3 years ago

Mark Abraham wrote:

While converting files to C++, I unearthed the cause of the bug that led to this fix.

Michael inserted some braces wrongly in http://redmine.gromacs.org/projects/gromacs/repository/revisions/a6ee084bcf3ac87d9dd493702b1c737d86506bdc/diff/src/mdlib/shakef.c that broke some FEP functionality at the end of vec_shake, which led to him later re-implementing that in bshakef (see http://redmine.gromacs.org/projects/gromacs/repository/revisions/714b43e235bf3ca67ebbe73bf5d267e1bf352b7b/diff/src/mdlib/shakef.c)

Berk's original implementation re-used the toler variable to post-multiply the Lagrangian multipliers by the constraint distance (weighted by FE-lambda in that case), but those multipliers are unused in the non-FE case. I don't think there is any use for that other than computing dH/dl, but we should clear that up rather than returning a Lagrangian that is scaled by the constraint distance. Michael's subsequent changes have converged on an implementation equivalent to Berk's original one, but with clarity in what value lagr[] actually holds.

Clarified - the cshake/crattle routines return the Lagrangian multipliers divided by their respective constraint distances, so the post-multiplication was simply finishing the job of computing the Lagrangian multipliers.

Michael also observed that there is a factor of two that seems to be missing in the SHAKE case. The original Ryckaert SHAKE paper derives the constraints based on (r_i-r_j)^2-d^2=0, but the LINCS paper uses |r_i-r_j|-d=0. GROMACS documentation is inconsistent, with neither manual section 3.6.1 and 3.6.2 directly agreeing with section 4.5. The implementation seems to use the LINCS definition in the SHAKE case for computing dH/dl, which offhand seems OK so long as it is consistently implemented (and documented). So, is it? And how should we fix the docs?

Fixed docs, and made derivation in 4.5 make more sense, which clarifies why the code was correct all along. Note the distinction in the case of SHAKE between the quadratic constraint equation, and the linear lambda-interpolation.

Also available in: Atom PDF