Project

General

Profile

Bug #1218

vrescale_resamplekin assumes an integer, but an integer isn't always passed

Added by Michael Shirts over 4 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
4.6.1
Affected version:
Difficulty:
uncategorized
Close

Description

vrescale_resamplekin expects an integer:

static real vrescale_resamplekin(real kk, real sigma, int ndeg, real taut,
gmx_rng_t rng)

But the value that is passed in to ndeg is opts->nrdf[i], which, due to subtraction of COM degrees of freedom which are then partitioned among the
temperature coupling groups, may not be an integer. In my case, I was trying every atom as its own tcoupling group, which lead to opts->nrdf[i] = 2.99 or so, and ndeg becomes 2. It might make more sense to round?

I've tried without COM removal with one thermostat per atom, so that there is now exactly 3 DOF per temperature group, to see if it gives the correct temperature, but it gives a temperature too high. (306 vs 300) I wonder if there is some other approximation that fails when nrdf becomes low?

I've attached a .tpr, though it will fail without increasing the size of several of thetemperature group variables.

Getting vrescale to work right for many temperature groups is probably not a super high priority, but I'm worried that this behavior may be indicative of subtle errors with in more normal situations.

md_all.tpr (296 KB) md_all.tpr Michael Shirts, 04/09/2013 04:27 AM

Associated revisions

Revision 1f8a4f94 (diff)
Added by Berk Hess over 3 years ago

Removed truncation of nrdf in v-rescale thermostat

The resampling function for the v-rescale thermostat expected an
integer value for nrdf, but a real was passed, which was truncated.
With a single coupling coupling group nrdf is analytically an int,
but could be off by a bit. The could lead to incorrect kinetic
energy fluctuations (averages were correct).
Now fractional nrdf's are properly handled for nrdf > 3.
For nrdf < 3 a check is added for integer values with a small margin
for rounding.
Fixes #1218

Change-Id: I4c60c337f9874d0bff51220ad09429140be2a056

History

#1 Updated by Berk Hess over 4 years ago

  • Priority changed from Normal to High

We should change ndeg to a real.
Have you tried that with your many group case?
It could be that some atoms get a ndeg of 3 - one bit.

#2 Updated by Michael Shirts over 4 years ago

I'm don't think making it ndeg a real would work, since some of the algorithms that vrescale_resamplekin uses require integers.

I didn't pursue this fully, because when I used COM = none, so I got full integers everywhere, then the temperature was too high -- 306, statistically significantly (errors was about 0.5 K, over about 10 ps or so). So the thermostat has additional problems beyond this.

Note that for, say, 900 argon molecules with one temperature group, the differences are statistically insignificant (see the checkensemble paper), so I don't know how vital it is to fix this -- but it should be fixed eventually.

#3 Updated by Berk Hess over 4 years ago

Indeed, for small ndeg we need integers.
I'm still not sure if there are no rounding issues. FP calculations are not that simple, there could still be rounding issues due to the ratio of sums being off by one bit. To check that you would need to round ndeg before passing it.
Your tpr format changed, so I can't check your input.
I ran a 20 system LJ particle system and didn't observe any issues.
What tau_t are you using compared to the LJ time?
It could be that the v-rescale algorithm "simply" is not valid with multiple/many t-coupl groups.
Indeed, not a critical issue, but we need to understand what is going on.

#4 Updated by Michael Shirts over 4 years ago

FP calculations are not that simple, there could still be rounding issues due to the ratio of sums being off by one bit. To check that you would need to round ndeg before passing it.

Agreed. Though being off by one bit is better than being off by 0.99.

Your tpr format changed, so I can't check your input.

Ah, sorry. Let me try to construct a system that runs under unmodified gromacs. Might take a day or two to get a chance.

I ran a 20 system LJ particle system and didn't observe any issues.
What tau_t are you using compared to the LJ time?

The .tpr is with water (not constrained), not LJ. tau_t is 0.1, timestep was 0.0005.

It could be that the v-rescale algorithm "simply" is not valid with multiple/many t-coupl groups.

Could be, will need to investigate this a bit more.

Indeed, not a critical issue, but we need to understand what is going on.

Right.

#5 Updated by Michael Shirts over 4 years ago

Still working on constructing a small enough problem to fit in the unmodified 4.6 to have it show up statistically. Perhaps for < 256 groups it ends up not being statistically signficant? Certainly the ndeg error should be fixed, but I can't yet determine if the other issues beyond that are relevant or not.

#6 Updated by Mark Abraham over 4 years ago

  • Target version changed from 4.6.2 to future
  • Affected version set to 4.6.1

#7 Updated by Mark Abraham over 4 years ago

  • Target version changed from future to 4.6.x

#8 Updated by Michael Shirts almost 4 years ago

Was there any final solution here?

#9 Updated by Mark Abraham almost 4 years ago

Not that I know of

#10 Updated by Berk Hess over 3 years ago

I don't see an easy fully accurate solution for fractional degrees of freedom.
I would suggest that we add rounding for nrdf, which will always be correct for a single coupling group and at least better for multiple groups, which you should anyhow only use with large groups and weak coupling, where the error will be small.
Should I put this in 4.6, which should be merged into 5.0?

#11 Updated by Gerrit Code Review Bot over 3 years ago

Gerrit received a related patchset '1' for Issue #1218.
Uploader: Berk Hess ()
Change-Id: I4c60c337f9874d0bff51220ad09429140be2a056
Gerrit URL: https://gerrit.gromacs.org/3392

#12 Updated by Berk Hess over 3 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

#13 Updated by Rossen Apostolov over 3 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF