Project

General

Profile

Bug #165

Center of mass drifts despite setting nstcomm=1

Added by Markus Miettinen about 12 years ago. Updated over 9 years ago.

Status:
Closed
Priority:
Normal
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Created an attachment (id=235)
The first 200 lines of gmxdump of the tpr-file, showing the value of nstcomm

Setting nstcomm=1, to remove center of mass movement after each integration step,
sometimes fails to have the desired effect, causing the center of mass to drift considerably
(> 1 nm) during long (>100 ns) simulations.

The effect is difficult to reproduce. In 13 systems studied 3 showed a considerable
drift, but were identical in general simulation parameters (i.e. identical .mdp-files)
to the other 10.

The systems studied were lipid bilayers with periodic boundary conditions. The
drift was noticed when plotting the centers of mass of the bilayers as a function
of time. This revealed that the whole bilayer was drifting steadily along the direction
of the bilayer normal. Further investigation (with trjconv -pbc nojump) revealed that,
in fact, the center of mass of the whole system was drifting.

The expected behaviour would be that the center of mass of the system would
stay fixed. As a result of this, the center of mass of the bilayer should not drift
considerably along the bilayer normal direction.

tap06s10.dump (7.54 KB) tap06s10.dump The first 200 lines of gmxdump of the tpr-file, showing the value of nstcomm Markus Miettinen, 09/19/2007 04:34 PM
bilayer_drifts.png (438 KB) bilayer_drifts.png Drifting of the bilayer center of mass along bilayer normal Markus Miettinen, 09/19/2007 04:41 PM
system_drifts.png (215 KB) system_drifts.png Center of mass of the whole system drifts Markus Miettinen, 09/19/2007 04:45 PM
tap06s10_1ns_dk.tpr (3.14 MB) tap06s10_1ns_dk.tpr The .tpr file for one of the drifting systems Markus Miettinen, 09/20/2007 11:13 PM
boxDimensionsNotDrifting.png (272 KB) boxDimensionsNotDrifting.png The box does not deform in any of the systems mentioned in Comment #1 Markus Miettinen, 09/23/2007 03:54 PM
no_jumps_between_monolayers.png (357 KB) no_jumps_between_monolayers.png Lipids do not change monolayers during simulation Markus Miettinen, 09/23/2007 03:58 PM
coms_and_waterJumps.png (524 KB) coms_and_waterJumps.png There is no pore in the bilayer Markus Miettinen, 09/23/2007 04:16 PM
CoMz:boxZ.png (77.6 KB) CoMz:boxZ.png System center of mass position scaled with the box size (z direction) Markus Miettinen, 09/24/2007 04:52 PM
bug165_scaledCoMs.png (283 KB) bug165_scaledCoMs.png Scaled CoMs (x,y,z) for all 13 systems show drift Markus Miettinen, 10/02/2007 04:56 PM
system_drifts_2.png (410 KB) system_drifts_2.png Comparison of different p-couplings, when using Nose-Hoover T-coupling for the whole system Markus Miettinen, 11/23/2007 05:22 PM
allcoords.jpg (124 KB) allcoords.jpg More evidence David van der Spoel, 10/01/2008 08:47 AM
kkk.jpg (210 KB) kkk.jpg Results David van der Spoel, 10/13/2008 11:45 AM
bb.tgz (519 KB) bb.tgz Two xvg files showing COM before and after some magic updates David van der Spoel, 06/10/2010 07:23 AM
bx.zip (502 KB) bx.zip Two more xvg files, showing it is not perfect anyway David van der Spoel, 06/22/2010 04:08 PM

History

#1 Updated by Markus Miettinen about 12 years ago

Created an attachment (id=236)
Drifting of the bilayer center of mass along bilayer normal

#2 Updated by Markus Miettinen about 12 years ago

Created an attachment (id=237)
Center of mass of the whole system drifts

The plot shows that the center of mass of the whole system is drifting towards higher "z" and smaller "x" and "y" values. In addition, it shows that the simulation box size is not changing.

#3 Updated by David van der Spoel about 12 years ago

please submit a tpr file.

which com groups did you use?

#4 Updated by Markus Miettinen about 12 years ago

Created an attachment (id=238)
The .tpr file for one of the drifting systems

The com group was not explicitly defined; hence, should be the whole system.

#5 Updated by David van der Spoel about 12 years ago

If the center of mass drifts so much, could that be cause by box deformations? Otherwise where should it go?

Try using
g_traj -com -ekt
and select System. This will give you the translational energy. Since this is what the center of mass algorithm sets to zero it should be zero.

And also
g_traj -com -ox
and select System (this will give you the 3 coordinates of the COM). Compare to box edges. Because molecules may flip to other side of the box due to PBC there will always be fluctuations in this property.

#6 Updated by David van der Spoel about 12 years ago

I have now 700 ps of this sim and what is happening is that the box deforms (it slowly becomes cubic). There the COM changes as well. Please verify whether you the same effect, in which case it probably is something to do with the input, it could be that your lipid density is too high.

#7 Updated by Markus Miettinen about 12 years ago

The box did not deform considerably during a 220 ns simulation; please see the Comment #2, and especially the plot (http://bugzilla.gromacs.org/attachment.cgi?id=237) there. The plot shows both the box dimensions and the system CoM as a function of time in x, y and z directions.

Unfortunately I can not check CoM kinetic energy using "g_traj -com -ekt", because we did not save the particle velocities.

I suppose "g_traj -com -ox" should be applied on a trajectory that has been converted using "trjconv -pbc nojump"? When using PBC, one naturally sees no CoM drift, because, by definition, each particle is represented by the periodic image that lies inside the original simulation box. The plot in Comment #2 has been measured using a "nojump" trajectory. It also shows the comparison to box edges.

#8 Updated by David van der Spoel about 12 years ago

I think the nojump may be the cause. Consider the following thought experiment: if we have a membrane and one of the lipids moves out of the box in one direction while the others stay in the box. Now if you apply the nojump option you will see a net drift from the center of mass as a function of time. Since you are not coupling the lipids separately they may move with respect to the water. Please try running the analysis on a "jump" trajectory.

As regards velocities, the tpr you uploaded does save the velocities every 10 ps or so which is enough to show that no velocity builds up.

#9 Updated by Markus Miettinen about 12 years ago

Let me be explicit.

The plots in Comment #1 are done with PBC. They show that the CoM of the bilayer drifts towards higher z. So just the bilayer CoM is calculated here.

The plots in Comment #2 are done without PBC. This is because we are looking the whole system CoM here. If I will do this with PBC (and I did, when you suggested it), the CoM will ofcourse not show the drift, because everything is kept in the original box by definition.

To sum up. The bilayer CoM drifts about 1.6 nm in 220 ns (plot in Comment #1, which has PBC). The system CoM drifts about 1.6 nm in 220 ns (plot in Comment #2, which has no-PBC).

#10 Updated by David van der Spoel about 12 years ago

Could it be that single lipids are extracted from the bilayer and coming back on the other side? This would of course lead to imbalance in the number of lipids per monolayer. You could analyse the mean square displacement of single lipid molecules in the bilayer (g_msd -mol). That analysis is probably more reliable than the trjconv.

If you say that with comm removal on the whole system and with PBC everything stays in place, that is the expected behavior. That is no explanation for the result without pbc, but maybe an animation of the -pbc nojump can explain more.

Somehow Comment 1 does not make sense. If the bilayer moves upwards, but the center of mass remains remains in place then the water has to go down. That means there should be a pore in the membrane. Can you plot the Z component of the COM for the water corresponding to Comment 1? Please check the boxes in the 3 sims in comment 1 as well. In the tpr that you sent there is a clear change after very short time.

#11 Updated by Markus Miettinen about 12 years ago

Created an attachment (id=241)
The box does not deform in any of the systems mentioned in Comment #1

"Please check the boxes in the 3 sims in comment 1 as well. In the tpr that you sent there is a clear change
after very short time."

The plot shows the box (z) dimensions for each of the three systems in Comments #1. The colours correspond to those of the plot in Comment #1. No drift is seen.

The measurements (both the box sizes here and the bilayer CoMs in Comment #1) are done after the system has relaxed, i.e. after several nanoseconds. The .tpr file of Comment #4 is from the very beginning of the simulation.

#12 Updated by Markus Miettinen about 12 years ago

Created an attachment (id=242)
Lipids do not change monolayers during simulation

From Comment #10: "Could it be that single lipids are extracted from the bilayer and coming back on the other side? This would of course lead to imbalance in the number of lipids per monolayer. You could analyse the mean square displacement of single lipid molecules in the bilayer (g_msd -mol). That analysis is probably more reliable than the trjconv."

Lipid exchange between monolayers does not happen, please see the plot attached.

The plot also shows the MSD (g_msd -mol), which looks like ballistic motion, in accordance with the whole bilayer drifitng towards higher values of z with constant velocity.

#13 Updated by David van der Spoel about 12 years ago

Where these analyses done on the original trajectory or the one after trjconv?

#14 Updated by Markus Miettinen about 12 years ago

Created an attachment (id=243)
There is no pore in the bilayer

From Comment #10: "If you say that with comm removal on the whole system and with PBC everything stays in place, that is the expected behavior. That is no explanation for the result without pbc, but maybe an animation of the -pbc nojump can explain more."

This is not exactly what I am saying. When using g_traj with PBC (see the "jump" plot in attachment) the bilayer CoM goes up, the water CoM goes down and the whole system CoM stays more or less fixed.

The reason for the water CoM going down could then be either (1) water leaking through bilayer or (2) water going around the box, because of PBC.

The "nojump" plot on shows that, indeed, the CoM of the whole system is drifting. And the bilayer and the waters with it.

From Comment #10: "Somehow Comment 1 does not make sense. If the bilayer moves upwards, but the center of mass remains remains in place then the water has to go down. That means there should be a pore in the membrane. Can you plot the Z component of the COM for the water corresponding to Comment 1?"

To make absolutely sure that there is no pore in the membrane I plotted the centers of mass for each water molecule in the beginning of the simulation and then followed them throught the whole trajectory (g_traj -nojump) and plotted their positions at the end of the trajectory. This should reveal if the waters go through the bilayer. The last plot on the attachment shows that waters do go through the bilayer, but this movement happens quite seldom, and more importantly, it is symmetrical. Thus, this does not explain the bilayer drift.

To sum it up: The whole system center of mass drifts.

#15 Updated by Markus Miettinen about 12 years ago

(In reply to comment #13)

Where these analyses done on the original trajectory or the one after trjconv?

The original trajectory.

#16 Updated by David van der Spoel about 12 years ago

I think I understand what happened now (maybe you did it a long time before me but anyway). The bilayer goes up and the water goes up as well but wraps around the box. Hence if you use the "jump" option you see that the COM remains in place, and if you use nojump the COM goes upwards.

Now what is the correct way of looking at it? I don't think it matters, and if you check the kinetic energy of the COM you see that is close to zero, ergo the algorithm works. It could be that your original velocities are such that the whole thing moves (these are chosen such the the total COM is zero). You could run g_traj -ov to check the velocity of the bilayer and the water at step zero.

#17 Updated by Markus Miettinen about 12 years ago

(In reply to comment #16)

I think I understand what happened now (maybe you did it a long time before me
but anyway). The bilayer goes up and the water goes up as well but wraps around
the box. Hence if you use the "jump" option you see that the COM remains in
place, and if you use nojump the COM goes upwards.

Yes. This is exactly what happens and what I have been trying to explain :)

Now what is the correct way of looking at it? I don't think it matters, and if
you check the kinetic energy of the COM you see that is close to zero, ergo the
algorithm works. It could be that your original velocities are such that the
whole thing moves (these are chosen such the the total COM is zero). You could
run g_traj -ov to check the velocity of the bilayer and the water at step zero.

Well, I do think that it matters. The correct way of finding out if the CoM of the
system is moving is using the "nojump" condition. If the CoM stays fixed in this
setting, then it does so also in the "jump" condition. Furthermore, in this case
the CoMs of different components (bilayer, water) stay fixed in the PBC.

I looked at the velocities at which the CoM of the whole system moves in a few
frames, of which I had the .gro files. In all these the speeds were of the order
of a few 10e-7 nm/ps. This is an order of magnitude smaller than the average speed
required (7 x 10e-6 nm/ps) to see the drift of 1.6 nm in about 220 ns as seen
in plot of Comment #2.

If my original velocities were such that the systems moves at this speed, why
does the algorithm not remove the movement during the simulation? Or even
diminish its magnitude?

#18 Updated by David van der Spoel about 12 years ago

Comments from Berk Hess:

Since we only correct v and not x, the COM can move. I have never checked if this happened.

Berendsen pressure coupling does not affect the velocities, so it will not affect the COM velocity, it will only scale the COM position.

The only strange thing is that it happens in only some cases.
What is the difference between the systems in which this happens or does not happen? Are all of these DMPC bilayers?

#19 Updated by Berk Hess about 12 years ago

Some more comments:

Since we correct only v (after the update) and not x,
x is incremented with the uncorrected v.
But since v is corrected each step, v can only build up
a COM component is one step.
Therefore I guess one would usually not observe significant
COM motion.
Unless you not only have pair forces, but also forces with
an absolute reference, e.g. freeze groups, position restraints,
COM pulling.
You don't have those, right?

In your case the COM motion is still extremely small over
one step. To see how the COM really bahaves, could you
plot com[Z]/box[Z][Z] and post the graph?

What I consider the most worrying is that it only happens
with three out of all systems.
What are the differences between the systems?
Only starting coordinates and velocities, or also the composition?

Berk.

#20 Updated by David van der Spoel about 12 years ago

Could this be a variant of the flying ice-cube problem? The COM vel of the previous step is added to the velocity, which is then scaled by temperature coupling, that means we are not applying the whole stopcm correction.

#21 Updated by Berk Hess about 12 years ago

No.

The COM velocity is set to zero after each step, so no velocity
can accumulate.

To locate the problem we first need to look at a v_com/box
graph, so we can see if the COM has a constant velocity or
if it diffuses.

Berk.

#22 Updated by Markus Miettinen about 12 years ago

(In reply to comment #18)

The only strange thing is that it happens in only some cases.
What is the difference between the systems in which this happens or does not
happen? Are all of these DMPC bilayers?

All the systems (drifting as well as non-drifting) comprise DMPC and DMTAP in differing proportions.

All the drifting systems have also NaCl in them. But among the non-drifting systems there are also several systems with NaCl.

I do not see any clear connection between a particular system setup and the occurrence of drift.

#23 Updated by Markus Miettinen about 12 years ago

(In reply to comment #19)

Some more comments:

Since we correct only v (after the update) and not x,
x is incremented with the uncorrected v.
But since v is corrected each step, v can only build up
a COM component is one step.
Therefore I guess one would usually not observe significant
COM motion.
Unless you not only have pair forces, but also forces with
an absolute reference, e.g. freeze groups, position restraints,
COM pulling.
You don't have those, right?

No, I do not have any of these.

What I consider the most worrying is that it only happens
with three out of all systems.
What are the differences between the systems?
Only starting coordinates and velocities, or also the composition?

As I mentioned in Comment #22 there are differences in lipid composition (DMPC/DMTAP proportion) and NaCl concentration, but there is no systamatic link between composition and the occurrence of drift.

#24 Updated by Markus Miettinen about 12 years ago

Created an attachment (id=244)
System center of mass position scaled with the box size (z direction)

From Comment #19:

In your case the COM motion is still extremely small over
one step. To see how the COM really bahaves, could you
plot com[Z]/box[Z][Z] and post the graph?

The plot attached. This is the blue curve, in plot of Comment #2, divided by the red curve.

#25 Updated by Berk Hess about 12 years ago

Thanks.

The scaled plot looks lice diffusion, no constant velocity
as I expected.
But now and then there must be something non-symmetric happening.

I don't understand how this could happen (unless surface_epsilon
is not infinity, in which case the ions would cause trouble).

I guess that for the non-drifting system you scaled COM
would be exactly constant.
Is that correct?

Until next week I probably don't have time to look into this deeper.

Berk.

#26 Updated by Markus Miettinen about 12 years ago

(In reply to comment #25)

The scaled plot looks lice diffusion, no constant velocity as I expected.
But now and then there must be something non-symmetric happening.

Speaking of non-symmetricity, one thing really stroke me as odd in the scaled plot of Comment #24: The com-position either grows towards more positive values or stays constant. It never goes towards smaller values.

I don't understand how this could happen (unless surface_epsilon
is not infinity, in which case the ions would cause trouble).

According to the dump attached to original Description "epsilon_surface = 0"

#27 Updated by Markus Miettinen about 12 years ago

Created an attachment (id=246)
Scaled CoMs (x,y,z) for all 13 systems show drift

(In reply to comment #25)

Sorry, it took a while to find the time to make this analysis:

I guess that for the non-drifting system you scaled COM
would be exactly constant. Is that correct?

To my surprise, this is not correct.

Instead the scaled CoM position is drifting in all 13 systems and in all directions (x,y,z), please see the attachment. The drift is just smaller in the 10 other systems, which seemed like not drifting at all in the (lower) plot of Comment #2.

Could this have something to do with round-off error?

#28 Updated by Markus Miettinen about 12 years ago

(In reply to comment #27)

Instead the scaled CoM position is drifting in all 13 systems and in all
directions (x,y,z), please see the attachment. The drift is just smaller in the
10 other systems, which seemed like not drifting at all in the (lower) plot of
Comment #2.

Oops, a typo. I ofcourse meant the plots of Comment #1, not those of Comment #2. The same typo is also in the attachment of Comment #27.

#29 Updated by Berk Hess about 12 years ago

One complicating factor is the Berendsen coupling with two different groups.
To exclude effects due to this issue, could you run the most drifting
system with Nose-Hoover with one tcoupl group, with say tau=2 ps?

Berk.

#30 Updated by Markus Miettinen about 12 years ago

Created an attachment (id=269)
Comparison of different p-couplings, when using Nose-Hoover T-coupling for the whole system

(In reply to comment #29)

One complicating factor is the Berendsen coupling with two different groups.
To exclude effects due to this issue, could you run the most drifting
system with Nose-Hoover with one tcoupl group, with say tau=2 ps?

Sorry, that this has taken a while. The results, however, are hopefully useful.

I resimulated the most drifting system for 20 ns (between 220 and 240ns). I did this in three different ways, all of which had just one tcoupl group, with tau=2ps and Nose-Hoover. The three ways differed in the way that I handled the pressure: Berendsen, Parrinello-Rahman and no control were tested.

The results between all three differ qualitatively. No control does exactly what is expected, and Parrinello-Rahman also looks promising. I, however, have to point out that atleast when using two T-coupling groups, also NH+PR -simulatios are known to drift.

Please see the attachment for plots.

#31 Updated by David van der Spoel over 11 years ago

-------- Original Message --------
Subject: Gromacs Bug#: 165
Date: Mon, 07 Jul 2008 21:46:26 +0200
From: Martin Dahlberg <>
To:

Hi David,

With regards to the Bug 165 in the gromacs bugzilla (Center of mass
drifts despite setting nstcomm=1), which is still open and assigned to you:

I recently found some odd behavior in the pressure control at long time
scales, and found this post by Berk which addresses my issues:

Berk Hess hessb at mpip-mainz.mpg.de
Wed Feb 13 14:43:04 CET 2008

In all Gromacs versions before 4 the box shape would "diffuse" over time
when using (semi)isotropic pressure coupling. This issue (only noticable
for very long simulations) has been fixed for version 4.0.

I believe this could influence the CoM behavior, which is also indicated
in Markus Miettinen's plots in the bug report because the problem
disappears when he uses the NVT ensemble.

#32 Updated by Berk Hess over 11 years ago

Hi Martin,

With "I believe this could influence", do you really mean believe,
or have you tried it and it fixes your problem?

The fix I implemented only affects the shape of non-cubic boxes.
I would guess it would not influence your issue, since that is probably
caused by rounding of the last bit of the coordinates during pressure scaling.
The pressure scaling itself is not affected by my fix, only the relative scaling
of x, y and z.

Berk.

#33 Updated by Markus Miettinen over 11 years ago

(In reply to comment #32)

Hi Berk,

you wrote, in your reply to Martin:

I would guess [...] your issue [...] is probably caused by rounding
of the last bit of the coordinates during pressure scaling.

where "your issue" refers to the issue of the present Bug (165).

I started to wonder that if the CoM drift really is caused by rounding
error, then

1. Why is the drift taking place so consistently (over hundreds of nanoseconds)
towards either bigger or smaller values (see plot of Comment #27), instead of
showing random fluctuations to both directions?
2. Why is the drift in z-coordinate often towards different direction than the drifts
in x- and y-coordinates? (I guess the whole simulation box is in the 1st octet (x>0,
y>0, z>0) of the coordinate system?)

#34 Updated by David van der Spoel about 11 years ago

Created an attachment (id=285)
More evidence

I have now reproduced this bug using a much simpler decane/water system. Upon testing different T and P coupling algorithms (see figure) I found that Berendsen P coupling is the culprit. T coupling does not matter. It could be that the reason is an aggregation of numerical error in the following manner: multiplying large numbers (high Z) by a scaling factor mu > 1 will usually give less round-off error than multiplying by a mu < 1. I will test this by writing a simple program. Further tests: double precision Berendsen P, Berendsen T. A solution to the problem could be to scale the positions and box wrt to the center of the box. If that fixes it then it could even be worthwhile to write a short paper about it.

#35 Updated by Berk Hess about 11 years ago

The center of the system is irrelevant, since absolute coordinates
are irrelevant for the current algorithms.
vcm is determined from v, not x.
The only thing I can think of is rounding issues, which might cause
a systematic effect when scaling with 1.0 + 1 bit and 1.0 - 1 bit.
But I have not looked into this.
The interesting question is why it goes wrong with Berendsen
and not with Parrinello-Rahman.

I have looked at the Parrinello-Rahman code, but I don't understand it.
The coordinates are never scaled, only the velocities are scaled.
Since the box is scaled, this would that at the edges artificial
forces are introduced.
Maybe Erik can explain this?

Berk

#36 Updated by Erik Lindahl about 11 years ago

With Parinello-Rahman the box vectors aren't scaled, but integrated - they have accelerations and velocities.
Formally the atom coordinates are integrated in relative box coordinates, but after going back this can be solved to yield an extra (velocity-dependent) force on each atom in Cartesian coordinates. Then it's a separate issue this might not be perfect with leap-frog.

#37 Updated by David van der Spoel about 11 years ago

My proposal earlier fixes the problem. I will do more systematic tests and write it up. I think it is purely numerical error, but I'll make a separate toy program to prove that.

#38 Updated by Markus Miettinen about 11 years ago

(In reply to comment #35, where Berk wrote:)

The interesting question is why it goes wrong with Berendsen
and not with Parrinello-Rahman.

I would like to point out that there are cases in which this happens
with Nose-Hoover temperature coupling and Parrinello-Rahman
barostat.

I have posted an example of such incident in Comment #30
(attachment http://bugzilla.gromacs.org/attachment.cgi?id=269
please see the lower part of the attachment).

#39 Updated by David van der Spoel about 11 years ago

Thanks for pointing that out. It looks more like a random walk however, then any of the other cases. As said, I'm now doing a series of systematic tests. Will come back with results after lunch.

#40 Updated by Berk Hess about 11 years ago

Erik,
I don't see how you could incorporate coordinate scaling
into the velocity scaling.
The velocities are completely uncorrelated with the positions
and the velocities are just scaled with M.
So I think that the current implementation of P-R is incorrect.
The coordinates are not scaled and the box is,
therefore overlap or a gap is introduced at the box boundaries.

BTW Also scaling the coordinates with M might lead to the same
com motion problem as with Berendsen.

Berk

#41 Updated by David van der Spoel about 11 years ago

OK, I was too optimistic. First off, my proposed patch does not seem to work, second the simple test program does not show any systematic drift even when drawing billions of random numbers. The problem is there in double precision as well, which basically tells us that it is unrelated to precision.

Some new information from my tests is that the problem goes away when using only P-scaling in the Z-direction (normal), or using P-scaling in the lateral direction only. In other words, there is a coupling between lateral scaling and normal scaling. I will investigate this further by changing my decane layer to a propane layer with fixed angles to see whether there is a coupling between angle potential and pressure scaling. We have previously had problems with coupling between pressure scaling and improper dihedrals, leading to instabilities when tau_P was too small (1 ps).

#42 Updated by Berk Hess about 11 years ago

I want to bring up the Parrinello-Rahman issue again.
I think not scaling the coordinates there is incorrect.
We should also scale the coordinates with M.
And I think we should implement this before the official 4.0 release.

Berk

#43 Updated by Erik Lindahl about 11 years ago

The problem is that it is not just a matter of blindly testing scaling or not, but somebody has to work through all the equations for velocity verlet, translate to leap-frog and solve the equations you end up with, calculate what the conserved properties in leap-frog should be, and test in a couple of long simulations that they really are conserved.

It would of course be great to have, but since it's not been quite proper for the last two years I don't think it is critical enough to hold up the 4.0 release and force everybody to use 3.3 in the meantime.

#44 Updated by David van der Spoel about 11 years ago

Berk, are you aware that the update algorithm is different for PR coupling as
well?

I'm not sure whether the implementation is completely correct, but there is more to is than what is coupling.c.

On the other hand people are using PR coupling in good faith and it should be correct, or disabled.

#45 Updated by Erik Lindahl about 11 years ago

Even the current partially broken implementation of P-R is better than Berendsen, so disabling it would be counter-productive.

There are a bunch of minor things open with RC3. I hope to release RC4 tomorrow, and in a week or so we could be at a final 4.0 release. If somebody fixes P-R before then it would of course be great, but I won't have the time to do it :-)

#46 Updated by Berk Hess about 11 years ago

Since you said before that the equations are in relative
coordinates, I deduce that we should just scale the coordinates.
Since the coordinates themselves do not affect any other
quantity in the update, this change will not affect any other
property.

I think that one can anyhow not do proper second order leap
frog integration with P-R. Velocity verlet is the only real
solution. But not scaling x really introduces artifacts.

Erik, from which paper did you get the formulas for the P-R
integration?

Before the 4.0 release we need to fix some serios issues
on the Cray XT4 and on the Blue Gene.

Berk

#47 Updated by David van der Spoel about 11 years ago

Maybe we should reopen

http://bugzilla.gromacs.org/show_bug.cgi?id=18

and

http://bugzilla.gromacs.org/show_bug.cgi?id=14

both containing additional information on PR coupling.

#48 Updated by Erik Lindahl about 11 years ago

The references are in the manual, basic stuff in the Parrinello/Rahman paper,
but the Nosé/Klein is a better reference for what we actually do. All
equations in those papers are for velocity verlet - you will have to translate
to leap-frog by making the updated v/x unknown quantities, and solving the
resulting equation system.

However, IF you want to change it this close to the release I would first
want to see that it really improves properties and still produces the right
pressures. Changing the particle integration to relative coordinates but not
the box vector integration stuff could easily make things much worse.

Remember that we had an issue a couple of years ago with multiple minor
"improvements" to the PME code that shouldn't have affected anything else but
still broke the code completely for non-rectangular boxes...

#49 Updated by David van der Spoel about 11 years ago

OK, since we apparently have Velocity Verlet implemented almost, I propose we do the following:

- Fix other bugs related to mdrun on the big machines (and as many as the others as possible obviously)
- Release 4.0

- Implement Velocity Verlet and Parrinello Rahman coupling correctly
- Disable PR with Leap Frog
- Implement 2 D FFT decomposition
- Release 4.1

Everything else should come after 4.1.

#50 Updated by Berk Hess about 11 years ago

I basically agree.

But next week I will have a look into the P-R scaling.
I don't think it is an issue of relative coordinates or not.
Currently everything is in real coordinates, but the coordinates
themselves are not scaled at all.

You now I never commit changes without thorough testing :)

Berk

#51 Updated by David van der Spoel about 11 years ago

This really is a bitch. I have tried 17 different simplified membrane systems with different variables and parameters, most notably decane/water. I typically used semi-isotropic pressure scaling. Without strong interactions in the "membrane" phase, the box extends forever in the Z-direction (that is until the cut-off gets larger than half the box). In other words, without "real" lipids the surface tension of the system is not high enough. The plots I submitted earlier do not show moving of a layer in a constant box but rather the extension of the box itself.

I could try having a positive pressure in the Z direction and a negative one in the X/Y direction, just to get a stable system. Objections?

An observation I could make based on these simulations is that the effect (wandering lipid layer) only happens with semi-isotropic scaling, not with isotropic and not with semi-isotropic scaling where the compressibility in the lateral direction is zero.

In order to test spurious coupling between forces and pressure coupling I have turned off angles (using constraints) and dihedral forces. I will probably hve to do those tests again once I get a stable system.

#52 Updated by David van der Spoel about 11 years ago

Created an attachment (id=288)
Results

Here are more results, for decane-water with three different P-scaling regimes. It seems that there is an effect even with isotropic scaling, although the scales are different. In the case where the P-ref are different the position of the decane layer looks a bit like a random walk.

I have created a slightly polar molecule now that equilibrates with ref-p = 1 in both lateral and normal directions. More on this later.

#53 Updated by Rossen Apostolov over 9 years ago

What is the status of this bug? Can we close it?

#54 Updated by David van der Spoel over 9 years ago

Not yet. It would be good to do some more test with some of the new integrators to see if that resolves it.

#55 Updated by David van der Spoel over 9 years ago

Created an attachment (id=462)
Two xvg files showing COM before and after some magic updates

Two hundred ns simulations of dppc are shown. One (before) run in Nov. 2009, this clearly shows the system center of mass deviating from the half box size. The second (after) was run in June 2010 with the latest git code. Here the effect is gone. It seems that some magic update has fixed the problem. Unless anyone object violently I will close this bug in a few days.

#56 Updated by Rossen Apostolov over 9 years ago

Using
$ git bisect
will easily spot which commit was responsible for the bug, although you still have to run multiple simulations for the checks.

#57 Updated by Berk Hess over 9 years ago

That's very interesting (and nice)!

I don't recall any significant changes to the pressure coupling code.
Except maybe for allowing pressure coupling only every nstcalcenergy steps.
What is nstcalcenergy set to (grep for it in the log file)?

Berk

#58 Updated by David van der Spoel over 9 years ago

[anfinsen:165/dppc4] % grep nstcalcenergy md.log
nstcalcenergy = 5

The only worry is that I've run it once only.

#59 Updated by Berk Hess over 9 years ago

This means that the pressure (and temperature) coupling is only applied
every 5 steps with a 5 times larger effect in that step.
I don't see directly how this would affect the drift, but it might be
worth it to run once more with nstcalcenergy set to 1.

Berk

#60 Updated by David van der Spoel over 9 years ago

Created an attachment (id=480)
Two more xvg files, showing it is not perfect anyway

Here are two more xvg files, in a response to Berk's question. Unfortunately I managed to make an error. The first one (bx5.xvg) is still with nstcalcenergy = 5, however now run on 8 cores. This gives not as good conservation of the COM as the previous run (with the same tpr) on 32 cores. The second one (bx6.xvg) is with nstcalcenergy = 1, but now also with using 1 T-coupling group for the whole system only. Unfortunately two changes, but this one does not show great conservation either, although it is not too bad. In other words, still no clue what is causing this other then numerical round-off problems.

#61 Updated by Berk Hess over 9 years ago

Although these results do not show perfect conservation, I think we can
consider these to be pure numerical rounding effects. The systematic increase
of the COM with respect to the box size seems to have disappeared.
The unsatisfying part is that we don't (yet) know what caused this.

Berk

#62 Updated by David van der Spoel over 9 years ago

The results are now so much better that the resulting drift of the center of mass from half the box size can most likely be attributed to numerical round-off. Although this is not quite satisfactory I close the bug.

Also available in: Atom PDF