Project

General

Profile

Bug #747

g_rms and g_rmsf give completely wrong results in some cases

Added by Semen Yesylevskyy over 8 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
High
Category:
analysis tools
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Attached is the test trajectory where the group of atoms is simply moved linearly each frame. RMSD in this case should be a straight line (when aligned by whole structure). This is really so in VMD (file rmsd_vmd.dat). BUT Gromacs shows very weird curve with a discontinuity (rmsd.dat).
This is simply wrong.
When rmsf is computed there is also big discrepancy and Gromacs gives wrong results.
It seems to be some severe bug in alignment routines, so this should affect other analysis programs as well.

Apparently this bug is very hard to find in "normal" circumstances. I've found it when I was developing custom analysis code (http://sourceforge.net/projects/pteros/). Gromacs and VMD were used as references and the results were not the same.

linear_motion.xtc (4.14 MB) linear_motion.xtc Semen Yesylevskyy, 05/03/2011 06:38 PM
rmsd_vmd.dat (1.41 KB) rmsd_vmd.dat Semen Yesylevskyy, 05/03/2011 06:38 PM
rmsd.xvg (2.9 KB) rmsd.xvg Semen Yesylevskyy, 05/03/2011 06:38 PM
dimer_pdb2gmx.gro (472 KB) dimer_pdb2gmx.gro Semen Yesylevskyy, 05/03/2011 06:41 PM
new.xvg (2.89 KB) new.xvg Semen Yesylevskyy, 05/03/2011 07:06 PM
rmsd_from_gro.xvg (2.93 KB) rmsd_from_gro.xvg Justin Lemkul, 05/03/2011 07:30 PM

History

#1 Updated by Semen Yesylevskyy over 8 years ago

Structure file for attached trajectory

#2 Updated by Justin Lemkul over 8 years ago

The jump coincides with your pulled group crossing the box. Did you perform any fitting or periodicity removal (with trjconv) ? If so, what were your commands? If not, then that's the problem. For complex systems, proper measurement of RMSD and RMSF (and probably more) is not handled very well by the tools in the absence of using trjconv.

#3 Updated by Semen Yesylevskyy over 8 years ago

As far as I understand "-pbs nojump" is what you are talking about. Just tested it - bug is still there:

trjconv -f linear_motion.xtc -s dimer_md.tpr -pbc nojump -o linear_nojump.xtc
g_rms -s dimer_md.tpr -f linear_nojump.xtc -n -o new.xvg

And new.xvg is attached - the same discontinuity.

#4 Updated by Justin Lemkul over 8 years ago

Semen Yesylevskyy wrote:

As far as I understand "-pbs nojump" is what you are talking about. Just tested it - bug is still there:

Not always. There are many different iterations of -pbc that might be necessary, and the result is also dependent upon the fitting group chosen. Are you fitting to the whole protein with -pbc nojump? If so, you may need a custom index group to fit only to your chosen group that moves. I see this all the time with dimeric proteins - one peptide (or a part thereof) crosses a boundary and the RMSD shoots up. But if you fit to one chain and center it in the box, using that new reference structure for RMSD, RMSF, etc. then there is no problem.

trjconv -f linear_motion.xtc -s dimer_md.tpr -pbc nojump -o linear_nojump.xtc
g_rms -s dimer_md.tpr -f linear_nojump.xtc -n -o new.xvg

And new.xvg is attached - the same discontinuity.

If you can provide a .tpr file and .ndx file (if necessary, as it would appear it is from the g_rms command) it would be easier to debug.

#5 Updated by Semen Yesylevskyy over 8 years ago

I'm interested in the RMSD of whole protein, not the moving part. I deliberately move this part far apart - this is stupid test case, I agree, but nojump should handle it correctly.

If the problem is with pbc as you described, then this is absolutely counterintuitive for the user and is a bug by itself, imho. I would expect that nojump keeps the whole protein contiguous and the result must be the same as VMD gives (VMD ignores the box and works with coordinates as it is).

tpr file is 6.5 Mb - it is too large to attach here. I've sent it to .
First 100 residues are moved, all the rest is fixed.

#6 Updated by Justin Lemkul over 8 years ago

Semen Yesylevskyy wrote:

I'm interested in the RMSD of whole protein, not the moving part. I deliberately move this part far apart - this is stupid test case, I agree, but nojump should handle it correctly.

If the problem is with pbc as you described, then this is absolutely counterintuitive for the user and is a bug by itself, imho. I would expect that nojump keeps the whole protein contiguous and the result must be the same as VMD gives (VMD ignores the box and works with coordinates as it is).

tpr file is 6.5 Mb - it is too large to attach here. I've sent it to .
First 100 residues are moved, all the rest is fixed.

Got it, thanks. The difference here is that VMD has no knowledge of PBC, since it can't read .tpr files, so to VMD, the movement is entirely linear. If you calculate the RMSD in a similarly PBC-agnostic way, i.e.:

g_rms -s dimer_pdb2gmx.gro -f linear_motion.xtc

You get a completely reasonable result (see attached).

The solution in this case would be to use either:

1. the above command
2. g_rms -nopbc, which actually gives a seg fault and thus needs to be fixed
3. use trjconv with custom index groups to account for the periodicity of the moving group. Fitting solely to the whole protein doesn't do any actual, useful, PBC removal.

#7 Updated by Justin Lemkul over 8 years ago

Justin Lemkul wrote:

Semen Yesylevskyy wrote:

I'm interested in the RMSD of whole protein, not the moving part. I deliberately move this part far apart - this is stupid test case, I agree, but nojump should handle it correctly.

If the problem is with pbc as you described, then this is absolutely counterintuitive for the user and is a bug by itself, imho. I would expect that nojump keeps the whole protein contiguous and the result must be the same as VMD gives (VMD ignores the box and works with coordinates as it is).

tpr file is 6.5 Mb - it is too large to attach here. I've sent it to .
First 100 residues are moved, all the rest is fixed.

Got it, thanks. The difference here is that VMD has no knowledge of PBC, since it can't read .tpr files, so to VMD, the movement is entirely linear. If you calculate the RMSD in a similarly PBC-agnostic way, i.e.:

g_rms -s dimer_pdb2gmx.gro -f linear_motion.xtc

You get a completely reasonable result (see attached).

The solution in this case would be to use either:

1. the above command
2. g_rms -nopbc, which actually gives a seg fault and thus needs to be fixed
3. use trjconv with custom index groups to account for the periodicity of the moving group. Fitting solely to the whole protein doesn't do any actual, useful, PBC removal.

To expand upon point #3 a bit here, the reason -pbc nojump does not work in your case is that half of your protein is outside the box to start with, so -pbc nojump just shifts the whole structure across the box to start with, since frame 0 is considered the reference, but it has its coordinates "fixed" by creating a jump and thus having no net effect. Probably some combination of -pbc atom, then -pbc nojump, then using a new .tpr file create from frame 0 of the completely fixed trajectory will give a proper result.

#8 Updated by Justin Lemkul over 8 years ago

2. g_rms -nopbc, which actually gives a seg fault and thus needs to be fixed

This option works in the latest release-4-5-patches branch, thus:

g_rms -s linear.tpr -f linear_motion.xtc -nopbc

produces proper output.

#9 Updated by Semen Yesylevskyy over 8 years ago

Now I understood it, thank you! I thought that -nojump is actually equivalent to -nopbc, but it is not. I confirm the segfault with -nopbc.

The whole story with pbc treatment is a bit messy to say the least. It should not be that complicated. If RMSD is computed for group A and alignment is done using group B then I'd expect that:
1) the program makes selection (A or B) for all atoms, which are involved.
2) ensures that (A or B) never jumps across pbc, so there is a guarantee that no artifacts appears
3) computes RMSD
This will produce expected result regardless of placement of A and B in initial frame, etc.

#10 Updated by Justin Lemkul over 8 years ago

Semen Yesylevskyy wrote:

Now I understood it, thank you! I thought that -nojump is actually equivalent to -nopbc, but it is not. I confirm the segfault with -nopbc.

The whole story with pbc treatment is a bit messy to say the least. It should not be that complicated. If RMSD is computed for group A and alignment is done using group B then I'd expect that:
1) the program makes selection (A or B) for all atoms, which are involved.
2) ensures that (A or B) never jumps across pbc, so there is a guarantee that no artifacts appears
3) computes RMSD
This will produce expected result regardless of placement of A and B in initial frame, etc.

PBC fitting and RMSD calculation are necessarily separate processes, if for no other reason but to keep the tools streamlined and compact. All tools (generally) assume that the input trajectory with which it is provided is a valid one that does not include any PBC "artifacts" or factors that skew the resulting calculation.

The whole reason your calculation doesn't work using the .tpr file is that your structure is split over multiple PBC boundaries. If you start with a properly centered structure, with all atoms in the box, then none of this happens. You can't use a single iteration of trjconv because you're trying to solve two problems - an initially broken (or "jumping") structure in one dimension, which then jumps across PBC in another. Reconciling these two factors is certainly not the job of g_rms, nor should it be. In your three steps, it has always been assumed that 1 & 2 have been handled by the user, providing a suitable reference, i.e.:

http://www.gromacs.org/Documentation/Terminology/Periodic_Boundary_Conditions#Suggested_trjconv_workflow

It is complex, but perhaps it is a cruel necessity, especially for complex systems rather than the trivial single protein in a box of water, which is very easy to work with. If there is a more generalized solution to this problem, I would suggest filing an enhancement request and close this "bug" (and certainly remove its label as high priority).

#11 Updated by Semen Yesylevskyy over 8 years ago

I think this issue could be closed - thank you again for clarifying it!

#12 Updated by Justin Lemkul over 8 years ago

  • Status changed from New to Closed

Also available in: Atom PDF