Project

General

Profile

Bug #356

g_sdf output has wrong normalization and wrong reference structure for more than one reference molecule

Added by Ondrej Marsalek almost 10 years ago. Updated about 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Erik Lindahl
Category:
analysis tools
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Created an attachment (id=397)
a simple test case that reproduces the bugs

I have found two problems with g_sdf, I report them together as the attached test case illustrates both.

The test case is based on the example water simulation provided with GROMACS in the directory share/gromacs/tutor/water. Simply run run.sh to produce all the data.

1) The refmol structure is wrong when more than one reference molecule is used. This is evident from comparing one-refmol.gro and two-refmols.gro. It has also been mentioned before in the users mailing list:

http://lists.gromacs.org/pipermail/gmx-users/2009-March/040364.html

2) The normalization of the volumetric data is wrong. Clearly, the absolute scale of the data grows with the number of frames used. A comparison of the code for g_sdf and g_rdf suggests that there could be other problems as well. The output of the last two runs of g_sdf in run.sh should illustrate this (notice the very different mean probability density).

I'll be happy to do more testing or help with resolving these issues, but at the moment I am unable to provide a full fix.

water.tar.bz2 (10.5 KB) water.tar.bz2 a simple test case that reproduces the bugs Ondrej Marsalek, 10/15/2009 08:13 PM
snap.jpg (39.8 KB) snap.jpg TIP4p water sdf Rasmus Lundsgaard, 11/10/2009 06:53 PM

History

#1 Updated by Rasmus Lundsgaard almost 10 years ago

Created an attachment (id=402)
TIP4p water sdf

#2 Updated by David van der Spoel over 9 years ago

I downloaded the test case and ran the simulation and analysis. However I have no idea what to do with the data or how to analyze it further.
[kahlo:356/water] % diff one-refmol.gro two-refmols.gro
3,5c3,5
< 1 SOL OW 1 0.000 -0.000 -0.000
< 1 SOL HW1 2 -0.000 -0.000 -0.100
< 1 SOL HW2 3 -0.000 0.094 0.034
--

1 SOL OW 1 0.617 0.197 0.382
1 SOL HW1 2 0.581 0.228 0.302
1 SOL HW2 3 0.579 0.187 0.427

In other words these are different, but both correct water molecules. Any more more directed info?

#3 Updated by Ondrej Marsalek over 9 years ago

Strange, I get two-refmols.gro as:
Pure Water
3
1 SOL OW 1 0.325 -0.588 0.191
1 SOL HW1 2 0.295 -0.615 0.209
1 SOL HW2 3 0.302 -0.585 0.146
10.00000 10.00000 10.00000

Anyway, both yours and mine have wrong dimension and are positioned wrong relative to the volumetric data, unlike the case with one refmol.

In VMD, you can open both the reference molecule and the volumetric data like this:

vmd two-refmols-test.gro -plt two-refmols.dat

Then you need to create a representation for the volumetric data, for example "isosurface". Then the difference between the two cases is clear.

I do not know why your two-refmols.gro is different from mine, I used "VERSION 4.0.5 (single precision)" on x86-64.

#4 Updated by David van der Spoel over 9 years ago

I don't have VMD installed, and would like to debug it without. Are you sure you don't need to set the -mode flags (and others) if you want to use two molecules as reference groups?

#5 Updated by Ondrej Marsalek over 9 years ago

The description of the modes from the help text for reference:

In -mode 1 the triples will be just the 1st, 2nd, 3rd, ... atoms from
groups 1, 2 and 3. Hence the nth entries in groups 1, 2 and 3 must be from
the same residue.

In -mode 2 the triples will be 1st, 2nd, 3rd, ... atoms
from groups 1 and 2 (with the nth entries in groups 1 and 2 having the same
res-id). For each pair from groups 1 and 2 group 3 is searched for an atom
meeting the distance conditions set with -triangle and -dtri relative to
atoms 1 and 2.

In -mode 3 for each atom in group 1 group 2 is searched for an
atom meeting the distance condition and if a pair is found group 3 is
searched for an atom meeting the further conditions. The triple will only be
used if all three atoms have different res-id's.

I am pretty sure I do not want my reference atoms to be from different residues. I always want one reference residue - one water molecule. So it looks like modes 2 and 3 are of no use here.

Without VMD, measuring the OH distances shows that the "water molecule" in two-refmols.gro is not really a water molecule:

$ echo 1 2 | g_dist f two-refmols.gro -n index.ndx -noxvgr -o dist-OH1
$ echo 1 3 | g_dist -f two-refmols.gro -n index.ndx -noxvgr -o dist-OH2
$ cat dist
*

For the last command, I get:
0.0000000 0.0441928 0.0300000 0.0270000 -0.0180000
0.0000000 0.0506261 0.0230000 -0.0030000 0.0450000

#6 Updated by David van der Spoel over 9 years ago

But then what is the point of your index groups which have two atoms each? Do you expect this to work just like the first case, where you have only the first molecule? Definitely something is strange with the ref gro file. In the code it says:

if ( bRef )
          {
for (j=0; j&lt;isize[G_REFMOL]; j++) {
pbc_dx(&pbc,xi,x[index[G_REFMOL][j]],dx);
mvmul(rot,dx,dx_mol);
rvec_inc(x_refmol[j],dx_mol);
for(k=XX; k<=ZZ; k++)
x_refmol[j][k] = x_refmol[j][k] / 2;
}
}

that means that for each atom its coordinates are added to the refmol and the total coordinate is divided by two. So if you would have three atoms in the reference the final ref coord would be .125 x1 + 0.25 x2 + 0. x3.
I will contact the original author of this program.

#7 Updated by Mark Abraham over 9 years ago

I've no actual insight to offer, but "git blame src/tools/gmx_sdf.c" says that this block of code is unchanged from David's original commit. Berk renamed cprod to oprod in the lines before that, but "git log 57330cbd -p" makes clear this was merely a name change.

#8 Updated by Erik Lindahl about 9 years ago

Has this been confirmed to be a bug or not?

#9 Updated by David van der Spoel about 9 years ago

I contacted the original author of the program but he hasn't replied (or I couldn't find the email address).

There is a bug, but since I don't know what the output should be and the code is quite unreadable I don't feel much like spending time on it.

#10 Updated by Erik Lindahl about 9 years ago

Basically, this appears to have been buggy "forever", none of the current developers seems to be able to invest the time to fix it, and the original author isn't available anymore.

Even if we could fix one or two issues, we will have the same problem the next time there's a bug in this tool.

I'm moving g_sdf to the contrib directory for now, and will disable it in the standard build.

#11 Updated by Erik Lindahl about 9 years ago

g_sdf is no longer part of the standard installation, commit f4d5cefe90abf6b9c214eab389adf3e158c9c9cf.

This tool will have to be rewritten from scratch if we want it; for now we'll refer to g_spatial instead, although the functionality is not identical. Sorry.

Also available in: Atom PDF