Project

General

Profile

Bug #3206

2 not-critical bugs in analyse tool

Added by Boris Timofeev 21 days ago. Updated 9 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
analysis tools
Target version:
Affected version - extra info:
Affected version:
Difficulty:
simple
Close

Description

To solve the problem of building complete statistics on all angles and dichedrals of modified DNA,
(See issue # 3184), I had to write a small console program on C++ using the libgromacs API.

Two bugs was found and corrected:
1.
The gmx_conect_addline parsing function of CONNECT fields in pdb files appears to contain errors in the format string, due to migrating from sprintf to std::string in 2019.4.
Must be:

static void gmx_conect_addline(gmx_conect_t *con, char *line) {
int n, ai, aj;

std::string form2  = "%*s"; 
std::string format = form2 + "%d";;
if (sscanf(line, format.c_str(), &ai) == 1) {
do {
form2 += "%*s";
format = form2 + "%d";
n = sscanf(line, format.c_str(), &aj);
if (n 1) {
gmx_conect_add(con, ai - 1, aj - 1); /* to prevent duplicated records */
}
}
while (n 1);
}
}

In "git master" branch sprintf function is used and all is right.

Gromacs never uses CONNECT records in pdb (source context search), but this is not a reason to contain knowingly inoperable code.

2.
It is further revealed that the gmx angle for Std.deviation sometimes returns a NAN, i.e. use sqrt from a negative number.
In a detailed review, it was revealed that:
at first, it is mathematically incorrect to calculate the mean square deviation from the pre-calculated means,
and second, float (32 bit) accuracy is lacking even if the standard deviation is properly calculated.

I propose the following corrections in patch attached.

my_angle_analize.patch (2.83 KB) my_angle_analize.patch Boris Timofeev, 11/15/2019 02:17 PM
my_angle_analize.patch (2.84 KB) my_angle_analize.patch Boris Timofeev, 11/15/2019 02:35 PM
pdb_analyse.cpp (9.67 KB) pdb_analyse.cpp Boris Timofeev, 11/15/2019 03:45 PM

Associated revisions

Revision 7d142404 (diff)
Added by Test User 10 days ago

Fix duplicate CONNECT records in pdb files

When adding connect records in pdb files, it was not checked if records
already existed and were thus duplicated in output. This patch fixes
this behavior.

Fix originally provided by Boris Timofeev.

refs #3206

Change-Id: I5850f4b6279b37df07282acc4416f433099d90c8

History

#1 Updated by Boris Timofeev 21 days ago

Patch 2 must be used

#2 Updated by Boris Timofeev 21 days ago

Source for "batch angle statistic", if it is interesting

#3 Updated by Christian Blau 11 days ago

The standard deviation that is calculated is also correct, though it is the standard deviation of the average angle per frame, instead of the overall standard deviation. I changed the output to reflect that better.

#4 Updated by Boris Timofeev 9 days ago

The average calculated from the frame averages is, at the same time, the average over the entire sample. You can 't say the same about the standard deviation. But the issue of loss of accuracy in calculating angles and the standard deviation remains open. Even if we use double to accumulate, the values obtained by calculating angles with float accuracy regularly result in a situation not possible to calculate the standard deviation.
And the using of float is simply unsuitable. Try to run next sample:

int main()
  {
#define RAD2DEG (180.0/M_PI)
typedef float MY_REAL;
float a[] = { 119.634, 119.657, 119.634 };
MY_REAL sum_a = 0, sum_a2 = 0;
int i;
for (i = 0; i < sizeof(a) / sizeof(a[0]); ++i) {
sum_a += a[i]*(180.0 / M_PI);
sum_a2 += a[i] * a[i] * (180.0 / M_PI) * (180.0 / M_PI);
}
sum_a /= i;
sum_a2 /= i;
sum_a *= sum_a;
assert(sum_a2 > sum_a );
}

assertion failed when MY_REAL => float, and not failed when MY_REAL => double

#5 Updated by Boris Timofeev 9 days ago

It's need also remove double "%" sign for version with std::string in pdbio.cpp:

std::string form2  = "%%*s";        !!!!!!!!!!!! Here must be:  std::string form2  = "%*s";
std::string format = form2 + "%%d"; !!!!!!!!!!!! Here must be: std::string format = form2 + "%d";
if (sscanf(line, format.c_str(), &ai) == 1) {
do {
form2 += "%*s";
format = form2 + "%%d"; !!!!!!!!!!!! Here must be: format = form2 + "%d";

#6 Updated by Boris Timofeev 9 days ago

And another question about angles:
We have two measured torsion angles: -179 deg and 179 deg.
What is a mean value?
I think, 180 degrees, not 0.

Also available in: Atom PDF