Project

General

Profile

Bug #3368

Erroneous interplay between gmx rms command and atommass.dat: Can not find mass in database for atom MG in residue

Added by Vedat Durmaz 2 months ago.

Status:
New
Priority:
Normal
Assignee:
Category:
analysis tools
Target version:
Affected version - extra info:
gromacs-2018.7, gromacs-2019.5, gromacs-2020
Affected version:
Difficulty:
uncategorized
Close

Description

Bug is related to the closed Gromacs Bug #20 (issued by Erik Lindahl on 10/18/2005 04:44 PM):
"g_rms uses incorrect masses when reference is a PDB/GRO file"

The following description was copied from a mail to the Gromacs users list:

I'm pretty sure it's not a feature but a bug which I've faced in the GMX versions 2018.7, 2019.5 and 2020.

When I try to calculate RMSD values for a protein system including a catalytic magnesium ion "Mg" using the command

$ gmx rms -s mol.pdb -f mol.xtc -f2 mol.xtc -o mol-rmsd.xvg -debug

I get this error:

Can not find mass in database for atom MG in residue 1370 MG

-------------------------------------------------------
Program:     gmx rms, version 2019.5
Source file: src/gromacs/fileio/confio.cpp (line 517)

Fatal error:
Masses were requested, but for some atom(s) masses could not be found in the
database. Use a tpr file as input, if possible, or add these atoms to the mass
database.

Let's accept for the moment that using a .tpr file with -s (rather than a .pdb file) is no option for me. Consequently, gmx retrieves atom masses from atommass.dat which actually contains the Mg ion:

; NOTE: longest names match
; General atoms
; '???' or '*' matches any residue name
???  H      1.00790 
???  He     4.00260 
???  Li     6.94100 
???  Be     9.01220 
???  B     10.81100 
???  C     12.01070 
???  N     14.00670 
???  O     15.99940 
???  F     18.99840 
???  Ne    20.17970 
???  Na    22.98970 
???  Mg    24.30500    <<< in this line
???  Al    26.98150 
???  Si    28.08550 
???  P     30.97380

Having a look at the log file of "gmx rms -debug ...", I see some strange output. Here's a snippet:

searching residue:  ??? atom:    H
 not successful
searching residue:  ??? atom:   He
 match:  ???    H
searching residue:  ??? atom:   Li
 not successful
searching residue:  ??? atom:   Be
 not successful
searching residue:  ??? atom:    B
 not successful
searching residue:  ??? atom:    C
 not successful
searching residue:  ??? atom:    N
 not successful
searching residue:  ??? atom:    O
 not successful
searching residue:  ??? atom:    F
 not successful
searching residue:  ??? atom:   Ne
 match:  ???    N
searching residue:  ??? atom:   Na
 match:  ???    N
searching residue:  ??? atom:   Mg
 not successful
searching residue:  ??? atom:   Al
 not successful
searching residue:  ??? atom:   Si
 not successful
searching residue:  ??? atom:    P
 not successful
searching residue:  ??? atom:    S
 not successful
searching residue:  ??? atom:   Cl
 match:  ???    C
 ...
searching residue:  ??? atom:   Cu
 match:  ???    C
...

I don't know what exactly the rms command is doing behind the scenes and also don't want to agonize over the cpp code. But to me it seems as if the element assignment is based only on the FIRST LETTER of each element name. If I use copper (Cu) instead of magnesium, the program runs fine. Now let's compare the Cu lines with the Mg lines in the debugging output above.

searching residue:  ??? atom:   Cu
 match:  ???    C
searching residue:  ??? atom:   Mg
 not successful

Obviously, Cu is found because the first letter of its element abbreviation, C (because Cu[index 0] is C) does exist as an element, but there is no element M, the first letter of Mg. A little test (respectively an improper workaround). If I add a line for the element "M" to atommass.dat like this

???  Na    22.98970 
???  Mg    24.30500
???  M     24.30500      <<< new line
???  Al    26.98150

then the rms command executed on the protein with Mg runs without any error. But this observation also implies that masses, the rms command retrieves from atommass.dat, are wrong because, e.g., to each of Cl, Cr, Cu and Co, the mass of C is assigned.

I see 2 options for GMX developers. Either you check the rms<->atommass.dat interplay, or you disable the possibility to use PDB files with the -s option. However, I would strongly discourage from the latter decision since there are cases where you have an xtc trajectory possibly generated with another tool along with a pdb file, but don't want to spend too much time in the generation of a proper tpr file.

Also available in: Atom PDF