Project

General

Profile

Feature #1307

PDB files should be written with the element symbol column

Added by Pedro Lacerda almost 4 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
core library
Target version:
Difficulty:
uncategorized
Close

Description

Generally it isn't a problem. But for molecules like FMNO from G53a6 where atom names starts with 'F', VMD sets the elements incorrectly as fluorine, leaving wrong bonds and colors. How do you avoid this situation?

The write_pdbfile_indexed function at pdbio.c gets element from t_atom.elem but seems that it's always blank because I never saw a generated PDB with element column filled. Maybe should this function use gmx_atomprop_element to set the column correctly?

input.tpr (7.82 KB) Rebeca Garcia Fandino, 07/07/2015 06:07 PM

input.gro (917 Bytes) Rebeca Garcia Fandino, 07/07/2015 06:07 PM

output.pdb (1.54 KB) Rebeca Garcia Fandino, 07/07/2015 06:08 PM

output2.pdb (1.58 KB) Rebeca Garcia Fandino, 07/07/2015 06:08 PM

Associated revisions

Revision 8a6e4217 (diff)
Added by Erik Lindahl almost 3 years ago

Improve PDB I/O (keep occupancy, b-factor, element)

Retain the extra PDB info such as occupancy, b-factors,
and element name columns with trjconv when providing a
PDB structure file. If you use a TPR file, we still
write the (given atomnumbers being present).
However, we deliberately don't store the pdbinfo in the
TPR file since that is a specification for a particular
experiment, rather than something that's valid in the
simulation.

Fixes #917, #1307.

Change-Id: I80cb8f5a250ba094fe81f32c58b4eb0298164053

History

#1 Updated by Mark Abraham almost 4 years ago

The general solution is to give VMD input that it can guess right, which in this case means you need to post-process the coordinate file before giving it to VMD. An element column is not a solution if it is only present in one file format, and only one column wide, e.g. what is HG11? Mercury, or hydrogen 1 on gamma-carbon 1? So we don't care too much. We do sometimes fill the field (see 4.6 HEAD).

#2 Updated by Pedro Lacerda almost 4 years ago

Yes, we can't solve this issue for e.g. GRO since doesn't have a field for element, just atom names.

But I guess that write_pdbfile_indexed should output the element field anyway because we have this information and VMD use it to guess the element type. See the lines 146 and 157 from the link. I believe that VMD will try to guess elements based on atom names just in case of absence of better approach. Other softwares can also expect the element symbol presence.

http://www.ks.uiuc.edu/Research/vmd/plugins/doxygen/pdbplugin_8c-source.html (btw, this source seems pretty nice and 'software agnostic')

I belive that t_atom's elem member is used only in pdbio.c because I didn't found other usages in release-4-6, so it's only set when reading PDB files.

#3 Updated by Pedro Lacerda almost 4 years ago

The logic of my last sentence don't works, but I guess that t_atom's elem is really always empty before fprintf ATOM records (pdbio.c:409).

#4 Updated by Mark Abraham almost 4 years ago

Actually, I'm not sure we do have element information all that often. Offhand, I can see little harm in populating atom.elem when we know. Anywhere we do know is probably only when atom.atomnumber is filled - but I would guess that this is not true even after a .tpr read.

#5 Updated by Pedro Lacerda almost 4 years ago

Okay, I see.. anyway i just made a commit using atom.elem or atom.atomatomnumber to get this info. You can just ignore if it's dumb :)

With this `editconf -f INPUT -o conf.pdb` write out element symbol when INPUT is .tpr and make no harm when INPUT is .gro.

https://github.com/pslacerda/gromacs/commit/17ee3d4329d653a6d8314572fc0030119a12c7df

#6 Updated by Mark Abraham almost 4 years ago

  • Target version changed from 4.6.x to 5.0

Pedro Lacerda wrote:

Okay, I see.. anyway i just made a commit using atom.elem or atom.atomatomnumber to get this info. You can just ignore if it's dumb :)

It's not dumb at all :-) I just didn't expect the information to be available often enough to be worth trying it. Ultimately, this is an open source project, and if someone can scratch an itch with decent code, then there's a fair chance it'll pass review.

With this `editconf -f INPUT -o conf.pdb` write out element symbol when INPUT is .tpr and make no harm when INPUT is .gro.

https://github.com/pslacerda/gromacs/commit/17ee3d4329d653a6d8314572fc0030119a12c7df

Cool. It turns out that .tpr construction does fill atomnumber fields, probably because of the desire to support QM/MM. Those fields are written to the .tpr file, so they continue to be available after reading.

The code you propose looks fine - I have made a suggestion there. It should go on GROMACS master branch (git cherry-pick is your friend), since it is a change of behaviour and not really a bug fix.

#7 Updated by David van der Spoel over 3 years ago

I have this implemented since a long time in another branch. Will move it over.

#8 Updated by Mark Abraham almost 3 years ago

  • Target version changed from 5.0 to future

#9 Updated by Erik Lindahl almost 3 years ago

  • Status changed from New to Fix uploaded

#10 Updated by Teemu Murtola almost 3 years ago

  • Category set to core library
  • Status changed from Fix uploaded to Resolved
  • Assignee set to Erik Lindahl
  • Target version changed from future to 5.0

#11 Updated by Mark Abraham over 2 years ago

  • Status changed from Resolved to Closed

#12 Updated by Rebeca Garcia Fandino almost 2 years ago

I am trying to obtain was a file.pdb with the element symbol column included using editconf.

I just tried two proposals from Mark Abraham, any of them led to the desired pdb output:

1) gmx editconf -f input.tpr -o output.pdb

Where output.pdb is:

(...)
ATOM 1 C MOL 1 28.780 30.430 14.380 1.00 0.00
ATOM 2 C0 MOL 1 32.260 29.290 13.120 1.00 0.00
ATOM 3 C1 MOL 1 27.970 30.090 15.640 1.00 0.00
(...)

2) gmx trjconv -s topol.tpr -f input.gro -o output2.pdb

where output2.pdb is:

(...)
ATOM 1 C MOL 1 28.770 30.460 14.390 1.00 0.00
ATOM 2 C0 MOL 1 32.270 29.290 13.110 1.00 0.00
ATOM 3 C1 MOL 1 27.960 30.090 15.640 1.00 0.00
(...)

My input and output files are attached. I am using GROMACS 5.0.4.

Thanks a lot for your help.

Rebeca.

Also available in: Atom PDF