Four letter residues and default index groups
Some forcefields have residues with 4 letters, e.g. amber with N- and C-terminals prepended with an N or a C.
Pdb files written by gromacs tools only write 3 letters. These files are incorrectly parsed by make_ndx and other tools that construct the default index groups. The N- and C-terminal residues are not recognized as being part of a protein.
Either these residues should be written with all four letters (my suggestion, since column 21 in the PDB file format standard is not defined: http://www.wwpdb.org/documentation/format32/sect9.html) or the routine that generate the default groups should be made more intelligent to recognize these as being residues.
Enable 4-letter resname in PDB output, keeps more pdbinfo.
This still fully adheres to the PDB standard since column 21
is not used by the standard. All common programs (PyMol, VMD, etc)
understand the 4-letter format, and programs that only read three
letters will still read the same filename as they used to. In
particular, this conserves most residue names during pdb<->gro
format conversions. We have also killed the non-standard
wide pdb format to avoid writing broken PDB files.
#2 Updated by Erik Lindahl almost 5 years ago
- Category set to preprocessing (pdb2gmx,grompp)
- Assignee set to Erik Lindahl
- Target version set to 5.0
- Affected version set to 4.6
This is not really a bug but a limitation of the PDB format; we've never claimed to write more than 3 letters. Since Gromacs-4.6 uses our new force field setup (that does not rely on 4-letter amino acid names in PDB files) that particular problem is no longer a concern, but I'll try to add 4-letter residue names in PDB output for Gromacs-5.0.