Project

General

Profile

Bug #2900

Cofactors vanishing in editconf

Added by Erik Marklund 7 months ago. Updated 6 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
preprocessing (pdb2gmx,grompp)
Target version:
Affected version - extra info:
2016.4 and 2018.X not tested
Affected version:
Difficulty:
uncategorized
Close

Description

I have a dimeric protein with two cofactors. After getting it through pdb2gmx (with some effort) I obtained a new pdb-file, which I passed on to editconf to get a suitable box. The cofactors however vanished in the editconf step.

I then tried using gromacs 2016.3, and then editconf managed to handle the cofactors. I also noticed some differences in the pdb2gmx step, where 2019.1 required the two cofactors to have different residue numbers, even though they belonged to different chains, whereas 2016 needed no such workaround.

Note that while investigating this issue, we also found that pdb2gmx, solvate and insert-molecules no longer accepted .tpr files as input, which is now fixed.

SR1_2016.3.pdb (592 KB) SR1_2016.3.pdb Erik Marklund, 04/07/2019 07:42 PM

Associated revisions

Revision 69731121 (diff)
Added by Berk Hess 6 months ago

Fix pdb-related issues

In editconf, if a PDB input was read, chain IDs are known, and should
be written if the output is also PDB. This now works again, after
being broken in commit 8dd3c9ae88004054b3.

This required adding missing tpr support to readConfAndAtoms(), which
also fixes that pdb2gmx, insert-molecules, and solvate could not read
tpr files. Those are the only tools that directly or indirectly called
readConfAndAtoms with a .tpr file.

Fixes #2900

Change-Id: I883777be945023a7e69260b22e76df54477828ee

Revision 57b3e3d9 (diff)
Added by Berk Hess 6 months ago

Fix pdb-related issues

In editconf, if a PDB input was read, chain IDs are known, and should
be written if the output is also PDB. This now works again, after
being broken in commit 8dd3c9ae88004054b3.

This required adding missing tpr support to readConfAndAtoms(), which
also fixes that pdb2gmx, insert-molecules, and solvate could not read
tpr files. Those are the only tools that directly or indirectly called
readConfAndAtoms with a .tpr file.

Fix in master also closes memory leaks.

Refs #2900

Change-Id: I883777be945023a7e69260b22e76df54477828ee

History

#1 Updated by Erik Marklund 7 months ago

Now it works, somehow. Not sure what is different. Feel free to close.

#2 Updated by Erik Marklund 7 months ago

Or actually, one problem remains. That the chain labels vanish with 2019.1, but not with 2016.3.

#3 Updated by Mark Abraham 7 months ago

I can see the usefulness of both of gmx pdb2gmx -f in.pdb -o out.pdb and gmx editconf -f in.pdb -o out.pdb retaining chain labels. You seem to be suggesting that pdb2gmx did that in 2016.3 and not in 2019.1? Many things influence the behaviour here (including things like TER fields). Can we have a full command line that used to work, now does not, and a relevant input file, please?

The general case of implementing all the things that pdb2gmx claims to do, and retain chain labels, and make sure that rebuilt H atoms and/or termini have the right chain labels is tricky. The state of the pdb2gmx code makes it imfeasible, sadly. I can well believe that attempts to fix other problems would break such things that worked in particular cases.

#4 Updated by Erik Marklund 7 months ago

Sory if I was unclear. pdb2gmx behaves the same for both versions, requiring some unexpected and annoying workarounds. editconf behavious differs between versions however, with chain labels being preserved in 2016.3 but not in 2019.1.

I used pymol to quickly look at the output structure, and the cofactors did not show up where they used to be in the sequence because of the missing chain labels (which I suppose is an issue with pymol). Hence I drew the incorrect conclusion that the cofactors were gone.

I will provide more details after more systematic testing this afternoon.

#5 Updated by Mark Abraham 7 months ago

OK, an editconf input file + command line would be great. We'll do the 2019.2 release in about 10 days, so we'd need something before then.

#6 Updated by Mark Abraham 7 months ago

  • Status changed from New to Blocked, need info

#7 Updated by Mark Abraham 7 months ago

@Erik any inputs available please?

#8 Updated by Erik Marklund 7 months ago

This is the editconf command (script):

#!/bin/bash

v=2019.1
source ~/bin/gmx${v}/bin/GMXRC.bash

gmx editconf \
-f SR1_${v}.pdb \
-o inbox_${v}.pdb \
-bt dodecahedron \
-d 1.5

Changing v... to get 2016.3 version.

The input files are identical:

$ diff ../2016.3/SR1_2016.3.pdb SR1_2019.1.pdb
1c1
< TITLE Glycine aRginine prOline Methionine Alanine Cystine Serine
---

TITLE Green Red Orange Magenta Azure Cyan Skyblue

The output differs however, in that the 2019.1 removes chain labels:
KMB-C02SWYYEGTF:~/Projects/Interface_evolution/sim/SR1_dimer/preproc/bugtest/2019.1$ head inbox_2019.1.pdb
TITLE Green Red Orange Magenta Azure Cyan Skyblue
REMARK THIS IS A SIMULATION BOX
CRYST1 106.339 106.339 106.339 60.00 60.00 90.00 P 1 1
MODEL 1
ATOM 1 MN1 ILE 2 68.111 97.700 21.501 1.00 0.00
ATOM 2 MN2 ILE 2 68.709 98.114 21.822 1.00 0.00
ATOM 3 N ILE 2 68.430 97.858 21.688 1.00 0.00 N
ATOM 4 H1 ILE 2 67.610 97.644 21.157 1.00 0.00 H
ATOM 5 H2 ILE 2 68.210 98.547 22.378 1.00 0.00 H
ATOM 6 H3 ILE 2 69.139 98.212 21.078 1.00 0.00 H
KMB-C02SWYYEGTF:~/Projects/Interface_evolution/sim/SR1_dimer/preproc/bugtest/2019.1$ head ../2016.3/inbox_2016.3.pdb
TITLE Glycine aRginine prOline Methionine Alanine Cystine Serine
REMARK THIS IS A SIMULATION BOX
CRYST1 106.339 106.339 106.339 60.00 60.00 90.00 P 1 1
MODEL 1
ATOM 1 MN1 ILE A 2 68.111 97.700 21.501 1.00 0.00
ATOM 2 MN2 ILE A 2 68.709 98.114 21.822 1.00 0.00
ATOM 3 N ILE A 2 68.430 97.858 21.688 1.00 0.00 N
ATOM 4 H1 ILE A 2 67.610 97.644 21.157 1.00 0.00 H
ATOM 5 H2 ILE A 2 68.210 98.547 22.378 1.00 0.00 H
ATOM 6 H3 ILE A 2 69.139 98.212 21.078 1.00 0.00 H

#9 Updated by Mark Abraham 6 months ago

  • Status changed from Blocked, need info to In Progress
  • Assignee set to Mark Abraham

I can reproduce the difference. Bisecting to see if it was intentional :-)

#10 Updated by Mark Abraham 6 months ago

The difference arose in 8dd3c9ae88004054b3 which looks like it was unintentional, but I am not yet sure how editconf was affected.

#11 Updated by Mark Abraham 6 months ago

The old code left mtop.mols empty, so top.mols was empty, so tpx_make_chain_identifiers did nothing to the chain IDs. The change in the above commit fills top.mols with the single-molecule description that mtop has in the one moleculetype that got built, which triggers tpx_make_chain_identifiers to do things that culminate in deleting the chain IDs. (That method should probably be removed altogether at some point.) We can fix that for editconf by recognizing that it can just use the t_atoms that is built from the pdb file, without converting it to an mtop, and then to a t_topology. No other tool tries to preserve chain IDs, so probably there are no further problems

#12 Updated by Mark Abraham 6 months ago

  • Status changed from In Progress to Fix uploaded

@Erik M: Thanks - we wouldn't have found this without your report and repro case!

#13 Updated by Erik Marklund 6 months ago

Mark Abraham wrote:

@Erik M: Thanks - we wouldn't have found this without your report and repro case!

Thank you fir fixing this and numerous other issues!

I have not yet figured out why the code was broken. I guess gmx_mtop_molecules_t_block() somehow fails? I will need to look in a debugger before I can complete the code review but will try to find time for that tonight.

#14 Updated by Mark Abraham 6 months ago

Erik Marklund wrote:

Mark Abraham wrote:

@Erik M: Thanks - we wouldn't have found this without your report and repro case!

Thank you fir fixing this and numerous other issues!

I have not yet figured out why the code was broken. I guess gmx_mtop_molecules_t_block() somehow fails? I will need to look in a debugger before I can complete the code review but will try to find time for that tonight.

mols.nr was zero before the change (probably mols was just not being filled at all), which meant that chain IDs were left alone. The new version filled mols and set mols.nr to 1, which led to chain IDs being rewritten and then deleted in tpx_make_chain_identifiers.

The whole thing is a combinatorial disaster, as nobody knows what was supposed to work in which sub cases.

#15 Updated by Erik Marklund 6 months ago

Ok. Am on it now. If I approve I will anyway wait for an answer to Berk's comment before +1.

#16 Updated by Erik Marklund 6 months ago

Mark Abraham wrote:

Erik Marklund wrote:

Mark Abraham wrote:

@Erik M: Thanks - we wouldn't have found this without your report and repro case!

Thank you fir fixing this and numerous other issues!

I have not yet figured out why the code was broken. I guess gmx_mtop_molecules_t_block() somehow fails? I will need to look in a debugger before I can complete the code review but will try to find time for that tonight.

mols.nr was zero before the change (probably mols was just not being filled at all), which meant that chain IDs were left alone. The new version filled mols and set mols.nr to 1, which led to chain IDs being rewritten and then deleted in tpx_make_chain_identifiers.

The whole thing is a combinatorial disaster, as nobody knows what was supposed to work in which sub cases.

When I run 2019.1 through the debugger mols.nr is set to 1, not zero. What happened was that on line 290 and onwards, chain labels are blanked out because it believes that only one chain is present. This doesn't happen when readConfAndTopology is used directly without conversion. I will +1 if someone can please give input to Berk's comment.

#17 Updated by Mark Abraham 6 months ago

Yes that was my analysis also. I replied to Berk's comment. Whatever reads coordinate files and does least processing before returning a t_atoms is best.

#18 Updated by Mark Abraham 6 months ago

@Erik Do you happen to have a matching .tpr file? That pdb with virtual sites is awkward to handle for building one.

#19 Updated by Mark Abraham 6 months ago

  • Description updated (diff)
  • Status changed from Fix uploaded to Resolved

@Erik Don't worry, I made my own

#20 Updated by Mark Abraham 6 months ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF