Project

General

Profile

Bug #1090

g_bar isnt compatible with current free energy code in mdrun

Added by Alexey Shvetsov over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
analysis tools
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

current g_bar implentation doesnt work with free-energy code in 4.6

For example http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin/gmx-tutorials/free_energy/ will not work after updating mdp files for 4.6

g_bar cannot parse dhdl.xvg from mdrun output. also it cannot find foreign lambdas in *.edr

edr.tar.gz (131 KB) edr.tar.gz Sander Pronk, 12/28/2012 06:07 PM

Associated revisions

Revision 23eb71d1 (diff)
Added by Sander Pronk over 6 years ago

Fixes to free energy code, output & g_bar compatibility.

- Fixes to the handling of the new free energy input options, so that
input are more stringently checked and common errors are caught.
- Fixed g_bar so it can deal with output from mdrun (in dhdl.xvg and
ener.edr input)
- Free energy output is now backward compatible with g_bar if 4.5-style
settings are used
- Some mdp option documentation clarifications.

This fixes issue #1090

Still left to do:
- g_energy odh relies on an ugly hack to get the xvg headers for free
energy
output to work. That needs to be fixed so ener.edr can be interpreted
independently from the associated tpr file.
The mdp documentation could use some more improvements.

Change-Id: I6198f935a94b7341d8dd85444e348ea8a55621ad

History

#1 Updated by Szilárd Páll over 6 years ago

  • Category set to analysis tools
  • Priority changed from Normal to High

#2 Updated by Sander Pronk over 6 years ago

  • Assignee set to Sander Pronk

There's two issues: the new FE code outputs slightly different legend items than before, and .edr output does not appear to work at all.

#3 Updated by Michael Shirts over 6 years ago

Can you be more specific when you say .edr output doesn't work at all?

#4 Updated by Sander Pronk over 6 years ago

As far as I can tell with gmxdump, the right number of data blocks is there, but there are far too few data points. Each energy block seems to have only one or two data points where there should be 20 in my test case.

The other issue is that the dhdl.xvg output does not contain the current lambda state. It correctly lists the lambda vector for all delta H values, but not the lambda vector of the current state. This is needed for g_bar to be able to connect path in lambda space.

I'll add that data so that the current output:

@ s0 legend "dH/d\xl\f{} (fep-lambdas)" 
@ s1 legend "dH/d\xl\f{} (coul-lambdas)" 
@ s2 legend "\xD\f{}H \xl\f{} (0,0)" 

will change to

@ s0 legend "dH/d\xl\f{} (fep-lambdas = 0.1)" 
@ s1 legend "dH/d\xl\f{} (coul-lambdas = 0.0)" 
@ s2 legend "\xD\f{}H \xl\f{} (0,0)" 

Will that break things in for example pymbar?

#5 Updated by Michael Shirts over 6 years ago

Hi, Sander-

That change should be fine. I forgot about this since pymbar reads the current state from the the dG information.
Another place to put the vector would be at the top of the file, after the "state = x" information.

I'm not sure why there are fewer points than there were before. Perhaps it's only updating them every nstdhdl points now? Hard to tell without example files to compare to.

#6 Updated by Sander Pronk over 6 years ago

Another place to put the vector would be at the top of the file, after the "state = x" information.

That would work too - I think I'll do both.

I'm not sure why there are fewer points than there were before. Perhaps it's only updating them every nstdhdl points now? Hard to tell without example files to compare to.

I've take a closer look and it may be that far fewer dhdls get calculated in the 4.6 code (I can make it output more if I set nstdhdl lower). Could this be the case?
I've attached an example system where this is being output for two different versions.

#7 Updated by Michael Shirts over 6 years ago

That would work too - I think I'll do both.

Sure, that's fine.

I've take a closer look and it may be that far fewer dhdls get calculated in the 4.6 code (I can make it output more if I set nstdhdl lower). Could this be the case?
I've attached an example system where this is being output for two different versions.

Right, I changed it so it only calculates dhdl every nstdhdl points. It's not clear that one would WANT dhdl more frequently than that -- the correlation times are usually 200-500 ps. Most uncertainty formulas only work when the data is independent. Perhaps this is the correct behavior, and it just need to be documented better?

#8 Updated by Sander Pronk over 6 years ago

One more thing: the default value for init_fep_state is 0, while init_lambda is -1 when not set. That is potentially dangerous because there might be legitimate situations where the user might want to set lambda to a value outside [0,1], and setting magic values in floats/doubles is asking for trouble.

I'll change the initialization code so that init_fep_state is set to -1 if init_lambda contains a valid value: that should never be a legitimately set value and is therefore safe as a flag value.

Right, I changed it so it only calculates dhdl every nstdhdl points. It's not clear that one would WANT dhdl more frequently than that -- the correlation times are usually 200-500 ps. Most uncertainty formulas only work when the data is independent. Perhaps this is the correct behavior, and it just need to be documented better?

IMHO 100 steps is too many. The correlation times are very system-dependent: correlation times for small molecule solvation free energies are probably much shorter than that, and are a component in most binding free energies. And either way, having more values is always better than fewer values even if there are correlations.

The uncertainties based on uncorrelated data are worst-case errors, and must be based on a good estimate of the correlation time itself, which probably requires measurements with intervals shorter than the correlation times.
Besides, g_bar estimates errors based on variance estimates, and detects most cases where there is severe undersampling that might lead to overly confident error margins.

#9 Updated by Michael Shirts over 6 years ago

Sander Pronk wrote:

One more thing: the default value for init_fep_state is 0, while init_lambda is -1 when not set. That is potentially dangerous because there might be legitimate situations where the user might want to set lambda to a value outside [0,1]

I'd dispute this. Lambda is between end states, and is by definition between 0 and 1. If you want to test other paramters, change the end states. I don't want to give up a completely consistent mathematical description of free energy calculations to allow people shorthand to describe new states. In many cases (softcore for example), then lambdas outside [0,1] give nonsensical, nonphysical potentials. I think it would be best to restrict lambda to [0,1]

and setting magic values in floats/doubles is asking for trouble.

Can you explain this comment more? Sounds like something I need to know about better.

I'll change the initialization code so that init_fep_state is set to -1 if init_lambda contains a valid value: that should never be a legitimately set value and is therefore safe as a flag value.

Init_fep_state is 0, meaning that the default is state A. init_lambda =-1 to tell if init_lambda has been specified -- if not, it is written over by the value determined by the lambda vectors an init_fep_state. So there are different reasons for the defaults. Perhaps init_fep_state should REQUIRE specification, and in that case, having an init_fep_state= -1 would make sense.

I can think about changing it, but that requries going over all the logic to make sure it doesn't break anything. I also don't think it's a good idea to change default values unless you've gone though and checked that it doesn't break the rest of the code. Another way of putting is is I don't see the bug that changing init_fep_state = -1 would fix.

And at this point in the 4.6 release cycle, I don't think we should be making code changes that don't fix bugs, but might introduce them. I'm happy to reconsider for 5.0.

Right, I changed it so it only calculates dhdl every nstdhdl points. It's not clear that one would WANT dhdl more frequently than that -- the correlation times are usually 200-500 ps. Most uncertainty formulas only work when the data is independent. Perhaps this is the correct behavior, and it just need to be documented better?

IMHO 100 steps is too many. The correlation times are very system-dependent: correlation times for small molecule solvation free energies are probably much shorter than that, and are a component in most binding free energies. And either way, having more values is always better than fewer values even if there are correlations.

Then set nstdhdl to be smaller for the run! Very simple solution. But in my experience, 100 steps is actually a pretty good estimate even for small molecule solvation energies.

The uncertainties based on uncorrelated data are worst-case errors,

Uncertainties based on uncorrelated data are exactly what you would get by running the same simulation N times, by definition. So I'm not sure what you mean here.

and must be based on a good estimate of the correlation time itself, which probably requires measurements with intervals shorter than the correlation times.

I think you mean measurements with intervals longer than the correlations time?

Besides, g_bar estimates errors based on variance estimates, and detects most cases where there is severe undersampling that might lead to overly confident error margins.

Then the solution is to set nstdhdl smaller. Perhaps this is an argument for a different default value of nstdhdl, but it is still fully configurable. I would rather have people make the mistake of running a bit to long than undersampling and getting uncertainties that are too low.

Let me know if you have any other questions here! I'm not seeing any need for action based on what you have written, other than explaining things more clearly in the manual, and potentially changing the nstdhdl default to something smaller, with revisiting the logic for 5.0.

#10 Updated by Sander Pronk over 6 years ago

One more thing: the default value for init_fep_state is 0, while init_lambda is -1 when not set. That is potentially dangerous because there might be legitimate situations where the user might want to set lambda to a value outside [0,1]

I'd dispute this. Lambda is between end states, and is by definition between 0 and 1. If you want to test other paramters, change the end states. I don't want to give up a completely consistent mathematical description of free energy calculations to allow people shorthand to describe new states. In many cases (softcore for example), then lambdas outside [0,1] give nonsensical, nonphysical potentials. I think it would be best to restrict lambda to [0,1]

and setting magic values in floats/doubles is asking for trouble.

Can you explain this comment more? Sounds like something I need to know about better.

I checked the code and it's simply checking for negative values. That's safe, but obviously checking for specific floating point values is potentially unsafe, especially with tpr settings being xdr-encoded and decoded.

I'll change the initialization code so that init_fep_state is set to -1 if init_lambda contains a valid value: that should never be a legitimately set value and is therefore safe as a flag value.

Init_fep_state is 0, meaning that the default is state A. init_lambda =-1 to tell if init_lambda has been specified -- if not, it is written over by the value determined by the lambda vectors an init_fep_state. So there are different reasons for the defaults. Perhaps init_fep_state should REQUIRE specification, and in that case, having an init_fep_state= -1 would make sense.
I can think about changing it, but that requries going over all the logic to make sure it doesn't break anything. I also don't think it's a good idea to change default values unless you've gone though and checked that it doesn't break the rest of the code. Another way of putting is is I don't see the bug that changing init_fep_state = -1 would fix.
And at this point in the 4.6 release cycle, I don't think we should be making code changes that don't fix bugs, but might introduce them. I'm happy to reconsider for 5.0.

I hadn't considered the possibility of an slow growth based on init_lambda together with discrete states. You're right that it's probably best to disallow negative lambdas and leave the rest of the code as it is.
An unspecified init_fep_state with specified lambda points would be a little dangerous because it is almost certainly a user oversight - IMO the user shouldn't rely on grompp choosing point 0 by default.

IMHO 100 steps is too many. The correlation times are very system-dependent: correlation times for small molecule solvation free energies are probably much shorter than that, and are a component in most binding free energies. And either way, having more values is always better than fewer values even if there are correlations.

Then set nstdhdl to be smaller for the run! Very simple solution. But in my experience, 100 steps is actually a pretty good estimate even for small molecule solvation energies.

The uncertainties based on uncorrelated data are worst-case errors,

Uncertainties based on uncorrelated data are exactly what you would get by running the same simulation N times, by definition. So I'm not sure what you mean here.

Of course!

But in my opinion (and I might very well be wrong here) that doesn't mean that there is no information in samples that are correlated with previous samples. The only case I can think of where sampling between observed correlation times gives you nothing extra is the case of a time series obtained from a single random process with finite correlation times. But the Hamiltonian differences from a thermal system are a highly non-linear function of the combination of many random processes with different characteristic times (see Markov State Modeling) which means that the picture of a system transforming from one uncorrelated state to another without contributions from other states would be incomplete.

I think it's primarily a question of perspective: g_bar makes use of simple block averaging and calculates variance estimates of those averages, which is simple and doesn't give you the data that MBAR (which is based on accurate error estimates) can give. Algorithms like MBAR force you to take a pessimistic approach to error estimates, which is what I meant with 'worst-case errors'. I'm primarily interested in maximum simulation time efficiency given an FEP integration/averaging algorithm, and if delta H data is almost free given an investment in simulation time, then I don't see a reason not to use/calculate it.

and must be based on a good estimate of the correlation time itself, which probably requires measurements with intervals shorter than the correlation times.

I think you mean measurements with intervals longer than the correlations time?

I'd guess you need both for a good estimate of the correlation time.

Besides, g_bar estimates errors based on variance estimates, and detects most cases where there is severe undersampling that might lead to overly confident error margins.

Then the solution is to set nstdhdl smaller. Perhaps this is an argument for a different default value of nstdhdl, but it is still fully configurable. I would rather have people make the mistake of running a bit to long than undersampling and getting uncertainties that are too low.

Let me know if you have any other questions here! I'm not seeing any need for action based on what you have written, other than explaining things more clearly in the manual, and potentially changing the nstdhdl default to something smaller, with revisiting the logic for 5.0.

Yes, I have a few questions:
- How does couple-intramol interact with the bonded-lambdas and the restraint-lambdas? Does couple-intramol=no mean that these have no effect?
- Do the restraint-lambdas couple to the pull code restraints?
- How are the different temperatures for temperature-lambdas set?

Happy New Year!
Sander

#11 Updated by Michael Shirts over 6 years ago

I checked the code and it's simply checking for negative values. That's safe, but obviously checking for specific floating point values is potentially unsafe, especially with tpr settings being xdr-encoded and decoded.

readir.c, 581-582 checks the lambdas.
sprintf(err_buf,"Entry %d for %s must be between 0 and 1, instead is %g",i,efpt_names[j],fep->all_lambda[j][i])\
;
CHECK || (fep->all_lambda[j][i] > 1));

> An unspecified init_fep_state with specified lambda points would be a little dangerous because it is almost certainly a user oversight - IMO the user shouldn't rely on grompp choosing point 0 by default.

If init_lambda is specified, then it overrides init_fep_state regardless. So I think you are asking about when neither init_fep_state or init lambda is specified but free energy is on?

But in my opinion (and I might very well be wrong here) that doesn't mean that there is no information in samples that are correlated with previous samples.

You don't get no information, but you don't get very much. I need to do a little work and understand this better. I have some ideas for testing, but don't have time right now.

The only case I can think of where sampling between observed correlation times gives you nothing extra is the case of a time series obtained from a single random process with finite correlation times. But the Hamiltonian differences from a thermal system are a highly non-linear function of the combination of many random processes with different characteristic times (see Markov State Modeling) which means that the picture of a system transforming from one uncorrelated state to another without contributions from other states would be incomplete.

You're dominated by weights that are likely in the current state, so the correlations should be pretty similar. Note that dh/dl and BAR free energies are very close, and dh/dl has standard single variable behavior. So it's a pretty good approximation.

I think it's primarily a question of perspective: g_bar makes use of simple block averaging and calculates variance estimates of those averages, which is simple and doesn't give you the data that MBAR (which is based on accurate error estimates) can give. Algorithms like MBAR force you to take a pessimistic approach to error estimates, which is what I meant with 'worst-case errors'. I'm primarily interested in maximum simulation time efficiency given an FEP integration/averaging algorithm, and if delta H data is almost free given an investment in simulation time, then I don't see a reason not to use/calculate it.

As I said, I don't believe you get that much more efficiency out. The real test is to do 100 independent simulations, with varying subsamplings. For example, the statistical inefficicency is 10 samples, then calculate the free energy with every 10th sample, 9th sample, 8th sample, etc. When the true sample variance plateaus out, then you know you're not getting any more information out of the simulation by using more data. Exactly how much less than the correlation time you can go -- I don't know. I suspect not much, but I could be wrong.

maximum simulation time efficiency given an FEP integration/averaging algorithm, and if delta H data is almost free given an investment in simulation time, then I don't see a reason not to use/calculate it.

If it's a LINEAR difference, then it is very cheap and could be done every step. If it's a nonlinear difference, it is not that cheap, and is only done every nstfep steps, which is currently an internal variable -- it could be set independently but right now it's tied to nstdhdl. But its complicated to tell at runtime whether it's completely linear or not -- the bookkeeping would not be that fun to set up. I think the best solutions is just to set nstdhdl to something like 10 or 20, and do some experiments to see. I'm happy to look at this in more depth for 5.0.

Yes, I have a few questions:
- How does couple-intramol interact with the bonded-lambdas and the restraint-lambdas? Does couple-intramol=no mean that these have no effect?

Couple-intramol should just affect nonbondeds. Verifying this is one of the things on my todo list . ..

- Do the restraint-lambdas couple to the pull code restraints?

Yes. This appears to be working in the tests I've done, though I haven't run through all the pull code options.

- How are the different temperatures for temperature-lambdas set?

GetSimTemps in readir.c. Can be either set to be linear, geometric, or exponential. Linear is the default, i.e. it will set them at low + (high-low)lambda.

#12 Updated by Sander Pronk over 6 years ago

readir.c, 581-582 checks the lambdas.
sprintf(err_buf,"Entry %d for %s must be between 0 and 1, instead is %g",i,efpt_names[j],fep->all_lambda[j][i])\
;
CHECK || (fep->all_lambda[j][i] > 1));

OK - that's good enough IMO. There is indeed no loss of generality in constricting the lambda range to [0, 1]

An unspecified init_fep_state with specified lambda points would be a little dangerous because it is almost certainly a user oversight - IMO the user shouldn't rely on grompp choosing point 0 by default.

If init_lambda is specified, then it overrides init_fep_state regardless. So I think you are asking about when neither init_fep_state or init lambda is specified but free energy is on?

Yes - sorry, that's what I meant.

The only case I can think of where sampling between observed correlation times gives you nothing extra is the case of a time series obtained from a single random process with finite correlation times. But the Hamiltonian differences from a thermal system are a highly non-linear function of the combination of many random processes with different characteristic times (see Markov State Modeling) which means that the picture of a system transforming from one uncorrelated state to another without contributions from other states would be incomplete.

You're dominated by weights that are likely in the current state, so the correlations should be pretty similar. Note that dh/dl and BAR free energies are very close, and dh/dl has standard single variable behavior. So it's a pretty good approximation.

That's interesting - do you have a reference for that?

I think it's primarily a question of perspective: g_bar makes use of simple block averaging and calculates variance estimates of those averages, which is simple and doesn't give you the data that MBAR (which is based on accurate error estimates) can give. Algorithms like MBAR force you to take a pessimistic approach to error estimates, which is what I meant with 'worst-case errors'. I'm primarily interested in maximum simulation time efficiency given an FEP integration/averaging algorithm, and if delta H data is almost free given an investment in simulation time, then I don't see a reason not to use/calculate it.

As I said, I don't believe you get that much more efficiency out. The real test is to do 100 independent simulations, with varying subsamplings. For example, the statistical inefficicency is 10 samples, then calculate the free energy with every 10th sample, 9th sample, 8th sample, etc. When the true sample variance plateaus out, then you know you're not getting any more information out of the simulation by using more data. Exactly how much less than the correlation time you can go -- I don't know. I suspect not much, but I could be wrong.

Yes that would be the way to test it; I'd also be curious about how the correlation times and the shared information correlates with dynamics as measured with MSM or some other dynamic clustering method.

maximum simulation time efficiency given an FEP integration/averaging algorithm, and if delta H data is almost free given an investment in simulation time, then I don't see a reason not to use/calculate it.

If it's a LINEAR difference, then it is very cheap and could be done every step. If it's a nonlinear difference, it is not that cheap, and is only done every nstfep steps, which is currently an internal variable -- it could be set independently but right now it's tied to nstdhdl. But its complicated to tell at runtime whether it's completely linear or not -- the bookkeeping would not be that fun to set up. I think the best solutions is just to set nstdhdl to something like 10 or 20, and do some experiments to see. I'm happy to look at this in more depth for 5.0.

Great. I was thinking of a default for nstdhdl in the range of 20-50: the global communication cost for large systems with high parallelization would also start to add up for lower values.

I'll make a separate commit with those changes (and some changes to the mdp options help file).

- How does couple-intramol interact with the bonded-lambdas and the restraint-lambdas? Does couple-intramol=no mean that these have no effect?

Couple-intramol should just affect nonbondeds. Verifying this is one of the things on my todo list . ..

OK.

- Do the restraint-lambdas couple to the pull code restraints?

Yes. This appears to be working in the tests I've done, though I haven't run through all the pull code options.

Excellent! That's very useful. Does it couple to restraints in the topology, too? (dihedral restraints, etc.)?

- How are the different temperatures for temperature-lambdas set?

GetSimTemps in readir.c. Can be either set to be linear, geometric, or exponential. Linear is the default, i.e. it will set them at low + (high-low)lambda.

OK -I hadn't seen the simulated tempering settings.

#13 Updated by Michael Shirts over 6 years ago

Yes - sorry, that's what I meant.

Let me think about this, but it's lower priority. If you have a proposed commit, and take the time to look over the logic, I can potentially comment.

That's interesting - do you have a reference for that?

You can see Paliwal and Shirts (http://pubs.acs.org/doi/abs/10.1021/ct2003995) but that might not answer all the questions. Part of it is just experience.

Yes that would be the way to test it; I'd also be curious about how the correlation times and the shared information correlates with dynamics as measured with MSM or some other dynamic clustering method.

Why do the dynamics matter if one is just calculated ensemble averages? It's how quickly you can generate uncorrelated data.

Great. I was thinking of a default for nstdhdl in the range of 20-50: the global communication cost for large systems with high parallelization would also start to add up for lower values.

The one issue with larger is the file size. But you can talk about the + and - in the discussion.

I'll make a separate commit with those changes (and some changes to the mdp options help file).
Excellent! That's very useful. Does it couple to restraints in the topology, too? (dihedral restraints, etc.)?

Yes, that's what it's designed for. Orientation restraints, I believe, are the only things that it's not coupled to because of all the complicated averaging.

#14 Updated by Mark Abraham over 6 years ago

Is there work to do here for the current target 4.6? If so, it needs to take place in the next week or two if it wants to make the 4.6 release. If not, please change the target version to 5.0 or 4.6.1 as applicable.

#15 Updated by Sander Pronk over 6 years ago

I'm hoping to finish something early this week, if not today.

#16 Updated by Sander Pronk over 6 years ago

You can see Paliwal and Shirts (http://pubs.acs.org/doi/abs/10.1021/ct2003995) but that might not answer all the questions. Part of it is just experience.

Thanks!

Yes that would be the way to test it; I'd also be curious about how the correlation times and the shared information correlates with dynamics as measured with MSM or some other dynamic clustering method.

Why do the dynamics matter if one is just calculated ensemble averages? It's how quickly you can generate uncorrelated data.

Because presumably the correlation times have something to do with the underlying dynamics.

Excellent! That's very useful. Does it couple to restraints in the topology, too? (dihedral restraints, etc.)?

Yes, that's what it's designed for. Orientation restraints, I believe, are the only things that it's not coupled to because of all the complicated averaging.

So how are the A and B states set? From the code, it looks like the restraint lambda simply couples directly to the pull code force. If that is the case, we need to turn that off by default: such a coupling would completely change the default behavior of existing simulation setups - but I may have misinterpreted the code.

#17 Updated by Sander Pronk over 6 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 80

(updated as in progress, just in case somebody wants to release early)

#18 Updated by Michael Shirts over 6 years ago

Because presumably the correlation times have something to do with the underlying dynamics.

OK, I see what you mean. It's not always 100% clear. For calculating statistical properties, I'm going with the fundamental statistical definition; if the data is uncorrelated, it is effectively independent. As you can see for small molecule affinities, then that works pretty well when compared with independent samples. The only issue is if it's not quite sensitive enough, and you actually need to sample less frequently.

So how are the A and B states set? From the code, it looks like the restraint lambda simply couples directly to the pull code force. If that is the case, we need to turn that off by default: such a coupling would completely change the default behavior of existing simulation setups - but I may have misinterpreted the code.

The only change for the pull code is that instead of passing in the scalar lambda into the restraint code (which was there before), it passes in the restraint component of the lambda vector. I checked a number of setups to make sure that 4.5 .mdp's gave the same result. This should only add new functionality, rather than alter default behavior.

#19 Updated by Michael Shirts over 6 years ago

Is there someone with more experience with the pull code that can try their existing setups? Noone has complained so far, but perhaps they haven't been looking.

#20 Updated by Sander Pronk over 6 years ago

So how are the A and B states set? From the code, it looks like the restraint lambda simply couples directly to the pull code force. If that is the case, we need to turn that off by default: such a coupling would completely change the default behavior of existing simulation setups - but I may have misinterpreted the code.

The only change for the pull code is that instead of passing in the scalar lambda into the restraint code (which was there before), it passes in the restraint component of the lambda vector. I checked a number of setups to make sure that 4.5 .mdp's gave the same result. This should only add new functionality, rather than alter default behavior.

Oh wow, this is a surprise to me. As far as I can tell this is not in the manual - and is crucial for setting up binding free energy calculations correctly.

Is there someone with more experience with the pull code that can try their existing setups? Noone has complained so far, but perhaps they haven't been looking.

That would be a good idea.

#21 Updated by Michael Shirts over 6 years ago

Oh wow, this is a surprise to me. As far as I can tell this is not in the manual - and is crucial for setting up binding free energy calculations correctly.

Which part is a surprise? The pull code certainly used lambda in 4.5, were you surprised by this? Default behavior from 4.5 should be preserved, because if you put in a scalar lambda in 4.6 instead of a vector, then all components of the lambda vector use the same scalar. If the pull code is not supposed to be using lambda, then than was something introduced by someone else previous to 4.5.

and is crucial for setting up binding free energy calculations correctly.

By which you mean binding free energy calculations which also use the pull code, correct? There are lots of ways to do binding free energy calculations.

#22 Updated by Sander Pronk over 6 years ago

Oh wow, this is a surprise to me. As far as I can tell this is not in the manual - and is crucial for setting up binding free energy calculations correctly.

Which part is a surprise? The pull code certainly used lambda in 4.5, were you surprised by this? Default behavior from 4.5 should be preserved, because if you put in a scalar lambda in 4.6 instead of a vector, then all components of the lambda vector use the same scalar. If the pull code is not supposed to be using lambda, then than was something introduced by someone else previous to 4.5.

I'm surprised the pull code was always coupled to lambda. The most straightforward way to introduce positional restraints on ligands is using the pull code, and apparently having a mutli-part FE calculation path that first turns off charges and then vdw breaks this because lambda is set to 0 twice - turning off restraints mid-way into the calculation for those unaware of this.

By which you mean binding free energy calculations which also use the pull code, correct? There are lots of ways to do binding free energy calculations.

Of course - it's just that this is the most obvious way and I can't find any place in the manual that says that this will happen.

#23 Updated by Michael Shirts over 6 years ago

I'm surprised the pull code was always coupled to lambda. The most straightforward way to introduce positional restraints on ligands is using the pull code, and apparently having a mutli-part FE calculation path that first turns off charges and then vdw breaks this because lambda is set to 0 twice - turning off restraints mid-way into the calculation for those unaware of this.

Well, if so, then the new vector format at least fixes this problem, as restraints can be set to change independently.

#24 Updated by Sander Pronk over 6 years ago

Well, if so, then the new vector format at least fixes this problem, as restraints can be set to change independently.

Indeed! Though it might be a good idea to note this potential issue in the manual..

#25 Updated by Sander Pronk over 6 years ago

  • Status changed from In Progress to Closed

Fixed with I6198f935a94b7341d8dd85444e348ea8a55621ad

#26 Updated by Alexey Shvetsov over 6 years ago

Is there some kind of example input files for fep and g_bar for 4.6?

Also available in: Atom PDF