## Bug #2862

### Division by zero in restrained dihedrals

**Description**

The following code in restcbt.cpp, line 154 can lead to division by zero (variable sine_phi_sq is set to zero).

`/* It is possible that cosine_phi is slightly bigger than 1.0 due to round-off errors. */`

if (sine_phi_sq < 0.0)
{

sine_phi_sq = 0.0;

}

`< snip >`

`/* Computation of the prefactor - common term for all forces */`

*prefactor_phi = -(k_torsion) * delta_cosine * norm_phi * term_phi_phi0 / (sine_phi_sq * sine_phi_sq);

### Associated revisions

### History

#### #1 Updated by Mark Abraham almost 2 years ago

Nico, should we improve something here?

#### #2 Updated by Berk Hess almost 2 years ago

Unrelated to the division by zero, but the factor (sine_theta_sq)^2 in the denominator of prefactor looks suspicious to me. I would expect a factor sin^2, not a factor sin^4.

#### #3 Updated by Berk Hess almost 2 years ago

Maybe the sin^4 factor could be correct, but that is very difficult to verify without the definitions of any of the variables in the code.

The potential diverges at theta=0, so you should simply never put in such a configuration!

#### #4 Updated by David van der Spoel almost 2 years ago

Here is an issue related to the same code, where single and double produce completely different results.

#### #5 Updated by David van der Spoel almost 2 years ago

For sure the code is not sufficiently robust. Maybe some variables need to be made double precision.

#### #6 Updated by Gerrit Code Review Bot almost 2 years ago

Gerrit received a related patchset '1' for Issue #2862.

Uploader: David van der Spoel (spoel@xray.bmc.uu.se)

Change-Id: gromacs~master~Iad7f2bc45be996ba3e16358aab838c5427b157b8

Gerrit URL: https://gerrit.gromacs.org/9148

#### #7 Updated by David van der Spoel almost 2 years ago

**Status**changed from*New*to*Resolved*

Applied in changeset dfa2114913705207ddf932ef753859cf46dfa851.

#### #8 Updated by Erik Lindahl almost 2 years ago

**Status**changed from*Resolved*to*In Progress*

I'm reopening this again so we do not close it, but fix it properly.

While moving to double might have fixed one particular case, that is never a proper solution to make numerically sensitive operatios robust - it just means the error will be rarer. To fix it, we have to understand

- Is it allowed for the dihedrals to have values in the range where this can happen? (likely yes, even if it is rare/stupid). If the answer to that is "no", we should add tests to catch the bad configurations.
- How do we fix the equations so we can handle the case where theta approaches 0?
- Given those fixes, the obvious test is to confirm the code works fine in single precision

For the record, doing some calculations in double is really horrible from a performance p-o-v since it forces single-to-double-to-single conversions that are very slow.

#### #9 Updated by Nicolae goga almost 2 years ago

Friends,

Sorry again for the late reply. Code was done according to the physics described in the cited article. So that the formula with cosinus at power 4 should be fine.

Nicu

#### #10 Updated by Mark Abraham almost 2 years ago

Nicolae goga wrote:

Sorry again for the late reply. Code was done according to the physics described in the cited article. So that the formula with cosinus at power 4 should be fine.

Nicu, what happens as theta approaches zero? How should we implement this stably?

#### #11 Updated by Mark Abraham almost 2 years ago

Berk Hess wrote:

Maybe the sin^4 factor could be correct, but that is very difficult to verify without the definitions of any of the variables in the code.

Also, Nicu can you please have a look at the code and propose some better comments? If the core team can't maintain code because we can't understand it, then we will have to remove it.

#### #12 Updated by Nicolae goga almost 2 years ago

Friends,

I was discussing with Monica Bulacu, the main author of the theory. The if part with

if (sine_phi_sq < 0.0)
{

sine_phi_sq = 0.0;

}

should be removed. Citing Monica, this will never happen and it was an error of translating the code from Fortrand (her code) to Gromacs.

I hope that this helps

Keep in touch

Nicu

#### #13 Updated by Erik Lindahl almost 2 years ago

I had a closer look at this code, and it appears to be written in a very fragile way.

- First, sin^2(x) should NEVER be calculated as 1.0-cos^2(x) if there is a risk x is anywhere close to zero, since it will lead to a drastic loss-of-precision. The proper solution is rather to calculate sin(x) directly from a cross product instead.

- Second, if "Code was done according to the physics described in the cited article", that might be a sign the equations are also bad and do not properly take into account what happens if the value approaches 0 (which I would assume means linear configurations). Even if that is not expected to happen in a stable restrained system, it might definitely happen for an initial system we want to restrain.

- As Mark asks, the code has to be implemented in a way where both the potential and derivative are stable as the value approaches zero.

Notably, this appears to be the case for all three routines. David's earlier fix of using double precision will not fix the problem - it just makes it slightly more rare, and created "reproducibility" in the sense that we get the same errors in single and double precision builds (since both use double).

#### #14 Updated by Erik Lindahl almost 2 years ago

PS Nicu: I very much agree with Mark this code urgently needs proper documentation, being made stable, and proper unit tests.

#### #15 Updated by Paul Bauer about 1 year ago

**Target version**changed from*2020*to*2020.1*

not done yet, but needs to be fixed for 2020.1

#### #16 Updated by Paul Bauer 11 months ago

**Status**changed from*In Progress*to*Resolved*

is the bug itself fixed with the change from David?

The refactoring should be its own issue

#### #17 Updated by Paul Bauer 11 months ago

**Status**changed from*Resolved*to*Closed*

Precision fix for rescbt code.

The compute_restangles function was not sufficiently

precise. In fact it return completely different

results in single and double precision. By making part

of the function work in double the issue is fixed.

New tests added.

Part of #2795

Fixes #2862

Change-Id: Iad7f2bc45be996ba3e16358aab838c5427b157b8