Project

General

Profile

Bug #1716

TrajectoryAnalysisUnitTests are failing on i686 openmpi double precision

Added by Dominik Mierzejewski over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
testing
Target version:
Affected version - extra info:
5.0-5.0.4
Affected version:
Difficulty:
uncategorized
Close

Description

One of TrajectoryAnalysisUnitTests is failing in the following configuration:
os: Fedora rawhide
arch: i686
gcc-5.0.1
openmpi-1.8.5
cmake command line:
/usr/bin/cmake -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_PREFIX:PATH=/usr -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DBUILD_SHARED_LIBS:BOOL=ON -DBUILD_SHARED_LIBS=ON -DBUILD_TESTING:BOOL=ON -DCMAKE_C_FLAGS_RELEASE= -DCMAKE_CXX_FLAGS_RELEASE= -DCMAKE_SKIP_RPATH:BOOL=ON -DCMAKE_SKIP_BUILD_RPATH:BOOL=ON -DGMX_BLAS_USER=satlas -DGMX_BUILD_UNITTESTS:BOOL=ON -DGMX_LAPACK_USER=satlas -DGMX_X11=ON -DGMX_SIMD=None -DGMX_MPI=ON -DGMX_THREAD_MPI=OFF -DGMX_DEFAULT_SUFFIX=OFF -D GMX_BINARY_SUFFIX=_openmpi_d -D GMX_LIBS_SUFFIX=_openmpi_d -D GMX_DOUBLE=ON

Failed subtest log:
14: [ RUN ] AngleModuleTest.HandlesOneVsMultipleVectorGroupsAngles
14:
14: WARNING: If there are molecules in the input trajectory file
14: that are broken across periodic boundaries, they
14: cannot be made whole (or treated as whole) without
14: you providing a run input file.
14:
14: Analyzed topology coordinates
14:
14: WARNING: Masses and atomic (Van der Waals) radii will be guessed
14: based on residue and atom names, since they could not be
14: definitively assigned from the information in your input
14: files. These guessed numbers might deviate from the mass
14: and radius of the atom type. Please check the output
14: files if necessary.
14:
14: /builddir/build/BUILD/gromacs-5.0.4/src/testutils/refdata.cpp:931: Failure
14: Value of: value
14: Actual: 0.016666666666666666
14: Expected: refValue
14: Which is: 0.0083333333333333332
14: Difference: 0.00833333 (4503599627370496 double-prec. ULPs)
14: Tolerance: abs. 8.88178e-16, 4 ULPs
14: Google Test trace:
14: /builddir/build/BUILD/gromacs-5.0.4/src/testutils/refdata.cpp:912: Checking '/Data/histogram/Frame1/[1]/[1]/Value'
14: /builddir/build/BUILD/gromacs-5.0.4/src/testutils/refdata.cpp:931: Failure
14: Value of: value
14: Actual: 0
14: Expected: refValue
14: Which is: 0.0083333333333333332
14: Difference: 0.00833333 (4575957461383581969 double-prec. ULPs)
14: Tolerance: abs. 8.88178e-16, 4 ULPs
14: Google Test trace:
14: /builddir/build/BUILD/gromacs-5.0.4/src/testutils/refdata.cpp:912: Checking '/Data/histogram/Frame2/[1]/[1]/Value'
14: [ FAILED ] AngleModuleTest.HandlesOneVsMultipleVectorGroupsAngles (3 ms)

Associated revisions

Revision f42a9e9a (diff)
Added by Teemu Murtola over 4 years ago

Avoid rounding errors affecting results in one test

Change the input for one of 'gmx gangle' tests such that it does not
produce angles that are exactly at an edge of the three-bin histogram
used in the test. Rounding could affect the bin into which the angle
was assigned to, causing the test to fail erroneously.

Fixes #1716

Change-Id: I9979a8dfee0b870b3904fa28e274540f892f542d

History

#1 Updated by Dominik Mierzejewski over 4 years ago

Update: the same test is failing with mpich-3.1.4, output is exactly the same.

#2 Updated by Dominik Mierzejewski over 4 years ago

And another update: the same test is failing in a serial build, without MPI.
For reference, fftw and atlas versions used (in all configurations):
fftw-3.3.4
atlas-3.10.1

#3 Updated by Mark Abraham over 4 years ago

I haven't been able to reproduce this on x86_64, and don't think I have an i686 to try. Are you able to reproduce it on an x86_64 machine, Dominik?

#4 Updated by Dominik Mierzejewski over 4 years ago

You can reproduce it in a 32bit chroot or container environment, for example using mock on Fedora. At least, that's how I'm doing it. Also, a new bit of information: it works just fine on ARM 32bit: http://koji.fedoraproject.org/koji/taskinfo?taskID=9506969 so it isn't an issue of 64bit vs 32bit.

#5 Updated by Teemu Murtola over 4 years ago

  • Category set to testing
  • Status changed from New to Accepted

Most likely the issue gets fixed by changing the coordinate file used in the test to not produce an exact 120 degree angle (which falls exactly at a histogram bin edge and can get rounded differently depending on three compiler, architecture, etc.).

#6 Updated by Berk Hess over 4 years ago

But we shouldn't have tests that require exact matching of histograms.
Can this test be changed to look at the value of the angle instead (with a margin)?

#7 Updated by Teemu Murtola over 4 years ago

Berk Hess wrote:

But we shouldn't have tests that require exact matching of histograms.
Can this test be changed to look at the value of the angle instead (with a margin)?

The test already tests the actual angles (with a margin) in addition to the histogram. And the histogram is one that is computed with a 60 degree bin width from two angle values, so testing it directly with existing code instead of writing elaborate extra code to test, e.g., the integral of the histogram is much more straightforward to actually check the basic histogramming in this case. We can of course just throw out testing the entire histogram and just trust that it might work without any testing, but then we will potentially miss issues.

#8 Updated by Gerrit Code Review Bot over 4 years ago

Gerrit received a related patchset '1' for Issue #1716.
Uploader: Teemu Murtola ()
Change-Id: I9979a8dfee0b870b3904fa28e274540f892f542d
Gerrit URL: https://gerrit.gromacs.org/4539

#9 Updated by Teemu Murtola over 4 years ago

  • Status changed from Accepted to Fix uploaded
  • Assignee set to Teemu Murtola
  • Target version set to 5.0.5

#10 Updated by Teemu Murtola over 4 years ago

  • Status changed from Fix uploaded to Resolved

Fix has been merged.

#11 Updated by Teemu Murtola over 4 years ago

  • Affected version - extra info set to 5.0-5.0.4

#12 Updated by Mark Abraham over 4 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF