Double precision is broken on big-endian floating point systems with cmake
src/gmxlib/maths.c has a few #ifdef IEEE754_BIG_ENDIAN_WORD_ORDER lines, but CMake defines the floating point format macros with a GMX_ prefix. This means that double-precision gmx_erf() and gmx_erfc() will give nonsense results if Gromacs is compiled with CMake on a system where this define should be set. With autoconf, the macro is defined as the name that exists in the code.
If this is relevant for some architecture where Gromacs is used, then I suggest this is fixed for 4.5.6 and/or 4.6, since the fix is very simple, but set the target version however you want. If it is not relevant, I would suggest considering whether the floating-point format check is necessary at all. Note that it checks for a number of things, but these few #ifdefs are the only ones that actually use the results. Considering the comments in gmxTestFloatFormat.cmake, it's also questionable whether the code in question works with all possible combinations of integer and float endianness...
Fix endian test functionality
Code for big-endian systems was silently failing with CMake. This
patch brings CMake and autotools into agreement.
Partial fix of #944
#1 Updated by Erik Lindahl almost 6 years ago
We should fix this; right now the big-endian side of the world is a bit deserted (apart from Power), but with ARM64 appearing in a year or two we need to make sure we don't assume things are intel-only. (ARM has some horribly strange behavior with different endian-ness of bytes inside words and the multiple words in a double precision floating-point number).
#2 Updated by Mark Abraham over 5 years ago
I attempted a fix for this without realising it was a known problem. https://gerrit.gromacs.org/#/c/1076/
BlueGene indeed requires those #defines to work correctly, so this definitely requires fixing. It affects all GROMACS builds that use CMake.
The automated testing for endianness does not work with the XLC cross-compiler for BlueGene/P. I have no idea why. I have hacked a hard fix for BlueGene, which I'll add in a separate commit.
#4 Updated by Teemu Murtola over 5 years ago
- Target version set to 4.5.6
- Affected version - extra info changed from 4.5.* to 4.5-4.5.5
Roland Schulz wrote:
What remains to be done after Mark's (partial) fix?
Mark's fix should make the cmake build work as well as the autoconf build did. Set the target version and known affected versions based on this, as the remaining problems (if any) are not common.Still, I think it's quite confusing that:
- We explicitly test for the byte endian order of floats, but the result of the byte endian order is never used.
- The test code has comments that one should not assume the endianness to be the same as integer endian.
- The code in maths.c most likely will not work if the integer and double byte-endian order is different.
Don't know how relevant this is, but perhaps we could at least issue an error in CMake if we detect a mismatch, or clarify the comments.
#5 Updated by Teemu Murtola about 5 years ago
- Status changed from New to Closed
- Assignee changed from Erik Lindahl to Mark Abraham
It seems that those confusing comments are not high on the priority list. Since the major issue is fixed, marking this as closed. If there ever is an architecture where the endianness differences between integers and floating-point numbers cause issues, I guess we will find out from bug reports...