Incorrect C code produced by mknb in single-precision when configure included --enable-ppc-sqrt
On an IBM PowerPC architecture, when using configure line
the combination of single-precision and --enable-ppc-sqrt means that line 47 of src/gmxlib/nonbonded/nb_kernel/mknb_innerloop.c outputs C code such as
This should be
and so line 47 of mknb_innerloop.c should read
The conversion to and from double-precision probably makes the use of a non-vectorized intrinsic reciprocal square root a net loser. I'd suggest a warning in configure that the combination of single-precision and --enable-ppc-sqrt could be ineffective and the user should compare performance with (--enable-double and --enable-ppc-sqrt) or (--enable-float and --disable-ppc-sqrt).
#2 Updated by Erik Lindahl over 11 years ago
I've changed it to use the single-precision frsqrtes() instead. The drawback is that this call is only available on power5 or later CPUs, but it is more important to get good speedup on modern processors rather than a tiny amount (or worse, slowdown) on old ones.