Project

General

Profile

Bug #250

erroneous type-hackery in BlueGene kernels

Added by Mark Abraham about 11 years ago. Updated about 11 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Erik Lindahl
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

src/gmxlib/nonbonded/nb_kernel_bluegene/nb_kernel300_bluegene.h and
src/gmxlib/nonbonded/nb_kernel_bluegene/nb_kernel312_bluegene.h each have a manual typedef of float to real, instead of #include <types/simple.h>. The other nb_kernelxxx_bluegene.h seem to #include <types/simple.h> in the normal fashion, so the fix would seem to be to remove these typedefs and put the #include in, probably before the 'extern "C"' bit.

Separately, src/gmxlib/nonbonded/nb_kernel_bluegene/nb_kernel_w4_bluegene.h and src/gmxlib/nonbonded/nb_kernel_bluegene/nb_kernel_w3_bluegene.h each define generalized function declarations and code for the water-optimized kernels. These do so with literal "float" types in the function declarations, and some variable declarations. These should all be "real", of course.

These make one suspect that development and testing only took place on single-precision!

With the above modifications, I succeeded in compiling for a BlueGene/L in both single- and double-precision. However test runs at both single- and double-precision "blew up" and dumped core, suggesting some kind of mismatch in calling these kernels. My runs should have been calling nb_kernel310_bluegene.o and I'm still looking for what's going on. The same runs using generic C or FORTRAN loops at either precision seem fine.

History

#1 Updated by Erik Lindahl about 11 years ago

Fixed, but not tested. We should ask IBM to donate a BG to us for release testing ;-)

Also available in: Atom PDF