BlueGene kernels have various problems
1) There's a six-fold loop unroll in nb_kernel_gen_bluegene.h which constructs code for non-solvent-optimized kernels. It has at least two problems.
a) Generalized Born would be horribly broken, since there are six typos of "==" for "-="
b) Any code with a Coulomb interaction would be likely to be broken, since the fifth and sixth charges in the unrolled loop were not loaded - the third and fourth charges were recycled.
A cvs diff of my fixes follows:
jcs158161 181# cvs -z3 -d :pserver:email@example.com:/home/gmx/cvs diff nb_kernel_gen_bluegene.h
Index: nb_kernel_gen_bluegene.h ===================================================================
RCS file: /home/gmx/cvs/gmx/src/gmxlib/nonbonded/nb_kernel_bluegene/nb_kernel_gen_bluegene.h,v
retrieving revision 1.1
< dvda[jnr11] __creal(dvdaj);
< dvda[jnr21] __creal(dvdaj);
dvda[jnr11] -= __creal(dvdaj);
dvda[jnr21] -= __creal(dvdaj);
#if COULOMB != COULOMB_NONE
qq = _fxpmul(_cmplx(charge[jnr13],charge[jnr23]),_iq);
< dvda[jnr12] __creal(dvdaj);
< dvda[jnr22] __creal(dvdaj);
dvda[jnr12] -= __creal(dvdaj);
dvda[jnr22] -= __creal(dvdaj);
< dvda[jnr13] __creal(dvdaj);
< dvda[jnr23] __creal(dvdaj);
dvda[jnr13] -= __creal(dvdaj);
dvda[jnr23] -= __creal(dvdaj);
This seems to work - equilibrated NVT runs with PME have had stable total energy, temperature and pressure through 10k+ steps at both single- and double-precision.
2) Puetz plays around with the sign of fscal in interaction.h. The results are correct for COULOMB_TAB and GENERALIZED_BORN - relative to the GMX generic C kernel, he negates fscal when he calculates it from the components, and then negates the sense in which fscal is used to create the force components, which works. I don't think they work for REACTION_FIELD or COULOMB_CUTOFF, (since the calculation is normal-sense and the usage is still negative-sense) but I haven't put any time into analysing that at this stage.
3) I haven't tested any of the water loops yet.
4) Combined with the other problems I've noticed (e.g. http://bugzilla.gromacs.org/show_bug.cgi?id=250) it seems to me that these kernels were never tested in a meaningful fashion. I'll run gmxtest.pl on them when I get a chance.
#1 Updated by Erik Lindahl almost 11 years ago
I've committed the diff you specified, but unfortunately I fear there will be more BG bugs until we get shell access on a machine and start testing things more systematically. Right now we can't check these changes, but we appreciate somebody else working on it! Please reopen when you have more info on the last part!
#2 Updated by Mark Abraham almost 11 years ago
I've actually done a serious re-work of the BG kernels to use more inlined functions rather than Puetz's programmer- and debugger-unfriendly macros. As far as I have seen so far, the performance difference is negligible, as you would hope. I still haven't tried kernels other than 310/313/314 and my own CHARMM-TIP3P ones, but I'm happy to do so (e.g. with gmxtest.pl) and if other things also look functional, suggest including my versions in 4.0.3. I'm on partial leave for another week or two, however. Is there a rough timeline for 4.0.3?