Project

General

Profile

Task #1058

gcc 4.4.something up to at least 4.6.0 can't compile AVX256 kernels

Added by Mark Abraham over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
mdrun
Target version:
Difficulty:
uncategorized
Close

Associated revisions

Revision 80721741 (diff)
Added by Erik Lindahl over 6 years ago

Added work-around for gcc bug in AVX intrinsincs formal parameter

Many relatively recent gcc versions (4.5.2, possibly 4.6.0) use incorrect
formal parameters for maskload and maskstore intrinsics. This patch
detects the bug during configuration and works around it with a define.
Refs #1058, and fixes it at least from gcc-4.5.2.

Change-Id: I447a968d2153f5acf59ad170ace69ad1b6b3f24d

History

#1 Updated by Roland Schulz over 6 years ago

The patch is included in 4.4.6 and 4.5.3. That's why Jenkins doesn't complain. Can't find the 4.6 version that has the fix. We could add a corresponding line to TestAVX.c to check for this bug.

#2 Updated by Mark Abraham over 6 years ago

Thanks.

Or we could add more general checks for the intrinsics we use for the configured acceleration level. This is an area in which compilers will tend to suck a bit. Feature tests are of course preferred over version checks. git grep _mm_, sed and uniq should be able to identify the ones we ever use.

Our kernel tests should serve as correctness tests for compiler intrinsics. (Once I get time to organize their distribution.)

#3 Updated by Christoph Junghans over 6 years ago

Similar problems on my core i7 with avx256 acceleration for double precision:
/var/tmp/portage/sci-chemistry/gromacs-4.6_beta2/work/gromacs-4.6-beta2/src/gmxlib/nonbonded/nb_kernel_avx_256_double/nb_kernel_ElecRFCut_VdwNone_GeomW3W3_avx_256_double.c: In function 'nb_kernel_ElecRFCut_VdwNone_GeomW3W3_F_avx_256_double':
/var/tmp/portage/sci-chemistry/gromacs-4.6_beta2/work/gromacs-4.6-beta2/src/gmxlib/nonbonded/nb_kernel_avx_256_double/nb_kernel_ElecRFCut_VdwNone_GeomW3W3_avx_256_double.c:1675:22: error: incompatible types when assigning to type '__m128' from type '__vector(2) double

#4 Updated by Berk Hess over 6 years ago

  • Assignee changed from Mark Abraham to Erik Lindahl
  • Priority changed from Normal to 6

I think all instances of gmx_mm256_set_m128 in the AVX double kernels should be replaced by gmx_mm256_set_m128d.
I guess that these kernels have never been compiled, nor tested then?

#5 Updated by Berk Hess over 6 years ago

PS This shows we need an AVX-256 double precision build.

#6 Updated by Roland Schulz over 6 years ago

These are two different problems, correct? 1) Some gcc version fail to compile the single precision kernels. 2) the double precision kernels have a bug. I think it would be better to have one redmine issue for one problem and open a separate issue for the 2nd (double precision problem).

#7 Updated by Martin Hoefling over 6 years ago

Created separate bug #1074 as suggested by Roland for the double precision gcc compilation problem (#3 from Christoph Junghans).

#8 Updated by Erik Lindahl over 6 years ago

Mostly fixed by gerrit 1944. There is one remaining related caveat, for the record: gcc-4.5.2 appear to produce buggy code for AVX128 FMA instructions. Not entirely surprising since it was released before the AMD hardware was available, and for that reason I'm not going to spend any time on tracking down exact versions where it was fixed. It works fine with gcc-4.6.3 which comes with e.g. ubuntu that we installed the first second the bulldozer hardware was available.

#9 Updated by Erik Lindahl over 6 years ago

  • Status changed from New to Closed

Also available in: Atom PDF