Project

General

Profile

Bug #2312

GROMACS-2018 beta-1 does not compile in double with AVX_512

Added by Carsten Kutzner over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

When I try to compile version 2154a4f95ffccaf1c75315c0 (2 commits past 2018 beta-1) with gcc 4.8.5 in double precision, I run into the following issue:

[ 5%] Building CXX object src/gromacs/CMakeFiles/libgromacs.dir/commandline/cmdlinehelpmodule.cpp.o
[ 5%] Building CXX object src/gromacs/CMakeFiles/libgromacs.dir/commandline/cmdlinehelpwriter.cpp.o
In file included from /tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512.h:46:0,
from /tmp/git-gromacs-2018/src/gromacs/simd/simd.h:126,
from /tmp/git-gromacs-2018/src/gromacs/listed-forces/listed-forces.cpp:69:
/tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h: In function ‘gmx::SimdFloat gmx::loadU4NOffset(const float*, int)’:
/tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:504:74: error: cannot convert ‘const float*’ to ‘const double*’ for argument ‘2’ to ‘__m512d mm512_i32gather_pd(_m256i, const double*, int)’
mm512_castpd_ps(_mm512_i32gather_pd(gdx, f, sizeof(float)))
^
/tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:505:5: error: could not convert ‘{<expression error>}’ from ‘<brace-enclosed initializer list>’ to ‘gmx::SimdFloat’
};
^
In file included from /tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512.h:46:0,
from /tmp/git-gromacs-2018/src/gromacs/simd/simd.h:126,
from /tmp/git-gromacs-2018/src/gromacs/pbcutil/pbc-simd.h:51,
from /tmp/git-gromacs-2018/src/gromacs/listed-forces/bonded.cpp:65:
/tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h: In function ‘gmx::SimdFloat gmx::loadU4NOffset(const float*, int)’:
/tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:504:74: error: cannot convert ‘const float*’ to ‘const double*’ for argument ‘2’ to ‘
_m512d mm512_i32gather_pd(_m256i, const double*, int)’
mm512_castpd_ps(_mm512_i32gather_pd(gdx, f, sizeof(float)))
^
/tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:505:5: error: could not convert ‘{<expression error>}’ from ‘<brace-enclosed initializer list>’ to ‘gmx::SimdFloat’
};
^
In file included from /tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512.h:46:0,
from /tmp/git-gromacs-2018/src/gromacs/simd/simd.h:126,
from /tmp/git-gromacs-2018/src/gromacs/pbcutil/pbc-simd.h:51,
from /tmp/git-gromacs-2018/src/gromacs/listed-forces/pairs.cpp:58:
/tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h: In function ‘gmx::SimdFloat gmx::loadU4NOffset(const float*, int)’:
/tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:504:74: error: cannot convert ‘const float*’ to ‘const double*’ for argument ‘2’ to ‘
_m512d mm512_i32gather_pd(_m256i, const double*, int)’
_mm512_castpd_ps(_mm512_i32gather_pd(gdx, f, sizeof(float)))
^
/tmp/git-gromacs-2018/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:505:5: error: could not convert ‘{<expression error>}’ from ‘<brace-enclosed initializer list>’ to ‘gmx::SimdFloat’
};
^
[ 5%] Building CXX object src/gromacs/CMakeFiles/libgromacs.dir/commandline/cmdlineinit.cpp.o
make2: * [src/gromacs/CMakeFiles/libgromacs.dir/listed-forces/listed-forces.cpp.o] Error 1
make2:
Waiting for unfinished jobs....
make2:
[src/gromacs/CMakeFiles/libgromacs.dir/listed-forces/pairs.cpp.o] Error 1
make2:
[src/gromacs/CMakeFiles/libgromacs.dir/listed-forces/bonded.cpp.o] Error 1
make1:
[src/gromacs/CMakeFiles/libgromacs.dir/all] Error 2
make: *
[all] Error 2

Associated revisions

Revision 6456d2f1 (diff)
Added by Erik Lindahl over 1 year ago

Fix compilation issues for AVX-512

- gcc-5.4.0 incorrectly requires the second argument of
_mm512_i32gather_pd() to be a double pointer instead
of void, but this should fix compilation for both
cases.
- Work around double precision permute instruction
only available with AVX512VL instructions.

Fixes #2312.

Change-Id: I31420e71064b1c5c25c8af29a1d41c7f372375c1

History

#1 Updated by Erik Lindahl over 1 year ago

My hunch is that gcc-4.8.5 is too old to support AVX-512 properly. I'll look into it, which might just result in adding this instruction to the AVX512 test, so we complain about the compiler at configuration time.

#2 Updated by Gerrit Code Review Bot over 1 year ago

Gerrit received a related patchset '1' for Issue #2312.
Uploader: Erik Lindahl ()
Change-Id: gromacs~master~I31420e71064b1c5c25c8af29a1d41c7f372375c1
Gerrit URL: https://gerrit.gromacs.org/7256

#3 Updated by Erik Lindahl over 1 year ago

  • Status changed from New to Fix uploaded

Carsten: The attached patch might fix it. Let me know if it works (or better, give the patch a +2).

Unfortunately gcc-4.8.5 is simply too old, so I don't have any host where I could get it working easily.

#4 Updated by Carsten Kutzner over 1 year ago

In the meantime I also tried with gcc 5.4.0, 6.4.0, and 7.2.0, where I also get compilation issues (without your patch) in double when using AVX_512, though not exactly the same as posted above.

#5 Updated by Carsten Kutzner over 1 year ago

WITH the patch and GCC 5.4.0 I now get:

[ 4%] Building CXX object src/gromacs/CMakeFiles/libgromacs.dir/commandline/cmdlinehelpmodule.cpp.o
In file included from /tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512.h:46:0,
from /tmp/git-gromacs-vanilla/src/gromacs/simd/simd.h:126,
from /tmp/git-gromacs-vanilla/src/gromacs/pbcutil/pbc-simd.h:51,
from /tmp/git-gromacs-vanilla/src/gromacs/listed-forces/pairs.cpp:58:
/tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h: In function ‘gmx::SimdFloat gmx::loadU4NOffset(const float*, int)’:
/tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:504:74: error: cannot convert ‘const float*’ to ‘const double*’ for argument ‘2’ to ‘__m512d mm512_i32gather_pd(_m256i, const double*, int)’
_mm512_castpd_ps(_mm512_i32gather_pd(gdx, f, sizeof(float)))
^
/tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:505:5: error: could not convert ‘{<expression error>}’ from ‘<brace-enclosed initializer list>’ to ‘gmx::SimdFloat’
};

#6 Updated by Carsten Kutzner over 1 year ago

Tried also with GCC 7.2.0, with the following result:

[ 70%] Building CXX object src/gromacs/CMakeFiles/libgromacs.dir/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_ElecEwTwinCut_VdwLJEwCombGeom_VF_4xn.cpp.o
In file included from /sw/cluster/gcc-7.2.0/lib/gcc/x86_64-pc-linux-gnu/7.2.0/include/immintrin.h:53:0,
from /tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_general.h:39,
from /tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512.h:40,
from /tmp/git-gromacs-vanilla/src/gromacs/simd/simd.h:126,
from /tmp/git-gromacs-vanilla/src/gromacs/mdlib/nbnxn_simd.h:40,
from /tmp/git-gromacs-vanilla/src/gromacs/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_ElecEwTwinCut_VdwLJCombGeom_F_4xn.cpp:48:
/sw/cluster/gcc-7.2.0/lib/gcc/x86_64-pc-linux-gnu/7.2.0/include/avx512vlintrin.h: In function ‘double gmx::reduceIncr4ReturnSum(double*, gmx::SimdDouble, gmx::SimdDouble, gmx::SimdDouble, gmx::SimdDouble)’:
/sw/cluster/gcc-7.2.0/lib/gcc/x86_64-pc-linux-gnu/7.2.0/include/avx512vlintrin.h:12355:1: error: inlining failed in call to always_inline ‘__m256d mm256_permutex_pd(_m256d, int)’: target specific option mismatch
mm256_permutex_pd (_m256d __X, const int __M)
^~~~~~~~~~~~~~~~
In file included from /tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512.h:45:0,
from /tmp/git-gromacs-vanilla/src/gromacs/simd/simd.h:126,
from /tmp/git-gromacs-vanilla/src/gromacs/mdlib/nbnxn_simd.h:40,
from /tmp/git-gromacs-vanilla/src/gromacs/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_ElecEwTwinCut_VdwLJCombGeom_F_4xn.cpp:48:
/tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_double.h:294:23: note: called from here
t3 = _mm256_add_pd(t3, _mm256_permutex_pd(t3, 0xB1));
~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

#7 Updated by Erik Lindahl over 1 year ago

Carsten Kutzner wrote:

WITH the patch and GCC 5.4.0 I now get:
/tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_util_float.h:504:74: error: cannot convert ‘const float*’ to ‘const double*’ for argument ‘2’ to ‘__m512d mm512_i32gather_pd(_m256i, const double*, int)’
_mm512_castpd_ps(_mm512_i32gather_pd(gdx, f, sizeof(float)))

The last line here still quotes the old contents of line 504, so this does not seem to include the patch?

#8 Updated by Erik Lindahl over 1 year ago

... but _mm256_permutex_pd() is apparently AVX512VL only, so we have to replace that.

#9 Updated by Carsten Kutzner over 1 year ago

Sorry, my bad.

Right, now I get another error message (GCC 7.2.0):

[ 70%] Building CXX object src/gromacs/CMakeFiles/libgromacs.dir/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_ElecEwTwinCut_VdwLJEwCombGeom_VF_4xn.cpp.o
In file included from /sw/cluster/gcc-7.2.0/lib/gcc/x86_64-pc-linux-gnu/7.2.0/include/immintrin.h:53:0,
from /tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512_general.h:39,
from /tmp/git-gromacs-vanilla/src/gromacs/simd/impl_x86_avx_512/impl_x86_avx_512.h:40,
from /tmp/git-gromacs-vanilla/src/gromacs/simd/simd.h:126,
from /tmp/git-gromacs-vanilla/src/gromacs/mdlib/nbnxn_simd.h:40,
from /tmp/git-gromacs-vanilla/src/gromacs/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_ElecEwTwinCut_VdwLJCombGeom_F_4xn.cpp:48:
/sw/cluster/gcc-7.2.0/lib/gcc/x86_64-pc-linux-gnu/7.2.0/include/avx512vlintrin.h: In function ‘double gmx::reduceIncr4ReturnSum(double*, gmx::SimdDouble, gmx::SimdDouble, gmx::SimdDouble, gmx::SimdDouble)’:
/sw/cluster/gcc-7.2.0/lib/gcc/x86_64-pc-linux-gnu/7.2.0/include/avx512vlintrin.h:12355:1: error: inlining failed in call to always_inline ‘__m256d mm256_permutex_pd(_m256d, int)’: target specific option mismatch
mm256_permutex_pd (_m256d __X, const int __M)

#10 Updated by Gerrit Code Review Bot over 1 year ago

Gerrit received a related patchset '1' for Issue #2312.
Uploader: Erik Lindahl ()
Change-Id: gromacs~release-2018~I31420e71064b1c5c25c8af29a1d41c7f372375c1
Gerrit URL: https://gerrit.gromacs.org/7259

#11 Updated by Erik Lindahl over 1 year ago

This change compiles fine for me with gcc-7.1 and passes the SIMD unit tests in double precision with AVX-512.

(Then there are other unit tests failing in very benign ways - we'll handle those separately)

#12 Updated by Carsten Kutzner over 1 year ago

Yes, I can confirm that. Just tested with gcc-5.4.0 and 7.2.0, both compile now and pass the SIMD unit tests. I just see some problems in the Ewald Unit Tests and the Table Unit Tests for both compilers.

#13 Updated by Mark Abraham over 1 year ago

Something in Erik's patch fails with gcc 4.8.5 on dev-purley01 (Erik, I fixed the modules) viz:

-- Detecting best SIMD instructions for this CPU
-- Performing Test C_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED
-- Performing Test C_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED - Failed
-- Performing Test C_xCORE_AVX512_FLAG_ACCEPTED
-- Performing Test C_xCORE_AVX512_FLAG_ACCEPTED - Failed
-- Performing Test C_mavx512f_mfma_FLAG_ACCEPTED
-- Performing Test C_mavx512f_mfma_FLAG_ACCEPTED - Failed
-- Performing Test C_mavx512f_FLAG_ACCEPTED
-- Performing Test C_mavx512f_FLAG_ACCEPTED - Failed
-- Performing Test C_arch_AVX_FLAG_ACCEPTED
-- Performing Test C_arch_AVX_FLAG_ACCEPTED - Failed
-- Performing Test C_hgnu_FLAG_ACCEPTED
-- Performing Test C_hgnu_FLAG_ACCEPTED - Failed
-- Performing Test C_COMPILE_WORKS_WITHOUT_SPECIAL_FLAGS
-- Performing Test C_COMPILE_WORKS_WITHOUT_SPECIAL_FLAGS - Failed
-- Could not find any flag to build test source (this could be due to either the compiler or binutils)
-- Performing Test CXX_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED
-- Performing Test CXX_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED - Failed
-- Performing Test CXX_xCORE_AVX512_FLAG_ACCEPTED
-- Performing Test CXX_xCORE_AVX512_FLAG_ACCEPTED - Failed
-- Performing Test CXX_mavx512f_mfma_FLAG_ACCEPTED
-- Performing Test CXX_mavx512f_mfma_FLAG_ACCEPTED - Failed
-- Performing Test CXX_mavx512f_FLAG_ACCEPTED
-- Performing Test CXX_mavx512f_FLAG_ACCEPTED - Failed
-- Performing Test CXX_arch_AVX_FLAG_ACCEPTED
-- Performing Test CXX_arch_AVX_FLAG_ACCEPTED - Failed
-- Performing Test CXX_hgnu_FLAG_ACCEPTED
-- Performing Test CXX_hgnu_FLAG_ACCEPTED - Failed
-- Performing Test CXX_COMPILE_WORKS_WITHOUT_SPECIAL_FLAGS
-- Performing Test CXX_COMPILE_WORKS_WITHOUT_SPECIAL_FLAGS - Failed
-- Could not find any flag to build test source (this could be due to either the compiler or binutils)
CMake Warning at cmake/gmxDetectSimd.cmake:89 (message):
  Could not run code to detect number of AVX-512 FMA units - assuming 2.
Call Stack (most recent call first):
  cmake/gmxDetectSimd.cmake:151 (gmx_suggest_simd)
  cmake/gmxManageSimd.cmake:92 (gmx_detect_simd)
  CMakeLists.txt:703 (gmx_manage_simd)

-- Detected best SIMD instructions for this CPU - AVX_512
CMake Error at cmake/gmxManageSimd.cmake:51 (message):
  Cannot find AVX 512F compiler flag.  Use a newer compiler, or choose a
  lower level of SIMD (slower).
Call Stack (most recent call first):
  cmake/gmxManageSimd.cmake:186 (gmx_give_fatal_error_when_simd_support_not_found)
  CMakeLists.txt:703 (gmx_manage_simd)

-- Configuring incomplete, errors occurred!
See also "/tmp/build-mark/CMakeFiles/CMakeOutput.log".
See also "/tmp/build-mark/CMakeFiles/CMakeError.log".

#14 Updated by Erik Lindahl over 1 year ago

Mark: That doesn't look like a failure, but CMake identifying that AVX-512 doesn't work for that compiler.

#15 Updated by Roland Schulz over 1 year ago

I get the same error as Mark for GCC 4.8.5. Which is to be expected because GCC 4.8.5 according to man-page doesn't have avx512. And according to https://www.gnu.org/software/gcc/gcc-4.9/changes.html it was added in 4.9. Carsten are you running a GCC with packported AVX512 or did you mean to say 4.9?

#16 Updated by Carsten Kutzner over 1 year ago

I also get the same error as Mark and Roland with GCC 4.8.5 during cmake.

It turns out I reported the wrong compiler version in my original report. Instead of 4.8.5, I have been using 5.4.0 when I stumbled upon the error. So the error message I reported in the first paragraph is for GCC 5.4.0. Just checked again with 4.8.5 and 5.4.0. Sorry Erik and Roland, for the confusion!

But apart from that https://gerrit.gromacs.org/#/c/7259/ fixes the issue for 5.4.0,
whereas 4.8.5 never gets through cmake, as it should.

#17 Updated by Mark Abraham over 1 year ago

Erik Lindahl wrote:

Mark: That doesn't look like a failure, but CMake identifying that AVX-512 doesn't work for that compiler.

Indeed. We need to be clear with ourselves (and in our code docs and perhaps install guide) about what our intent for GMX_SIMD=auto is. Forcing the user to do things at configure time to maximize runtime performance is one way to get a reputation for being hard to build. Currently we will not build by default on Ubuntu 14.04 (which ships gcc 4.8) on AVX512 build hosts. We have the following options:

  • detect best build-host SIMD support, detect compiler SIMD capability, and build the most effective SIMD level possible (remembering that the log file will note that performance may not be optimal)
  • detect best build-host SIMD support, detect compiler SIMD capability, and issue an error if the former exceeds the latter (as above), which forces the user to choose a SIMD level or a different compiler

The latter has been our historical behaviour, and I am open to us choosing not to change that during this beta phase. But I think it would be wiser for us to focus our efforts towards the former in future. I think that many users will prefer that the build always works and they can go on with other things until they decide they care about CPU performance, e.g. because the log file observes that it could be improved with better SIMD support, and the performance analysis confirms it.

#18 Updated by Erik Lindahl over 1 year ago

  • Status changed from Fix uploaded to Resolved

#19 Updated by Erik Lindahl over 1 year ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF