Project

General

Profile

Bug #3385

FindLibStdCpp.cmake - wrong sanity check for clang

Added by Anton Shterenlikht 7 months ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
build system
Target version:
-
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

I'm using a Clang based host compiler, which is based on Clang 9.
NVCC, at least at cuda 10.2, still does not accept Clang 9, only up to version 8.
So I have to use GCC as a cuda compiler, which should be fine
via -DCUDA_HOST_COMPILER.

However, I fail with:

+ cmake .. -DCMAKE_C_COMPILER=cc -DCMAKE_C_FLAGS=-O3 
+ -DCMAKE_CXX_COMPILER=CC -DCMAKE_CXX_FLAGS=-O3 
+ -DCUDA_HOST_COMPILER=/global/opt/gcc/8.3.0/snos/bin/g++ 
+ -DGMX_BUILD_OWN_FFTW=ON -DGMX_HWLOC=OFF 
+ -DGMX_BUILD_SHARED_EXE=OFF -DGMX_CYCLE_SUBCOUNTERS=ON -DGMX_GPU=ON 
+ -DGMX_CUDA_TARGET_SM=70 -DGMX_OPENMP=ON -DGMX_MPI=ON 
+ -DGMX_SIMD=AVX2_256
-- The C compiler identification is Clang 9.0.0
-- The CXX compiler identification is Clang 9.0.0
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5.1/bin/cc
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5.1/bin/cc -- 
works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5.1/bin/CC
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5.1/bin/CC 
-- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test USING_LIBSTDCXX
-- Performing Test USING_LIBSTDCXX - Success CMake Error at 
cmake/FindLibStdCpp.cmake:128 (message):
/global/opt/gcc/8.3.0/include/c++ doesn't exist even though it should.
Please report to developers.
Call Stack (most recent call first):
CMakeLists.txt:69 (find_package)

Note that GCC does not have c++ under include, only g++.

$ for num in 5.3.0 6.1.0 6.2.0 7.1.0 7.2.0 8.1.0 8.3.0 9.1.0; do ls /opt/gcc/$num/snos/include/; done
g++
g++
g++
g++
g++
g++
g++
g++
$

History

#1 Updated by Szilárd Páll 7 months ago

  • Description updated (diff)

#2 Updated by Szilárd Páll 7 months ago

  • Description updated (diff)

#3 Updated by Szilárd Páll 7 months ago

  • Category set to build system

Anton Shterenlikht wrote:

Note that GCC does not have c++ under include, only g++.

[...]

Note that the code in question is looking for the C++ standard library headers' location (not the c++ binary).

The check that assumes the headers are in $(dirname $(which c++))/../include/c++) seems to be correct assuming a system gcc installation and related C++ headers (checked on Debian and Centos and a Cray XC too). Where this fails is custom toolchain installations.

I am not sure whether the check needs to be improved or the failing test indicates that the build will / can fail later (e.g. due to clang not finding the right C++ headers).

Q:
- Can you try to comment out the check to see if the configure and build completes succesfully?
- Does this happen if you use a supported clang host compiler, e.g. clang 8?

Note that it is not safe to mix different C++ compilers, so unless this issue appears with CMAKE_CXX_COMPILER == CUDA_HOST_COMPILER, we should probably mark this as a low priority issue.

#4 Updated by Anton Shterenlikht 7 months ago

ok, I double checked on XC - even without any GPU offloading,
seems it's a regression 2019.5 -> 2020.
This test wasn't in 2019.5.

2019.5:

+ cmake .. -DCMAKE_C_COMPILER=cc -DCMAKE_C_FLAGS=-O3 -DCMAKE_CXX_COMPILER=CC -DCMAKE_CXX_FLAGS=-O3 -DGMX_BUILD_OWN_FFTW=ON -DGMX_HWLOC=OFF -DGMX_BUILD_SHARED_EXE=OFF -DGMX_CYCLE_SUBCOUNTERS=ON -DGMX_OPENMP=ON -DGMX_MPI=ON -DGMX_SIMD=AVX2_256
-- The C compiler identification is Clang 9.0.0
-- The CXX compiler identification is Clang 9.0.0
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5.1/bin/cc
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5.1/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5.1/bin/CC
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5.1/bin/CC -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create 
-- Looking for pthread_create - found
-- Found Threads: TRUE  
-- Performing Test CXXFLAG_STD_CXX0X
-- Performing Test CXXFLAG_STD_CXX0X - Success
-- Performing Test CXX11_SUPPORTED
-- Performing Test CXX11_SUPPORTED - Success
-- Performing Test CXX11_STDLIB_PRESENT 
-- Performing Test CXX11_STDLIB_PRESENT - Success
-- Looking for NVIDIA GPUs present in the system
-- Number of NVIDIA GPUs detected: 1 

while 2020 fails here:

+ cmake .. -DCMAKE_C_COMPILER=cc -DCMAKE_C_FLAGS=-O3 -DCMAKE_CXX_COMPILER=CC -DCMAKE_CXX_FLAGS=-O3 -DGMX_BUILD_OWN_FFTW=ON -DGMX_HWLOC=OFF -DGMX_BUILD_SHARED_EXE=OFF -DGMX_CYCLE_SUBCOUNTERS=ON -DGMX_OPENMP=ON -DGMX_MPI=ON -DGMX_SIMD=AVX2_256
-- The C compiler identification is Clang 9.0.0
-- The CXX compiler identification is Clang 9.0.0
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5.1/bin/cc
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5.1/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5.1/bin/CC
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5.1/bin/CC -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test USING_LIBSTDCXX
-- Performing Test USING_LIBSTDCXX - Success
CMake Error at cmake/FindLibStdCpp.cmake:128 (message):
  /opt/gcc/8.3.0/include/c++ doesn't exist even though it should.  Please

cce/9.1.2 is used in both cases

$ CC --version
Cray clang version 9.1.2 (4476ef5e094fa4af6221b63f08ebb96a84090d21) (based on LLVM 9.0.0)

#5 Updated by Szilárd Páll 7 months ago

  • Subject changed from FindLibStdCpp.cmake - wrong logic for Clang host compiler + GCC NVCC compiler to FindLibStdCpp.cmake - wrong sanity check for clang

Anton Shterenlikht wrote:

ok, I double checked on XC - even without any GPU offloading,
seems it's a regression 2019.5 -> 2020.
This test wasn't in 2019.5.

Sure, this is not related to GPU offload, but to the use of compilers that might require a libstdc++ (clang/Intel) -- which we now manage explicitly internally (see the docs here: source:cmake/FindLibStdCpp.cmake#L35). The code is new in the 2020 release, so the regression is expected.

- Can you try to comment out the check to see if the configure and build completes succesfully?

Can you please try that?

#6 Updated by Anton Shterenlikht 7 months ago

I disabled the whole check

--- FindLibStdCpp.cmake.orig    2020-02-19 09:49:27.000000000 -0600
+++ FindLibStdCpp.cmake    2020-02-19 09:49:00.000000000 -0600
@@ -44,6 +44,7 @@
 # for builds at different times using the same cache file (so that e.g. module loading is
 # not required for a reproducible build).

+return()
 if (NOT CMAKE_CXX_COMPILER_ID MATCHES "Clang" AND NOT CMAKE_CXX_COMPILER_ID MATCHES "Intel") # Compilers supported
     return()
 endif()

Is that ok?

cmake now completes fine, but the build fails here:

[ 51%] Building CXX object src/gromacs/CMakeFiles/libgromacs.dir/nbnxm/prunekerneldispatch.cpp.o
[ 51%] Building CXX object src/gromacs/CMakeFiles/libgromacs.dir/nbnxm/benchmark/bench_setup.cpp.o
In file included from /lus/scratch/ashterenli/BENCH9237/gromacs/cce/gromacs/src/gromacs/nbnxm/benchmark/bench_setup.cpp:48:
In file included from /lus/scratch/ashterenli/BENCH9237/gromacs/cce/gromacs/src/gromacs/compat/optional.h:55:
/lus/scratch/ashterenli/BENCH9237/gromacs/cce/gromacs/src/external/nonstd/optional.hpp:1664:22: error: no member named 'bad_optional_access' in namespace 'nonstd::optional_lite'
using optional_lite::bad_optional_access;
      ~~~~~~~~~~~~~~~^
In file included from /lus/scratch/ashterenli/BENCH9237/gromacs/cce/gromacs/src/gromacs/nbnxm/benchmark/bench_setup.cpp:48:
/lus/scratch/ashterenli/BENCH9237/gromacs/cce/gromacs/src/gromacs/compat/optional.h:62:15: error: no member named 'bad_optional_access' in namespace 'nonstd'
using nonstd::bad_optional_access;
      ~~~~~~~~^
2 errors generated.
make[2]: *** [src/gromacs/CMakeFiles/libgromacs.dir/build.make:5926: src/gromacs/CMakeFiles/libgromacs.dir/nbnxm/benchmark/bench_setup.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:2976: src/gromacs/CMakeFiles/libgromacs.dir/all] Error 2
make: *** [Makefile:163: all] Error 2

Not sure if this is related to the missed cmake test,
or is a separate issue.

#7 Updated by Szilárd Páll 7 months ago

Anton Shterenlikht wrote:

I disabled the whole check

[...]

Is that ok?

That disables the entire detection which could be the reason why your build fails. As a results, clang will by default detect the C++ standard library based on the g++ it detects in the PATH -- that is assuming you are using a vanilla clang (or something that has the same behaviour).

I instead recommend commenting out FindLibStdCpp.cmake lines 123-135. Also make sure that a modern g++ is detected, or pass one to cmake in the GMX_GPLUSPLUS_PATH (you can verify that the right --gcc-version=PATH appears in the build commands).

#8 Updated by Anton Shterenlikht 7 months ago

Commenting just L123-135 leads to a different cmake failure.

--- FindLibStdCpp.cmake.orig    2020-01-17 08:01:39.000000000 -0600
+++ FindLibStdCpp.cmake    2020-02-21 10:47:39.000000000 -0600
@@ -120,19 +120,19 @@
     endif()

     # Now make some sanity checks on the compiler using libstdc++.
-    if (CMAKE_CXX_COMPILER_ID MATCHES "Clang")
-        get_filename_component(GMX_GPLUSPLUS_PATH "${GMX_GPLUSPLUS_PATH}" REALPATH)
-        get_filename_component(GMX_GPLUSPLUS_PATH "${GMX_GPLUSPLUS_PATH}" DIRECTORY) #strip g++
-        get_filename_component(GMX_GPLUSPLUS_PATH "${GMX_GPLUSPLUS_PATH}" DIRECTORY) #strip bin
-        if (NOT EXISTS "${GMX_GPLUSPLUS_PATH}/include/c++")
-            message(FATAL_ERROR "${GMX_GPLUSPLUS_PATH}/include/c++ doesn't exist even though it should. " 
-                "Please report to developers.")
-        endif()
-    else() #Intel
-        if (${GMX_GPLUSPLUS_VERSION} VERSION_GREATER_EQUAL 7 AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 19)
-            message(FATAL_ERROR "ICC versions below 19 don't support GCC versions above 6.")
-        endif ()
-    endif()
+#    if (CMAKE_CXX_COMPILER_ID MATCHES "Clang")
+#        get_filename_component(GMX_GPLUSPLUS_PATH "${GMX_GPLUSPLUS_PATH}" REALPATH)
+#        get_filename_component(GMX_GPLUSPLUS_PATH "${GMX_GPLUSPLUS_PATH}" DIRECTORY) #strip g++
+#        get_filename_component(GMX_GPLUSPLUS_PATH "${GMX_GPLUSPLUS_PATH}" DIRECTORY) #strip bin
+#        if (NOT EXISTS "${GMX_GPLUSPLUS_PATH}/include/c++")
+#            message(FATAL_ERROR "${GMX_GPLUSPLUS_PATH}/include/c++ doesn't exist even though it should. " 
+#                "Please report to developers.")
+#        endif()
+#    else() #Intel
+#        if (${GMX_GPLUSPLUS_VERSION} VERSION_GREATER_EQUAL 7 AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 19)
+#            message(FATAL_ERROR "ICC versions below 19 don't support GCC versions above 6.")
+#        endif ()
+#    endif()

     # Set up to use the libstdc++ from that g++. Note that we checked
     # the existing contents of CMAKE_CXX_FLAGS* variables earlier, so
+ cmake .. -DCMAKE_CXX_FLAGS=--stdlib=libc++ -DCMAKE_C_COMPILER=cc '-DCMAKE_C_FLAGS=-O3 -Ofast -Rpass=.* -fsave-loopmark' -DCMAKE_CXX_COMPILER=CC '-DCMAKE_CXX_FLAGS=-O3 -Ofast -Rpass=.* -fsave-loopmark' -DGMX_HWLOC=OFF -DGMX_BUILD_SHARED_EXE=OFF -DGMX_CYCLE_SUBCOUNTERS=ON -DGMX_BUILD_OWN_FFTW=ON -DGMX_GPU=OFF -DGMX_OPENMP=ON -DGMX_MPI=ON -DGMX_SIMD=AVX_256 
Re-run cmake no build system arguments
-- The C compiler identification is Clang 9.0.0
-- The CXX compiler identification is Clang 9.0.0
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5.1/bin/cc
-- Check for working C compiler: /opt/cray/pe/craype/2.6.5.1/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5.1/bin/CC
-- Check for working CXX compiler: /opt/cray/pe/craype/2.6.5.1/bin/CC -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test USING_LIBSTDCXX
-- Performing Test USING_LIBSTDCXX - Success
-- Performing Test CXX14_COMPILES
-- Performing Test CXX14_COMPILES - Failed
CMake Error at cmake/FindLibStdCpp.cmake:162 (message):
  GROMACS requires C++14, but a test of such functionality in the C++
  standard library failed to compile.  The g++ found at
  /opt/gcc/8.1.0/bin/g++ had a suitable version, so ;something else must be
  the problem
Call Stack (most recent call first):
  CMakeLists.txt:69 (find_package)

#9 Updated by Szilárd Páll 7 months ago

+ cmake .. -DCMAKE_CXX_FLAGS=--stdlib=libc++ -DCMAKE_C_COMPILER=cc '-DCMAKE_C_FLAGS=-O3 -Ofast -Rpass=.* -fsave-loopmark' -DCMAKE_CXX_COMPILER=CC '-DCMAKE_CXX_FLAGS=-O3 -Ofast -Rpass=.* -fsave-loopmark' -DGMX_HWLOC=OFF -DGMX_BUILD_SHARED_EXE=OFF -DGMX_CYCLE_SUBCOUNTERS=ON -DGMX_BUILD_OWN_FFTW=ON -DGMX_GPU=OFF -DGMX_OPENMP=ON -DGMX_MPI=ON -DGMX_SIMD=AVX_256

You are passing a zoo of flags there, have you tested with just the default flags?

Also, note that the -DCMAKE_CXX_FLAGS=--stdlib=libc++ in your command line does not have an effect as you set the same variable again later. (also Ofast overrides O3).

#10 Updated by Anton Shterenlikht 7 months ago

I know about Ofast/O3.

--stdlib=libc++ does not help anyway:

/opt/cray/pe/cce/9.1.3/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld.gold: error: cannot find -lc++

The other flags are all fairly standard - nothing special.

So here's the latest - I now tried 3 different systems,
and get the same result on all 3.

1. If I build with -DGMX_GPU=OFF, i.e. don't load any cuda
modules, I can build and link gmx_mpi successfully
with the first patch - bypassing all FindLibStdCpp.cmake checks.
But I haven't tried running that binary.

2. If I use the second patch - just commenting out lines 123-135
in FindLibStdCpp.cmake - I cannot complete cmake even with -DGMX_GPU=OFF.

-- Performing Test CXX14_COMPILES
-- Performing Test CXX14_COMPILES - Failed
CMake Error at cmake/FindLibStdCpp.cmake:162 (message):
  GROMACS requires C++14, but a test of such functionality in the C++
  standard library failed to compile.  The g++ found at
  /opt/gcc/8.1.0/bin/g++ had a suitable version, so ;something else must be
  the problem

3. If I build with -DGMX_GPU=ON, load the relevant cuda modules,
and use e.g.

-DCUDA_HOST_COMPILER=/global/opt/gcc/7.2.0/snos/bin/g++

I complete cmake, but the build fails at

4812 /lus/scratch/ashterenli/BENCH9237/gromacs/cce/gromacs/src/external/nonstd/optional.hpp:1664:22: error: no mem     ber named 'bad_optional_access' in namespace 'nonstd::optional_lite'
4813 using optional_lite::bad_optional_access;
4814       ~~~~~~~~~~~~~~~^

#11 Updated by Szilárd Páll 7 months ago

Anton Shterenlikht wrote:

--stdlib=libc++ does not help anyway:

[...]

The use case where you manually pass the C++ library in the CXX flags won't be catered for by the GROMACS build system, so that is not unexpected. You'll have to set up the library paths so libc++ is found during the link stage.

So here's the latest - I now tried 3 different systems,
and get the same result on all 3.

1. If I build with -DGMX_GPU=OFF, i.e. don't load any cuda
modules, I can build and link gmx_mpi successfully
with the first patch - bypassing all FindLibStdCpp.cmake checks.
But I haven't tried running that binary.

Works for me without modifying anything in FindLibStdCpp.cmake or elsewhere; for reference, here's my cmake output and my CMakeCache.txt

2. If I use the second patch - just commenting out lines 123-135
in FindLibStdCpp.cmake - I cannot complete cmake even with -DGMX_GPU=OFF.

[...]

That suggests that something is very wrong with how your toolchain gets set up in cmake as the simple C++14 test code does not compile. You have a CMakeErrors.log, you can check there what is going wrong.

3. If I build with -DGMX_GPU=ON, load the relevant cuda modules,
and use e.g.
[...]

I complete cmake, but the build fails at

[...]

Not sure about the very reason for that, but mixing C++ compilers/std libraries can and will often have such side-effects and that's why it is neither supported nor encouraged. That said, it is a useful way for testing when a new CPU compiler is out that's not yet supported by CUDA, but as soon as you set a CUDA_HOST_COMPILER != CMAKE_CXX_COMPILER all bets are off.

Here are a few options to consider. I suggest to use a clang-8 (based cce) as that is supported in CUDA 10.2. Note that clang (including cce) has been slower than gcc in (nearly) all the cases that I've encountered, so you could just stick to gcc, it will most probably be faster on x86. Last, if you really need clang 9 + CUDA, you could ask NVIDIA for a CUDA that works with clang-9 ;) -- long shot, but I imagine their next CUDA release should support that.

#12 Updated by Erik Lindahl 7 months ago

Anton Shterenlikht wrote:

The other flags are all fairly standard - nothing special.

At least the -DGMX_SIMD=AVX_256 flag is outright bad to use if you are compiling GROMACS-2019.4 or later and still use the same Cray system. We explicitly detect and support the AVX2_256 capabilities of modern EPYC/Rome chips, but your flag disables that.

Even though you might think all your flags aren't special, GROMACS frequently does a pretty darned good job at detecting the hardware and producing fast code if you just leave it to its own devices with a vanilla gcc compiler.

In particular if you are trying to get higher performance, I would strongly recommend that you start with that as a baseline, and then check what flags actually make it faster (and in particular: doesn't make it slower!) rather than assuming that's the case ;-)

#13 Updated by Szilárd Páll 7 months ago

Erik Lindahl wrote:

Anton Shterenlikht wrote:

The other flags are all fairly standard - nothing special.

At least the -DGMX_SIMD=AVX_256 flag is outright bad to use if you are compiling GROMACS-2019.4 or later and still use the same Cray system. We explicitly detect and support the AVX2_256 capabilities of modern EPYC/Rome chips, but your flag disables that.

Good catch, unless the build is for a rather old CPU architecture without AVX2 support, -DGMX_SIMD=AVX_256 can't be right.

Also available in: Atom PDF