Project

General

Profile

Bug #2561

Incorrectly getting "Enabling single compilation unit for the CUDA non-bonded module." when using Cuda 9.0 and gromacs 2018.1

Added by Åke Sandgren 23 days ago. Updated about 5 hours ago.

Status:
Feedback wanted
Priority:
Normal
Assignee:
-
Category:
build system
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

The Cmake code in cmake/gmxManageGPU.cmake incorrectly ends up in the
message(STATUS "Enabling single compilation unit for the CUDA non-bonded module...."
part of the if-stmt when building with CUDA 9.9
It does this despite correctly identifying CUDA 9.0 and using only >= 3.0 cuda targets.

Since the code in cmake/gmxManageNvccConfig.cmake correctly sets GMX_CUDA_NVCC_FLAGS regardless of doing autodetection or using user specified SM/COMPUTE targets, the code in cmake/gmxManageGPU.cmake macro(gmx_gpu_setup) can be simplified to:

if (GMX_GPU AND NOT GMX_CLANG_CUDA)
  gmx_check_if_changed(_gmx_cuda_target_changed GMX_CUDA_NVCC_FLAGS)
  if(_gmx_cuda_target_changed OR NOT GMX_GPU_DETECTION_DONE)
    if(GMX_CUDA_NVCC_FLAGS MATCHES "_2[01]")
      message(STATUS "Enabling single compilation unit...
    else()
      message(STATUS "Enabling multiple compilation
...

(This is the problem I incorrectly targeted in issue 2560)


Related issues

Related to GROMACS - Bug #2390: GROMACS build system should check for valid nvcc flags before useNew

History

#1 Updated by Szilárd Páll 2 days ago

I can't reproduce the issue. Can you please provide full command line invocation?

#2 Updated by Åke Sandgren 2 days ago

cmd line used:
cmake ../gromacs-2018.2 -DCMAKE_INSTALL_PREFIX=/scratch/ake/gromacs/inst -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=Release -DGMX_EXTERNAL_BLAS=ON -DGMX_EXTERNAL_LAPACK=ON -DGMX_X11=OFF -DGMX_OPENMP=ON -DGMX_MPI=OFF -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/CUDA/9.2.88 -DGMX_BLAS_USER="/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/OpenBLAS/0.3.1/lib/libopenblas.a" -DGMX_LAPACK_USER="/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/OpenBLAS/0.3.1/lib/libopenblas.a" > cmake.out 2>&1

grep 'Enabling single compilation' cmake.out
-- Enabling single compilation unit for the CUDA non-bonded module. Multiple compilation units are not compatible with CC 2.x devices, to enable the feature specify only CC >=3.0 target architectures in GMX_CUDA_TARGET_SM/GMX_CUDA_TARGET_COMPUTE.

#3 Updated by Szilárd Páll 1 day ago

Åke Sandgren wrote:

cmd line used:
cmake ../gromacs-2018.2 -DCMAKE_INSTALL_PREFIX=/scratch/ake/gromacs/inst -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=Release -DGMX_EXTERNAL_BLAS=ON -DGMX_EXTERNAL_LAPACK=ON -DGMX_X11=OFF -DGMX_OPENMP=ON -DGMX_MPI=OFF -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/CUDA/9.2.88 -DGMX_BLAS_USER="/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/OpenBLAS/0.3.1/lib/libopenblas.a" -DGMX_LAPACK_USER="/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/OpenBLAS/0.3.1/lib/libopenblas.a" > cmake.out 2>&1

This does not contain either of the GMX_CUDA_TARGER_SM or GMX_CUDA_TARGER_COMPUTE passed to cmake to restrict the targetted architectures.

#4 Updated by Szilárd Páll 1 day ago

  • Status changed from New to Feedback wanted

#5 Updated by Åke Sandgren 1 day ago

Exactly, it doesn't contain them, thus the code should take the default of using what the CUDA supports.
And there is code in there to figure this out based on CUDA version, but it isn't used. The result is just ignored.

With the code shown in the original post, everything works in both cases, with and without GMX_CUDA_TARGET_SM or GMX_CUDA_TARGET_COMPUTE

#6 Updated by Mark Abraham about 16 hours ago

  • Description updated (diff)

#7 Updated by Mark Abraham about 14 hours ago

  • Target version set to 2018.3

With

export CC=gcc-6
export CXX=g++-6

cmake .. -DCMAKE_INSTALL_PREFIX=/opt/gromacs/$version -G Ninja

I see

...
-- Enabling single compilation unit for the CUDA non-bonded module. Multiple compilation units are not compatible with CC 2.x devices, to enable the feature specify only CC >=3.0 target architectures in GMX_CUDA_TARGET_SM/GMX_CUDA_TARGET_COMPUTE.
...

which is the expected result (supporting 2.x devices with CUDA < 9 requires a single compilation unit in 2018; CC 2.x is now not supported at all in master) if perhaps a little tricky for users to understand whether they should act. (If they know about compute capabilities, it's not clear that by default GROMACS builds for all of them, and if they don't know about compute capabilities then nothing makes sense.)

cmake .. -DCMAKE_INSTALL_PREFIX=/opt/gromacs/$version -G Ninja -DGMX_CUDA_TARGET_SM=30 -DGMX_CUDA_TARGET_COMPUTE=30

enables multiple compilation units, as expected since that is now possible.

With the reform proposed in #2390, then we can avoid some aspects of such problems in future, but for now Ake's proposal looks good to me. Szilard can you prepare a patch, please?

#8 Updated by Mark Abraham about 14 hours ago

  • Related to Bug #2390: GROMACS build system should check for valid nvcc flags before use added

#9 Updated by Szilárd Páll about 11 hours ago

Åke Sandgren wrote:

Exactly, it doesn't contain them, thus the code should take the default of using what the CUDA supports.

Sorry, forgot that CUDA 9.0 skips generating CC 2.0 code.

IIRC the reason why the current code does not just test CUDA_NVCC_FLAGS is that the compiler can generate code for CC 2.0 despite a command line that does not contain "_2[01]". For the same reason, the proposed change could also lead to not triggering single compilation unit for some CUDA version <9.0 for a CUDA_NVCC_FLAGS string that implicitly triggers CC 2.0 code generation. I'd rather make the code explicit and add a CUDA version check to the conditional than take the risk (or trawl through the nvcc docs).

#10 Updated by Mark Abraham about 5 hours ago

If we test for the features supported by nvcc then we will have a good way to guess whether it is at all capable of e.g. generating code for CC 2.0 (which Szilard identifies as a risk). If the user chooses particular CC, then we check in cmake that this will work. If the user chooses multiple compilation units but CC 2.0 is supported, then cmake should fail (inconsistent input). If the user chooses nothing and CC 2.0 is supported, then we default to single compilation unit. If the user chooses nothing and CC 2.0 is not supported, then we default to multiple compilation units.

I don't see any extra value delivered by any CUDA version checks. Different OS first support newer CC in different CUDA versions, so such checks are brittle. They can't help the build work with unknown future CUDA versions. This is the same thinking behind taking advantage of compiler and glibc feature-reporting gear, and testing functionality at configure time, rather than hard coding specific version checks.

Also available in: Atom PDF