Project

General

Profile

Bug #2810

running on Fermi throws cryptic error

Added by Szilárd Páll 9 months ago. Updated 9 months ago.

Status:
Closed
Priority:
Low
Assignee:
-
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

The consistency / support checks have not been updated and when we removed Fermi support we left no consistency checks that can detect the lack of support for an architecture. As a result, it neither defaults to CPU nor does it emit a message that users can understand.

$ gmx mdrun -quiet -nb gpu -gpu_id 0

-------------------------------------------------------
Program:     gmx mdrun, version 2019-rc1-dev-20181217-eeda455
Source file: src/gromacs/gpu_utils/cudautils.cuh (line 347)
Function:    void launchGpuKernel(void (*)(Args ...), const KernelLaunchConfig&, CommandEvent*, const char*, const std::array<void*, sizeof... (Args)>&) [with Args = {}; CommandEvent = void]

Internal error (bug):
GPU kernel (Dummy kernel) failed to launch: invalid device function

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

Related issues

Related to GROMACS - Task #2665: remove fermi supportClosed
Related to GROMACS - Bug #2811: CUDA binary target support check can't workClosed

Associated revisions

Revision ba3ab170 (diff)
Added by Szilárd Páll 9 months ago

Correct CUDA compatibility check

The CUDA compatibility check became ineffective after the deprecation of
Fermi as it could not detect and flag these GPUs correctly as
"incompatible" but instead was throwing a kernel execution error when
trying to launch the sanity checker kernel.

This change does the minimal necessary corrections for the now
deprecated Fermi arch to be correctly detected. As expected, CPU
fallback can now automatically selected (unless the user request a GPU).

Minor refactoring was also necessary, but was kept at minimum.

Fixes #2810

Change-Id: I81bf12cca43a2a5e16d48d9faf4b9fc9627e4452

History

#1 Updated by Szilárd Páll 9 months ago

  • Related to Task #2665: remove fermi support added

#2 Updated by Szilárd Páll 9 months ago

  • Description updated (diff)
  • Priority changed from Normal to Low

#3 Updated by Szilárd Páll 9 months ago

PS: the original check was also misplaced and should have been called in is_gmx_supported_gpu_id() rather than at initialization as otherwise it can't detect the error that would occur already when first executing the dummy kernel.

#4 Updated by Szilárd Páll 9 months ago

  • Related to Bug #2811: CUDA binary target support check can't work added

#5 Updated by Gerrit Code Review Bot 9 months ago

Gerrit received a related patchset '1' for Issue #2810.
Uploader: Szilárd Páll ()
Change-Id: gromacs~release-2019~I81bf12cca43a2a5e16d48d9faf4b9fc9627e4452
Gerrit URL: https://gerrit.gromacs.org/8844

#6 Updated by Szilárd Páll 9 months ago

  • Status changed from New to Fix uploaded

#7 Updated by Szilárd Páll 9 months ago

  • Status changed from Fix uploaded to Resolved

#8 Updated by Paul Bauer 9 months ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF