Project

General

Profile

Bug #2061

fix FindNVML for CUDA 8.0

Added by Szilárd Páll about 3 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
build system
Target version:
Affected version - extra info:
all GROMACS versions, but only with CUDA 8+
Affected version:
Difficulty:
simple
Close

Description

CUDA 8.0 ships the necessary components for NVML support with the toolkit:
  • library stub at $CUDA_HOME/lib64/stubs/libnvidia-ml.so
  • header at $CUDA_HOME/include/nvml.h

Hence, starting from this release detection of the GDK is not necessary anymore which greatly simplifies things and makes NVML more widely available. Hence, we should consider extending our FindNVML for a future r2016.x release.


Related issues

Related to GROMACS - Bug #2311: NVML compilation issuesClosed

Associated revisions

Revision 0f541787 (diff)
Added by Jiri Kraus over 2 years ago

Update FindNVML to fix #2061

Fixes FindNVML to reflect move of the NVML development files from the
GDK to the CUDA Toolkit with CUDA 8.

Change-Id: I1d99ebff1fa32ba1fd44a37dcb43158da733daed

History

#1 Updated by Szilárd Páll about 3 years ago

  • Target version deleted (2016.1)

#2 Updated by Mark Abraham over 2 years ago

This is probably fine for a build, and we should educate our CMake accordingly. We should document that the real libnvidia-ml.so should be found at run time (it still seems to be provided by the driver installation).

#3 Updated by Mark Abraham over 2 years ago

  • Tracker changed from Task to Bug
  • Subject changed from extend FindNVML for CUDA 8.0 to fix FindNVML for CUDA 8.0
  • Status changed from New to Accepted
  • Target version set to 2016.4
  • Affected version - extra info set to all GROMACS versions, but only with CUDA 8+
  • Affected version set to 2016.3

This is a bug, because a user with CUDA 8 should not be required to install the separate GDK, because from CUDA 8 it is not distributed separately. This was probably not clear when the issue was opened.

For a bug fix, setting a target version when accepting makes sense even if not planning/able to work on it right now. There's always one open for branches that are being maintained, and bumping a handful of outstanding known issues to the next bug fix release is reasonable.

Otherwise, leaving a blank target and no affected version gives us no good way to find it in the large number of open issues. (Those issues also include a lot of "wish list" tasks that look the same to the database.)

#4 Updated by Gerrit Code Review Bot over 2 years ago

Gerrit received a related patchset '1' for Issue #2061.
Uploader: Jiri Kraus ()
Change-Id: gromacs~master~I1d99ebff1fa32ba1fd44a37dcb43158da733daed
Gerrit URL: https://gerrit.gromacs.org/6651

#5 Updated by Gerrit Code Review Bot over 2 years ago

Gerrit received a related patchset '1' for Issue #2061.
Uploader: Jiri Kraus ()
Change-Id: gromacs~release-2016~I1d99ebff1fa32ba1fd44a37dcb43158da733daed
Gerrit URL: https://gerrit.gromacs.org/6657

#6 Updated by Szilárd Páll over 2 years ago

  • Status changed from Accepted to Fix uploaded

#7 Updated by Mark Abraham over 2 years ago

Confirming that the approach of linking to the stub library has worked for me on Piz Daint

#8 Updated by Szilárd Páll over 2 years ago

  • Status changed from Fix uploaded to Resolved

I've also tested last week on a couple of configs both CUDA <8.0 and ==8.0.

#9 Updated by Szilárd Páll over 2 years ago

  • Status changed from Resolved to Closed

I think the amount of feedback&testing warrants closing this issue.

#10 Updated by Aleksei Iupinov about 2 years ago

  • Related to Bug #2311: NVML compilation issues added

Also available in: Atom PDF