Project

General

Profile

Task #2825

reportGpuUsage is misleading

Added by Joe Jordan 7 months ago. Updated 7 months ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
mdrun
Target version:
Difficulty:
uncategorized
Close

Description

If I specify a gpu_id to mdrun I will get output to stderr as follows.

Command line:
gmx mdrun -deffnm nvt -ntmpi 4 -ntomp 4 -gpu_id 0

Reading file nvt.tpr, VERSION 2018.4 (single precision)
Changing nstlist from 10 to 100, rlist from 1 to 1.124

Using 4 MPI threads
Using 4 OpenMP threads per tMPI thread

On host gpu14 1 GPU auto-selected for this run.
Mapping of GPU IDs to the 4 GPU tasks in the 4 ranks on this node:
PP:0,PP:0,PP:0,PP:0

This is not what I would expect since I specified the gpu but it says the gpu was auto-selected.

I think the issue is that reportGpuUsage in taskassignment.cpp needs a bool to set whether user- or auto-selected is printed to stderr. This bool is !userGpuTaskAssignment.empty() which if I understand will be empty unless the user also specifies -gputasks to mdrun. I think the message is a bit confusing since it says the gpu was auto-selected, not the tasks to run on the gpu. I can't quite get to the bottom of this myself on a cursory investigation because there seems to be some ambuguity in the runner between gpu task assignment and gpu id assignment.

Associated revisions

Revision cf71abd7 (diff)
Added by Mark Abraham 7 months ago

Simplify reporting of GPU selection

It is no longer clear to users whether mdrun -gpu_id 0 is
auto-selection or user-selection, since the user has restricted the
range from which auto-selection takes place. It is simpler and less
confusing to report simply that some were selected.

Fixes #2825

Change-Id: I0c8f0c50f6c5e90d86469c6005a61e43b29eb260

History

#1 Updated by Mark Abraham 7 months ago

Joe Jordan wrote:

If I specify a gpu_id to mdrun I will get output to stderr as follows.

Command line:
gmx mdrun -deffnm nvt -ntmpi 4 -ntomp 4 -gpu_id 0

Reading file nvt.tpr, VERSION 2018.4 (single precision)
Changing nstlist from 10 to 100, rlist from 1 to 1.124

Using 4 MPI threads
Using 4 OpenMP threads per tMPI thread

On host gpu14 1 GPU auto-selected for this run.
Mapping of GPU IDs to the 4 GPU tasks in the 4 ranks on this node:
PP:0,PP:0,PP:0,PP:0

This is not what I would expect since I specified the gpu but it says the gpu was auto-selected.

http://manual.gromacs.org/documentation/current/user-guide/mdrun-performance.html#running-mdrun-with-gpus uses specifies only for -gputasks. Technically, your command line restricted the range from which "auto" selection can take place. However I agree the output is confusing.

I think the issue is that reportGpuUsage in taskassignment.cpp needs a bool to set whether user- or auto-selected is printed to stderr. This bool is !userGpuTaskAssignment.empty() which if I understand will be empty unless the user also specifies -gputasks to mdrun. I think the message is a bit confusing since it says the gpu was auto-selected, not the tasks to run on the gpu. I can't quite get to the bottom of this myself on a cursory investigation because there seems to be some ambuguity in the runner between gpu task assignment and gpu id assignment.

The code tries to confirm what motivates the selection. The former usage of "auto" and "user" was easy when we only had -gpu_id but now there's a third possibility, and the interpretation of "auto-selected" is unclear. I suggest we merely state the number of GPUs selected, and not try to use wording that matches the way the user invoked mdrun.

#2 Updated by Joe Jordan 7 months ago

I agree that there is not much value add to cleverly attempting to print back to stderr how mdrun was called. In general the only time a user looks at what was written to stderr is if something went wrong, and I think the less that gets printed there the easier it will be for users to figure out what went wrong.

#3 Updated by Gerrit Code Review Bot 7 months ago

Gerrit received a related patchset '1' for Issue #2825.
Uploader: Mark Abraham ()
Change-Id: gromacs~release-2019~I0c8f0c50f6c5e90d86469c6005a61e43b29eb260
Gerrit URL: https://gerrit.gromacs.org/8924

#4 Updated by Mark Abraham 7 months ago

  • Category set to mdrun
  • Status changed from New to Fix uploaded
  • Assignee set to Mark Abraham
  • Target version set to 2019.1

#5 Updated by Mark Abraham 7 months ago

  • Status changed from Fix uploaded to Resolved

#6 Updated by Paul Bauer 7 months ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF