Project

General

Profile

Bug #1613

Problem with 4.6.x MPI, thread affinity, slurm and node-uneven task spread

Added by Åke Sandgren over 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Low
Assignee:
-
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Just managed to pin down a weird problem which is caused by uneven spread of tasks over nodes and thread affinity causing jobs to hang in gmx_set_thread_affinity.

This happens on our 48-core nodes using a 100 task job that when submitted through slurm (without specifying distribution manually) gets distributed over 3 nodes with 6+47+47 tasks.
We are also using cgroups to allow for multiple jobs per node, so the node with 6 tasks has an affinity mask set for only the 6 cores on a single NUMA. The nodes with 47 tasks have the whole node allocated and thus gets a full 48-core affinity mask.

(Actually due to a bug(/feature?) in slurm the tasks on the node with only 6 cores allocated actually get a single-core per task affinity, but that's not relevant here.)

Anyway, when the code gets to line 1629 in runner.c (this is 4.6.7) and the call to gmx_check_thread_affinity_set we start having problems.

The loop to set bAllSet ends up setting bAllSet to TRUE for the tasks on the two fully allocated nodes and FALSE on the tasks on the third node.
This in turn changes hw_opt->thread_affinity to threadaffOFF on those 6 tasks, but leaves it at threadaffAUTO for the other 2x47 tasks.

gmx_set_thread_affinity then promptly returns for those poor 6 tasks and tries in vain to do a MPI_Comm_split with 6 tasks missing from the equation...

I suggest to gather the bAllSet result from all nodes in gmx_check_thread_affinity_set and make sure all tasks have the same view of the world...

Suggested patch;
gmx_bool bAllSet_All;
MPI_Allreduce(&bAllSet, &bAllSet_All, 1, MPI_INT, MPI_LAND, MPI_COMM_WORLD);
bAllSet = bAllSet_All;


Related issues

Related to GROMACS - Bug #1614: thread-MPI has broken support for operations on MPI_INT Rejected

Associated revisions

Revision 93a5a180 (diff)
Added by Åke Sandgren over 2 years ago

Fix problem with mixed affinity mask on different nodes.

If task distribution (with slurm for instance) causes both fully
allocated and not-fully allocated nodes to be assigned to the job then
there may be tasks with a all-cores affinity mask and tasks with a
not-all-cores affinity masks.

Fixes #1613

Change-Id: I71c0daa43a5dd42da57bfd09037806ce1d9334b5

History

#1 Updated by Szilárd Páll over 2 years ago

Your suggested patch seems reasonable! FYI our code review system is open for anyone to submit patches, so you could upload the suggested three-liner it straight to gerrit.gromacs.org. (the 3 steps instructions are here: http://www.gromacs.org/Developer_Zone/Git/Gerrit#Getting_started)

#2 Updated by Åke Sandgren over 2 years ago

Szilárd Páll wrote:

Your suggested patch seems reasonable! FYI our code review system is open for anyone to submit patches, so you could upload the suggested three-liner it straight to gerrit.gromacs.org. (the 3 steps instructions are here: http://www.gromacs.org/Developer_Zone/Git/Gerrit#Getting_started)

Ok, trying that then :-)

#3 Updated by Gerrit Code Review Bot over 2 years ago

Gerrit received a related patchset '1' for Issue #1613.
Uploader: Åke Sandgren ()
Change-Id: I71c0daa43a5dd42da57bfd09037806ce1d9334b5
Gerrit URL: https://gerrit.gromacs.org/4116

#4 Updated by Åke Sandgren over 2 years ago

Åke Sandgren wrote:

Szilárd Páll wrote:

Your suggested patch seems reasonable! FYI our code review system is open for anyone to submit patches, so you could upload the suggested three-liner it straight to gerrit.gromacs.org. (the 3 steps instructions are here: http://www.gromacs.org/Developer_Zone/Git/Gerrit#Getting_started)

Ok, trying that then :-)

I hope i got it right. Can you check?

#5 Updated by Mark Abraham over 2 years ago

  • Related to Bug #1614: thread-MPI has broken support for operations on MPI_INT added

#6 Updated by Mark Abraham almost 2 years ago

  • Category set to mdrun
  • Status changed from New to Resolved

#7 Updated by Mark Abraham almost 2 years ago

  • Status changed from Resolved to Closed
  • Target version set to 5.0.5

Also available in: Atom PDF