Project

General

Profile

Bug #1098

mdrun randomly hangs for 32-bit builds running on >1 thread on OS X

Added by Erik Lindahl over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

mdrun simple continues to run (using 100% CPU) for several of the regressiontests. It happens randomly, and sometimes the tests pass. When restricting it to a single thread, the problem disappears.

64-bit builds work fine.

This could be a sign of an error in thread-MPI, or the domain decomposition, that is sensitive to 32-vs-64 bits, and it might also affect other 32-bit platforms.

md.log (1.27 KB) md.log Erik Lindahl, 12/28/2012 08:25 PM

History

#1 Updated by Erik Lindahl over 6 years ago

The bug seems to disappear when compiling and using a 32-bit version of openmpi. This points to a 32-bit bug in thread-mpi.

#2 Updated by Sander Pronk over 6 years ago

Stupid question: how does one force 32-bit compilation?

#3 Updated by Erik Lindahl over 6 years ago

On OS X the easiest solution is to set CMAKE_OSX_ARCHITECTURES=i386 on the command line when calling cmake (it has to be set before any tests are run).

On Linux (and OS X) it can be enabled by adding "-m32" to CFLAGS, but many Linux distros no longer install the 32-bit libraries by default.

#4 Updated by Sander Pronk over 6 years ago

I can't seem to reproduce this with llvm on Mountain Lion - what OS version & compiler is this?

#5 Updated by Erik Lindahl over 6 years ago

Mountain Lion 10.8.2, Apple clang version 4.1 (tags/Apple/clang-421.11.66) (based on LLVM 3.1svn).
Header of a log file attached for more information.

I'm seeing the problem on an quad core macbook pro. The problem occurs more frequently with 8 threads than 2, but after running the simple tests 5-10 times I got it to hang with 2 threads too. However, note that it is NOT deterministic - many executions work fine, and then it suddenly hangs again.

#7 Updated by Erik Lindahl over 6 years ago

That fix solves all my issues.

#8 Updated by Roland Schulz over 6 years ago

  • Status changed from New to Closed

Also available in: Atom PDF