Project

General

Profile

Bug #1041

mdrun-openmm segmentation fault

Added by Justin Lemkul over 6 years ago. Updated over 6 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
mdrun
Target version:
Affected version - extra info:
release-4-6
Affected version:
Difficulty:
uncategorized
Close

Description

No commentary on the gmx-developers list, so I'm posting here so the issue isn't forgotten.

Using mdrun-openmm from a very recent version of the release-4-6 branch results in a seg fault. The latest update I pulled was:

$ git log -1
commit 7a330e69659febf8663a11b2ff57fb9942e0b7a7
Author: Roland Schulz <roland@utk.edu>
Date:   Fri Nov 16 17:41:48 2012 -0500

    Rename remaining GMX_ACCELERATION to GMX_CPU_ACCELERATION

    366c49a438150 renamed this variable but forgot these three.

    Change-Id: Iad653e2deaa2fc7cb218bffab84c6833ff8b3d56

The log file seems to indicate that the GPU is not being found properly, though it works just fine in version 4.5.5.

The workstation has a Tesla C2075 card, CUDA 4.0 and OpenMM 4.0.

My command in 4.5.5:

$GMX/mdrun-gpu -device "OpenMM:platform=Cuda,memtest=off,deviceid=0,force-device=yes" -s test.tpr

From the 4.5.5 log file:

OpenMM plugins loaded from directory /usr/local/openmm-4.0/lib/plugins: libOpenMMCuda.so, libOpenMMAmoebaCuda.so, libOpenMMAmoebaSerialization.so, libOpenMMOpenCL.so, libFreeEnergySerialization.so,
The combination rule of the used force field matches the one used by OpenMM.
Gromacs will use the OpenMM platform: Cuda
Non-supported GPU selected (#0, Tesla C2075), forced continuing.Note, that the simulation can be slow or it migth even crash.
Pre-simulation GPU memtest skipped. Note, that faulty memory can cause incorrect results!

In release-4.6 I get this (same command except "mdrun-openmm" instead of "mdrun-gpu"):

OpenMM plugins loaded from directory /usr/local/openmm-4.0/lib/plugins: libOpenMMCuda.so, libOpenMMAmoebaCuda.so, libOpenMMAmoebaSerialization.so, libOpenMMOpenCL.so, libFreeEnergySerialization.so,
The combination rule of the used force field matches the one used by OpenMM.
Gromacs will use the OpenMM platform: Cuda
Gromacs will run on the GPU #0 (es 148566
procs_running 1
procs_blocked 0
softirq 33540031 0 182f4&).
Pre-simulation GPU memtest skipped. Note, that faulty memory can cause incorrect results!

It seems the name of the GPU is not detected properly and gibberish gets printed to the .log file. Immediately after this, mdrun-openmm seg faults. I know the card is functional and my installation method is sound because version 4.5.5 works.

I compiled in Debug mode instead of Release mode to try to diagnose, but running mdrun-openmm through gdb allows the simulation to run, with the following in the .log file:

OpenMM plugins loaded from directory /usr/local/openmm-4.0/lib/plugins: libOpenMMCuda.so, libOpenMMAmoebaCuda.so, libOpenMMAmoebaSerialization.so, libOpenMMOpenCL.so, libFreeEnergySerialization.so,
The combination rule of the used force field matches the one used by OpenMM.
Gromacs will use the OpenMM platform: Cuda
Gromacs will run on the GPU #0 ().
Pre-simulation GPU memtest skipped. Note, that faulty memory can cause incorrect results!

Then the run proceeds, albeit very slowly.

This all strikes me as very weird. Debug mode works (though it does not print the name of the card any more) but release mode fails.

History

#1 Updated by Mark Abraham over 6 years ago

  • Status changed from New to Rejected
  • Assignee deleted (Berk Hess)

We have made the decision to remove support of the OpenMM code path, because of a shortage of developers willing to commit time to it. The code itself will continue to exist in 4.6, but will migrate to src/contrib (#1079). I don't expect that anybody will ever find time to explore this report.

Also available in: Atom PDF