Project

General

Profile

Task #2593

work around denorm handling performance penalty on AMD Vega

Added by Szilárd Páll about 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Category:
mdrun
Target version:
Difficulty:
simple
Close

Description

The concrete issue we ran into is that the ROCm stack's compiler generates code that handles denorms on AMD Vega and later as it's assumed to have low penalty. However, this turns out to not be the case for the nonbonded kernels and wall-time is reduced by ~25-30% by flushing denormals to zero -- as it's done by default on all other architectures, as well as in CUDA (when using the fast-math flag).
Ref: https://github.com/RadeonOpenCompute/ROCm/issues/222#issuecomment-393553253

While avoiding this performance penalty we would also make the denorm handling more uniform across platforms by explicitly use the -cl-denorms-are-zero flag as a hint to the compiler that we prefer denormalized numbers flushed to zero.

Associated revisions

Revision c8096cb8 (diff)
Added by Szilárd Páll about 2 years ago

Request flushing denorms to zero in OpenCL

This change adds by default the cl-denorms-are-zero to the flags used
for kernel compilation. This is done to:
avoid a large performance penalty on AMD Vega with ROCm (which by
default handles denorms on GFX9 or later).
- make the defaults uniform across CUDA and OpenCL.

Fixes #2593

Change-Id: I9e6183c4367b5960e0e21f1dd342d7695acfbc44

History

#1 Updated by Gerrit Code Review Bot about 2 years ago

Gerrit received a related patchset '2' for Issue #2593.
Uploader: Szilárd Páll ()
Change-Id: gromacs~master~I9e6183c4367b5960e0e21f1dd342d7695acfbc44
Gerrit URL: https://gerrit.gromacs.org/8118

#2 Updated by Szilárd Páll about 2 years ago

  • Status changed from New to Closed

#3 Updated by Gerrit Code Review Bot about 2 years ago

Gerrit received a related patchset '1' for Issue #2593.
Uploader: Szilárd Páll ()
Change-Id: gromacs~release-2018~I9e6183c4367b5960e0e21f1dd342d7695acfbc44
Gerrit URL: https://gerrit.gromacs.org/8210

Also available in: Atom PDF