Project

General

Profile

Feature #1782

OpenCL runtime/compiler version should be reported in the log...

Added by Szilárd Páll about 4 years ago. Updated 8 months ago.

Status:
Closed
Priority:
High
Assignee:
-
Category:
core library
Target version:
Difficulty:
uncategorized
Close

Description

...not just because that's how we do it with CPU and CUDA GPU information, but also because identifying the source of some OpenCL related issues has already been hampered by the lack of such information.

Ref: http://comments.gmane.org/gmane.science.biology.gromacs.devel/6384


Related issues

Related to GROMACS - Task #1783: investigate Apple OpenCL runtime usabilityClosed07/20/2015

History

#1 Updated by Szilárd Páll about 4 years ago

  • Related to Task #1783: investigate Apple OpenCL runtime usability added

#2 Updated by Erik Lindahl about 4 years ago

We can't do it they way we report it for CUDA at the very beginning of the log file, since OpenCL requires us to query specific devices, which in turn depends on the entire GPU initialization having completed.

The best we can achieve with OpenCL is to provide a log file pointer to the initialization routine and write this information for the actually available/selected devices further down in the log file.

#3 Updated by Szilárd Páll about 4 years ago

Indeed, the ICD loader makes things harder, but some form of reporting should happen. Not knowing what is the software environment seems to not be ideal.

#4 Updated by Mark Abraham about 4 years ago

Current release-5-1 writes

GROMACS:      gmx mdrun, VERSION 5.1-rc1-dev-20150806-f9747f2-unknown
Executable:   /nethome/mark/redmines/redmine-1496/test-opencl/r51/build-cmake-gcc-gpu-release/install/bin/gmx
Data prefix:  /nethome/mark/redmines/redmine-1496/test-opencl/r51/build-cmake-gcc-gpu-release/install
Command line:
  gmx mdrun -ntmpi 1 -ntomp 4 -notunepme

GROMACS version:    VERSION 5.1-rc1-dev-20150806-f9747f2-unknown
GIT SHA1 hash:      f9747f2ce7acf426ac58f369708ba85d4560fb08
Branched from:      unknown
Precision:          single
Memory model:       64 bit
MPI library:        thread_mpi
OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 32)
GPU support:        enabled
OpenCL support:     enabled
invsqrt routine:    gmx_software_invsqrt(x)
SIMD instructions:  AVX2_256
FFT library:        fftw-3.3.4-sse2
RDTSCP usage:       enabled
C++11 compilation:  disabled
TNG support:        enabled
Tracing support:    disabled
Built on:           Fri Aug  7 15:51:02 CEST 2015
Built by:           mark@tcbs15 [CMAKE]
Build OS/arch:      Linux 3.16.0-45-generic x86_64
Build CPU vendor:   GenuineIntel
Build CPU brand:    Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz
Build CPU family:   6   Model: 63   Stepping: 2
Build CPU features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
C compiler:         /opt/tcbsys/gcc/4.9.2/bin/gcc-4.9 GNU 4.9.2
C compiler flags:    -march=core-avx2    -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter  -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast  -Wno-array-bounds 
C++ compiler:       /opt/tcbsys/gcc/4.9.2/bin/g++-4.9 GNU 4.9.2
C++ compiler flags:  -march=core-avx2    -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function  -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast  -Wno-array-bounds 
Boost version:      1.55.0 (internal)
OpenCL include dir: /opt/AMD/AMDAPPSDK-2.9-1/include
OpenCL library:     /opt/AMD/AMDAPPSDK-2.9-1/lib/x86_64/libOpenCL.so
OpenCL version:     1.2

Running on 1 node with total 8 cores, 16 logical cores, 2 compatible GPUs
Hardware detected:
  CPU info:
    Vendor: GenuineIntel
    Brand:  Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz
    Family:  6  model: 63  stepping:  2
    CPU features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
    SIMD instructions most likely to fit this hardware: AVX2_256
    SIMD instructions selected at GROMACS compile time: AVX2_256
  GPU info:
    Number of GPUs detected: 2
    #0: name: Oland, vendor: Advanced Micro Devices, Inc., device version: OpenCL 1.2 AMD-APP (1800.5), stat: compatible
    #1: name: Hawaii, vendor: Advanced Micro Devices, Inc., device version: OpenCL 2.0 AMD-APP (1800.5), stat: compatible

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Páll, M. J. Abraham, C. Kutzner, B. Hess, E. Lindahl
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with
GROMACS
In S. Markidis & E. Laure (Eds.), Solving Software Challenges for Exascale 8759 (2015) pp. 3–27
-------- -------- --- Thank You --- -------- --------

Can we specify what we think should be there that is not? We've got CL_DEVICE_[NAME|VENDOR|VERSION] already. Looking at https://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clGetDeviceInfo.html we could get CL_DRIVER_VERSION also? IIRC we don't do that for CUDA, so maybe we don't need that for OpenCL for 5.1 either? clGetPlatformInfo doesn't seem to return anything useful.

AFAICS there's no such thing as a host compiler to have a version to report - the OpenCL compiler gives you the highest 1.x version the device supports, unless you pass a flag for 2.0 support.

Do we want the AMD SDK version? If so, then I think there is no OpenCL function to query. e.g. amd/appsdk/2.9/include/SDKUtil/SDKUtil.hpp has appropriate #defines but we don't have the CMakery to access it. There is nothing in include/CL/* that seems to be able to help.

I didn't see anything in the discussion that Szilard linked that would have been improved by better mdrun reporting. We now think the issue is Mac OS version. We could try to have mdrun report runtime OS/arch as well as build time OS/arch, but in Carlo's case both would probably have been the same. Unless someone thinks that is easy, I wouldn't push to hard for that in 5.1, now that we have a patch to ban Mac OS <10.10.4

#5 Updated by Szilárd Páll about 4 years ago

Mark Abraham wrote:

Current release-5-1 writes

[...]

Can we specify what we think should be there that is not? We've got CL_DEVICE_[NAME|VENDOR|VERSION] already. Looking at https://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clGetDeviceInfo.html we could get CL_DRIVER_VERSION also? IIRC we don't do that for CUDA, so maybe we don't need that for OpenCL for 5.1 either? clGetPlatformInfo doesn't seem to return anything useful.

OpenCL compiler and driver version should be enough. These will typically be identical as long as no binary caching is supported, so for now there is no difference.

In CUDA we don't use the driver API, but, although not documented, the driver API version at compile-time is always the same as the runtime API version. What we report is the runtime API version we compiled against and the driver API version we detect at runtime. As mentioned above such a distinction is ATM not needed in OpenCL, but later we may want to implement it by e.g. embedding the compiler version in the binary (if we'll assume backward compatibility between the compiler used and the driver at the time or running).

AFAICS there's no such thing as a host compiler to have a version to report - the OpenCL compiler gives you the highest 1.x version the device supports, unless you pass a flag for 2.0 support.

OpenCL does not do cross-compilation, it's plain C/C++ so there is no such thing as host compiler.

Do we want the AMD SDK version? If so, then I think there is no OpenCL function to query. e.g. amd/appsdk/2.9/include/SDKUtil/SDKUtil.hpp has appropriate #defines but we don't have the CMakery to access it. There is nothing in include/CL/* that seems to be able to help.

Well, if CL_DRIVER_VERSION only repots an OpenCL version and not the vendor version, that would be good in identifying the software env the binary was run in, right?

I didn't see anything in the discussion that Szilard linked that would have been improved by better mdrun reporting. We now think the issue is Mac OS version. We could try to have mdrun report runtime OS/arch as well as build time OS/arch, but in Carlo's case both would probably have been the same. Unless someone thinks that is easy, I wouldn't push to hard for that in 5.1, now that we have a patch to ban Mac OS <10.10.4

Well, build vs runtime OS could be useful, but not that crucial I believe. My guess is that it's (hopefully) enough to know that we're using x86_64 and AMD APPSDK 2.9. Sadly, as the driver is a kernel module the kernel version may matter, but let's not get into the business of system info reporting now. Could be good to keep in mind for later, e.g. to write a small module that detects at run-time the software environment.

#6 Updated by Mark Abraham about 4 years ago

  • Target version changed from 5.1 to future

Sounds like we could have various things, but the most important of them can be readily supplied by the user in the event of a problem.

In particular, I don't want to put last-minute time into CMakery for noticing that we have AMD OpenCL, finding AMDSDK and querying compile- and/or run-time versions. We might consider that in future, though.

#7 Updated by Szilárd Páll over 3 years ago

Time to revisit this, I'd say. Knowing a bit better how things work, I'm not sure the library is useful to be included. One can link against some whatever ICD loader is available on the system, but that tells nothing about the runtime that will be used (well, at runtime).

#8 Updated by Mark Abraham 8 months ago

  • Status changed from New to Resolved

No action mentioned here, so I assume the current behaviour is satisfactory.

#9 Updated by Mark Abraham 8 months ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF