Project

General

Profile

Task #2193

Task #2454: OpenCL infrastructure improvements

OpenCL code modernization assuming v1.2 reqd

Added by Szilárd Páll about 2 years ago. Updated 9 months ago.

Status:
Closed
Priority:
Normal
Category:
mdrun
Target version:
Difficulty:
simple
Close

Description

A list of tasks that could be beneficial to consider if we require v1.2 standard.

- modernize API use swapping clEnqueueMarker/clEnqueueBarrier/clEnqueueWaitForEvents to clEnqueueMarkerWithWaitList/clEnqueueBarrierWithWaitList;

- consider using CL_MEM_COPY_HOST_WRITE_ONLY, CL_MEM_COPY_HOST_READ_ONLY, CL_MEM_COPY_HOST_NO_ACCESS

- use clEnqueueFillBuffer for 0-ing

- do not add call to clUnloadCompiler

- use OPENCL_C_VERSION for reporting version


Related issues

Related to GROMACS - Task #2787: allow passing flags to allocateDeviceBufferNew
Has duplicate GROMACS - Task #2438: bump OpenCL requirement to 1.2Closed

Associated revisions

Revision 866c0086 (diff)
Added by Szilárd Páll 10 months ago

Modernize OpenCL memory allocation flags

This change correct the memory allocation flags in the nonbonded module
so that these reflect the R/W use on both host- and device-side; the
former is made possible by requiring OpenCL 1.2.

Also improve some lacking error handling.

Refs #2193

Change-Id: Icef4890aa412b811bf189b78ff42ee8ca8c50113

History

#1 Updated by Mark Abraham about 2 years ago

Did you mean to make this private?

#2 Updated by Szilárd Páll about 2 years ago

Mark Abraham wrote:

Did you mean to make this private?

Yes, it is not very well thought through, just wanted to record as a draft some of the thing we can/should do.

#3 Updated by Szilárd Páll over 1 year ago

  • Private changed from Yes to No

#4 Updated by Szilárd Páll over 1 year ago

  • Has duplicate Task #2438: bump OpenCL requirement to 1.2 added

#5 Updated by Szilárd Páll over 1 year ago

  • Description updated (diff)

Dupe/overlaps with #2438, but the 2nd to last point might be useful for #2449

#6 Updated by Szilárd Páll over 1 year ago

  • Parent task set to #2454

#7 Updated by Aleksei Iupinov over 1 year ago

  • Description updated (diff)

OK, it's clUnloadCompiler(), not clUnloadPlatformCompiler(), it didn't help me at all with #2449, and it's actually a deprecated in 1.2 :-)

https://www.khronos.org/registry/OpenCL/sdk/2.1/docs/man/xhtml/deprecated.html

#8 Updated by Aleksei Iupinov over 1 year ago

  • Description updated (diff)

#9 Updated by Szilárd Páll over 1 year ago

Aleksei Iupinov wrote:

OK, it's clUnloadCompiler(), not clUnloadPlatformCompiler(), it didn't help me at all with #2449, and it's actually a deprecated in 1.2 :-)

https://www.khronos.org/registry/OpenCL/sdk/2.1/docs/man/xhtml/deprecated.html

You mean in 2.1. So what exactly is the resource that's using/leaking memory in your case, any idea?

#10 Updated by Aleksei Iupinov over 1 year ago

No, the link is for 2.1, but if you read through it, the function was deprecated in 1.2 already.

No idea so far, but I will eventually have to not compile kernels many times in unit tests (wildly assuming it's the compilation that causes this) - it takes insane time anyway, 2 seconds per each case. We'll see if that 'fixes' it then.

#11 Updated by Mark Abraham 11 months ago

Is there work remaining?

#12 Updated by Szilárd Páll 11 months ago

  • Status changed from New to In Progress
  • Assignee set to Szilárd Páll

Yes, had some WIP for the second item in the list, was hoping the PME code gets merged, but it's been slow to get there, so I should just upload the modernization for the nonbonded module, and do the rest later.

#13 Updated by Szilárd Páll 11 months ago

  • Target version set to 2019-beta1

#14 Updated by Gerrit Code Review Bot 11 months ago

Gerrit received a related patchset '1' for Issue #2193.
Uploader: Szilárd Páll ()
Change-Id: gromacs~master~Icef4890aa412b811bf189b78ff42ee8ca8c50113
Gerrit URL: https://gerrit.gromacs.org/8520

#15 Updated by Mark Abraham 10 months ago

  • Status changed from In Progress to Fix uploaded

#16 Updated by Szilárd Páll 10 months ago

This weekend I realized that the allocation modernization won't be possible in the new PME code due to the an omission/oversimplication of the allocateDeviceBuffer() API which doesn't support passing flags.

The allocateDeviceBuffer() call should take some flags, not sure if we want to just add a flag argument with a default value? Also, to be able to call reallocateDeviceBuffer() without having to pass the flags again, we could store the flags in TypedClMemory class.

Thoughts?

#17 Updated by Mark Abraham 10 months ago

  • Target version changed from 2019-beta1 to 2019

Unlikely to do anything before beta

#18 Updated by Szilárd Páll 10 months ago

Mark Abraham wrote:

Unlikely to do anything before beta

Sure. Do such internal API changes qualify as acceptable for changes during beta?

#19 Updated by Mark Abraham 10 months ago

Szilárd Páll wrote:

Mark Abraham wrote:

Unlikely to do anything before beta

Sure. Do such internal API changes qualify as acceptable for changes during beta?

Sounds plausible. Depends on scope and impact.

#20 Updated by Mark Abraham 9 months ago

Szilard is this done?

#21 Updated by Szilárd Páll 9 months ago

  • Related to Task #2787: allow passing flags to allocateDeviceBuffer added

#22 Updated by Szilárd Páll 9 months ago

  • Status changed from Fix uploaded to Closed

Also available in: Atom PDF