Project

General

Profile

Feature #1641

Add toolchain file for Cray systems

Added by Justin Cook almost 3 years ago. Updated over 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
build system
Target version:
-
Difficulty:
uncategorized
Close

Description

There are a few issues with GROMACS building on a Cray system. I was hoping to address these either individually or with a toolchain file for CMake. If this sounds like something you'd be interested in I would be happy to discuss your ideas. I wanted to open discussion up before pushing out any code.


Related issues

Related to GROMACS - Feature #911: implement CMake option to enable fully static binariesClosed2012-04-03

Associated revisions

Revision 88fda475 (diff)
Added by Roland Schulz over 2 years ago

Facilitate linking of static binaries

Minimal solution. The user has to manually set both
-DBUILD_SHARED_EXE=no and CFLAGS=CXXFLAGS=-static, perhaps manage
their own toolchain, and certainly make static libraries available for
all dependencies. Also does not auto-detect if compiler defaults to
static (Cray). Works better than LINK_SEARCH_END_STATIC because
otherwise dynamic flags can be added to the middle if some libraries
in default search path exist as both dyanmic and shared.

Fixes #911
Related to #1641

Change-Id: If7b8192b44c33c861f126e3422df04388d2f2be5

History

#1 Updated by Mark Abraham almost 3 years ago

Hi Justin,

Things have been not-quite-right-enough on some Crays for some time, so some discussion is most welcome. Getting the cleverness of any/all of Cray's wrapper compilers (perhaps with a local requirement for static linking), CMake, nvcc, icc and Gromacs configuration to work correctly together is problematic.

Lately, I pushed up some proposed changes at https://gerrit.gromacs.org/#/c/4188/, but consensus is that it needs further work. Do you have suggestions to make (there or here)?

#2 Updated by Justin Cook almost 3 years ago

I'd be happy to help improve the default build behavior and performance.

Internally at Cray, I've been tasked with improving CMake builds. GROMACS is a popular application that comes up often, so I'm glad it is also on your radar.

I'd be happy to review your proposed changes, though it may take me some time.

I have a github account (jscook) if you want to CC me on any other changes.

#3 Updated by Mark Abraham almost 3 years ago

Justin Cook wrote:

I'd be happy to help improve the default build behavior and performance.

Internally at Cray, I've been tasked with improving CMake builds. GROMACS is a popular application that comes up often, so I'm glad it is also on your radar.

Great. After further tinkering, I have proposed a functionality change for CMake, specifically for handling static linking with Intel compilers on Cray (e.g. as implemented on the "Archer" XC30 at EPCC in Edinburgh). Please check out http://public.kitware.com/pipermail/cmake/2014-November/059154.html and comment if you have any insight.

I'd be happy to review your proposed changes, though it may take me some time.

I have a github account (jscook) if you want to CC me on any other changes.

We don't use github for development, but if you log into https://gerrit.gromacs.org with an OpenID (e.g. gmail account), then I'll be able to tag you on any relevant proposals.

#4 Updated by Mark Abraham almost 3 years ago

  • Related to Feature #911: implement CMake option to enable fully static binaries added

#5 Updated by Mark Abraham almost 3 years ago

I have updated https://gerrit.gromacs.org/#/c/4188/, after separating out the non-Cray-specific issues.

#6 Updated by Mark Abraham almost 3 years ago

One aspect of these problems is manifest below:

/opt/cray/craype/2.2.1/bin/CC   -march=core-avx2   -std=c++0x  -Wextra -Wno-missing-field-initializers -Wpointer-arith
-Wall -Wno-unused-function  -O3 -DNDEBUG -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast  -Wno-array-bounds
src/programs/CMakeFiles/mdrun.dir/mdrun_main.cpp.o src/programs/CMakeFiles/mdrun_objlib.dir/mdrun/membed.c.o
src/programs/CMakeFiles/mdrun_objlib.dir/mdrun/runner.c.o src/programs/CMakeFiles/mdrun_objlib.dir/mdrun/pme_loadbal.c.o
src/programs/CMakeFiles/mdrun_objlib.dir/mdrun/repl_ex.c.o src/programs/CMakeFiles/mdrun_objlib.dir/mdrun/md.c.o
src/programs/CMakeFiles/mdrun_objlib.dir/mdrun/mdrun.cpp.o  -o bin/mdrun_mpi  -rdynamic lib/libgromacs_mdrun_mpi.a -fopenmp -lrt -lm
-Wl,-Bstatic -lz -Wl,-Bdynamic /afs/pdc.kth.se/home/m/mjab/progs/lib/libfftw3f.a && :
/usr/lib/../lib64/librt.a(clock_gettime.o): In function `hp_timing_gettime':
/usr/src/packages/BUILD/glibc-2.11.3/rt/../sysdeps/unix/clock_gettime.c:65: undefined reference to `_dl_cpuclock_offset'

Some function's use of some time function looks in librt.a and the effect of the final -Wl,-Bdynamic is to prevent Cray's wrapper compiler sorting out the issue. Removing -Wl,-Bstatic and -Wl,-Bdynamic solves the problem.

When CMake is used to find a library (zlib in this case), and even when you have found a static version (/usr/lib64/libz.a is in the CMakeCache.txt), the standard CMake behaviour is to decorate it in explicit linker flags so that any platform-specific searching that might be triggered from the use of -lz can work (e.g. /usr/lib/libz.a vs /usr/lib64/libz.a in the general case; see http://www.cmake.org/Wiki/CMake_2.6_Notes#Linking_to_System_Libraries and http://www.cmake.org/pipermail/cmake/2011-June/045222.html). This seems like a self-double-cross from CMake searching for a full path to a library in the first place... I can't find more recent CMake docs for this, but inspecting Source/cmComputeLinkInformation.cxx function in the cmake repo finds function ComputeLinkTypeInfo() that shows it is still valid.

This means that for (at least) targets of type EXE, the CMake code in Modules/Platform/Linux.cmake:

FOREACH(type SHARED_LIBRARY SHARED_MODULE EXE)
  SET(CMAKE_${type}_LINK_STATIC_C_FLAGS "-Wl,-Bstatic")
  SET(CMAKE_${type}_LINK_DYNAMIC_C_FLAGS "-Wl,-Bdynamic")
ENDFOREACH(type)

results in the above static-then-dynamic decoration. The only way to get these variables changed is to use a different Platform file in the first place. Alternatively, the application's CMake code can decorate each target with the LINK_SEARCH_END_STATIC property (see http://www.cmake.org/cmake/help/v2.8.8/cmake.html#prop_tgt%3aLINK_SEARCH_END_STATIC) which is annoying when the application also has to build on non-Cray systems. Or move the static library out of the system folders, so the "optimization" is not triggered.

Using export CRAYPE_LINK_TYPE=dynamic works for linking, but means you
  • can't achieve a fully static build even if you wanted to, and
  • don't necessarily find OpenMP or C++ standard libraries in /usr/lib64 at run time...

Justin, this kind of thing Cray could be useful about fixing. There are lots of platform files distributed by CMake, but none cater to using the Cray wrapper compilers on Linux. As you can see from the above, this problem is not specific to Gromacs and is caused by reasonable Linux assumptions that seem to be somehow violated by the way the Cray wrapper compiler works. Supplying a platform file intended for use with the Cray wrapper compiler would solve this issue (and perhaps others).

#7 Updated by Justin Cook over 2 years ago

OK. It's take me a little while to process this but I agree on what is needed and I'd like to add one item.

  1. A platform file that knows how to use the Cray PE.
  2. A platform file that knows how to use the Cray PE when building GROMACS.

We're working internally on the first platform file and its distribution but I'm trying to work with you to get a Cray platform file included with GROMACS for build issues that we come across.

The issue you describe above with the inclusion of the dynamic flags is something I'm working on internally with the Cray PE platform file, but I'm always willing to take suggestions based on other problems.

#8 Updated by Mark Abraham over 2 years ago

Mark Abraham wrote:

One aspect of these problems is manifest below:
[...]

Some function's use of some time function looks in librt.a and the effect of the final -Wl,-Bdynamic is to prevent Cray's wrapper compiler sorting out the issue. Removing -Wl,-Bstatic and -Wl,-Bdynamic solves the problem.

When CMake is used to find a library (zlib in this case), and even when you have found a static version (/usr/lib64/libz.a is in the CMakeCache.txt), the standard CMake behaviour is to decorate it in explicit linker flags so that any platform-specific searching that might be triggered from the use of -lz can work (e.g. /usr/lib/libz.a vs /usr/lib64/libz.a in the general case; see http://www.cmake.org/Wiki/CMake_2.6_Notes#Linking_to_System_Libraries and http://www.cmake.org/pipermail/cmake/2011-June/045222.html). This seems like a self-double-cross from CMake searching for a full path to a library in the first place... I can't find more recent CMake docs for this, but inspecting Source/cmComputeLinkInformation.cxx function in the cmake repo finds function ComputeLinkTypeInfo() that shows it is still valid.

http://public.kitware.com/pipermail/cmake/2015-January/059699.html has some relevant discussion, and http://public.kitware.com/pipermail/cmake/2015-January/059702.html links to actual current docs.

However, setting all of the found libraries to IMPORTED targets is not a scaleable way to work around standard-Linux defaults in order to implement fully-static linking :-)

#9 Updated by Mark Abraham over 2 years ago

Justin Cook wrote:

OK. It's take me a little while to process this but I agree on what is needed and I'd like to add one item.

  1. A platform file that knows how to use the Cray PE.
  2. A platform file that knows how to use the Cray PE when building GROMACS.

We're working internally on the first platform file and its distribution but I'm trying to work with you to get a Cray platform file included with GROMACS for build issues that we come across.

Great. Our ideal solution would be for Cray to contribute the platform file to CMake so that it eventually becomes standard, but in the meantime we would be likely to bundle it, so that people installing GROMACS on Cray can be instructed to use it, and it will Just Work on whatever range of things are supported. Building HPC apps is quite hard enough...

The issue you describe above with the inclusion of the dynamic flags is something I'm working on internally with the Cray PE platform file, but I'm always willing to take suggestions based on other problems.

I don't have anything else to suggest right now, other than modifying/making configurable the kind of behaviour produced by my CMake fragment from comment 6. Creating a Modules/Platform/Cray.cmake that responds to CRAYPE_LINK_TYPE at initial configure time would be one approach. By design, CMake treats the compiler as a constant, and IMO that makes sense for linker behaviour, too.

One thing to keep in mind is linking with CUDA libraries. Once upon a time, static linking wasn't possible but I believe that has changed. Checking the behaviour of FindCUDA.cmake with Cray PE and its various base compilers should be in the testing plan.

#10 Updated by Roland Schulz over 2 years ago

I think conceptually the problem isn't in the contents of *_LINK_DYNAMIC_C_FLAGS or cmake's use of that variable. It is correct that changing those fixes the problem but it would be somewhat of a hack because it doesn't resolve the underlying problem. That problem is that cmake incorrectly assumes that the compiler defaults to building dynamically-linked executables/libraries. And thus it adds the _LINK_STATIC_C_FLAGS (because under that assumption it is the correct thing to do). A further problem is that cmake doesn't seem to have build in support for building static executables. add_library has a STATIC option but add_executable doesn't. Also there is no equivalent for executables to the BUILD_SHARED_LIBS variable. That seems like an obvious missing feature in cmake and might be something which should be filed as a feature request in their bug tracker.

If it would be possible to specify STATIC in add_executable I would suggest a proper solution to be:
- Set CMAKE_SHARED_LINKER_FLAGS correctly in the platform file (it needs to be "-shared" for Cray because shared isn't default). This should fix that building shared binaries works correctly on Cray.
- If the user wants static binaries they would set BUILD_SHARED_EXE=no (either directly or indirectly through a Gromacs option or even by default through a Cray platform file). That would mean that cmake would use the correct linker options when building a static binary.

#11 Updated by Justin Cook over 2 years ago

Is there anything I can distribute in a toolchain file that will fix this problem? It appears like we may need to define new platforms instead here.

#12 Updated by Mark Abraham over 2 years ago

Roland Schulz wrote:

I think conceptually the problem isn't in the contents of *_LINK_DYNAMIC_C_FLAGS or cmake's use of that variable. It is correct that changing those fixes the problem but it would be somewhat of a hack because it doesn't resolve the underlying problem. That problem is that cmake incorrectly assumes that the compiler defaults to building executables/libraries.

Did you miss a word here? "dynamically-linked?"

And thus it adds the _LINK_STATIC_C_FLAGS (because under that assumption it is the correct thing to do). A further problem is that cmake doesn't seem to have build in support for building static executables. add_library has a STATIC option but add_executable doesn't. Also there is no equivalent for executables to the BUILD_SHARED_LIBS variable. That seems like an obvious missing feature in cmake and might be something which should be filed as a feature request in their bug tracker.

Yes, that seems reasonable. However, I understand compilers for embedded processors frequently only do static linking, and I understand that the solution there is to deploy a custom platform file.

If it would be possible to specify STATIC in add_executable I would suggest a proper solution to be:
- Set CMAKE_SHARED_LINKER_FLAGS correctly in the platform file (it needs to be "-shared" for Cray because shared isn't default). This should fix that building shared binaries works correctly on Cray.
- If the user wants static binaries they would set BUILD_SHARED_EXE=no (either directly or indirectly through a Gromacs option or even by default through a Cray platform file). That would mean that cmake would use the correct linker options when building a static binary.

Yes, if CMake would support that, this would be a good solution. But I suspect that's an uphill battle.

#13 Updated by Mark Abraham over 2 years ago

Justin Cook wrote:

Is there anything I can distribute in a toolchain file that will fix this problem? It appears like we may need to define new platforms instead here.

I think platform files are necessary. Those should be submitted to CMake for inclusion - we'd likely bundle them too, and instruct people to use the matching toolchain files to get them called.

#14 Updated by Roland Schulz over 2 years ago

Mark Abraham wrote:

Roland Schulz wrote:

I think conceptually the problem isn't in the contents of *_LINK_DYNAMIC_C_FLAGS or cmake's use of that variable. It is correct that changing those fixes the problem but it would be somewhat of a hack because it doesn't resolve the underlying problem. That problem is that cmake incorrectly assumes that the compiler defaults to building executables/libraries.

Did you miss a word here? "dynamically-linked?"

yes. Sorry. Fixed.

And thus it adds the _LINK_STATIC_C_FLAGS (because under that assumption it is the correct thing to do). A further problem is that cmake doesn't seem to have build in support for building static executables. add_library has a STATIC option but add_executable doesn't. Also there is no equivalent for executables to the BUILD_SHARED_LIBS variable. That seems like an obvious missing feature in cmake and might be something which should be filed as a feature request in their bug tracker.

Yes, that seems reasonable. However, I understand compilers for embedded processors frequently only do static linking, and I understand that the solution there is to deploy a custom platform file.

But on Cray it is possible to build both dynamic and static. Of course we could make 2 set of platform files. But we would still need to set a variable called _LINK_DYNAMIC_C_FLAGS to flags for static. As I said, it would work, but doesn't seam clean.

If it would be possible to specify STATIC in add_executable I would suggest a proper solution to be:
- Set CMAKE_SHARED_LINKER_FLAGS correctly in the platform file (it needs to be "-shared" for Cray because shared isn't default). This should fix that building shared binaries works correctly on Cray.
- If the user wants static binaries they would set BUILD_SHARED_EXE=no (either directly or indirectly through a Gromacs option or even by default through a Cray platform file). That would mean that cmake would use the correct linker options when building a static binary.

Yes, if CMake would support that, this would be a good solution. But I suspect that's an uphill battle.

I suspect that if someone is willing to make a cmake patch that that's a kind of contribution they would be happy about. Without a patch it is probably something the cmake developers will take a long time to do. We might still need a work-around for older cmake versions.

#15 Updated by Justin Cook over 2 years ago

Cray now supplies a cmake module available for use that has a work-around for the rdynamic issue and should be useable. I will be adding the platform files (as a long term solution) over the next few months.

But perhaps we should split this bug up into two? One for the intel issue and one for inclusion of a Cray platform file in GROMACS? One specific issue would be to make sure GROMACS finds Cray's FFTW by default rather than trying to build it's own.

I'm also looking into other ways for you to be able to better report issues with CMake on Cray systems as well.

#16 Updated by Szilárd Páll over 2 years ago

Justin Cook wrote:

But perhaps we should split this bug up into two? One for the intel issue and one for inclusion of a Cray platform file in GROMACS? One specific issue would be to make sure GROMACS finds Cray's FFTW by default rather than trying to build it's own.

Related to that, but off-topic, (so we should follow up on #1696): It would be good to re-evaluate whether our FFTW detection warning, which claims that one should avoid using AVX-enabled FFTW, applies to Cray's FFTW derivative.
We are also in the process of reconsidering the suggestion with newer FFTW versions. The latest FFTW increased the auto-tuning run-time which will hopefully make it more reliable at picking the right kernels; on Haswell even the pure AVX build seemed faster for the cases where I tried. I'm not sure to what extent will conclusions drawn from the vanilla FFTW performance behavior apply to Cray's modified version. Could we get some general info on Cray's FFTW modifications and how do these affect our use-case?

#17 Updated by Justin Cook over 2 years ago

Let me check what feedback I can get from the Cray FFTW builders. I'm going to likely be the go-between for a few items anyways. If you have other questions please don't hesitate to ask in bugz or email.

#18 Updated by Gerrit Code Review Bot over 2 years ago

Gerrit received a related patchset '1' for Issue #1641.
Uploader: Roland Schulz ()
Change-Id: If7b8192b44c33c861f126e3422df04388d2f2be5
Gerrit URL: https://gerrit.gromacs.org/4496

#19 Updated by Justin Cook over 2 years ago

I have developer feedback on FFTW for Szilárd Páll. Can you please open up a new issue so I can point them and their feedback there?

#20 Updated by Mark Abraham over 2 years ago

Justin Cook wrote:

I have developer feedback on FFTW for Szilárd Páll. Can you please open up a new issue so I can point them and their feedback there?

Thanks Justin - feedback at #1696 would be welcome.

#21 Updated by Mark Abraham over 2 years ago

Justin Cook wrote:

Cray now supplies a cmake module available for use that has a work-around for the rdynamic issue and should be useable. I will be adding the platform files (as a long term solution) over the next few months.

Great. What should we ask Cray-site sysadmins to do to get access to the work-around version of CMake?

You may also be interested in ideas from https://github.com/chuckatkins/Cray-CMake-Modules

But perhaps we should split this bug up into two? One for the intel issue and one for inclusion of a Cray platform file in GROMACS?

I think the root cause of the Intel-compiler issue is handled well by my patch at http://public.kitware.com/pipermail/cmake/2014-November/059154.html, but so far I have not generated much interest.

One specific issue would be to make sure GROMACS finds Cray's FFTW by default rather than trying to build it's own.

Happy to do that, assuming it proves to be fastest.

The Cray FFTW people might be interested in https://github.com/eriklindahl/fftw3 from one of the founding GROMACS developers, which may enhance FFTW performance (though I haven't seen any performance numbers yet). I understand the FFTW developers plan to incorporate modifications along these lines into future releases.

I'm also looking into other ways for you to be able to better report issues with CMake on Cray systems as well.

Great

#22 Updated by Justin Cook over 2 years ago

Mark Abraham wrote:

Justin Cook wrote:

Cray now supplies a cmake module available for use that has a work-around for the rdynamic issue and should be useable. I will be adding the platform files (as a long term solution) over the next few months.

Great. What should we ask Cray-site sysadmins to do to get access to the work-around version of CMake?

It turns out this is an internal only release currently. I'm going to have to provide toolchain files for the time being.

Also available in: Atom PDF