Project

General

Profile

Feature #3395

Consider scripted composition of Dockerfiles

Added by Eric Irrgang 8 months ago. Updated 7 months ago.

Status:
Feedback wanted
Priority:
Normal
Assignee:
-
Category:
testing
Target version:
-
Difficulty:
uncategorized
Close

Description

Dockerfiles tend to support chained image builds well, but forked recipes start to develop a maintenance burden as scriptlets get duplicated. Docker images composed from several recipes are kludgey, at best.

As the Dockerfiles for GitLab Runner develop more parameters (compiler tool chain, MPI flavor, Python version), we should consider using a tool to script the composition of our Dockerfiles rather than manually maintaining flat files.

Prominent solutions


Related issues

Related to GROMACS - Bug #3263: Change dockerfile organization to have all builds in one fileClosed
Related to GROMACS - Feature #3177: Spack package management supportNew

Associated revisions

Revision 3ba43a73 (diff)
Added by Eric Irrgang 8 months ago

Manage the list of image tags in Docker build script.

Supports flexibility in future handling of built images.

Refs #3395

Change-Id: I207911f190a57cbc4671777b5b92827aa6d3904a

Revision 386b2947 (diff)
Added by Eric Irrgang 8 months ago

Manage the list of image tags in Docker build script.

Supports flexibility in future handling of built images.

Refs #3395

Change-Id: I207911f190a57cbc4671777b5b92827aa6d3904a

Revision f8517ace (diff)
Added by Eric Irrgang 8 months ago

Free up Docker image tag namespace.

Allow tags to distinguish Docker images for different release
requirements. Free up the image tag namespace for granularly describing
image revisions (and create smaller image repositories) by using the
matrix slug as the image name. Tag images to indicate the supported
GROMACS branch.

Refs #3395

Change-Id: Idff95d031cd56a28545562862e78cd99902db620

Revision 7bd5c7f7 (diff)
Added by Eric Irrgang 7 months ago

Distinguish Docker images for different release requirements.

Free up the image tag namespace for granularly describing image
revisions (and create smaller image repositories) by using the matrix
slug as the image name. Tag images with the major release number (2020).

Refs #3395

Change-Id: I917f511c8bc11c4cedf85943eb7388ce0d5f9740

Revision 4f4fa0ae (diff)
Added by Eric Irrgang 7 months ago

Build MPI in Dockerfile.

We should make sure that the MPI compiler wrapper we
set up corresponds to the toolchain we expect to use
in a Docker image.

Refs #3395

Change-Id: I89e2af539199918153560483552b5245c6947acf

Revision 3ea23bfb (diff)
Added by Eric Irrgang 7 months ago

Define an argparse parent parser for container image tools.

Refs #3395

Change-Id: I8bcb314b21732cb870743f44de6aadc4d6bf534d

Revision a2142ff6 (diff)
Added by Paul Bauer 7 months ago

Add script to generate CI Docker files

Refs #3395

Change-Id: Id001c87b52f4d1f7afcb7fbb38cd4b5efdbfd90e

Revision d26f554a (diff)
Added by Paul Bauer 7 months ago

Add script to generate CI Docker files

Refs #3395

Change-Id: Id001c87b52f4d1f7afcb7fbb38cd4b5efdbfd90e

History

#1 Updated by Mark Abraham 8 months ago

Strongly support something to build our Dockerfiles so that we repeat ourselves as little as possible. But we should also strive for simplicity in that tool. It's a means to an end that supports our users, but would be easy to over-engineer :-)

#2 Updated by Eric Irrgang 8 months ago

  • Description updated (diff)

#3 Updated by Eric Irrgang 8 months ago

Mark Abraham wrote:

But we should also strive for simplicity in that tool.

A Spack-based solution would look like a generalization of https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/gromacs/package.py injected into https://spack.readthedocs.io/en/latest/workflows.html#using-spack-to-create-docker-images

An HPC-container-maker based solution would look like a generalization of https://github.com/NVIDIA/hpc-container-maker/blob/master/recipes/gromacs/gromacs.py

Please comment on perceptions of simplicity.

#4 Updated by Erik Lindahl 8 months ago

Not directly related to those scripts, but my main concern for simplicity is the entire CI framework. Our main reason for moving to Gitlab is that we have been a bit too happy to implement quite customised solution, but we have not been as good at writing good documentation for those setups, or sharing the load of maintaining them (not to mention we're critically dependent on specific hosts being available).

A simple script might be a neat idea, but then we need something that is 100% vanilla, well-documented and easy to install anywhere - not e.g. a special fork from one user's github repo that might or might not be maintained :-)

Second, we don't want the CI hardware or maintenance needs to keep expanding, the CI stage should finish really fast, and we also want the entire pipelines to run (in reasonable time) on Gitlab's public servers in case our own are down. This means we'll have to accept that the CI only covers a VERY small fraction of the most common combinations - it's not a way to automatically test "everything".

Finally, while it might be possible to test some things e.g. nightly or weekly, that comes with other requirements. First we need good documentation and explanations about what is run when and where, how new tests should be classified, and not least how we should monitor failures, and who is responsible for debugging a failed test (not Paul). If people want such tests, they will likely also have to sacrifice a bit of their time to start debugging failures.

#5 Updated by Eric Irrgang 8 months ago

Erik summarizes some concerns that are currently under discussion in several other issues, linked below for reader convenience.

writing good documentation for those setups, or sharing the load of maintaining them (not to mention we're critically dependent on specific hosts being available).

Ref #3272 and #3275

we need something that is 100% vanilla

Yes, we should choose a maintained and documented tool that fits our need as closely as possible to minimize the number of unique lines while maximizing their readability and maintainability.

easy to install anywhere

To clarify, it should be easy to write, build, and run tests on supported environments, but the CI infrastructure has more limited deployment, and non-automated components like Docker image rebuilds need to be easy for core developers to use, but it doesn't seem necessary to target the same universality as a regular GROMACS build.

- not e.g. a special fork from one user's github repo that might or might not be maintained :-)

Spack and NVidia HPC container-maker both seem to be well-maintained and well-documented projects. If we can express our container recipes in more concise, readable, and maintainable ways with one of those tools than with static Dockerfiles, I think there is a big potential win.

Also reference #3263

This means we'll have to accept that the CI only covers a VERY small fraction of the most common combinations - it's not a way to automatically test "everything".

I think this comment may be more relevant in the context of #2756 and #3272

... might be possible to test some things e.g. nightly or weekly...

Yes. Ref #2576 and #3275

#6 Updated by Eric Irrgang 8 months ago

  • Related to Bug #3263: Change dockerfile organization to have all builds in one file added

#7 Updated by Eric Irrgang 8 months ago

  • Related to Feature #3177: Spack package management support added

#8 Updated by Eric Irrgang 8 months ago

There are various conversations on Gerrit containing thoughts or suggestions about what images we need. Can we try to list here the parameters that we want to permute for separate images?

We can also note whether some software should either have multiple installations in a single image or be completely left out of some images.

#9 Updated by Eric Irrgang 8 months ago

On a related note, I can think of at least two reason why we might want to move some of the image feature slug into the Docker repository name rather than the image tag.
  • It would be nice to reserve the tag for versioning of images with the same slug in case we need to distinguish otherwise equivalently named images for branches or while testing new recipes.
  • The images are getting larger and share very few layers, resulting in unusually large repositories.
I don't know whether the latter causes problems on DockerHub or local caches, but we should probably strive not to be an edge case. I propose we put the slug of toolchain software identifiers in the image name instead of the tag. Caveats:
  • creating an image repository on Docker hub is a separate permission from updating/tagging, and may affect automation goals when new permutations are added.
  • Each image will need to be pushed separately, whereas currently all tags can be pushed with a single image name. This is not necessarily a bad thing or good thing, but it is a thing.

#10 Updated by Mark Abraham 8 months ago

Agree the tag makes sense to use to express version. My original choice was based upon ignorance of common practice.

#11 Updated by Eric Irrgang 7 months ago

Mark Abraham wrote:

Agree the tag makes sense to use to express version. My original choice was based upon ignorance of common practice.

I don't know if there is enough consistency to define common practice here. :-]

The images are getting larger and share very few layers, resulting in unusually large repositories.

Nvidia's example (https://hub.docker.com/r/nvidia/cuda/tags) seems to indicate that a huge number of very dissimilar images can be effectively managed by tag in a single repository.

It would be nice to reserve the tag for versioning of images with the same slug in case we need to distinguish otherwise equivalently named images for branches or while testing new recipes.

If the slug is sufficiently detailed to distinguish versions, we don't need any extra namespace.

I would reconsider my advocacy of image name-based slugs if it translates to substantially more local storage or overall download / boot-strap time. Inspection of image hashes in docker histories indicates that it shouldn't. But image rebuild time may be affected (at least without effective `--cache-from` hinting). I don't know if there is more than an aesthetic argument to be made for managing dozens of image repositories versus dozens of tags, though (as noted previously) there are access control implications for docker pushes.

#12 Updated by Eric Irrgang 7 months ago

  • Description updated (diff)

Spack update

There are at least two ways to incorporate Spack into containerized environment preparation. I just added a link to https://spack.readthedocs.io/en/latest/containers.html in the description.

Spack seems like it could lend itself to a somewhat different approach than the HPCCM approach. The matrix of software configurations can be expressed in YAML (in a way that may be reusable by other tools), but the base OS image is expressed separately.

The most obvious way to use this would be to build the entire matrix of software in a single image for each OS distribution, and use environment activation to provide Spack-based isolation of tools within CI jobs. This could be helpful for debugging, because a developer could try the different configurations easily in the same container.

Alternatively, it may simply be helpful to define the matrix in a single YAML file, possibly use Spack to test the dependencies and such (possibly building a complete master image), and generate Dockerfiles for the limited environment parameter permutations. I don't yet have enough of a handle on the steps of spack specification, concretization, and build, or of the "environment" facilities, to comment more on such specifics at the moment.

Note that it is probably substantially more straightforward to use Spack to build GROMACS if we are using Spack to prepare the environment, but to do so may obfuscate the CMake calls in the CI infrastructure (at least, in a different way than they are currently obfuscated in the gitlab-ci YAML). If we don't want to use Spack to build GROMACS in the GitLab Runner, then we need to extract the relevant tool paths/locations from the Spack environment to construct a CMake command line.

Next steps:

  • understand how best to perform a non-Spack-driven GROMACS build in terms of Spack environment tool chain.
  • understand how to extract a Dockerfile (or new environment) for a single matrix element from an environment with a complete matrix.

Also available in: Atom PDF