Project

General

Profile

Feature #2128

add means to verify at build/run-time that source tarball is not tainted

Added by Szilárd Páll over 2 years ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
build system
Target version:
-
Difficulty:
simple
Close

Description

Builds from a dev tree identify and record in their version string the exact source version (if possible). However, checking the integrity of the source release either at compile- or runtime is hard, but quite relevant. Use-cases where detecting changes would be useful:
- The user intentionally applies changes to the code (e.g. patch-based software distribution like PLUMED). The resulting binaries will however report vanilla version despite the tainted source (unless explicit efforts are made to change the version string too);
- A user downloads the tarball, tries out a modification leaving the vanilla-looking tree behind. Later this tree can accidentally end up used to build and install a .

One solution could be to generate and distribute a checksum with the source release. This could be re-generated at build-time (excluding the checksum file itself) which would allow compile-time warnings to be issued as well as the result of the verification to be encoded in the binary. This would allow that in both of the above use-cases the -version and log outputs to report e.g. 2016.2-tainted in case if the checksum does not match

History

#1 Updated by Szilárd Páll over 2 years ago

If we want to cater a little for those who knowingly want to run GROMACS built from modded release sources, we could allow them to pass a tag a version tag cmake-time which could also disable a runtime warning.

I've also checked a PLUMED install and unfortunately there is no version tag or any other sign that the build is not a vanilla (other than the path in this case):

[...]
GROMACS:      gmx_mpi, version 2016
Executable:   /pdc/vol/gromacs/2016.1-plumed_2.3b/amd64_co7/haswell/bin/gmx_mpi
Data prefix:  /pdc/vol/gromacs/2016.1-plumed_2.3b/amd64_co7/haswell
Working dir:  /afs/pdc.kth.se/home/p/pszilard
Command line:
  gmx_mpi -version

GROMACS version:    2016
Precision:          single
Memory model:       64 bit
MPI library:        MPI
[...]

#2 Updated by Mark Abraham over 2 years ago

Sounds like a useful rainy-day project. Would require build-system integration that did not get in the way of a build that couldn't find a suitable cross-platform hashing program to use. Would we just not verify the source eg on Windows, and would we report that as unverified, or merely not report it as verified?

#3 Updated by Aleksei Iupinov over 2 years ago

Apparently there is a hasher, starting with Windows 7.
https://superuser.com/a/898377

#4 Updated by Teemu Murtola over 2 years ago

I'm sure CMake has built-in hashing facilities that can be used for this. The trickier part is getting the correct files hashed (both from the source repository, and then after extracting from the tarballs). Also detecting extra files that can change behavior is tricky, but we don't do that fully for the git-based mechanism, either.

#5 Updated by Szilárd Páll over 2 years ago

  • Category set to build system
  • Difficulty simple added
  • Difficulty deleted (uncategorized)

#6 Updated by Szilárd Páll almost 2 years ago

Teemu Murtola wrote:

I'm sure CMake has built-in hashing facilities that can be used for this. The trickier part is getting the correct files hashed (both from the source repository, and then after extracting from the tarballs). Also detecting extra files that can change behavior is tricky, but we don't do that fully for the git-based mechanism, either.

This will clearly not make it for the next release, but having thought of it just recently I do wonder (noob question alert :): how does one generally checksum multiple files at the same time (I can imagine a dumb way of concatenating everything into a single bitstream and feeding that to shasum)? Or would one do them one at a time and check each file?

#7 Updated by Szilárd Páll over 1 year ago

Looks like the equivalent of the following one-lines is what we want to do:
`cat $(find $PATH -type f) | shasum`
with the caveat that we'll need some filtering. Not sure if we should rely on cmake-internal hashing/globbing or make a best-effort attempt at finding external tools.

#8 Updated by Mark Abraham over 1 year ago

Szilárd Páll wrote:

Looks like the equivalent of the following one-lines is what we want to do:
`cat $(find $PATH -type f) | shasum`
with the caveat that we'll need some filtering. Not sure if we should rely on cmake-internal hashing/globbing or make a best-effort attempt at finding external tools.

One would need that to run as a make target that depends on all the header and source files (and probably CMake files), and its output would probably have to be a dependency for a source file that gets compiled into libgromacs. Probably one can use CMake target properties to get much of the former information about (at least) source files.

Also available in: Atom PDF