Project

General

Profile

Bug #1926

gmx analysis tools require big-endian TRR trajectories

Added by Daniel Roe over 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
core library
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Hi,

Recently on the Amber mailing list a user (Robert Molt) reported that do_x3dna would crash when he tried to run it with a TRR trajectory generated by cpptraj from AmberTools. However, VMD was able to read and display the cpptraj-generated trajectory with no issues. I was able to reproduce this behavior with 'gmx rms' and found that when I forced cpptraj to write the TRR trajectory as big-endian, 'gmx rms' was able to run successfully. Is this intentional behavior? Should I be forcing cpptraj to write TRR trajectories as big endian by default? Thanks.

-Dan

GROMACS version: VERSION 5.1.2
Precision: single
Memory model: 32 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 32)
GPU support: disabled
OpenCL support: disabled
invsqrt routine: gmx_software_invsqrt(x)
SIMD instructions: SSE2
FFT library: fftw-3.3.4-sse2
RDTSCP usage: disabled
C++11 compilation: enabled
TNG support: enabled
Tracing support: disabled
Built on: Thu Mar 17 07:30:12 MDT 2016
Built by: droe@choxnuxe [CMAKE]
Build OS/arch: Linux 3.13.0-83-generic i686
Build CPU vendor: GenuineIntel
Build CPU brand: Intel(R) Atom(TM) CPU N280 @ 1.66GHz
Build CPU family: 6 Model: 28 Stepping: 2
Build CPU features: apic clfsh cmov cx8 htt lahf_lm mmx msr pdcm pse sse2 sse3 ssse3
C compiler: /usr/bin/cc GNU 4.8.4
C compiler flags: -msse2 -Wundef -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds
C++ compiler: /usr/bin/c++ GNU 4.8.4
C++ compiler flags: -msse2 -std=c++0x -Wundef -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds
Boost version: 1.55.0 (internal)

Associated revisions

Revision f7d4d019 (diff)
Added by Mark Abraham over 3 years ago

Fix trr magic number reading

The trr header-reading routine returned an "OK" value if the magic
number was wrong, which might lead to chaotic results everywhere,
because checking of return values tends not to happen (even when
they're right). Other GROMACS magic-number reading routines tend to
give a fatal error if the the number is wrong, e.g. by reading a file
written in wrong endianness. (This should never be a thing for XDR
files, which are defined to be big endian, but such code has existed.)

This change adds/restores that behaviour to trr reading, along
with separating the behaviour of failing to read a magic integer
from reading one that doesn't match.

Fixes #1926

Change-Id: I3cdd8ae9172e3b95fc232d8fa31a442d239233db

Revision d465ad8b (diff)
Added by Mark Abraham almost 3 years ago

Fix logic of TRR reading

Commit f7d4d019 introduced a bug where TRR reading reaching the end of
the file was indistinguishable from a reading error or a magic-number
error. This is now fixed, restoring the end-of-file behaviour that
existed before f7d4d019, while retaining the wrong-magic-number
behaviour that it introduced.

Refs #1926

Change-Id: Ic8540846c481f022bc6ae7b774794792c8c7a523

History

#1 Updated by Mark Abraham over 3 years ago

It's not clear whether you are saying that gmx rms reproduced the crash of do_x3dna or the success of VMD.

In any case, the TRR format is based on XDR, which is defined as big-endian, and I think that all XDR writers should conform. Thus I think that any tool that fails to read a little-endian TRR file has a reasonable implementation.

That VMD apparently also copes with little-endian is nice from a certain point of view, but I don't see any value in encouraging proliferation of forms of XDR. I would encourage e.g. cpptraj, VMD and other tools to use a standard XDR library, or at least http://www.gromacs.org/Developer_Zone/Programming_Guide/XTC_Library, rather than rolling their own code, and thus accidentally writing non-conforming XDR, leading to further fragmentation of the software landscape. :-)

#2 Updated by Daniel Roe over 3 years ago

That's good to know, although I was not able to find any clear documentation to this effect. For example, the first couple of results of a google search for "gromacs trr file format description" indicate no clear link between the XDR format and gromacs TRR files (although the fourth hit, a link to MDAnalysis documentation, starts to hint at this indirectly via the way the modules are named, but even then it's not clear). Compare this to the first result of a google search for "amber netcdf file format description", which for me is http://ambermd.org/netcdf/nctraj.pdf.

Now that I know that TRR should be XDR I will make the appropriate updates in cpptraj, although it does seem strange to me to have a big-endian format in what is primarily a little-endian world! Thanks for the info,

-Dan

-Dan

#3 Updated by Mark Abraham over 3 years ago

If you have a "little-endian TRR file" that makes gmx rms crash, please upload a copy here, so we can see how to improve that.

#4 Updated by Gerrit Code Review Bot over 3 years ago

Gerrit received a related patchset '1' for Issue #1926.
Uploader: Mark Abraham ()
Change-Id: If919545a74dc415225706c9e4d23968fd04dc3a1
Gerrit URL: https://gerrit.gromacs.org/5736

#5 Updated by Daniel Roe over 3 years ago

Hi,

I'm attaching a tar/gz containing files that can be used to reproduce the behavior. It contains a README describing each file and steps to reproduce the bug. Let me know if you need any more info.

-Dan

#6 Updated by Erik Lindahl over 3 years ago

  • Status changed from New to Rejected

IMHO, this is not a bug, but a file that has been generated incorrectly with some other tool.

Once upon a long time ago Gromacs had another trajectory format (trj) which was endian-dependent. However, the whole idea with using XDR data representation is that we leave it to the operating system to translate data to the XDR network standard representation (just as it does e.g. for any data over NFS). While it's a fun quirk of history that all modern systems happen to be little-endian, that doesn't change the fact that the XDR standard is big endian, so all unix systems covert back-and-forth. However, XDR also goes further than this by specifying that the conversions have to work even if the system e.g. isn't using IEEE754 floating-point format internally.

So, while it's of course unfortunate if somebody missed that and generate files that are suddenly endian-dependent, I wouldn't put very high priority at rewriting the GROMACS trr layer to handle that type of files, but we might be able to detect it and refuse to read the file in a nicer fashion.

And, just for the record: The future of file formats in Gromacs is TrajNG, which will gradually become the new standard we use (it's just not written by default - yet)!

#7 Updated by Mark Abraham over 3 years ago

I still plan to look into the reported segfault, but haven't had time just now.

#8 Updated by Mark Abraham over 3 years ago

  • Status changed from Rejected to In Progress
  • Assignee set to Mark Abraham
  • Target version set to 2016

#9 Updated by Gerrit Code Review Bot over 3 years ago

Gerrit received a related patchset '1' for Issue #1926.
Uploader: Mark Abraham ()
Change-Id: I3cdd8ae9172e3b95fc232d8fa31a442d239233db
Gerrit URL: https://gerrit.gromacs.org/6037

#10 Updated by Mark Abraham over 3 years ago

  • Category changed from analysis tools to core library
  • Status changed from In Progress to Fix uploaded

The code that read the trr header wasn't doing anything reasonable when it saw the wrong magic number, but merely returned with a success error code, and then e.g. gmx rms didn't check the error code (which was anyway wrong), which led to the segfault. Other tools will have had other strange behaviour. My patch fixes this for the 2016 release, by issuing a fatal error like our other file-reading code does when it can't find a magic number, but the whole fileio module needs a complete overhaul.

#11 Updated by Mark Abraham over 3 years ago

  • Status changed from Fix uploaded to Resolved

#12 Updated by Mark Abraham over 3 years ago

  • Status changed from Resolved to Closed

#13 Updated by Gerrit Code Review Bot almost 3 years ago

Gerrit received a related patchset '1' for Issue #1926.
Uploader: Mark Abraham ()
Change-Id: gromacs~release-2016~Ic8540846c481f022bc6ae7b774794792c8c7a523
Gerrit URL: https://gerrit.gromacs.org/6367

Also available in: Atom PDF