Project

General

Profile

Bug #601

proposed change to type of c and tot arguments to print_cycles()

Added by Mark Abraham almost 9 years ago. Updated almost 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
mdrun
Target version:
Affected version - extra info:
Affected version:
Difficulty:
uncategorized
Close

Description

Hi,

In src/mdlib/gmx_wallcycle.c, the definition of print_cycles() is

static void print_cycles(FILE *fplog, double c2t, const char *name, int nnodes,
int n, gmx_cycles_t c, gmx_cycles_t tot)

The type of the last two parameters is a typedef whose definition varies with the platform. Inside this function, they are used only after typecasting to double. When print_cycles() is called, arguments of either type double or type gmx_cycles_t are supplied.

On my BlueGene/L, I observe that the .log output from print_cycles() is

R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
Computing:         Nodes     Number     G-Cycles    Seconds     %
-----------------------------------------------------------------------
Domain decomp. 40 750 18446744073.710 26352583927.2 100.0
DD comm. load 40 750 18446744073.710 26352583927.2 100.0
DD comm. bounds 40 751 18446744073.710 26352583927.2 100.0
Vsite constr. 40 18446744073.710 26352583927.2 100.0
Send X to PME 40 7501 18446744073.710 26352583927.2 100.0
Comm. coord. 40 7501 18446744073.710 26352583927.2 100.0
Neighbor search 40 751 18446744073.710 26352583927.2 100.0
Born radii 40 18446744073.710 26352583927.2 100.0
Force 40 7501 18446744073.710 26352583927.2 100.0
Wait + Comm. F 40 7501 18446744073.710 26352583927.2 100.0
PME mesh 24 7501 18446744073.710 26352583927.2 100.0
Wait + Comm. X/F 24 18446744073.710 26352583927.2 100.0
Wait + Recv. PME F 40 7501 18446744073.710 26352583927.2 100.0
Vsite spread 40 18446744073.710 26352583927.2 100.0
Write traj. 40 16 18446744073.710 26352583927.2 100.0
Update 40 7501 18446744073.710 26352583927.2 100.0
Constraints 40 7501 18446744073.710 26352583927.2 100.0
Comm. energies 40 752 18446744073.710 26352583927.2 100.0
Test 40 18446744073.710 26352583927.2 100.0
Rest 40 18446744073.710 26352583927.2 100.0
-----------------------------------------------------------------------
Total 64 18446744073.710 26352583927.2 100.0
-----------------------------------------------------------------------
-----------------------------------------------------------------------
PME redist. X/F 24 15002 18446744073.710 26352583927.2 100.0
PME spread/gather 24 15002 18446744073.710 26352583927.2 100.0
PME 3D-FFT 24 15002 18446744073.710 26352583927.2 100.0
PME solve 24 7501 18446744073.710 26352583927.2 100.0
-----------------------------------------------------------------------

If I change the function definition such that c and n are of type double, then I get sensible numbers in the output. I can't see why that should make any difference, but it does.

FWIW, gmx_cyclecounter typedefs gmx_cycles_t as unsigned long long on this architecture.

Can I please change the types of these arguments to double in the GROMACS repo to work around the problem permanently? (Or, can someone see what the actual problem is?)

History

#1 Updated by Berk Hess almost 9 years ago

Strange.
During a normal run print_cycles always get called with doubles.
So something going wrong with the unsigned long long to double
conversion of in the back cast. But on linux x86_64, which we use
most of the time, gmx_cycles_t is also a unsigned long long
and we have never seen problems.
Anyhow, I changed the arguments to double:
commit 89c0883022896ff35b58c81a59a2e0dfee1c1ca5

Berk

Also available in: Atom PDF